Design a platform that runs autonomous AI agents which plan multi-step tasks, call tools, and act on a user's behalf (e.g. "research X and draft a report", or "monitor my inbox and reply to routine emails").
The platform should support:
- •A planning/reasoning loop (the agent decides the next action each step)
- •A tool registry (web search, code execution, API calls, file I/O)
- •Short-term (working) and long-term (persistent) memory
- •Long-running tasks lasting minutes to hours, with human-approval checkpoints
- •Thousands of agents running concurrently
What you'll be assessed on
The agent execution loop, tool orchestration & safety, state/memory management, handling long-running and failing tasks, and cost/loop control.