The actual stack we open on every new client project, split by what each tool earns its keep doing. Not “10 tools to revolutionise your workflow” — just the boring, reliable list that gets a build from kickoff to Demo Day.
1. Claude (Anthropic)
The default LLM for agents. Sonnet is our price/performance sweet spot. Tool-use accuracy and long-context recall are the headline reasons; prompt caching is the unglamorous one that pays for itself within a week.
2. LangGraph
State-machine orchestration for agents with branching logic. Checkpointing means we can pause-resume runs, which is essential when the agent is doing something a human needs to approve mid-flight. Built-in LangSmith integration is the cherry.
3. LangSmith
Tracing and evaluation. Every chain, agent step, retrieval, and tool call shows up in a UI you can scroll. Datasets capture production traffic so we can replay it against a new prompt version before shipping. This is the difference between “my agent works” and “my agent works at p95”.
4. Pinecone
The managed vector DB we reach for when the client doesn’t want to operate infrastructure. Serverless tier means there’s no sizing decision — you ship and scale. Filtered queries are fast enough that we don’t bolt on a second-stage filter.
5. Cursor
AI-first code editor. The agent mode genuinely accelerates building scaffolding code — boilerplate, glue, the unfun bits. Pair it with Claude Code for terminal-driven work and you’ve covered both halves of the dev loop.
6. Docker
Everything we ship is a Docker image. Multi-stage builds keep Python images under 500MB; BuildKit cache mounts make CI builds finish in 90 seconds. Without Docker, agent deployment becomes “works on my machine” theatre.
7. n8n
The connective tissue between agents and the rest of the world. Webhook triggers, Google Sheets sync, email send, Slack post — wired up visually in minutes. We self-host n8n on a $20 VPS for every client project. It saves us writing a hundred small integration scripts.
8. FastAPI
Python web framework of choice. Async-first, Pydantic-typed, OpenAPI docs for free. Every agent we ship has a FastAPI front door. The same Pydantic models we define for tool inputs become our HTTP request schemas, so there’s zero duplication.
9. Grafana + Prometheus
Observability for the agent. Token usage by model, p95 latency, agent loop depth distribution, cost per request, tool-call success rate. Without these dashboards, you have no idea whether your agent got better or worse when you changed the prompt. With them, you have a number to point at.
10. GitHub Actions + ArgoCD
Build → push → deploy. GitHub Actions builds the Docker image on every push, pushes it to a registry. ArgoCD watches the manifest repo and syncs the change to the Kubernetes cluster. The whole pipeline is YAML in Git — no secret-passing, no manual SSH, full audit trail.
Honourable Mentions
- LangFuse — open-source LangSmith alternative when data residency matters
- Qdrant — self-hosted vector DB when Pinecone’s pricing doesn’t fit
- Ollama — local model serving for the prompt-engineering exploration phase
- Groq — when wall-clock latency is the constraint and the model fits on their LPU
- Pydantic AI — for projects where the team lives in typed Python and wants validation-first agents
What’s Not On The List
Plenty of tools we used to reach for that we don’t anymore:
- OpenAI Assistants API — too opinionated, replaced by Claude + tool use
- Pinecone pod-based — Serverless replaced this
- Helm charts — Kustomize is enough for most agent deployments
- Self-managed Kafka — n8n + Redis cover 90% of the queueing we need
The stack churns. The Agentic AI Bootcamp curriculum updates each cohort to track it.