Deployment Topologies — Choosing How to Run Your Agents
A deployment topology is how you run agents — the process and concurrency model — independent of how you package and ship them (Docker, Kubernetes, CI/CD, which live in the Deployment section). The two are complementary: you pick a topology here, then package it there.
If you want to build a number of different agents on top of Paladin, the first decision is
which of these five topologies fits. Paladin is designed to be embedded as a library —
paladin-ai (library name paladin) is your composition root, not a framework that owns
your process — so every topology below is something you assemble in your own binary.
The five topologies at a glance
| Topology | Process model | Concurrency | Use when | Avoid when | Key crates / features |
|---|---|---|---|---|---|
| Embedded library | One process, agents in your main | tokio tasks; agents are Send + Sync behind Arc | You control invocation in-code and want the simplest setup | You need an external caller or independent scaling | paladin-ai |
| Battalion orchestration | One process, many agents collaborating | Built-in (Phalanx parallel, Campaign DAG, …) | The agents form a workflow on one task | The agents are independent request handlers | paladin-battalion |
| HTTP service host | One long-running process, agents resident behind an API | Concurrent requests over a shared agent registry | You need request/response access to many agents | A single embedded call already suffices | axum + paladin-ai (compose your own; paladin-web for user/auth) |
| Queue / worker (distributed) | Producer(s) + a pool of worker processes | Horizontal scale; backpressure via the queue | You need scale-out, retries, or fault isolation | Load is low and in-process execution is enough | paladin-storage (redis-queue) |
| Sidecar (separate process) | Agent in its own process, called over the network | Per-sidecar; caller is decoupled | You need hard process/security/deploy isolation per agent | In-process hosting gives the same benefit cheaper | HTTP host + an HTTP client (no IPC ships today) |
Two of these are documented in depth elsewhere. The embedded-library and Battalion pages here are short topology overviews — they link to the full Paladin Agents and Orchestration Patterns guides for the complete API.
Choosing a topology
flowchart TD
start([I want to run a number of agents]) --> q1{Do the agents collaborate on one task?}
q1 -->|Yes| battalion[Battalion orchestration]
q1 -->|No| q2{Does an external caller need to invoke them?}
q2 -->|No| embedded[Embedded library]
q2 -->|Yes| q3{Need scale-out, retries, or backpressure?}
q3 -->|Yes| queue[Queue / worker]
q3 -->|No| q4{Need hard process / deploy isolation per agent?}
q4 -->|No| http[HTTP service host]
q4 -->|Yes| sidecar[Sidecar]
Recommendation
Start with the embedded library topology (and Battalion when agents collaborate). When you need an external caller, wrap it in an HTTP service host. Reach for the queue / worker topology only when load or fault-isolation demands it, and the sidecar topology only when you need process isolation that in-process hosting cannot give — it carries the most operational overhead.
These topologies also compose: a worker process is an embedded host; a sidecar is an HTTP host called from another process.