LLM Providers
Three provider types ship in-tree (Anthropic, OpenAI-compatible, Ollama),
powering 21 pre-configured providers via different base_url and model
combinations. Adding a new provider type means implementing one trait and
wiring a factory branch.
The Trait
#[async_trait]
pub trait LlmProvider: Send + Sync {
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse>;
async fn stream(
&self,
request: CompletionRequest,
tx: tokio::sync::mpsc::UnboundedSender<StreamEvent>,
) -> Result<()> {
// Default: complete then send Done event.
let response = self.complete(request).await?;
let _ = tx.send(StreamEvent::Done(response));
Ok(())
}
fn name(&self) -> &str;
fn model(&self) -> &str;
fn supports_tools(&self) -> bool;
}
Send + Sync because providers are wrapped in Box<dyn LlmProvider>
and stored on Agent, which itself is shared across async tasks.
Built-in Implementations
| Provider Type | File | API | Streaming | Used By |
|---|---|---|---|---|
| Anthropic | providers/anthropic.rs | Messages API | Real SSE — text deltas, tool input deltas, stop reasons, usage | claude |
| OpenAI | providers/openai.rs | Chat Completions | Real SSE streaming | openai, openrouter-*, groq, together |
| Ollama | providers/ollama.rs | Native API | Default + tool-call ID synthesis | local |
The OpenAI provider type is reused for OpenRouter, Groq, and Together by
setting base_url in fledge.toml to point at their API endpoints.
The factory:
// merlin-core/src/providers/mod.rs
pub fn create_provider(
_name: &str,
config: &ProviderConfig,
) -> anyhow::Result<Box<dyn LlmProvider>> {
match config.provider_type.as_str() {
"anthropic" => Ok(Box::new(anthropic::AnthropicProvider::new(config)?)),
"openai" => Ok(Box::new(openai::OpenAiProvider::new(config)?)),
"ollama" => Ok(Box::new(ollama::OllamaProvider::new(config)?)),
other => anyhow::bail!("unknown provider type: {other}"),
}
}
Adding a Provider
Implement LlmProvider for your type, add a branch to
create_provider, and register the type string in fledge.toml. The
agent loop is provider-agnostic — once the trait is satisfied, you’re
done.
A skeleton:
pub struct MyProvider {
client: reqwest::Client,
api_key: String,
model: String,
}
#[async_trait]
impl LlmProvider for MyProvider {
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse> {
// Map CompletionRequest → wire format.
// POST. Map response → CompletionResponse.
unimplemented!()
}
// Optionally override `stream` for true SSE.
fn name(&self) -> &str { "myprovider" }
fn model(&self) -> &str { &self.model }
fn supports_tools(&self) -> bool { true }
}
Tool-call Mapping
Every provider speaks a slightly different dialect for tool calls:
- Anthropic:
tool_usecontent blocks with structuredinputJSON,tool_resultblocks withtool_use_idfor the response. - OpenAI: top-level
tool_callsarray withidand JSON-stringarguments. - Ollama: similar to OpenAI but no
id; we synthesize one from a monotonic timestamp.
Internally we standardize on Anthropic’s shape (ContentBlock::ToolUse
ContentBlock::ToolResult). Adapter code lives in each provider’s file.
Streaming Events
Implementations push events through the channel as they arrive:
| Event | Emit when |
|---|---|
TextDelta(String) | A text chunk arrives |
ToolUseStart { id, name } | A new tool-use block begins |
ToolUseInputDelta(String) | Partial JSON for tool arguments |
ToolUseEnd | The current tool block closes |
Done(CompletionResponse) | Final assembled response |
Error(String) | Stream-level error; agent bubbles it up |
The agent loop accumulates ToolUseInputDelta chunks per current
block and emits a single AgentEvent::ToolCall at ToolUseEnd.