LLM Providers

Three provider types ship in-tree (Anthropic, OpenAI-compatible, Ollama), powering 21 pre-configured providers via different base_url and model combinations. Adding a new provider type means implementing one trait and wiring a factory branch.

The Trait

#[async_trait]
pub trait LlmProvider: Send + Sync {
    async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse>;

    async fn stream(
        &self,
        request: CompletionRequest,
        tx: tokio::sync::mpsc::UnboundedSender<StreamEvent>,
    ) -> Result<()> {
        // Default: complete then send Done event.
        let response = self.complete(request).await?;
        let _ = tx.send(StreamEvent::Done(response));
        Ok(())
    }

    fn name(&self) -> &str;
    fn model(&self) -> &str;
    fn supports_tools(&self) -> bool;
}

Send + Sync because providers are wrapped in Box<dyn LlmProvider> and stored on Agent, which itself is shared across async tasks.

Built-in Implementations

Provider TypeFileAPIStreamingUsed By
Anthropicproviders/anthropic.rsMessages APIReal SSE — text deltas, tool input deltas, stop reasons, usageclaude
OpenAIproviders/openai.rsChat CompletionsReal SSE streamingopenai, openrouter-*, groq, together
Ollamaproviders/ollama.rsNative APIDefault + tool-call ID synthesislocal

The OpenAI provider type is reused for OpenRouter, Groq, and Together by setting base_url in fledge.toml to point at their API endpoints.

The factory:

// merlin-core/src/providers/mod.rs
pub fn create_provider(
    _name: &str,
    config: &ProviderConfig,
) -> anyhow::Result<Box<dyn LlmProvider>> {
    match config.provider_type.as_str() {
        "anthropic" => Ok(Box::new(anthropic::AnthropicProvider::new(config)?)),
        "openai"    => Ok(Box::new(openai::OpenAiProvider::new(config)?)),
        "ollama"    => Ok(Box::new(ollama::OllamaProvider::new(config)?)),
        other => anyhow::bail!("unknown provider type: {other}"),
    }
}

Adding a Provider

Implement LlmProvider for your type, add a branch to create_provider, and register the type string in fledge.toml. The agent loop is provider-agnostic — once the trait is satisfied, you’re done.

A skeleton:

pub struct MyProvider {
    client: reqwest::Client,
    api_key: String,
    model: String,
}

#[async_trait]
impl LlmProvider for MyProvider {
    async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse> {
        // Map CompletionRequest → wire format.
        // POST. Map response → CompletionResponse.
        unimplemented!()
    }
    // Optionally override `stream` for true SSE.
    fn name(&self) -> &str { "myprovider" }
    fn model(&self) -> &str { &self.model }
    fn supports_tools(&self) -> bool { true }
}

Tool-call Mapping

Every provider speaks a slightly different dialect for tool calls:

  • Anthropic: tool_use content blocks with structured input JSON, tool_result blocks with tool_use_id for the response.
  • OpenAI: top-level tool_calls array with id and JSON-string arguments.
  • Ollama: similar to OpenAI but no id; we synthesize one from a monotonic timestamp.

Internally we standardize on Anthropic’s shape (ContentBlock::ToolUse

  • ContentBlock::ToolResult). Adapter code lives in each provider’s file.

Streaming Events

Implementations push events through the channel as they arrive:

EventEmit when
TextDelta(String)A text chunk arrives
ToolUseStart { id, name }A new tool-use block begins
ToolUseInputDelta(String)Partial JSON for tool arguments
ToolUseEndThe current tool block closes
Done(CompletionResponse)Final assembled response
Error(String)Stream-level error; agent bubbles it up

The agent loop accumulates ToolUseInputDelta chunks per current block and emits a single AgentEvent::ToolCall at ToolUseEnd.