ZeroClaw-07-Provider生态与选型深度解析 🔗
深入解析 ZeroClaw 支持的 AI 模型提供商生态,从设计原理到代码实现,理解如何在成本、质量、速度之间做出最优选择。
适合阅读人群:技术决策者、架构师、成本控制人员
引言:AI 模型的"菜市场" 🔗
模型选择的困境 🔗
想象一下走进一个巨大的菜市场:
- 有昂贵的进口水果(GPT-4、Claude)
- 有便宜的本地蔬菜(开源模型)
- 有各种各样的摊位(不同厂商)
- 每个都在吆喝"选我选我"
ZeroClaw 的解决方案:一个统一接口,接入所有主流模型,随时可以切换。
一、Provider 生态全景 🔗
1.1 Provider 分类地图 🔗
flowchart TB
subgraph 商业闭源["💼 商业闭源模型"]
OPENAI["OpenAI
GPT-4o/o1/o3"]
ANTHROPIC["Anthropic
Claude 3.5/3.7"]
GOOGLE["Google
Gemini 2.0"]
COHERE["Cohere
Command R+"]
end
subgraph 聚合平台["🌐 聚合平台"]
OPENROUTER["OpenRouter
🟢 推荐"]
TOGETHER["Together AI"]
GROQ["Groq
极速推理"]
end
subgraph 开源模型["🔓 开源模型"]
OLLAMA["Ollama
本地运行"]
LMSTUDIO["LM Studio"]
VLLM["vLLM"]
end
subgraph 国内服务["🇨🇳 国内服务"]
DEEPSEEK["DeepSeek"]
QWEN["通义千问"]
GLM["智谱 GLM"]
ZAI["Z AI"]
end
subgraph ZeroClaw["🦀 ZeroClaw"]
CORE["统一 Provider 接口"]
FACTORY["Provider 工厂"]
ROUTER["智能路由"]
end
OPENAI & ANTHROPIC & GOOGLE & COHERE --> CORE
OPENROUTER & TOGETHER & GROQ --> CORE
OLLAMA & LMSTUDIO & VLLM --> CORE
DEEPSEEK & QWEN & GLM & ZAI --> CORE
CORE --> FACTORY --> ROUTER
style OPENROUTER fill:#6366f1,color:#fff,stroke:#333,stroke-width:3px
style OLLAMA fill:#fff,stroke:#333,stroke-width:2px
1.2 Provider Trait 设计 🔗
// src/providers/traits.rs
#[async_trait]
pub trait Provider: Send + Sync + std::fmt::Debug {
/// Provider 名称
fn name(&self) -> &str;
/// 完成对话
async fn complete(&self, request: CompletionRequest)
-> Result<CompletionResponse, ProviderError>;
/// 流式完成
async fn complete_stream(&self, request: CompletionRequest)
-> Result<BoxStream<'static, Result<StreamChunk, ProviderError>>, ProviderError>;
/// 列出可用模型
async fn list_models(&self) -> Result<Vec<ModelInfo>, ProviderError>;
/// 健康检查
async fn health_check(&self) -> HealthStatus;
/// 预热(建立连接池等)
async fn warmup(&self) -> Result<()> {
Ok(())
}
}
/// 完成请求
#[derive(Debug, Clone)]
pub struct CompletionRequest {
pub messages: Vec<Message>,
pub model: Option<String>,
pub temperature: Option<f32>,
pub max_tokens: Option<u32>,
pub tools: Option<Vec<ToolSpec>>,
pub stream: bool,
}
/// 完成响应
#[derive(Debug, Clone)]
pub struct CompletionResponse {
pub content: String,
pub tool_calls: Option<Vec<ToolCall>>,
pub usage: TokenUsage,
pub model: String,
}
pub struct TokenUsage {
pub prompt_tokens: u32,
pub completion_tokens: u32,
pub total_tokens: u32,
}
为什么这样设计?
| 设计决策 | 理由 |
|---|---|
async_trait |
网络请求需要异步 |
stream 方法 |
实时响应用户体验更好 |
health_check |
监控和故障转移需要 |
warmup |
连接池预热减少延迟 |
二、主流 Provider 实现 🔗
2.1 OpenAI Provider 🔗
// src/providers/openai.rs
pub struct OpenAiProvider {
client: reqwest::Client,
api_key: String,
base_url: String,
default_model: String,
}
#[async_trait]
impl Provider for OpenAiProvider {
fn name(&self) -> &str {
"openai"
}
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse, ProviderError> {
let body = json!({
"model": request.model.as_deref().unwrap_or(&self.default_model),
"messages": request.messages,
"temperature": request.temperature.unwrap_or(0.7),
"max_tokens": request.max_tokens,
"tools": request.tools,
});
let response = self.client
.post(format!("{}/v1/chat/completions", self.base_url))
.header("Authorization", format!("Bearer {}", self.api_key))
.json(&body)
.send()
.await
.map_err(|e| ProviderError::Network(e.to_string()))?;
if !response.status().is_success() {
let error = response.text().await.unwrap_or_default();
return Err(ProviderError::ApiError(error));
}
let result: OpenAiResponse = response.json().await
.map_err(|e| ProviderError::ParseError(e.to_string()))?;
Ok(CompletionResponse {
content: result.choices[0].message.content.unwrap_or_default(),
tool_calls: result.choices[0].message.tool_calls.map(|tcs| {
tcs.into_iter().map(|tc| ToolCall {
id: tc.id,
name: tc.function.name,
arguments: tc.function.arguments,
}).collect()
}),
usage: TokenUsage {
prompt_tokens: result.usage.prompt_tokens,
completion_tokens: result.usage.completion_tokens,
total_tokens: result.usage.total_tokens,
},
model: result.model,
})
}
async fn complete_stream(&self, request: CompletionRequest)
-> Result<BoxStream<'static, Result<StreamChunk, ProviderError>>, ProviderError> {
// 实现流式响应...
todo!()
}
}
// OpenAI API 响应结构
#[derive(Debug, Deserialize)]
struct OpenAiResponse {
model: String,
choices: Vec<Choice>,
usage: Usage,
}
#[derive(Debug, Deserialize)]
struct Choice {
message: Message,
}
#[derive(Debug, Deserialize)]
struct Message {
content: Option<String>,
tool_calls: Option<Vec<ToolCall>>,
}
#[derive(Debug, Deserialize)]
struct Usage {
prompt_tokens: u32,
completion_tokens: u32,
total_tokens: u32,
}
2.2 Ollama Provider(本地模型) 🔗
// src/providers/ollama.rs
pub struct OllamaProvider {
client: reqwest::Client,
base_url: String,
default_model: String,
}
#[async_trait]
impl Provider for OllamaProvider {
fn name(&self) -> &str {
"ollama"
}
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse, ProviderError> {
let body = json!({
"model": request.model.as_deref().unwrap_or(&self.default_model),
"messages": request.messages,
"stream": false,
});
let response = self.client
.post(format!("{}/api/chat", self.base_url))
.json(&body)
.send()
.await
.map_err(|e| {
if e.is_connect() {
ProviderError::Network(
"无法连接到 Ollama,请检查服务是否运行".into()
)
} else {
ProviderError::Network(e.to_string())
}
})?;
let result: OllamaResponse = response.json().await
.map_err(|e| ProviderError::ParseError(e.to_string()))?;
Ok(CompletionResponse {
content: result.message.content,
tool_calls: None, // Ollama 部分模型支持工具调用
usage: TokenUsage {
// Ollama 不返回 token 计数,需要估算
prompt_tokens: 0,
completion_tokens: 0,
total_tokens: 0,
},
model: result.model,
})
}
async fn health_check(&self) -> HealthStatus {
match self.client
.get(format!("{}/api/tags", self.base_url))
.send()
.await {
Ok(resp) if resp.status().is_success() => HealthStatus::Healthy,
_ => HealthStatus::Unhealthy("无法连接到 Ollama 服务".into()),
}
}
}
Ollama 与云端 Provider 的区别:
| 特性 | Ollama | OpenAI/OpenRouter |
|---|---|---|
| 网络依赖 | 无(本地) | 有 |
| 成本 | 免费(硬件成本) | 按 Token 付费 |
| 隐私 | 完全本地 | 发送到云端 |
| 延迟 | 取决于硬件 | 网络延迟 |
| 模型选择 | 有限 | 丰富 |
2.3 Provider 工厂 🔗
// src/providers/mod.rs
pub struct ProviderFactory;
impl ProviderFactory {
pub async fn create(config: &ProviderConfig) -> Result<Box<dyn Provider>> {
match config.type_.as_str() {
"openai" => Ok(Box::new(OpenAiProvider::new(config)?)),
"anthropic" => Ok(Box::new(AnthropicProvider::new(config)?)),
"ollama" => Ok(Box::new(OllamaProvider::new(config)?)),
"openrouter" => Ok(Box::new(OpenRouterProvider::new(config)?)),
"deepseek" => Ok(Box::new(DeepSeekProvider::new(config)?)),
"custom" => Ok(Box::new(CustomProvider::new(config)?)),
_ => Err(Error::UnknownProvider(config.type_.clone())),
}
}
pub fn list_available() -> Vec<&'static str> {
vec![
"openai",
"anthropic",
"ollama",
"openrouter",
"deepseek",
"groq",
"together",
"custom",
]
}
}
三、Provider 选型决策 🔗
3.1 能力矩阵对比 🔗
flowchart LR
subgraph 维度["评估维度"]
D1["代码能力"]
D2["推理能力"]
D3["工具调用"]
D4["上下文长度"]
D5["中文能力"]
D6["性价比"]
end
subgraph GPT4o["GPT-4o"]
G1["⭐⭐⭐⭐⭐"]
G2["⭐⭐⭐⭐"]
G3["⭐⭐⭐⭐⭐"]
G4["128K"]
G5["⭐⭐⭐⭐"]
G6["💰💰"]
end
subgraph Claude["Claude 3.5"]
C1["⭐⭐⭐⭐⭐"]
C2["⭐⭐⭐⭐⭐"]
C3["⭐⭐⭐⭐"]
C4["200K"]
C5["⭐⭐⭐"]
C6["💰💰"]
end
subgraph OpenRouter["OpenRouter"]
O1["⭐⭐⭐⭐⭐"]
O2["⭐⭐⭐⭐⭐"]
O3["⭐⭐⭐⭐⭐"]
O4["200K+"]
O5["⭐⭐⭐⭐"]
O6["💰"]
end
subgraph Ollama["Ollama本地"]
LL1["⭐⭐⭐"]
LL2["⭐⭐⭐"]
LL3["⭐⭐⭐"]
LL4["128K"]
LL5["⭐⭐⭐"]
LL6["🆓"]
end
D1 --> G1 & C1 & O1 & LL1
D2 --> G2 & C2 & O2 & LL2
D3 --> G3 & C3 & O3 & LL3
D4 --> G4 & C4 & O4 & LL4
D5 --> G5 & C5 & O5 & LL5
D6 --> G6 & C6 & O6 & LL6
style OpenRouter fill:#6366f1,color:#fff
style Ollama fill:#afa
3.2 选型决策树 🔗
flowchart TD
START["选择 Provider"] --> Q1{"预算?"}
Q1 -->|"有限/免费"| FREE["免费/低成本方案"]
Q1 -->|"中等"| MID["平衡方案"]
Q1 -->|"充足"| PREM["高端方案"]
FREE --> Q2{"本地算力?"}
Q2 -->|"充足"| OLLAMA["Ollama
🟢 推荐"]
Q2 -->|"有限"| CHEAP["DeepSeek
Z AI
性价比之选"]
MID --> Q3{"主要用途?"}
Q3 -->|"代码开发"| CODE["OpenRouter + Claude
🟢 推荐"]
Q3 -->|"通用对话"| GEN["OpenRouter
🟢 推荐"]
Q3 -->|"中文场景"| CN["DeepSeek/GLM
🟢 推荐"]
PREM --> Q4{"特定需求?"}
Q4 -->|"最强代码"| BEST_CODE["Claude 3.7 Sonnet
🟢 推荐"]
Q4 -->|"工具调用"| BEST_TOOL["GPT-4o
🟢 推荐"]
Q4 -->|"超长上下文"| BEST_CTX["Claude 200K
🟢 推荐"]
OLLAMA --> CONFIG1["配置: provider = 'ollama'"]
CHEAP --> CONFIG2["配置: provider = 'deepseek' / 'zai'"]
CODE --> CONFIG3["配置: provider = 'openrouter'
model = 'anthropic/claude-sonnet-4'"]
GEN --> CONFIG4["配置: provider = 'openrouter'"]
CN --> CONFIG5["配置: provider = 'openrouter'
model = 'deepseek/deepseek-chat'"]
BEST_CODE --> CONFIG6["配置: provider = 'anthropic' / 'openrouter'"]
BEST_TOOL --> CONFIG7["配置: provider = 'openai' / 'openrouter'"]
BEST_CTX --> CONFIG8["配置: provider = 'anthropic'"]
style OpenRouter fill:#6366f1,color:#fff
3.3 配置示例 🔗
# 快速开始(推荐)
default_provider = "openrouter"
default_model = "openrouter/auto"
# 最强代码能力
default_provider = "openrouter"
default_model = "anthropic/claude-sonnet-4"
# 本地免费
default_provider = "ollama"
default_model = "llama3.2"
# 中文优化
default_provider = "openrouter"
default_model = "deepseek/deepseek-chat"
# 成本优先
default_provider = "openrouter"
default_model = "openrouter/optimus"
四、成本优化策略 🔗
4.1 成本结构分析 🔗
flowchart TB
subgraph 成本构成["成本构成"]
C1["Input Tokens
提示词"]
C2["Output Tokens
生成内容"]
C3["工具调用
额外消耗"]
C4["嵌入服务
记忆系统"]
end
subgraph 优化策略["优化策略"]
S1["精简上下文"]
S2["缓存嵌入"]
S3["选择便宜模型"]
S4["本地嵌入"]
end
subgraph 效果["优化效果"]
E1["-30% 成本"]
E2["-50% 成本"]
E3["-70% 成本"]
E4["-90% 成本"]
end
C1 --> S1 --> E1
C4 --> S2 & S4
S2 --> E2
S4 --> E4
C1 & C2 --> S3 --> E3
style E4 fill:#afa,stroke:#333
4.2 缓存策略实现 🔗
// src/providers/cached.rs
pub struct CachedProvider {
inner: Box<dyn Provider>,
cache: Arc<RwLock<LruCache<String, CompletionResponse>>>,
}
#[async_trait]
impl Provider for CachedProvider {
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse, ProviderError> {
// 生成缓存键
let cache_key = format!(
"{}/{}/{}/{}",
request.model.as_deref().unwrap_or("default"),
hash_messages(&request.messages),
request.temperature.unwrap_or(0.7),
request.max_tokens.unwrap_or(0)
);
// 检查缓存
{
let cache = self.cache.read().await;
if let Some(cached) = cache.get(&cache_key) {
trace!("Provider 缓存命中");
return Ok(cached.clone());
}
}
// 调用实际 Provider
let result = self.inner.complete(request).await?;
// 写入缓存
{
let mut cache = self.cache.write().await;
cache.put(cache_key, result.clone());
}
Ok(result)
}
fn name(&self) -> &str {
self.inner.name()
}
}
五、Provider 选择速查表 🔗
5.1 场景推荐 🔗
| 场景 | 推荐 Provider | 推荐模型 | 理由 |
|---|---|---|---|
| 新手入门 | OpenRouter | openrouter/auto | 一键切换,自动优化 |
| 专业开发 | OpenRouter | anthropic/claude-sonnet-4 | 代码能力最强 |
| 中文场景 | DeepSeek/Z AI | deepseek-chat | 中文优化,成本低 |
| 隐私优先 | Ollama | llama3.2 | 本地运行,数据不出境 |
| 成本敏感 | OpenRouter | openrouter/optimus | 自动选择性价比最优 |
| 超长文档 | Anthropic | claude-3-opus | 200K 上下文 |
| 极速响应 | Groq | llama-3.1-70b | 毫秒级响应 |
| 企业部署 | Azure OpenAI | GPT-4o | SLA 保障 |
5.2 完整 Provider 列表 🔗
商业模型:
- openai : GPT-4o, GPT-4o-mini, o1, o3
- anthropic : Claude 3.5/3.7 Sonnet, Opus, Haiku
- google : Gemini 1.5/2.0 Pro/Flash
- cohere : Command R, Command R+
聚合平台:
- openrouter : 100+ 模型统一接口 (推荐)
- together : Together AI 推理
- groq : 极速推理平台
开源/本地:
- ollama : 本地运行 Llama, Qwen, Phi 等
- lmstudio : LM Studio 本地服务
- vllm : vLLM 推理引擎
国内服务:
- deepseek : DeepSeek Coder/Chat
- zai : Z AI GLM-5
- openrouter-cn : 通过 OpenRouter 访问国内模型
自定义:
- custom:https:// : 自定义 OpenAI 兼容端点
六、自定义 Provider 🔗
6.1 实现自定义 Provider 🔗
// 实现自定义 Provider
pub struct MyCustomProvider {
client: reqwest::Client,
api_key: String,
}
#[async_trait]
impl Provider for MyCustomProvider {
fn name(&self) -> &str {
"my-provider"
}
async fn complete(&self, request: CompletionRequest)
-> Result<CompletionResponse, ProviderError> {
// 实现自己的 API 调用逻辑
todo!()
}
async fn list_models(&self) -> Result<Vec<ModelInfo>, ProviderError> {
Ok(vec![
ModelInfo {
id: "my-model".into(),
name: "My Custom Model".into(),
context_length: 4096,
}
])
}
}
// 注册到工厂
pub fn register_custom_providers(factory: &mut ProviderFactory) {
factory.register("my-provider", |config| {
Box::new(MyCustomProvider::new(config))
});
}
6.2 配置使用 🔗
# 使用自定义 Provider
default_provider = "custom:https://api.example.com/v1"
api_key = "your-api-key"