Google Gemini 集成指南
本指南介绍了如何设置和使用 Google 的 Gemini 模型与 Ragas 进行评估。
概述
Ragas 支持 Google Gemini 模型并能自动选择适配器。该框架会智能地通过 LiteLLM 适配器路由 Gemini 请求,从而与 Gemini 的 API 实现无缝兼容。
设置
先决条件
- 具有 Gemini API 访问权限的 Google API 密钥
- Python 3.8+
- 已安装 Ragas
安装
安装所需的依赖项
或使用 Ragas 附加功能
配置
选项 1:使用 Google 官方库(推荐)
使用 Google 官方的 generativeai 库是最简单、最直接的方法
import os
import google.generativeai as genai
from ragas.llms import llm_factory
# Configure with your API key
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
# Create client
client = genai.GenerativeModel("gemini-2.0-flash")
# Create LLM - adapter is auto-detected for google provider
llm = llm_factory(
"gemini-2.0-flash",
provider="google",
client=client
)
选项 2:使用 LiteLLM 代理(高级)
对于需要 LiteLLM 代理功能的高级用例,请先设置 LiteLLM 代理服务器,然后使用
import os
from openai import OpenAI
from ragas.llms import llm_factory
# Requires running: litellm --model gemini-2.0-flash
client = OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000" # LiteLLM proxy endpoint
)
# Create LLM with explicit adapter selection
llm = llm_factory("gemini-2.0-flash", client=client, adapter="litellm")
支持的模型
Ragas 可与所有 Gemini 模型配合使用
- 最新:
gemini-2.0-flash(推荐) - 1.5 系列:
gemini-1.5-pro、gemini-1.5-flash - 1.0 系列:
gemini-1.0-pro
有关最新模型和定价,请参阅 Google AI Studio。
嵌入配置
Ragas 指标分为两类
- 仅 LLM 指标(不需要嵌入)
- ContextPrecision
- ContextRecall
- 忠实度
-
AspectCritic
-
依赖嵌入的指标(需要嵌入)
- AnswerCorrectness
- AnswerRelevancy
- AnswerSimilarity
- SemanticSimilarity
- ContextEntityRecall
自动提供商匹配
当将 Ragas 与 Gemini 一起使用时,嵌入提供商会自动匹配您的 LLM 提供商。如果您提供了 Gemini LLM,Ragas 将默认使用 Google 嵌入。不需要 OpenAI API 密钥。
选项 1:默认嵌入(推荐)
让 Ragas 根据您的 LLM 自动选择正确的嵌入
import os
from datasets import Dataset
import google.generativeai as genai
from ragas import evaluate
from ragas.llms import llm_factory
from ragas.metrics import (
AnswerCorrectness,
ContextPrecision,
ContextRecall,
Faithfulness
)
# Initialize Gemini client
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)
# Create sample evaluation data
data = {
"question": ["What is the capital of France?"],
"answer": ["Paris is the capital of France."],
"contexts": [["France is a country in Western Europe. Paris is its capital."]],
"ground_truth": ["Paris"]
}
dataset = Dataset.from_dict(data)
# Define metrics - embeddings are auto-configured for Google
metrics = [
ContextPrecision(llm=llm),
ContextRecall(llm=llm),
Faithfulness(llm=llm),
AnswerCorrectness(llm=llm) # Uses Google embeddings automatically
]
# Run evaluation
results = evaluate(dataset, metrics=metrics)
print(results)
选项 2:显式嵌入
为了显式控制嵌入,您可以单独创建它们。Google 嵌入支持多种配置选项
import os
import google.generativeai as genai
from ragas.llms import llm_factory
from ragas.embeddings import GoogleEmbeddings
from ragas.embeddings.base import embedding_factory
from datasets import Dataset
from ragas import evaluate
from ragas.metrics import AnswerCorrectness, ContextPrecision, ContextRecall, Faithfulness
# Configure Google AI
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
# Initialize Gemini LLM
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)
# Initialize Google embeddings (multiple options):
# Option A: Simplest - auto-import (recommended)
embeddings = embedding_factory("google", model="text-embedding-004")
# Option B: From genai module directly
embeddings = GoogleEmbeddings(client=genai, model="text-embedding-004")
# Option C: No client (auto-imports genai)
embeddings = GoogleEmbeddings(model="text-embedding-004")
# Create sample evaluation data
data = {
"question": ["What is the capital of France?"],
"answer": ["Paris is the capital of France."],
"contexts": [["France is a country in Western Europe. Paris is its capital."]],
"ground_truth": ["Paris"]
}
dataset = Dataset.from_dict(data)
# Define metrics with explicit embeddings
metrics = [
ContextPrecision(llm=llm),
ContextRecall(llm=llm),
Faithfulness(llm=llm),
AnswerCorrectness(llm=llm, embeddings=embeddings)
]
# Run evaluation
results = evaluate(dataset, metrics=metrics)
print(results)
示例:完整评估
这是一个使用 Gemini 评估 RAG 应用程序的完整示例(使用自动嵌入提供商匹配)
import os
from datasets import Dataset
import google.generativeai as genai
from ragas import evaluate
from ragas.llms import llm_factory
from ragas.metrics import (
AnswerCorrectness,
ContextPrecision,
ContextRecall,
Faithfulness
)
# Initialize Gemini client
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)
# Create sample evaluation data
data = {
"question": ["What is the capital of France?"],
"answer": ["Paris is the capital of France."],
"contexts": [["France is a country in Western Europe. Paris is its capital."]],
"ground_truth": ["Paris"]
}
dataset = Dataset.from_dict(data)
# Define metrics - embeddings automatically use Google provider
metrics = [
ContextPrecision(llm=llm),
ContextRecall(llm=llm),
Faithfulness(llm=llm),
AnswerCorrectness(llm=llm)
]
# Run evaluation
results = evaluate(dataset, metrics=metrics)
print(results)
性能注意事项
模型选择
- gemini-2.0-flash:速度和效率最佳
- gemini-1.5-pro:对于复杂评估具有更好的推理能力
- gemini-1.5-flash:在速度和成本之间取得了良好平衡
成本优化
Gemini 模型具有成本效益。对于大规模评估
- 对大多数指标使用
gemini-2.0-flash - 考虑对多个评估进行批量处理
- 尽可能缓存提示(Gemini 支持提示缓存)
异步支持
对于高吞吐量评估,请使用 google-generativeai 的异步操作
import google.generativeai as genai
from ragas.llms import llm_factory
# Configure and create client (same as sync)
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)
# Use in async evaluation
# response = await llm.agenerate(prompt, ResponseModel)
适配器选择
Ragas 会根据您的设置自动选择合适的适配器
# Auto-detection happens automatically
# For Gemini: uses LiteLLM adapter
# For other providers: uses Instructor adapter
# Explicit selection (if needed)
llm = llm_factory(
"gemini-2.0-flash",
client=client,
adapter="litellm" # Explicit adapter selection
)
# Check auto-detected adapter
from ragas.llms.adapters import auto_detect_adapter
adapter_name = auto_detect_adapter(client, "google")
print(f"Using adapter: {adapter_name}") # Output: Using adapter: litellm
问题排查
API 密钥问题
# Make sure your API key is set
import os
if not os.environ.get("GOOGLE_API_KEY"):
raise ValueError("GOOGLE_API_KEY environment variable not set")
速率限制
Gemini 有速率限制。对于生产使用,LLM 适配器会自动处理重试和超时。如果您需要细粒度控制,请确保您的客户端在 HTTP 客户端级别正确配置了适当的超时。
模型可用性
如果某个模型不可用
- 在 Google Cloud Console 中检查您的区域/配额
- 尝试支持列表中的其他模型
- 验证您的 API 密钥是否具有访问 Generative AI API 的权限
从其他提供商迁移
从 OpenAI
# Before: OpenAI-only
from openai import OpenAI
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
llm = llm_factory("gpt-4o", client=client)
# After: Gemini with similar code pattern
import google.generativeai as genai
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)
从 Anthropic
# Before: Anthropic
from anthropic import Anthropic
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
llm = llm_factory("claude-3-sonnet", provider="anthropic", client=client)
# After: Gemini
import google.generativeai as genai
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)
与指标集合一起使用(现代方法)
对于现代的指标集合 API,您需要显式创建 LLM 和嵌入
import os
import google.generativeai as genai
from ragas.llms import llm_factory
from ragas.embeddings.base import embedding_factory
from ragas.metrics.collections import AnswerCorrectness, ContextPrecision
# Configure Google AI
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
# Create LLM
llm_client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=llm_client)
# Create embeddings - multiple options work:
# Option 1: Auto-import (simplest)
embeddings = embedding_factory("google", model="text-embedding-004")
# Option 2: From LLM client (now works with the fix!)
embeddings = embedding_factory("google", client=llm.client, model="text-embedding-004")
# Create metrics with explicit LLM and embeddings
metrics = [
ContextPrecision(llm=llm), # LLM-only metric
AnswerCorrectness(llm=llm, embeddings=embeddings), # Needs both
]
# Use metrics with your evaluation workflow
result = await metrics[1].ascore(
user_input="What is the capital of France?",
response="Paris",
reference="Paris is the capital of France."
)
与传统方法的关键区别: - 传统 evaluate():从 LLM 提供商自动创建嵌入 - 现代集合:您将嵌入显式传递给每个指标
这为您提供了更多控制权,并能与 Gemini 无缝协作!
支持的指标
所有 Ragas 指标都可与 Gemini 配合使用
- Answer Correctness (答案正确性)
- Answer Relevancy (答案相关性)
- Answer Similarity (答案相似性)
- Aspect Critique (方面批判)
- 上下文精确率
- 上下文召回率
- 上下文实体召回率
- 忠实度
- NLI Eval (自然语言推断评估)
- 响应相关性
详情请参阅指标参考。
高级:自定义模型参数
向 Gemini 传递自定义参数
llm = llm_factory(
"gemini-2.0-flash",
client=client,
temperature=0.5,
max_tokens=2048,
top_p=0.9,
top_k=40,
)