跳转到内容

Google Gemini 集成指南

本指南介绍了如何设置和使用 Google 的 Gemini 模型与 Ragas 进行评估。

概述

Ragas 支持 Google Gemini 模型并能自动选择适配器。该框架会智能地通过 LiteLLM 适配器路由 Gemini 请求,从而与 Gemini 的 API 实现无缝兼容。

设置

先决条件

  • 具有 Gemini API 访问权限的 Google API 密钥
  • Python 3.8+
  • 已安装 Ragas

安装

安装所需的依赖项

pip install ragas google-generativeai litellm

或使用 Ragas 附加功能

pip install "ragas[gemini]"

配置

使用 Google 官方的 generativeai 库是最简单、最直接的方法

import os
import google.generativeai as genai
from ragas.llms import llm_factory

# Configure with your API key
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))

# Create client
client = genai.GenerativeModel("gemini-2.0-flash")

# Create LLM - adapter is auto-detected for google provider
llm = llm_factory(
    "gemini-2.0-flash",
    provider="google",
    client=client
)

选项 2:使用 LiteLLM 代理(高级)

对于需要 LiteLLM 代理功能的高级用例,请先设置 LiteLLM 代理服务器,然后使用

import os
from openai import OpenAI
from ragas.llms import llm_factory

# Requires running: litellm --model gemini-2.0-flash
client = OpenAI(
    api_key="anything",
    base_url="http://0.0.0.0:4000"  # LiteLLM proxy endpoint
)

# Create LLM with explicit adapter selection
llm = llm_factory("gemini-2.0-flash", client=client, adapter="litellm")

支持的模型

Ragas 可与所有 Gemini 模型配合使用

  • 最新gemini-2.0-flash(推荐)
  • 1.5 系列gemini-1.5-progemini-1.5-flash
  • 1.0 系列gemini-1.0-pro

有关最新模型和定价,请参阅 Google AI Studio

嵌入配置

Ragas 指标分为两类

  1. 仅 LLM 指标(不需要嵌入)
  2. ContextPrecision
  3. ContextRecall
  4. 忠实度
  5. AspectCritic

  6. 依赖嵌入的指标(需要嵌入)

  7. AnswerCorrectness
  8. AnswerRelevancy
  9. AnswerSimilarity
  10. SemanticSimilarity
  11. ContextEntityRecall

自动提供商匹配

当将 Ragas 与 Gemini 一起使用时,嵌入提供商会自动匹配您的 LLM 提供商。如果您提供了 Gemini LLM,Ragas 将默认使用 Google 嵌入。不需要 OpenAI API 密钥。

让 Ragas 根据您的 LLM 自动选择正确的嵌入

import os
from datasets import Dataset
import google.generativeai as genai
from ragas import evaluate
from ragas.llms import llm_factory
from ragas.metrics import (
    AnswerCorrectness,
    ContextPrecision,
    ContextRecall,
    Faithfulness
)

# Initialize Gemini client
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)

# Create sample evaluation data
data = {
    "question": ["What is the capital of France?"],
    "answer": ["Paris is the capital of France."],
    "contexts": [["France is a country in Western Europe. Paris is its capital."]],
    "ground_truth": ["Paris"]
}

dataset = Dataset.from_dict(data)

# Define metrics - embeddings are auto-configured for Google
metrics = [
    ContextPrecision(llm=llm),
    ContextRecall(llm=llm),
    Faithfulness(llm=llm),
    AnswerCorrectness(llm=llm)  # Uses Google embeddings automatically
]

# Run evaluation
results = evaluate(dataset, metrics=metrics)
print(results)

选项 2:显式嵌入

为了显式控制嵌入,您可以单独创建它们。Google 嵌入支持多种配置选项

import os
import google.generativeai as genai
from ragas.llms import llm_factory
from ragas.embeddings import GoogleEmbeddings
from ragas.embeddings.base import embedding_factory
from datasets import Dataset
from ragas import evaluate
from ragas.metrics import AnswerCorrectness, ContextPrecision, ContextRecall, Faithfulness

# Configure Google AI
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))

# Initialize Gemini LLM
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)

# Initialize Google embeddings (multiple options):

# Option A: Simplest - auto-import (recommended)
embeddings = embedding_factory("google", model="text-embedding-004")

# Option B: From genai module directly
embeddings = GoogleEmbeddings(client=genai, model="text-embedding-004")

# Option C: No client (auto-imports genai)
embeddings = GoogleEmbeddings(model="text-embedding-004")

# Create sample evaluation data
data = {
    "question": ["What is the capital of France?"],
    "answer": ["Paris is the capital of France."],
    "contexts": [["France is a country in Western Europe. Paris is its capital."]],
    "ground_truth": ["Paris"]
}

dataset = Dataset.from_dict(data)

# Define metrics with explicit embeddings
metrics = [
    ContextPrecision(llm=llm),
    ContextRecall(llm=llm),
    Faithfulness(llm=llm),
    AnswerCorrectness(llm=llm, embeddings=embeddings)
]

# Run evaluation
results = evaluate(dataset, metrics=metrics)
print(results)

示例:完整评估

这是一个使用 Gemini 评估 RAG 应用程序的完整示例(使用自动嵌入提供商匹配)

import os
from datasets import Dataset
import google.generativeai as genai
from ragas import evaluate
from ragas.llms import llm_factory
from ragas.metrics import (
    AnswerCorrectness,
    ContextPrecision,
    ContextRecall,
    Faithfulness
)

# Initialize Gemini client
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)

# Create sample evaluation data
data = {
    "question": ["What is the capital of France?"],
    "answer": ["Paris is the capital of France."],
    "contexts": [["France is a country in Western Europe. Paris is its capital."]],
    "ground_truth": ["Paris"]
}

dataset = Dataset.from_dict(data)

# Define metrics - embeddings automatically use Google provider
metrics = [
    ContextPrecision(llm=llm),
    ContextRecall(llm=llm),
    Faithfulness(llm=llm),
    AnswerCorrectness(llm=llm)
]

# Run evaluation
results = evaluate(dataset, metrics=metrics)
print(results)

性能注意事项

模型选择

  • gemini-2.0-flash:速度和效率最佳
  • gemini-1.5-pro:对于复杂评估具有更好的推理能力
  • gemini-1.5-flash:在速度和成本之间取得了良好平衡

成本优化

Gemini 模型具有成本效益。对于大规模评估

  1. 对大多数指标使用 gemini-2.0-flash
  2. 考虑对多个评估进行批量处理
  3. 尽可能缓存提示(Gemini 支持提示缓存)

异步支持

对于高吞吐量评估,请使用 google-generativeai 的异步操作

import google.generativeai as genai
from ragas.llms import llm_factory

# Configure and create client (same as sync)
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)

# Use in async evaluation
# response = await llm.agenerate(prompt, ResponseModel)

适配器选择

Ragas 会根据您的设置自动选择合适的适配器

# Auto-detection happens automatically
# For Gemini: uses LiteLLM adapter
# For other providers: uses Instructor adapter

# Explicit selection (if needed)
llm = llm_factory(
    "gemini-2.0-flash",
    client=client,
    adapter="litellm"  # Explicit adapter selection
)

# Check auto-detected adapter
from ragas.llms.adapters import auto_detect_adapter
adapter_name = auto_detect_adapter(client, "google")
print(f"Using adapter: {adapter_name}")  # Output: Using adapter: litellm

问题排查

API 密钥问题

# Make sure your API key is set
import os
if not os.environ.get("GOOGLE_API_KEY"):
    raise ValueError("GOOGLE_API_KEY environment variable not set")

速率限制

Gemini 有速率限制。对于生产使用,LLM 适配器会自动处理重试和超时。如果您需要细粒度控制,请确保您的客户端在 HTTP 客户端级别正确配置了适当的超时。

模型可用性

如果某个模型不可用

  1. Google Cloud Console 中检查您的区域/配额
  2. 尝试支持列表中的其他模型
  3. 验证您的 API 密钥是否具有访问 Generative AI API 的权限

从其他提供商迁移

从 OpenAI

# Before: OpenAI-only
from openai import OpenAI
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
llm = llm_factory("gpt-4o", client=client)

# After: Gemini with similar code pattern
import google.generativeai as genai
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)

从 Anthropic

# Before: Anthropic
from anthropic import Anthropic
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
llm = llm_factory("claude-3-sonnet", provider="anthropic", client=client)

# After: Gemini
import google.generativeai as genai
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))
client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=client)

与指标集合一起使用(现代方法)

对于现代的指标集合 API,您需要显式创建 LLM 和嵌入

import os
import google.generativeai as genai
from ragas.llms import llm_factory
from ragas.embeddings.base import embedding_factory
from ragas.metrics.collections import AnswerCorrectness, ContextPrecision

# Configure Google AI
genai.configure(api_key=os.environ.get("GOOGLE_API_KEY"))

# Create LLM
llm_client = genai.GenerativeModel("gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", provider="google", client=llm_client)

# Create embeddings - multiple options work:
# Option 1: Auto-import (simplest)
embeddings = embedding_factory("google", model="text-embedding-004")

# Option 2: From LLM client (now works with the fix!)
embeddings = embedding_factory("google", client=llm.client, model="text-embedding-004")

# Create metrics with explicit LLM and embeddings
metrics = [
    ContextPrecision(llm=llm),  # LLM-only metric
    AnswerCorrectness(llm=llm, embeddings=embeddings),  # Needs both
]

# Use metrics with your evaluation workflow
result = await metrics[1].ascore(
    user_input="What is the capital of France?",
    response="Paris",
    reference="Paris is the capital of France."
)

与传统方法的关键区别: - 传统 evaluate():从 LLM 提供商自动创建嵌入 - 现代集合:您将嵌入显式传递给每个指标

这为您提供了更多控制权,并能与 Gemini 无缝协作!

支持的指标

所有 Ragas 指标都可与 Gemini 配合使用

  • Answer Correctness (答案正确性)
  • Answer Relevancy (答案相关性)
  • Answer Similarity (答案相似性)
  • Aspect Critique (方面批判)
  • 上下文精确率
  • 上下文召回率
  • 上下文实体召回率
  • 忠实度
  • NLI Eval (自然语言推断评估)
  • 响应相关性

详情请参阅指标参考

高级:自定义模型参数

向 Gemini 传递自定义参数

llm = llm_factory(
    "gemini-2.0-flash",
    client=client,
    temperature=0.5,
    max_tokens=2048,
    top_p=0.9,
    top_k=40,
)

资源