如何估算评估和测试集生成的成本和用量
在使用 LLMs 进行评估和测试集生成时,成本是一个重要因素。Ragas 提供了一些工具来帮助您解决这个问题。
实现 TokenUsageParser
默认情况下,Ragas 不计算 evaluate()
的 token 用量。这是因为 langchain 的 LLMs 返回 token 用量信息的方式并不统一。因此,为了获取用量数据,我们需要实现一个 TokenUsageParser
。
TokenUsageParser
是一个函数,它解析 langchain 模型 generate_prompt()
函数返回的 LLMResult
或 ChatResult
,并输出 Ragas 期望的 TokenUsage
。
例如,这里有一个使用我们已定义的解析器来解析 OpenAI 的示例。
from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompt_values import StringPromptValue
gpt4o = ChatOpenAI(model="gpt-4o")
p = StringPromptValue(text="hai there")
llm_result = gpt4o.generate_prompt([p])
# lets import a parser for OpenAI
from ragas.cost import get_token_usage_for_openai
get_token_usage_for_openai(llm_result)
您可以定义自己的解析器,或者在已定义的情况下导入解析器。如果您想为 LLM 提供商建议解析器或贡献您自己的解析器,请查看此问题 🙂。
评估的 Token 用量
让我们使用 get_token_usage_for_openai
解析器来计算评估的 token 用量。
from ragas import EvaluationDataset
from datasets import load_dataset
dataset = load_dataset("explodinggradients/amnesty_qa", "english_v3")
eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])
您可以将解析器传递给 evaluate()
函数,成本将被计算并返回到 Result
对象中。
from ragas import evaluate
from ragas.metrics import LLMContextRecall
from ragas.cost import get_token_usage_for_openai
result = evaluate(
eval_dataset,
metrics=[LLMContextRecall()],
llm=gpt4o,
token_usage_parser=get_token_usage_for_openai,
)
您可以通过将每 token 成本传递给 Result.total_cost()
函数来计算每次运行的成本。
在本例中,GPT-4o 每百万输入 token 成本为 5 美元,每百万输出 token 成本为 15 美元。
输出
测试集生成的 Token 用量
您可以将相同的解析器用于测试集生成,但需要将 token_usage_parser
传递给 generate()
函数。目前它只计算生成过程的成本,不计算转换的成本。
举个例子,让我们加载一个现有的知识图谱并生成一个测试集。如果您想了解更多关于如何生成测试集的信息,请查看测试集生成。
from ragas.testset.graph import KnowledgeGraph
# loading an existing KnowledgeGraph
# make sure to change the path to the location of the KnowledgeGraph file
kg = KnowledgeGraph.load("../../../experiments/scratchpad_kg.json")
kg
输出
KnowledgeGraph(nodes: 47, relationships: 109)
### Choose your LLM
=== "OpenAI"
Install the langchain-openai package
```bash
pip install langchain-openai
```
then ensure you have your OpenAI key ready and available in your environment
```python
import os
os.environ["OPENAI_API_KEY"] = "your-openai-key"
```
Wrapp the LLMs in `LangchainLLMWrapper` so that it can be used with ragas.
```python
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
generator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))
generator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())
```
=== "AWS"
Install the langchain-aws package
```bash
pip install langchain-aws
```
then you have to set your AWS credentials and configurations
```python
config = {
"credentials_profile_name": "your-profile-name", # E.g "default"
"region_name": "your-region-name", # E.g. "us-east-1"
"llm": "your-llm-model-id", # E.g "anthropic.claude-3-5-sonnet-20241022-v2:0"
"embeddings": "your-embedding-model-id", # E.g "amazon.titan-embed-text-v2:0"
"temperature": 0.4,
}
```
define you LLMs and wrap them in `LangchainLLMWrapper` so that it can be used with ragas.
```python
from langchain_aws import ChatBedrockConverse
from langchain_aws import BedrockEmbeddings
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
generator_llm = LangchainLLMWrapper(ChatBedrockConverse(
credentials_profile_name=config["credentials_profile_name"],
region_name=config["region_name"],
base_url=f"https://bedrock-runtime.{config['region_name']}.amazonaws.com",
model=config["llm"],
temperature=config["temperature"],
))
generator_embeddings = LangchainEmbeddingsWrapper(BedrockEmbeddings(
credentials_profile_name=config["credentials_profile_name"],
region_name=config["region_name"],
model_id=config["embeddings"],
))
```
If you want more information on how to use other AWS services, please refer to the [langchain-aws](https://python.langchain.ac.cn/docs/integrations/providers/aws/) documentation.
=== "Google Cloud"
Google offers two ways to access their models: Google AI and Google Cloud Vertex AI. Google AI requires just a Google account and API key, while Vertex AI requires a Google Cloud account with enterprise features.
First, install the required packages:
```bash
pip install langchain-google-genai langchain-google-vertexai
```
Then set up your credentials based on your chosen API:
For Google AI:
```python
import os
os.environ["GOOGLE_API_KEY"] = "your-google-ai-key" # From https://ai.google.dev/
```
For Vertex AI:
```python
# Ensure you have credentials configured (gcloud, workload identity, etc.)
# Or set service account JSON path:
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/service-account.json"
```
Define your configuration:
```python
config = {
"model": "gemini-1.5-pro", # or other model IDs
"temperature": 0.4,
"max_tokens": None,
"top_p": 0.8,
# For Vertex AI only:
"project": "your-project-id", # Required for Vertex AI
"location": "us-central1", # Required for Vertex AI
}
```
Initialize the LLM and wrap it for use with ragas:
```python
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
# Choose the appropriate import based on your API:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_google_vertexai import ChatVertexAI
# Initialize with Google AI Studio
generator_llm = LangchainLLMWrapper(ChatGoogleGenerativeAI(
model=config["model"],
temperature=config["temperature"],
max_tokens=config["max_tokens"],
top_p=config["top_p"],
))
# Or initialize with Vertex AI
generator_llm = LangchainLLMWrapper(ChatVertexAI(
model=config["model"],
temperature=config["temperature"],
max_tokens=config["max_tokens"],
top_p=config["top_p"],
project=config["project"],
location=config["location"],
))
```
You can optionally configure safety settings:
```python
from langchain_google_genai import HarmCategory, HarmBlockThreshold
safety_settings = {
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
# Add other safety settings as needed
}
# Apply to your LLM initialization
generator_llm = LangchainLLMWrapper(ChatGoogleGenerativeAI(
model=config["model"],
temperature=config["temperature"],
safety_settings=safety_settings,
))
```
Initialize the embeddings and wrap them for use with ragas:
```python
# Google AI Studio Embeddings
from langchain_google_genai import GoogleGenerativeAIEmbeddings
generator_embeddings = LangchainEmbeddingsWrapper(GoogleGenerativeAIEmbeddings(
model="models/embedding-001", # Google's text embedding model
task_type="retrieval_document" # Optional: specify the task type
))
```
```python
# Vertex AI Embeddings
from langchain_google_vertexai import VertexAIEmbeddings
generator_embeddings = LangchainEmbeddingsWrapper(VertexAIEmbeddings(
model_name="textembedding-gecko@001", # or other available model
project=config["project"], # Your GCP project ID
location=config["location"] # Your GCP location
))
```
For more information on available models, features, and configurations, refer to: [Google AI documentation](https://ai.google.dev/docs)
- [Vertex AI documentation](https://cloud.google.com/vertex-ai/docs)
- [LangChain Google AI integration](https://python.langchain.ac.cn/docs/integrations/chat/google_generative_ai)
- [LangChain Vertex AI integration](https://python.langchain.ac.cn/docs/integrations/chat/google_vertex_ai)
=== "Azure"
Install the langchain-openai package
```bash
pip install langchain-openai
```
Ensure you have your Azure OpenAI key ready and available in your environment.
```python
import os
os.environ["AZURE_OPENAI_API_KEY"] = "your-azure-openai-key"
# other configuration
azure_config = {
"base_url": "", # your endpoint
"model_deployment": "", # your model deployment name
"model_name": "", # your model name
"embedding_deployment": "", # your embedding deployment name
"embedding_name": "", # your embedding name
}
```
Define your LLMs and wrap them in `LangchainLLMWrapper` so that it can be used with ragas.
```python
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAIEmbeddings
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
generator_llm = LangchainLLMWrapper(AzureChatOpenAI(
openai_api_version="2023-05-15",
azure_endpoint=azure_configs["base_url"],
azure_deployment=azure_configs["model_deployment"],
model=azure_configs["model_name"],
validate_base_url=False,
))
# init the embeddings for answer_relevancy, answer_correctness and answer_similarity
generator_embeddings = LangchainEmbeddingsWrapper(AzureOpenAIEmbeddings(
openai_api_version="2023-05-15",
azure_endpoint=azure_configs["base_url"],
azure_deployment=azure_configs["embedding_deployment"],
model=azure_configs["embedding_name"],
))
```
If you want more information on how to use other Azure services, please refer to the [langchain-azure](https://python.langchain.ac.cn/docs/integrations/chat/azure_chat_openai/) documentation.
=== "Others"
If you are using a different LLM provider and using Langchain to interact with it, you can wrap your LLM in `LangchainLLMWrapper` so that it can be used with ragas.
```python
from ragas.llms import LangchainLLMWrapper
generator_llm = LangchainLLMWrapper(your_llm_instance)
```
For a more detailed guide, checkout [the guide on customizing models](../../howtos/customizations/customize_models.md).
If you using LlamaIndex, you can use the `LlamaIndexLLMWrapper` to wrap your LLM so that it can be used with ragas.
```python
from ragas.llms import LlamaIndexLLMWrapper
generator_llm = LlamaIndexLLMWrapper(your_llm_instance)
```
For more information on how to use LlamaIndex, please refer to the [LlamaIndex Integration guide](./../../howtos/integrations/_llamaindex.md).
If your still not able use Ragas with your favorite LLM provider, please let us know by by commenting on this [issue](https://github.com/explodinggradients/ragas/issues/1617) and we'll add support for it 🙂.
```python
from ragas.testset import TestsetGenerator
from ragas.llms import llm_factory
tg = TestsetGenerator(llm=llm_factory(), knowledge_graph=kg)
# generating a testset
testset = tg.generate(testset_size=10, token_usage_parser=get_token_usage_for_openai)
# total cost for the generation process
testset.total_cost(cost_per_input_token=5 / 1e6, cost_per_output_token=15 / 1e6)
输出