LangSmith
LangSmith 是一款旨在增强大型语言模型 (LLM) 应用开发和部署的先进工具。它提供了一个全面的框架,用于跟踪、分析和优化 LLM 工作流,使开发者更容易管理其应用中的复杂交互。
本教程介绍如何使用 LangSmith 记录 Ragas 评估的跟踪。由于 Ragas 构建于 LangChain 之上,您只需设置 LangSmith,它将自动处理跟踪的记录。
设置 LangSmith
要设置 LangSmith,请确保设置以下环境变量(有关更多详细信息,请参阅 LangSmith 文档)。
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<your-project> # Defaults to "default" if not set
获取数据集
创建评估数据集或评估实例时,请确保术语与 SingleTurnSample
或 MultiTurnSample
中使用的模式一致。
from ragas import EvaluationDataset
dataset = [
{
"user_input": "Which CEO is widely recognized for democratizing AI education through platforms like Coursera?",
"retrieved_contexts": [
"Andrew Ng, CEO of Landing AI, is known for his pioneering work in deep learning and for democratizing AI education through Coursera."
],
"response": "Andrew Ng is widely recognized for democratizing AI education through platforms like Coursera.",
"reference": "Andrew Ng, CEO of Landing AI, is known for democratizing AI education through Coursera.",
},
{
"user_input": "Who is Sam Altman?",
"retrieved_contexts": [
"Sam Altman, CEO of OpenAI, has advanced AI research and advocates for safe, beneficial AI technologies."
],
"response": "Sam Altman is the CEO of OpenAI and advocates for safe, beneficial AI technologies.",
"reference": "Sam Altman, CEO of OpenAI, has advanced AI research and advocates for safe AI.",
},
{
"user_input": "Who is Demis Hassabis and how did he gain prominence?",
"retrieved_contexts": [
"Demis Hassabis, CEO of DeepMind, is known for developing systems like AlphaGo that master complex games."
],
"response": "Demis Hassabis is the CEO of DeepMind, known for developing systems like AlphaGo.",
"reference": "Demis Hassabis, CEO of DeepMind, is known for developing AlphaGo.",
},
{
"user_input": "Who is the CEO of Google and Alphabet Inc., praised for leading innovation across Google's product ecosystem?",
"retrieved_contexts": [
"Sundar Pichai, CEO of Google and Alphabet Inc., leads innovation across Google's product ecosystem."
],
"response": "Sundar Pichai is the CEO of Google and Alphabet Inc., praised for leading innovation across Google's product ecosystem.",
"reference": "Sundar Pichai, CEO of Google and Alphabet Inc., leads innovation across Google's product ecosystem.",
},
{
"user_input": "How did Arvind Krishna transform IBM?",
"retrieved_contexts": [
"Arvind Krishna, CEO of IBM, transformed the company by focusing on cloud computing and AI solutions."
],
"response": "Arvind Krishna transformed IBM by focusing on cloud computing and AI solutions.",
"reference": "Arvind Krishna, CEO of IBM, transformed the company through cloud computing and AI.",
},
]
evaluation_dataset = EvaluationDataset.from_list(dataset)
跟踪 ragas 指标
在您的数据集上运行 Ragas 评估,跟踪将出现在您的 LangSmith 控制面板中,位于指定的项目名称或“默认”下。
from ragas import evaluate
from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness
llm = ChatOpenAI(model="gpt-4o-mini")
evaluator_llm = LangchainLLMWrapper(llm)
result = evaluate(
dataset=evaluation_dataset,
metrics=[LLMContextRecall(), Faithfulness(), FactualCorrectness()],
llm=evaluator_llm,
)
result
输出
Evaluating: 0%| | 0/15 [00:00<?, ?it/s]
{'context_recall': 1.0000, 'faithfulness': 0.9333, 'factual_correctness': 0.8520}