LangSmith
LangSmith 是一款先进的工具,旨在增强利用大型语言模型(LLM)的应用程序的开发和部署。它为追踪、分析和优化 LLM 工作流提供了一个全面的框架,使开发人员能够更轻松地管理其应用程序中的复杂交互。
本教程解释了如何使用 LangSmith 记录 Ragas 评估的追踪信息。由于 Ragas 是基于 LangChain 构建的,您只需设置 LangSmith,它就会自动处理追踪信息的记录。
设置 LangSmith
要设置 LangSmith,请确保您设置了以下环境变量(更多详情请参考 LangSmith 文档)
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
export LANGCHAIN_API_KEY=<your-api-key>
export LANGCHAIN_PROJECT=<your-project> # Defaults to "default" if not set
获取数据集
在创建评估数据集或评估实例时,请确保术语与 SingleTurnSample 或 MultiTurnSample 中使用的模式相匹配。
from ragas import EvaluationDataset
dataset = [
{
"user_input": "Which CEO is widely recognized for democratizing AI education through platforms like Coursera?",
"retrieved_contexts": [
"Andrew Ng, CEO of Landing AI, is known for his pioneering work in deep learning and for democratizing AI education through Coursera."
],
"response": "Andrew Ng is widely recognized for democratizing AI education through platforms like Coursera.",
"reference": "Andrew Ng, CEO of Landing AI, is known for democratizing AI education through Coursera.",
},
{
"user_input": "Who is Sam Altman?",
"retrieved_contexts": [
"Sam Altman, CEO of OpenAI, has advanced AI research and advocates for safe, beneficial AI technologies."
],
"response": "Sam Altman is the CEO of OpenAI and advocates for safe, beneficial AI technologies.",
"reference": "Sam Altman, CEO of OpenAI, has advanced AI research and advocates for safe AI.",
},
{
"user_input": "Who is Demis Hassabis and how did he gain prominence?",
"retrieved_contexts": [
"Demis Hassabis, CEO of DeepMind, is known for developing systems like AlphaGo that master complex games."
],
"response": "Demis Hassabis is the CEO of DeepMind, known for developing systems like AlphaGo.",
"reference": "Demis Hassabis, CEO of DeepMind, is known for developing AlphaGo.",
},
{
"user_input": "Who is the CEO of Google and Alphabet Inc., praised for leading innovation across Google's product ecosystem?",
"retrieved_contexts": [
"Sundar Pichai, CEO of Google and Alphabet Inc., leads innovation across Google's product ecosystem."
],
"response": "Sundar Pichai is the CEO of Google and Alphabet Inc., praised for leading innovation across Google's product ecosystem.",
"reference": "Sundar Pichai, CEO of Google and Alphabet Inc., leads innovation across Google's product ecosystem.",
},
{
"user_input": "How did Arvind Krishna transform IBM?",
"retrieved_contexts": [
"Arvind Krishna, CEO of IBM, transformed the company by focusing on cloud computing and AI solutions."
],
"response": "Arvind Krishna transformed IBM by focusing on cloud computing and AI solutions.",
"reference": "Arvind Krishna, CEO of IBM, transformed the company through cloud computing and AI.",
},
]
evaluation_dataset = EvaluationDataset.from_list(dataset)
追踪 ragas 指标
在您的数据集上运行 Ragas 评估,追踪信息将出现在您 LangSmith 仪表板中指定的项目名称下或“default”项目下。
from ragas import evaluate
from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness
llm = ChatOpenAI(model="gpt-4o-mini")
evaluator_llm = LangchainLLMWrapper(llm)
result = evaluate(
dataset=evaluation_dataset,
metrics=[LLMContextRecall(), Faithfulness(), FactualCorrectness()],
llm=evaluator_llm,
)
result
输出
Evaluating: 0%| | 0/15 [00:00<?, ?it/s]
{'context_recall': 1.0000, 'faithfulness': 0.9333, 'factual_correctness': 0.8520}
