跳到内容

LlamaIndex

LlamaIndex 是一个用于 LLM 应用的数据框架,用于摄取、组织和访问私有或特定领域的数据。它使得将 LLM 与您自己的数据连接起来变得超级简单。但为了找出 LlamaIndex 和您的数据的最佳配置,您需要一个客观的性能衡量标准。这就是 ragas 的用武之地。Ragas 将帮助您评估您的 QueryEngine,并让您有信心调整配置以获得最高分数。

本指南假设您熟悉 LlamaIndex 框架。

构建测试集

您将需要一个测试集来评估您的 QueryEngine。您可以自己构建一个,或者使用 Ragas 中的 测试集生成器模块 来开始构建一个小型合成测试集。

让我们看看它如何与 Llamaindex 一起工作

加载文档

from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader("./nyc_wikipedia").load_data()

现在让我们使用相应的生成器和评论器 LLM 初始化 TestsetGenerator 对象

from ragas.testset import TestsetGenerator

from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

# generator with openai models
generator_llm = OpenAI(model="gpt-4o")
embeddings = OpenAIEmbedding(model="text-embedding-3-large")

generator = TestsetGenerator.from_llama_index(
    llm=generator_llm,
    embedding_model=embeddings,
)

现在您已经准备好生成数据集了

# generate testset
testset = generator.generate_with_llamaindex_docs(
    documents,
    testset_size=5,
)
df = testset.to_pandas()
df.head()
user_input reference_contexts reference synthesizer_name
0 Cud yu pleese explane the role of New York Cit... [New York, often called New York City or NYC, ... New York City serves as the geographical and d... single_hop_specifc_query_synthesizer
1 So like, what was New York City called before ... [History == === Early history === In the pre-C... Before it was called New York, the area was kn... single_hop_specifc_query_synthesizer
2 what happen in new york with slavery and how i... [and rechristened it "New Orange" after Willia... In the early 18th century, New York became a c... single_hop_specifc_query_synthesizer
3 What historical significance does Long Island ... [<1-hop>\n\nHistory == === Early history === I... Long Island holds historical significance in t... multi_hop_specific_query_synthesizer
4 What role does the Staten Island Ferry play in... [<1-hop>\n\nto start service in 2017; this wou... The Staten Island Ferry plays a significant ro... multi_hop_specific_query_synthesizer

有了测试数据集来测试我们的 QueryEngine,现在让我们构建并评估它。

构建 QueryEngine

首先,让我们以纽约市的 维基百科页面 为例构建一个 VectorStoreIndex,并使用 ragas 进行评估。

既然我们已经将数据集加载到 documents 中,就用它吧。

# build query engine
from llama_index.core import VectorStoreIndex

vector_index = VectorStoreIndex.from_documents(documents)

query_engine = vector_index.as_query_engine()

让我们从生成的测试集中尝试一个样本问题,看看它是否有效

# convert it to pandas dataset
df = testset.to_pandas()
df["user_input"][0]
'Cud yu pleese explane the role of New York City within the Northeast megalopolis, and how it contributes to the cultural and economic vibrancy of the region?'
response_vector = query_engine.query(df["user_input"][0])

print(response_vector)
New York City serves as a key hub within the Northeast megalopolis, playing a significant role in enhancing the cultural and economic vibrancy of the region. Its status as a global center of creativity, entrepreneurship, and cultural diversity contributes to the overall dynamism of the area. The city's renowned arts scene, including Broadway theatre and numerous cultural institutions, attracts artists and audiences from around the world, enriching the cultural landscape of the Northeast megalopolis. Economically, New York City's position as a leading financial and fintech center, home to major stock exchanges and a bustling real estate market, bolsters the region's economic strength and influence. Additionally, the city's diverse culinary scene, influenced by its immigrant history, adds to the cultural richness of the region, making New York City a vital component of the Northeast megalopolis's cultural and economic tapestry.

评估 QueryEngine

现在我们有了用于 VectorStoreIndexQueryEngine,我们可以使用 Ragas 的 llama_index 集成来评估它。

为了使用 Ragas 和 LlamaIndex 运行评估,您需要 3 样东西

  1. LlamaIndex QueryEngine:我们将要评估的对象
  2. 指标:Ragas 定义了一组可以衡量 QueryEngine 不同方面的指标。可用指标及其含义可以在此处找到
  3. 问题:ragas 将用来测试 QueryEngine 的问题列表。

首先,让我们生成问题。理想情况下,您应该使用在生产环境中遇到的问题,这样我们评估时使用的问题分布才能匹配生产环境中遇到的问题分布。这确保了分数反映了在生产环境中看到的性能,但作为开始,我们将使用一些示例问题。

现在让我们导入将用于评估的指标

# import metrics
from ragas.metrics import (
    Faithfulness,
    AnswerRelevancy,
    ContextPrecision,
    ContextRecall,
)

# init metrics with evaluator LLM
from ragas.llms import LlamaIndexLLMWrapper

evaluator_llm = LlamaIndexLLMWrapper(OpenAI(model="gpt-4o"))
metrics = [
    Faithfulness(llm=evaluator_llm),
    AnswerRelevancy(llm=evaluator_llm),
    ContextPrecision(llm=evaluator_llm),
    ContextRecall(llm=evaluator_llm),
]

evaluate() 函数期望一个包含“question”和“ground_truth”的字典用于指标。您可以轻松地将 testset 转换为该格式

# convert to Ragas Evaluation Dataset
ragas_dataset = testset.to_evaluation_dataset()
ragas_dataset
EvaluationDataset(features=['user_input', 'reference_contexts', 'reference'], len=6)

最后,让我们运行评估

from ragas.integrations.llama_index import evaluate

result = evaluate(
    query_engine=query_engine,
    metrics=metrics,
    dataset=ragas_dataset,
)
# final scores
print(result)
{'faithfulness': 0.7454, 'answer_relevancy': 0.9348, 'context_precision': 0.6667, 'context_recall': 0.4667}

您可以将其转换为 pandas DataFrame 以进行更多分析。

result.to_pandas()
user_input retrieved_contexts reference_contexts response reference faithfulness answer_relevancy context_precision context_recall
0 Cud yu pleese explane the role of New York Cit... [and its ideals of liberty and peace. In the 2... [New York, often called New York City or NYC, ... New York City plays a significant role within ... New York City serves as the geographical and d... 0.615385 0.918217 0.0 0.0
1 So like, what was New York City called before ... [New York City is the headquarters of the glob... [History == === Early history === In the pre-C... New York City was named New Amsterdam before i... Before it was called New York, the area was kn... 1.000000 0.967821 1.0 1.0
2 what happen in new york with slavery and how i... [=== Province of New York and slavery ===\n\nI... [and rechristened it "New Orange" after Willia... Slavery became a significant part of New York'... In the early 18th century, New York became a c... 1.000000 0.919264 1.0 1.0
3 What historical significance does Long Island ... [==== River crossings ====\n\nNew York City is... [<1-hop>\n\nHistory == === Early history === I... Long Island played a significant role in the e... Long Island holds historical significance in t... 0.500000 0.931895 0.0 0.0
4 What role does the Staten Island Ferry play in... [==== Buses ====\n\nNew York City's public bus... [<1-hop>\n\nto start service in 2017; this wou... The Staten Island Ferry serves as a vital mode... The Staten Island Ferry plays a significant ro... 0.500000 0.936920 1.0 0.0
5 How does Central Park's role as a cultural and... [==== State parks ====\n\nThere are seven stat... [<1-hop>\n\nCity has over 28,000 acres (110 km... Central Park's role as a cultural and historic... Central Park, located in middle-upper Manhatta... 0.857143 0.934841 1.0 0.8