跳到内容

使用 Ragas 为基于 RAG 的问答系统生成合成测试集

概述

在本教程中,我们将探索 Ragas 中的测试集生成模块,为基于检索增强生成 (RAG) 的问答机器人创建一个合成测试集。我们的目标是设计一个 Ragas 航空公司助手,能够回答客户关于各种主题的查询,包括

  • 航班预订
  • 航班更改和取消
  • 行李政策
  • 查看预订
  • 航班延误
  • 机上服务
  • 特殊协助

为了确保我们的合成数据集尽可能真实且多样化,我们将创建不同的客户角色。每个角色将代表不同的旅行者类型和行为,帮助我们构建一个全面且具有代表性的测试集。这种方法确保我们可以彻底评估 RAG 模型的有效性和鲁棒性。

让我们开始吧!

下载并加载文档

运行以下命令下载虚拟 Ragas 航空公司数据集并使用 LangChain 加载文档。

! git clone https://hugging-face.cn/datasets/explodinggradients/ragas-airline-dataset
from langchain_community.document_loaders import DirectoryLoader

path = "ragas-airline-dataset"
loader = DirectoryLoader(path, glob="**/*.md")
docs = loader.load()

设置 LLM 和嵌入模型

from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI, OpenAIEmbeddings


generator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o-mini"))
generator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings(model="text-embedding-3-small"))

创建知识图谱

使用文档创建基础知识图谱

from ragas.testset.graph import KnowledgeGraph
from ragas.testset.graph import Node, NodeType


kg = KnowledgeGraph()

for doc in docs:
    kg.nodes.append(
        Node(
            type=NodeType.DOCUMENT,
            properties={"page_content": doc.page_content, "document_metadata": doc.metadata}
        )
    )

kg
输出
KnowledgeGraph(nodes: 8, relationships: 0)

设置转换

在本教程中,我们使用仅由节点构建的知识图谱创建了一个单跳查询数据集。为了增强我们的图谱并改进查询生成,我们应用了三个关键转换

  • 标题提取: 使用语言模型从每个文档中提取清晰的章节标题(例如,从 flight cancellations.md 中提取“航空公司发起的取消”)。这些标题隔离了特定主题,为生成重点问题提供了直接上下文。
  • 标题分割: 根据提取的标题将文档分割成易于管理的小节。这增加了节点数量,并确保生成更精细、更具上下文特异性的查询。
  • 关键词提取: 识别核心主题关键词(例如关键座位信息),这些关键词作为语义种子点,丰富了生成查询的多样性和相关性。

from ragas.testset.transforms import apply_transforms
from ragas.testset.transforms import HeadlinesExtractor, HeadlineSplitter, KeyphrasesExtractor

headline_extractor = HeadlinesExtractor(llm=generator_llm, max_num=20)
headline_splitter = HeadlineSplitter(max_tokens=1500)
keyphrase_extractor = KeyphrasesExtractor(llm=generator_llm)

transforms = [
    headline_extractor,
    headline_splitter,
    keyphrase_extractor
]

apply_transforms(kg, transforms=transforms)
Applying HeadlinesExtractor: 100%|██████████| 8/8 [00:00<?, ?it/s]
Applying HeadlineSplitter: 100%|██████████| 8/8 [00:00<?, ?it/s]
Applying KeyphrasesExtractor: 100%|██████████| 25/25 [00:00<?, ?it/s]

配置用于查询生成的角色

角色提供了上下文和视角,确保生成的查询自然、特定于用户且多样化。通过根据不同用户视角定制查询,我们的测试集涵盖了广泛的场景

  • 首次乘机旅客: 生成带有详细分步指导的查询,针对需要清晰指示的新手。
  • 常旅客: 为经验丰富的旅行者生成简洁、注重效率的查询。
  • 愤怒的商务舱旅客: 生成带有批评、紧急语气的查询,以反映高期望和立即解决的需求。
from ragas.testset.persona import Persona

persona_first_time_flier = Persona(
    name="First Time Flier",
    role_description="Is flying for the first time and may feel anxious. Needs clear guidance on flight procedures, safety protocols, and what to expect throughout the journey.",
)

persona_frequent_flier = Persona(
    name="Frequent Flier",
    role_description="Travels regularly and values efficiency and comfort. Interested in loyalty programs, express services, and a seamless travel experience.",
)

persona_angry_business_flier = Persona(
    name="Angry Business Class Flier",
    role_description="Demands top-tier service and is easily irritated by any delays or issues. Expects immediate resolutions and is quick to express frustration if standards are not met.",
)

personas = [persona_first_time_flier, persona_frequent_flier, persona_angry_business_flier]

使用合成器生成查询

合成器负责将丰富的节点和角色转换为查询。它们通过选择节点属性(例如,“实体”或“关键词”),将其与角色、样式和查询长度配对,然后使用 LLM 根据节点内容生成查询-答案对来实现这一目标。

使用 SingleHopSpecificQuerySynthesizer 的两个实例来定义查询分布

  • 基于标题的合成器 – 使用提取的文档标题生成查询,从而产生引用特定章节的结构化问题。
  • 基于关键词的合成器 – 围绕关键概念形成查询,生成更广泛的主题问题。

两个合成器权重相等(各 0.5),确保了特定和概念性查询的均衡组合,最终增强了测试集的多样性。

from ragas.testset.synthesizers.single_hop.specific import (
    SingleHopSpecificQuerySynthesizer,
)

query_distibution = [
    (
        SingleHopSpecificQuerySynthesizer(llm=generator_llm, property_name="headlines"),
        0.5,
    ),
    (
        SingleHopSpecificQuerySynthesizer(
            llm=generator_llm, property_name="keyphrases"
        ),
        0.5,
    ),
]

测试集生成

from ragas.testset import TestsetGenerator

generator = TestsetGenerator(
    llm=generator_llm,
    embedding_model=generator_embeddings,
    knowledge_graph=kg,
    persona_list=personas,
)

现在我们可以生成测试集了。

testset = generator.generate(testset_size=10, query_distribution=query_distibution)
testset.to_pandas()
Generating Scenarios: 100%|██████████| 2/2 [00:00<?, ?it/s]
Generating Samples: 100%|██████████| 10/10 [00:00<?, ?it/s]
输出

用户输入 参考上下文 参考答案 合成器名称
0 Wut do I do if my baggage is Delayed, Lost, or... [Baggage Policies\n\nThis section provides a d... If your baggage is delayed, lost, or damaged, ... single_hop_specifc_query_synthesizer
1 Wht asistance is provided by the airline durin... [Flight Delays\n\nFlight delays can be caused ... Depending on the length of the delay, Ragas Ai... single_hop_specifc_query_synthesizer
2 What is Step 1: Check Fare Rules in the contex... [Flight Cancellations\n\nFlight cancellations ... Step 1: Check Fare Rules involves logging into... single_hop_specifc_query_synthesizer
3 How can I access my booking online with Ragas ... [Managing Reservations\n\nManaging your reserv... To access your booking online with Ragas Airli... single_hop_specifc_query_synthesizer
4 What assistance does Ragas Airlines provide fo... [Special Assistance\n\nRagas Airlines provides... Ragas Airlines provides special assistance ser... single_hop_specifc_query_synthesizer
5 What steps should I take if my baggage is dela... [Baggage Policies This section provides a deta... If your baggage is delayed, lost, or damaged w... single_hop_specifc_query_synthesizer
6 How can I resubmit the claim for my baggage is... [Potential Issues and Resolutions for Baggage ... To resubmit the claim for your baggage issue, ... single_hop_specifc_query_synthesizer
7 Wut are the main causes of flight delays and h... [Flight Delays Flight delays can be caused by ... Flight delays can be caused by weather conditi... single_hop_specifc_query_synthesizer
8 How can I request reimbursement for additional... [2. Additional Expenses Incurred Due to Delay ... To request reimbursement for additional expens... single_hop_specifc_query_synthesizer
9 What are passenger-initiated cancelations? [Flight Cancellations Flight cancellations can... Passenger-initiated cancellations occur when a... single_hop_specifc_query_synthesizer

最后思考

在本教程中,我们探索了使用 Ragas 库进行测试集生成,主要关注单跳查询。在即将发布的教程中,我们将深入研究多跳查询,扩展这些概念以获得更丰富的测试集场景。