MDER-DR: Multi-Hop Question Answering with Entity-Centric Summaries
MDER-DR framework enhances multi-hop QA with entity-centric summaries, achieving 66% improvement.
Key Findings
Methodology
The MDER-DR framework combines the Map-Disambiguate-Enrich-Reduce (MDER) and Decompose-Resolve (DR) methods. MDER generates entity-level summaries during KG construction, avoiding explicit graph traversal during QA retrieval. DR decomposes user queries into resolvable triples and grounds them in the KG through iterative reasoning. This combination makes MDER-DR robust in handling sparse, incomplete, and complex relational data.
Key Results
- In standard and domain-specific benchmarks, MDER-DR achieved up to 66% improvement over traditional RAG baselines, particularly in the WikiQA dataset where MDER-DR's Soft EM score was 0.800 compared to 0.538 for the best baseline, Vector-RAG.
- In the HotpotQA dataset, MDER-DR excelled in the LLM-as-a-Judge evaluation, scoring 0.515, showing significant improvement over other RAG architectures.
- On the BenchEE dataset, MDER-DR received high scores in human-based evaluations by domain experts, demonstrating strong capabilities in handling expert-level content.
Significance
The MDER-DR framework has significant implications for both academia and industry. It addresses long-standing challenges in multi-hop QA over KGs, particularly regarding information loss and complex relational reasoning. By compressing relational information during indexing, MDER-DR not only improves retrieval efficiency but also enhances cross-lingual robustness. This approach opens new possibilities for multilingual and multi-domain QA systems.
Technical Contribution
MDER-DR's technical contributions lie in its compression of multi-hop relational information during indexing, avoiding explicit graph traversal during inference. This approach fundamentally differs from existing SOTA methods, providing new theoretical guarantees and engineering possibilities. By using entity-centric summaries, MDER-DR achieves efficient retrieval and reasoning over complex relations.
Novelty
MDER-DR is the first framework to compress multi-hop relational information into entity summaries during indexing. Compared to existing multi-hop QA methods, it significantly improves efficiency and accuracy by eliminating the need for explicit graph traversal during inference.
Limitations
- When dealing with highly complex relational networks, MDER-DR may face issues with overly simplified entity summaries, leading to information loss.
- Due to its reliance on large language models, MDER-DR may encounter performance bottlenecks when processing very long texts.
- In certain specific domains, MDER-DR may require parameter adjustments to achieve optimal performance.
Future Work
Future research directions include further optimizing MDER-DR's performance in handling large-scale KGs and exploring its applications in more domains and languages. Additionally, integrating other advanced NLP technologies, such as deep learning and graph neural networks, may further enhance its performance.
AI Executive Summary
Knowledge Graphs (KGs) play a crucial role in structuring information, but existing Retrieval-Augmented Generation (RAG) methods often lose important contextual nuances when text is reduced to triples. This information loss is particularly detrimental in multi-hop QA tasks, which require composing answers from multiple entities, facts, or relations.
To address this issue, we propose a KG-based QA framework called MDER-DR, which covers both the indexing and retrieval/inference phases. MDER-DR consists of two main components: Map-Disambiguate-Enrich-Reduce (MDER) and Decompose-Resolve (DR). MDER generates context-derived triple descriptions during KG construction and integrates them with entity-level summaries, avoiding the need for explicit traversal of edges in the graph during QA retrieval. DR decomposes user queries into resolvable triples and grounds them in the KG via iterative reasoning.
MDER-DR performs exceptionally well across multiple multi-hop QA benchmarks, including cross-lingual and domain-specific settings. Experimental results demonstrate consistent improvements over standard RAG baselines, particularly in the WikiQA and HotpotQA datasets. MDER-DR's entity-centric summaries effectively preserve the details needed for exact answer extraction.
The MDER-DR framework has significant implications for both academia and industry. It addresses long-standing challenges in multi-hop QA over KGs, particularly regarding information loss and complex relational reasoning. By compressing relational information during indexing, MDER-DR not only improves retrieval efficiency but also enhances cross-lingual robustness. This approach opens new possibilities for multilingual and multi-domain QA systems.
However, MDER-DR may face challenges with overly simplified entity summaries when dealing with highly complex relational networks, leading to information loss. Additionally, due to its reliance on large language models, MDER-DR may encounter performance bottlenecks when processing very long texts. Future research directions include further optimizing MDER-DR's performance in handling large-scale KGs and exploring its applications in more domains and languages.
Deep Dive
Abstract
Retrieval-Augmented Generation (RAG) over Knowledge Graphs (KGs) suffers from the fact that indexing approaches may lose important contextual nuance when text is reduced to triples, thereby degrading performance in downstream Question-Answering (QA) tasks, particularly for multi-hop QA, which requires composing answers from multiple entities, facts, or relations. We propose a domain-agnostic, KG-based QA framework that covers both the indexing and retrieval/inference phases. A new indexing approach called Map-Disambiguate-Enrich-Reduce (MDER) generates context-derived triple descriptions and subsequently integrates them with entity-level summaries, thus avoiding the need for explicit traversal of edges in the graph during the QA retrieval phase. Complementing this, we introduce Decompose-Resolve (DR), a retrieval mechanism that decomposes user queries into resolvable triples and grounds them in the KG via iterative reasoning. Together, MDER and DR form an LLM-driven QA pipeline that is robust to sparse, incomplete, and complex relational data. Experiments show that on standard and domain specific benchmarks, MDER-DR achieves substantial improvements over standard RAG baselines (up to 66%), while maintaining cross-lingual robustness. Our code is available at https://github.com/DataSciencePolimi/MDER-DR_RAG.
References (20)
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
Darren Edge, Ha Trinh, Newman Cheng et al.
PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text
Haitian Sun, Tania Bedrax-Weiss, William W. Cohen
WikiQA: A Challenge Dataset for Open-Domain Question Answering
Yi Yang, Wen-tau Yih, Christopher Meek
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
H. Trivedi, Niranjan Balasubramanian, Tushar Khot et al.
How to Mitigate Information Loss in Knowledge Graphs for GraphRAG: Leveraging Triple Context Restoration and Query-Driven Feedback
Manzong Huang, Chenyang Bu, Yi He et al.
The Web as a Knowledge-Base for Answering Complex Questions
Alon Talmor, Jonathan Berant
QA Is the New KR: Question-Answer Pairs as Knowledge Bases
Wenhu Chen, William W. Cohen, Michiel de Jong et al.
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Lianmin Zheng, Wei-Lin Chiang, Ying Sheng et al.
A Comprehensive Survey on Automatic Knowledge Graph Construction
Lingfeng Zhong, Jia Wu, Qian Li et al.
Unifying Large Language Models and Knowledge Graphs: A Roadmap
Shirui Pan, Linhao Luo, Yufei Wang et al.
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao, Jeffrey Zhao, Dian Yu et al.
ATLANTIC: Structure-Aware Retrieval-Augmented Language Model for Interdisciplinary Science
Sai Munikoti, Anurag Acharya, S. Wagle et al.
Constructing Datasets for Multi-hop Reading Comprehension Across Documents
Johannes Welbl, Pontus Stenetorp, Sebastian Riedel
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
O. Khattab, Keshav Santhanam, Xiang Lisa Li et al.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Patrick Lewis, Ethan Perez, Aleksandara Piktus et al.
Knowledge Graphs
Aidan Hogan, E. Blomqvist, Michael Cochez et al.
Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision
Chen Liang, Jonathan Berant, Quoc V. Le et al.
Reified Input/Output logic: Combining Input/Output logic and Reification to represent norms coming from existing legislation
Livio Robaldo, Xin Sun
Fine-tuning Language Models for Triple Extraction with Data Augmentation
Yujia Zhang, Tyler Sadler, Mohammad Reza Taesiri et al.
A Survey on RAG with LLMs
Muhammad Arslan, Hussam Ghanem, Saba Munawar et al.