Multi-Agent Transactive Memory

TL;DR

Proposed Multi-Agent Transactive Memory (MATM) enhances heterogeneous agent populations by sharing trajectories, improving success rate by 8% and reducing steps by 0.59 in interactive tasks.

cs.AI 🔴 Advanced 2026-06-18 18 views

To Eun Kim Xuhong He Dishank Jain Ambuj Agrawal Negar Arabzadeh Fernando Diaz

AI Reader Arxiv Page Download PDF

Multi-Agent Systems Knowledge Sharing Trajectory Retrieval Reinforcement Learning Interactive Environments

Key Findings

Methodology

This paper introduces the Multi-Agent Transactive Memory (MATM) framework based on the concept of transactive memory, which involves a shared repository where agents contribute and retrieve procedural trajectories. The system employs a state-conditioned key-value indexing scheme, encoding interaction sequences into dense vector representations using a shared embedding function f. Retrieval is performed via a dense retriever, followed by a learned ranking (LTR) model such as LambdaMART or SVMRank, to prioritize relevant trajectories. The framework supports continuous growth as more agents contribute, enabling collective knowledge accumulation. Experiments in ALFWorld and WebArena environments demonstrate that retrieving trajectories from MATM improves downstream task success rates, reduces interaction steps, and benefits both weak and strong agents without joint training or coordination.

Key Results

In ALFWorld, success rate increased from 47% to 55%, with an average step reduction from 11.77 to 11.18, and the RPP metric improved from -0.16 to -0.05, indicating better efficiency and effectiveness.
In WebArena, success rate rose from 18% to 20%, average steps decreased from 22.0 to 20.3, and RPP shifted from -0.05 to 0.03, confirming the framework's robustness across environments.
Incorporating the learned ranking model, especially SVMRank, further boosted performance: ALFWorld success rate reached 64.3%, a 17.2% increase over no retrieval, with steps reduced to 10.35, demonstrating the benefit of retrieval optimization.

Significance

This work addresses the critical challenge of knowledge reuse in heterogeneous multi-agent systems, moving beyond traditional single-agent or centralized approaches. By enabling scalable, decentralized sharing of procedural trajectories, MATM facilitates collective learning and reduces redundant exploration. It paves the way for more autonomous, adaptable, and efficient multi-agent ecosystems, with potential impacts spanning robotics, virtual assistants, and autonomous vehicles. The framework's ability to improve task success and efficiency without additional training or coordination marks a significant advance in AI system design, fostering more sustainable and scalable intelligent environments.

Technical Contribution

The primary technical innovations include: • A state-conditioned key-value indexing scheme that encodes long interaction sequences for efficient retrieval; • Integration of learning-to-rank models (LambdaMART, SVMRank) to optimize trajectory relevance, surpassing simple similarity measures; • A scalable, continuously growing shared repository that supports multi-source trajectory contributions and retrieval, enabling cross-agent transfer and cumulative knowledge growth. These contributions collectively enhance the retrieval accuracy, scalability, and practical deployment of multi-agent procedural knowledge systems.

Novelty

This study is the first to embed the concept of transactive memory into multi-agent trajectory sharing, combining state-conditioned indexing with learned ranking for dynamic, large-scale knowledge bases. Unlike prior works limited to single-agent memory or centralized repositories, MATM facilitates decentralized, open, and scalable knowledge exchange among heterogeneous agents in real-time environments. Its ability to continuously grow and adapt distinguishes it from static or manually curated knowledge bases, marking a new paradigm in multi-agent collective intelligence.

Limitations

The effectiveness heavily depends on the quality and diversity of collected trajectories; poor or biased data can limit retrieval relevance and overall performance.
Handling extremely long or highly complex interaction sequences remains computationally challenging, potentially affecting retrieval speed and accuracy.
Training and maintaining the ranking models and index structures require substantial computational resources, which may hinder deployment in resource-constrained settings.

Future Work

Future directions include developing more efficient indexing and retrieval algorithms to support larger populations, integrating multi-modal data such as images and videos into trajectories, and exploring adaptive ranking models that dynamically adjust to environment changes. Additionally, research into transfer learning and knowledge distillation could further enhance the system's generalization and scalability, enabling broader application across diverse domains.

AI Executive Summary

The rapid expansion of multi-agent systems in complex, dynamic environments has underscored the need for effective knowledge sharing mechanisms among heterogeneous agents. Traditional approaches, such as centralized repositories or internal memory modules, face scalability and flexibility limitations, especially in open ecosystems where agents can join or leave freely. Addressing this gap, the present study introduces the Multi-Agent Transactive Memory (MATM) framework, inspired by human social memory systems, to facilitate decentralized, continuous sharing of procedural knowledge.

MATM operates as a shared, scalable repository where agents contribute trajectories—long sequences of actions and observations—generated during task execution. These trajectories encode rich procedural information, which can be retrieved by other agents to improve task performance. The core technical innovation lies in a state-conditioned key-value indexing scheme that encodes interaction sequences into dense vectors, enabling efficient retrieval even for long and complex trajectories. To enhance relevance, the system employs learned ranking models such as LambdaMART and SVMRank, which prioritize trajectories based on their utility for the current task and state.

Experiments conducted in ALFWorld and WebArena environments demonstrate the effectiveness of MATM. Results show that retrieval from the shared repository improves success rates by 8% in ALFWorld and 2% in WebArena, while reducing the number of interaction steps by approximately 0.6 steps per episode. When combined with learned rerankers, performance further improves, with ALFWorld success reaching 64.3%, a 17.2% increase over baseline without retrieval. These findings confirm that collective procedural knowledge significantly enhances both efficiency and effectiveness across diverse tasks.

This framework's significance extends beyond immediate performance gains. It offers a scalable, decentralized solution for knowledge transfer in heterogeneous, open multi-agent ecosystems, reducing redundant exploration and fostering continual learning. By enabling agents to leverage collective experience, MATM addresses long-standing challenges in AI scalability, adaptability, and autonomous collaboration. Its potential applications span robotics, virtual assistants, and autonomous vehicles, promising a future where intelligent agents share and grow knowledge dynamically.

Despite these advances, limitations remain. The system's reliance on trajectory quality, the computational cost of indexing and retrieval, and challenges in handling extremely long sequences warrant further research. Future work aims to optimize indexing algorithms, incorporate multi-modal data, and develop adaptive ranking models. Overall, MATM represents a significant step toward realizing truly scalable, collaborative multi-agent systems capable of autonomous knowledge sharing and continuous improvement.

Deep Dive

Abstract

The decentralized deployment of LLM agents with diverse capabilities across diverse tasks motivates infrastructure for knowledge sharing across heterogeneous agent populations. Just as search engines index human-generated artifacts to support human problem solving, retrieval systems can organize agent-generated artifacts for reuse across agent populations. We extend retrieval-augmented generation - which demonstrates the value of human-authored artifacts to individual agents - to retrieval of agent-generated artifacts supporting a population of agents. In particular, agent trajectories encode reusable procedural knowledge, yet these artifacts are typically discarded after a single use or retained only by the producing agent, forcing newly instantiated agents to repeatedly rediscover existing solutions. We propose Multi-Agent Transactive Memory (MATM), a framework for population-level storage and retrieval of agent-generated trajectories, where producer agents contribute trajectories to a shared repository and consumer agents retrieve them to improve task execution. We focus on interactive environments (ALFWorld and WebArena), where trajectories are long and encode especially rich procedural structure. Our experiments demonstrate that retrieving trajectories from MATM improves downstream task performance and reduces interaction steps without coordination or joint training. These results position MATM as a design pattern for population-level experience sharing in open agent ecosystems.

cs.AI cs.CL cs.IR

Multi-Agent Transactive Memory

Key Findings

Methodology

Key Results

Significance

Technical Contribution

Novelty

Limitations

Future Work

AI Executive Summary

Deep Dive

Abstract

Related Papers

DeepSWIP: Quotient-WMC Counterfactuals for Neural Probabilistic Logic Programs

DRFLOW: A Deep Research Benchmark for Personalized Workflow Prediction

Abstracting Cross-Domain Action Sequences into Interpretable Workflows

Automated reproducibility assessments in the social and behavioral sciences using large language models

The Role of Feedback Alignment in Self-Distillation

A History-Aware Visually Grounded Critic for Computer Use Agents