IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse
IndexCache accelerates sparse attention by reusing cross-layer indices, reducing 75% of computations, achieving 1.82x speedup.
Yushi Bai, Qian Dong, Ting Jiang et al.
IndexCache accelerates sparse attention by reusing cross-layer indices, reducing 75% of computations, achieving 1.82x speedup.
Yushi Bai, Qian Dong, Ting Jiang et al.
Introduced a Polish long-context encoder model handling up to 8192 tokens, significantly improving long-document task performance.
Sławomir Dadas, Rafał Poświata, Marek Kozłowski et al.
LifeSim simulates user cognition via BDI model to enhance personalized assistant evaluation.
Feiyu Duan, Xuanjing Huang, Zhongyu Wei
MDER-DR framework enhances multi-hop QA with entity-centric summaries, achieving 66% improvement.
Riccardo Campi, Nicolò Oreste Pinciroli Vago, Mathyas Giudici et al.
IsalGraph method encodes any finite simple graph as a compact string over a nine-character instruction alphabet, suitable for graph similarity search.
Ezequiel Lopez-Rubio, Mario Pascual-Gonzalez
GLM-OCR combines CogViT visual encoder and GLM language decoder to enhance document understanding efficiency.
Shuaiqi Duan, Yadong Xue, Weihan Wang et al.
We release a large bilingual library dataset for GND-based multi-label classification.
Jennifer D'Souza, Sameer Sadruddin, Maximilian Kähler et al.
LLM-assisted MIPVU rule script generation enables interpretable Chinese metaphor identification; protocol choice is the main source of variation.
Weihang Huang, Mengna Liu
Introduced DOWIS dataset to evaluate SLLMs in multilingual settings, finding text prompts outperform spoken prompts.
Maike Züfle, Sara Papi, Fabian Retkowski et al.
N-gram models predict reading time best due to sensitivity to simple statistics.
James A. Michaelov, Roger P. Levy