Quantized Inference for OneRec-V2
OneRec-V2 achieves 49% latency reduction and 92% throughput increase via FP8 quantized inference.
Yi Su, Xinchen Luo, Hongtao Cheng et al.
OneRec-V2 achieves 49% latency reduction and 92% throughput increase via FP8 quantized inference.
Yi Su, Xinchen Luo, Hongtao Cheng et al.
MDER-DR framework enhances multi-hop QA with entity-centric summaries, achieving 66% improvement.
Riccardo Campi, Nicolò Oreste Pinciroli Vago, Mathyas Giudici et al.
COMIC system uses LLM critics to generate sketch comedy videos near professional quality.
Susung Hong, Brian Curless, Ira Kemelmacher-Shlizerman et al.
NeFTY achieves high-accuracy 3D thermal diffusion reconstruction using a differentiable physics framework, significantly improving defect localization.
Tao Zhong, Yixun Hu, Dongzhe Zheng et al.
V2M-Zero generates time-aligned music from video using event curves, achieving significant improvements in audio quality and beat alignment across datasets.
Yan-Bo Lin, Jonah Casebeer, Long Mai et al.
DynVLA uses Dynamics CoT to predict compact world dynamics, excelling on datasets like NAVSIM.
Shuyao Shang, Bing Zhan, Yunfei Yan et al.
IsalGraph method encodes any finite simple graph as a compact string over a nine-character instruction alphabet, suitable for graph similarity search.
Ezequiel Lopez-Rubio, Mario Pascual-Gonzalez
LLMGreenRec optimizes green product recommendations using a multi-agent system and large language models, reducing digital carbon footprint.
Hao N. Nguyen, Hieu M. Nguyen, Son Van Nguyen et al.
Leech Lattice Vector Quantization (LLVQ) achieves efficient LLM compression, outperforming Quip# and QTIP.
Tycho F. A. van der Ouderaa, Mart van Baalen, Paul Whatmough et al.
Study shows LLM-generated pseudo-relevance feedback significantly improves query performance, especially in low-resource tasks.
Nour Jedidi, Jimmy Lin
Using RCT methodology to evaluate AI systems' impact on human performance, revealing methodological challenges and solutions.
Patricia Paskov, Kevin Wei, Shen Zhou Hong et al.
Using cross-species transfer learning to enhance electrophysiology-to-transcriptomics mapping accuracy in cortical GABAergic interneurons.
Theo Schwider, Ramin Ramezani
MLP layers in Transformers perform binary routing; validated in GPT-2, removing MLP increases perplexity by 43.3%.
Peter Balogh
GLM-OCR combines CogViT visual encoder and GLM language decoder to enhance document understanding efficiency.
Shuaiqi Duan, Yadong Xue, Weihan Wang et al.
Efficient approximation of analytic and L^p functions using height-augmented ReLU networks, significantly improving approximation rates.
ZeYu Li, FengLei Fan, TieYong Zeng
We release a large bilingual library dataset for GND-based multi-label classification.
Jennifer D'Souza, Sameer Sadruddin, Maximilian Kähler et al.
LLM-assisted MIPVU rule script generation enables interpretable Chinese metaphor identification; protocol choice is the main source of variation.
Weihang Huang, Mengna Liu
RAGPerf is an end-to-end benchmarking framework for retrieval-augmented generation systems, supporting various datasets and embedding models with negligible performance overhead.
Shaobo Li, Yirui Zhou, Yuan Xu et al.
Using structured linked data as a memory layer improves RAG system retrieval accuracy by 29.6% in standard RAG and 29.8% in agentic pipeline.
Andrea Volpini, Elie Raad, Beatrice Gamba et al.
This paper presents an event-driven E-Skin system with dynamic binary scanning and real-time SNN classification, achieving a 12.8x scan reduction and 92.11% accuracy.
Gaishan Li, Zhengnan Fu, Anubhab Tripathi et al.