Batched Kernelized Bandits: Refinements and Extensions
The paper refines and extends the batched kernelized bandits problem, optimizing batch numbers and regret bounds.
Chenkai Ma, Keqin Chen, Jonathan Scarlett
The paper refines and extends the batched kernelized bandits problem, optimizing batch numbers and regret bounds.
Chenkai Ma, Keqin Chen, Jonathan Scarlett
VLM4Rec enhances multimodal recommendation by leveraging large vision-language models for semantic representation.
Ty Valencia, Burak Barlas, Varun Singhal et al.
InterDeepResearch enables human-agent collaborative information seeking through an interactive deep research framework, enhancing process observability and real-time steerability.
Bo Pan, Lunke Pan, Yitao Zhou et al.
Introduces Alternating Gradient Flow (AGF) to prevent structural collapse under 75% compression on ImageNet-1K.
Tianhao Qian, Zhuoxuan Li, Jinde Cao et al.
Study reveals three phases in fully-connected neural networks through dropout pruning: eumentia, dementia, and amentia.
Haining Pan, Nakul Aggarwal, J. H. Pixley
EVATok achieves efficient visual autoregressive generation with adaptive video tokenization, saving 24.4% tokens on average.
Tianwei Xiong, Jun Hao Liew, Zilong Huang et al.
MM-CondChain uses VPIR for visually grounded deep compositional reasoning, with top model achieving only 53.33 Path F1.
Haozhan Shen, Shilin Yan, Hongwei Xue et al.
OmniStream achieves perception, reconstruction, and action in visual streams using causal spatiotemporal attention and 3D-RoPE, excelling across 29 datasets.
Yibin Yan, Jilan Xu, Shangzhe Di et al.
$Ψ_0$ model achieves 40% performance improvement using only 800 hours of human video and 30 hours of robot data.
Songlin Wei, Hongyi Jing, Boqian Li et al.
Achieve color control in FLUX's VAE latent space, revealing a structure reflecting Hue, Saturation, and Lightness.
Mateusz Pach, Jessica Bader, Quentin Bouniot et al.
HumDex system uses IMU tracking and learning methods for portable humanoid dexterous manipulation, enhancing data collection efficiency and generalization.
Liang Heng, Yihe Tang, Jiajun Xu et al.
DreamVideo-Omni achieves multi-subject video customization with latent identity reinforcement learning, enhancing identity fidelity and motion control precision.
Yujie Wei, Xinyu Liu, Shiwei Zhang et al.
AutoGaze autoregressively selects multi-scale video patches, reducing redundancy and enhancing efficiency, enabling 1K-frame 4K video processing.
Baifeng Shi, Stephanie Fu, Long Lian et al.
EndoCoT activates MLLMs' reasoning potential, achieving 92.1% accuracy, 8.3% higher than the baseline.
Xuanlang Dai, Yujie Zhou, Long Xing et al.
The study enhances performance in non-verifiable LLM post-training using reasoning LLM judges, with gpt-oss-120b as the gold standard.
Yixin Liu, Yue Yu, DiJia Su et al.
Separable Neural Architectures (SNA) unify predictive and generative intelligence by constraining interaction order and tensor rank.
Reza T. Batley, Apurba Sarker, Rajib Mostakim et al.
BiGain enhances diffusion models by frequency separation, improving classification accuracy by 7.15% and FID by 0.34.
Jiacheng Liu, Shengkun Tang, Jiacheng Cui et al.
STAMP framework uses the Polar mechanism to achieve superior privacy-utility trade-offs in text privacy.
Fengwei Tian, Payel Bhattacharjee, Heidi Hanson et al.
Incremental neural network verification via learned conflicts achieves up to 1.9x speedup in Marabou verifier.
Raya Elsaleh, Liam Davis, Haoze Wu et al.
Temporal Straightening improves latent planning success rates by 20-60% using curvature regularization.
Ying Wang, Oumayma Bounou, Gaoyue Zhou et al.