S2MAM: Semi-supervised Meta Additive Model for Robust Estimation and Variable Selection
S2MAM uses bilevel optimization for robust estimation and variable selection, validated on 16 datasets.
Xuelin Zhang, Hong Chen, Yingjie Wang et al.
S2MAM uses bilevel optimization for robust estimation and variable selection, validated on 16 datasets.
Xuelin Zhang, Hong Chen, Yingjie Wang et al.
Sessa enhances long-range memory by embedding selective attention in feedback paths.
Liubomyr Horbatko
Introduced Bounded Ratio Reinforcement Learning (BRRL) framework, outperforming PPO in environments like MuJoCo.
Yunke Ao, Le Chen, Bruce D. Lee et al.
Apollo model integrates 28 medical modalities and 12 specialties to predict disease risk up to 5 years in advance.
Andrew Zhang, Tong Ding, Sophia J. Wagner et al.
Analysis of generalization bounds in symbolic regression with genetic programming, revealing complexities in structure selection and constant fitting.
Masahiro Nomura, Ryoki Hamano, Isao Ono
VS-WNO fails to translate spike sparsity into deployment cost advantage on Jetson Orin Nano.
Jason Yoo, Shailesh Garg, Souvik Chakraborty et al.
A smaller model post-trained with reinforcement learning excels in small-molecule drug design tasks, rivaling state-of-the-art frontier models.
Shriram Chennakesavalu, Kirill Shmilovich, Hayley Weir et al.
Task-reward optimization enhances Llama-3.2-3B-Instruct's performance on math datasets.
Sarthak Mittal, Leo Gagnon, Guillaume Lajoie
HILBERT framework achieves significant performance improvement in long-sequence audio-text representation learning through dual contrastive learning and information-balanced regularization.
Habibeh Naderi, Behrouz Haji Soleimani, Stan Matwin
Detect and suppress reward hacking using Gradient Fingerprints, achieving superior performance on math, code, and logical reasoning benchmarks.
Songtao Wang, Quang Hieu Pham, Fangcong Yin et al.
Prototype-Grounded Concept Models (PGCMs) verify concept alignment via visual prototypes, enhancing interpretability.
Stefano Colamonaco, David Debot, Pietro Barbiero et al.
Muon optimizer outperforms AdamW in MLP-based tabular deep learning, recommended if training efficiency is acceptable.
Yury Gorishniy, Ivan Rubachev, Dmitrii Feoktistov et al.
Analyzes stability and generalization of looped transformers using a fixed-point framework, validated on chess, sudoku, and prefix-sum tasks.
Asher Labovich
SortedRL accelerates RL training for LLMs through online length-aware scheduling, enhancing efficiency and performance.
Yiqi Zhang, Huiqiang Jiang, Xufang Luo et al.
Introduces Graph Energy Matching (GEM), surpassing discrete diffusion models in molecular graph generation.
Michal Balcerak, Suprosana Shit, Chinmay Prabhakar et al.
Scaling DoRA achieves high-rank adaptation via factored norms and fused kernels, significantly reducing memory usage and enhancing speed.
Alexandra Zelenin, Alexandra Zhuravlyova
SPA method uses carefully designed prompts to generate large-scale synthetic data for effective knowledge injection.
Kexian Tang, Jiani Wang, Shaowen Wang et al.
Proposed a new algorithm for online learning with ranking feedback, addressing the absence of traditional numeric feedback.
Mingyang Liu, Yongshan Chen, Zhiyuan Fan et al.
A cost-aware evasion framework reveals robustness gaps in phishing detection; median evasion cost is 2, with over 80% attacks on three low-cost features.
Julian Allagan, Mohamed Elbakary, Zohreh Safari et al.
The paper introduces a maximum-entropy exploration method using future state-action visitation measures, improving feature visitation and convergence speed.
Adrien Bolland, Gaspard Lambrechts, Damien Ernst