Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights
Using CWRF, only critical weights are adjusted to enhance privacy while maintaining utility.
Xingli Fang, Jung-Eun Kim
Using CWRF, only critical weights are adjusted to enhance privacy while maintaining utility.
Xingli Fang, Jung-Eun Kim
MXNorm reuses MXFP8 block scales for efficient tensor normalization, reducing reduction size by 32x.
Callum McLean, Luke Y. Prince, Alexandre Payot et al.
ZO-SAM integrates zero-order optimization to reduce computational costs, enhancing efficiency and robustness in sparse training.
Jie Ji, Gen Li, Kaiyuan Deng et al.
BoSS enhances deep active learning performance by integrating multiple selection strategies, excelling on large-scale datasets.
Denis Huseljic, Paul Hahn, Marek Herde et al.
Zero-hyperparameter multi-corner analysis using learned priors reduces validation cost by over 10 times.
Wei W. Xing, Kaiqi Huang, Jiazhan Liu et al.
Influence Malleability in Linearized Attention: Dual Implications of Non-Convergent NTK Dynamics.
Jose Marie Antonio Miñoza, Paulo Mario P. Medina, Sebastian C. Ibañez
DDIM reverse chain as Partitioned Iterated Function Systems provides a unified design language for denoising diffusion models.
Ann Dooms
Achieve color control in FLUX's VAE latent space, revealing a structure reflecting Hue, Saturation, and Lightness.
Mateusz Pach, Jessica Bader, Quentin Bouniot et al.
Separable Neural Architectures (SNA) unify predictive and generative intelligence by constraining interaction order and tensor rank.
Reza T. Batley, Apurba Sarker, Rajib Mostakim et al.
STAMP framework uses the Polar mechanism to achieve superior privacy-utility trade-offs in text privacy.
Fengwei Tian, Payel Bhattacharjee, Heidi Hanson et al.
Temporal Straightening improves latent planning success rates by 20-60% using curvature regularization.
Ying Wang, Oumayma Bounou, Gaoyue Zhou et al.
RandOpt enhances large-scale models via random perturbations and ensemble voting around pretrained weights.
Yulu Gan, Phillip Isola
Quantifies forgetting in generative models post-training using forward and reverse KL objectives, avoiding quality degradation.
Krishnakumar Balasubramanian, Shiva Prasad Kasiviswanathan
EnTransformer combines Transformer with engression for superior multivariate probabilistic forecasting.
Rajdeep Pathak, Rahul Goswami, Madhurima Panja et al.
NeFTY achieves high-accuracy 3D thermal diffusion reconstruction using a differentiable physics framework, significantly improving defect localization.
Tao Zhong, Yixun Hu, Dongzhe Zheng et al.
Leech Lattice Vector Quantization (LLVQ) achieves efficient LLM compression, outperforming Quip# and QTIP.
Tycho F. A. van der Ouderaa, Mart van Baalen, Paul Whatmough et al.
Using cross-species transfer learning to enhance electrophysiology-to-transcriptomics mapping accuracy in cortical GABAergic interneurons.
Theo Schwider, Ramin Ramezani
MLP layers in Transformers perform binary routing; validated in GPT-2, removing MLP increases perplexity by 43.3%.
Peter Balogh