cs.CL 2604.19642

Micro Language Models Enable Instant Responses

Micro Language Models (μLMs) enable instant responses by generating the first 4-8 words on-device, with cloud models completing the response.

Wen Cheng, Tuochao Chen, Karim Helwani et al.

2026-04-22 32