- Rush AM,
Biderman S,
Webson A,
Sasanka Ammanamanchi P, Wang T,
Sagot B,
Muennighoff N,
Villanova del
Moral A,
Ruwase O,
Bawden R,
Bekman S, McMillan-Major...
-
Transcending Scaling Laws with 0.1%
Extra Compute, arXiv:2210.11399
Muennighoff, Niklas; Rush, Alexander; Barak, Boaz; Le Scao, Teven; Tazi, Nouamane;...
-
Retrieved 11
December 2023. Li, Raymond; Allal,
Loubna Ben; Zi, Yangtian;
Muennighoff, Niklas; Kocetkov, Denis; Mou, Chenghao; Marone, Marc; Akiki, Christopher;...
-
Training Enables Zero-Shot Task Generalization". arXiv:2110.08207 [cs.LG].
Muennighoff, Niklas; Wang, Thomas; Sutawika, Lintang; Roberts, Adam; Biderman, Stella;...