Cramming: Training a Language Model on a Single GPU in One Day (arxiv.org)
3 points by andai 7 hours ago | 0 comments
1513 points by andai 7 hours ago | 0 comments
1511 points by typeofhuman 7 hours ago | 3 comments
1521 points by perihelions 7 hours ago | 0 comments
1531 points by mdp2021 7 hours ago | 0 comments
1542 points by geox 7 hours ago | 0 comments
1551 points by gnabgib 7 hours ago | 1 comment
1561 points by foxtacles 7 hours ago | 0 comments
1571 points by abutbul 7 hours ago | 0 comments
1584 points by overclock351 7 hours ago | 6 comments
1593 points by jjwiseman 7 hours ago | 1 comment
1607 points by bloomingkales 7 hours ago | 17 comments
1612 points by nemoniac 7 hours ago | 0 comments
1628 points by solardev 7 hours ago | 7 comments
1632 points by bentocorp 7 hours ago | 0 comments
1649 points by vasco 7 hours ago | 0 comments
1653 points by speckx 7 hours ago | 0 comments
1662 points by aarestad 7 hours ago | 1 comment
16710 points by speckx 7 hours ago | 0 comments
1683 points by colinprince 7 hours ago | 1 comment
1691 points by chadk 7 hours ago | 0 comments
1702 points by sandwichsphinx 7 hours ago | 0 comments
1711 points by timbourcier 7 hours ago | 0 comments
1721 points by axiomdata316 7 hours ago | 1 comment
17384 points by mariuz 7 hours ago | 73 comments
1741 points by colinprince 7 hours ago | 1 comment
1752 points by egnehots 7 hours ago | 0 comments
1763 points by pseudolus 7 hours ago | 2 comments
1773 points by JSeymourATL 7 hours ago | 0 comments
1782 points by fork-bomber 7 hours ago | 0 comments
1791 points by xenodium 7 hours ago | 1 comment
180