2024-12-17 at 16:09

why is everyone being weird about scaling test-time compute vs training compute
https://x.com/ClementDelangue/status/1868740932251844806

or maybe thats just tweakers on AI twitter and not ppl in actual research communities

===

also im curious how the phenomenon of "for every 10x increase in training compute, we decrease 15x in inference compute" maps onto humans

re:
https://yellow-apartment-148.notion.site/AI-Search-The-Bitter-er-Lesson-44c11acd27294f4495c3de778cd09c8d
"Moreover, the brilliant Scaling Scaling Laws with Board Games show that “for each additional 10× of train-time compute, about 15× of test-time compute can be eliminated” even down to single-neuron models. Recall that Stockfish beat Leela with a model 3 orders of magnitude smaller."

More from corbin
All posts