2024-12-07 at 1727 Untitled 7
December 24, 2024•1,148 words
wait i swear ARC AGI tasks are interpolative on the goal
but then you need a reasoning thing to try out the solutions at test time just like we do
==========
if humans are strictly interpolative but we use that to do extrapolation, we have to ask how that happens
rn LLMs are surface level interpolative but extremely good at doing surface level interpolation across superhuman amounts of data
i think there's something about . humans interpolate on a much finer resolution, like on a much finer fundamental level of ideas
i wanna understand LLMs inteprolation and then think about can we eventually train LLMs to do deeper level interpolation?
i also wanna understand how LLM interpolation works bc i think im underestimating the resolution/depth of its interpolation ... also wait how does RL fit into this
can we eventually train LLMs to do deeper level interpolation OR do we really need a reasoning thing to break it down ?
oh wait i guess even if LLMs did deeper level interpolation it would still only be like Intuition doing the deeper level interpolation -- and remember what i was writing about how . even if we had a crazy insanely intuitive guy who could one shot hard problems like crazy intuition, i Believe we would still need like . type 2 reasoning
WAIT WE NEED TO TRAIN TYPE 1 AND TYPE 2 TOGETHER ... JUST LIKE WE TRAIN THE WORD EMBEDDINGS IN GPT 3 INSTEAD OF USING PREBUILT EMBEDDING WEIGHTS
or i guess we dont need to but it would be better
===
i remember earlier in some other note i wanted to talk about how a lot of human creativity (like just for aesthetic and also for scientific breakthroughs) happen bc we DONT optimize for it
re: greatness cannot be planned
a lot of it comes from just curiosity and play and random serendipity
==========
ARC AGI when we check the solutions (during test-time and also during training) dont just check for binary win/loss but also check for aspects that are correct (like positions but not color)
idk how chess NNs do this
brainstorming rules based on the example then testing the rules based on the example -- but also brainstorming subrules based on the example
okay but also we cant train it from the public eval set and we cant hardcode algos using knowledge abt the types of problems from the public eval set
we need to build all our cognitive axioms from the public training (easy) dataset, and not use the public eval (hard) dataset
like . the public eval set is just to test out your performance .
Wait but . why does he say
To ensure fair evaluation results, be sure not to leak information from the evaluation set into your algorithm (e.g., by looking at the tasks in the evaluation set yourself during development, or by repeatedly modifying an algorithm while using its evaluation score as feedback.)
like obviously i wanna repeatedly modify an algorithm while using its evaluation score as feedback (or maybe he's specifically referring to the dumb way to do this, like using the public eval set as part of ur training data or using the public eval set to hardcode some things -- then obviously youre not lear
we shouldnt
==========
can we somehow train it to create a good correct list of axioms??? like not even embedded in the NN-algo but like we train it to write down a guide for itself almost
can we somehow train it while remembering how it solved problems? (either via NN or via scratchpad or smth that give s it "memory") -- like don't do 1 epoch as attempting all tasks and then optimize after that ... let's optimize our NN even just after 1 task so that it uses that knowledge to do the rest of the tasks
EDIT 2024-12-25 01:56:17 -- wait that is literally like stochastic gradient using batches
then also it can use similarities from prev problems
i guess that is what test-time learning is.... there can be a NN solution for that where we actually update the weights, and there can be just literally a Scratchpad
(the NN solution would be like a clever person updating their intuition, the scratchpad could still be good but it would be like a person who really really relies on a cheat sheet and they are continually updating their cheat sheet)
wait but i feel like . something that really contributes to how the clever person updates their intuition is that they have meta-knowledge about the test - like they know about the test maker and they just know how tests go, and more generally they also fkn Learned How To Learn via years of experience even before being introduced to arc-agi
===
how would we train an extremely dumb human (or like a 4 year old) to get 90%, with only 400 practice problems?
we could train them on how to learn
i guess that is what the ARC challenge specifically attempts to gauge
but like . when you run the training algo on ARC training dataset, it's only learning how to learn for ARC specifically
also when we help humans to learn how to learn, these students are not going in completely blind -- we have many rules of thumb already developed because of centuries and centuries of humanity experience with learning things, and we give those guidelines to the students
===
note that how you solve a set of tasks is not necessarily how someone else would go about solving that set of tasks
like . some ppl rly are the scratchpad/cheatsheet people
i feel like im just relatively rly good at these type of puzzles and i feel like im also relatively rly good at updating my intuition about brainstorming the i/o function in the puzzles
===
okay so observe that the task here is . we need to get good at figuring out the i/o function for a task
can we even train that intuition in humans? or i mean i wanna ask to what extent is that possible/easy?
and also, notice how we're only training the intuition for the brainstorming the i/o function - oh and then i guess we have to train the reasoning for how to
God i really wonder how we train reasoning in children Lmfao .
also we can use python to replace some Type 2 thinking for checking our brainstorm guesses for a task
idk where we need to be on the spectrum of fully trained reasoner to somewhat manually coded reasoner, but it would be a lot Awesomer to lean towards training a reasoner
===
HOW DO WE TRAIN REASONING IN CHILDREN??????????????????
i think i almost forgot that literally some people are fuckshit at reasoning . but we somehow still train them to do things, just without reasoning