2024-12-25 at 14:10 summarizing past notes and what to do going forward

December 25, 2024•1,559 words

i have gone through my past notes and this note is me copypasting excerpts that i wanna keep in mind, trying to inform what to do rn / going forward

this note is unfinished (but i guess all of my notes on this blog are unfinished n messy anyway lol so wtvr)

=====

2024-12-24 at 18:01

okay i know ive been getting Slightly more into learning about ai, and i want to make sure im not making my ARC solution hypotheses too complex — i.e. i dont want the equivalent of feature creep

i wanna try the simple solutions first

lets remember our first thoughts from a few weeks ago and then draft a solution

okay so .

first, just try out some rly simple solutions. experiment on them. experiment on how you would teach a salenahuman to do ARC.

then, if they dont work, read up on Voyager (the minecraft thing) then perhaps try smth w that, and also read up on Constitutional RLAIF and perhaps try smth w that

also maybe look at MCTS and pruning search tree

also finish 3b1b and maybe start on karpathy zero to hero then david silver
also geohotz chess

=====

in the current state of LLMs, theyre like a rly knowledgeable salena. very terrible IQ, but they can follow a plan nicely.
is there a way that we can devise a straightforward plan for a salenahuman to do ARC? and then we iterate on the Plan. we can do training in the sense of . iterating on a Plan, like salena .... god i swear i wrote that before .. Oh yeah re: search "continually updating their cheat sheet"
can i train an LLM to use another LLM and/or direct the whole process?
- either through Constitutional RLAIF or AIF prompting
how cheap of a System 1 model (like an LLM) can we use and still produce good results?
can i dynamically RLHAIF like i would teach salena to do ARC?
do we need to implement some of the voyager stuff

transformation function guessing (subrules and whole transformation)
guess checker, subrule
guess checker, whole transformation

when we check the solutions (during test-time and also during training) dont just check for binary win/loss but also check for aspects that are correct (like positions but not color)

we want to train until we get 100% on the public training set (just like a human would, but make sure we try to generalize and not overfit. but tbh, a human would have problems with this too.)

can use the public eval set like sending off your salenahuman to go do a mock test with a proctor.

(the NN solution would be like a clever person updating their intuition, the scratchpad could still be good but it would be like a person who really really relies on a cheat sheet and they are continually updating their cheat sheet)

wait.

how would a cheat sheet (whether continually updating or not) even help a salena doing ARC?
- remembering possible subrules or possible subrule topics
- some sort of todo list/process
how would a corbin update their intuition during training for ARC?
- idk tbh.

wait i feel like . something that really contributes to how the clever person updates their intuition is that they have meta-knowledge about the test - like they know about the test maker and they just know how tests go, and more generally they also fkn Learned How To Learn via years of experience even before being introduced to arc-agi

some people are fuckshit at reasoning . but we somehow still train them to do things, just without reasoning

need to daydream/brainstorm/mindwander about how to teach salena to do ARC tasks

IS IT EVEN POSSIBLE TO HAVE A RLY DUMB SALENAHUMAN LEARN HOW TO MAKE GUESSES FOR ARC TRANSFORMATION FUNCTIONS????? like let's say we have someone/something else automatically check if her guesses r right/wrong. IS IT POSSIBLE TO TRAIN HER TO MAKE GOOD GUESSES?????

humans learn not from repeated correct examples but from the edge cases of mistakes
"I know this only happens if they are given a problem that is unsolvable without learning to think differently. If it's possible using their existing knowledge, they won't bother learning something new."

how would u train salena to make better guesses / have a better guessing process?????????

would we actually train her, or would we just make her have a cheat sheet and make her update her cheat sheet? if the former, how? if the latter, how?

god why am i so lost

how do you reason for things that feel genuinely out of your reasoning ability?
is there some sort of prompt that a human can follow?????

corbin stop thikning about ai things.
think about how the FUCK you would train a SALENAHUMAN to pass an ARC eval!!!!!!!!!!!!!!!!!!!!
let's set the constraints

can salena learn? or are we just gonna change her prompt + cheat sheet?
can we allow salena to do unsupervised or AI-supervised learning?

god the ironic thing about the pursuit for reasoning AGI is that you realize that your reasoning is so fucking bad too. or at least your creativity . or whatever. i just mean problem solving skills in general. the ironic thing about the pursuit for AGI that can problem-solve is that you realize that your own problem-solving is so fucking mortal and imperfect and obviously you cant even fucking solve THIS problem itself, the problem of inventing problem-solving agi.

how do you teach salena how to reason? there's a very similar irony here. the ironic thing about trying to teach salena how to problem-solve/reason is that you realize FUCK im so bad at problem-solving that i cant even solve THIS fucking problem itself... FUCK! and that's what salena must be feeling when she's trying to solve some math problem.

i need to figure out how to solve THIS problem, and then maybe i can teach salena how to solve the math problem.
again, how do you reason for things that feel genuinely out of your reasoning ability?

okay wait but . hold on let's not fall into a trap of bleh too quickly. if i was gonna teach salena how to do math, i would redo all of her understanding and make sure she gets the visual intuition, and then i would make her curious about math, and i'd make her do a mix of self-supervised practice (allow her to bang her head on the wall) and teacher-led practice (allow her to bang her head on the wall, but i give hints + tips + etc)
i feel like the secret is in that banging ur head against the wall. u can have a perfect teacher like the best most intuitive teacher in the whole wide world with the most intuitive easily understandable explanations, but EVEN AFTER THAT not everything will be immediately obvious. u still need to bang ur own head against ur own wall......... i think this is true.

okay thats cool and all and i guess i CAN teach salena how to do math. but how the FUCK do u teach salena to do smth like ARC????

cuz ARC isnt purely reasoning. math/physics psets are like . you can understand smth, and as long as u deeply understand it, then i feel like the solution just becomes obvious almost sorta kinda.

but ARC is like . some sort of creativity as well.. psets dont require much creativity/search. research problems require creativity/search (unlike most pset problems). and ARC requires creativity/search, albeit a much simpler kind.

how the fuck do u train to get better at divergent thinking that converges on the right thing????????????????????????????????????

like we dont even know how the fuck to do that.

this feels very Greatness Cannot Be Planned vibes...

=====

god . what i feel rn trying to solve this is what 4o/sonnet must feel like when theyre trying to solve ARC

how ironic

god there's just so many fucking questions

=====

what the fuck is my bottleneck rn

it's not that i dont know how to code the implementation of the simple thing i wanna do (i'm not at the coding stage of the process yet)

it's not that i don't know how the fuck to do RL (i'm not at the RL stage of the process yet, and i'm not even sure how much that is needed. i might need to learn it, but not now bc i wanna test other things first)

recall the plan at the top of this note... try out the simple salenahuman solutions first.
rn my bottleneck is that i dont know how the fuck i would teach a salenahuman to do this

=====

okay wait before entertaining a onepersonsalena approach, lets entertain a team of salenas doing evolutionary approach...

is it good to measure the evolutionary threads and then cut off the ones that arent meeting the benchmark? thats very Not how picbreeder worked . okay wait but i think the situations are different .. wait but how different are they

i mean arc has an actual objective (solve the puzzle), picbreeder is allowed to be not picky abt what the end state looks like

=====

GOSH WHY AM I SO CONFUSED I FEEL LIKE I HAVE BRAIN FOG AM I BEING DUMB

=====

2024-12-25 at 14:10 summarizing past notes and what to do going forward

More from corbin
All posts

2024-12-25 at 02:13

2024-12-21 yapping to friend about o1/o3/search

2024-12-25 at 14:10 summarizing past notes and what to do going forward

More from corbinAll posts

2024-12-25 at 02:13

2024-12-21 yapping to friend about o1/o3/search

More from corbin
All posts