corbin

(an old blog) i was interested in ai reasoning for a bit >_< my thoughts were mostly naive, but it was rly interesting n fun trying to derive reasoning psych theory from first principles before diving into the nitty gritty stats/algos these posts are straight from my notes / journal entries, with basically zero edits check out listed.to/@corbin/58083 for an approx summary of my perspective rn i am trying to tackle ARC-AGI and i think it would be crackable in a few weeks if i was someone like noam brown but alas i am literally a noob (edit: openai's o3 has solved it pretty easily haha yay!) studying apma cs physics at brown

2024-12-27 yapping to friend about arc

this is a monologue im gonna text u things bc writing it out to another person helps me process my thoughts better than just writing it to myself but u dont have to respond like u can fully just not respond at all like i can just pretend u r reading these okay so . ive been trying to flesh out a program for ARC . and its rly crazy how salena is such a good analogy for an LLM like . she hallucinates just like an LLM (i.e. e.g. she confidently does the wrong equation) she litera...
Read post

2024-12-21 yapping to friend about o1/o3/search

this is a monologue so the way i like thinking abt it is models like 4o, claude sonnet, etc .. their pattern matching / interpolation / wtvr is all prettyyyy analogous to human type 1 thinking .. like imagine if u were asked a question and you gave zero conscious thought about it and just immediately spewed out words based on intuition (the analogy is not perfect but i think its pretty good so ill continue using it here) humans are actually pretty bad at type 1 thinking for things like...
Read post

2024-12-25 at 14:10 summarizing past notes and what to do going forward

i have gone through my past notes and this note is me copypasting excerpts that i wanna keep in mind, trying to inform what to do rn / going forward this note is unfinished (but i guess all of my notes on this blog are unfinished n messy anyway lol so wtvr) ===== 2024-12-24 at 18:01 okay i know ive been getting Slightly more into learning about ai, and i want to make sure im not making my ARC solution hypotheses too complex — i.e. i dont want the equivalent of feature creep i wanna try the ...
Read post

2024-12-25 at 02:13

wait dude. the thing is . humans r ALWAYS doing test-time training. it's not like training -> static level of performance... even our performance is training. i mean i guess Yes we can train and then have some static level of performance that we stagnate at, but even that performance is still solidifying smth (perhaps eg ur current bad habits in ur technique) ...
Read post

2024-12-23 misc3

if u dont understand smth u need to know that u dont fully understand it and that youve blackboxed it etc how does a good human reasoner learn that? and then how do we update our beliefs/understanding after learning it === u also need to learn the threshold of blackboxing, and the threshold of how blackboxed of a tool can u still be satisfied using how does a good human reasoner learn that? === id like an ai to be able to go through Purcell n Morin EM textbook, do all the practice problems...
Read post

2024-12-23 misc2

understand constitutional ai approach can i do RLAIF via prompting? like it wont change the model but it will change the prompt evolutionary approach is dumb if it's only evolutionary and you have set compute cutoffs. allow your time to be managed like a human would manage their time. okay maybe im being too harsh to jeremy berman evolutionary approach is actually pretty cool (picbreeder type beat) but i just feel like it's not needed for this problem i guess? i feel like it would be much more...
Read post

2024-12-22 at 17:38 more notes on Parables on the Power of Planning in AI (noam brown)

And this was, at the time, state of the art for predicting human moves in chess. 29:48 Now, one thing that's really interesting about MAIA is that for high Elo models, it was about 100 to 300 points-- 29:56 Elo points-- below the target Elo rating. So if you were to train it on 2,000 Elo-rated humans, 30:03 it would only be about 1,700 Elo. For the lower Elo ratings, this ended up not being a problem. 30:08 For the higher Elo ratings, it was a challenge. Now, one hypothesis for why this is the c...
Read post

2024-12-22 at 16:21 3b1b

cost function is average over all examples backprop gives you gradient of C(w1, w2, ...) (how?) but to calculate that youd need all examples instead, we use only a few examples at a time then calculate not the exact gradient but instead a Stochastic Gradient using backprop using those few examples . this makes sense bc it's also how humans learn . we dont need to retrain on 50000 examples before adjusting our strategy/intuition/wtvr (whether consciously or subconsciously) . we adjust as we go, ...
Read post

2024-12-22 misc

it's crazy to think about how babies learn language . like Wtf the brain is just able to do that???? and humans first learning experience are like . model free RL isnt that fucking crazy . its not model based with rules n wtvr . its fucking model free deep RL... === ive been thinking about how to copy human reasoning and RL it like a human teacher -- but perhaps also think about how can we make reasoning emerge in the first place... i mean, even with the former method we can go beyond human r...
Read post

2024-12-22 at 02:07

i feel like i wouldnt have gotten most of these ideas if o1 didnt exist yet to inspire me ...
Read post

2024-12-22 at 00:03

for things like tennis n rock climbing, perhaps the partial derivatives are approximately only dependent on that variable like, at any point, you can improve any part of your technique and expect a reasonable gain in performance, and it doesn't rly matter what order you do this in, because the improvement from URGH im gonna stop explaining it in english i already understand it . whatever. a little more formally, let's say you have a cost function C which depends on factors x, y, ... let's say ...
Read post

2024-12-21 at 21:14

if we want ai to make serendipitously great discoveries we need it to play and that means we need “useless” ai agents that “waste” compute ...
Read post

2024-12-21 at 20:55

remember the thing shane said about adding more params just allows us to gradient descent even more instead of hitting a minima and it seems to work rly rly rly well its rly interesting bc once i was talking w jason and i thought more params = harder to find gradient and less params = rly easy to find gradient we were comparing Life Happiness vs Tennis, and we made that analogy ... happiness is very hard to make progress on, tennis is very easy to make progress on.. (or at least it seems like ...
Read post

2024-12-21 at 20:45

BFS vs DFS play shows bfs ...
Read post

2024-12-21 at 20:30

wait humans r extremely bad at type 1 thinking but also sometimes we get a kamikaze divine gift and even more than sometimes we also somehow r rly good at connecting things or wtvr theres some sort of creativity process, ykwim? so cool ...
Read post

2024-12-21 at 18:54

learning by association vs learning by gradient descent etc ??? learning by RL , ai vs humans ??? ...
Read post

2024-12-17 at 16:09

why is everyone being weird about scaling test-time compute vs training compute https://x.com/ClementDelangue/status/1868740932251844806 or maybe thats just tweakers on AI twitter and not ppl in actual research communities === also im curious how the phenomenon of "for every 10x increase in training compute, we decrease 15x in inference compute" maps onto humans re: https://yellow-apartment-148.notion.site/AI-Search-The-Bitter-er-Lesson-44c11acd27294f4495c3de778cd09c8d "Moreover, the brillia...
Read post

2024-12-16 at 16:01

why are we not worried about human interpretability why is there no worry about human superintelligence like . why is there the assumption of . AI becomes compoundingly infinitely smart right after we get reasoning is it just bc . humans are limited by their speed and AI is assumed to have the ability to scale speed+knowledge with enough compute? also .. i'm realizing how human reasoning is so slow n unoptimal actually compared to the ideal that we sometimes think it is also . even if we get ...
Read post

2024-12-15 at 06:21

wait humans r actually so bad at first principles n reasoning for super open ended questions like we use so many heuristics and it just rly depends if ur lucky that ur heuristics work out and u only change heuristics when ur rly sure they dont work out and also emotions change ur rationalization so easily like . you cant verify each step from first principles for an emotional problem .. u can try but its impossible to really have some axiomatic base that u can build up from . all axioms in p...
Read post

2024-12-12 at 20:02

how do you reason for things that feel genuinely out of your reasoning ability? ...
Read post

2024-12-12 at 01:20

we need a tiny LLM to be trained on the brainstorming, not the output actually this is smth that i alr said but . idk i forgot it then rederived it the brainstorming is the actual interesting part the output is something that we can easily calculate using python we need to train the brainstorming to have that sort of . "oh we know how iq tests work and i know how to sorta kinda reason through it like a human sorta kinda reasons through it" ...
Read post

2024-12-11 at 20:03

i dont think i wrote abt this yet but . i remember when reading the Voyager minecraft thing and how they used embeddings of past solutions to help current/future solutions, i remember i was like "oh i've been thinking of that" (but i forgot what Exactly i was actually originally thinking and how i originally wanted to go about it, so now the earliest version i actually remember is just the voyager implementation) ========== constitutional ai not simple RL but like nuanced RL could we implement...
Read post

2024-12-11 at 03:51

summary of convo w shane today i met this guy beside the pool tables in faunce while i was talking about Hidden Markov Models w moses and then he chimed in and started teaching us about HMMs and then i asked him what year he was and he said he's a phd student and i asked him what research and he said ai rl and so we started talking about ai here's his website https://sparr.io/ double descent penalize large coeffs language changes perception not bc of the language but bc the language is a sc...
Read post

2024-12-10 at 21:47

instead of providing the reasoner with a human-built cheat sheet / human-curated guidelines / etc, we want to allow the reasoner to struggle a lot we can give the reasoner tips on how to reason, but not shortcuts to reason same thing with training humans! like if we were teaching salena, we wouldnt wanna give her human-curated tips like "oh try noticing connected blocks, and try shearing, then try symmetry, then try rotation, then try [...]" we would wanna let her struggle through it, but give...
Read post

2024-12-09 at 19:20

humans learn not from repeated correct examples but from the edge cases of mistakes (wait we also do learn from repeating correct things like w spaced repetition, right? and also w reviewing the same practice problem ... oh wait but . perhaps that is also still actually traveling the edge case of mistakes, bc otherwise tbh we just skip the example when we're studying) eg dekeyser's skill building oh wait i guess this is what RL does (?) ========== also why havent we created a multimodal mode...
Read post

2024-12-09 at 14:22

arxiv paper that i found via youtube shorts yt = https://youtube.com/shorts/ZlvdInrdAYE arxiv link = https://arxiv.org/abs/2309.05660 kinda surprised that this was published late 2023 and presented in mid 2024 . also this is literally not discrete program search, right? but in the yt short, chollet says it is ? this is a naive implementation of literally what i was thinking about a few days ago (re: "brainstorming") i guess they were just trying to demonstrate base level capabilities, and per...
Read post

2024-12-09 at 13:31

i wanna be curious and just explore whatever but like also i dont wanna waste time,, i wanna actually be fast and speedy and actually catch up to things so that i can actually think smart about these things and actually start experimenting on novel things quickly, like ofc it's very useful to just sit around and think about hypotheses and we need that to get New Creative Ideas (re: greatness cannot be planned, etc) and that is very intangible but i also wanna be able to respond to tangible measu...
Read post

2024-12-06 at 0341 mimi import

active inference how can an ai eventually do divergent thinking and connecting two seemingly unrelated things? how do WE interpolate for that? r we only capable of interpolation and our extrapolation is just built from interpolation? ...
Read post

2024-12-06 at 1209 mimi import

humans r so dumb but we have 999999x distributed compute n teamwork ...
Read post

2024-12-08 at 1331 mimi import

when youve honed your reasoning you can sorta Feel if an equation is off that feeling is an action tugging (just like how u feel a word on the tip of ur tongue or how you feel) (i rly like model of . viewing feelings as action tuggings … its not perfect but i like it) and thats bc your Type 2 reasoning is built using Type 1, and Type 1 is multimodal whose teleology (and thus ontology) is to control Actions how did humans figure out how to reason ? maybe instead of teaching a model how to reas...
Read post