All posts from corbin

2024-12-27 yapping to friend about arc

December 27, 2024•1,586 words

this is a monologue im gonna text u things bc writing it out to another person helps me process my thoughts better than just writing it to myself but u dont have to respond like u can fully just not respond at all like i can just pretend u r reading these okay so . ive been trying to flesh out a program for ARC . and its rly crazy how salena is such a good analogy for an LLM like . she hallucinates just like an LLM (i.e. e.g. she confidently does the wrong equation) she litera...

2024-12-21 yapping to friend about o1/o3/search

December 27, 2024•1,551 words

this is a monologue so the way i like thinking abt it is models like 4o, claude sonnet, etc .. their pattern matching / interpolation / wtvr is all prettyyyy analogous to human type 1 thinking .. like imagine if u were asked a question and you gave zero conscious thought about it and just immediately spewed out words based on intuition (the analogy is not perfect but i think its pretty good so ill continue using it here) humans are actually pretty bad at type 1 thinking for things like...

2024-12-25 at 14:10 summarizing past notes and what to do going forward

December 25, 2024•1,559 words

i have gone through my past notes and this note is me copypasting excerpts that i wanna keep in mind, trying to inform what to do rn / going forward this note is unfinished (but i guess all of my notes on this blog are unfinished n messy anyway lol so wtvr) ===== 2024-12-24 at 18:01 okay i know ive been getting Slightly more into learning about ai, and i want to make sure im not making my ARC solution hypotheses too complex — i.e. i dont want the equivalent of feature creep i wanna try the ...

2024-12-25 at 02:13

December 25, 2024•63 words

wait dude. the thing is . humans r ALWAYS doing test-time training. it's not like training -> static level of performance... even our performance is training. i mean i guess Yes we can train and then have some static level of performance that we stagnate at, but even that performance is still solidifying smth (perhaps eg ur current bad habits in ur technique) ...

2024-12-23 misc3

December 24, 2024•1,014 words

if u dont understand smth u need to know that u dont fully understand it and that youve blackboxed it etc how does a good human reasoner learn that? and then how do we update our beliefs/understanding after learning it === u also need to learn the threshold of blackboxing, and the threshold of how blackboxed of a tool can u still be satisfied using how does a good human reasoner learn that? === id like an ai to be able to go through Purcell n Morin EM textbook, do all the practice problems...

2024-12-23 misc2

December 24, 2024•379 words

understand constitutional ai approach can i do RLAIF via prompting? like it wont change the model but it will change the prompt evolutionary approach is dumb if it's only evolutionary and you have set compute cutoffs. allow your time to be managed like a human would manage their time. okay maybe im being too harsh to jeremy berman evolutionary approach is actually pretty cool (picbreeder type beat) but i just feel like it's not needed for this problem i guess? i feel like it would be much more...

2024-12-22 at 17:38 more notes on Parables on the Power of Planning in AI (noam brown)

December 24, 2024•283 words

And this was, at the time, state of the art for predicting human moves in chess. 29:48 Now, one thing that's really interesting about MAIA is that for high Elo models, it was about 100 to 300 points-- 29:56 Elo points-- below the target Elo rating. So if you were to train it on 2,000 Elo-rated humans, 30:03 it would only be about 1,700 Elo. For the lower Elo ratings, this ended up not being a problem. 30:08 For the higher Elo ratings, it was a challenge. Now, one hypothesis for why this is the c...

2024-12-22 at 16:21 3b1b

December 24, 2024•128 words

cost function is average over all examples backprop gives you gradient of C(w1, w2, ...) (how?) but to calculate that youd need all examples instead, we use only a few examples at a time then calculate not the exact gradient but instead a Stochastic Gradient using backprop using those few examples . this makes sense bc it's also how humans learn . we dont need to retrain on 50000 examples before adjusting our strategy/intuition/wtvr (whether consciously or subconsciously) . we adjust as we go, ...

2024-12-22 misc

December 24, 2024•779 words

it's crazy to think about how babies learn language . like Wtf the brain is just able to do that???? and humans first learning experience are like . model free RL isnt that fucking crazy . its not model based with rules n wtvr . its fucking model free deep RL... === ive been thinking about how to copy human reasoning and RL it like a human teacher -- but perhaps also think about how can we make reasoning emerge in the first place... i mean, even with the former method we can go beyond human r...

2024-12-22 at 02:07

December 24, 2024•19 words

i feel like i wouldnt have gotten most of these ideas if o1 didnt exist yet to inspire me ...

2024-12-22 at 00:03

December 24, 2024•260 words

for things like tennis n rock climbing, perhaps the partial derivatives are approximately only dependent on that variable like, at any point, you can improve any part of your technique and expect a reasonable gain in performance, and it doesn't rly matter what order you do this in, because the improvement from URGH im gonna stop explaining it in english i already understand it . whatever. a little more formally, let's say you have a cost function C which depends on factors x, y, ... let's say ...

2024-12-21 at 21:14

December 24, 2024•25 words

if we want ai to make serendipitously great discoveries we need it to play and that means we need “useless” ai agents that “waste” compute ...

2024-12-21 at 20:55

December 24, 2024•129 words

remember the thing shane said about adding more params just allows us to gradient descent even more instead of hitting a minima and it seems to work rly rly rly well its rly interesting bc once i was talking w jason and i thought more params = harder to find gradient and less params = rly easy to find gradient we were comparing Life Happiness vs Tennis, and we made that analogy ... happiness is very hard to make progress on, tennis is very easy to make progress on.. (or at least it seems like ...

2024-12-21 at 20:45

December 24, 2024•6 words

BFS vs DFS play shows bfs ...

2024-12-21 at 20:30

December 24, 2024•43 words

wait humans r extremely bad at type 1 thinking but also sometimes we get a kamikaze divine gift and even more than sometimes we also somehow r rly good at connecting things or wtvr theres some sort of creativity process, ykwim? so cool ...

2024-12-21 at 18:54

December 24, 2024•18 words

learning by association vs learning by gradient descent etc ??? learning by RL , ai vs humans ??? ...

2024-12-17 at 16:09

December 24, 2024•98 words

why is everyone being weird about scaling test-time compute vs training compute https://x.com/ClementDelangue/status/1868740932251844806 or maybe thats just tweakers on AI twitter and not ppl in actual research communities === also im curious how the phenomenon of "for every 10x increase in training compute, we decrease 15x in inference compute" maps onto humans re: https://yellow-apartment-148.notion.site/AI-Search-The-Bitter-er-Lesson-44c11acd27294f4495c3de778cd09c8d "Moreover, the brillia...

2024-12-16 at 16:01

December 24, 2024•333 words

why are we not worried about human interpretability why is there no worry about human superintelligence like . why is there the assumption of . AI becomes compoundingly infinitely smart right after we get reasoning is it just bc . humans are limited by their speed and AI is assumed to have the ability to scale speed+knowledge with enough compute? also .. i'm realizing how human reasoning is so slow n unoptimal actually compared to the ideal that we sometimes think it is also . even if we get ...

2024-12-15 at 06:21

December 24, 2024•115 words

wait humans r actually so bad at first principles n reasoning for super open ended questions like we use so many heuristics and it just rly depends if ur lucky that ur heuristics work out and u only change heuristics when ur rly sure they dont work out and also emotions change ur rationalization so easily like . you cant verify each step from first principles for an emotional problem .. u can try but its impossible to really have some axiomatic base that u can build up from . all axioms in p...

2024-12-12 at 20:02

December 24, 2024•14 words

how do you reason for things that feel genuinely out of your reasoning ability? ...

2024-12-12 at 01:20

December 24, 2024•86 words

we need a tiny LLM to be trained on the brainstorming, not the output actually this is smth that i alr said but . idk i forgot it then rederived it the brainstorming is the actual interesting part the output is something that we can easily calculate using python we need to train the brainstorming to have that sort of . "oh we know how iq tests work and i know how to sorta kinda reason through it like a human sorta kinda reasons through it" ...

2024-12-11 at 20:03

December 24, 2024•627 words

i dont think i wrote abt this yet but . i remember when reading the Voyager minecraft thing and how they used embeddings of past solutions to help current/future solutions, i remember i was like "oh i've been thinking of that" (but i forgot what Exactly i was actually originally thinking and how i originally wanted to go about it, so now the earliest version i actually remember is just the voyager implementation) ========== constitutional ai not simple RL but like nuanced RL could we implement...

2024-12-11 at 03:51

December 24, 2024•466 words

summary of convo w shane today i met this guy beside the pool tables in faunce while i was talking about Hidden Markov Models w moses and then he chimed in and started teaching us about HMMs and then i asked him what year he was and he said he's a phd student and i asked him what research and he said ai rl and so we started talking about ai here's his website https://sparr.io/ double descent penalize large coeffs language changes perception not bc of the language but bc the language is a sc...

2024-12-10 at 21:47

December 24, 2024•117 words

instead of providing the reasoner with a human-built cheat sheet / human-curated guidelines / etc, we want to allow the reasoner to struggle a lot we can give the reasoner tips on how to reason, but not shortcuts to reason same thing with training humans! like if we were teaching salena, we wouldnt wanna give her human-curated tips like "oh try noticing connected blocks, and try shearing, then try symmetry, then try rotation, then try [...]" we would wanna let her struggle through it, but give...

2024-12-09 at 19:20

December 24, 2024•596 words

humans learn not from repeated correct examples but from the edge cases of mistakes (wait we also do learn from repeating correct things like w spaced repetition, right? and also w reviewing the same practice problem ... oh wait but . perhaps that is also still actually traveling the edge case of mistakes, bc otherwise tbh we just skip the example when we're studying) eg dekeyser's skill building oh wait i guess this is what RL does (?) ========== also why havent we created a multimodal mode...

2024-12-09 at 14:22

December 24, 2024•113 words

arxiv paper that i found via youtube shorts yt = https://youtube.com/shorts/ZlvdInrdAYE arxiv link = https://arxiv.org/abs/2309.05660 kinda surprised that this was published late 2023 and presented in mid 2024 . also this is literally not discrete program search, right? but in the yt short, chollet says it is ? this is a naive implementation of literally what i was thinking about a few days ago (re: "brainstorming") i guess they were just trying to demonstrate base level capabilities, and per...

2024-12-09 at 13:31

December 24, 2024•303 words

i wanna be curious and just explore whatever but like also i dont wanna waste time,, i wanna actually be fast and speedy and actually catch up to things so that i can actually think smart about these things and actually start experimenting on novel things quickly, like ofc it's very useful to just sit around and think about hypotheses and we need that to get New Creative Ideas (re: greatness cannot be planned, etc) and that is very intangible but i also wanna be able to respond to tangible measu...

2024-12-06 at 0341 mimi import

December 24, 2024•36 words

active inference how can an ai eventually do divergent thinking and connecting two seemingly unrelated things? how do WE interpolate for that? r we only capable of interpolation and our extrapolation is just built from interpolation? ...

2024-12-06 at 1209 mimi import

December 24, 2024•12 words

humans r so dumb but we have 999999x distributed compute n teamwork ...

2024-12-08 at 1331 mimi import

December 24, 2024•471 words

when youve honed your reasoning you can sorta Feel if an equation is off that feeling is an action tugging (just like how u feel a word on the tip of ur tongue or how you feel) (i rly like model of . viewing feelings as action tuggings … its not perfect but i like it) and thats bc your Type 2 reasoning is built using Type 1, and Type 1 is multimodal whose teleology (and thus ontology) is to control Actions how did humans figure out how to reason ? maybe instead of teaching a model how to reas...

2024-12-08 at 2220 mimi import

December 24, 2024•517 words

train type 1 w type 2 (as in at the same time, just like how we train the word embedding matrix with the rest of the model), but type 2 is constructed from type 1 tho one new sorta reason ive thought of that it’s good to train type 1 and type 2 together is . this combined training will also determine how type 2 is constructed from type 1, and otherwise youre setting a sort of demarcation that is probably rly naive, but if u allow the demarcation to happen on its own then it follows how humans l...

2024-12-09 at 1312

December 24, 2024•1 words

tokenization ...

2024-12-05 at 2018 Untitledaksdjfhdas

December 24, 2024•63 words

i swear AI research is like a profitable tweaker's hobby.. like sure it's Valuable and you are contributing value to ppl. but like . ppl do it moreso for the prestige and the thrill (or at least me) . and i would feel more of the Contribution Feeling if i was doing smth like . actually personal . at least i think . ...

2024-12-07 at 1727 Untitled 5

December 24, 2024•347 words

do we only interpolate? how do we interpolate on ideas how limited are LLMs? bc they do not interpolate on ideas wait but what if they Do interpolate on ideas . like what if with 170B params it actually interpolates on ideas in a similar way as us bc like . we are only trained on our output not our hidden layers (ideas) right ...? i mean sorta but also no. okay but with next word prediction it obviously does not interpolate on an Entire idea (or maybe it is?? and maybe our sentence generation...

2024-12-05 at 2018 Untitled 3

December 24, 2024•559 words

ah . god im realizing how new i am to thinking about AI because im making many silly mistakes. but its okay. playing devil's advocate though: how do we make sure it doesn't optimize for: taking shortcuts that look efficient but aren't actually good reasoning finding patterns in the training problems rather than learning general reasoning memorizing solution templates instead of actually reasoning oh wow all of these are problems we have with Humans too... (note to self: 2 and 3 are basica...

2024-12-07 at 1727 Untitled 7

December 24, 2024•1,148 words

wait i swear ARC AGI tasks are interpolative on the goal but then you need a reasoning thing to try out the solutions at test time just like we do ========== if humans are strictly interpolative but we use that to do extrapolation, we have to ask how that happens rn LLMs are surface level interpolative but extremely good at doing surface level interpolation across superhuman amounts of data i think there's something about . humans interpolate on a much finer resolution, like on a much finer...

2024-12-05 at 1444

December 24, 2024•2,271 words

need to figure out difference betw LLMs and smth like alphago has anyone done the equivalent of like . training a tiny LLM/NN to use a tiny LLM and how different is that from smth like . training a NN to play go ========== also . like humans, will it need to start from scratch for each new skill ? like will it just come out of the box sorta tabula rasa about how to reason in that skill ? Hm Perhaps but also surely some things will carry over from prev skills like humans for agi to be agi ....

2024-12-23 misc

December 24, 2024•480 words

LLMs are too confident and they need to not be they need to know their limits and they need to know to what threshold they can make intuitive jumps vs need to break it into smaller parts good human reasoners do this but actually this is not a closed problem for humans too like . for something like doing the problem of 572 * 205 ... we know we cant immediately do it so we don't just immediately say "oh the answer is 170380" or wtvr. we have to break it up into like 572 * 2 * 100 + 572 * 10 / 2 ...