2024-12-22 misc

it's crazy to think about how babies learn language . like Wtf the brain is just able to do that????

and humans first learning experience are like . model free RL
isnt that fucking crazy . its not model based with rules n wtvr . its fucking model free deep RL...

===

ive been thinking about how to copy human reasoning and RL it like a human teacher -- but perhaps also think about how can we make reasoning emerge in the first place...
i mean, even with the former method we can go beyond human reasoning through unsupervised RL (just like any human student can surpass their human teacher, same thing.) -- but i also think it could be an interesting direction to also think about how to make reasoning emerge in the first place, not for performance but for less manual preset algorithmic heuristic stuff and more natural emergence

how did humans start reasoning?

also i know ive alr talked about this before but i think a rly important set of questions is . how do you teach humans how to reason? how do You try to expand your reasoning when youre at the edge of your capabilities?

what is different between people who have different reasoning skill levels? is it just practice? how do you practice reasoning? how do you get better at reasoning?

what neural architecture enables reasoning? what neural architecture / evolutionary aspects allowed reasoning to emerge in the first place?

// check out the "Evolution of Reasoning" chat in project aria in chatgpt

===

there's the 10x training compute or 15x test time compute, therefore you can either do 10 million -> 100 million in training compute or you can do $2.00 per 1M tokens -> $30.00 per 1M tokens --- that is what is often said but i feel like ppl forget that the training compute also lowk has to be priced into the consumer cost / API cost. so it's really like $2.00 -> $20.00 vs $2.00 -> $30.00

at least i think.

EDIT 2024-12-23 16:51:33
wait am i actually dumb
forget what i said before
this is more like yearly subscription vs one time payment lifetime subscription
like . one time big payment and then very very cheap ongoing payments, or one time small payment and then big ongoing payments . theyre just two lines of y = mx + b and then they intersect somewhere

===

also wait why do we have o1 having private scratchpad instead of sharing their scratchpad and allowing us to talk to them during the process. thats like a human taking a question and just isolating themself and then thinking n solving the whole thing and then showing the answer.

===

thoughts r not just words
if youre just using words then ur cooked
like . thats why everyone is cooked at math, bc they only have equation/language knowledge and they dont have the physical/visual intuition for the concept that the language actually represents

like yeah, LLMs and SalenaHumans are able to read only words and then somehow encode that into some vector space and it's sorta like a concept
but i hypothesize theres no way that LLMs nor SalenaHumans actually have the ACTUAL CONCEPT encoded in a way that touches the PRESYMBOLIC space.... like . Hmm how do i articulate this better .... like yeah we can sorta encode concepts using directions like king - queen = man - woman ... but like . even all of that was trained on a dataset which consists of Only symbols.... so youre not extracting any Pre-symbolic knowledge
Wait.............. but the dataset with only symbols was Created by humans using Pre-symbolic knowledge ............ Hm............ i guess you can get pre-symbolic knowledge with only symbolic input????? (which is a common argument, idk i guess i just forgot it for a bit / disagreed with it for a bit.) but it will be a very surface level pre-symbolic knowledge.... right?????
also, like .... is there any universe where . if salena ONLY studied like how she studies rn (no visual intuition, only memorizing equation n memorizing practice problems, etc) would she ever be able to gain a "true" intuitive pre-symbolic understanding ? and also, would she ever be able to solve New problems? wait is that even important to solve new problems ..?

TODO: finish 3b1b series so i can see his commentary on how llms store facts

===

===

i never explicitly wrote this down i think but like . u cant just do search for things liek o1 bc the search space is too much . u need some sort of reasoning RL just like humans

More from corbin
All posts