2024-12-23 misc2

December 24, 2024•379 words

understand constitutional ai approach

can i do RLAIF via prompting?
like it wont change the model but it will change the prompt

evolutionary approach is dumb if it's only evolutionary and you have set compute cutoffs. allow your time to be managed like a human would manage their time.
okay maybe im being too harsh to jeremy berman
evolutionary approach is actually pretty cool (picbreeder type beat) but i just feel like it's not needed for this problem i guess? i feel like it would be much more efficient to have some sort of chain of thought with a list of attempted subrules and attempted full transform functions.
and sampling is not that bad actually. we do sampling too. i just think it wouldve been better to sample more strategically. like sample subrules instead of the full transform function.

ugh wait but if we sample subrules then we have to generate and run a whole python thing for each one
okay wait but i still think it would be much more efficient than sampling 200 fullass transform functions actually

===

i should evolutionarily sample a bunch of different simple approaches first, and get a baseline for how things perform. just do a lot of fast experiments.

test out qwen 32b coding. efficacy + efficiency
test out vision models
20 questions, finetuning using only prompts, memory of guesses for this task, memory of guesses for all tasks, etc

exhaust the simple approaches first. dont get lost in math n complicated RL stuff. make the minimum simplest solution that gets us there

naive approach -> rlaif with prompting -> rlaif on reasoning?

===

remember, 7.2 mins per task
but surely it should take less than that

===

dude we just need an LLM or LLM system that is as smart as salena and as malleable as salena. then we'll be bing chilling

these surely must exist
if not, it surely must be finetunable to exist

gosh i rly need to learn how the constitutional ai works and then try replicating it so that i can see how i might be able to utilize it for ARC.

ugh but also constitutional ai only works for a set of principles that you have beforehand. what if these principles are dynamic!!!

👍❤️🫶👏👌🤯🤔😂😍😭😢😡😮

More from corbin
All posts

2024-12-22 at 17:38 more notes on Parables on the Power of Planning in AI (noam brown)

December 24, 2024•283 words

And this was, at the time, state of the art for predicting human moves in chess. 29:48 Now, one thing that's really interesting about MAIA is that for high Elo models, it was about 100 to 300 points-- 29:56 Elo points-- below the target Elo rating. So if you were to train it on 2,000 Elo-rated humans, 30:03 it would only be about 1,700 Elo. For the lower Elo ratings, this ended up not being a problem. 30:08 For the higher Elo ratings, it was a challenge. Now, one hypothesis for why this is the c...

Read post

2024-12-23 misc3

December 24, 2024•1,014 words

if u dont understand smth u need to know that u dont fully understand it and that youve blackboxed it etc how does a good human reasoner learn that? and then how do we update our beliefs/understanding after learning it === u also need to learn the threshold of blackboxing, and the threshold of how blackboxed of a tool can u still be satisfied using how does a good human reasoner learn that? === id like an ai to be able to go through Purcell n Morin EM textbook, do all the practice problems...

Read post

2024-12-23 misc2

More from corbinAll posts

2024-12-22 at 17:38 more notes on Parables on the Power of Planning in AI (noam brown)

2024-12-23 misc3

More from corbin
All posts