Human Compatible by Stuart Russell book notes

Inventing super-intelligent AI will be the biggest event in the future of humanity. It's unclear when super-intelligent AI will appear. Many experts say super-intelligent AI will arrive in 2050 but a more conservative estimate is 2100.

AI is already having a big impact on human activity. One example is Facebook's content selection algorithms in the 2016 elections. Facebook makes money when someone clicks on content, people with extreme views are more likely to click on extreme forms of content. Therefore, Facebook's algorithm incentivizes the spread of extreme links and videos since people don't click on moderate content. These algorithms could lead to the resurgence of fascism. Would Trump have been elected without these social media platforms?

AI could have a positive benefit on humans. The industrial revolution delivered a 10x improvement in living standards between 1820-2000 and it's reasonable to think that the world of AI could bring about a similar increase in our standard of living.

AI could be a "Gorilla problem". This problem describes the reluctance Gorilla's must have felt by being the source of the genetic material that created humans which led to their downfall. AI also has a "King Midas problem" which is a cautionary tale about it's unpredictability. Kind Midas got everything he wanted -- mainly items he touched to turn to gold. This included his family and his food and he died in misery and alone. How can we prevent AI from taking over from humans and how can we control it?

Currently AI is programmed according to the "Standard Model" which are algorithms that are designed to achieve human objectives like the Facebook content algorithm described above. The standard model could lead to the King Midas problem where AI's pursue their objectives without regard to what humans want.

The way to avoid the Gorilla and King Midas problems is to design AI that is provably beneficial to humans.

The rules of how to create this AI are as follow:

  • The machine's only objective is to maximize the realization of human preferences.
  • The machine is initially uncertain about what those preferences are.
  • The ultimate source of information about human preferences is human behavior.

The rest of the book goes into how to deign and prove that these systems work.


You'll only receive email when they publish something new.

More from Sunday Writing Club
All posts