Human Compatible
Stuart Russell
Given how fast things are moving forward, a book on AI published in 2019 may seem outdated. On the contrary – everyone interested in AI and needing an in-depth understanding should read this book. As well as discussing the benefits to be gained from AI, Russell also explains, in great detail and with clear examples, why our approach to AI today is not necessarily the best. His core question explores how we ensure machines remain beneficial to humans and addresses a fundamental challenge: increasingly intelligent machines that do what we ask but not what we intend.
Russell starts by demystifying AI and clarifying its capabilities and boundaries. He argues that AI may excel in structured environments like chess, where its logic navigates a finite set of moves. However, AI’s ability to predict outcomes falters in the dynamic tapestry of human decision-making, where actions are not necessarily bound by rationality. Today, dissecting complex social interactions to advise humans on decision-making is a formidable challenge for AI.
Russell also contemplates the possibility of artificial general intelligence (AGI), where a machine could learn and think like a human. Contrary to some experts, he believes we still need a major breakthrough and a human lifetime (about 80 years) until AGI is developed. Here, his arguments on the challenge of truly understanding human language make one wonder about the capabilities we have attributed to LLMs, especially ChatGPT. According to the author, current AI algorithms “can extract simple information from clearly stated facts, but cannot build complex knowledge structures from text; nor can they answer questions that require extensive chains of reasoning with information from multiple sources”.
Russell spends time discussing what could go wrong if we continue progressing down the AI path we are currently on and emphasizes the “unintended consequences” that superintelligent AI may bring just because humans were not specific enough when describing what they want and do not want.
This book is not technical; even in his suggestions about how to build a better AI development philosophy, most of Russell’s suggestions are around teaching machines human values. Russell argues we should strive to build beneficial machines that are purely altruistic, and that is only possible if machines learn not from data but from observing human behavior. When discussing how to do this, his suggestions fall short of being applicable, but this does not take anything away from the book’s brilliant questions and discussions.