Since the “birth” of artificial intelligence (AI) in the 1950s, games have been used to gauge advances of AI: Deep Blue mastered Chess, Watson successfully challenged Jeopardy’s best players, AlphaGo won 4-1 against a world Go champion, and Libratus beat the best players at Texas Hold’Em poker. Each of these victories ultimately led to significant advances in the history of AI. Real-time, multi-player strategy games are the next frontier.
Even though several AI researchers and organizations are engaged in this race, OpenAI, a San Francisco based non-profit research group, made a breakthrough earlier this year. In a benchmark game in August, a team of five neural networks, called OpenAI Five, learnt how to behave collaboratively and won a best-of-three against a team of professional players in a simplified version of Dota 2.
Dota 2 what?
Dota 2 is one of the most popular eSport games in the world (966 tournaments with more than $169 million awarded in prize money and over 10 million monthly active users in July 2018). Each player belongs to a team of 5 players, controls a “hero” with specific strengths and weaknesses, and battles opposing teams in order to destroy the “Ancient” (a structure located in the opposite team’s base). Collaboration and coordination between players is crucial to succeed. Watch a video here to explain the game further.
Such games are nightmares for AI programmers for several reasons:
- Continuous action space: each hero can decide whether to explore territory, target an enemy or get on with any other objective. Every fraction of a second, thousands of actions can be chosen.
- Continuous observation space: each hero can encounter various objects (trees, river, buildings, etc.), teammates or enemies. The environment is perpetually changing: every fraction of a second, there are more than 20,000 observations from the game.
- Long-time horizons: short-term actions can have only minor impacts, the team’s long-run strategy is the key to success.
- Incomplete information: each hero only sees a small area around itself, the rest of the battlefield is hidden, and must be explored.
- The need for collaboration: unlike in games like Chess or Go, which are one-on-one, Dota 2 requires high levels of communication and collaboration in order to achieve long run objectives.
For an AI system to challenge and win against professionals in this environment is, therefore, an impressive achievement. But what does this really mean about the progress of AI?
Is Artificial General Intelligence (AGI) around the corner?
Relax, we are still very far from it. First of all, OpenAI Five was able to beat professional human teams only under some restricted rules (i.e. limited number of characters and with some gameplay options removed), which significantly altered the game in its favor. Indeed, after the last major game restriction was removed, OpenAI Five lost its two games against top Dota 2 players at The International, held at the end of August. The games were nevertheless considered “vigorous Dota matches,” lasting about an hour.
Since computers can make more precise calculations and react much faster and efficiently to split-second developments, OpenAI Five had a competitive advantage during fights. But as several commentators noticed (here and here), the robots fell behind when planning for the long game and failed to connect events dozens of minutes apart (point 3. above). While it is straightforward to connect cause and effect when killing an enemy, indirect links might be trickier to assimilate for a machine (e.g. taking time to complete objectives in order to gain gold, which can in turn be used to upgrade weapons and become powerful enough to win). The robots’ determination to play aggressively, even when the situation didn’t warrant it, highlighted their shortcomings. The teams who beat OpenAI Five used this weakness and learned to quickly corner the AI.
Nevertheless, these defeats do not take away the significant advances in artificial intelligence these games represent. As Gary Brockman, a co-founder at OpenAI, noted, a machine does not make the same mistake twice:
“Usually we start playing teams when we’re about at their level, then a week or two later we surpass them. And that has happened to us a number of times now.”
As illustrated by the figure below, the progress made by OpenAI Five in just few weeks is impressive and promising. The hope is then that these superhuman skills will help build advanced systems that can be applied to real-life challenges in the future.