Google’s latest artificial intelligence, AlphaZero, has defeated one of the best chess programs in the world after learning the game from scratch in just four hours.
The ‘superhuman’ AlphaZero AI played 100 games against rival computer program Stockfish 8, and won or drew all of them.
The AI is the work of Googe’s DeepMind division, and is the latest in a series of incredible AI achievements by the company.
An earlier version of the machine, dubbed AlphaGo, was able to defeat the world’s top human players of the Chinese board game Go.
Google’s AlphaZero has defeated one of the best chess programs in the world after learning the game from scratch in just four hours. The ‘superhuman’ AlphaZero AI played 100 games against rival computer program Stockfish 8, and won or drew all of them
The results of the achievement were published in a report by researchers from Google’s DeepMind in London this week.
They found that the technology could go up against a powerful chess software, that has been in development for nearly a decade, and still win.
In 100 games played against Stockfish, Google’s AI won 28 and drew the rest – after just four hours of training.
Writing in a research paper about the project, shared via the electronic print repository Arxiv, its authors said: ‘AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case.’
Stockfish 8, which was released in 2008, had previously won 2016’s Top Chess Engine Championship.
The open source software has been beaten by another program, Komodo, in two major computer chess challenges.
But one chess grandmaster was still impressed with Googe’s victory.
‘I always wondered how it would be if a superior species landed on Earth and showed us how they played chess,’ Peter Heine Nielsen told the BBC. ‘Now I know.’
The AI was also able to beat an AI program called Elmo in the Japanese board game Shogi, after two hours of self-training.
It won 90 games, drew two and lost eight.
With hours of self-training it was also able to beat an earlier version of itself at the ancient Chinese game of Go.
The latest win follows a series of achievements by Google’s Deepmind division.
One version of the Alpha Go AI, created in 2015, proved it could do better than humans on dozens of Atari video games from the 1980s, like video pinball, boxing, and ‘Breakout.’
A new version of the software, called AlphaGo Zero, which learned to play simply by playing games against itself, was unveiled in October.
Demis Hassabis, co-founder and CEO of DeepMind said at the time: ‘It’s amazing to see just how far AlphaGo has come in only two years.
‘AlphaGo Zero is now the strongest version of our program and shows how much progress we can make even with less computing power and zero use of human data.’
Previous versions of AlphaGo were initially trained on thousands of human amateur and professional gamers, to learn how to play Go.
But AlphaGo Zero skipped this step and learned to play simply by playing games against itself, starting from completely random play.
During testing, the system quickly surpassed human level of play and defeated the previous version of AlphaGo by 100 games to 0.
It was able to do this using a form of reinforcement learning, in which it becomes its own teacher.
The system starts off with a neural network that knows nothing about the game of Go.
It then played games against itself, by combining this neural network with a powerful search algorithm.
As it played, the neural network was tuned and updated to predict moves, as well as the eventual winner of the games.
This updated neural network is then recombined with the search algorithm to create a new, stronger version of AlphaGo Zero, and the process begins again.
In each iteration, the performance of the system improves by a small amount, and the quality of the self-play games increases, leading to more and more accurate neural networks and an ever stronger version.
This technique was more powerful than previous versions of AlphaGo because it is no longer constrained by the limits of human knowledge.
In trials, after 40 days of self-training, AlphaGo Zero was able to outperform the version of AlphaGo known as ‘Master’, which has defeated the world’s best players and world number one Ke Jie.
AlphaGo became progressively more efficient thanks to hardware gains and, more recently, algorithmic advances
And after millions of games, the system learned the game from scratch – something that would take a human player thousands of years.
If similar techniques can be applied to other problems, including reducing energy consumption or searching for revolutionary new materials, the resulting breakthroughs have the potential to massively impact society.
Mr Hassabis added: ‘Ultimately we want to harness algorithmic breakthroughs like this to help solve all sorts of pressing real world problems like protein folding or designing new materials.
‘If we can make the same progress on these problems that we have with AlphaGo, it has the potential to drive forward human understanding and positively impact all of our lives.’
In May, the previous version of AlphaGo defeated the world champion for the third time.
AlphaGo defeated 19-year-old world number one Ke Jie of China to sweep a three-game series that was closely watched as a measure of how far artificial intelligence (AI) has come.
Ke Jie anointed the program as the new ‘Go god’ after his defeat.