AlphaGo - Demonstration of the AI development

Lately (as many other people in the world, I guess) I’ve been following the event held by Alphabet; precisely, by Google DeepMind. Some rounds of Go matches were played between the Korean Champion Lee Sedol and a machine developed by DeepMind (AlphaGo) in a week interval, and the result was, simply, astounding. AlphaGo won 4-1 against Lee Sedol, reaching to a similar milestone that IBM achieved when the Russian chess player Garri Kaspárov was beaten by Deep Blue.

However, someone that doesn’t have some knowledge on this topic could ask:

… Which is the difference between the 2 machines?

The main difference is in the software they both have installed on themselves (and their computing power, but this difference isn’t so meaningful in this case). Deep Blue was, basically, a supercomputer that had the ability to compute every single probability of the present and future turns, evaluating the best option and taking that path. However, AlphaGo’s approach is totally different: the learning.

To understand the milestone’s significance better, we should highlight that Go was considered as an unreachable game to the programmers due to its complexity and its rapid growth.

To understand it graphically, this is a binary tree. It scalates as: 1 - 2 - 4 - 8 - ... - 2^n nodes. — To understand it graphically, this is a binary tree. It scalates as: 1 – 2 – 4 – 8 – … – 2^n nodes.

In order to make a comparison with totally arbitrary numbers, let’s imagine that Chess generates some kind of 30^n progression. This means that in d1 stage, there will be 30 nodes (30^1), 900 nodes in d2, 27000 nodes in d3, 810000 nodes in d4, etc. It might seem a lot, but let’s remind that Go boards are 18×18, and comparing to chess, a player can add 1 tile in any position, whereas in Chess, many options are limited or blocked. And, calculating a 364^n progression is a really expensive operation to do in computational terms.

So, looking at the previous situation, we could assert that IBM’s approach to win the Chess’ match was the most straightforward one, and, in fact, it made us aware which the computing power was at that time. Unfortunately, it wasn’t a viable option to tackle Go matches.

Then, the system generated by DeepMind combines different methods that allow the “learning”, ignoring the “brute force” calculations. In fact, it is related to neural networks, which, in a nutshell, try to simulate what a brain’s neurons do, adapting the knowledge it has obtained from Success/Failure trials to similar situations; something undoable with the “brute force” approach.

Besides these neural networks which limit the iterative behaviour of the program, a specific algorithm is also used here, which appears frequently in Artificial Intelligence topics and learning topics. This algorithm is known as the “Monte Carlo Tree Search” (MCTS), which is divided in 4 phases:

MCTS algorithm's 4 phases — MCTS algorithm’s 4 phases.

Selection: A specific branch which will be used by the program is selected.
Expansion: The program goes forward into a branch until a possible (previously existing or new) case is detected.
Simulation: The program runs at the particular case.
Backpropagation: The result is returned back to the beginning of the tree, registering the case in a (previously existing or new) node with the result obtained (Success/Failure).

This means that, essentially, a repetition of this algorithm is executed, which makes the machine recognize some situations with specific Success/Failure ratios, giving them priority over the rest of the options.

Because of that, AlphaGo isn’t merely a machine to beat Asian board players, but it’s main aim, as one of the Co-Founders of DeepMind Demis Hassabis explained at this and some other interviews, is to set a way to develop these learning techniques and to eventually apply them in some other fields such as the medicine or the smartphones production.

So, out of the eloquent quotes (link in Spanish) Google’s progress has generated, they are really showing a change in the paradigms of the AI, moving from the implementation of some basic algorithms that just allow to calculate a massive amount of alternatives, to an algorithm that can choose the optimal option in different cases, based on a backup of Success/Failure tries. And, joined to the initiatives businesses like Boston Dynamics are showing (which sadly chose to work separated to Google) might impress us quite a bit in a not-so-far future.

[Originally written at March 15th, 2016]

The Cookie Banner