Wikinews interviews the research team behind 'human-like' Maia chess engine

From Wikinews, the free news source you can write!
Jump to navigation Jump to search

Monday, March 1, 2021

Portrait of Professor Ashton Anderson, one of the researchers on this study.

Interview with Professor Andersen[edit]

Portrait of Reid McIlroy-Young.
Image: Reid McIlroy-Young.
Wikinews

What was the timeline of this study? Why did you choose the name 'Maia'? How many researchers were involved in this study and what were their roles?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We started in late 2018, as one of the first projects of Reid's PhD. There were 4 people involved for most of the time. The programing and data analysis was done by Reid, with close collaboration with the other team members[.]

Wikinews waves Left.png((WNWikinews waves Right.png)) How was Maia trained?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We used first generation NC Azure VMS with NVIDIA Tesla K80 GPUs, the final models were based on the hyperparameter tuning. The final training time was a couple days[.]

Wikinews waves Left.png((WNWikinews waves Right.png)) How many games were selected for training the neural network? Did you exclude those games where one of the players had quit/got disconnected from the game? (ie, letting the time run out, instead of resigning?) After dividing the games based on ratings, how many games were used for each Maia version?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We used 12 million games for each model, we truncated games where either player had 30 seconds or fewer, we didn't filter by termination condition[.]

Wikinews waves Left.png((WNWikinews waves Right.png)) If I understand correctly, different versions of Maia predict moves depending on the Elo rating; how was that achieved? Did the team choose only those games where both players where in the same rating division? (eg, both were in 1500-1599) If not (eg, 1500 v 1700), did two versions of Maia train using that game?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) Yes, the models were only shown games where both players were within the targeted rating range. So all our models have fully separate training sets[.]

Wikinews waves Left.png((WNWikinews waves Right.png)) Do we notice a pattern of higher rated Maia playing rather easy game against weaker opponents?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) Maia doesn't take the opponent into account, so it should play the same against all opponents.

Wikinews waves Left.png((WNWikinews waves Right.png)) Did you try playing Maia v5 (1500 ELO) against v1 or v9? Does the weaker model ever win against the stronger model?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) maia1 can win against maia9, but playing them raw requires an entropy source as the models are deterministic or they'll just play the same game over and over again. I ran a few a long time ago so you can search for maia1 vs maia9 on their accounts.

Stockfish showing different options of ideal moves.
Image: Lichess.

Wikinews waves Left.png((WNWikinews waves Right.png)) Some players play aggressively, and some have different styles -- how does Maia take those into consideration, to predict the next move? Does it also provide a list of moves, ordered based on the likelihood of it being played by a human? (For perspective Stockfish shows which was the ideal move, and which was the second best move, per its evaluation)

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We don't consider play style, the models just average over the players. The models take in a board stack and output probabilities for all 1858 possible chess moves. We then filter that down to just the legal moves and convert it to a probability distribution and select the top move.

GM Hikaru Nakamura (black) with an esoteric fashion of check mating the opponent's King.
Image: Lichess.

Wikinews waves Left.png((WNWikinews waves Right.png)) In the endgame, some players try to beat the opponent in rather creative, or sometimes esoteric fashion, chasing the king across the board. Does Maia also do that -- something that a sapient player does purely because of emotions, fun or thrill; xor will Maia play the shortest series of winning moves all the time?

Wikinews waves Left.png((Ashton AndersonnWikinews waves Right.png)) The models tend to play for longer endgames. This is likely due to those being more plentiful in the training data, since players that concede early don't leave samples[.]

The rating distribution for weekly Classical games on Lichess on March 7, 2021, forming a bell curve.
Image: Lichess.

Wikinews waves Left.png((WNWikinews waves Right.png)) Given the rating distribution of players follow a bell curve, what measures did the research team take to avoid over-fitting games at the curve's peak, and under-fitting at the extremes?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We trained on the same number of games for each rating level (12 million).

Maia's move-matching accuracy.
Image: Ashton Anderson, Reid McIlroy, Siddhartha Sen and Jon Kleinberg.

Wikinews waves Left.png((WNWikinews waves Right.png)) This graph shows a trend; the maximum of the curve lies ahead of the rating it was trained for. (eg, Maia 1100's peak is at 1200, and so on). Could you please explain why?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We hypothesize it's because the models are more like committees of players than a single player, so they tend to be a bit stronger than their targets. https://maiachess.com/assets/js/plots.js has the data used for the plots[.]

Wikinews waves Left.png((WNWikinews waves Right.png)) The Microsoft announcement says "Maia could look at your games and tell which blunders were predictable and which were random mistakes." When Maia predicts the blunder, is Maia aware the move is a blunder? If so, how was it achieved?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) The models assign a probability of winning to each board position, so this can be used to say if a blunder occurred[.]

Wikinews waves Left.png((WNWikinews waves Right.png)) What are some of the use cases of training a chess engine like how a sapient being would play rather than generating a minimax tree/Monte Carlo tree search to find a more optimal move?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) The main use case is developing training tools that can help people improve. If we can tell what common mistakes are, we can help people find them and target them in their training.

Wikinews waves Left.png((WNWikinews waves Right.png)) Can a player run Maia on their computer to train, practice and improve their games? Xor, does the trained neural network take too much computational power?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) Yes, we are using the Leela/lc0 client which can run on just about anything, although a GPU would make them faster.

Wikinews waves Left.png((WNWikinews waves Right.png)) Given Lichess uses Glicko-2 rating, rather than using Elo, did the research involve converting the player ratings on Lichess to Elo while training the neural network?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We used the Lichess rating system, and sometimes refer to it as Elo as that's the more commonly used term.

Wikinews waves Left.png((WNWikinews waves Right.png)) Time plays a factor in how a human player will play a move. Does that affect Maia's move predictions when Maia is playing in blitz, rapid or classical time control?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) No, the models don't know anything about move time, but we plan on incorporating this in future versions.

Portrait of GM Hikaru Nakamura
Image: Andreas Kontokanis.

Wikinews waves Left.png((WNWikinews waves Right.png)) The Microsoft announcement mentioned games like bullet and ultra-bullet were filtered out since rate of blunders increase. However, there are players like Hikaru Nakamura, who are good at bullet. Are there plans to increase Maia's domain on how skilled players play in such time controls? And this could also help humans detect which type of blunders they might make under time pressure. Though I understand, this might pollute Neural Network, and might give underwhelming results.

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We are looking into it, but access to enough training games also quickly becomes an issue.

Wikinews waves Left.png((WNWikinews waves Right.png)) Glicko-2 ratings for new players start at 1500 on Lichess and changes rapidly for their first few games. Their ratings look like: "1642 (?)", as Glicko-2 is not very confident of their rating yet. Did your team filter out those games where either of the players were new?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) No, they are infrequent in our sample as we only looked at rated games. Although on some tests (in a tournament with tree search it was the weakest) maia-1500 has been an outlier so we suspect the new players do have an effect on it[.]

Wikinews waves Left.png((WNWikinews waves Right.png)) Since Maia was trained till 2500 rating, do we expect it to lose against players who are rated above 2500? Will Maia continually run, and train itself while playing against a human opponent?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) The released models only go up to 1900-1999, we tested them up to 2500. The models are static and don't update or learn from play.

Wikinews waves Left.png((WNWikinews waves Right.png)) Given a position on the board may not appear frequently, how did you get around this in order to train Maia?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) Most positions don't occur in our training set, the deep learning based design means they can extrapolate to novel positions[.]

Wikinews waves Left.png((WNWikinews waves Right.png)) On Maia's website, it is said "even when players make horrific blunders, Maia correctly predicts the exact blunder they make around 25% of the time". Does that indicate humans in-general, tend to make the same type of blunders?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) That is our speculation too, but we can't generalize our results yet.

Wikinews waves Left.png((WNWikinews waves Right.png)) The Microsoft announcement also said, "some personalized models can predict an individual's moves with accuracies up to 75%". Which player's moves were used to train the model? How many games were analysed? What is the reason of this improvement in prediction? How can it be improved? Is it possible to download the model and train it with games of a specific player on a regular computer?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We recently found some issues with the data used in that analysis, so would say we get up to 65% accuracy. The updated paper is on arXiv. We did a variety of analyses so there is no single answer. Once the paper has undergone peer review we will have more information available about code/models.

Wikinews waves Left.png((WNWikinews waves Right.png)) Is there a way we could quantify, visualise or explain how the style of any two players differ? It might be interesting to see. Moreover, it could be used to track how one's style changes as their rating changes. Ah, this reminds me of the "Play Magnus" app; maybe using Maia, one can make a better, more accurate, and free alternative of such applications for various grand masters; this is indeed brilliant! Is it possible to take the model and train on say Alireza Firouzja's games in PGN?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) Yes!

Wikinews waves Left.png((WNWikinews waves Right.png)) AI mimicking actions of a sapient being is in the territories of Turing Test. Do you think a machine can pass Turing Tests in a very niche domain? Will this make it harder for anti-cheat tools to detect if a move was a sapient-decision, or was assisted by computer?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) We plan to test if our systems pass a Turing test, so we're at least optimistic about it.

Wikinews waves Left.png((WNWikinews waves Right.png)) In what way could someone misuse Maia?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) Cheating with them like any other chess engine.

Wikinews waves Left.png((WNWikinews waves Right.png)) Is Maia ready to be used for detecting cheating on online chess websites like Lichess?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) No, that's a much harder problem. It could be a valuable input, though.

Wikinews waves Left.png((WNWikinews waves Right.png)) Does the team intend to train Maia on the other chess variants offered by Lichess?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) Not currently, no.

Wikinews waves Left.png((WNWikinews waves Right.png)) Does the team plan on training Maia with more datasets? Will Maia also be training with live matches happening on Lichess?

Wikinews waves Left.png((Ashton AndersonWikinews waves Right.png)) Yes, we'll release new versions of Maia in the coming months.


Sources[edit]

External links[edit]