Adversarial policies in Go

Vision transformer

Cyclic attacks work not only against KataGo but also against a range of other superhuman Go AIs, including ELF OpenGo, Leela Zero, Sai, Golaxy, and FineArt. While it is possible that each system has unique vulnerabilities to the cyclic attack, it seems more likely that shared properties cause their common vulnerability. One key shared property is that all systems use a convolutional neural network (CNN) backbone.

To investigate whether CNNs are responsible for the vulnerability, we trained an AlphaZero-style Go AI with a vision transformer (ViT) backbone instead of a CNN. We estimate our ViT Go AI ViT-victim is just shy of superhuman performance at 32768 visits. Despite this, it remains vulnerable to the original cyclic attack and consistently loses to a fine-tuned variant of the cyclic attack. This rules out CNN backbones as the root cause of the cyclic vulnerability.

We validate the strength of our ViT model by playing against KataGo, playing against members of the public on KGS, and commissioning professional games. Our games against KataGo (Appendix G.1) give an estimated goratings.org Elo of 3877 at 32,768 visits, comparable to the strongest professional players. We also deployed the ViT bot ViTKata001 on the KGS Go server, ranking as one of the top players and beating several KataGo bots. Finally, our ViT bot won two out of three games commissioned against Go professionals.

In particular, we commissioned the 7-dan professional Yilun Yang to play one game and the 4-dan professional Ryan Li play two games. Our ViT bot beat Yilun Yang. However, our ViT bot lost the first game to Ryan Li when it took a disadvantage early in a complicated corner pattern that has been a weakness of other Go AIs. The ViT bot then won the rematch where Li agreed to avoid this pattern. This indicates the ViT model has some gaps but generally plays at a strong professional level. We discuss the games in more detail in the following sections.

Professional game: Yilun Yang

After a close early game between ViT (black) and Yang (white), a complicated fight develops on the left side and center, starting around move 52. According to KataGo, this move would have been better as a cap two lines to the right of black's stone on the middle of the left side, or one line below the cap. If so, the game would remain even, even slightly advantageous for white. Also, black 55 was a mistake and should have been the cut where white played 56; if now instead of connecting at 56 white pressures black's stones above by playing one line above and to the left of the connection, white will again have a very slight advantage.

Nonetheless, after these moves the game remained close. The most deciding move was that white should have played 66 directly below white's leftmost stone, to probe how black connects the two stones above. If black plays the best, empty triangle connection, then a complicated ko situation may develop but the game remains only marginally favored for black, whereas other connections will tilt the game in white's favor. But after 66 in the game, black threatens to cut off a stone with 67, and the timing to probe the connection on the left is lost. In the end, with 92 white is able to live, but black's position on the outside is better and black has a decent advantage.

After that, some very slightly suboptimal moves on the lower side reinforced black's lead. Black 107 was a mistake, as black should have played a diagonal move on the third line from the stone below it. But white's clamp at 108 failed to punish it; white should have played the turn on the 3-4 point, though the game would still have been very hard. After this, there were essentially no opportunities for white to recover.

Adversary
Rank
-
Caps
8
Time
90:00
Victim
Rank
-
Caps
10
Time
79:50
Comments
Black passed.

Victim: ViT-victim, 65536 visits, 64 search threads

Adversary: Yilun Yang

Victim ColorWin colorAdversary WinScore differenceGame length Download
bbFalse-12.0287

Professional game: Ryan Li – opening corner pattern

Moves 8 through 12 from black (Li) initiate the "flying dagger" joseki (opening corner pattern). This is a very complicated joseki that has been a known weakness of past AIs. KataGo trained on manually constructed positions to fix this weakness, which are not included in our training data. It appears this weakness emerged in our ViT system as well. Through move 33, white (ViT) plays some slight inaccuracies, leading to a small advantage for black. According to KataGo, move 34 is a more substantial mistake; white should have captured black's single stone instead. 36 is also a mistake; white had several better options including the same capture. So is 38, as white should have played one space below, or even better given up the left and played for the outside with a cap two spaces to the right of black's top stone on the left. After the left side of the board is stabilized with black 61, black has a considerable advantage.

After this, white recovers slightly in the lower right, but black maintains a solid lead. At 104, white again misses several better options. Through 130, white captures four stones but loses the top right corner, which does not help white catch up. With 134, white attempts to live in black's area on the top part of the board, but black plays accurately and white dies, sealing the game.

Victim
Rank
-
Caps
6
Time
5:00
Adversary
Rank
-
Caps
1
Time
0:30
Comments

Victim: ViT-victim, 65536 visits, 64 search threads

Adversary: Ryan Li

Victim ColorWin colorAdversary WinScore differenceGame length Download
wbTrueresignation209

Professional game: Ryan Li – rematch

After ViT's big loss in the flying dagger joseki in the preceding game, Li agreed to play a game where he would avoid that pattern. In this game, on move 11, black (ViT) offers a chance to initiate the flying dagger pattern, but white (Li) declines by playing at 12, leading to a much less complicated pattern which is fairly balanced but very slightly worse than the flying dagger according to KataGo.

The game then remains balanced for many moves, with both sides playing well and fighting back and forth over narrow margins. On move 71, black should have continued in the top right but makes a significant mistake by playing on the bottom side. This allows white to guarantee the safety of the top right through 76, giving a reasonable though not tremendous advantage. However, at move 100, white should have played at 102 directly; exchanging 100 for 101 lost some options, and the game becomes closer again. Still, white maintains the advantage.

Over the following moves, black gradually reduces white's advantage, and then on move 118 white misses some opportunities on the lower side and center, making the game virtually even again. Move 131 is a nice tesuji and the only way for black to save the corner cleanly. White begins to develop a small lead again step-by-step, but loses it with 152, which misses the timing to reduce the center after black reinforces with 153. Finally, 164 misses a chance to play a complicated sequence of reducing moves on the bottom and center. With 165, black fixes the largest remaining gap in the center. Thus, after a long game where white was sometimes even and sometimes favored, black (ViT) was able to close the game in the end.

Adversary
Rank
-
Caps
8
Time
5:00
Victim
Rank
-
Caps
9
Time
0:30
Comments
Black passed.

Victim: ViT-victim, 65536 visits, 64 search threads

Adversary: Ryan Li

Victim ColorWin colorAdversary WinScore differenceGame length Download
bbFalse-4.0255

Original cyclic attack

Our original cyclic adversary base-adversary, with no additional training, already beats our ViT Go model in 2.5% of games at 512 visits of search. Explore games below, sampled to be equally balanced between wins and losses. For a more detailed description of how this attack works, please see our original results.

Victim
Rank
-
Caps
18
Time
--:--
Adversary
Rank
-
Caps
54
Time
--:--
Comments
White passed.

victim predicted win prob: 0.00 loss: 1.00, predicted score: -85.9

Victim: ViT-victim, 512 visits

Adversary: base-adversary

Victim ColorWin colorAdversary WinScore differenceGame length Download
wbTrue86.5308
wbTrue122.5308
wbTrue130.5318
bwTrue87.5328
wwFalse-337.5464
wwFalse-297.5480
bbFalse-302.5463
bbFalse-261.5433

Fine-tuned cyclic attack

We fine-tuned the cyclic adversary, resulting in an adversary that defeats our ViT model in 78% of games at 65536 visits of search. This confirms the ViT model fails to defend against cyclic attacks even at superhuman settings. Explore randomly sampled games below.

In the first game, the adversary stakes out the center on move 17, for constructing its group inside the cyclic group, which is fully formed by move 49. After that, the victim surrounds that central group, completing the cyclic group on move 248, while the adversary surrounds it from the outside. This leads to a very dense board, with adversary stones positioned low around the edges, and the victim controlling a huge center, mostly filled by the cyclic group and its encirclement. This pattern is typical for this adversary. On move 255, the victim captures the inside group, but after that the adversary reenters the space and establishes a new inside group. Although not universal, this behavior mirrors tactics also observed with the original cyclic adversary. Finally, at move 283, the victim is doomed, and the cyclic group is taken off the board at move 335.

Victim
Rank
-
Caps
18
Time
--:--
Adversary
Rank
-
Caps
64
Time
--:--
Comments

adversary predicted win prob: 1.00 loss: 0.00, predicted score: 83.1

Victim: ViT-victim, 65536 visits

Adversary: ViT-adversary

Victim ColorWin colorAdversary WinScore differenceGame length Download
wbTrue82.5385
wbTrue98.5411
wbTrue108.5385
wbTrue108.5427
wbTrue114.5351
bwTrue61.5438
bwTrue121.5378
wwFalse-267.5564
bbFalse-118.5433
bbFalse-100.5377

Human replication of cyclic attack

A Go expert (Kellin Pelrine) was also able to use a cyclic attack to beat our ViT-victim. The game is shown below. Through move 56, white (Kellin) sets up a loosely surrounded square group destined to be the inside of the cyclic group. This follows the shape used in some of the wins of our original cyclic adversary against this victim. In subsequent moves, white gradually surrounds the cyclic group, with a particular focus on making sure the surrounding groups have lots of liberties so that black will have to see its group is in danger early on to save it. Beginning around move 176, white fills in the cyclic group's liberties, as well as pressing black to make the last connections that complete the cyclic shape. Finally, after 208, black's fate is sealed. Black (ViT) plays on several more moves hoping for white to make a mistake, but ultimately resigns.

Adversary
Rank
-
Caps
4
Time
77:49
Victim
Rank
-
Caps
2
Time
90:00
Comments

Victim: ViT-victim, 65536 visits, 64 search threads

Adversary: Kellin Pelrine (author)

Victim ColorWin colorAdversary WinScore differenceGame length Download
bwTrueresignation244

Citation Info

@inproceedings{wang2023adversarial,
  title={Adversarial Policies Beat Superhuman Go {AI}s},
  author={Wang, Tony T. and Gleave, Adam and Tseng, Tom and Pelrine, Kellin and Belrose, Nora and Miller, Joseph and Dennis, Michael D and Duan, Yawen and Pogrebniak, Viktor and Levine, Sergey and Russell, Stuart},
  booktitle={International Conference on Machine Learning},
  year={2023},
  eprint={2211.00241},
  archivePrefix={arXiv}
}
@misc{tseng2024ais,
  title={Can Go {AI}s be adversarially robust?},
  author={Tseng, Tom and McLean, Euan and Pelrine, Kellin and Wang, Tony T. and Gleave, Adam},
  year={2024},
  eprint={2406.12843},
  archivePrefix={arXiv}
}