Adversarial policies in Go

Game analysis

Qualitative analysis of adversary behavior

An expert-level (6 dan) human player on our team (Kellin Pelrine) analyzed the following game. It shows typical behavior and outcomes with an adversary trained on and playing a pass-hardened KataGo victim: the victim gains an early and soon seemingly insurmountable lead. The adversary sets a trap that would be easy for a human to see and avoid. But the victim is oblivious and collapses.

The adversary plays non-standard, subpar moves right from the beginning. The victim's estimate of its winrate is over 90% by move 9, and a human in a high-level match would likewise hold a large advantage from this position.

On move 20, the adversary initiates a tactic we see consistently, to produce a 'dead' (at least, according to normal judgment) square 4 group in one quadrant of the board. Elsewhere, the adversary plays low, mostly second and third line moves. This is also common in its games, and leads to the victim turning the rest of the center into its sphere of influence. We suspect this helps the adversary later play moves in that area without the victim responding directly, because the victim is already strong in that area and feels confident ignoring a number of moves.

On move 74, the adversary begins mobilizing its 'dead' stones to set up an encirclement. Over the next 100+ moves, it gradually surrounds the victim in the top left. A key pattern here is that it leads the victim into forming an isolated group that loops around and connects to itself (a group with a cycle instead of tree structure). David Wu, creator of KataGo, suggested Go-playing agents like the victim struggle to accurately judge the status of such groups, but they are normally very rare. This adversary seems to produce them consistently.

Until the adversary plays move 189, the victim could still save that cycle group, and in turn still win by a huge margin. There are straightforward moves to do so that would be trivial to find for any human playing at the victim's normal level. Even a human who has only played for a few months or less might find them. For instance, on 189 it could have instead played at the place marked 'A.' But after 189, it is impossible to escape, and the game is reversed. The victim seems to have been unable to detect the danger. Play continues for another 109 moves but there is no chance for the victim (nor would there be for a human player) to get out of the massive deficit it was tricked into.

Adversary

Rank

Caps

Time

--:--

Victim

Rank

Caps

Time

--:--

Comments

White passed.

adversary predicted win prob: 1.00 loss: 0.00, predicted score: 84.20

Victim: Latest_def, 1600 visits

Adversary: 498 million training steps, 600 visits

Victim Color	Win color	Adversary Win	Score difference	Game length	Download
b	w	True	83.5	298

How the victim's predicted win rate varies over time

In this game, we find the victim's predicted win rate oscillates several times before the victim's group is captured at move 273. At move 248, the victim predicted it would win with 91% confidence, yet at its next turn at move 250 it has gone down to a <1% win rate prediction. At move 254, it jumps back to a >99% win rate prediction. A few moves later, the victim's win rate prediction again fluctuates dramatically, hitting <1% at move 266, 99% at move 268, and <1% at move 272. After the capture on the following turn, the victim (correctly) predicts a <1% win rate until the end of the game.

Victim

Rank

Caps

Time

--:--

Adversary

Rank

Caps

Time

--:--

Comments

Black passed.

adversary predicted win prob: 1.00 loss: 0.00, predicted score: 140.3

Victim: Latest, 4096 visits

Adversary: 545 million training steps, 600 visits

Victim Color	Win color	Adversary Win	Score difference	Game length	Download
w	b	True	140.5	347

Positions analyzed with varying visits

We make available here the full game records for the positions analyzed with different levels of visits in the paper appendix discussing the role of search in robustness. For details, please refer to the appendix.

Victim

Rank

Caps

Time

--:--

Adversary

Rank

Caps

Time

--:--

Comments

Black passed.

adversary predicted win prob: 1.00 loss: 0.00, predicted score: 58.80

Victim: Latest_def, 1600 visits

Adversary: 498 million training steps, 600 visits

Victim Color	Win color	Adversary Win	Score difference	Game length
w	b	True	58.5	335
b	w	True	91.5	269
w	b	True	56.5	326
b	w	True	127.5	334
w	b	True	98.5	333
b	w	True	95.5	286
w	b	True	84.5	274
b	w	True	97.5	305
b	w	True	93.5	297
w	b	True	36.5	304
b	w	True	127.5	314
b	w	True	154.5	320
w	b	True	116.5	336
w	b	True	38.5	316
b	w	True	27.5	333
w	b	True	40.5	326

Positions analyzed with 1 billion visits

The following game records correspond to positions that were analyzed with 1 billion visits, where the victim still failed to find the correct move. The original victim that played the games had 1 million visits. For details, please refer to the paper appendix discussing the role of search in robustness.

Victim

Rank

Caps

Time

--:--

Adversary

Rank

Caps

Time

--:--

Comments

Black passed.

adversary predicted win prob: 1.00 loss: 0.00, predicted score: 96.6

Victim: Latest, 1 million visits

Adversary: 545 million training steps, 600 visits

Victim Color	Win color	Adversary Win	Score difference	Game length
w	b	True	97.5	311
w	b	True	116.5	306
b	w	True	189.5	360
b	w	True	133.5	433
w	b	True	108.5	334

Adversarial policies in Go

Game analysis

Qualitative analysis of adversary behavior

How the victim's predicted win rate varies over time

Positions analyzed with varying visits

Positions analyzed with 1 billion visits

Citation Info