Adversarial policies in Go

Pass attack

Our initial attempts at attacking KataGo resulted in adversaries that exploited KataGo's passing behavior. These pass-based adversaries trick KataGo into passing when it shouldn't. While this attack is effective against victims which do not use tree search, it stops working once victims are able to use even a small amount of tree search. We developed the pass-hardening defense so that our adversaries would not get stuck learning this pass-exploit. This worked surprisingly well — training against pass-hardened victims resulted in our adversaries learning an alternate strategy that works even in the high search regime.

KataGo without search (top-100 European player level)

Without tree search, Katago's Latest network plays at the strength of a top-100 European professional. Our pass-based adversary achieves a 99% win rate against this victim by playing a counterintuitive strategy. The adversary stakes out a minority territory in the corner, allowing KataGo to stake the complement, and placing weak stones in KataGo’s stake.

KataGo predicts a high win probability for itself and, in a way, it’s right—it would be simple to capture most of the adversary’s stones in KataGo’s stake, achieving a decisive victory. However, KataGo plays a pass move before it has finished securing its territory, allowing the adversary to pass in turn and end the game. This results in a win for the adversary under the standard Tromp-Taylor ruleset for computer Go, as the adversary gets points for its corner territory (devoid of victim stones) whereas the victim does not receive points for its unsecured territory because of the presence of the adversary’s stones.

Victim

Rank

Caps

Time

--:--

Adversary

Rank

Caps

Time

--:--

Comments

Black passed.

adversary predicted win prob: 1.00 loss: 0.00, predicted score: 27.1

Victim: Latest, no search

Adversary: 34.1 million training steps, 600 visits

Victim Color	Win color	Adversary Win	Score difference	Game length
w	b	True	26.5	89
w	b	True	32.5	85
w	b	True	36.5	81
w	b	True	38.5	91
b	w	True	48.5	84
b	w	True	48.5	78

KataGo with 8 visits

A search budget of 8 visits / move is around the limit of what our pass-based adversary can exploit. We achieve a win rate of 87.8% against this victim by modeling the victim perfectly during the adversary's search. The adversary wins by the same strategy of staking out a corner. The adversary loses when the victim plays the game out to the end, resulting in a very full board.

Victim

Rank

Caps

Time

--:--

Adversary

Rank

Caps

Time

--:--

Comments

Black passed.

adversary predicted win prob: 0.99 loss: 0.01, predicted score: 19.4

Victim: Latest, 8 visits

Adversary: 34.1 million training steps, 200 visits, recursive modeling

Victim Color	Win color	Adversary Win	Score difference	Game length
w	b	True	24.5	101
w	b	True	28.5	119
b	w	True	38.5	106
b	w	True	45.5	80
b	w	True	47.5	98
b	w	True	47.5	82
b	w	True	52.5	80
w	w	False	-238.5	415
w	w	False	-223.5	384
w	w	False	-215.5	403

Adversarial policies in Go

Pass attack

KataGo without search (top-100 European player level)

KataGo with 8 visits

Citation Info