Adversarial policies in Go

Transfer

Our adversary apparently exploits a weakness common across several Go AI systems. We find that the attack transfers zero-shot against ELF OpenGo and Leela Zero, two other open-source Go AI systems that can play at a superhuman level.

ELF OpenGo

We pit our adversary against ELF OpenGo playing with its final network and 80,000 rollouts per move. The authors of ELF found that this number of rollouts was sufficient with to consistently beat several top-30 Go players even using a weaker network. Our adversary achieves a win rate of 3.5% against ELF. (The games displayed are non-randomly selected to show the wins achieved by the adversary.)

white
Rank
-
Caps
0
Time
--:--
black
Rank
-
Caps
0
Time
--:--
Comments

Victim: ELF OpenGo, final network, 80,000 rollouts per move

Adversary: 545 million training steps, 600 visits

Victim ColorWin colorAdversary WinScore differenceGame length Download

Leela Zero

We pit our adversary against Leela Zero OpenGo playing with its final network (hash 0e9ea880 on the Leela training website), no time limit, and a maximum of 40,000 visits per move. Our adversary achieves a win rate of 6.1%. (The games displayed are non-randomly selected to show the wins achieved by the adversary.)

white
Rank
-
Caps
0
Time
--:--
black
Rank
-
Caps
0
Time
--:--
Comments

Victim: Leela Zero, final network, max 40,000 visits per move

Adversary: 545 million training steps, 600 visits

Victim ColorWin colorAdversary WinScore differenceGame length Download
wbTrue354.5625
bwTrue367.5778
wwFalse-41.5330
wwFalse-22.5226
wwFalse-21.5335
bbFalse-52.5303
bbFalse-46.5235
bbFalse-29.5238

Citation Info

@inproceedings{wang2023adversarial,
  title={Adversarial Policies Beat Superhuman Go {AI}s},
  author={Wang, Tony T. and Gleave, Adam and Tseng, Tom and Pelrine, Kellin and Belrose, Nora and Miller, Joseph and Dennis, Michael D and Duan, Yawen and Pogrebniak, Viktor and Levine, Sergey and Russell, Stuart},
  booktitle={International Conference on Machine Learning},
  year={2023},
  eprint={2211.00241},
  archivePrefix={arXiv}
}
@misc{tseng2024ais,
  title={Can Go {AI}s be adversarially robust?},
  author={Tseng, Tom and McLean, Euan and Pelrine, Kellin and Wang, Tony T. and Gleave, Adam},
  year={2024},
  eprint={2406.12843},
  archivePrefix={arXiv}
}