Follow
Zachary Kenton
Zachary Kenton
Google DeepMind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Ethical and social risks of harm from language models
L Weidinger, J Mellor, M Rauh, C Griffin, J Uesato, PS Huang, M Cheng, ...
arXiv preprint arXiv:2112.04359, 2021
9772021
Taxonomy of risks posed by language models
L Weidinger, J Uesato, M Rauh, C Griffin, PS Huang, J Mellor, A Glaese, ...
Proceedings of the 2022 ACM Conference on Fairness, Accountability, and …, 2022
5952022
Three factors influencing minima in sgd
S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey
arXiv preprint arXiv:1711.04623, 2017
5282017
Alignment of language agents
Z Kenton, T Everitt, L Weidinger, I Gabriel, V Mikulik, G Irving
arXiv preprint arXiv:2103.14659, 2021
1622021
A systematic comparison of bayesian deep learning robustness in diabetic retinopathy tasks
A Filos, S Farquhar, AN Gomez, TGJ Rudner, Z Kenton, L Smith, ...
arXiv preprint arXiv:1912.10481, 2019
1282019
On the relation between the sharpest directions of DNN loss and the SGD step length
S Jastrzębski, Z Kenton, N Ballas, A Fischer, Y Bengio, A Storkey
arXiv preprint arXiv:1807.05031, 2018
1282018
Specification gaming: the flip side of AI ingenuity
V Krakovna, J Uesato, V Mikulik, M Rahtz, T Everitt, R Kumar, Z Kenton, ...
DeepMind Blog 3, 2020
1102020
Imitating interactive intelligence
J Abramson, A Ahuja, I Barr, A Brussee, F Carnevale, M Cassin, ...
arXiv preprint arXiv:2012.05672, 2020
732020
Ethical and social risks of harm from language models. arXiv
L Weidinger, J Mellor, M Rauh, C Griffin, J Uesato, PS Huang, M Cheng, ...
arXiv preprint arXiv:2112.04359 10, 2021
712021
Goal misgeneralization: Why correct specifications aren't enough for correct goals
R Shah, V Varma, R Kumar, M Phuong, V Krakovna, J Uesato, Z Kenton
arXiv preprint arXiv:2210.01790, 2022
642022
Explaining grokking through circuit efficiency
V Varma, R Shah, Z Kenton, J Kramár, R Kumar
arXiv preprint arXiv:2309.02390, 2023
442023
The ethics of advanced ai assistants
I Gabriel, A Manzini, G Keeling, LA Hendricks, V Rieser, H Iqbal, ...
arXiv preprint arXiv:2404.16244, 2024
402024
Finding flatter minima with sgd
S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey
392018
The squeezed limit of the bispectrum in multi-field inflation
Z Kenton, DJ Mulryne
Journal of Cosmology and Astroparticle Physics 2015 (10), 018, 2015
392015
D-brane potentials in the warped resolved conifold and natural inflation
Z Kenton, S Thomas
Journal of High Energy Physics 2015 (2), 1-42, 2015
392015
Discovering agents
Z Kenton, R Kumar, S Farquhar, J Richens, M MacDermott, T Everitt
Artificial Intelligence 322, 103963, 2023
332023
Width of minima reached by stochastic gradient descent is influenced by learning rate to batch size ratio
S Jastrzębski, Z Kenton, D Arpit, N Ballas, A Fischer, Y Bengio, A Storkey
Artificial Neural Networks and Machine Learning–ICANN 2018: 27th …, 2018
302018
Generalizing from a few environments in safety-critical reinforcement learning
Z Kenton, A Filos, Y Gal, O Evans
Safe Machine Learning workshop at ICLR, 2019
26*2019
Benchmarking Bayesian deep learning with diabetic retinopathy diagnosis
A Filos, S Farquhar, AN Gomez, TGJ Rudner, Z Kenton, L Smith, ...
Preprint at https://arxiv. org/abs/1912.10481, 2019
242019
The separate universe approach to soft limits
Z Kenton, DJ Mulryne
Journal of Cosmology and Astroparticle Physics 2016 (10), 035, 2016
222016
The system can't perform the operation now. Try again later.
Articles 1–20