Generating wikipedia by summarizing long sequences PJ Liu, M Saleh, E Pot, B Goodrich, R Sepassi, L Kaiser, N Shazeer arXiv preprint arXiv:1801.10198, 2018 | 539 | 2018 |
Tensor2tensor for neural machine translation A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ... arXiv preprint arXiv:1803.07416, 2018 | 485 | 2018 |
Model-based reinforcement learning for atari L Kaiser, M Babaeizadeh, P Milos, B Osinski, RH Campbell, ... arXiv preprint arXiv:1903.00374, 2019 | 480 | 2019 |
Mesh-tensorflow: Deep learning for supercomputers N Shazeer, Y Cheng, N Parmar, D Tran, A Vaswani, P Koanantakool, ... Advances in neural information processing systems 31, 2018 | 218 | 2018 |
Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ... arXiv preprint arXiv:2204.02311, 2022 | 59 | 2022 |
Pathways: Asynchronous distributed dataflow for ML P Barham, A Chowdhery, J Dean, S Ghemawat, S Hand, D Hurt, M Isard, ... Proceedings of Machine Learning and Systems 4, 430-449, 2022 | 6 | 2022 |
Attention-based decoder-only sequence transduction neural networks NM Shazeer, LM Kaiser, E Pot, M Saleh, BD Goodrich, PJ Liu, R Sepassi US Patent App. 16/759,690, 2020 | 1 | 2020 |
Scaling Up Models and Data with and A Roberts, HW Chung, A Levskaya, G Mishra, J Bradbury, D Andor, ... arXiv preprint arXiv:2203.17189, 2022 | | 2022 |
INTELLIGENT INVESTING RS Sepassi Harvard University Cambridge Massachusetts, 2010 | | 2010 |