Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 550 | 2023 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024 | 18 | 2024 |
Mory flips of Type A (provisional title) G Brown, M Reid preparation, 0 | 5 | |
A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding A Burns, K Srinivasan, J Ainslie, G Brown, BA Plummer, K Saenko, J Ni, ... arXiv preprint arXiv:2305.03668, 2023 | 3 | 2023 |
Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling Y Wang, J Wu, T Dabral, J Zhang, G Brown, CT Lu, F Liu, Y Liang, B Pang, ... arXiv preprint arXiv:2310.12100, 2023 | 2 | 2023 |
Wikiweb2m: A page-level multimodal wikipedia dataset A Burns, K Srinivasan, J Ainslie, G Brown, BA Plummer, K Saenko, J Ni, ... arXiv preprint arXiv:2305.05432, 2023 | 1 | 2023 |