Atari-GPT

Welcome to Atari-GPT

Atari-GPT introduces a novel benchmark to evaluate the capabilities of multimodal large language models (LLMs) as low-level controllers in Atari games. This groundbreaking research explores their potential in dynamic, visually rich environments and creates a benchmark designed to test their capabilities.

Explore the Highlights

Want to read more? You can read the full paper here.

Watch GPT-4o Play Atari!

Want to see more? Watch all the LLMs play Atari here!

GPT-4o Gameplay

How to Cite

If you find our work useful, please use the following citation:

        @misc{waytowich2024atarigptinvestigatingcapabilitiesmultimodal,
            title={Atari-GPT: Investigating the Capabilities of Multimodal Large Language Models as Low-Level Policies for Atari Games}, 
            author={Nicholas R. Waytowich and Devin White and MD Sunbeam and Vinicius G. Goecks},
            year={2024},
            eprint={2408.15950},
            archivePrefix={arXiv},
            primaryClass={cs.AI},
            url={https://arxiv.org/abs/2408.15950}
        }
            

You can also find our paper on arXiv.