by showlab
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
# Add to your Claude Code skills
git clone https://github.com/showlab/Awesome-GUI-AgentA curated list of papers, projects, and resources for multi-modal Graphical User Interface (GUI) agents.
<p align="center"> <img src="assets/teaser.webp" width="480px"/> </p> <p align="center"> Build a digital assistant on your screen. Generated by <a href="https://openai.com/index/dall-e-3/">DALL-E-3</a>. </p>WELCOME CONTRIBUTE!
🔥 This project is actively maintained, and we welcome your contributions. If you have any suggestions, such as missing papers or information, please feel free to open an issue or submit a pull request.
🤖 Try our Awesome-Paper-Agent. Just provide an arXiv URL link, and it will automatically return formatted information, like this:
User:
https://arxiv.org/abs/2312.13108
GPT:
+ [AssistGUI: Task-Oriented Desktop Graphical User Interface Automation](https://arxiv.org/abs/2312.13108) (Dec. 2023)
[](https://github.com/showlab/assistgui)
[](https://arxiv.org/abs/2312.13108)
[](https://showlab.github.io/assistgui/)
So then you can easily copy and use this information in your pull requests.
⭐ If you find this repository useful, please give it a star.
Quick Navigation: [Datasets / Benchmarks] [Models / Agents] [Surveys] [Projects]
No comments yet. Be the first to share your thoughts!
World of Bits: An Open-Domain Platform for Web-Based Agents (Aug. 2017, ICML 2017)
A Unified Solution for Structured Web Data Extraction (Jul. 2011, SIGIR 2011)
Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration (Feb. 2018, ICLR 2018)
[Mapping Natural Language Instructions to Mobile UI Action Sequences](https://arxiv.org/abs/2005.0...