AI prompts
base on Mobile-Agent: The Powerful Mobile Device Operation Assistant Family 
<div align="center">
<h3>Mobile-Agent: The Powerful Mobile Device Operation Assistant Family<h3>
<div align="center">
<a href="https://huggingface.co/spaces/junyangwang0410/PC-Agent"><img src="https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm-dark.svg" alt="Open in Spaces"></a>
<a href="https://www.modelscope.cn/studios/wangjunyang/PC-Agent"><img src="assets/Demo-ModelScope-brightgreen.svg" alt="Demo ModelScope"></a>
<a href="https://arxiv.org/abs/2502.14282 "><img src="https://img.shields.io/badge/Arxiv-2502.14282-b31b1b.svg?logo=arXiv" alt=""></a>
<a href="https://arxiv.org/abs/2501.11733"><img src="https://img.shields.io/badge/Arxiv-2501.11733-b31b1b.svg?logo=arXiv" alt=""></a>
<a href="https://arxiv.org/abs/2406.01014 "><img src="https://img.shields.io/badge/Arxiv-2406.01014-b31b1b.svg?logo=arXiv" alt=""></a>
<a href="https://arxiv.org/abs/2401.16158"><img src="https://img.shields.io/badge/Arxiv-2401.16158-b31b1b.svg?logo=arXiv" alt=""></a>
</div>
<p align="center">
<a href="https://trendshift.io/repositories/7423" target="_blank"><img src="https://trendshift.io/api/badge/repositories/7423" alt="MobileAgent | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</p>
</div>
<div align="center">
<a href="README.md">English</a> | <a href="README_zh.md">简体中文</a> | <a href="README_ja.md">日本語</a>
<hr>
</div>
## 📺Demo
### Newest PC-Agent
See [paper](https://arxiv.org/abs/2502.14282) for details.
Try the [demo](https://huggingface.co/spaces/junyangwang0410/PC-Agent) in Hugging Face Space.
Try the [demo](https://www.modelscope.cn/studios/wangjunyang/PC-Agent) in ModelScope.
https://github.com/user-attachments/assets/b13bbb14-b39a-4c6b-b4a6-3df97de517dc
### Mobile-Agent-E
See the [project page](https://x-plug.github.io/MobileAgent) for video demos.
<!-- <div style="display: flex; justify-content: space-between; gap: 10px; flex-wrap: wrap;">
<video width="30%" controls>
<source src="https://raw.githubusercontent.com/X-PLUG/MobileAgent/main/Mobile-Agent-E/static/videos/bouldering_gym.mp4" type="video/mp4">
</video>
<video width="30%" controls>
<source src="https://raw.githubusercontent.com/X-PLUG/MobileAgent/main/Mobile-Agent-E/static/videos/shopping.mp4" type="video/mp4">
</video>
<video width="30%" controls>
<source src="https://raw.githubusercontent.com/X-PLUG/MobileAgent/main/Mobile-Agent-E/static/videos/survey.mp4" type="video/mp4">
</video>
</div> -->
### Mobile-Agent-v3 (Note: The video is not accelerated)
**YouTube**
[](https://www.youtube.com/watch?v=EMbIpzqJld0)
**Bilibili**
[](https://www.bilibili.com/video/BV1pPvyekEsa/?share_source=copy_web&vd_source=47ffcd57083495a8965c8cdbe1a751ae)
### PC-Agent
**Chrome and DingTalk**
https://github.com/user-attachments/assets/b890a08f-8a2f-426d-9458-aa3699185030
**Word**
https://github.com/user-attachments/assets/37f0a0a5-3d21-4232-9d1d-0fe845d0f77d
### Mobile-Agent-v2
https://github.com/X-PLUG/MobileAgent/assets/127390760/d907795d-b5b9-48bf-b1db-70cf3f45d155
### Mobile-Agent
https://github.com/X-PLUG/MobileAgent/assets/127390760/26c48fb0-67ed-4df6-97b2-aa0c18386d31
## 📢News
* 🔥🔥[2.21.25] We have released an updated version of PC-Agent. Check the [paper](https://arxiv.org/abs/2502.14282) for details. The code will be updated soon.
* 🔥🔥[1.20.25] We propose [Mobile-Agent-E](https://x-plug.github.io/MobileAgent), a hierarchical multi-agent framework capable of self-evolution through past experience, achieving stronger performance on complex, multi-app tasks.
* 🔥🔥[9.26] Mobile-Agent-v2 has been accepted by **The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024)**.
* 🔥[8.23] We proposed PC-Agent, a **PC** operation assistant supporting both **Mac and Windows** platforms.
* 🔥[7.29] Mobile-Agent won the **best demo award** at the ***The 23rd China National Conference on Computational Linguistics*** (CCL 2024). On the CCL 2024, we displayed the upcoming Mobile-Agent-v3. It has smaller memory overhead (8 GB), faster reasoning speed (10s-15s per operation), and all uses open source models. Video demo, please see the last section 📺Demo.
* [6.27] We proposed Demo that can upload mobile phone screenshots to experience Mobile-Agent-V2 in [Hugging Face](https://huggingface.co/spaces/junyangwang0410/Mobile-Agent) and [ModelScope](https://modelscope.cn/studios/wangjunyang/Mobile-Agent-v2). You don’t need to configure models and devices, and you can experience it immediately.
* [6. 4] Modelscope-Agent has supported Mobile-Agent-V2, based on Android Adb Env, please check in the [application](https://github.com/modelscope/modelscope-agent/tree/master/apps/mobile_agent).
* [6. 4] We proposed Mobile-Agent-v2, a mobile device operation assistant with effective navigation via multi-agent collaboration.
* [3.10] Mobile-Agent has been accepted by the **ICLR 2024 Workshop on Large Language Model (LLM) Agents**.
## 📱Version
* [PC-Agent](PC-Agent/README.md) - A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC
* [Mobile-Agent-E](Mobile-Agent-E/README.md) - Stronger performance on complex, long-horizon, reasoning-intensive tasks, with self-evolution capability
* [Mobile-Agent-v3](Mobile-Agent-v3/README.md)
* [Mobile-Agent-v2](Mobile-Agent-v2/README.md) - Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
* [Mobile-Agent](Mobile-Agent/README.md) - Autonomous Multi-Modal Mobile Device Agent with Visual Perception
## ⭐Star History
[](https://star-history.com/#X-PLUG/MobileAgent&Date)
## 📑Citation
If you find Mobile-Agent useful for your research and applications, please cite using this BibTeX:
```
@article{liu2025pc,
title={PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC},
author={Liu, Haowei and Zhang, Xi and Xu, Haiyang and Wanyan, Yuyang and Wang, Junyang and Yan, Ming and Zhang, Ji and Yuan, Chunfeng and Xu, Changsheng and Hu, Weiming and Huang, Fei},
journal={arXiv preprint arXiv:2502.14282},
year={2025}
}
@article{wang2025mobile,
title={Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks},
author={Wang, Zhenhailong and Xu, Haiyang and Wang, Junyang and Zhang, Xi and Yan, Ming and Zhang, Ji and Huang, Fei and Ji, Heng},
journal={arXiv preprint arXiv:2501.11733},
year={2025}
}
@article{wang2024mobile2,
title={Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration},
author={Wang, Junyang and Xu, Haiyang and Jia, Haitao and Zhang, Xi and Yan, Ming and Shen, Weizhou and Zhang, Ji and Huang, Fei and Sang, Jitao},
journal={arXiv preprint arXiv:2406.01014},
year={2024}
}
@article{wang2024mobile,
title={Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception},
author={Wang, Junyang and Xu, Haiyang and Ye, Jiabo and Yan, Ming and Shen, Weizhou and Zhang, Ji and Huang, Fei and Sang, Jitao},
journal={arXiv preprint arXiv:2401.16158},
year={2024}
}
```
## 📦Related Projects
* [AppAgent: Multimodal Agents as Smartphone Users](https://github.com/mnotgod96/AppAgent)
* [mPLUG-Owl & mPLUG-Owl2: Modularized Multimodal Large Language Model](https://github.com/X-PLUG/mPLUG-Owl)
* [Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond](https://github.com/QwenLM/Qwen-VL)
* [GroundingDINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection](https://github.com/IDEA-Research/GroundingDINO)
* [CLIP: Contrastive Language-Image Pretraining](https://github.com/openai/CLIP)
", Assign "at most 3 tags" to the expected json: {"id":"7423","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"