base on An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks. # OS-Copilot: Towards Generalist Computer Agents with Self-Improvement <div align="center"> <!-- [[PDF]](https://arxiv.org/pdf/2402.07456.pdf) [[Documentation]](https://os-copilot.readthedocs.io/en/latest/) --> [![Website](https://img.shields.io/website?url=https://os-copilot.github.io/)](https://os-copilot.github.io/) [![Paper](https://img.shields.io/badge/paper--blue)](https://arxiv.org/pdf/2402.07456.pdf) [![Documentation](https://img.shields.io/badge/documentation--blue)](https://os-copilot.readthedocs.io/en/latest/) ![Python](https://img.shields.io/badge/python-3.10-blue) [![Discord](https://img.shields.io/discord/1222168244673314847?logo=discord&style=flat)](https://discord.com/invite/rXS2XbgfaD) [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/cloudposse.svg?style=social&label=Follow%20%40oscopilot)](https://twitter.com/oscopilot) <p align="center"> <img src='pic/demo.png' width="100%"> </p> </div> <!-- ## 📖 Overview - **OS-Copilot** is a pioneering conceptual framework for building generalist computer agents on Linux and MacOS, which provides a unified interface for app interactions in the heterogeneous OS ecosystem. <p align="center"> <img src='pic/framework.png' width="75%"> </p> - Leveraging OS-Copilot, we built **FRIDAY**, a self-improving AI assistant capable of solving general computer tasks. <p align="center"> <img src='pic/FRIDAY.png' width="75%"> </p> --> ## 🔥 News - _2024.9_: 🎉 Now Friday is equipped with vision! Try out the new [friday_vision](https://github.com/OS-Copilot/OS-Copilot/tree/main/examples/friday_vision)! Currently still under development but more stable versions are expected soon. - _2024.6_: 🎉 The front-end interface of OS-Copilot is now available. Go check it out in the [frontend](https://github.com/OS-Copilot/OS-Copilot/tree/main/fronted) directory! - _2024.3_: 🎉 OS-Copilot is accepted at the [LLM Agents Workshop](https://llmagents.github.io/)@ICLR 2024! ## What is OS-Copilot OS-Copilot is an open-source library to build generalist agents capable of automatically interfacing with comprehensive elements in an operating system (OS), including the web, code terminals, files, multimedia, and various third-party applications. ## ⚡️ Quickstart 1. **Clone the GitHub Repository:** ``` git clone https://github.com/OS-Copilot/OS-Copilot.git ``` 2. **Set Up Python Environment and Install Dependencies:** ``` conda create -n oscopilot_env python=3.10 -y conda activate oscopilot_env cd OS-Copilot pip install -e . ``` 3. **Set OpenAI API Key:** Configure your OpenAI API key in [.env](.env). ``` cp .env_template .env ``` 4. **Now you are ready to have fun:** ``` python quick_start.py ``` \* **FRIDAY currently only supports single-round conversation**. ## 🛠️ Tutorial | **Level** | **Tutorial** | **Description** | | ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------ | | **Beginner** | [Installation](https://os-copilot.readthedocs.io/en/latest/installation.html) | Explore three methods to install FRIDAY. | | **Beginner** | [Getting Started](https://os-copilot.readthedocs.io/en/latest/quick_start.html) | The simplest demonstration of FRIDAY with a quick_start.py script. | | **Intermediate** | [Adding Your Tools](https://os-copilot.readthedocs.io/en/latest/tutorials/add_tool.html) | Adding and removing tools to the FRIDAY. | | **Intermediate** | [Deploying API Services](https://os-copilot.readthedocs.io/en/latest/tutorials/deploy_api_service.html) | Demonstrate the deployment of API services for FRIDAY. | | **Intermediate** | [Example: Automating Excel Tasks](https://os-copilot.readthedocs.io/en/latest/tutorials/example_excel.html) | Automating Excel control using FRIDAY. | | **Intermediate** | [Enhancing FRIDAY with Self-Learning for Excel Task Automation](https://os-copilot.readthedocs.io/en/latest/tutorials/self_learning.html) | Improved Excel control with self-directed learning. | | **Advanced** | [Designing New API Tools](https://os-copilot.readthedocs.io/en/latest/tutorials/design_new_api_tool.html) | Guides on deploying custom API tools for FRIDAY to extend its functionalities. | <!-- ## 🛠️ FRIDAY-Gizmos We maintain an open-source library of toolkits for FRIDAY, which includes tools that can be directly utilized within FRIDAY. For a detailed list of tools, please see [FRIDAY-Gizmos](https://github.com/OS-Copilot/FRIDAY-Gizmos). The usage methods are as follows: 1. Find the tool you want to use in [FRIDAY-Gizmos](https://github.com/OS-Copilot/FRIDAY-Gizmos) and download its tool code. 2. Add the tool to FRIDAY's toolkit: ```shell python friday/tool_repository/manager/tool_manager.py --add --tool_name [tool_name] --tool_path [tool_path] ``` 3. If you wish to remove a tool, you can run: ```shell python friday/tool_repository/manager/tool_manager.py --delete --tool_name [tool_name] ``` ## 💻 User Interface (UI) **Enhance Your Experience with Our Intuitive Frontend!** This interface is crafted for effortless control of your agents. For more details, visit [FRIDAY Frontend](https://github.com/OS-Copilot/FRIDAY-front). ## ✨ Deploy API Services For comprehensive guidelines on deploying API services, please refer to the [OS-Copilot documentation](https://os-copilot.readthedocs.io/en/latest/). --> ## 💻 User Interface (UI) **Enhance Your Experience with Our Intuitive Frontend!** This interface is crafted for effortless control of your agents. For more details, visit [OS-Copilot Frontend](https://github.com/OS-Copilot/OS-Copilot/tree/main/fronted). ## 🏫 Community Join our community to connect with other enthusiasts, researchers and developers: - **[Discord](https://discord.com/invite/rXS2XbgfaD)**: Join our Discord server for real-time discussions and support. - **[Twitter](https://twitter.com/oscopilot)**: Follow our Twitter to get latest new or tag us to share your demos! ## 👨‍💻‍ Contributing **Visit [the roadmap](./docs/roadmap.md) to preview what the community is working on and become a contributor!** <a href="https://github.com/OS-Copilot/OS-Copilot/graphs/contributors"> <img src="https://contrib.rocks/image?repo=OS-Copilot/OS-Copilot" /> </a> <!-- Made with [contrib.rocks](https://contrib.rocks). --> ## 🛡 Disclaimer OS-Copilot is provided "as is" without warranty of any kind. Users assume full responsibility for any risks associated with its use, including **potential data loss** or **changes to system settings**. The developers of OS-Copilot are not liable for any damages or losses resulting from its use. Users must ensure their actions comply with applicable laws and regulations. ## 🔎 Citation ``` @article{wu2024copilot, title={Os-copilot: Towards generalist computer agents with self-improvement}, author={Wu, Zhiyong and Han, Chengcheng and Ding, Zichen and Weng, Zhenmin and Liu, Zhoumianze and Yao, Shunyu and Yu, Tao and Kong, Lingpeng}, journal={arXiv preprint arXiv:2402.07456}, year={2024} } ``` ## 📬 Contact If you have any inquiries, suggestions, or wish to contact us for any reason, we warmly invite you to email us at [email protected]. ## Star History ![Star History Chart](https://api.star-history.com/svg?repos=OS-Copilot/OS-Copilot&type=Date) ", Assign "at most 3 tags" to the expected json: {"id":"10118","tags":[]} "only from the tags list I provide: [{"id":39,"name":"3d-generation","display_name":"3D generation","slug":"3d-generation"},{"id":3,"name":"ai-agent","display_name":"AI agent","slug":"ai-agent"},{"id":8,"name":"ai-coding","display_name":"AI coding assistant","slug":"ai-coding"},{"id":5,"name":"ai-image","display_name":"AI image generation","slug":"ai-image"},{"id":9,"name":"ai-infrastructure","display_name":"AI infrastructure","slug":"ai-infrastructure"},{"id":10,"name":"ai-memory","display_name":"AI memory","slug":"ai-memory"},{"id":11,"name":"ai-skills","display_name":"AI skills","slug":"ai-skills"},{"id":12,"name":"ai-translation","display_name":"AI translation","slug":"ai-translation"},{"id":6,"name":"ai-video","display_name":"AI video generation","slug":"ai-video"},{"id":4,"name":"ai-voice","display_name":"AI voice","slug":"ai-voice"},{"id":7,"name":"ai-workflow","display_name":"AI workflow","slug":"ai-workflow"},{"id":22,"name":"audio-processing","display_name":"Audio processing","slug":"audio-processing"},{"id":29,"name":"authentication","display_name":"Authentication","slug":"authentication"},{"id":51,"name":"bundler","display_name":"Bundler","slug":"bundler"},{"id":41,"name":"chatbot","display_name":"Chatbot","slug":"chatbot"},{"id":27,"name":"cloud-native","display_name":"Cloud native","slug":"cloud-native"},{"id":1,"name":"computer-vision","display_name":"Computer vision","slug":"computer-vision"},{"id":37,"name":"crypto-trading","display_name":"Crypto trading","slug":"crypto-trading"},{"id":57,"name":"curated-list","display_name":"Curated list","slug":"curated-list"},{"id":54,"name":"data-streaming","display_name":"Data streaming","slug":"data-streaming"},{"id":35,"name":"data-visualization","display_name":"Data visualization","slug":"data-visualization"},{"id":16,"name":"database-backup","display_name":"Database backup","slug":"database-backup"},{"id":49,"name":"design-system","display_name":"Design system","slug":"design-system"},{"id":38,"name":"digital-human","display_name":"Digital human","slug":"digital-human"},{"id":34,"name":"document-processing","display_name":"Document processing","slug":"document-processing"},{"id":44,"name":"ecommerce","display_name":"E-commerce","slug":"ecommerce"},{"id":45,"name":"emulator","display_name":"Emulator","slug":"emulator"},{"id":46,"name":"file-management","display_name":"File management","slug":"file-management"},{"id":32,"name":"fintech","display_name":"Fintech","slug":"fintech"},{"id":31,"name":"game-development","display_name":"Game development","slug":"game-development"},{"id":24,"name":"headless-browser","display_name":"Headless browser","slug":"headless-browser"},{"id":52,"name":"headless-cms","display_name":"Headless CMS","slug":"headless-cms"},{"id":36,"name":"home-automation","display_name":"Home automation","slug":"home-automation"},{"id":20,"name":"image-editing","display_name":"Image editing","slug":"image-editing"},{"id":28,"name":"iot","display_name":"IoT","slug":"iot"},{"id":13,"name":"local-llm","display_name":"Local LLM","slug":"local-llm"},{"id":17,"name":"mcp","display_name":"MCP","slug":"mcp"},{"id":47,"name":"monitoring","display_name":"Monitoring","slug":"monitoring"},{"id":2,"name":"nlp","display_name":"NLP","slug":"nlp"},{"id":26,"name":"observability","display_name":"Observability","slug":"observability"},{"id":40,"name":"pentesting","display_name":"Pentesting","slug":"pentesting"},{"id":48,"name":"programming-examples","display_name":"Programming examples","slug":"programming-examples"},{"id":42,"name":"proxy","display_name":"Proxy","slug":"proxy"},{"id":14,"name":"rag","display_name":"RAG","slug":"rag"},{"id":56,"name":"resume-building","display_name":"Resume building","slug":"resume-building"},{"id":33,"name":"robotics","display_name":"Robotics","slug":"robotics"},{"id":30,"name":"search","display_name":"Search","slug":"search"},{"id":43,"name":"self-hosted","display_name":"Self-hosted","slug":"self-hosted"},{"id":50,"name":"static-analysis","display_name":"Static analysis","slug":"static-analysis"},{"id":18,"name":"synthetic-data","display_name":"Synthetic data","slug":"synthetic-data"},{"id":19,"name":"text-to-speech","display_name":"Text to speech","slug":"text-to-speech"},{"id":53,"name":"ui-components","display_name":"UI components","slug":"ui-components"},{"id":15,"name":"vector-database","display_name":"Vector database","slug":"vector-database"},{"id":21,"name":"video-editing","display_name":"Video editing","slug":"video-editing"},{"id":25,"name":"web-scraping","display_name":"Web scraping","slug":"web-scraping"},{"id":55,"name":"webassembly","display_name":"WebAssembly","slug":"webassembly"},{"id":23,"name":"workflow-automation","display_name":"Workflow automation","slug":"workflow-automation"}]" returns me the "expected json"