base on An Open-Ended Embodied Agent with Large Language Models # Voyager: An Open-Ended Embodied Agent with Large Language Models <div align="center"> [[Website]](https://voyager.minedojo.org/) [[Arxiv]](https://arxiv.org/abs/2305.16291) [[PDF]](https://voyager.minedojo.org/assets/documents/voyager.pdf) [[Tweet]](https://twitter.com/DrJimFan/status/1662115266933972993?s=20) [![Python Version](https://img.shields.io/badge/Python-3.9-blue.svg)](https://github.com/MineDojo/Voyager) [![GitHub license](https://img.shields.io/github/license/MineDojo/Voyager)](https://github.com/MineDojo/Voyager/blob/main/LICENSE) ______________________________________________________________________ https://github.com/MineDojo/Voyager/assets/25460983/ce29f45b-43a5-4399-8fd8-5dd105fd64f2 ![](images/pull.png) </div> We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention. Voyager consists of three key components: 1) an automatic curriculum that maximizes exploration, 2) an ever-growing skill library of executable code for storing and retrieving complex behaviors, and 3) a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement. Voyager interacts with GPT-4 via blackbox queries, which bypasses the need for model parameter fine-tuning. The skills developed by Voyager are temporally extended, interpretable, and compositional, which compounds the agent’s abilities rapidly and alleviates catastrophic forgetting. Empirically, Voyager shows strong in-context lifelong learning capability and exhibits exceptional proficiency in playing Minecraft. It obtains 3.3× more unique items, travels 2.3× longer distances, and unlocks key tech tree milestones up to 15.3× faster than prior SOTA. Voyager is able to utilize the learned skill library in a new Minecraft world to solve novel tasks from scratch, while other techniques struggle to generalize. In this repo, we provide Voyager code. This codebase is under [MIT License](LICENSE). # Installation Voyager requires Python ≥ 3.9 and Node.js ≥ 16.13.0. We have tested on Ubuntu 20.04, Windows 11, and macOS. You need to follow the instructions below to install Voyager. ## Python Install ``` git clone https://github.com/MineDojo/Voyager cd Voyager pip install -e . ``` ## Node.js Install In addition to the Python dependencies, you need to install the following Node.js packages: ``` cd voyager/env/mineflayer npm install -g npx npm install cd mineflayer-collectblock npx tsc cd .. npm install ``` ## Minecraft Instance Install Voyager depends on Minecraft game. You need to install Minecraft game and set up a Minecraft instance. Follow the instructions in [Minecraft Login Tutorial](installation/minecraft_instance_install.md) to set up your Minecraft Instance. ## Fabric Mods Install You need to install fabric mods to support all the features in Voyager. Remember to use the correct Fabric version of all the mods. Follow the instructions in [Fabric Mods Install](installation/fabric_mods_install.md) to install the mods. # Getting Started Voyager uses OpenAI's GPT-4 as the language model. You need to have an OpenAI API key to use Voyager. You can get one from [here](https://platform.openai.com/account/api-keys). After the installation process, you can run Voyager by: ```python from voyager import Voyager # You can also use mc_port instead of azure_login, but azure_login is highly recommended azure_login = { "client_id": "YOUR_CLIENT_ID", "redirect_url": "https://127.0.0.1/auth-response", "secret_value": "[OPTIONAL] YOUR_SECRET_VALUE", "version": "fabric-loader-0.14.18-1.19", # the version Voyager is tested on } openai_api_key = "YOUR_API_KEY" voyager = Voyager( azure_login=azure_login, openai_api_key=openai_api_key, ) # start lifelong learning voyager.learn() ``` * If you are running with `Azure Login` for the first time, it will ask you to follow the command line instruction to generate a config file. * For `Azure Login`, you also need to select the world and open the world to LAN by yourself. After you run `voyager.learn()` the game will pop up soon, you need to: 1. Select `Singleplayer` and press `Create New World`. 2. Set Game Mode to `Creative` and Difficulty to `Peaceful`. 3. After the world is created, press `Esc` key and press `Open to LAN`. 4. Select `Allow cheats: ON` and press `Start LAN World`. You will see the bot join the world soon. # Resume from a checkpoint during learning If you stop the learning process and want to resume from a checkpoint later, you can instantiate Voyager by: ```python from voyager import Voyager voyager = Voyager( azure_login=azure_login, openai_api_key=openai_api_key, ckpt_dir="YOUR_CKPT_DIR", resume=True, ) ``` # Run Voyager for a specific task with a learned skill library If you want to run Voyager for a specific task with a learned skill library, you should first pass the skill library directory to Voyager: ```python from voyager import Voyager # First instantiate Voyager with skill_library_dir. voyager = Voyager( azure_login=azure_login, openai_api_key=openai_api_key, skill_library_dir="./skill_library/trial1", # Load a learned skill library. ckpt_dir="YOUR_CKPT_DIR", # Feel free to use a new dir. Do not use the same dir as skill library because new events will still be recorded to ckpt_dir. resume=False, # Do not resume from a skill library because this is not learning. ) ``` Then, you can run task decomposition. Notice: Occasionally, the task decomposition may not be logical. If you notice the printed sub-goals are flawed, you can rerun the decomposition. ```python # Run task decomposition task = "YOUR TASK" # e.g. "Craft a diamond pickaxe" sub_goals = voyager.decompose_task(task=task) ``` Finally, you can run the sub-goals with the learned skill library: ```python voyager.inference(sub_goals=sub_goals) ``` For all valid skill libraries, see [Learned Skill Libraries](skill_library/README.md). # FAQ If you have any questions, please check our [FAQ](FAQ.md) first before opening an issue. # Paper and Citation If you find our work useful, please consider citing us! ```bibtex @article{wang2023voyager, title = {Voyager: An Open-Ended Embodied Agent with Large Language Models}, author = {Guanzhi Wang and Yuqi Xie and Yunfan Jiang and Ajay Mandlekar and Chaowei Xiao and Yuke Zhu and Linxi Fan and Anima Anandkumar}, year = {2023}, journal = {arXiv preprint arXiv: Arxiv-2305.16291} } ``` Disclaimer: This project is strictly for research purposes, and not an official product from NVIDIA. ", Assign "at most 3 tags" to the expected json: {"id":"3718","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"