base on Minecraft AI with LLMs+Mineflayer <h1 align="center">🧠mindcraft⛏️</h1> <h1 align="center"> <a href="https://trendshift.io/repositories/9163" target="_blank"><img src="https://trendshift.io/api/badge/repositories/9163" alt="kolbytn%2Fmindcraft | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> </h1> <p align="center">Crafting minds for Minecraft with LLMs and <a href="https://prismarinejs.github.io/mineflayer/#/">Mineflayer!</a></p> <p align="center"> <a href="https://github.com/mindcraft-bots/mindcraft/blob/main/FAQ.md">FAQ</a> | <a href="https://discord.gg/mp73p35dzC">Discord Support</a> | <a href="https://www.youtube.com/watch?v=gRotoL8P8D8">Video Tutorial</a> | <a href="https://kolbynottingham.com/mindcraft/">Blog Post</a> | <a href="https://github.com/users/kolbytn/projects/1">Contributor TODO</a> | <a href="https://mindcraft-minecollab.github.io/index.html">Paper Website</a> | <a href="https://github.com/mindcraft-bots/mindcraft/blob/main/minecollab.md">MineCollab</a> </p> > [!Caution] Do not connect this bot to public servers with coding enabled. This project allows an LLM to write/execute code on your computer. The code is sandboxed, but still vulnerable to injection attacks. Code writing is disabled by default, you can enable it by setting `allow_insecure_coding` to `true` in `settings.js`. Ye be warned. # Getting Started ## Requirements - [Minecraft Java Edition](https://www.minecraft.net/en-us/store/minecraft-java-bedrock-edition-pc) (up to v1.21.6, recommend v1.21.6) - [Node.js Installed](https://nodejs.org/) (Node v18 or v20 LTS recommended. Node v24+ may cause issues with native dependencies) - At least one API key from a supported API provider. See [supported APIs](#model-customization). OpenAI is the default. > [!Important] > If installing node on windows, ensure you check `Automatically install the necessary tools` > > If you encounter `npm install` errors on macOS, see the [FAQ](FAQ.md#common-issues) for troubleshooting native module build issues ## Install and Run 1. Make sure you have the requirements above. 2. Download the [latest release](https://github.com/mindcraft-bots/mindcraft/releases/latest) and unzip it, or clone the repository. 3. Rename `keys.example.json` to `keys.json` and fill in your API keys (you only need one). The desired model is set in `andy.json` or other profiles. For other models refer to the table below. 4. In terminal/command prompt, run `npm install` from the installed directory 5. Start a minecraft world and open it to LAN on localhost port `55916` 6. Run `node main.js` from the installed directory If you encounter issues, check the [FAQ](https://github.com/mindcraft-bots/mindcraft/blob/main/FAQ.md) or find support on [discord](https://discord.gg/mp73p35dzC). We are currently not very responsive to github issues. To run tasks please refer to [Minecollab Instructions](minecollab.md#installation) # Configuration ## Model Customization You can configure project details in `settings.js`. [See file.](settings.js) You can configure the agent's name, model, and prompts in their profile like `andy.json`. The model can be specified with the `model` field, with values like `model: "gemini-2.5-pro"`. You will need the correct API key for the API provider you choose. See all supported APIs below. <details> <summary><strong>⭐ VIEW SUPPORTED APIs ⭐</strong></summary> | API Name | Config Variable| Docs | |------|------|------| | `openai` | `OPENAI_API_KEY` | [docs](https://platform.openai.com/docs/models) | | `google` | `GEMINI_API_KEY` | [docs](https://ai.google.dev/gemini-api/docs/models/gemini) | | `anthropic` | `ANTHROPIC_API_KEY` | [docs](https://docs.anthropic.com/claude/docs/models-overview) | | `xai` | `XAI_API_KEY` | [docs](https://docs.x.ai/docs) | | `deepseek` | `DEEPSEEK_API_KEY` | [docs](https://api-docs.deepseek.com/) | | `ollama` (local) | n/a | [docs](https://ollama.com/library) | | `qwen` | `QWEN_API_KEY` | [Intl.](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api)/[cn](https://help.aliyun.com/zh/model-studio/getting-started/models) | | `mistral` | `MISTRAL_API_KEY` | [docs](https://docs.mistral.ai/getting-started/models/models_overview/) | | `replicate` | `REPLICATE_API_KEY` | [docs](https://replicate.com/collections/language-models) | | `groq` (not grok) | `GROQCLOUD_API_KEY` | [docs](https://console.groq.com/docs/models) | | `huggingface` | `HUGGINGFACE_API_KEY` | [docs](https://huggingface.co/models) | | `novita` | `NOVITA_API_KEY` | [docs](https://novita.ai/model-api/product/llm-api?utm_source=github_mindcraft&utm_medium=github_readme&utm_campaign=link) | | `openrouter` | `OPENROUTER_API_KEY` | [docs](https://openrouter.ai/models) | | `glhf` | `GHLF_API_KEY` | [docs](https://glhf.chat/user-settings/api) | | `hyperbolic` | `HYPERBOLIC_API_KEY` | [docs](https://docs.hyperbolic.xyz/docs/getting-started) | | `vllm` | n/a | n/a | | `cerebras` | `CEREBRAS_API_KEY` | [docs](https://inference-docs.cerebras.ai/introduction) | | `mercury` | `MERCURY_API_KEY` | [docs](https://www.inceptionlabs.ai/) | </details> For more comprehensive model configuration and syntax, see [Model Specifications](#model-specifications). For local models we support [ollama](https://ollama.com/) and we provide our own finetuned models for you to use. To install our models, install ollama and run the following terminal command: ```bash ollama pull sweaterdog/andy-4:micro-q8_0 && ollama pull embeddinggemma ``` ## Online Servers To connect to online servers your bot will need an official Microsoft/Minecraft account. You can use your own personal one, but will need another account if you want to connect too and play with it. To connect, change these lines in `settings.js`: ```javascript "host": "111.222.333.444", "port": 55920, "auth": "microsoft", // rest is same... ``` > [!Important] > The bot's name in the profile.json must exactly match the Minecraft profile name! Otherwise the bot will spam talk to itself. To use different accounts, Mindcraft will connect with the account that the Minecraft launcher is currently using. You can switch accounts in the launcher, then run `node main.js`, then switch to your main account after the bot has connected. ## Tasks Tasks automatically start the bot with a prompt and a goal item to aquire or blueprint to construct. To run a simple task that involves collecting 4 oak_logs run `node main.js --task_path tasks/basic/single_agent.json --task_id gather_oak_logs` Here is an example task json format: ``` { "gather_oak_logs": { "goal": "Collect at least four logs", "initial_inventory": { "0": { "wooden_axe": 1 } }, "agent_count": 1, "target": "oak_log", "number_of_target": 4, "type": "techtree", "max_depth": 1, "depth": 0, "timeout": 300, "blocked_actions": { "0": [], "1": [] }, "missing_items": [], "requires_ctable": false } } ``` The `initial_inventory` is what the bot will have at the start of the episode, `target` refers to the target item and `number_of_target` refers to the number of target items the agent needs to collect to successfully complete the task. If you want more optimization and automatic launching of the minecraft world, you will need to follow the instructions in [Minecollab Instructions](minecollab.md#installation) ## Docker Container If you intend to `allow_insecure_coding`, it is a good idea to run the app in a docker container to reduce risks of running unknown code. This is strongly recommended before connecting to remote servers, although still does not guarantee complete safety. ```bash docker build -t mindcraft . && docker run --rm --add-host=host.docker.internal:host-gateway -p 8080:8080 -p 3000-3003:3000-3003 -e SETTINGS_JSON='{"auto_open_ui":false,"profiles":["./profiles/gemini.json"],"host":"host.docker.internal"}' --volume ./keys.json:/app/keys.json --name mindcraft mindcraft ``` or simply ```bash docker-compose up --build ``` When running in docker, if you want the bot to join your local minecraft server, you have to use a special host address `host.docker.internal` to call your localhost from inside your docker container. Put this into your [settings.js](settings.js): ```javascript "host": "host.docker.internal", // instead of "localhost", to join your local minecraft from inside the docker container ``` To connect to an unsupported minecraft version, you can try to use [viaproxy](services/viaproxy/README.md) # Bot Profiles Bot profiles are json files (such as `andy.json`) that define: 1. Bot backend LLMs to use for talking, coding, and embedding. 2. Prompts used to influence the bot's behavior. 3. Examples help the bot perform tasks. ## Model Specifications LLM models can be specified simply as `"model": "gpt-4o"`, or more specifically with `"{api}/{model}"`, like `"openrouter/google/gemini-2.5-pro"`. See all supported APIs [here](#model-customization). The `model` field can be a string or an object. A model object must specify an `api`, and optionally a `model`, `url`, and additional `params`. You can also use different models/providers for chatting, coding, vision, embedding, and voice synthesis. See the example below. ```json "model": { "api": "openai", "model": "gpt-4o", "url": "https://api.openai.com/v1/", "params": { "max_tokens": 1000, "temperature": 1 } }, "code_model": { "api": "openai", "model": "gpt-4", "url": "https://api.openai.com/v1/" }, "vision_model": { "api": "openai", "model": "gpt-4o", "url": "https://api.openai.com/v1/" }, "embedding": { "api": "openai", "url": "https://api.openai.com/v1/", "model": "text-embedding-ada-002" }, "speak_model": "openai/tts-1/echo" ``` `model` is used for chat, `code_model` is used for newAction coding, `vision_model` is used for image interpretation, `embedding` is used to embed text for example selection, and `speak_model` is used for voice synthesis. `model` will be used by default for all other models if not specified. Not all APIs support embeddings, vision, or voice synthesis. All apis have default models and urls, so those fields are optional. The `params` field is optional and can be used to specify additional parameters for the model. It accepts any key-value pairs supported by the api. Is not supported for embedding models. ## Embedding Models Embedding models are used to embed and efficiently select relevant examples for conversation and coding. Supported Embedding APIs: `openai`, `google`, `replicate`, `huggingface`, `novita` If you try to use an unsupported model, then it will default to a simple word-overlap method. Expect reduced performance. We recommend using supported embedding APIs. ## Voice Synthesis Models Voice synthesis models are used to narrate bot responses and specified with `speak_model`. This field is parsed differently than other models and only supports strings formatted as `"{api}/{model}/{voice}"`, like `"openai/tts-1/echo"`. We only support `openai` and `google` for voice synthesis. ## Specifying Profiles via Command Line By default, the program will use the profiles specified in `settings.js`. You can specify one or more agent profiles using the `--profiles` argument: `node main.js --profiles ./profiles/andy.json ./profiles/jill.json` # Contributing We welcome contributions to the project! We are generally less responsive to github issues, and more responsive to pull requests. Join the [discord](https://discord.gg/mp73p35dzC) for more active support and direction. While AI generated code is allowed, please vet it carefully. Submitting tons of sloppy code and documentation actively harms development. ## Patches Some of the node modules that we depend on have bugs in them. To add a patch, change your local node module file and run `npx patch-package [package-name]` ## Development Team Thanks to all who contributed to the project, especially the official development team: [@MaxRobinsonTheGreat](https://github.com/MaxRobinsonTheGreat), [@kolbytn](https://github.com/kolbytn), [@icwhite](https://github.com/icwhite), [@Sweaterdog](https://github.com/Sweaterdog), [@Ninot1Quyi](https://github.com/Ninot1Quyi), [@riqvip](https://github.com/riqvip), [@uukelele-scratch](https://github.com/uukelele-scratch), [@mrelmida](https://github.com/mrelmida) ## Citation: This work is published in the paper [Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning](https://arxiv.org/abs/2504.17950). Please use this citation if you use this project in your research: ``` @article{mindcraft2025, title = {Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning}, author = {White*, Isadora and Nottingham*, Kolby and Maniar, Ayush and Robinson, Max and Lillemark, Hansen and Maheshwari, Mehul and Qin, Lianhui and Ammanabrolu, Prithviraj}, journal = {arXiv preprint arXiv:2504.17950}, year = {2025}, url = {https://arxiv.org/abs/2504.17950}, } ``` ", Assign "at most 3 tags" to the expected json: {"id":"9163","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"