AI prompts
base on Pandora: Towards General World Model with Natural Language Actions and Video States <p align="center">
<img src="./assets/logo.png" width="250"/>
</p>
<h2 align="center"> Pandora: Towards General World Model with Natural Language Actions and Video States</h2>
We introduce Pandora, a step towards a General World Model (GWM) that:
1. Simulates world states by generating videos across any domains
2. Allows any-time control with actions expressed in natural language
**Please refer to [world-model.ai](world-model.ai) for results.**
[[Website]](https://world-model.maitrix.org/)
[[Paper]](https://world-model.maitrix.org/assets/pandora.pdf)
[[Model]](https://huggingface.co/maitrix-org/Pandora)
[[Gallery]](https://world-model.maitrix.org/gallery.html)
<div align=center>
<img src="assets/architecture.png" width = "780" alt="struct" align=center />
</div>
## News
- __[2024/05/23]__ Release the model and inference code.
- __[2024/05/23]__ Launch the website and release the paper.
## Setup
```shell
conda create -n pandora python=3.11.0 nvidia/label/cuda-12.1.0::cuda-toolkit -y
conda activate pandora
pip install torch torchvision torchaudio
bash build_envs.sh
```
If your GPU doesn't support CUDA 12.1, you can also install with CUDA 11.8:
```shell
conda create -n pandora python=3.11.0 nvidia/label/cuda-11.8.0::cuda-toolkit -y
conda activate pandora
pip install torch torchvision torchaudio
bash build_envs.sh
```
## Inference
### Gradio Demo
1. Download the model checkpoint from [Hugging Face](https://huggingface.co/maitrix-org/Pandora). (***We currently hide the model weights due to data license issue. We will re-open the weights soon after we figure this out.***)
2. Run the commands on your terminal
```shell
CUDA_VISIBLE_DEVICES={cuda_id} python gradio_app.py --ckpt_path {path_to_ckpt}
```
Then you can interact with the model through gradio interface.
## Citation
```bib
@article{xiang2024pandora,
title={Pandora: Towards General World Model with Natural Language Actions and Video States},
author={Jiannan Xiang and Guangyi Liu and Yi Gu and Qiyue Gao and Yuting Ning and Yuheng Zha and Zeyu Feng and Tianhua Tao and Shibo Hao and Yemin Shi and Zhengzhong Liu and Eric P. Xing and Zhiting Hu},
year={2024}
}
```
", Assign "at most 3 tags" to the expected json: {"id":"10412","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"