AI prompts
base on <p align="center">
<img src="assets/logo.png" height=150>
</p>
# WonderJourney: Going from Anywhere to Everywhere
<div align="center">
[![a](https://img.shields.io/badge/Website-WonderJourney-blue)](https://kovenyu.com/wonderjourney/)
[![arXiv](https://img.shields.io/badge/arXiv-2312.03884-red)](https://arxiv.org/abs/2312.03884)
[![twitter](https://img.shields.io/twitter/url?label=Koven_Yu&url=https%3A%2F%2Ftwitter.com%2FKoven_Yu)](https://twitter.com/Koven_Yu)
</div>
https://github.com/KovenYu/WonderJourney/assets/27218043/43c864b5-2416-4177-ae39-347150968bc3
https://github.com/KovenYu/WonderJourney/assets/27218043/70eb220d-2521-4033-b736-cf88755a3bcb
> #### [WonderJourney: Going from Anywhere to Everywhere](https://arxiv.org/abs/2312.03884)
> ##### [Hong-Xing "Koven" Yu](https://kovenyu.com/), [Haoyi Duan](https://haoyi-duan.github.io/), [Junhwa Hur](https://hurjunhwa.github.io/), [Kyle Sargent](https://kylesargent.github.io/), [Michael Rubinstein](https://people.csail.mit.edu/mrub/), [William T. Freeman](https://billf.mit.edu/), [Forrester Cole](https://people.csail.mit.edu/fcole/), [Deqing Sun](https://deqings.github.io/), [Noah Snavely](https://www.cs.cornell.edu/~snavely/), [Jiajun Wu](https://jiajunwu.com/), [Charles Herrmann](https://scholar.google.com/citations?user=LQvi5XAAAAAJ&hl=en)
## Getting Started
### Installation
For the installation to be done correctly, please proceed only with CUDA-compatible GPU available.
It requires 24GB GPU memory to run.
Clone the repo and create the environment:
```bash
git clone https://github.com/KovenYu/WonderJourney.git
cd WonderJourney
mamba create --name wonderjourney python=3.10
mamba activate wonderjourney
```
We are using <a href="https://github.com/facebookresearch/pytorch3d" target="_blank">Pytorch3D</a> to perform rendering.
Run the following commands to install it or follow their <a href="https://github.com/facebookresearch/pytorch3d/blob/main/INSTALL.md" target="_blank">installation guide</a> (it may take some time).
```bash
mamba install pytorch=1.13.0 torchvision pytorch-cuda=11.6 -c pytorch -c nvidia
mamba install -c fvcore -c iopath -c conda-forge fvcore iopath
mamba install -c bottler nvidiacub
mamba install pytorch3d -c pytorch3d
```
Install the rest of the requirements:
```bash
pip install -r requirements.txt
```
Load English language model for spacy:
```bash
python -m spacy download en_core_web_sm
```
Export your OpenAI api_key (since we use GPT-4 to generate scene descriptions):
```bash
export OPENAI_API_KEY='your_api_key_here'
```
Download Midas DPT model and put it to the root directory.
```bash
wget https://github.com/isl-org/MiDaS/releases/download/v3_1/dpt_beit_large_512.pt
```
### Run examples
- Example config file
To run an example, first you need to write a config. An example config `./config/village.yaml` is shown below:
```yaml
runs_dir: output/56_village
example_name: village
seed: -1
frames: 10
save_fps: 10
finetune_decoder_gen: True
finetune_decoder_interp: False # Turn on this for higher-quality rendered video
finetune_depth_model: True
num_scenes: 4
num_keyframes: 2
use_gpt: True
kf2_upsample_coef: 4
skip_interp: False
skip_gen: False
enable_regenerate: True
debug: True
inpainting_resolution_gen: 512
rotation_range: 0.45
rotation_path: [0, 0, 0, 1, 1, 0, 0, 0]
camera_speed_multiplier_rotation: 0.2
```
The total frames of the generated example is `num_scenes` $\times$ `num_keyframes`. You can manually adjust `rotation_path` in the config file to control the rotation state of the camera in each frame. A value of $0$ indicates moving straight, $1$ signifies a right turn, and $-1$ indicates a left turn.
- Run
```bash
python run.py --example_config config/village.yaml
```
You will see results in `output/56_village/{time-string}_merged`.
### How to add more examples?
We highly encourage you to add new images and try new stuff!
You would need to do the image-caption pairing separately (e.g., using DALL-E to generate image and GPT4V to generate description).
- Add a new image in `./examples/images/`.
- Add content of this new image in `./examples/examples.yaml`.
Here is an example:
```yaml
- name: new_example
image_filepath: examples/images/new_example.png
style_prompt: DSLR 35mm landscape
content_prompt: scene name, object 1, object 2, object 3
negative_prompt: ''
background: ''
```
- **content_prompt**: "scene name", "object 1", "object 2", "object 3"
- **negative_prompt** and **background** are optional
For controlled journey, you need to add `control_text`. Examples are as follow:
```yaml
- name: poem_jiangxue
image_filepath: examples/images/60_poem_jiangxue.png
style_prompt: black and white color ink painting
content_prompt: Expansive mountainous landscape, old man in traditional attire, calm river, mountains
negative_prompt: ""
background: ""
control_text: ["千山鸟飞绝", "万径人踪灭", "孤舟蓑笠翁", "独钓寒江雪"]
- name: poem_snowy_evening
image_filepath: examples/images/72_poem_snowy_evening.png
style_prompt: Monet painting
content_prompt: Stopping by woods on a snowy evening, woods, snow, village
negative_prompt: ""
background: ""
control_text: ["Snowy Woods and Farmhouse: A secluded farmhouse, a frozen lake, a dense thicket, a quiet meadow, a chilly wind, a pale twilight, a covered bridge, a rustic fence, a snow-laden tree, and a frosty ground", "The Traveler's Horse: A restless horse, a jingling harness, a snowy mane, a curious gaze, a sturdy hoof, a foggy breath, a leather saddle, a woolen blanket, a frost-covered tail, and a patient stance", "Snowfall in the Woods: A gentle snowflake, a whispering wind, a soft flurry, a white blanket, a twinkling icicle, a bare branch, a hushed forest, a crystalline droplet, a serene atmosphere, and a quiet night", "Deep, Dark Woods in the Evening: A mysterious grove, a shadowy tree, a darkened sky, a hidden trail, a silent owl, a moonlit glade, a dense underbrush, a quiet clearing, a looming branch, and an eerie stillness"]
```
- Write a config `config/new_example.yaml` like `./config/village.yaml` for the new example
- Run
```bash
python run.py --example_config config/new_example.yaml
```
## Citation
```
@article{yu2023wonderjourney,
title={WonderJourney: Going from Anywhere to Everywhere},
author={Yu, Hong-Xing and Duan, Haoyi and Hur, Junhwa and Sargent, Kyle and Rubinstein, Michael and Freeman, William T and Cole, Forrester and Sun, Deqing and Snavely, Noah and Wu, Jiajun and Herrmann, Charles},
journal={arXiv preprint arXiv:2312.03884},
year={2023}
}
```
## Acknowledgement
We appreciate the authors of [SceneScape](https://github.com/RafailFridman/SceneScape), [MiDaS](https://github.com/isl-org/MiDaS), [SAM](https://github.com/facebookresearch/segment-anything), [Stable Diffusion](https://huggingface.co/stabilityai/stable-diffusion-2-inpainting), and [OneFormer](https://github.com/SHI-Labs/OneFormer) to share their code.
", Assign "at most 3 tags" to the expected json: {"id":"5870","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"