base on Official implementation of "En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data", CVPR 2024; 3D Avatar Generation and Animation # En3D - Official PyTorch Implementation ### [Project page](https://menyifang.github.io/projects/En3D/index.html) | [Paper](https://arxiv.org/abs/2401.01173) | [Video](https://www.youtube.com/watch?v=YxMjaKgGdCc&t=5s) | [Online Demo](https://modelscope.cn/studios/alibaba_openvision_3dgen/En3D/summary) **En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data**<br> [Yifang Men](https://menyifang.github.io/), [Biwen Lei](mailto:[email protected]), [Yuan Yao](mailto:[email protected]), [Miaomiao Cui](mailto:[email protected]), [Zhouhui Lian](https://www.icst.pku.edu.cn/zlian/), [Xuansong Xie](https://scholar.google.com/citations?user=M0Ei1zkAAAAJ&hl=en)<br> In: CVPR 2024 En3D is a large 3D human generative model trained on millions of synthetic 2D data, independently of any pre-existing 3D or 2D assets. This repo contains an implementation of En3D and provides a series of applications built upon it. In addition, this repo aims to be a useful creative tool to produce realistic 3D avatars from seeds, text prompts or images, and support automatic character animation FBX production. All outputs are compatible with the modern graphics workflows. **Generative 3D humans**<br> https://github.com/menyifang/En3D/assets/47292223/8b57a74d-6270-4b37-ae1e-ee2c0baad51d **Text guided synthesis**<br> ![demo](assets/demo_text.gif) **Image guided synthesis**<br> ![demo](assets/demo_img.gif) More results can be found in [project page](https://menyifang.github.io/projects/En3D/index.html). ## Updates (2024-10-29) The code of avatar generation & animation are available. (2024-10-29) The pretrained weights are available from [Modelscope](https://modelscope.cn/models/alibaba_openvision_3dgen/cv_en3d_3d_human_generation). (2024-01-15) ModelScope and HuggingFace Online demo is available! Try out [![ModelScope Spaces]( https://img.shields.io/badge/ModelScope-Spaces-blue)](https://modelscope.cn/studios/alibaba_openvision_3dgen/En3D/summary) [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/menyifang/En3D). (2024-01-15) A Rigged & Animated 3D Human library (3DHuman-Syn) is released, containing ~1000 avatars produced by En3D for quick experience. Infinite avatars and actions support will be coming soon! (2024-01-03) The paper and video are released. ## Web Demo - Integrated an online demo into [ModelScope](https://modelscope.cn/studios/alibaba_openvision_3dgen/En3D/summary). Try out and have fun! - Integrated an online demo into [Huggingface Spaces 🤗](https://huggingface.co/spaces/menyifang/En3D). Try out and have fun! ## Requirements * We recommend Linux for performance and compatibility reasons. * 1&ndash;8 high-end NVIDIA GPUs. We have done all testing and development using V100, RTX3090, and A100 GPUs. * 64-bit Python 3.8 and PyTorch 1.11.0 (or later). See https://pytorch.org for PyTorch install instructions. * CUDA toolkit 11.3 or later. We used the custom CUDA extensions from the StyleGAN3 repo. Please see [Troubleshooting](https://github.com/NVlabs/stylegan3/blob/main/docs/troubleshooting.md#why-is-cuda-toolkit-installation-necessary) * Python libraries: see [requirements.txt](./requirements.txt) for exact library dependencies. You can use the following commands with Anaconda to create and activate your Python environment: - `cd En3d` - `conda create -n en3d python=3.8` - `conda activate eg3d` - `pip install -r requirements.txt` ## Quick Start ### WebUI usage [Recommended] A deployed [Online Demo](https://modelscope.cn/studios/alibaba_openvision_3dgen/En3D/summary) ![app](assets/app_thumb.png) With own machine, you can also deploy our demo as below, which provides flexible user interface. Both CPU/GPU are supported for avatar animation, only GPU (>24G memory) is supported for avatar generation and render. ```bash python app.py ``` ### Synthetic avatar library We released a Rigged & Animated 3D Human library (3DHuman-Syn), containing ~1000 characters produced by En3D, and 1000+ actions are provided for animation. - Avatar download and rendering ```bash python render.py ``` - Avatar animation ```bash python animation.py ``` - AR application <a href="https://3d-studio123.github.io/"><img src="https://img.shields.io/badge/Open in-iphone-blue" alt="google colab logo"></a> Convert the generated animation file (.glb) to .usdz format using Sketchfab (upload glb and download usdz) or other tools, and insert the animated avatar to web with [model-viewer](https://modelviewer.dev/). ```bash <!-- Import the component --> <script type="module" src="https://ajax.googleapis.com/ajax/libs/model-viewer/3.3.0/model-viewer.min.js"></script> <!-- Use it like any other HTML element --> <model-viewer src="assets/human.glb" camera-controls ar shadow-intensity="1" ios-src="assets/human.usdz"></model-viewer> ``` AR function is only supported with iPhone, try the [AR example](https://3d-studio123.github.io/) in the phone browser and click the right-down icon for AR experience. ## Avatar Generation ### Download - Download the whole `models` folder from the [link](https://modelscope.cn/models/alibaba_openvision_3dgen/cv_en3d_3d_human_generation/files) and put it under the root dir. ```bash python download_models.py ``` ### Generate 360 renderings ```bash bash run_seed_pretrained.sh ``` ### Text guided synthesis ```bash bash run_text_synthesis.sh ``` https://github.com/user-attachments/assets/7ca89be8-9f5c-410d-9168-9f241778a197 ### Image guided synthesis Repose the human image to canonical A-pose first, and then using the following command to generate the 3D avatar. ```bash bash run_img_synthesis.sh ``` https://github.com/user-attachments/assets/10ddc739-3ed5-4875-a6d0-908c8dbfcb31 ## Avatar Animation Auto-rig and animation for the generated 3D avatar ```bash bash run_animate.sh ``` https://github.com/user-attachments/assets/581089a4-971d-4431-a272-e9663fe5bad5 ## Acknowledgments Here are some great resources we benefit from: - [Multiview-Avatar](https://github.com/ArcherFMY/Multiview-Avatar) for 2D avatar synthesis - [EG3D](https://nvlabs.github.io/eg3d/) for the general 3D GAN structure and 3D representation - [Fantasia3D](https://github.com/Gorilla-Lab-SCUT/Fantasia3D) for DMTET optimization - [ICON](https://github.com/YuliangXiu/ICON) for normal estimation ## Citation If you find this code useful for your research, please use the following BibTeX entry. ```bibtex @inproceedings{men2024en3d, title={En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data}, author={Men, Yifang and Lei, Biwen and Yao, Yuan and Cui, Miaomiao and Lian, Zhouhui and Xie, Xuansong}, journal={arXiv preprint arXiv:2401.01173}, website={https://menyifang.github.io/projects/En3D/index.html}, year={2024}} ``` ", Assign "at most 3 tags" to the expected json: {"id":"6704","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"