Trendshift - Ask AI

base on An open-source impl. of Large Reconstruction Models # OpenLRM: Open-Source Large Reconstruction Models [![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-yellow.svg)](LICENSE) [![Weight License](https://img.shields.io/badge/Weight%20License-CC%20By%20NC%204.0-red)](LICENSE_WEIGHT) [![LRM](https://img.shields.io/badge/LRM-Arxiv%20Link-green)](https://arxiv.org/abs/2311.04400) [![HF Models](https://img.shields.io/badge/Models-Huggingface%20Models-bron)](https://huggingface.co/zxhezexin) [![HF Demo](https://img.shields.io/badge/Demo-Huggingface%20Demo-blue)](https://huggingface.co/spaces/zxhezexin/OpenLRM) <img src="assets/rendered_video/teaser.gif" width="75%" height="auto"/> <div style="text-align: left"> <img src="assets/mesh_snapshot/crop.owl.ply00.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.owl.ply01.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.building.ply00.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.building.ply01.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.rose.ply00.png" width="12%" height="auto"/> <img src="assets/mesh_snapshot/crop.rose.ply01.png" width="12%" height="auto"/> </div> ## News - [2024.03.13] Update [training code](openlrm/runners/train) and release [OpenLRM v1.1.1](https://github.com/3DTopia/OpenLRM/releases/tag/v1.1.1). - [2024.03.08] We have released the core [blender script](scripts/data/objaverse/blender_script.py) used to render Objaverse images. - [2024.03.05] The [Huggingface demo](https://huggingface.co/spaces/zxhezexin/OpenLRM) now uses `openlrm-mix-base-1.1` model by default. Please refer to the [model card](model_card.md) for details on the updated model architecture and training settings. - [2024.03.04] Version update v1.1. Release model weights trained on both Objaverse and MVImgNet. Codebase is majorly refactored for better usability and extensibility. Please refer to [v1.1.0](https://github.com/3DTopia/OpenLRM/releases/tag/v1.1.0) for details. - [2024.01.09] Updated all v1.0 models trained on Objaverse. Please refer to [HF Models](https://huggingface.co/zxhezexin) and overwrite previous model weights. - [2023.12.21] [Hugging Face Demo](https://huggingface.co/spaces/zxhezexin/OpenLRM) is online. Have a try! - [2023.12.20] Release weights of the base and large models trained on Objaverse. - [2023.12.20] We release this project OpenLRM, which is an open-source implementation of the paper [LRM](https://arxiv.org/abs/2311.04400). ## Setup ### Installation ``` git clone https://github.com/3DTopia/OpenLRM.git cd OpenLRM ``` ### Environment - Install requirements for OpenLRM first. ``` pip install -r requirements.txt ``` - Please then follow the [xFormers installation guide](https://github.com/facebookresearch/xformers?tab=readme-ov-file#installing-xformers) to enable memory efficient attention inside [DINOv2 encoder](openlrm/models/encoders/dinov2/layers/attention.py). ## Quick Start ### Pretrained Models - Model weights are released on [Hugging Face](https://huggingface.co/zxhezexin). - Weights will be downloaded automatically when you run the inference script for the first time. - Please be aware of the [license](LICENSE_WEIGHT) before using the weights. | Model | Training Data | Layers | Feat. Dim | Trip. Dim. | In. Res. | Link | | :--- | :--- | :--- | :--- | :--- | :--- | :--- | | openlrm-obj-small-1.1 | Objaverse | 12 | 512 | 32 | 224 | [HF](https://huggingface.co/zxhezexin/openlrm-obj-small-1.1) | | openlrm-obj-base-1.1 | Objaverse | 12 | 768 | 48 | 336 | [HF](https://huggingface.co/zxhezexin/openlrm-obj-base-1.1) | | openlrm-obj-large-1.1 | Objaverse | 16 | 1024 | 80 | 448 | [HF](https://huggingface.co/zxhezexin/openlrm-obj-large-1.1) | | openlrm-mix-small-1.1 | Objaverse + MVImgNet | 12 | 512 | 32 | 224 | [HF](https://huggingface.co/zxhezexin/openlrm-mix-small-1.1) | | openlrm-mix-base-1.1 | Objaverse + MVImgNet | 12 | 768 | 48 | 336 | [HF](https://huggingface.co/zxhezexin/openlrm-mix-base-1.1) | | openlrm-mix-large-1.1 | Objaverse + MVImgNet | 16 | 1024 | 80 | 448 | [HF](https://huggingface.co/zxhezexin/openlrm-mix-large-1.1) | Model cards with additional details can be found in [model_card.md](model_card.md). ### Prepare Images - We put some sample inputs under `assets/sample_input`, and you can quickly try them. - Prepare RGBA images or RGB images with white background (with some background removal tools, e.g., [Rembg](https://github.com/danielgatis/rembg), [Clipdrop](https://clipdrop.co)). ### Inference - Run the inference script to get 3D assets. - You may specify which form of output to generate by setting the flags `EXPORT_VIDEO=true` and `EXPORT_MESH=true`. - Please set default `INFER_CONFIG` according to the model you want to use. E.g., `infer-b.yaml` for base models and `infer-s.yaml` for small models. - An example usage is as follows: ``` # Example usage EXPORT_VIDEO=true EXPORT_MESH=true INFER_CONFIG="./configs/infer-b.yaml" MODEL_NAME="zxhezexin/openlrm-mix-base-1.1" IMAGE_INPUT="./assets/sample_input/owl.png" python -m openlrm.launch infer.lrm --infer $INFER_CONFIG model_name=$MODEL_NAME image_input=$IMAGE_INPUT export_video=$EXPORT_VIDEO export_mesh=$EXPORT_MESH ``` ### Tips - The recommended PyTorch version is `>=2.1`. Code is developed and tested under PyTorch `2.1.2`. - If you encounter CUDA OOM issues, please try to reduce the `frame_size` in the inference configs. - You should be able to see `UserWarning: xFormers is available` if `xFormers` is actually working. ## Training ### Configuration - We provide a sample accelerate config file under `configs/accelerate-train.yaml`, which defaults to use 8 GPUs with `bf16` mixed precision. - You may modify the configuration file to fit your own environment. ### Data Preparation - We provide the core [Blender script](scripts/data/objaverse/blender_script.py) used to render Objaverse images. - Please refer to [Objaverse Rendering](https://github.com/allenai/objaverse-rendering) for other scripts including distributed rendering. ### Run Training - A sample training config file is provided under `configs/train-sample.yaml`. - Please replace data related paths in the config file with your own paths and customize the training settings. - An example training usage is as follows: ``` # Example usage ACC_CONFIG="./configs/accelerate-train.yaml" TRAIN_CONFIG="./configs/train-sample.yaml" accelerate launch --config_file $ACC_CONFIG -m openlrm.launch train.lrm --config $TRAIN_CONFIG ``` ### Inference on Trained Models - The inference pipeline is compatible with huggingface utilities for better convenience. - You need to convert the training checkpoint to inference models by running the following script. ``` python scripts/convert_hf.py --config <YOUR_EXACT_TRAINING_CONFIG> convert.global_step=null ``` - The converted model will be saved under `exps/releases` by default and can be used for inference following the [inference guide](https://github.com/3DTopia/OpenLRM?tab=readme-ov-file#inference). ## Acknowledgement - We thank the authors of the [original paper](https://arxiv.org/abs/2311.04400) for their great work! Special thanks to Kai Zhang and Yicong Hong for assistance during the reproduction. - This project is supported by Shanghai AI Lab by providing the computing resources. - This project is advised by Ziwei Liu and Jiaya Jia. ## Citation If you find this work useful for your research, please consider citing: ``` @article{hong2023lrm, title={Lrm: Large reconstruction model for single image to 3d}, author={Hong, Yicong and Zhang, Kai and Gu, Jiuxiang and Bi, Sai and Zhou, Yang and Liu, Difan and Liu, Feng and Sunkavalli, Kalyan and Bui, Trung and Tan, Hao}, journal={arXiv preprint arXiv:2311.04400}, year={2023} } ``` ``` @misc{openlrm, title = {OpenLRM: Open-Source Large Reconstruction Models}, author = {Zexin He and Tengfei Wang}, year = {2023}, howpublished = {\url{https://github.com/3DTopia/OpenLRM}}, } ``` ## License - OpenLRM as a whole is licensed under the [Apache License, Version 2.0](LICENSE), while certain components are covered by [NVIDIA's proprietary license](LICENSE_NVIDIA). Users are responsible for complying with the respective licensing terms of each component. - Model weights are licensed under the [Creative Commons Attribution-NonCommercial 4.0 International License](LICENSE_WEIGHT). They are provided for research purposes only, and CANNOT be used commercially. ", Assign "at most 3 tags" to the expected json: {"id":"6249","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"

AI prompts