Trendshift - Ask AI

base on [CVPR 2024] Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis # Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis ## CVPR 2024 [Project Page](https://oppo-us-research.github.io/SpacetimeGaussians-website/) | [Paper](https://arxiv.org/abs/2312.16812) | [Video](https://youtu.be/YsPPmf-E6Lg) | [Viewer & Pre-Trained Models](https://huggingface.co/stack93/spacetimegaussians/tree/main) This is an official implementation of the paper "Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis". [Zhan Li](https://lizhan17.github.io/web/)1,2, [Zhang Chen](https://zhangchen8.github.io/)1,&dagger;, [Zhong Li](https://sites.google.com/site/lizhong19900216)1,&dagger;, [Yi Xu](https://www.linkedin.com/in/yi-xu-42654823/)1 1 OPPO US Research Center, 2 Portland State University &dagger; Corresponding authors <img src="assets/output.gif" width="100%"/> ## Updates and News - `Jun 16, 2024`: Added fully fused mlp for testing ours-full models on Technicolor and Neural 3D dataset (40 FPS improvement compared to paper). - `Jun 13, 2024`: Fixed minors for reproducity on the scenes ```coffee_martini``` and ```flame_salmon_1``` (~ 0.1 PSNR). - `Jun 9, 2024` : Supported lazy loading and ground truth image as int8 in GPU. - `Dec 28, 2023`: Paper and Code are released. ## Table of Contents 1. [Installation](#installation) 1. [Preprocess Datasets](#processing-datasets) 1. [Training](#training) 1. [Testing](#testing) 1. [Real-Time Viewer](#real-time-viewer) 1. [Creating Your Gaussians](#create-your-new-representations-and-rendering-pipeline) 1. [License Infomration](#license-information) 1. [Acknowledgement](#acknowledgement) 1. [Citations](#citations) ## Installation ### Windows users with WSL2 : Please first refer to [here](./script/wsl.md) to install the WSL2 system (Windows Subsystem for Linux 2) and install dependencies inside WSL2. Then you can set up our repo inside the Linux sub-system same as other Linux users. ### Linux users : Clone the source code of this repo. ``` git clone https://github.com/oppo-us-research/SpacetimeGaussians.git --recursive cd SpacetimeGaussians ``` Then run the following command to install the environments with conda. Note we will create two environments, one for preprocessing with colmap (```colmapenv```) and one for training and testing (```feature_splatting```). Training, testing and preprocessing have been tested on Ubuntu 20.04. ``` bash script/setup.sh ``` Note that you may need to manually install the following packages if you encounter errors during the installation of the above command. ``` conda activate feature_splatting pip install thirdparty/gaussian_splatting/submodules/gaussian_rasterization_ch9 pip install thirdparty/gaussian_splatting/submodules/gaussian_rasterization_ch3 pip install thirdparty/gaussian_splatting/submodules/forward_lite pip install thirdparty/gaussian_splatting/submodules/forward_full ``` ## Processing Datasets Note, our paper uses the sparse points that follow 3DGS. Our per frame SfM points only use ```point_triangulator``` in Colmap instead of dense points. ### Neural 3D Dataset Download the dataset from [here](https://github.com/facebookresearch/Neural_3D_Video.git). After downloading the dataset, you can run the following command to preprocess the dataset. ``` conda activate colmapenv python script/pre_n3d.py --videopath <location>/<scene> ``` ```<location>``` is the path to the dataset root folder, and ```<scene>``` is the name of a scene in the dataset. - For example if you put the dataset at ```/home/Neural3D```, and want to preprocess the ```cook_spinach``` scene, you can run the following command ``` conda activate colmapenv python script/pre_n3d.py --videopath /home/Neural3D/cook_spinach/ ``` Our codebase expects the following directory structure for the Neural 3D Dataset after preprocessing: ``` <location> |---cook_spinach | |---colmap_<0> | |---colmap_<...> | |---colmap_<299> |---flame_salmon1 ``` ### Technicolor Dataset Please reach out to the authors of the paper "Dataset and Pipeline for Multi-View Light-Field Video" for access to the Technicolor dataset. Our codebase expects the following directory structure for this dataset before preprocessing: ``` <location> |---Fabien | |---Fabien_undist_<00257>_<08>.png | |---Fabien_undist_<.....>_<..>.png |---Birthday ``` Then run the following command to preprocess the dataset. ``` conda activate colmapenv python script/pre_technicolor.py --videopath <location>/<scene> ``` ### Google Immersive Dataset Download the dataset from [here](https://github.com/augmentedperception/deepview_video_dataset). After downloading and unzip the dataset, you can run the following command to preprocess the dataset. ``` conda activate colmapenv python script/pre_immersive_distorted.py --videopath <location>/<scene> python script/pre_immersive_undistorted.py --videopath <location>/<scene> ``` ```<location>``` is the path to the dataset root folder, and ```<scene>``` is the name of a scene in the dataset. Please rename the orginal file to the name list ```Immersiveseven```in [here](./script/pre_immersive_distorted.py) - For example if you put the dataset at ```/home/immersive```, and want to preprocess the ```02_Flames``` scene, you can run the following command ``` conda activate colmapenv python script/pre_immersive_distorted.py --videopath /home/immersive/02_Flames/ ``` 1. Our codebase expects the following directory structure for immersive dataset before preprocessing ``` <location> |---02_Flames | |---camera_0001.mp4 | |---camera_0002.mp4 |---09_Alexa ``` 2. Our codebase expects the following directory structure for immersive dataset (raw video, decoded images, distorted and undistorted) after preprocessing: ``` <location> |---02_Flames | |---camera_0001 | |---camera_0001.mp4 | |---camera_<...> |---02_Flames_dist | |---colmap_<0> | |---colmap_<...> | |---colmap_<299> |---02_Flames_undist | |---colmap_<0> | |---colmap_<...> | |---colmap_<299> |---09_Alexa |---09_Alexa_dist |---09_Alexa_undist ``` 3. Copy the picked views files to the scene dir. The views is generated by inferencing our model initialized with ```duration=1``` points without training. We provide generated views in pkl for reproducity and simplicity. - For example, for the scene ```09_Alexa``` with distortion model. copy ```configs/im_view/09_Alexa/pickview.pkl``` to ```<location>/09_Alexa_dist/pickview.pkl``` ## Training You can train our model by running the following command: ``` conda activate feature_splatting python train.py --quiet --eval --config configs/<dataset>_<lite|full>/<scene>.json --model_path <path to save model> --source_path <location>/<scene>/colmap_0 ``` In the argument to ```--config```, ```<dataset>``` can be ```n3d``` (for Neural 3D Dataset) or ```techni``` (for Technicolor Dataset), and you can choose between ```full``` model or ```lite``` model. You need 24GB GPU memory to train on the Neural 3D Dataset. You need 48GB GPU memory to train on the Technicolor Dataset. The large memory requirement is because training images are loaded into GPU memory. - For example, if you want to train the **lite** model on the first 50 frames of the ```cook_spinach``` scene in the Neural 3D Dataset, you can run the following command ``` python train.py --quiet --eval --config configs/n3d_lite/cook_spinach.json --model_path log/cook_spinach_lite --source_path <location>/cook_spinach/colmap_0 ``` - If you want to train the **full** model, you can run the following command ``` python train.py --quiet --eval --config configs/n3d_full/cook_spinach.json --model_path log/cook_spinach/colmap_0 --source_path <location>/cook_spinach/colmap_0 ``` Please refer to the .json config files for more options. - If you want to train the **full** model with **distorted** immersive dataset, you can run the following command ``` PYTHONDONTWRITEBYTECODE=1 python train_imdist.py --quiet --eval --config configs/im_distort_full/02_Flames.json --model_path log/02_Flames/colmap_0 --source_path <location>/02_Flames_dist/colmap_0 ``` Note, sometimes pycache file somehow affects the results. Please remove every pycache file and retrain the model without generating BYTECODE by ```PYTHONDONTWRITEBYTECODE=1```. - If you want to train the **lite** model with **undistorted** immersive dataset. Note, we remove the ```--eval``` to reuse the loader of technicolor and also to train with all cameras. ```gtmask 1``` is specially for training with undistorted fisheye images that have black pixels. ``` python train.py --quiet --gtmask 1 --config configs/im_undistort_lite/02_Flames.json --model_path log/02_Flames/colmap_0 --source_path <location>/02_Flames_undist/colmap_0 ``` Please refer to the .json config files for more options. ## Testing - Test model on Neural 3D Dataset ``` python test.py --quiet --eval --skip_train --valloader colmapvalid --configpath configs/n3d_<lite|full>/<scene>.json --model_path <path to model> --source_path <location>/<scene>/colmap_0 ``` - Test model on Technicolor Dataset ``` python test.py --quiet --eval --skip_train --valloader technicolorvalid --configpath configs/techni_<lite|full>/<scene>.json --model_path <path to model> --source_path <location>/<scenename>/colmap_0 ``` - Test on Google Immersive Dataset with distortion camera model Fist Install fused mlp layer. ``` pip install thirdparty/gaussian_splatting/submodules/forward_full ``` ``` PYTHONDONTWRITEBYTECODE=1 CUDA_VISIBLE_DEVICES=0 python test.py --quiet --eval --skip_train --valloader immersivevalidss --configpath configs/im_distort_<lite|full>/<scene>.json --model_path <path to model> --source_path <location>/<scenename>/colmap_0 ``` ## Real-Time Viewer The viewer is based on [SIBR](https://sibr.gitlabpages.inria.fr/) and [Gaussian Splatting](https://github.com/graphdeco-inria/gaussian-splatting). ### Pre-built Windows Binary Download the viewer binary from [this link](https://huggingface.co/stack93/spacetimegaussians/tree/main) and unzip it. The binary works for Windows with CUDA >= 11.0. We also provide pre-trained models in the link. For example, [n3d_sear_steak_lite_allcam.zip](https://huggingface.co/stack93/spacetimegaussians/blob/main/n3d_sear_steak_lite_allcam.zip) contains the lite model that uses all views during training for the sear_steak scene in the Neural 3D Dataset. ### Installation from Source please see bottom commented text [this link](./script/setup.sh) ### Running the Real-Time Viewer After downloading the pre-built binary or installing from source, you can use the following command to run the real-time viewer. Adjust ```--iteration``` to match the training iterations of model. ``` ./<SIBR install dir>/bin/SIBR_gaussianViewer_app_rwdi.exe --iteration 25000 -m <path to trained model> ``` The above command has beed tested on Nvidia RTX 3050 Laptop GPU + Windows 10. - For 8K rendering, you can use the following command. ``` ./<SIBR install dir>/bin/SIBR_gaussianViewer_app_rwdi.exe --iteration 25000 --rendering-size 8000 4000 --force-aspect-ratio 1 -m <path to trained model> ``` 8K rendering has been tested on Nvidia RTX 4090 + Windows 11. ### Third Party Implemented Web Viewer We thank Kevin Kwok (Antimatter15) for the amazing web viewer of our method: splaTV . The web viewer is released at [github](https://github.com/antimatter15/splaTV). You can view one of our scene from the [web viewer](http://antimatter15.com/splaTV/). ## Create Your New Representations and Rendering Pipeline If you want to customize our codebase for your own models, you can refer to the following steps - Step 1: Create a new Gaussian representation in this [folder](./thirdparty/gaussian_splatting/scene/). You can use ```oursfull.py``` or ```ourslite.py``` as a template. - Step 2: Create a new rendering pipeline in this [file](./thirdparty/gaussian_splatting/renderer/__init__.py). You can use the ```train_ours_full``` function as a template. - Step 3 (For new dataset, optional): Create a new dataloader in this [file](./thirdparty/gaussian_splatting/scene/__init__.py) and this [file](./thirdparty/gaussian_splatting/scene/dataset_readers.py). - Step 4: Update the intermidiate API in ```getmodel``` (for Step 1) and ```getrenderpip``` (for Step 2) functions in ```helper_train.py```. ## License Information The code in this repository (except the thirdparty folder) is licensed under MIT licence, see [LICENSE](LICENSE). thirdparty/gaussian_splatting is licensed under Gaussian-Splatting license, see [thirdparty/gaussian_splatting/LICENSE.md](thirdparty/gaussian_splatting/LICENSE.md). thirdparty/colmap is licensed under new BSD license, see [thirdparty/colmap/LICENSE.txt](thirdparty/colmap/LICENSE.txt). ## Acknowledgement We sincerely thank the owners of the following source code repos, which are used by our released codes: [Gaussian Splatting](https://github.com/graphdeco-inria/gaussian-splatting), [Colmap](https://github.com/colmap/colmap). Some parts of our code referenced the following repos: [Gaussian Splatting with Depth](https://github.com/JonathonLuiten/diff-gaussian-rasterization-w-depth), [SIBR](https://gitlab.inria.fr/sibr/sibr_core.git), [fisheye-distortion](https://github.com/Synthesis-AI-Dev/fisheye-distortion). We sincerely thank the anonymous reviewers of CVPR2024 for their helpful feedbacks. we also thank Michael Rubloff for his post on [radiancefields](https://radiancefields.com/splatv-dynamic-gaussian-splatting-viewer/). We also want to thank MrNeRF for [posts](https://x.com/janusch_patas/status/1740621964480217113?s=20) about our paper and other Guassian Splatting based papers. ## Citations Please cite our paper if you find it useful for your research. ``` @InProceedings{Li_STG_2024_CVPR, author = {Li, Zhan and Chen, Zhang and Li, Zhong and Xu, Yi}, title = {Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {8508-8520} } ``` Please also cite the following paper if you use Gaussian Splatting. ``` @Article{kerbl3Dgaussians, author = {Kerbl, Bernhard and Kopanas, Georgios and Leimk{\"u}hler, Thomas and Drettakis, George}, title = {3D Gaussian Splatting for Real-Time Radiance Field Rendering}, journal = {ACM Transactions on Graphics}, number = {4}, volume = {42}, month = {July}, year = {2023}, url = {https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/} } ``` ", Assign "at most 3 tags" to the expected json: {"id":"6517","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"

AI prompts

AI prompts