AI prompts
base on Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust. <div align="center">
<h1>The Rust CUDA Project</h1>
<p>
<strong>An ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in
<a href="https://www.rust-lang.org/">Rust</a></strong>
</p>
<h3>
<a href="guide">Guide</a>
<span> | </span>
<a href="guide/src/guide/getting_started.md">Getting Started</a>
<span> | </span>
<a href="guide/src/features.md">Features</a>
</h3>
<strong>⚠️ The project is still in early development, expect bugs, safety issues, and things that don't work ⚠️</strong>
</div>
## Goal
The Rust CUDA Project is a project aimed at making Rust a tier-1 language for extremely fast GPU computing
using the CUDA Toolkit. It provides tools for compiling Rust to extremely fast PTX code as well as libraries
for using existing CUDA libraries with it.
## Background
Historically, general purpose high performance GPU computing has been done using the CUDA toolkit. The CUDA toolkit primarily
provides a way to use Fortran/C/C++ code for GPU computing in tandem with CPU code with a single source. It also provides
many libraries, tools, forums, and documentation to supplement the single-source CPU/GPU code.
CUDA is exclusively an NVIDIA-only toolkit. Many tools have been proposed for cross-platform GPU computing such as
OpenCL, Vulkan Computing, and HIP. However, CUDA remains the most used toolkit for such tasks by far. This is why it is
imperative to make Rust a viable option for use with the CUDA toolkit.
However, CUDA with Rust has been a historically very rocky road. The only viable option until now has been to use the LLVM PTX
backend, however, the LLVM PTX backend does not always work and would generate invalid PTX for many common Rust operations, and
in recent years it has been shown time and time again that a specialized solution is needed for Rust on the GPU with the advent
of projects such as rust-gpu (for Rust -> SPIR-V).
Our hope is that with this project we can push the Rust GPU computing industry forward and make Rust an excellent language
for such tasks. Rust offers plenty of benefits such as `__restrict__` performance benefits for every kernel, An excellent module/crate system,
delimiting of unsafe areas of CPU/GPU code with `unsafe`, high level wrappers to low level CUDA libraries, etc.
## Structure
The scope of the Rust CUDA Project is quite broad, it spans the entirety of the CUDA ecosystem, with libraries and tools to make it
usable using Rust. Therefore, the project contains many crates for all corners of the CUDA ecosystem.
The current line-up of libraries is the following:
- `rustc_codegen_nvvm` Which is a rustc backend that targets NVVM IR (a subset of LLVM IR) for the [libnvvm](https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html) library.
- Generates highly optimized PTX code which can be loaded by the CUDA Driver API to execute on the GPU.
- For the near future it will be CUDA-only, but it may be used to target amdgpu in the future.
- `cuda_std` for GPU-side functions and utilities, such as thread index queries, memory allocation, warp intrinsics, etc.
- *Not* a low level library, provides many utility functions to make it easier to write cleaner and more reliable GPU kernels.
- Closely tied to `rustc_codegen_nvvm` which exposes GPU features through it internally.
- [`cudnn`](https://github.com/Rust-GPU/Rust-CUDA/tree/master/crates/cudnn) for a collection of GPU-accelerated primitives for deep neural networks.
- `cust` for CPU-side CUDA features such as launching GPU kernels, GPU memory allocation, device queries, etc.
- High level with features such as RAII and Rust Results that make it easier and cleaner to manage the interface to the GPU.
- A high level wrapper for the CUDA Driver API, the lower level version of the more common CUDA Runtime API used from C++.
- Provides much more fine grained control over things like kernel concurrency and module loading than the C++ Runtime API.
- `gpu_rand` for GPU-friendly random number generation, currently only implements xoroshiro RNGs from `rand_xoshiro`.
- `optix` for CPU-side hardware raytracing and denoising using the CUDA OptiX library.
In addition to many "glue" crates for things such as high level wrappers for certain smaller CUDA libraries.
## Related Projects
Other projects related to using Rust on the GPU:
- 2016: [glassful](https://github.com/kmcallister/glassful) Subset of Rust that compiles to GLSL.
- 2017: [inspirv-rust](https://github.com/msiglreith/inspirv-rust) Experimental Rust MIR -> SPIR-V Compiler.
- 2018: [nvptx](https://github.com/japaric-archived/nvptx) Rust to PTX compiler using the `nvptx` target for rustc (using the LLVM PTX backend).
- 2020: [accel](https://github.com/termoshtt/accel) Higher level library that relied on the same mechanism that `nvptx` does.
- 2020: [rlsl](https://github.com/MaikKlein/rlsl) Experimental Rust -> SPIR-V compiler (predecessor to rust-gpu)
- 2020: [rust-gpu](https://github.com/EmbarkStudios/rust-gpu) Rustc codegen backend to compile Rust to SPIR-V for use in shaders, similar mechanism as our project.
## License
Licensed under either of
- Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your discretion.
### Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
", Assign "at most 3 tags" to the expected json: {"id":"8414","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"