Submit repository
Discover trends that matter
Trending repositories
Daily
Weekly
Monthly
Yearly
Live mentions
Topics
GitHub trending
Repositories
Developers
Insights
Stats
noonghunna/club-3090 — GitHub trending stats & insights | Trendshift
Sponsor spot open
·
promote your product
noonghunna/club-3090
#
Local LLM
#
Self-hosted
Community recipes for serving LLMs on RTX 3090/CUDA gpus. Multi-engine (vLLM, llama.cpp, SGLang) and model-agnostic. Currently shipping Qwen3.6-27B Qwen3.6 35B Gemma 4 26B Gemma 4 31B configs for 1× and 2× cards.
Visit GitHub
Python
1.3k
67
13 contributors
Apache License 2.0
Social mentions
Recent discussions about this repository across the web
well it is basicaly the config i tried ( . i was on it but not same speed as expected
@DragonGroky · x.com
Gemma4 12b is now available to everyone on club-3090. Go ahead and give it a shot if you have RTX 3090/4090/5090 P.S. MTP is currently blocked on llama/beellama.
@malikwas1f · x.com
Gemma 4 12b, latest situation. 121+ tps with MTP on vllm + 2 x 3090s, 2.9 * concurrency. @googlegemma @vllm_project Can you fix #39914 please. Clubbers! Grab the recipes from
@malikwas1f · x.com
Gemma 4 12b, latest situation on 2x3090s. @googlegemma @vllm_project Can you fix #39914 please. Clubbers! Grab the recipes from
@malikwas1f · x.com
5/5 My current takeaway: For fast single-card Gemma-4 on 3090/4090-class hardware, the real path forward seems to be: engine-specific optimizations smarter KV handling MTP/spec decode Gemma-tuned…
@malikwas1f · x.com
Dual RTX 3090s took me from 40-50 tok/s to 70 tok/s. Switched from Windows to Ubuntu and hit 120 tok/s. Windows had CPU at 90C idle. Ubuntu runs 38C idle, 85C full load. Linux beats Windows for AI…
@Tech2Wild · x.com
在RTX 3090上跑大模型这件事,club-3090直接给了现成部署方案。 本质就是个“配方库”:自动配好vLLM、llama.cpp多引擎,连Qwen3.6-27B这种都能在1-2张卡上跑起来。 省去你自己查参数、改补丁的时间,脚本按步骤走就行。说白了就为了让本地玩LLM不卡在配置上。
@vintcessun · x.com
6/ Fair caveat: This was Q4 on a single 3090. The runaway reasoning tail may partly be a quantization artifact rather than the base model itself — shorter reasoning traces matched bf16 much more…
@malikwas1f · x.com
well well well, Beellama managed to merge Dflash+TurboQuant already. this unlocks Q5 quants. Things just keep getting better and better for those 3090s Join the 3090 club, we've got more recipes…
@malikwas1f · x.com
OMG! 97.8 TPS 🚀🚀🚀 single 3090 qwen3.6 27b dense. (P.S. No overclocking yet) Recipe cooking at
@malikwas1f · x.com
Load more
No trending activity
This repository has not yet been featured on GitHub Trending
Repository activities
repository's daily and monthly activities across stars, forks, merged PRs, issues, and closed issues