Reach 125K+ monthly visitors
Advertise on TrendshiftLocal LLM
On-device semantic tool selection demo (SwiftUI + MLX) using LiquidAI LFM2.5 retrievers via mlx-swift-lm
Use Claude Code 100% free with 100+ NVIDIA NIM models via LiteLLM proxy. No Anthropic subscription needed. Works on Windows & macOS.
A simple but powerful TUI coding agent. Fully native rust, cross-compatible.
Self-hosted AI workspace where chat becomes visual workflows, multi-agent operations, and reviewable automations. Local memory; local or cloud models
Tiny transformer language model running locally on a Palm Tungsten E2
Self-hosted, privacy-first document chat for attorneys: parse legal PDFs and query them with local open-source LLMs (Ollama) + verifiable page citations.
Local LLM chat panel for ComfyUI — LM Studio & Ollama, multi-session, vision support, Guide Materials
Compute's open sharded inference CLI for two-stage llama.cpp/GGUF experiments
Auto web interaction by VLM
Hush — Privacy-first conversational notification manager for Android. Control notifications with voice commands powered by on-device AI (Gemini Nano).
ik_llama.cpp fork with experimental NUMA per-node mirroring of model weights and KV cache. This is an attempt to maximize CPU inference performance on multi-socket systems.
Production-ready vLLM deployment wrapper for Qwen3.6-27B (NVFP4) — self-hosted OpenAI-compatible inference
ROCmFPX Family for AMD Hardware and Processors. More quants and special agent quants
Self-hosted vLLM inference for Qwen3.6-35B-A3B-NVFP4
Tuned recipe: Xiaomi MiMo V2.5 (NVFP4) on 2x DGX Spark, vLLM TP=2 over RoCE. ~32-33 tok/s, Quality 89.9, 160K context, omnimodal + tool-calling. Non-eager + MTP=2.