Browse all topics
Audio processing
The Infinite Crate is a DAW plugin built on JUCE, React, and the Lyria RealTime live music model
NPM Library to transcribe Audio & Videos completely in browser with WebGPU and WebCodecs. 100% private and offline with WASM fallbacks
Real-time transcription and AI assistant for Meta Ray-Ban smart glasses. Live speech-to-text, speaker diarization, Gemini Live vision+voice, and WebRTC streaming.
A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD
A beautiful Windows program for analyzing audio quality — detects fake lossless, clipping, MQA, and AI-generated audio; includes a spectrogram viewer and more. Built-in player with EQ and spatial audio.
Petal is a native macOS menu bar app for fast, local-first audio transcription.
cliamp - Terminal music player inspired by winamp
Slap your MacBook, it yells back. Uses Apple Silicon accelerometer via IOKit HID.
Rust implementation of Qwen3-ASR automatic speech recognition
A tiny audio streaming server (compatible with icecast2) written in Go with multiple mountpoint, multi-source, and relaying support and a lot more.
A desktop music player built with Electron that streams audio from YouTube Music. Clean UI, no accounts, no ads.
WhisperCrabs is a simple terminal-based floating recording button, click, and transcribe to input
Ming-omni-tts: Simple and Efficient Unified Generation of Speech, Music, and Sound with Precise Control
🌋LavaSR: Fast Speech restoration and enhancement
A real-time and multilingual speech translation model
Pure C inference of Mistral Voxtral Realtime 4B speech to text model
Your voice is the fastest interface to AI
🎵 The Ultimate Open Source Suno Alternative - Professional UI for ACE-Step 1.5 AI Music Generation. Free, local, unlimited. Stop paying for Suno!
ComfyUI custom nodes for Qwen3-ASR (Automatic Speech Recognition) - audio-to-text transcription supporting 52 languages and dialects.
Open source macOS video transcriber that builds a self-organizing knowledge base 🐝
Qwen3-ASR is an open-source series of ASR models developed by the Qwen team at Alibaba Cloud, supporting stable multilingual speech/music/song recognition, language detection and timestamp prediction.
Other topics
Browse other topics on Trendshift
A
N