Instant voice cloning by MIT and MyShell. Audio foundation model.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
Run Mixtral-8x7B models in Colab or consumer desktops
Code and dataset for photorealistic Codec Avatars driven from audio
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
The SQL IDE for Your Terminal.
This is a tool used to automatically generate a cover letter using chatgpt based on your resume and job description and send messages to bosses in China.
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Unofficial Implementation of Animate Anyone
A collective list of free APIs
Stable Diffusion web UI
Focus on prompting and generating
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
An opinionated list of Python frameworks, libraries, tools, and resources
A feature-rich command-line audio/video downloader
AWS zero to hero repo for devops engineers to learn AWS in 30 Days. This repo includes projects, presentations, interview questions and real time examples.
The agent engineering platform.
A lightweight coding agent for open models like Deepseek, Kimi, and Qwen