Synthetic data
Synthetic datasets, experiment protocols, and evaluation code for "Governed Memory: A Production Architecture for Multi-Agent Workflows"
Generate realistic multi-agent workflow traces with LLM-enriched content, semantic validation, and PM4Py compatibility. pip install open-agent-traces
[SIGGRAPH 2026] SynthVerse: A Large-Scale Diverse Synthetic Dataset for Point Tracking
[CVPR 2026 Oral] Learning to Drive via Real-World Simulation at Scale
🎨 NeMo Data Designer: Generate high-quality synthetic data from scratch or from seed data.
Using deep research workflow to generate datasets for finetuning LLMs.
Cyber-Zero: Training Cybersecurity Agents Without Runtime
ACE-Step: A Step Towards Music Generation Foundation Model
A powerful tool for creating datasets for LLM fine-tuning 、RAG and Eval
Deep learning tools for peptide substrate prediction and generation
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
AI-native platform for tabular data generation via CLI, WebUI or app.
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Official implementation of "En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data", CVPR 2024; 3D Avatar Generation and Animation