MatN23/AdaptiveTrainingSystem — GitHub trending stats & insights

Reach 125K+ monthly visitors

MatN23/AdaptiveTrainingSystem

A PyTorch framework for training transformer language models with Mixture of Experts (MoE) architecture support, Mixture of Depths (MoD), and DeepSpeed integration. Implements models from 70M to 300B parameters with automatic dataset processing, distributed training, and memory management.

Visit GitHub

Python

1 contributors

Apache License 2.0

Social mentions

Recent discussions about this repository across the web

I built a full transformer training stack from scratch. Custom CUDA kernels, autonomous orchestrator, MoE/MoD, theoredicly scales to 300B+ .Free Colab demo inside.

r/learnmachinelearning

I built a full transformer training stack from scratch. Custom CUDA kernels, autonomous orchestrator, MoE/MoD, theoredicly scales to 300B+. Free Colab demo inside.

r/LocalLLaMA

Repository activities

repository's daily and monthly activities across stars, forks, merged PRs, issues, and closed issues

Data is not available yet

Recent activity data for stars, forks, merged PRs, issues, and closed issues will appear here once available