AI prompts
base on ๐ค ๐๐ฒ๐ฎ๐ฟ๐ป for ๐ณ๐ฟ๐ฒ๐ฒ how to ๐ฏ๐๐ถ๐น๐ฑ an end-to-end ๐ฝ๐ฟ๐ผ๐ฑ๐๐ฐ๐๐ถ๐ผ๐ป-๐ฟ๐ฒ๐ฎ๐ฑ๐ ๐๐๐ & ๐ฅ๐๐ ๐๐๐๐๐ฒ๐บ using ๐๐๐ ๐ข๐ฝ๐ best practices: ~ ๐ด๐ฐ๐ถ๐ณ๐ค๐ฆ ๐ค๐ฐ๐ฅ๐ฆ + 12 ๐ฉ๐ข๐ฏ๐ฅ๐ด-๐ฐ๐ฏ ๐ญ๐ฆ๐ด๐ด๐ฐ๐ฏ๐ด <div align="center">
<h2>LLM Twin Course: Building Your Production-Ready AI Replica</h2>
<h1>Learn to architect and implement a production-ready LLM & RAG system by building your LLM Twin</h1>
<h3>From data gathering to productionizing LLMs using LLMOps good practices.</h3>
<i>by <a href="https://decodingml.substack.com">Decoding ML</i>
</div>
</br>
<p align="center">
<img src="media/cover.png" alt="Your image description">
</p>
## Why is this course different?
*By finishing the **"LLM Twin: Building Your Production-Ready AI Replica"** free course, you will learn how to design, train, and deploy a production-ready LLM twin of yourself powered by LLMs, vector DBs, and LLMOps good practices.*
> Why should you care? ๐ซต
>
> โ **No more isolated scripts or Notebooks!** Learn production ML by building and deploying an end-to-end production-grade LLM system.
## What will you learn to build by the end of thisย course?
You will **learn** how to **architect** and **build a real-world LLM system** from **start** to **finish**โ-โfrom **data collection** to **deployment**.
You will also **learn** to **leverage MLOps best practices**, such as experiment trackers, model registries, prompt monitoring, and versioning.
**The end goal?** Build and deploy your own LLM twin.
**What is an LLM Twin?** It is an AI character that learns to write like somebody by incorporating its style and personality into an LLM.
## The architecture of the LLM Twin is split into 4 Python microservices:
<p align="center">
<img src="media/architecture.png" alt="LLM Twin Architecture">
</p>
### The data collection pipeline
- Crawl your digital data from various social media platforms, such as Medium, Substack and GitHub.
- Clean, normalize and load the data to a [Mongo NoSQL DB](https://www.mongodb.com/) through a series of ETL pipelines.
- Send database changes to a [RabbitMQ](https://www.rabbitmq.com/) queue using the CDC pattern.
- Learn to package the crawlers as AWS Lambda functions.
### The feature pipeline
- Consume messages in real-time from a queue through a [Bytewax](https://github.com/bytewax/bytewax?utm_source=github&utm_medium=decodingml&utm_campaign=2024_q1) streaming pipeline.
- Every message will be cleaned, chunked, embedded and loaded into a [Qdrant](https://qdrant.tech/?utm_source=decodingml&utm_medium=referral&utm_campaign=llm-course) vector DB.
- In the bonus series, we refactor the cleaning, chunking, and embedding logic using [Superlinked](https://github.com/superlinked/superlinked?utm_source=community&utm_medium=github&utm_campaign=oscourse), a specialized vector compute engine. We will also load and index the vectors to a [Redis vector DB](https://redis.io/solutions/vector-search/).
### The training pipeline
- Create a custom instruction dataset based on your custom digital data to do SFT.
- Fine-tune an LLM using LoRA or QLoRA.
- Use [Comet ML's](https://www.comet.com/signup/?utm_source=decoding_ml&utm_medium=partner&utm_content=github) experiment tracker to monitor the experiments.
- Evaluate the LLM using [Opik](https://github.com/comet-ml/opik)
- Save and version the best model to the [Hugging Face model registry](https://huggingface.co/models).
- Run and automate the training pipeline using [AWS SageMaker](https://aws.amazon.com/sagemaker/).
### The inference pipeline
- Load the fine-tuned LLM from the [Hugging Face model registry](https://huggingface.co/models).
- Deploy the LLM as a scalable REST API using [AWS SageMaker inference endpoints](https://aws.amazon.com/sagemaker/deploy/).
- Enhance the prompts using advanced RAG techniques.
- Monitor the prompts and LLM generated results using [Opik](https://github.com/comet-ml/opik)
- In the bonus series, we refactor the advanced RAG layer to write more optimal queries using [Superlinked](https://github.com/superlinked/superlinked?utm_source=community&utm_medium=github&utm_campaign=oscourse).
- Wrap up everything with a Gradio UI (as seen below) where you can start playing around with the LLM Twin to generate content that follows your writing style.
<p align="center">
<img src="media/ui-example.png" alt="Gradio UI">
</p>
Along the 4 microservices, you will learn to integrate 4 serverless tools:
* [Comet ML](https://www.comet.com/signup/?utm_source=decoding_ml&utm_medium=partner&utm_content=github) as your experiment tracker and data registry;
* [Qdrant](https://qdrant.tech/?utm_source=decodingml&utm_medium=referral&utm_campaign=llm-course) as your vector DB;
* [AWS SageMaker](https://aws.amazon.com/sagemaker/) as your ML infrastructure;
* [Opik](https://github.com/comet-ml/opik) as your prompt evaluation and monitoring tool.
## Who is thisย for?
**Audience:** MLE, DE, DS, or SWE who want to learn to engineer production-ready LLM & RAG systems using LLMOps good principles.
**Level:** intermediate
**Prerequisites:** basic knowledge of Python and ML
## How will youย learn?
The course contains **10 hands-on written lessons** and the **open-source code** you can access on GitHub, showing how to build an end-to-end LLM system.
Also, it includes **2 bonus lessons** on how to **improve the RAG system**.
You can read everything at your own pace.
## Costs?
The **articles** and **code** are **completely free**. They will always remain free.
If you plan to run the code while reading it, you must know that we use several cloud tools that might generate additional costs.
**Pay as you go**
- [AWS](https://aws.amazon.com/) offers accessible plans to new joiners.
- For a new first-time account, you could get up to 300$ in free credits which are valid for 6 months. For more, consult the [AWS Offerings](https://aws.amazon.com/free/offers/) page.
**Freemium** (Free-of-Charge)
- [Qdrant](https://qdrant.tech/?utm_source=decodingml&utm_medium=referral&utm_campaign=llm-course)
- [Comet ML](https://www.comet.com/signup/?utm_source=decoding_ml&utm_medium=partner&utm_content=github)
- [Opik](https://github.com/comet-ml/opik)
## Questions and troubleshooting
Please ask us any questions if anything gets confusing while studying the articles or running the code.
You can `ask any question` by `opening an issue` in this GitHub repository by clicking [here](https://github.com/decodingml/llm-twin-course/issues).
## Lessons
This self-paced course consists of 12 comprehensive lessons covering theory, system design, and hands-on implementation.
Our recommendation for each lesson:
1. Read the article
2. Run the code
3. Read the source code in depth
| Lesson | Article | Category | Description | Source Code |
|--------|---------|----------|-------------|-------------|
| 1 | [An End-to-End Framework for Production-Ready LLM Systems](https://medium.com/decodingml/an-end-to-end-framework-for-production-ready-llm-systems-by-building-your-llm-twin-2cc6bb01141f) | System Design | Learn the overall architecture and design principles of production LLM systems. | No code |
| 2 | [Data Crawling](https://medium.com/decodingml/your-content-is-gold-i-turned-3-years-of-blog-posts-into-an-llm-training-d19c265bdd6e) | Data Engineering | Learn to crawl and process social media content for LLM training. | `src/data_crawling` |
| 3 | [CDC Magic](https://medium.com/decodingml/i-replaced-1000-lines-of-polling-code-with-50-lines-of-cdc-magic-4d31abd3bc3b) | Data Engineering | Learn to implement Change Data Capture (CDC) for syncing two data sources. | `src/data_cdc` |
| 4 | [Feature Streaming Pipelines](https://medium.com/decodingml/sota-python-streaming-pipelines-for-fine-tuning-llms-and-rag-in-real-time-82eb07795b87) | Feature Pipeline | Build real-time streaming pipelines for LLM and RAG data processing. | `src/feature_pipeline` |
| 5 | [Advanced RAG Algorithms](https://medium.com/decodingml/the-4-advanced-rag-algorithms-you-must-know-to-implement-5d0c7f1199d2) | Feature Pipeline | Implement advanced RAG techniques for better retrieval. | `src/feature_pipeline` |
| 6 | [Generate Fine-Tuning Instruct Datasets](https://medium.com/decodingml/turning-raw-data-into-fine-tuning-datasets-dc83657d1280) | Training Pipeline | Create custom instruct datasets for LLM fine-tuning. | `src/feature_pipeline/generate_dataset` |
| 7 | [LLM Fine-tuning Pipeline](https://medium.com/decodingml/8b-parameters-1-gpu-no-problems-the-ultimate-llm-fine-tuning-pipeline-f68ef6c359c2) | Training Pipeline | Build an end-to-end LLM fine-tuning pipeline and deploy it to AWS SageMaker. | `src/training_pipeline` |
| 8 | [LLM & RAG Evaluation](https://medium.com/decodingml/the-engineers-framework-for-llm-rag-evaluation-59897381c326) | Training Pipeline | Learn to evaluate LLM and RAG system performance. | `src/inference_pipeline/evaluation` |
| 9 | [Implement and Deploy the RAG Inference Pipeline](https://medium.com/decodingml/beyond-proof-of-concept-building-rag-systems-that-scale-e537d0eb063a) | Inference Pipeline | Design, implement and deploy the RAG inference to AWS SageMaker. | `src/inference_pipeline` |
| 10 | [Prompt Monitoring](https://medium.com/decodingml/the-ultimate-prompt-monitoring-pipeline-886cbb75ae25) | Inference Pipeline | Build the prompt monitoring and production evaluation pipeline. | `src/inference_pipeline` |
| 11 | [Refactor the RAG module using 74.3% Less Code ](https://medium.com/decodingml/build-a-scalable-rag-ingestion-pipeline-using-74-3-less-code-ac50095100d6) | Bonus on RAG | Optimize the RAG system. | `src/bonus_superlinked_rag` |
| 12 | [Multi-Index RAG Apps](https://medium.com/decodingml/build-multi-index-advanced-rag-apps-bd33d2f0ec5c) | Bonus on RAG | Build advanced multi-index RAG apps. | `src/bonus_superlinked_rag` |
> [!NOTE]
> Check the [INSTALL_AND_USAGE](https://github.com/decodingml/llm-twin-course/blob/main/INSTALL_AND_USAGE.md) doc for a step-by-step installation and usage guide.
## Project Structure
At Decoding ML we teach how to build production ML systems, thus the course follows the structure of a real-world Python project:
```text
llm-twin-course/
โโโ src/ # Source code for all microservices
โ โโโ data_crawling/ # Data collection pipeline code
โ โโโ data_cdc/ # Change Data Capture pipeline code
โ โโโ feature_pipeline/ # Feature engineering pipeline code
โ โโโ training_pipeline/ # Training pipeline code
โ โโโ inference_pipeline/ # Inference service code
โ โโโ bonus_superlinked_rag/ # Bonus RAG optimization code
โโโ .env.example # Example environment variables template
โโโ Makefile # Commands to build and run the project
โโโ pyproject.toml # Project dependencies
```
## Install & Usage
To understand how to **install and run the LLM Twin code end-to-end**, go to the [INSTALL_AND_USAGE](https://github.com/decodingml/llm-twin-course/blob/main/INSTALL_AND_USAGE.md) dedicated document.
> [!NOTE]
> Even though you can run everything solely using the [INSTALL_AND_USAGE](https://github.com/decodingml/llm-twin-course/blob/main/INSTALL_AND_USAGE.md) dedicated document, we recommend that you read the articles to understand the LLM Twin system and design choices fully.
### Bonus Superlinked series
The **bonus Superlinked series** has an extra dedicated [README](https://github.com/decodingml/llm-twin-course/blob/main/6-bonus-superlinked-rag/README.md) that you can access under the [src/bonus_superlinked_rag](https://github.com/decodingml/llm-twin-course/tree/main/src/bonus_superlinked_rag) directory.
In that section, we explain how to run it with the improved RAG layer powered by [Superlinked](https://github.com/superlinked/superlinked?utm_source=community&utm_medium=github&utm_campaign=oscourse).
## License
This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, personal projects, etc.).
## Contributors
A big "Thank you ๐" to all our contributors! This course is possible only because of their efforts.
<p align="center">
<a href="https://github.com/decodingml/llm-twin-course/graphs/contributors">
<img src="https://contrib.rocks/image?repo=decodingml/llm-twin-course" />
</a>
</p>
## Sponsors
Also, another big "Thank you ๐" to all our sponsors who supported our work and made this course possible.
<table>
<tr>
<td align="center">
<a href="https://www.comet.com/signup/?utm_source=decoding_ml&utm_medium=partner&utm_content=github" target="_blank">Comet</a>
</td>
<td align="center">
<a href="https://github.com/comet-ml/opik" target="_blank">Opik</a>
</td>
<td align="center">
<a href="https://github.com/bytewax/bytewax?utm_source=github&utm_medium=decodingml&utm_campaign=2024_q1" target="_blank">Bytewax</a>
</td>
<td align="center">
<a href="https://qdrant.tech/?utm_source=decodingml&utm_medium=referral&utm_campaign=llm-course" target="_blank">Qdrant</a>
</td>
<td align="center">
<a href="https://github.com/superlinked/superlinked?utm_source=community&utm_medium=github&utm_campaign=oscourse" target="_blank">Superlinked</a>
</td>
</tr>
<tr>
<td align="center">
<a href="https://www.comet.com/signup/?utm_source=decoding_ml&utm_medium=partner&utm_content=github" target="_blank">
<img src="media/sponsors/comet.png" width="150" alt="Comet">
</a>
</td>
<td align="center">
<a href="https://github.com/comet-ml/opik" target="_blank">
<img src="media/sponsors/opik.svg" width="150" alt="Opik">
</a>
</td>
<td align="center">
<a href="https://github.com/bytewax/bytewax?utm_source=github&utm_medium=decodingml&utm_campaign=2024_q1" target="_blank">
<img src="media/sponsors/bytewax.png" width="150" alt="Bytewax">
</a>
</td>
<td align="center">
<a href="https://qdrant.tech/?utm_source=decodingml&utm_medium=referral&utm_campaign=llm-course" target="_blank">
<img src="media/sponsors/qdrant.svg" width="150" alt="Qdrant">
</a>
</td>
<td align="center">
<a href="https://github.com/superlinked/superlinked?utm_source=community&utm_medium=github&utm_campaign=oscourse" target="_blank">
<img src="media/sponsors/superlinked.png" width="150" alt="Superlinked">
</a>
</td>
</tr>
</table>
## Next steps
Our **LLM Engineerโs Handbook** inspired the **open-source LLM Twin course**.
Consider supporting our work by getting our book to **learn** a **complete framework** for **building and deploying production LLM & RAG systems** โ from data to deployment.
Perfect for practitioners who want **both theory** and **hands-on** expertise by connecting the dots between DE, research, MLE and MLOps:
**[Buy the LLM Engineerโs Handbook](https://www.amazon.com/LLM-Engineers-Handbook-engineering-production/dp/1836200072/)**
* [On Amazon](https://www.amazon.com/LLM-Engineers-Handbook-engineering-production/dp/1836200072/)
* [On Packt](https://www.packtpub.com/en-us/product/llm-engineers-handbook-9781836200062)
<p align="center">
<a href="https://www.amazon.com/LLM-Engineers-Handbook-engineering-production/dp/1836200072/">
<img src="media/llm_engineers_handbook_cover.png" alt="LLM Engineer's Handbook">
</a>
</p>
", Assign "at most 3 tags" to the expected json: {"id":"8993","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"