AI prompts
base on Implementation of all RAG techniques in a simpler way # All RAG Techniques: A Simpler, Hands-On Approach ✨
Read this in your preferred language:
[Deutsch](https://www.readme-i18n.com/FareedKhan-dev/all-rag-techniques?lang=de) | [Español](https://www.readme-i18n.com/FareedKhan-dev/all-rag-techniques?lang=es) | [Français](https://www.readme-i18n.com/FareedKhan-dev/all-rag-techniques?lang=fr) | [日本語](https://www.readme-i18n.com/FareedKhan-dev/all-rag-techniques?lang=ja) | [한국어](https://www.readme-i18n.com/FareedKhan-dev/all-rag-techniques?lang=ko) | [Português](https://www.readme-i18n.com/FareedKhan-dev/all-rag-techniques?lang=pt) | [Русский](https://www.readme-i18n.com/FareedKhan-dev/all-rag-techniques?lang=ru) | [中文](https://www.readme-i18n.com/FareedKhan-dev/all-rag-techniques?lang=zh)
[](https://www.python.org/downloads/release/python-370/) [](https://cloud.nebius.ai/services/llm-embedding) [](https://openai.com/) [](https://medium.com/@fareedkhandev/testing-every-rag-technique-to-find-the-best-094d166af27f)
This repository takes a clear, hands-on approach to **Retrieval-Augmented Generation (RAG)**, breaking down advanced techniques into straightforward, understandable implementations. Instead of relying on frameworks like `LangChain` or `FAISS`, everything here is built using familiar Python libraries `openai`, `numpy`, `matplotlib`, and a few others.
The goal is simple: provide code that is readable, modifiable, and educational. By focusing on the fundamentals, this project helps demystify RAG and makes it easier to understand how it really works.
## Update: 📢
- (12-May-2025) Added a new notebook on how to handle big data using Knowledge Graphs.
- (27-April-2025) Added a new notebook which finds best RAG technique for a given query (Simple RAG + Reranker + Query Rewrite).
- (20-Mar-2025) Added a new notebook on RAG with Reinforcement Learning.
- (07-Mar-2025) Added 20 RAG techniques to the repository.
## 🚀 What's Inside?
This repository contains a collection of Jupyter Notebooks, each focusing on a specific RAG technique. Each notebook provides:
- A concise explanation of the technique.
- A step-by-step implementation from scratch.
- Clear code examples with inline comments.
- Evaluations and comparisons to demonstrate the technique's effectiveness.
- Visualization to visualize the results.
Here's a glimpse of the techniques covered:
| Notebook | Description |
| :-------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| [1. Simple RAG](01_simple_rag.ipynb) | A basic RAG implementation. A great starting point! |
| [2. Semantic Chunking](02_semantic_chunking.ipynb) | Splits text based on semantic similarity for more meaningful chunks. |
| [3. Chunk Size Selector](03_chunk_size_selector.ipynb) | Explores the impact of different chunk sizes on retrieval performance. |
| [4. Context Enriched RAG](04_context_enriched_rag.ipynb) | Retrieves neighboring chunks to provide more context. |
| [5. Contextual Chunk Headers](05_contextual_chunk_headers_rag.ipynb) | Prepends descriptive headers to each chunk before embedding. |
| [6. Document Augmentation RAG](06_doc_augmentation_rag.ipynb) | Generates questions from text chunks to augment the retrieval process. |
| [7. Query Transform](07_query_transform.ipynb) | Rewrites, expands, or decomposes queries to improve retrieval. Includes **Step-back Prompting** and **Sub-query Decomposition**. |
| [8. Reranker](08_reranker.ipynb) | Re-ranks initially retrieved results using an LLM for better relevance. |
| [9. RSE](09_rse.ipynb) | Relevant Segment Extraction: Identifies and reconstructs continuous segments of text, preserving context. |
| [10. Contextual Compression](10_contextual_compression.ipynb) | Implements contextual compression to filter and compress retrieved chunks, maximizing relevant information. |
| [11. Feedback Loop RAG](11_feedback_loop_rag.ipynb) | Incorporates user feedback to learn and improve RAG system over time. |
| [12. Adaptive RAG](12_adaptive_rag.ipynb) | Dynamically selects the best retrieval strategy based on query type. |
| [13. Self RAG](13_self_rag.ipynb) | Implements Self-RAG, dynamically decides when and how to retrieve, evaluates relevance, and assesses support and utility. |
| [14. Proposition Chunking](14_proposition_chunking.ipynb) | Breaks down documents into atomic, factual statements for precise retrieval. |
| [15. Multimodel RAG](15_multimodel_rag.ipynb) | Combines text and images for retrieval, generating captions for images using LLaVA. |
| [16. Fusion RAG](16_fusion_rag.ipynb) | Combines vector search with keyword-based (BM25) retrieval for improved results. |
| [17. Graph RAG](17_graph_rag.ipynb) | Organizes knowledge as a graph, enabling traversal of related concepts. |
| [18. Hierarchy RAG](18_hierarchy_rag.ipynb) | Builds hierarchical indices (summaries + detailed chunks) for efficient retrieval. |
| [19. HyDE RAG](19_HyDE_rag.ipynb) | Uses Hypothetical Document Embeddings to improve semantic matching. |
| [20. CRAG](20_crag.ipynb) | Corrective RAG: Dynamically evaluates retrieval quality and uses web search as a fallback. |
| [21. Rag with RL](21_rag_with_rl.ipynb) | Maximize the reward of the RAG model using Reinforcement Learning. |
| [Best RAG Finder](best_rag_finder.ipynb) | Finds the best RAG technique for a given query using Simple RAG + Reranker + Query Rewrite. |
| [22. Big Data with Knowledge Graphs](22_Big_data_with_KG.ipynb) | Handles large datasets using Knowledge Graphs. |
## 🗂️ Repository Structure
```
fareedkhan-dev-all-rag-techniques/
├── README.md <- You are here!
├── 01_simple_rag.ipynb
├── 02_semantic_chunking.ipynb
├── 03_chunk_size_selector.ipynb
├── 04_context_enriched_rag.ipynb
├── 05_contextual_chunk_headers_rag.ipynb
├── 06_doc_augmentation_rag.ipynb
├── 07_query_transform.ipynb
├── 08_reranker.ipynb
├── 09_rse.ipynb
├── 10_contextual_compression.ipynb
├── 11_feedback_loop_rag.ipynb
├── 12_adaptive_rag.ipynb
├── 13_self_rag.ipynb
├── 14_proposition_chunking.ipynb
├── 15_multimodel_rag.ipynb
├── 16_fusion_rag.ipynb
├── 17_graph_rag.ipynb
├── 18_hierarchy_rag.ipynb
├── 19_HyDE_rag.ipynb
├── 20_crag.ipynb
├── 21_rag_with_rl.ipynb
├── 22_big_data_with_KG.ipynb
├── best_rag_finder.ipynb
├── requirements.txt <- Python dependencies
└── data/
└── val.json <- Sample validation data (queries and answers)
└── AI_Information.pdf <- A sample PDF document for testing.
└── attention_is_all_you_need.pdf <- A sample PDF document for testing (for Multi-Modal RAG).
```
## 🛠️ Getting Started
1. **Clone the repository:**
```bash
git clone https://github.com/FareedKhan-dev/all-rag-techniques.git
cd all-rag-techniques
```
2. **Install dependencies:**
```bash
pip install -r requirements.txt
```
3. **Set up your OpenAI API key:**
- Obtain an API key from [Nebius AI](https://studio.nebius.com/).
- Set the API key as an environment variable:
```bash
export OPENAI_API_KEY='YOUR_NEBIUS_AI_API_KEY'
```
or
```bash
setx OPENAI_API_KEY "YOUR_NEBIUS_AI_API_KEY" # On Windows
```
or, within your Python script/notebook:
```python
import os
os.environ["OPENAI_API_KEY"] = "YOUR_NEBIUS_AI_API_KEY"
```
4. **Run the notebooks:**
Open any of the Jupyter Notebooks (`.ipynb` files) using Jupyter Notebook or JupyterLab. Each notebook is self-contained and can be run independently. The notebooks are designed to be executed sequentially within each file.
**Note:** The `data/AI_Information.pdf` file provides a sample document for testing. You can replace it with your own PDF. The `data/val.json` file contains sample queries and ideal answers for evaluation.
The 'attention_is_all_you_need.pdf' is for testing Multi-Modal RAG Notebook.
## 💡 Core Concepts
- **Embeddings:** Numerical representations of text that capture semantic meaning. We use Nebius AI's embedding API and, in many notebooks, also the `BAAI/bge-en-icl` embedding model.
- **Vector Store:** A simple database to store and search embeddings. We create our own `SimpleVectorStore` class using NumPy for efficient similarity calculations.
- **Cosine Similarity:** A measure of similarity between two vectors. Higher values indicate greater similarity.
- **Chunking:** Dividing text into smaller, manageable pieces. We explore various chunking strategies.
- **Retrieval:** The process of finding the most relevant text chunks for a given query.
- **Generation:** Using a Large Language Model (LLM) to create a response based on the retrieved context and the user's query. We use the `meta-llama/Llama-3.2-3B-Instruct` model via Nebius AI's API.
- **Evaluation:** Assessing the quality of the RAG system's responses, often by comparing them to a reference answer or using an LLM to score relevance.
## 🤝 Contributing
Contributions are welcome!
", Assign "at most 3 tags" to the expected json: {"id":"14017","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"