AI prompts
base on Interact with your documents using the power of GPT, 100% privately, no data leaks # PrivateGPT
<a href="https://trendshift.io/repositories/2601" target="_blank"><img src="https://trendshift.io/api/badge/repositories/2601" alt="imartinez%2FprivateGPT | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
[](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml?query=branch%3Amain)
[](https://docs.privategpt.dev/)
[](https://discord.gg/bK6mRVpErU)
[](https://twitter.com/ZylonPrivateGPT)

PrivateGPT -built by Zylon- is a production-ready AI project that allows you to ask questions about your documents using the power
of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your
execution environment at any point.
>[!TIP]
> If you are looking for an **enterprise-ready, fully private AI platform for regulated industries** like financial services (banks, insurance, investment), defense, critical infrastructure services, government and healthcare,
> check out [Zylon's website](https://zylon.ai) or [request a demo](https://cal.com/zylon/demo?source=pgpt-readme).
> **Zylon** is an enterprise AI platform delivering private generative AI and on-premise AI software for regulated industries, enabling secure deployment inside enterprise infrastructure without external cloud dependencies.
The project provides an API offering all the primitives required to build private, context-aware AI applications.
It follows and extends the [OpenAI API standard](https://openai.com/blog/openai-api),
and supports both normal and streaming responses.
The API is divided into two logical blocks:
**High-level API**, which abstracts all the complexity of a RAG (Retrieval Augmented Generation)
pipeline implementation:
- Ingestion of documents: internally managing document parsing,
splitting, metadata extraction, embedding generation and storage.
- Chat & Completions using context from ingested documents:
abstracting the retrieval of context, the prompt engineering and the response generation.
**Low-level API**, which allows advanced users to implement their own complex pipelines:
- Embeddings generation: based on a piece of text.
- Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents.
In addition to this, a working [Gradio UI](https://www.gradio.app/)
client is provided to test the API, together with a set of useful tools such as bulk model
download script, ingestion script, documents folder watch, etc.
## 🎞️ Overview
>[!WARNING]
> This README is not updated as frequently as the [documentation](https://docs.privategpt.dev/).
> Please check it out for the latest updates!
### Motivation behind PrivateGPT
Generative AI is a game changer for our society, but adoption in companies of all sizes and data-sensitive
domains like healthcare or legal is limited by a clear concern: **privacy**.
Not being able to ensure that your data is fully under your control when using third-party AI tools
is a risk those industries cannot take.
### Primordial version
The first version of PrivateGPT was launched in May 2023 as a novel approach to address the privacy
concerns by using LLMs in a complete offline way.
That version, which rapidly became a go-to project for privacy-sensitive setups and served as the seed
for thousands of local-focused generative AI projects, was the foundation of what PrivateGPT is becoming nowadays;
thus a simpler and more educational implementation to understand the basic concepts required
to build a fully local -and therefore, private- chatGPT-like tool.
If you want to keep experimenting with it, we have saved it in the
[primordial branch](https://github.com/zylon-ai/private-gpt/tree/primordial) of the project.
> It is strongly recommended to do a clean clone and install of this new version of
PrivateGPT if you come from the previous, primordial version.
### Present and Future of PrivateGPT
PrivateGPT is now evolving towards becoming a gateway to generative AI models and primitives, including
completions, document ingestion, RAG pipelines and other low-level building blocks.
We want to make it easier for any developer to build AI applications and experiences, as well as provide
a suitable extensive architecture for the community to keep contributing.
Stay tuned to our [releases](https://github.com/zylon-ai/private-gpt/releases) to check out all the new features and changes included.
## 📄 Documentation
Full documentation on installation, dependencies, configuration, running the server, deployment options,
ingesting local documents, API details and UI features can be found here: https://docs.privategpt.dev/
## 🧩 Architecture
Conceptually, PrivateGPT is an API that wraps a RAG pipeline and exposes its
primitives.
* The API is built using [FastAPI](https://fastapi.tiangolo.com/) and follows
[OpenAI's API scheme](https://platform.openai.com/docs/api-reference).
* The RAG pipeline is based on [LlamaIndex](https://www.llamaindex.ai/).
The design of PrivateGPT allows to easily extend and adapt both the API and the
RAG implementation. Some key architectural decisions are:
* Dependency Injection, decoupling the different components and layers.
* Usage of LlamaIndex abstractions such as `LLM`, `BaseEmbedding` or `VectorStore`,
making it immediate to change the actual implementations of those abstractions.
* Simplicity, adding as few layers and new abstractions as possible.
* Ready to use, providing a full implementation of the API and RAG
pipeline.
Main building blocks:
* APIs are defined in `private_gpt:server:<api>`. Each package contains an
`<api>_router.py` (FastAPI layer) and an `<api>_service.py` (the
service implementation). Each *Service* uses LlamaIndex base abstractions instead
of specific implementations,
decoupling the actual implementation from its usage.
* Components are placed in
`private_gpt:components:<component>`. Each *Component* is in charge of providing
actual implementations to the base abstractions used in the Services - for example
`LLMComponent` is in charge of providing an actual implementation of an `LLM`
(for example `LlamaCPP` or `OpenAI`).
## 💡 Contributing
Contributions are welcomed! To ensure code quality we have enabled several format and
typing checks, just run `make check` before committing to make sure your code is ok.
Remember to test your code! You'll find a tests folder with helpers, and you can run
tests using `make test` command.
Don't know what to contribute? Here is the public
[Project Board](https://github.com/users/imartinez/projects/3) with several ideas.
Head over to Discord
#contributors channel and ask for write permissions on that GitHub project.
## 💬 Community
Join the conversation around PrivateGPT on our:
- [Twitter (aka X)](https://twitter.com/PrivateGPT_AI)
- [Discord](https://discord.gg/bK6mRVpErU)
## 📖 Citation
If you use PrivateGPT in a paper, check out the [Citation file](CITATION.cff) for the correct citation.
You can also use the "Cite this repository" button in this repo to get the citation in different formats.
Here are a couple of examples:
#### BibTeX
```bibtex
@software{Zylon_PrivateGPT_2023,
author = {Zylon by PrivateGPT},
license = {Apache-2.0},
month = may,
title = {{PrivateGPT}},
url = {https://github.com/zylon-ai/private-gpt},
year = {2023}
}
```
#### APA
```
Zylon by PrivateGPT (2023). PrivateGPT [Computer software]. https://github.com/zylon-ai/private-gpt
```
## 🤗 Partners & Supporters
PrivateGPT is actively supported by the teams behind:
* [Qdrant](https://qdrant.tech/), providing the default vector database
* [Fern](https://buildwithfern.com/), providing Documentation and SDKs
* [LlamaIndex](https://www.llamaindex.ai/), providing the base RAG framework and abstractions
This project has been strongly influenced and supported by other amazing projects like
[LangChain](https://github.com/hwchase17/langchain),
[GPT4All](https://github.com/nomic-ai/gpt4all),
[LlamaCpp](https://github.com/ggerganov/llama.cpp),
[Chroma](https://www.trychroma.com/)
and [SentenceTransformers](https://www.sbert.net/).
", Assign "at most 3 tags" to the expected json: {"id":"8691","tags":[]} "only from the tags list I provide: [{"id":39,"name":"3d-generation","display_name":"3D generation","slug":"3d-generation"},{"id":3,"name":"ai-agent","display_name":"AI agent","slug":"ai-agent"},{"id":8,"name":"ai-coding","display_name":"AI coding assistant","slug":"ai-coding"},{"id":5,"name":"ai-image","display_name":"AI image generation","slug":"ai-image"},{"id":9,"name":"ai-infrastructure","display_name":"AI infrastructure","slug":"ai-infrastructure"},{"id":10,"name":"ai-memory","display_name":"AI memory","slug":"ai-memory"},{"id":11,"name":"ai-skills","display_name":"AI skills","slug":"ai-skills"},{"id":12,"name":"ai-translation","display_name":"AI translation","slug":"ai-translation"},{"id":6,"name":"ai-video","display_name":"AI video generation","slug":"ai-video"},{"id":4,"name":"ai-voice","display_name":"AI voice","slug":"ai-voice"},{"id":7,"name":"ai-workflow","display_name":"AI workflow","slug":"ai-workflow"},{"id":22,"name":"audio-processing","display_name":"Audio processing","slug":"audio-processing"},{"id":29,"name":"authentication","display_name":"Authentication","slug":"authentication"},{"id":51,"name":"bundler","display_name":"Bundler","slug":"bundler"},{"id":41,"name":"chatbot","display_name":"Chatbot","slug":"chatbot"},{"id":27,"name":"cloud-native","display_name":"Cloud native","slug":"cloud-native"},{"id":1,"name":"computer-vision","display_name":"Computer vision","slug":"computer-vision"},{"id":37,"name":"crypto-trading","display_name":"Crypto trading","slug":"crypto-trading"},{"id":57,"name":"curated-list","display_name":"Curated list","slug":"curated-list"},{"id":54,"name":"data-streaming","display_name":"Data streaming","slug":"data-streaming"},{"id":35,"name":"data-visualization","display_name":"Data visualization","slug":"data-visualization"},{"id":16,"name":"database-backup","display_name":"Database backup","slug":"database-backup"},{"id":49,"name":"design-system","display_name":"Design system","slug":"design-system"},{"id":38,"name":"digital-human","display_name":"Digital human","slug":"digital-human"},{"id":34,"name":"document-processing","display_name":"Document processing","slug":"document-processing"},{"id":44,"name":"ecommerce","display_name":"E-commerce","slug":"ecommerce"},{"id":45,"name":"emulator","display_name":"Emulator","slug":"emulator"},{"id":46,"name":"file-management","display_name":"File management","slug":"file-management"},{"id":32,"name":"fintech","display_name":"Fintech","slug":"fintech"},{"id":31,"name":"game-development","display_name":"Game development","slug":"game-development"},{"id":24,"name":"headless-browser","display_name":"Headless browser","slug":"headless-browser"},{"id":52,"name":"headless-cms","display_name":"Headless CMS","slug":"headless-cms"},{"id":36,"name":"home-automation","display_name":"Home automation","slug":"home-automation"},{"id":20,"name":"image-editing","display_name":"Image editing","slug":"image-editing"},{"id":28,"name":"iot","display_name":"IoT","slug":"iot"},{"id":13,"name":"local-llm","display_name":"Local LLM","slug":"local-llm"},{"id":17,"name":"mcp","display_name":"MCP","slug":"mcp"},{"id":47,"name":"monitoring","display_name":"Monitoring","slug":"monitoring"},{"id":2,"name":"nlp","display_name":"NLP","slug":"nlp"},{"id":26,"name":"observability","display_name":"Observability","slug":"observability"},{"id":40,"name":"pentesting","display_name":"Pentesting","slug":"pentesting"},{"id":48,"name":"programming-examples","display_name":"Programming examples","slug":"programming-examples"},{"id":42,"name":"proxy","display_name":"Proxy","slug":"proxy"},{"id":14,"name":"rag","display_name":"RAG","slug":"rag"},{"id":56,"name":"resume-building","display_name":"Resume building","slug":"resume-building"},{"id":33,"name":"robotics","display_name":"Robotics","slug":"robotics"},{"id":30,"name":"search","display_name":"Search","slug":"search"},{"id":43,"name":"self-hosted","display_name":"Self-hosted","slug":"self-hosted"},{"id":50,"name":"static-analysis","display_name":"Static analysis","slug":"static-analysis"},{"id":18,"name":"synthetic-data","display_name":"Synthetic data","slug":"synthetic-data"},{"id":19,"name":"text-to-speech","display_name":"Text to speech","slug":"text-to-speech"},{"id":53,"name":"ui-components","display_name":"UI components","slug":"ui-components"},{"id":15,"name":"vector-database","display_name":"Vector database","slug":"vector-database"},{"id":21,"name":"video-editing","display_name":"Video editing","slug":"video-editing"},{"id":25,"name":"web-scraping","display_name":"Web scraping","slug":"web-scraping"},{"id":55,"name":"webassembly","display_name":"WebAssembly","slug":"webassembly"},{"id":23,"name":"workflow-automation","display_name":"Workflow automation","slug":"workflow-automation"}]" returns me the "expected json"