base on Interact with your documents using the power of GPT, 100% privately, no data leaks # PrivateGPT <a href="https://trendshift.io/repositories/2601" target="_blank"><img src="https://trendshift.io/api/badge/repositories/2601" alt="imartinez%2FprivateGPT | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> [![Tests](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml/badge.svg)](https://github.com/zylon-ai/private-gpt/actions/workflows/tests.yml?query=branch%3Amain) [![Website](https://img.shields.io/website?up_message=check%20it&down_message=down&url=https%3A%2F%2Fdocs.privategpt.dev%2F&label=Documentation)](https://docs.privategpt.dev/) [![Discord](https://img.shields.io/discord/1164200432894234644?logo=discord&label=PrivateGPT)](https://discord.gg/bK6mRVpErU) [![X (formerly Twitter) Follow](https://img.shields.io/twitter/follow/ZylonPrivateGPT)](https://twitter.com/ZylonPrivateGPT) ![Gradio UI](/fern/docs/assets/ui.png?raw=true) PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection. 100% private, no data leaves your execution environment at any point. >[!TIP] > If you are looking for an **enterprise-ready, fully private AI workspace** > check out [Zylon's website](https://zylon.ai) or [request a demo](https://cal.com/zylon/demo?source=pgpt-readme). > Crafted by the team behind PrivateGPT, Zylon is a best-in-class AI collaborative > workspace that can be easily deployed on-premise (data center, bare metal...) or in your private cloud (AWS, GCP, Azure...). The project provides an API offering all the primitives required to build private, context-aware AI applications. It follows and extends the [OpenAI API standard](https://openai.com/blog/openai-api), and supports both normal and streaming responses. The API is divided into two logical blocks: **High-level API**, which abstracts all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation: - Ingestion of documents: internally managing document parsing, splitting, metadata extraction, embedding generation and storage. - Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt engineering and the response generation. **Low-level API**, which allows advanced users to implement their own complex pipelines: - Embeddings generation: based on a piece of text. - Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents. In addition to this, a working [Gradio UI](https://www.gradio.app/) client is provided to test the API, together with a set of useful tools such as bulk model download script, ingestion script, documents folder watch, etc. ## 🎞️ Overview >[!WARNING] > This README is not updated as frequently as the [documentation](https://docs.privategpt.dev/). > Please check it out for the latest updates! ### Motivation behind PrivateGPT Generative AI is a game changer for our society, but adoption in companies of all sizes and data-sensitive domains like healthcare or legal is limited by a clear concern: **privacy**. Not being able to ensure that your data is fully under your control when using third-party AI tools is a risk those industries cannot take. ### Primordial version The first version of PrivateGPT was launched in May 2023 as a novel approach to address the privacy concerns by using LLMs in a complete offline way. That version, which rapidly became a go-to project for privacy-sensitive setups and served as the seed for thousands of local-focused generative AI projects, was the foundation of what PrivateGPT is becoming nowadays; thus a simpler and more educational implementation to understand the basic concepts required to build a fully local -and therefore, private- chatGPT-like tool. If you want to keep experimenting with it, we have saved it in the [primordial branch](https://github.com/zylon-ai/private-gpt/tree/primordial) of the project. > It is strongly recommended to do a clean clone and install of this new version of PrivateGPT if you come from the previous, primordial version. ### Present and Future of PrivateGPT PrivateGPT is now evolving towards becoming a gateway to generative AI models and primitives, including completions, document ingestion, RAG pipelines and other low-level building blocks. We want to make it easier for any developer to build AI applications and experiences, as well as provide a suitable extensive architecture for the community to keep contributing. Stay tuned to our [releases](https://github.com/zylon-ai/private-gpt/releases) to check out all the new features and changes included. ## 📄 Documentation Full documentation on installation, dependencies, configuration, running the server, deployment options, ingesting local documents, API details and UI features can be found here: https://docs.privategpt.dev/ ## 🧩 Architecture Conceptually, PrivateGPT is an API that wraps a RAG pipeline and exposes its primitives. * The API is built using [FastAPI](https://fastapi.tiangolo.com/) and follows [OpenAI's API scheme](https://platform.openai.com/docs/api-reference). * The RAG pipeline is based on [LlamaIndex](https://www.llamaindex.ai/). The design of PrivateGPT allows to easily extend and adapt both the API and the RAG implementation. Some key architectural decisions are: * Dependency Injection, decoupling the different components and layers. * Usage of LlamaIndex abstractions such as `LLM`, `BaseEmbedding` or `VectorStore`, making it immediate to change the actual implementations of those abstractions. * Simplicity, adding as few layers and new abstractions as possible. * Ready to use, providing a full implementation of the API and RAG pipeline. Main building blocks: * APIs are defined in `private_gpt:server:<api>`. Each package contains an `<api>_router.py` (FastAPI layer) and an `<api>_service.py` (the service implementation). Each *Service* uses LlamaIndex base abstractions instead of specific implementations, decoupling the actual implementation from its usage. * Components are placed in `private_gpt:components:<component>`. Each *Component* is in charge of providing actual implementations to the base abstractions used in the Services - for example `LLMComponent` is in charge of providing an actual implementation of an `LLM` (for example `LlamaCPP` or `OpenAI`). ## 💡 Contributing Contributions are welcomed! To ensure code quality we have enabled several format and typing checks, just run `make check` before committing to make sure your code is ok. Remember to test your code! You'll find a tests folder with helpers, and you can run tests using `make test` command. Don't know what to contribute? Here is the public [Project Board](https://github.com/users/imartinez/projects/3) with several ideas. Head over to Discord #contributors channel and ask for write permissions on that GitHub project. ## 💬 Community Join the conversation around PrivateGPT on our: - [Twitter (aka X)](https://twitter.com/PrivateGPT_AI) - [Discord](https://discord.gg/bK6mRVpErU) ## 📖 Citation If you use PrivateGPT in a paper, check out the [Citation file](CITATION.cff) for the correct citation. You can also use the "Cite this repository" button in this repo to get the citation in different formats. Here are a couple of examples: #### BibTeX ```bibtex @software{Zylon_PrivateGPT_2023, author = {Zylon by PrivateGPT}, license = {Apache-2.0}, month = may, title = {{PrivateGPT}}, url = {https://github.com/zylon-ai/private-gpt}, year = {2023} } ``` #### APA ``` Zylon by PrivateGPT (2023). PrivateGPT [Computer software]. https://github.com/zylon-ai/private-gpt ``` ## 🤗 Partners & Supporters PrivateGPT is actively supported by the teams behind: * [Qdrant](https://qdrant.tech/), providing the default vector database * [Fern](https://buildwithfern.com/), providing Documentation and SDKs * [LlamaIndex](https://www.llamaindex.ai/), providing the base RAG framework and abstractions This project has been strongly influenced and supported by other amazing projects like [LangChain](https://github.com/hwchase17/langchain), [GPT4All](https://github.com/nomic-ai/gpt4all), [LlamaCpp](https://github.com/ggerganov/llama.cpp), [Chroma](https://www.trychroma.com/) and [SentenceTransformers](https://www.sbert.net/). ", Assign "at most 3 tags" to the expected json: {"id":"8691","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"