Trendshift - Ask AI

base on Local Deep Research is an AI-powered assistant that transforms complex questions into comprehensive, cited reports by conducting iterative analysis using any LLM across diverse knowledge sources including academic databases, scientific repositories, web content, and private document collections. # Local Deep Research <div align="center"> [![GitHub stars](https://img.shields.io/github/stars/LearningCircuit/local-deep-research?style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/stargazers) [![Docker Pulls](https://img.shields.io/docker/pulls/localdeepresearch/local-deep-research?style=for-the-badge)](https://hub.docker.com/r/localdeepresearch/local-deep-research) [![PyPI Downloads](https://img.shields.io/pypi/dm/local-deep-research?style=for-the-badge)](https://pypi.org/project/local-deep-research/) [![Tests](https://img.shields.io/github/actions/workflow/status/LearningCircuit/local-deep-research/tests.yml?branch=main&style=for-the-badge&label=Tests)](https://github.com/LearningCircuit/local-deep-research/actions/workflows/tests.yml) [![CodeQL](https://img.shields.io/github/actions/workflow/status/LearningCircuit/local-deep-research/codeql.yml?branch=main&style=for-the-badge&label=CodeQL)](https://github.com/LearningCircuit/local-deep-research/security/code-scanning) [![Discord](https://img.shields.io/discord/1352043059562680370?style=for-the-badge&logo=discord)](https://discord.gg/ttcqQeFcJ3) [![Reddit](https://img.shields.io/badge/Reddit-r/LocalDeepResearch-FF4500?style=for-the-badge&logo=reddit)](https://www.reddit.com/r/LocalDeepResearch/) **AI-powered research assistant for deep, iterative research** *Performs deep, iterative research using multiple LLMs and search engines with proper citations* </div> ## 🚀 What is Local Deep Research? LDR is an AI research assistant that performs systematic research by: - **Breaking down complex questions** into focused sub-queries - **Searching multiple sources** in parallel (web, academic papers, local documents) - **Verifying information** across sources for accuracy - **Creating comprehensive reports** with proper citations It aims to help researchers, students, and professionals find accurate information quickly while maintaining transparency about sources. ## 🎯 Why Choose LDR? - **Privacy-Focused**: Run entirely locally with Ollama + SearXNG - **Flexible**: Use any LLM, any search engine, any vector store - **Comprehensive**: Multiple research modes from quick summaries to detailed reports - **Transparent**: Track costs and performance with built-in analytics - **Open Source**: MIT licensed with an active community ## 📊 Performance **~95% accuracy on SimpleQA benchmark** (preliminary results) - Tested with GPT-4.1-mini + SearXNG + focused-iteration strategy - Comparable to state-of-the-art AI research systems - Local models can achieve similar performance with proper configuration - [Join our community benchmarking effort →](https://github.com/LearningCircuit/local-deep-research/tree/main/community_benchmark_results) ## ✨ Key Features ### 🔍 Research Modes - **Quick Summary** - Get answers in 30 seconds to 3 minutes with citations - **Detailed Research** - Comprehensive analysis with structured findings - **Report Generation** - Professional reports with sections and table of contents - **Document Analysis** - Search your private documents with AI ### 🛠️ Advanced Capabilities - **[LangChain Integration](docs/LANGCHAIN_RETRIEVER_INTEGRATION.md)** - Use any vector store as a search engine - **[REST API](docs/api-quickstart.md)** - Language-agnostic HTTP access - **[Benchmarking](docs/BENCHMARKING.md)** - Test and optimize your configuration - **[Analytics Dashboard](docs/analytics-dashboard.md)** - Track costs, performance, and usage metrics - **Real-time Updates** - WebSocket support for live research progress - **Export Options** - Download results as PDF or Markdown - **Research History** - Save, search, and revisit past research - **Adaptive Rate Limiting** - Intelligent retry system that learns optimal wait times - **Keyboard Shortcuts** - Navigate efficiently (ESC, Ctrl+Shift+1-5) ### 🌐 Search Sources #### Free Search Engines - **Academic**: arXiv, PubMed, Semantic Scholar - **General**: Wikipedia, SearXNG, DuckDuckGo - **Technical**: GitHub, Elasticsearch - **Historical**: Wayback Machine - **News**: The Guardian #### Premium Search Engines - **Tavily** - AI-powered search - **Google** - Via SerpAPI or Programmable Search Engine - **Brave Search** - Privacy-focused web search #### Custom Sources - **Local Documents** - Search your files with AI - **LangChain Retrievers** - Any vector store or database - **Meta Search** - Combine multiple engines intelligently [Full Search Engines Guide →](docs/search-engines.md) ## ⚡ Quick Start ### Option 1: Docker (Quickstart on MAC/ARM) ```bash # Step 1: Pull and run SearXNG for optimal search results docker run -d -p 8080:8080 --name searxng searxng/searxng # Step 2: Pull and run Local Deep Research (Please build your own docker on ARM) docker run -d -p 5000:5000 --name local-deep-research --volume 'deep-research:/install/.venv/lib/python3.13/site-packages/data/' localdeepresearch/local-deep-research ``` ### Option 2: Docker Compose (Recommended) LDR uses Docker compose to bundle the web app and all it's dependencies so you can get up and running quickly. #### Option 2a: Quick Start (One Command) ```bash curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d ``` Open http://localhost:5000 after ~30 seconds. This starts LDR with SearXNG and all dependencies. #### Option 2b: DIY docker-compose See [docker-compose.yml](./docker-compose.yml) for a docker-compose file with reasonable defaults to get up and running with ollama, searxng, and local deep research all running locally. Things you may want/need to configure: * Ollama GPU driver * Ollama context length (depends on available VRAM) * Ollama keep alive (duration model will stay loaded into VRAM and idle before getting unloaded automatically) * Deep Research model (depends on available VRAM and preference) #### Option 2c: Use Cookie Cutter to tailor a docker-compose to your needs: ##### Prerequisites - [Docker](https://docs.docker.com/engine/install/) - [Docker Compose](https://docs.docker.com/compose/install/) - `cookiecutter`: Run `pip install --user cookiecutter` Clone the repository: ```bash git clone https://github.com/LearningCircuit/local-deep-research.git cd local-deep-research ``` ### Configuring with Docker Compose Cookiecutter will interactively guide you through the process of creating a `docker-compose` configuration that meets your specific needs. This is the recommended approach if you are not very familiar with Docker. In the LDR repository, run the following command to generate the compose file: ```bash cookiecutter cookiecutter-docker/ docker compose -f docker-compose.default.yml up ``` [Docker Compose Guide →](docs/docker-compose-guide.md) ### Option 3: Python Package ```bash # Step 1: Install the package pip install local-deep-research # Step 2: Setup SearXNG for best results docker pull searxng/searxng docker run -d -p 8080:8080 --name searxng searxng/searxng # Step 3: Install Ollama from https://ollama.ai # Step 4: Download a model ollama pull gemma3:12b # Step 5: Start the web interface python -m local_deep_research.web.app ``` [Full Installation Guide →](https://github.com/LearningCircuit/local-deep-research/wiki/Installation) ## 💻 Usage Examples ### Python API ```python from local_deep_research.api import quick_summary # Simple usage result = quick_summary("What are the latest advances in quantum computing?") print(result["summary"]) # Advanced usage with custom configuration result = quick_summary( query="Impact of AI on healthcare", search_tool="searxng", search_strategy="focused-iteration", iterations=2 ) ``` ### HTTP API ```bash curl -X POST http://localhost:5000/api/v1/quick_summary \ -H "Content-Type: application/json" \ -d '{"query": "Explain CRISPR gene editing"}' ``` [More Examples →](examples/api_usage/) ### Command Line Tools ```bash # Run benchmarks from CLI python -m local_deep_research.benchmarks --dataset simpleqa --examples 50 # Manage rate limiting python -m local_deep_research.web_search_engines.rate_limiting status python -m local_deep_research.web_search_engines.rate_limiting reset ``` ## 🔗 Enterprise Integration Connect LDR to your existing knowledge base: ```python from local_deep_research.api import quick_summary # Use your existing LangChain retriever result = quick_summary( query="What are our deployment procedures?", retrievers={"company_kb": your_retriever}, search_tool="company_kb" ) ``` Works with: FAISS, Chroma, Pinecone, Weaviate, Elasticsearch, and any LangChain-compatible retriever. [Integration Guide →](docs/LANGCHAIN_RETRIEVER_INTEGRATION.md) ## 📊 Performance & Analytics ### Benchmark Results Early experiments on small SimpleQA dataset samples: | Configuration | Accuracy | Notes | |--------------|----------|--------| | gpt-4.1-mini + SearXNG + focused_iteration | 90-95% | Limited sample size | | gpt-4.1-mini + Tavily + focused_iteration | 90-95% | Limited sample size | | gemini-2.0-flash-001 + SearXNG | 82% | Single test run | Note: These are preliminary results from initial testing. Performance varies significantly based on query types, model versions, and configurations. [Run your own benchmarks →](docs/BENCHMARKING.md) ### Built-in Analytics Dashboard Track costs, performance, and usage with detailed metrics. [Learn more →](docs/analytics-dashboard.md) ## 🤖 Supported LLMs ### Local Models (via Ollama) - Llama 3, Mistral, Gemma, DeepSeek - LLM processing stays local (search queries still go to web) - No API costs ### Cloud Models - OpenAI (GPT-4, GPT-3.5) - Anthropic (Claude 3) - Google (Gemini) - 100+ models via OpenRouter [Model Setup →](docs/env_configuration.md) ## 📚 Documentation ### Getting Started - [Installation Guide](https://github.com/LearningCircuit/local-deep-research/wiki/Installation) - [Frequently Asked Questions](docs/faq.md) - [API Quickstart](docs/api-quickstart.md) - [Configuration Guide](docs/env_configuration.md) ### Core Features - [All Features Guide](docs/features.md) - [Search Engines Guide](docs/search-engines.md) - [Analytics Dashboard](docs/analytics-dashboard.md) ### Advanced Features - [LangChain Integration](docs/LANGCHAIN_RETRIEVER_INTEGRATION.md) - [Benchmarking System](docs/BENCHMARKING.md) - [Elasticsearch Setup](docs/elasticsearch_search_engine.md) - [SearXNG Setup](docs/SearXNG-Setup.md) ### Development - [Docker Compose Guide](docs/docker-compose-guide.md) - [Development Guide](docs/developing.md) - [Security Guide](docs/security/CODEQL_GUIDE.md) - [Release Guide](docs/RELEASE_GUIDE.md) ### Examples & Tutorials - [API Examples](examples/api_usage/) - [Benchmark Examples](examples/benchmarks/) - [Optimization Examples](examples/optimization/) ## 🤝 Community & Support - [Discord](https://discord.gg/ttcqQeFcJ3) - Get help and share research techniques - [Reddit](https://www.reddit.com/r/LocalDeepResearch/) - Updates and showcases - [GitHub Issues](https://github.com/LearningCircuit/local-deep-research/issues) - Bug reports ## 🚀 Contributing We welcome contributions! See our [Contributing Guide](CONTRIBUTING.md) to get started. ## 📄 License MIT License - see [LICENSE](LICENSE) file. Built with: [LangChain](https://github.com/hwchase17/langchain), [Ollama](https://ollama.ai), [SearXNG](https://searxng.org/), [FAISS](https://github.com/facebookresearch/faiss) > **Support Free Knowledge:** Consider donating to [Wikipedia](https://donate.wikimedia.org), [arXiv](https://arxiv.org/about/give), or [PubMed](https://www.nlm.nih.gov/pubs/donations/donations.html). ", Assign "at most 3 tags" to the expected json: {"id":"14116","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"

AI prompts

AI prompts