Document processing

New 2026

📄 PDF/IMG ->.MD/JSON Document OCR API for PaddleOCR and GLMOCR. Self-hostable.

New 2026

LLM Wiki is a cross-platform desktop application that turns your documents into an organized, interlinked knowledge base — automatically. Instead of traditional RAG (retrieve-and-answer from scratch every time), the LLM incrementally builds and maintains a persistent wiki from your sources。

New 2026

Claude Code skill. Drop code, papers, images, or notes into a folder and get a knowledge graph with community detection, god nodes, and honest audit trail.

New 2026

Free offline all-in-one file converter for Windows. Converts documents, images, audio and video locally. No uploads, no internet, no dependencies. Built with Python & PySide6. Features dark/light theme, stats dashboard, achievements, and multi-engine fallback.

New 2026

OpenKB: Open LLM Knowledge Base

New 2026

AI Legal Assistant skill for Claude Code. Contract review, risk analysis, NDA generation, compliance auditing, negotiation strategy, and PDF reports — 14 skills, 5 parallel agents. If you want to learn how to sell this to real businesses, check out the Skool community

New 2026

🖍️ Convert anything to markdown. Mark it.

New 2026

A code-driven presentation generation framework. 像构建软件工程一样生成演示文稿。

New 2026

Claude Code skill that translates entire books (PDF/DOCX/EPUB) into any language using parallel subagents

New 2026

Hybrid RAG system combining vector search, knowledge graph (LightRAG), and cross-encoder reranking — with Docling document parsing, visual intelligence (image/table captioning), agentic streaming chat, and inline citations. Powered by Gemini or local Ollama models.

New 2026

OfficeCLI is the first and best Office suite purpose-built for AI agents to read, edit, and automate Word, Excel, and PowerPoint files. Free, open-source, single binary, no Office installation required.

New 2026

PDF library for Go: layout engine, HTML to PDF, forms, signatures, barcodes, and PDF/A. Apache 2.0.

New 2026

A diffusion-based framework for document OCR that replaces autoregressive decoding with block-level parallel diffusion decoding.

New 2026

Portable, fully offline Markdown editor for Windows. No install, no internet, no Electron.

New 2026

Client Side PDF editor Toolkit

New 2026

🦖 Markdown browser for humans and agents. Browse docs from local directories, GitHub repos, and remote websites — all from your terminal.

New 2026

Thoth - Personal AI Sovereignty. A local-first AI assistant with integrated tools, a personal knowledge graph, voice, vision, shell, browser automation, scheduled tasks, health tracking, and messaging channels. Run locally via Ollama or add opt-in cloud models. Your data stays on your machine.

New 2026

TreeSearch: Search your codebase like a human — not like a vector database. No embeddings. No chunking. Just millisecond search over structured documents and large codebases. 无需 embedding,无需切分文档,在结构化文档和大型代码库中实现毫秒级检索。

New 2026

A native Android document reader application built with Kotlin and Jetpack Compose.

New 2026

A Claude Code skill that turns PDFs, docs, and codebases into Obsidian study vaults