A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 97+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.
This repository is tracked by Trending Repos. The badge upgrades automatically if it ever cracks the top 100.
<img src="https://trending-repos.com/badge/kreuzberg-dev/kreuzberg.svg" alt="Trending Repos" />https://trending-repos.com/badge/kreuzberg-dev/kreuzberg.svg