TreeDex Documentation

Tree-based, vectorless document RAG framework.

Index any document into a navigable tree structure, then retrieve relevant sections using any LLM. No vector databases, no embeddings — just structured tree retrieval.

Get Started View on GitHub


Why TreeDex?

  TreeDex Vector DB RAG
Structure Preserves hierarchy (chapters → sections → subsections) Flat chunks, no hierarchy
Storage JSON file (human-readable, inspectable) Opaque vector database
Infrastructure None — just JSON files Pinecone, Chroma, Weaviate, etc.
Attribution Exact page ranges per section Approximate chunk boundaries
Dependencies 1 LLM API 1 LLM + 1 embedding model + 1 database
PDF with ToC Zero LLM calls for indexing Still needs embedding

Key Features

  • 18+ LLM providers — Gemini, OpenAI, Claude, Mistral, Groq, Ollama, and more
  • Smart hierarchy detection — PDF ToC extraction, font-size heading markers, orphan repair
  • Dual language — Python and Node.js with identical APIs and cross-compatible index format
  • Agentic mode — Retrieve context AND generate answers in one call
  • Image support — Vision LLMs describe images embedded in PDFs

v0.1.5 Highlights

  • PDF ToC extraction — Zero LLM calls when PDF has bookmarks
  • Font-size heading detection[H1]/[H2]/[H3] markers from font analysis
  • Capped continuation context — 90% token savings on large documents
  • Orphan repair — Auto-insert synthetic parents for broken hierarchy

Quick Example

from treedex import TreeDex, GeminiLLM

llm = GeminiLLM(api_key="your-key")
index = TreeDex.from_file("textbook.pdf", llm=llm)
index.show_tree()

result = index.query("What are the main methods?", agentic=True)
print(result.answer)       # Direct answer
print(result.pages_str)    # "pages 45-52"

Documentation

Page Description
Getting Started Installation, quick start, basic usage
Architecture System design, pipeline, data types
API Reference Complete function and class reference
LLM Backends 18+ providers, vision support, custom backends
Benchmarks Performance numbers, scaling characteristics
Case Studies Before/after comparisons, real-world scenarios
Configuration Tuning guide for different use cases
Benchmark Report TreeDex vs Vector RAG head-to-head on 244-page textbook

Back to top

TreeDex © 2024-2026 Mithun Gowda B. MIT License.

This site uses Just the Docs, a documentation theme for Jekyll.