LlamaIndex’s legal-kb redefines legal document search with agentic tools

LlamaIndex has just open-sourced legal-kb, a reference web app that turns legal document repositories into interactive knowledge bases where an AI agent can “crawl” files using familiar filesystem tools. Instead of running a single embedding search per query, the agent can list documents, scan contents, and fetch exact passages—behavior the team calls a “Retrieval Harness.” Built on LlamaIndex Index v2 and the LlamaParse platform, the demo shows how everyday file operations can power robust, evolving retrieval pipelines.
A retrieval harness closer to the command line than to vector search
The Retrieval Harness flips the script on traditional one-shot retrieval. It exposes four tools that mirror common command-line actions: semantic and keyword search (retrieve), file discovery (findFiles), raw text extraction (readFile), and pattern matching (grepFile). Each tool is backed by Index v2’s retrieval APIs, so the agent can chain operations—first locate files, then retrieve chunks, and finally read or grep for exact wording—before citing sources. The agent is instructed to follow this order, ensuring traceable answers.
From upload to answer in real time
Uploading a document triggers a background pipeline: files are pushed to LlamaCloud, metadata is stored in PostgreSQL, and an index sync starts automatically. The UI polls until the index is ready, while versioning keeps every re-upload as a new side-by-side version for the same file. During chat, the agent can query the live index using either OpenAI or Anthropic models, with reasoning streamed and citations attached. The entire flow—from upload to final citation—happens without leaving the browser-based TanStack Start app.
Source: MarkTechPost. AI-assisted editorial synthesis — TechnoExpress.

