Kshitij.ai
Client-SideInteractive

Document Processing Pipeline

Watch documents flow through OCR, parsing, chunking, and embedding stages in real time. Click "Run Pipeline Demo" to see each processing stage animate with realistic timing.

Document Upload

Drop a document to start the pipeline

Supports PDF, DOCX, TXT, images (OCR)

sample_research_paper.pdf2.4 MB12 pages

Pipeline Stages

1/6

Ingestion

pending
2/6

OCR / Text Extraction

pending
3/6

Parsing

pending
4/6

Chunking

pending
5/6

Embedding Generation

pending
6/6

Vector Storage

pending

Processing Log

$ awaiting pipeline start...

Tesseract OCRLangChain SplittersSentence TransformersHNSW IndexVector StoreDocument AI