U1-OCR-Parser

Light in weight, heavy on understanding

SOTA parsing across formats with one-step structured output.

U1-OCR-Parser: Light in weight, heavy on understanding

U1-OCR-Parser is a lightweight, high-performance document understanding model with just 0.9B parameters. Its core strengths are a compact size and high accuracy, purpose-built for enterprise document parsing. It accurately parses complex layouts, formulas, complex tables, and mixed-format documents, and natively outputs standardized structured results such as Markdown, JSON, and HTML, helping enterprises quickly complete content ingestion, retrieval, information extraction, and RAG knowledge base construction.

SOTA on authoritative benchmarks—industrial results you can measure and compare

94.63pts

OmniDocBench V1 overall score

93.93pts

D4LA overall F1 score

2.17pages / sec

Much higher PDF parsing throughput

0.63images / sec

More efficient high-resolution image processing

Key strengths

Higher parsing accuracy

Industrial-grade OCR with leading complex layout and reading-order recovery.

Structured output, ready to use

Standard JSON/Markdown layout signals for human review, search, and downstream development.

Multilingual and multi-format

Chinese and multilingual documents; office files, PDFs, images, scans, and more.

Downstream friendly

A stable, high-quality structural base for extraction, retrieval, Q&A, and knowledge ingestion.

Technical highlights

Complex tables, tuned

Merged cells, multi-level headers, nested and irregular tables restored reliably.

Strong on formulas and mixed layouts

Papers, research reports, textbooks, exam sheets, and other dense layouts.

Unusual content supported

Handwriting, stamps, seals, code, and special symbols with stable recognition.

Multi-engine deployment

VLLM, FastDeploy, and SGLang for flexible deployment options.

Use cases

General document digitization

One-click parsing for Word, PPT, Excel, PDF, screenshots, photos, and scans—searchable content, faster.

Complex tables to structured data

Government, research, and finance ledgers structured in one step—less manual entry and reformatting.

RAG knowledge bases

Better ingestion quality; fewer issues from "fragmentation, errors, and messy layout" in QA and search.

Archives and compliance

Structured output for archiving, search, audit, and reuse.

Capabilities

1) Complex layout parsing

Recognizes titles, body text, chart regions, and how content is organized—supports multi-column and mixed layouts.

2) Formulas and complex tables

Structured restoration for merged cells, multi-level headers, nested and irregular tables.

3) Unconventional text

Handwriting, stamps, seals, code, and special symbols supported.

4) Structured output

Native Markdown, JSON, and HTML—clean, reusable results.

5) Multilingual

Major languages including Chinese, English, Russian, Arabic, and Hindi, plus minority scripts—ready for regional and multilingual document parsing.

Get started

Flexible pricing, tailored solutions, and private deployment