Lift can complement OCR workflows, but it focuses on structured extraction by schema, not only text recognition.

Can Lift run locally?

Datalab lists local and server-based options, including Hugging Face, vLLM, SGLang, and Docker Model Runner.

What is Lift not good for?

For vague one-off document questions without a fixed target schema, a research or chat workflow is often a better fit.

Datalab Lift: structured extraction from PDFs

What this is about

Lift is a 9B vision model from Datalab for structured document extraction. The user provides a JSON schema, the model reads PDF or image documents, and it returns a JSON object that should match that schema. That makes Lift a concrete workflow tool, not a general chat assistant.

The reason to look at it now: Datalab introduced Lift as an open-weights model and made it available on Hugging Face. For teams that currently combine OCR, regex rules, and manual cleanup, this is a testable alternative.

What Lift actually does

Lift processes documents visually and extracts fields into a predefined structure. On the Hugging Face page, Datalab describes schema-constrained decoding: the model should not just answer in free text, but return valid, typed fields. Datalab also lists local and server-based usage paths, including vLLM, SGLang, Docker Model Runner, and compatible local apps.

The goal is not to summarize every paragraph. Lift is useful when the required fields are known in advance: invoice number, delivery date, total amount, table rows, contract parties, or measurement values.

Why it matters

Document data is the messy edge of automation in many companies. An ERP system expects structured fields, but the input arrives as a PDF, scan, or photo. Traditional OCR finds text, but it does not automatically know which number is the net amount and which number is a line-item quantity.

Lift aims at exactly that gap. In its announcement, Datalab describes an internal benchmark set with 225 documents, long multi-page cases, and roughly 11,000 scored fields. Independent tests still need to show how stable this is on other document classes. Even so, the value is clear: anyone building an extraction pipeline can compare Lift against existing OCR rules, commercial APIs, and manual samples.

In plain language

Lift is like a careful clerk who receives a form with empty boxes. It reads the stack of papers and fills only the boxes that fit, instead of writing a long essay about the document.

A practical example

A logistics team receives 8,000 freight invoices per month from 40 providers. The target schema contains 18 fields: invoice number, sender, recipient, shipment number, currency, net amount, tax, gross amount, and several line-item lists. The team tests Lift on 300 historical invoices. If at least 95 percent of required fields are correct and uncertain cases go to human review, Lift can first enter the accounting workflow as preprocessing.

Scope and limits

Lift needs a good schema. If the required fields are unclear, the process will not be stable.
Open weights do not mean easy operation; a 9B vision model needs suitable hardware or a hosted service.
Critical documents need validation, sampling, and fallbacks because wrong extraction can have financial or legal consequences.

The sensible next test is a small golden-set comparison: 100 real documents, human-verified target fields, Lift output, error rate per field, and a decision on which fields can be automated and which should stay assisted.

SEO & GEO keywords

Datalab Lift, document extraction, structured JSON extraction, PDF AI, vision language model, OCR workflow, invoice automation, open weights model, Hugging Face, schema-constrained decoding, document intelligence, AI data extraction

Datalab Lift extracts document data by schema

What this is about

What Lift actually does

Why it matters

In plain language

A practical example

Scope and limits

SEO & GEO keywords

💡 In plain English

Key Takeaways

FAQ

Is Lift an OCR tool?

Can Lift run locally?

What is Lift not good for?

Sources & Context