What is Schema Projection?
Schema Projection is DocIntell’s core differentiator: instead of dumping gigabytes of raw OCR data, you define exactly which fields you need and get back only that structured data.The Problem with Traditional OCR
Traditional OCR APIs return everything they extract - bounding boxes, confidence scores, page coordinates - resulting in massive payloads:DocIntell’s Approach: Schema Projection
With DocIntell, you define which fields matter and get back structured data:- 20-2000x smaller payloads - Only the data you need, nothing more
- Ingest once, query many ways - Create multiple views for the same document
- Type-safe schemas - Well-defined field types with validation
Discover Available Document Types
Before creating views, discover what document types DocIntell supports and what fields are available for extraction.List All Document Types
Get a high-level overview of all supported document types:Get Full Schema Definition
Retrieve the complete field definitions for a specific document type:Understanding Field Definitions
| Field | Description |
|---|---|
field_name | Field identifier (snake_case) - use this in views |
field_type | Data type: string, decimal, date, monetary, boolean, integer, array |
severity | hard = required field (extraction fails if missing)soft = optional field (extraction continues if missing) |
is_nullable | Whether the field can be null even if present |
description | Human-readable explanation of the field |
pattern | Regex validation pattern (if applicable) |
Field Severity Matters:
- Hard fields are critical and must be present for extraction to succeed
- Soft fields are nice-to-have and won’t fail extraction if missing
Create Custom Views
Views define which fields you want to retrieve when querying document data. Think of them as SQL SELECT statements that filter the extracted data.Why Use Views?
Multiple Use Cases
Create different views for accounting, compliance, and auditing teams - all from the same extraction.
Reduced Payload Size
Only retrieve the fields you need. A “quick summary” view might return 5 fields instead of 50.
Separation of Concerns
Different teams see different data without re-processing the document.
Version Control
Name views like “accounting_v1” and “accounting_v2” to manage schema evolution.
Creating a View
Create a view by specifying the document type and which fields to include:Default Views
Setis_default: true to make a view the default for its document type. When you query document data without specifying a view, the default view is used.
Only one default view per document type. Setting a new default automatically unsets the previous one.
List Your Views
See all views you’ve created:Update a View
Modify an existing view (fields, description, or default status):Delete a View
Remove a view you no longer need:204 No Content
Query Data with Views
Once you’ve created views, use them to retrieve extracted document data filtered to exactly the fields you need.Query with a Specific View
Retrieve document data using a named view:Query with Default View
If you don’t specify a view, the default view for the document type is used:Include Field Metadata
Get additional metadata for each field (confidence scores, page numbers, etc.):Field metadata is only available if you enable
include_metadata=true. It’s disabled by default to reduce payload size.Query the Same Document with Different Views
This is where Schema Projection shines - query the same document multiple ways:Example: Accounting vs. Compliance Views
Example: Accounting vs. Compliance Views
Accounting View (6 fields for AP processing):
Compliance View (8 fields for audit trail):Same document, same extraction, different views - no re-processing.
Compliance View (8 fields for audit trail):
Best Practices
1. Create Views for Each Use Case
Don’t use a single “all fields” view for everything. Create specific views for each team or workflow:Accounting Team
accounting_v1: invoice_number, vendor_name, total_amount, due_dateCompliance Team
compliance_v1: vendor_tax_id, payment_terms, approved_by, approval_dateAudit Team
audit_v1: All financial fields + approval workflow fieldsQuick Summary
summary_v1: Just 3-5 key fields for dashboards2. Use Semantic Versioning for View Names
Plan for schema evolution by versioning your views:- Migrate gradually - New code uses v2, old code continues using v1
- A/B test schema changes - Compare v1 vs v2 side-by-side
- Roll back if needed - Switch back to v1 if v2 has issues
3. Set Default Views for Common Queries
Make your most common view the default:4. Validate Fields Before Creating Views
Always check the schema first to ensure your fields exist:5. Use include_metadata Sparingly
Only request field metadata when you actually need it (e.g., for quality review):
6. Document Your Views
Maintain a mapping of views to use cases in your documentation:Error Handling
Invalid Fields
If you try to create a view with fields that don’t exist in the schema:400 Bad Request
Fix: Check the schema (GET /v1/schemas/invoice) for valid field names.
View Not Found
If you query with a view that doesn’t exist:404 Not Found
Fix: Check your view name or create the view first (POST /v1/views).
Document Type Not Found
If you try to create a view for an unsupported document type:404 Not Found
Fix: List available document types (GET /v1/schemas).
Document Not Ready
If you query data before extraction completes:400 Bad Request
Fix: Wait for extraction to complete (check job status with GET /v1/jobs/{job_id}).