After DocIntell processes your document, you receive two types of results: classification (what type of document it is) and extraction (the actual data pulled from the document). This guide explains how to retrieve and understand these results.
What You Get After Processing
Once a document completes processing (status: completed), you have access to:
Classification - What type of document was detected and why
Extraction - The actual data extracted from the document
Field Metadata - Confidence scores, page numbers, and source locations for each field
Validation Results - Whether the extraction passed validation rules
Use GET /v1/jobs/{job_id}/results to retrieve the complete extraction output for a job:
curl -X GET https://api.docintell.com/v1/jobs/550e8400-e29b-41d4-a716-446655440000/results \
-H "Authorization: Bearer dk_live_YOUR_API_KEY"
When to use /results vs /documents/{id}/data:
Use /jobs/{job_id}/results for full extraction output with all metadata
Use /documents/{id}/data for filtered data based on your custom views (see Views Guide )
Understanding Classification
The classification tells you what type of document was detected and why.
Classification Fields
The detected document type code (e.g., invoice, capital_call, k1)
Classification confidence score from 0.0 to 1.0 (higher is more confident)
1-2 sentence explanation of why this type was chosen
Direct quote from the document that supports the classification
Page number where the citation was found (1-indexed)
Example Classification
{
"classification" : {
"document_type" : "capital_call" ,
"confidence" : 0.95 ,
"reasoning" : "Document is a capital call notice from a private equity fund requesting capital contribution from limited partners." ,
"citation" : "CAPITAL CALL NOTICE - Fund IV, L.P." ,
"citation_page" : 1
}
}
How is the document type determined?
DocIntell uses a multi-stage classification process:
Visual analysis - Layout, headers, and document structure
Text analysis - Key phrases, terminology, and language patterns
Template matching - Common document formats (W-9, K-1, invoices, etc.)
The confidence score reflects how strongly the document matches the identified type. Scores above 0.90 are typically very reliable.
Understanding Extracted Data
The extraction contains the actual data pulled from your document.
Document type (matches classification)
Number of pages in the document
LLM model used for extraction (e.g., google-vertex:gemini-2.5-flash)
How long extraction took in milliseconds
The extracted fields as key-value pairs. Field names are in snake_case.
Per-field metadata including confidence scores, page numbers, and source locations
Validation results with hard/soft violations
{
"extraction" : {
"document_type" : "invoice" ,
"page_count" : 2 ,
"extraction_model" : "google-vertex:gemini-2.5-flash" ,
"processing_time_ms" : 3500 ,
"data" : {
"invoice_number" : "INV-2024-0892" ,
"invoice_date" : "2024-12-01" ,
"due_date" : "2024-12-31" ,
"vendor_name" : "Acme Corp" ,
"total_amount" : 1234.56 ,
"currency" : "USD" ,
"line_items" : [
{
"description" : "Professional Services" ,
"quantity" : 40 ,
"unit_price" : 150.00 ,
"amount" : 6000.00
}
]
},
"field_metadata" : {
"invoice_number" : {
"confidence" : 0.98 ,
"page_number" : 1 ,
"location_hint" : "top right header" ,
"raw_text" : "INV-2024-0892"
},
"total_amount" : {
"confidence" : 0.95 ,
"page_number" : 1 ,
"location_hint" : "bottom of page, summary section" ,
"raw_text" : "$1,234.56"
}
},
"validation" : {
"is_valid" : true ,
"hard_violations" : [],
"soft_violations" : []
}
}
}
Field metadata provides provenance and confidence information for each extracted field.
Self-reported confidence from the LLM (0.0 to 1.0). Directionally useful but not calibrated - a 90% confidence does not mean 90% accuracy.
Page where the value was found (1-indexed). Useful for manual verification.
Qualitative description of where on the page (e.g., “top header”, “in summary table”, “footer”)
The original text as it appeared in the document before parsing
{
"field_metadata" : {
"invoice_number" : {
"confidence" : 0.98 ,
"page_number" : 1 ,
"location_hint" : "top right header" ,
"raw_text" : "INV-2024-0892"
},
"total_amount" : {
"confidence" : 0.95 ,
"page_number" : 1 ,
"location_hint" : "bottom of page, summary section" ,
"raw_text" : "$1,234.56"
},
"due_date" : {
"confidence" : 0.92 ,
"page_number" : 1 ,
"location_hint" : "near invoice date in header" ,
"raw_text" : "Due: December 31, 2024"
}
}
}
Confidence Score Guidelines:
0.95+ - Very high confidence (rarely wrong)
0.85-0.94 - High confidence (generally reliable)
0.70-0.84 - Moderate confidence (worth verifying)
Below 0.70 - Low confidence (manual review recommended)
Understanding Validation Results
Validation checks whether the extracted data meets expected constraints.
Validation Types
Hard Violations - Critical errors that indicate extraction failure
Soft Violations - Warnings that may require attention but don’t fail the extraction
true if all hard constraints passed, false if any hard violations exist
List of critical validation failures
List of warnings or optional field issues
Example Validation (Passing)
{
"validation" : {
"is_valid" : true ,
"hard_violations" : [],
"soft_violations" : [
{
"field" : "swift_code" ,
"severity" : "soft" ,
"message" : "Optional field 'swift_code' not found in document"
}
]
}
}
Example Validation (Failing)
{
"validation" : {
"is_valid" : false ,
"hard_violations" : [
{
"field" : "due_date" ,
"severity" : "hard" ,
"message" : "Required field 'due_date' is missing"
},
{
"field" : "total_amount" ,
"severity" : "hard" ,
"message" : "Field 'total_amount' failed validation: must be a positive number"
}
],
"soft_violations" : []
}
}
When is_valid is false, the extracted data may be incomplete or unreliable. Review hard_violations to understand what went wrong.
Complete Example: Invoice
Here’s a full response for an invoice extraction:
{
"job_id" : "550e8400-e29b-41d4-a716-446655440000" ,
"document_id" : "6789def0-abcd-4567-ef01-23456789abcd" ,
"status" : "completed" ,
"classification" : {
"document_type" : "invoice" ,
"confidence" : 0.96 ,
"reasoning" : "Document is a vendor invoice with itemized charges and payment terms." ,
"citation" : "INVOICE" ,
"citation_page" : 1
},
"extraction" : {
"document_type" : "invoice" ,
"page_count" : 2 ,
"extraction_model" : "google-vertex:gemini-2.5-flash" ,
"processing_time_ms" : 3500 ,
"data" : {
"invoice_number" : "INV-2024-0892" ,
"invoice_date" : "2024-12-01" ,
"due_date" : "2024-12-31" ,
"vendor_name" : "Acme Corp" ,
"vendor_address" : "123 Main St, San Francisco, CA 94105" ,
"customer_name" : "ABC Capital Partners" ,
"total_amount" : 1234.56 ,
"currency" : "USD" ,
"payment_terms" : "Net 30" ,
"line_items" : [
{
"description" : "Professional Services - November 2024" ,
"quantity" : 40 ,
"unit_price" : 150.00 ,
"amount" : 6000.00
},
{
"description" : "Software License" ,
"quantity" : 1 ,
"unit_price" : 500.00 ,
"amount" : 500.00
}
]
},
"field_metadata" : {
"invoice_number" : {
"confidence" : 0.98 ,
"page_number" : 1 ,
"location_hint" : "top right header" ,
"raw_text" : "INV-2024-0892"
},
"invoice_date" : {
"confidence" : 0.97 ,
"page_number" : 1 ,
"location_hint" : "header section below invoice number" ,
"raw_text" : "Date: December 1, 2024"
},
"total_amount" : {
"confidence" : 0.95 ,
"page_number" : 1 ,
"location_hint" : "bottom of page, summary section" ,
"raw_text" : "Total: $1,234.56"
},
"vendor_name" : {
"confidence" : 0.99 ,
"page_number" : 1 ,
"location_hint" : "top left header" ,
"raw_text" : "Acme Corp"
}
},
"validation" : {
"is_valid" : true ,
"hard_violations" : [],
"soft_violations" : []
}
}
}
Complete Example: Capital Call
Here’s a full response for a capital call extraction:
{
"job_id" : "660e8400-e29b-41d4-a716-446655440001" ,
"document_id" : "7890abc1-def2-5678-9012-345678901234" ,
"status" : "completed" ,
"classification" : {
"document_type" : "capital_call" ,
"confidence" : 0.95 ,
"reasoning" : "Document is a capital call notice from a private equity fund requesting capital contribution from limited partners." ,
"citation" : "CAPITAL CALL NOTICE - Fund IV, L.P." ,
"citation_page" : 1
},
"extraction" : {
"document_type" : "capital_call" ,
"page_count" : 3 ,
"extraction_model" : "google-vertex:gemini-2.5-flash" ,
"processing_time_ms" : 4200 ,
"data" : {
"fund_name" : "ABC Partners Fund IV, L.P." ,
"call_reference" : "CC-2024-Q4-001" ,
"notice_date" : "2024-12-01" ,
"due_date" : "2024-12-15" ,
"call_amount_lp" : 4500000.00 ,
"call_amount_fund" : 50000000.00 ,
"lp_ownership_percentage" : 9.0 ,
"investment_amount" : 4200000.00 ,
"management_fee_amount" : 250000.00 ,
"other_expenses_amount" : 50000.00 ,
"bank_name" : "Silicon Valley Bank" ,
"account_number" : "****1234" ,
"routing_number" : "121000248" ,
"swift_code" : "SVBKUS6S" ,
"wire_reference" : "ABC Fund IV - CC-2024-Q4-001"
},
"field_metadata" : {
"fund_name" : {
"confidence" : 0.99 ,
"page_number" : 1 ,
"location_hint" : "top of page, main header" ,
"raw_text" : "ABC Partners Fund IV, L.P."
},
"call_amount_lp" : {
"confidence" : 0.96 ,
"page_number" : 1 ,
"location_hint" : "summary table, highlighted row" ,
"raw_text" : "$4,500,000.00"
},
"due_date" : {
"confidence" : 0.98 ,
"page_number" : 1 ,
"location_hint" : "prominently displayed below header" ,
"raw_text" : "Payment Due: December 15, 2024"
},
"bank_name" : {
"confidence" : 0.97 ,
"page_number" : 2 ,
"location_hint" : "wire instructions section" ,
"raw_text" : "Silicon Valley Bank"
},
"swift_code" : {
"confidence" : 0.94 ,
"page_number" : 2 ,
"location_hint" : "wire instructions section" ,
"raw_text" : "SWIFT: SVBKUS6S"
}
},
"validation" : {
"is_valid" : true ,
"hard_violations" : [],
"soft_violations" : [
{
"field" : "call_amount_calculation" ,
"severity" : "soft" ,
"message" : "LP call amount ($4,500,000) does not exactly match fund call ($50,000,000) × ownership (9.0%) = $4,500,000. Difference: $0 (within tolerance)."
}
]
}
}
}
Error Handling
Job Not Completed
If you try to get results before the job completes:
{
"error" : "job_not_completed" ,
"message" : "Job not completed. Current status: processing. Results are only available for completed jobs."
}
HTTP Status: 400 Bad Request
Fix: Wait for the job to complete or use webhooks for notifications.
Job Not Found
{
"error" : "not_found" ,
"message" : "Job not found: 550e8400-e29b-41d4-a716-446655440000. It may not exist or you may not have access to it."
}
HTTP Status: 404 Not Found
Possible Causes:
Job ID does not exist
Job belongs to a different tenant
Typo in the job ID
Best Practices
Check Confidence Scores Review field-level confidence scores for critical data. Fields with low confidence may need manual verification.
Use Page Numbers The page_number and location_hint help you quickly locate and verify extracted values in the original PDF.
Handle Soft Violations Soft violations are warnings, not errors. They may indicate missing optional fields or minor inconsistencies.
Log Validation Failures When is_valid is false, log the hard_violations for debugging and quality monitoring.
Next Steps