What industries do you work with?

We work across a wide range of industries including finance, healthcare, e-commerce, logistics, and telecommunications. Our solutions are tailored to each client’s specific domain requirements and regulatory environment.

How long does a typical engagement take?

It depends on the scope. A focused observability deployment or automation workflow can be delivered in 4-6 weeks. Larger initiatives like full-scale LLM integration or platform builds typically run 2-4 months. We always start with a discovery phase to align on timelines.

Do you offer ongoing support after project delivery?

Yes. We offer flexible support and maintenance plans to ensure your systems stay healthy, updated, and optimized. We can also embed with your team on a part-time basis for continuous improvement.

Can you work with our existing tech stack?

Absolutely. We integrate with your current infrastructure and tools rather than forcing a rip-and-replace. Whether you’re on AWS, GCP, Azure, or on-prem, we adapt our approach to what works best for your environment.

What is your pricing model?

We offer both fixed-price project engagements and time-and-materials contracts depending on the nature of the work. Reach out through our contact form and we’ll provide a tailored estimate within 24 hours.

How do you handle data security and compliance?

Security is built into every engagement. We follow industry best practices for data handling, support GDPR and SOC 2 compliance requirements, and can work within your existing security policies and access controls.

LLM Structured Outputs — Schema Design, Validation, and Retry Patterns for Production AI Systems

Why Freeform LLM Output Breaks Production

Every team that integrates an LLM into a production system eventually hits the same wall: the model returns text, and the application needs data. At first, string parsing feels manageable — a few regex patterns, a JSON block extracted with a slice, a hand-rolled parser that handles the two output variants you have seen so far. Then it's 3 AM on a Tuesday and the model started appending a markdown code fence it didn't include yesterday, and your JSON parser throws an unhandled exception that silently swallows an entire batch of invoice records.

The problem compounds downstream. REST consumers expecting a typed InvoiceRecord object get None because the deserialization silently failed. Analytics pipelines see nulls where they expected amounts. An ML feature pipeline skips rows with missing fields and quietly degrades model accuracy over the next three weeks. None of these failures produce an obvious error — they produce silent data corruption.

The root cause is always the same: the contract between the LLM and the consuming system was expressed in natural language rather than a machine-verifiable schema. The fix is to move validation to the boundary — the moment the model response enters your system — using tools designed for exactly this purpose.

Note

Validate at the boundary, not deep in business logic. The further a malformed LLM response travels before being caught, the more state it will have corrupted. A Pydantic validation error raised at the API response handler is infinitely easier to recover from than a KeyError raised six function calls deep in a payment processing flow.

The Three Methods — JSON Mode, Tool Calling, and Native Structured Outputs

There are three primary mechanisms for constraining LLM output to a specific shape. Each has different guarantees, different failure modes, and different implementation complexity. Choosing the right one for your use case is the first design decision in any structured extraction pipeline.

JSON Mode

Both the OpenAI and Anthropic APIs support a JSON mode that instructs the model to return valid JSON. The guarantee is syntactic only — the response will be parseable JSON, but the schema of that JSON is not enforced. You may get every field, some fields, or entirely different fields than you asked for. JSON mode is appropriate for exploratory use cases where schema flexibility is acceptable; it is not sufficient for production extraction pipelines where downstream consumers have rigid expectations.

Tool Calling

Tool calling (also called function calling) allows you to define a JSON Schema for a “tool” and force the model to call it. The model's response is not free text — it is a structured tool call whose arguments conform to the schema you provided. This is the most widely supported and battle-tested mechanism for structured extraction across providers, and it is the approach we will focus on throughout this article.

Native Structured Outputs

OpenAI's Structured Outputs feature (introduced in 2024) goes one step further than tool calling by enforcing the schema at the model generation level using a constrained decoding algorithm. This provides stronger guarantees than tool calling alone, at the cost of some latency overhead. Anthropic's equivalent is forced tool calling with a specific tool name — a pattern we will implement in Section 4.

The following example shows all three approaches for extracting an invoice entity from unstructured text, so you can compare the code surface area and guarantee level:

# compare_extraction_methods.py
import json
import anthropic
import openai

TEXT = """
Invoice #INV-2026-0042 from Acme Supplies Ltd
Date: May 15, 2026
Due: June 15, 2026
Items:
  - Cloud infrastructure (monthly): $4,200.00
  - Support SLA tier 2: $850.00
Total due: $5,050.00
"""

# ── Method 1: JSON mode (syntactic guarantee only) ──────────────────────────
openai_client = openai.OpenAI()

json_mode_response = openai_client.chat.completions.create(
    model="gpt-4o",
    response_format={"type": "json_object"},
    messages=[
        {
            "role": "system",
            "content": "Extract invoice data as JSON with fields: invoice_number, vendor, total_amount, due_date.",
        },
        {"role": "user", "content": TEXT},
    ],
)
# No schema guarantee — fields may be missing or renamed
raw = json.loads(json_mode_response.choices[0].message.content)
print("JSON mode:", raw)


# ── Method 2: Tool calling (schema-constrained) ─────────────────────────────
INVOICE_TOOL = {
    "name": "extract_invoice",
    "description": "Extract structured invoice data from the provided text.",
    "input_schema": {
        "type": "object",
        "properties": {
            "invoice_number": {"type": "string"},
            "vendor_name": {"type": "string"},
            "invoice_date": {"type": "string", "description": "ISO 8601 date"},
            "due_date": {"type": "string", "description": "ISO 8601 date"},
            "line_items": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "description": {"type": "string"},
                        "amount_usd": {"type": "number"},
                    },
                    "required": ["description", "amount_usd"],
                },
            },
            "total_amount_usd": {"type": "number"},
        },
        "required": ["invoice_number", "vendor_name", "total_amount_usd", "due_date"],
    },
}

anthropic_client = anthropic.Anthropic()

tool_response = anthropic_client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=[INVOICE_TOOL],
    tool_choice={"type": "tool", "name": "extract_invoice"},  # forced
    messages=[{"role": "user", "content": TEXT}],
)
tool_call = next(b for b in tool_response.content if b.type == "tool_use")
print("Tool calling:", tool_call.input)


# ── Method 3: OpenAI Structured Outputs (constrained decoding) ───────────────
from pydantic import BaseModel
from typing import Optional
from openai.lib._parsing import type_to_response_format_param

class LineItem(BaseModel):
    description: str
    amount_usd: float

class InvoiceExtraction(BaseModel):
    invoice_number: str
    vendor_name: str
    due_date: str
    total_amount_usd: float
    line_items: list[LineItem]

structured_response = openai_client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "Extract the invoice data."},
        {"role": "user", "content": TEXT},
    ],
    response_format=InvoiceExtraction,
)
parsed: InvoiceExtraction = structured_response.choices[0].message.parsed
print("Structured outputs:", parsed)

Schema Design with Pydantic

Pydantic v2 is the de-facto standard for defining structured schemas in Python LLM applications. Its strengths — automatic JSON Schema generation, runtime validation, type coercion, and descriptive error messages — map directly to the requirements of a structured extraction pipeline. The key insight is that model.model_json_schema() gives you exactly the JSON Schema you need to pass as a tool definition to the LLM.

# invoice_schema.py
from __future__ import annotations

import json
from datetime import date
from enum import Enum
from typing import Optional, Union

from pydantic import BaseModel, Field, field_validator


class Currency(str, Enum):
    USD = "USD"
    EUR = "EUR"
    GBP = "GBP"
    PLN = "PLN"


class LineItem(BaseModel):
    description: str = Field(..., description="Human-readable line item description")
    quantity: Optional[float] = Field(None, ge=0)
    unit_price: Optional[float] = Field(None, ge=0)
    amount: float = Field(..., description="Total amount for this line item")
    currency: Currency = Currency.USD


class VendorInfo(BaseModel):
    name: str
    tax_id: Optional[str] = Field(None, description="VAT number or EIN, if present")
    address: Optional[str] = None


class InvoiceExtraction(BaseModel):
    """Structured extraction schema for vendor invoices."""

    invoice_number: str = Field(..., description="Invoice identifier as printed on the document")
    vendor: VendorInfo
    invoice_date: Optional[date] = None
    due_date: Optional[date] = None
    line_items: list[LineItem] = Field(default_factory=list)
    subtotal: Optional[float] = None
    tax_amount: Optional[float] = None
    total_amount: float = Field(..., description="Final amount due, after tax")
    currency: Currency = Currency.USD
    payment_terms: Optional[str] = Field(None, description="e.g. Net 30, Due on receipt")
    notes: Optional[str] = None

    # Union type: either fully parsed or a fallback raw string
    po_reference: Union[str, None] = Field(
        None, description="Purchase order number referenced on the invoice"
    )

    @field_validator("total_amount")
    @classmethod
    def total_must_be_positive(cls, v: float) -> float:
        if v <= 0:
            raise ValueError(f"total_amount must be positive, got {v}")
        return v


# Generate the JSON Schema for the LLM tool definition
schema = InvoiceExtraction.model_json_schema()
print(json.dumps(schema, indent=2))
# Use this schema in the "input_schema" field of your Anthropic tool definition

Note

Always add description fields to your Pydantic models. The LLM reads these descriptions when deciding how to populate each field. A field named amount with no description is ambiguous — the same field with description="Total amount for this line item, inclusive of discounts" significantly improves extraction accuracy on edge-case documents.

Anthropic SDK — Tool Use for Structured Extraction

The Anthropic tool use API supports a tool_choiceparameter that forces the model to call a specific named tool. This is the key to schema-constrained extraction: instead of asking the model to “return JSON”, you define a tool whose sole purpose is to receive the extracted data, and the model must call it. The model cannot respond with free text — it must produce a structured tool call that conforms to the tool's input schema.

# anthropic_extraction.py
import json
import anthropic
from pydantic import BaseModel, ValidationError
from invoice_schema import InvoiceExtraction

client = anthropic.Anthropic()


def build_tool_from_pydantic(model_cls: type[BaseModel], tool_name: str, description: str) -> dict:
    """Convert a Pydantic model into an Anthropic tool definition."""
    return {
        "name": tool_name,
        "description": description,
        "input_schema": model_cls.model_json_schema(),
    }


INVOICE_TOOL = build_tool_from_pydantic(
    InvoiceExtraction,
    tool_name="extract_invoice",
    description=(
        "Extract all structured invoice data from the provided document text. "
        "Populate every field you can identify. Use null for fields not present in the document."
    ),
)


def extract_invoice(document_text: str) -> InvoiceExtraction:
    """
    Extract structured invoice data from unstructured document text.
    Uses forced tool calling to guarantee the model returns the expected schema.
    """
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=2048,
        tools=[INVOICE_TOOL],
        tool_choice={"type": "tool", "name": "extract_invoice"},  # force this specific tool
        messages=[
            {
                "role": "user",
                "content": (
                    "Extract all invoice data from the following document.\n\n"
                    f"{document_text}"
                ),
            }
        ],
    )

    # The model must have returned a tool_use block
    tool_block = next(
        (b for b in response.content if b.type == "tool_use" and b.name == "extract_invoice"),
        None,
    )
    if tool_block is None:
        raise ValueError(
            f"Model did not call the extract_invoice tool. stop_reason={response.stop_reason}"
        )

    # Validate the tool call arguments against our Pydantic schema
    return InvoiceExtraction.model_validate(tool_block.input)


# Usage
SAMPLE_DOC = """
Invoice #INV-2026-0042 from Acme Supplies Ltd (VAT: GB123456789)
Invoice Date: 2026-05-15  |  Due Date: 2026-06-15  |  PO: PO-2026-0019

  Cloud infrastructure (monthly) ........... USD 4,200.00
  Support SLA tier 2 ........................ USD   850.00

Subtotal: USD 5,050.00
Tax (0%): USD 0.00
Total Due: USD 5,050.00

Payment Terms: Net 30
"""

invoice = extract_invoice(SAMPLE_DOC)
print(invoice.model_dump_json(indent=2))

Note

When using tool_choice={"type": "tool", "name": "..."}, the model is not allowed to use any other tool or respond with free text. This eliminates the most common failure mode of tool-calling extraction: the model deciding to respond conversationally instead of calling the tool, which happens when the document is ambiguous or the model is uncertain.

Validation and Retry Patterns

Even with forced tool calling, the model can produce values that pass JSON Schema validation but fail Pydantic's richer validators — a negative total amount, a due date before the invoice date, a line item sum that doesn't match the stated subtotal. The correct response to these validation failures is not to crash or silently corrupt data: it is to send the validation error back to the model and ask it to correct its extraction.

The retry-with-feedback pattern is one of the highest-value patterns in production LLM systems. It converts hard failures into model corrections, and in practice eliminates the vast majority of extraction errors on the second attempt. Use tenacity for exponential backoff on transient network errors, and a manual retry loop for validation failures where you want to inject error context into the conversation.

# retry_extraction.py
import json
import time
import anthropic
from pydantic import ValidationError
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
from invoice_schema import InvoiceExtraction

client = anthropic.Anthropic()

MAX_VALIDATION_RETRIES = 3


@retry(
    retry=retry_if_exception_type(anthropic.APIConnectionError),
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10),
    reraise=True,
)
def _call_model(messages: list, tools: list) -> anthropic.types.Message:
    """Call the Anthropic API with exponential backoff on transient errors."""
    return client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=2048,
        tools=tools,
        tool_choice={"type": "tool", "name": "extract_invoice"},
        messages=messages,
    )


def extract_with_retry(
    document_text: str,
    tool_def: dict,
    max_retries: int = MAX_VALIDATION_RETRIES,
) -> InvoiceExtraction:
    """
    Extract invoice data with automatic retry on validation failure.
    Sends the validation error back to the model as context for the next attempt.
    """
    messages: list[dict] = [
        {
            "role": "user",
            "content": (
                "Extract all invoice data from the following document.\n\n"
                f"{document_text}"
            ),
        }
    ]

    for attempt in range(1, max_retries + 1):
        response = _call_model(messages, [tool_def])

        tool_block = next(
            (b for b in response.content if b.type == "tool_use" and b.name == "extract_invoice"),
            None,
        )

        if tool_block is None:
            raise ValueError(f"Model did not call the tool on attempt {attempt}")

        # Append model response to history (required for multi-turn context)
        messages.append({"role": "assistant", "content": response.content})

        try:
            return InvoiceExtraction.model_validate(tool_block.input)

        except ValidationError as exc:
            if attempt == max_retries:
                raise RuntimeError(
                    f"Extraction failed after {max_retries} attempts. "
                    f"Last validation error:\n{exc}"
                ) from exc

            # Send validation error back to model as tool result + correction request
            error_detail = json.dumps(
                [{"field": e["loc"], "message": e["msg"]} for e in exc.errors()],
                indent=2,
            )
            messages.append({
                "role": "user",
                "content": [
                    {
                        "type": "tool_result",
                        "tool_use_id": tool_block.id,
                        "content": (
                            f"Validation failed with the following errors:\n{error_detail}\n\n"
                            "Please re-call the extract_invoice tool with corrected values."
                        ),
                        "is_error": True,
                    }
                ],
            })

    # Should not be reached
    raise RuntimeError("Extraction failed: exceeded retry loop bounds")

Streaming Structured Outputs

Streaming is useful when you want to show the user progressive feedback while a long document is being processed, or when you need to start downstream processing on partial results as they arrive. Structured extraction with streaming requires accumulating the streamed tool call input deltas and parsing the complete JSON only after the stream ends. The Anthropic streaming API emits input_json_delta events for tool calls, which you accumulate into a string buffer before final parsing.

# streaming_extraction.py
import json
import anthropic
from invoice_schema import InvoiceExtraction

client = anthropic.Anthropic()


def extract_invoice_streaming(document_text: str, tool_def: dict) -> InvoiceExtraction:
    """
    Extract invoice data using the streaming API.
    Accumulates partial JSON deltas and validates on completion.
    """
    accumulated_input = ""
    tool_use_id: str | None = None
    tool_name: str | None = None

    with client.messages.stream(
        model="claude-sonnet-4-6",
        max_tokens=2048,
        tools=[tool_def],
        tool_choice={"type": "tool", "name": "extract_invoice"},
        messages=[
            {
                "role": "user",
                "content": f"Extract all invoice data from this document:\n\n{document_text}",
            }
        ],
    ) as stream:
        for event in stream:
            # Track tool use block start
            if event.type == "content_block_start":
                if hasattr(event.content_block, "type") and event.content_block.type == "tool_use":
                    tool_use_id = event.content_block.id
                    tool_name = event.content_block.name
                    accumulated_input = ""

            # Accumulate partial JSON
            elif event.type == "content_block_delta":
                if hasattr(event.delta, "type") and event.delta.type == "input_json_delta":
                    accumulated_input += event.delta.partial_json

            # Stream complete — final message available
            elif event.type == "message_stop":
                pass  # final_message = stream.get_final_message()

    if not accumulated_input or tool_name != "extract_invoice":
        raise ValueError("Stream did not produce an extract_invoice tool call")

    # Parse and validate the complete accumulated JSON
    raw_data = json.loads(accumulated_input)
    return InvoiceExtraction.model_validate(raw_data)

Type-Safe Output Pipelines

Once you have Pydantic models and forced tool calling in place, the next step is making the extraction function generic so it works with any schema without code duplication. A typed generic function lets mypy and pyright verify that the caller receives the correct type — if you call extract(text, InvoiceExtraction), the type checker knows the return type is InvoiceExtraction, not Any.

# typed_extraction.py
from __future__ import annotations

import json
from typing import TypeVar, Generic, get_args, get_origin
from pydantic import BaseModel, ValidationError
import anthropic

T = TypeVar("T", bound=BaseModel)

client = anthropic.Anthropic()


def extract(
    document_text: str,
    schema: type[T],
    tool_name: str = "extract_data",
    tool_description: str = "Extract structured data from the provided text.",
    model: str = "claude-sonnet-4-6",
    max_tokens: int = 2048,
) -> T:
    """
    Generic type-safe extraction function.
    Returns an instance of `schema` validated against the model's tool call output.

    Type-checked: mypy/pyright infers the return type from the `schema` parameter.
    """
    tool_def = {
        "name": tool_name,
        "description": tool_description,
        "input_schema": schema.model_json_schema(),
    }

    response = client.messages.create(
        model=model,
        max_tokens=max_tokens,
        tools=[tool_def],
        tool_choice={"type": "tool", "name": tool_name},
        messages=[{"role": "user", "content": document_text}],
    )

    tool_block = next(
        (b for b in response.content if b.type == "tool_use" and b.name == tool_name),
        None,
    )
    if tool_block is None:
        raise ValueError("Model did not produce the expected tool call")

    return schema.model_validate(tool_block.input)


# ── Discriminated unions for multi-intent classification ─────────────────────
from typing import Annotated, Literal, Union
from pydantic import Field as PydanticField


class SupportTicketBug(BaseModel):
    intent: Literal["bug_report"]
    severity: Literal["critical", "high", "medium", "low"]
    component: str
    reproduction_steps: list[str]
    expected_behavior: str
    actual_behavior: str


class SupportTicketFeature(BaseModel):
    intent: Literal["feature_request"]
    title: str
    description: str
    business_value: str
    priority: Literal["high", "medium", "low"]


class SupportTicketQuestion(BaseModel):
    intent: Literal["question"]
    topic: str
    question_text: str
    urgency: Literal["urgent", "normal", "low"]


# Discriminated union — Pydantic uses the `intent` field as the discriminator
SupportTicket = Annotated[
    Union[SupportTicketBug, SupportTicketFeature, SupportTicketQuestion],
    PydanticField(discriminator="intent"),
]


class ClassifiedTicket(BaseModel):
    """Wrapper to enable top-level discriminated union extraction."""
    ticket: SupportTicket


# Usage
ticket = extract(
    "Users are reporting that the export button crashes the app on iOS 17. "
    "This worked in version 2.3.1 but broke in 2.4.0.",
    ClassifiedTicket,
    tool_name="classify_ticket",
    tool_description="Classify a support ticket into bug_report, feature_request, or question.",
)
print(type(ticket.ticket).__name__)   # SupportTicketBug
print(ticket.ticket.severity)        # e.g. "high"

Production Patterns

Structured extraction in production requires operational discipline beyond the extraction function itself. The following patterns address the four most common failure modes in production LLM output pipelines: silent schema drift, untraceable validation failures, inconsistent schema versions across environments, and hard crashes on documents that cannot be parsed.

Validation Logging with model_dump()

Every extraction attempt — success or failure — should be logged as a structured document. Pydantic's model_dump() gives you a serializable dict that can be shipped to your data warehouse, a log aggregator, or an observability platform.

# extraction_logger.py
import uuid
import json
from datetime import datetime, timezone
from dataclasses import dataclass, field, asdict
from typing import Any, Optional
from pydantic import BaseModel, ValidationError


@dataclass
class ExtractionRecord:
    run_id: str = field(default_factory=lambda: str(uuid.uuid4()))
    timestamp: str = field(default_factory=lambda: datetime.now(timezone.utc).isoformat())
    schema_name: str = ""
    schema_version: str = ""
    model: str = ""
    document_hash: str = ""
    success: bool = False
    extracted_data: Optional[dict[str, Any]] = None
    validation_errors: Optional[list[dict]] = None
    retry_count: int = 0
    input_tokens: int = 0
    output_tokens: int = 0

    def record_success(self, instance: BaseModel, usage) -> None:
        self.success = True
        self.extracted_data = instance.model_dump(mode="json")
        self.input_tokens = usage.input_tokens
        self.output_tokens = usage.output_tokens

    def record_failure(self, exc: ValidationError, retry_count: int) -> None:
        self.success = False
        self.retry_count = retry_count
        self.validation_errors = [
            {"field": list(e["loc"]), "type": e["type"], "message": e["msg"]}
            for e in exc.errors()
        ]

    def to_json(self) -> str:
        return json.dumps(asdict(self), indent=2, default=str)

Schema Pinning

Pin your Pydantic schema version alongside the model version in every extraction record. When you upgrade to a new model, run both the old and new model in parallel against a golden dataset before switching production traffic. Schema drift — a field the old model reliably extracted that the new model misses — is invisible without this paper trail.

Validation Logging

Log every validation failure to a dedicated table, including the raw tool call input that failed validation. This lets you identify systematic model errors — fields the model consistently misnames, types it consistently gets wrong — and use them to improve your schema descriptions or add few-shot examples to your system prompt.

Schema Registry

Store schema versions as tagged artifacts alongside your model version registry. When you change a Pydantic schema in a way that is not backward-compatible (removing a required field, changing a type), bump the schema version and treat existing extraction records as schema v1 data. Mixing schema versions in a single table is a data quality bug waiting to happen.

Graceful Degradation

On final validation failure after all retries, do not crash. Return a typed failure result — a dataclass with a success=False flag, the raw model output, and the validation errors — and route the document to a manual review queue. Silent data loss is worse than an explicit failure that can be reviewed and reprocessed.

# graceful_degradation.py
from __future__ import annotations
from dataclasses import dataclass
from typing import Generic, TypeVar, Optional, Any
from pydantic import BaseModel, ValidationError

T = TypeVar("T", bound=BaseModel)


@dataclass
class ExtractionResult(Generic[T]):
    success: bool
    data: Optional[T]
    raw_tool_input: Optional[dict[str, Any]]
    validation_errors: Optional[list[dict]]
    retry_count: int

    @classmethod
    def ok(cls, data: T, raw: dict, retries: int) -> "ExtractionResult[T]":
        return cls(success=True, data=data, raw_tool_input=raw, validation_errors=None, retry_count=retries)

    @classmethod
    def fail(cls, raw: dict, exc: ValidationError, retries: int) -> "ExtractionResult[T]":
        errors = [{"field": list(e["loc"]), "msg": e["msg"]} for e in exc.errors()]
        return cls(success=False, data=None, raw_tool_input=raw, validation_errors=errors, retry_count=retries)


def safe_extract(document_text: str, schema: type[T], tool_def: dict) -> ExtractionResult[T]:
    """Extract with graceful degradation — never raises on validation failure."""
    import anthropic
    client = anthropic.Anthropic()

    for attempt in range(1, 4):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=2048,
            tools=[tool_def],
            tool_choice={"type": "tool", "name": tool_def["name"]},
            messages=[{"role": "user", "content": document_text}],
        )
        tool_block = next(
            (b for b in response.content if b.type == "tool_use"), None
        )
        if tool_block is None:
            continue

        try:
            instance = schema.model_validate(tool_block.input)
            return ExtractionResult.ok(instance, tool_block.input, attempt)
        except ValidationError as exc:
            if attempt == 3:
                return ExtractionResult.fail(tool_block.input, exc, attempt)

    return ExtractionResult.fail({}, ValidationError.from_exception_data("", []), 3)

LLM Structured Outputs — Schema Design, Validation, and Retry Patterns for Production AI Systems

Why Freeform LLM Output Breaks Production

The Three Methods — JSON Mode, Tool Calling, and Native Structured Outputs

JSON Mode

Tool Calling

Native Structured Outputs

Schema Design with Pydantic

Anthropic SDK — Tool Use for Structured Extraction

Validation and Retry Patterns

Streaming Structured Outputs

Type-Safe Output Pipelines

Production Patterns

Validation Logging with model_dump()

Schema Pinning

Validation Logging

Schema Registry

Graceful Degradation

Further Reading

Building LLM-powered systems and tired of brittle string parsing breaking production?

Related Articles