What industries do you work with?

We work across a wide range of industries including finance, healthcare, e-commerce, logistics, and telecommunications. Our solutions are tailored to each client’s specific domain requirements and regulatory environment.

How long does a typical engagement take?

It depends on the scope. A focused observability deployment or automation workflow can be delivered in 4-6 weeks. Larger initiatives like full-scale LLM integration or platform builds typically run 2-4 months. We always start with a discovery phase to align on timelines.

Do you offer ongoing support after project delivery?

Yes. We offer flexible support and maintenance plans to ensure your systems stay healthy, updated, and optimized. We can also embed with your team on a part-time basis for continuous improvement.

Can you work with our existing tech stack?

Absolutely. We integrate with your current infrastructure and tools rather than forcing a rip-and-replace. Whether you’re on AWS, GCP, Azure, or on-prem, we adapt our approach to what works best for your environment.

What is your pricing model?

We offer both fixed-price project engagements and time-and-materials contracts depending on the nature of the work. Reach out through our contact form and we’ll provide a tailored estimate within 24 hours.

How do you handle data security and compliance?

Security is built into every engagement. We follow industry best practices for data handling, support GDPR and SOC 2 compliance requirements, and can work within your existing security policies and access controls.

dbt in Production — Incremental Models, Tests, Macros, and CI/CD Pipelines

Why dbt Has Become the Standard for Production SQL Transformations

dbt (data build tool) solves a problem that every data team eventually hits: SQL transformations scattered across scripts, stored procedures, and ad-hoc queries with no version control, no tests, and no visibility into dependencies. It brings software engineering practices — version control, testing, documentation, modularity — to the layer where most data teams spend most of their time: SQL transformations.

But the gap between a dbt project that works on a developer's laptop and one that runs reliably in production is significant. Incremental models that rebuild the full table on every run because the logic is subtly wrong, tests that catch 10% of data quality issues because only the happy path is covered, CI pipelines that run for 40 minutes and still miss regressions — these are the failure modes that teams discover after going live.

This guide covers the patterns that separate production dbt projects from experimental ones: incremental model strategies, comprehensive testing approaches, macro libraries for reusable logic, and CI/CD with slim CI state-based selection so you only run what changed.

Materialization	When to use	Trade-off
view	Staging models, lightweight transforms	No storage cost; query cost per read
table	Small dimension tables, seed-derived lookups	Full rebuild every run; simple to reason about
incremental	Large fact tables, event streams, logs	Faster runs; complex to get right
ephemeral	CTEs you want to name and reuse	No table created; inlined as CTE

Incremental Models: Strategies and Pitfalls

Incremental models are the most powerful and most commonly misused dbt feature. They allow dbt to process only the rows that have changed since the last run, dramatically reducing compute costs for large tables. The core idea is simple: use is_incremental() to add a WHERE clause that filters to new rows, and let dbt handle the merge or append.

Strategy 1: append (simplest, for immutable event streams)

The append strategy inserts new rows without touching existing ones. It is only correct when source rows are truly immutable — events that are never updated or deleted. The risk is duplicates if your pipeline reruns with overlapping time windows.

-- models/fct_events.sql
-- append strategy: insert new rows only, never touch existing ones
-- Only correct for immutable events (never updated at source)

{{
  config(
    materialized='incremental',
    incremental_strategy='append',
    on_schema_change='sync_all_columns'
  )
}}

select
  event_id,
  user_id,
  event_type,
  properties,
  occurred_at
from {{ ref('stg_events') }}

{% if is_incremental() %}
  -- Only process events newer than the latest event in the target table
  -- Add a 3-hour lookback buffer for late-arriving events
  where occurred_at > (
    select dateadd(hour, -3, max(occurred_at))
    from {{ this }}
  )
{% endif %}

Strategy 2: merge (for mutable records with unique keys)

The merge strategy (the default when unique_key is set) performs an upsert: rows with matching unique keys are updated, new rows are inserted. This handles late-arriving updates and source corrections correctly. Supported on BigQuery, Snowflake, Databricks, and Postgres 15+.

-- models/fct_orders.sql
-- merge strategy: upsert on order_id
-- Handles late-arriving status updates and order corrections

{{
  config(
    materialized='incremental',
    unique_key='order_id',
    incremental_strategy='merge',
    merge_update_columns=['status', 'updated_at', 'shipped_at', 'delivered_at'],
    on_schema_change='sync_all_columns',
    cluster_by=['order_date']
  )
}}

with source as (
  select
    order_id,
    customer_id,
    status,
    total_amount,
    currency,
    order_date,
    updated_at,
    shipped_at,
    delivered_at
  from {{ ref('stg_orders') }}
),

-- Deduplicate: keep the most recent record per order_id
-- Critical for sources that emit multiple updates for the same order
deduped as (
  select *
  from source
  qualify row_number() over (
    partition by order_id
    order by updated_at desc
  ) = 1
)

select * from deduped

{% if is_incremental() %}
  -- Look back 7 days to catch late-arriving status updates
  -- Orders that shipped days after being placed still need to be captured
  where updated_at >= dateadd(day, -7, current_timestamp())
{% endif %}

Note

Always specify merge_update_columns explicitly. Without it, dbt updates ALL columns on match, which overwrites values that should be preserved from the first insert (e.g., created_at, original_amount). Listing only the mutable columns protects immutable attributes from accidental overwrite.

Strategy 3: delete+insert (for partitioned tables)

On BigQuery and Databricks, the delete+insert strategy is often more efficient than merge for partitioned tables. It deletes all rows in the affected partitions, then inserts the full replacement dataset. This avoids the overhead of row-level merge operations and leverages partition pruning.

-- models/fct_sessions.sql
-- delete+insert strategy on BigQuery, partitioned by session_date
-- Deletes affected partitions entirely, then re-inserts
-- More efficient than merge for partition-aligned incremental loads

{{
  config(
    materialized='incremental',
    incremental_strategy='delete+insert',
    unique_key='session_date',
    partition_by={
      'field': 'session_date',
      'data_type': 'date',
      'granularity': 'day'
    },
    cluster_by=['user_id'],
    on_schema_change='sync_all_columns'
  )
}}

select
  session_id,
  user_id,
  date(session_start) as session_date,
  session_start,
  session_end,
  page_views,
  events_count,
  revenue_attributed
from {{ ref('stg_sessions') }}

{% if is_incremental() %}
  where date(session_start) >= date_sub(current_date(), interval 3 day)
{% endif %}

Testing: Generic, Singular, and Custom

dbt's testing framework has two layers: schema tests defined in YAML (now called data tests) and singular tests written as SQL files. Most teams use only the four built-in generic tests (unique, not_null, accepted_values, relationships), which catches maybe 20% of the data quality issues that matter in production. The rest require custom tests.

# models/fct_orders.yml
# Comprehensive test coverage for the orders fact table

version: 2

models:
  - name: fct_orders
    description: "One row per order, fully merged with latest status"
    config:
      contract:
        enforced: true  # Enforce column types at build time (dbt 1.5+)

    columns:
      - name: order_id
        data_type: varchar
        description: "Unique order identifier from the source system"
        data_tests:
          - unique
          - not_null

      - name: customer_id
        data_type: varchar
        data_tests:
          - not_null
          - relationships:
              to: ref('dim_customers')
              field: customer_id
              config:
                severity: warn  # Warn on orphaned orders; don't fail the run

      - name: status
        data_type: varchar
        data_tests:
          - accepted_values:
              values: ['pending', 'confirmed', 'processing', 'shipped', 'delivered', 'cancelled', 'refunded']

      - name: total_amount
        data_type: numeric
        data_tests:
          - not_null
          - dbt_utils.accepted_range:
              min_value: 0
              inclusive: true

      - name: order_date
        data_type: date
        data_tests:
          - not_null
          - dbt_utils.recency:
              datepart: day
              field: order_date
              interval: 3
              config:
                severity: warn

    data_tests:
      # Table-level: no orders can have delivered_at before shipped_at
      - dbt_utils.expression_is_true:
          expression: "delivered_at is null or shipped_at is null or delivered_at >= shipped_at"
          config:
            error_msg: "delivered_at cannot precede shipped_at"

      # Row count freshness: table must have been updated in the last 25 hours
      - dbt_utils.recency:
          datepart: hour
          field: updated_at
          interval: 25

Singular tests for complex business rules

Singular tests are SQL files in the tests/ directory. They return the rows that FAIL the test — zero rows means the test passes. They are the right tool for business rules that cannot be expressed as a simple column assertion.

-- tests/assert_revenue_matches_line_items.sql
-- Revenue on fct_orders must match the sum of line item amounts
-- within a 0.01 tolerance (floating-point rounding)
-- Returns any orders where the totals diverge

select
  o.order_id,
  o.total_amount         as order_total,
  sum(li.unit_price * li.quantity) as line_item_total,
  abs(o.total_amount - sum(li.unit_price * li.quantity)) as delta
from {{ ref('fct_orders') }} o
left join {{ ref('fct_order_line_items') }} li
  on o.order_id = li.order_id
group by 1, 2
having abs(o.total_amount - sum(li.unit_price * li.quantity)) > 0.01

-- tests/assert_no_duplicate_daily_sessions.sql
-- Each user should have at most one session record per day
-- per session_type. Multiple records indicate a deduplication
-- failure in the incremental logic.

select
  user_id,
  session_date,
  session_type,
  count(*) as duplicate_count
from {{ ref('fct_sessions') }}
group by 1, 2, 3
having count(*) > 1

Custom generic tests as macros

When the same business rule applies across multiple models, write it as a generic test macro in macros/tests/. Generic tests accept the model and column name as inputs and can have additional parameters, making them reusable across the entire project.

-- macros/tests/test_is_valid_currency_code.sql
-- Generic test: validates that a column contains only ISO 4217 currency codes
-- Usage in schema YAML:
--   - dbt_custom.is_valid_currency_code:
--       iso_codes: ['USD', 'EUR', 'GBP', 'PLN', 'JPY']

{% test is_valid_currency_code(model, column_name, iso_codes) %}

select {{ column_name }}
from {{ model }}
where {{ column_name }} is not null
  and {{ column_name }} not in (
    {{ iso_codes | join(", ") | surround_each_with_quotes }}
  )

{% endtest %}


-- macros/tests/test_no_future_timestamps.sql
-- Generic test: column values must not exceed current_timestamp by more
-- than tolerance_minutes (useful for catching clock-skew or bad ETL writes)

{% test no_future_timestamps(model, column_name, tolerance_minutes=5) %}

select {{ column_name }}, current_timestamp() as check_time
from {{ model }}
where {{ column_name }} > dateadd(
  minute, {{ tolerance_minutes }}, current_timestamp()
)

{% endtest %}

Note

Install dbt-utils and dbt-expectations (a Great Expectations port for dbt) before writing custom macros — they provide 50+ generic tests covering ranges, patterns, uniqueness combinations, and distribution checks that you would otherwise write from scratch.

Macros: Reusable SQL Logic Across Models

Macros are Jinja2 templates that generate SQL at compile time. They eliminate copy-paste patterns across models, enforce consistent business logic, and centralize logic that would otherwise diverge across dozens of models. A team without macros typically has 10 slightly different implementations of the same date dimension logic.

Macro 1: generate_schema_name (environment routing)

The most important macro in any multi-environment dbt project. Without it, developer models land in the same schema as production, causing naming collisions. Override it to prefix schemas with the target name in non-production environments.

-- macros/generate_schema_name.sql
-- Routes models to environment-prefixed schemas in non-production targets
-- Production: target schema as configured in profiles.yml
-- Dev/CI:     <developer_name>_<schema> (e.g., alice_staging, alice_marts)
-- This prevents dev models from overwriting production tables

{% macro generate_schema_name(custom_schema_name, node) -%}

  {%- set default_schema = target.schema -%}

  {%- if target.name == 'prod' -%}
    {#- Production: use schema exactly as configured in dbt_project.yml -#}
    {%- if custom_schema_name is none -%}
      {{ default_schema }}
    {%- else -%}
      {{ custom_schema_name | trim }}
    {%- endif -%}

  {%- else -%}
    {#- Non-prod: prefix with target schema (developer name or CI run ID) -#}
    {%- if custom_schema_name is none -%}
      {{ default_schema }}
    {%- else -%}
      {{ default_schema }}_{{ custom_schema_name | trim }}
    {%- endif -%}

  {%- endif -%}

{%- endmacro %}

Macro 2: date_spine (generating calendar tables)

A date spine generates a complete sequence of dates between two boundaries. It is used to ensure that time-series aggregations have a row for every date, even when no data exists — preventing silent gaps in BI dashboards.

-- macros/date_spine.sql
-- Generates a continuous date sequence without gaps
-- Usage: {{ date_spine('2020-01-01', 'current_date()', 'day') }}

{% macro date_spine(start_date, end_date, datepart='day') %}

  {{
    dbt_utils.date_spine(
      datepart=datepart,
      start_date="cast('" ~ start_date ~ "' as date)",
      end_date=end_date
    )
  }}

{% endmacro %}


-- models/dim_date.sql
-- Uses the date_spine macro to build a full calendar dimension
-- Extend with fiscal calendar, holiday flags, etc.

{{
  config(
    materialized='table',
    tags=['dimensions', 'daily']
  )
}}

with spine as (
  {{ date_spine('2020-01-01', 'dateadd(year, 2, current_date())') }}
),

enriched as (
  select
    cast(date_day as date)                          as date_day,
    extract(year from date_day)                     as year,
    extract(quarter from date_day)                  as quarter,
    extract(month from date_day)                    as month,
    extract(week from date_day)                     as week_of_year,
    extract(dayofweek from date_day)                as day_of_week,
    dayname(date_day)                               as day_name,
    monthname(date_day)                             as month_name,
    date_trunc('week', date_day)                    as week_start,
    date_trunc('month', date_day)                   as month_start,
    date_trunc('quarter', date_day)                 as quarter_start,
    case when dayofweek(date_day) in (0, 6)
         then false else true end                   as is_weekday,
    date_day = current_date()                       as is_today
  from spine
)

select * from enriched

Macro 3: audit_helper (pre/post run row count auditing)

Auditing macros compare model state before and after a run to surface unexpected changes. The dbt-audit-helper package provides compare_relations and compare_column_values macros that are invaluable when migrating models or changing incremental logic.

-- macros/audit_row_count.sql
-- Post-hook macro that writes model row counts to an audit log table
-- Add to dbt_project.yml as a post-hook on all mart models:
--   post-hook: "{{ audit_row_count(this) }}"

{% macro audit_row_count(model) %}

  insert into {{ target.schema }}.dbt_audit_log (
    model_name,
    schema_name,
    run_started_at,
    row_count,
    run_id
  )
  select
    '{{ model.name }}'      as model_name,
    '{{ model.schema }}'    as schema_name,
    '{{ run_started_at }}'  as run_started_at,
    count(*)                as row_count,
    '{{ invocation_id }}'   as run_id
  from {{ model }}

{% endmacro %}

CI/CD: Slim CI with State-Based Selection

Running dbt build on every pull request — rebuilding all models and running all tests — is the most common CI bottleneck in large dbt projects. A project with 300 models can take 45 minutes per CI run. Slim CI solves this by using dbt's state:modified selector to build only the models that changed in the PR, plus their downstream dependents.

The state method compares the compiled manifest from the current run against a reference manifest from the previous production run. Any model whose compiled SQL, config, or upstream dependency changed is selected. Pair it with --deferto resolve unmodified upstream refs against the production environment, so CI runs don't need to rebuild unchanged dependencies.

# .github/workflows/dbt-ci.yml
# Slim CI: build only changed models + downstream, defer to prod for upstreams
# Requires: dbt artifacts from the latest prod run stored in a GCS/S3 bucket

name: dbt CI

on:
  pull_request:
    branches: [main]
    paths:
      - 'models/**'
      - 'macros/**'
      - 'tests/**'
      - 'dbt_project.yml'
      - 'packages.yml'

env:
  DBT_PROFILES_DIR: .

jobs:
  dbt-slim-ci:
    name: dbt Slim CI
    runs-on: ubuntu-latest
    environment: ci

    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'

      - name: Install dbt
        run: pip install dbt-snowflake==1.8.0 dbt-utils dbt-expectations

      - name: Configure dbt profile
        env:
          SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
          SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_USER }}
          SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
          SNOWFLAKE_ROLE: ${{ secrets.SNOWFLAKE_ROLE }}
          SNOWFLAKE_DATABASE: ${{ secrets.SNOWFLAKE_DATABASE }}
          SNOWFLAKE_WAREHOUSE: ${{ secrets.SNOWFLAKE_WAREHOUSE }}
        run: |
          mkdir -p ~/.dbt
          cat > ~/.dbt/profiles.yml << EOF
          my_project:
            target: ci
            outputs:
              ci:
                type: snowflake
                account: "${SNOWFLAKE_ACCOUNT}"
                user: "${SNOWFLAKE_USER}"
                password: "${SNOWFLAKE_PASSWORD}"
                role: "${SNOWFLAKE_ROLE}"
                database: "${SNOWFLAKE_DATABASE}"
                warehouse: "${SNOWFLAKE_WAREHOUSE}"
                schema: "ci_pr_${{ github.event.pull_request.number }}"
                threads: 8
          EOF

      - name: Download production manifest (state reference)
        env:
          GCS_BUCKET: ${{ secrets.GCS_ARTIFACT_BUCKET }}
        run: |
          # Download the most recent production manifest for state comparison
          mkdir -p prod-manifest
          gsutil cp gs://${GCS_BUCKET}/dbt/prod/manifest.json prod-manifest/manifest.json

      - name: dbt deps
        run: dbt deps

      - name: dbt build — slim CI (changed + downstream)
        run: |
          dbt build \
            --select "state:modified+" \
            --defer \
            --state prod-manifest \
            --target ci \
            --threads 8 \
            --fail-fast

      - name: Upload CI manifest
        if: always()
        env:
          GCS_BUCKET: ${{ secrets.GCS_ARTIFACT_BUCKET }}
        run: |
          gsutil cp target/manifest.json \
            gs://${GCS_BUCKET}/dbt/ci/pr-${{ github.event.pull_request.number }}/manifest.json

      - name: Cleanup CI schema
        if: always()
        run: |
          dbt run-operation drop_schema --args '{"schema_name": "ci_pr_${{ github.event.pull_request.number }}"}'

Note

The --defer flag resolves ref() calls to unmodified models against the production environment rather than rebuilding them in CI. This means your CI run only creates tables for the models that actually changed. Without --defer, state:modified+ still selects only changed models, but their upstream ref()calls would fail because the upstream tables don't exist in the CI schema.

Production deployment pipeline

The production pipeline runs the full dbt build on merge to main, uploads the manifest for the next CI run's state comparison, and sends a notification with the run summary. Use a separate dbt Cloud job or orchestrate with Airflow/Prefect if you need retry logic, SLA monitoring, or dependency-aware scheduling.

# .github/workflows/dbt-prod.yml
# Production deployment: runs on merge to main
# Uploads manifest artifact so next CI run can use state comparison

name: dbt Production

on:
  push:
    branches: [main]

jobs:
  dbt-prod:
    name: dbt Production Build
    runs-on: ubuntu-latest
    environment: production
    concurrency:
      group: dbt-production
      cancel-in-progress: false  # Never cancel a running prod build

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'pip'

      - name: Install dbt
        run: pip install dbt-snowflake==1.8.0 dbt-utils dbt-expectations

      - name: Configure dbt profile (production)
        env:
          SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
          SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_USER }}
          SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
        run: |
          mkdir -p ~/.dbt
          cat > ~/.dbt/profiles.yml << 'PROFILE'
          my_project:
            target: prod
            outputs:
              prod:
                type: snowflake
                account: "{{ env_var('SNOWFLAKE_ACCOUNT') }}"
                user: "{{ env_var('SNOWFLAKE_USER') }}"
                password: "{{ env_var('SNOWFLAKE_PASSWORD') }}"
                role: TRANSFORMER
                database: ANALYTICS
                warehouse: TRANSFORM_WH
                schema: marts
                threads: 16
          PROFILE

      - name: dbt deps
        run: dbt deps

      - name: dbt build — full production run
        run: |
          dbt build \
            --target prod \
            --threads 16 \
            --full-refresh  # Remove for daily incremental runs

      - name: Upload manifest to GCS (for slim CI state reference)
        if: success()
        env:
          GCS_BUCKET: ${{ secrets.GCS_ARTIFACT_BUCKET }}
        run: |
          gsutil cp target/manifest.json gs://${GCS_BUCKET}/dbt/prod/manifest.json
          gsutil cp target/run_results.json gs://${GCS_BUCKET}/dbt/prod/run_results.json

      - name: dbt docs generate and deploy
        if: success()
        run: |
          dbt docs generate
          gsutil -m rsync -r target/ gs://${GCS_BUCKET}/dbt-docs/

Production Project Structure

A well-structured dbt project separates models by layer: staging (raw data cleanup), intermediate (business logic joins), and marts (dimensional models for BI). Each layer has its own schema, materialization defaults, and tagging strategy. This separation makes it easy to understand the data lineage and apply different access controls per layer.

# dbt_project.yml
# Production-ready project configuration
# Enforces materialization defaults, schemas, and tags per layer

name: 'my_analytics'
version: '1.0.0'
config-version: 2

profile: 'my_project'

model-paths: ["models"]
test-paths: ["tests"]
macro-paths: ["macros"]
seed-paths: ["seeds"]
analysis-paths: ["analyses"]

target-path: "target"
clean-targets: ["target", "dbt_packages"]

models:
  my_analytics:
    # Staging: raw source cleanup, views only, no business logic
    staging:
      +materialized: view
      +schema: staging
      +tags: ['staging']

    # Intermediate: complex joins and aggregations (ephemeral or views)
    intermediate:
      +materialized: ephemeral
      +schema: intermediate
      +tags: ['intermediate']

    # Marts: dimensional models for BI tools and downstream consumers
    marts:
      +materialized: table
      +schema: marts
      +tags: ['marts']

      # Large fact tables: incremental by default
      fct_:
        +materialized: incremental

      # Dimension tables: full table rebuild (usually small)
      dim_:
        +materialized: table

# Run hooks: audit row counts on every mart model
on-run-end:
  - "{{ audit_run_summary() }}"

seeds:
  my_analytics:
    +schema: seeds
    +tags: ['seeds']

tests:
  my_analytics:
    +severity: error
    # Override severity for specific test types
    staging:
      +severity: warn  # Don't block CI on staging data quality issues

Production Readiness Checklist

Before promoting a dbt project to production, verify these practices are in place. They prevent the most common failure modes: incremental models silently returning wrong results, CI taking too long, and schema changes breaking downstream BI reports without warning.

Every incremental model has a full-refresh test in CI

Run dbt build --full-refresh periodically (weekly or on schema changes) to verify that the full-rebuild path still works. Incremental models that have drifted from the full-refresh result are silently incorrect. Set up a weekly CI job that runs with --full-refresh and compares row counts against the incremental result.

Source freshness checks are wired to the pipeline

Run dbt source freshness before dbt build and block the run if sources are stale. A transformation that runs on stale data produces stale results, and the downstream dashboards will look correct but be hours behind. Configure loaded_at_field and freshness blocks in every source definition.

Contracts are enforced on mart models

Enable contract: enforced: true on all models that BI tools consume directly. Contracts validate that column names and data types match the schema definition at compile time, catching breaking changes before they reach production. Without contracts, a column rename silently breaks downstream reports.

Test coverage includes at least one singular test per fact table

Generic tests (unique, not_null, relationships) are necessary but not sufficient. Each fact table should have at least one singular test validating a core business invariant — for example, that revenue totals balance, that session counts are non-negative, or that order statuses follow a valid state machine. These catch logic bugs that generic tests cannot.

Slim CI is configured and the manifest artifact is uploaded

Without slim CI, CI run times grow linearly with the project size. With 300+ models and 500+ tests, a full CI run can take an hour. Slim CI with state:modified+ and --defer reduces this to 2–5 minutes for typical PRs. Ensure the production manifest is uploaded after every successful prod run, otherwise the state comparison falls back to a full build.

Exposures document downstream consumers

Define exposures for every dashboard, API endpoint, or ML feature that consumes a dbt model. Exposures make the downstream impact of model changes visible in the DAG, enable impact analysis before breaking changes, and document the business use of every model. Teams that skip exposures discover the business impact of model renames after the fact.

Running dbt in production and hitting scaling or reliability challenges?

We help analytics and data engineering teams implement production-grade dbt workflows — from incremental model design and test coverage to macro libraries, CI/CD pipelines, and monitoring. Let’s talk.

Get in Touch

dbt in Production — Incremental Models, Tests, Macros, and CI/CD Pipelines

Why dbt Has Become the Standard for Production SQL Transformations

Incremental Models: Strategies and Pitfalls

Strategy 1: append (simplest, for immutable event streams)

Strategy 2: merge (for mutable records with unique keys)

Strategy 3: delete+insert (for partitioned tables)

Testing: Generic, Singular, and Custom

Singular tests for complex business rules

Custom generic tests as macros

Macros: Reusable SQL Logic Across Models

Macro 1: generate_schema_name (environment routing)

Macro 2: date_spine (generating calendar tables)

Macro 3: audit_helper (pre/post run row count auditing)

CI/CD: Slim CI with State-Based Selection

Production deployment pipeline

Production Project Structure

Production Readiness Checklist

Every incremental model has a full-refresh test in CI

Source freshness checks are wired to the pipeline

Contracts are enforced on mart models

Test coverage includes at least one singular test per fact table

Slim CI is configured and the manifest artifact is uploaded

Exposures document downstream consumers

Running dbt in production and hitting scaling or reliability challenges?

Related Articles