Back to Blog
dbtAnalytics EngineeringSQLData EngineeringJinja2CI/CDTestingMacros

dbt Advanced Patterns — Macros, Packages, Custom Tests, and Multi-Environment Deployments

A deep-dive into advanced dbt patterns for analytics engineering teams: writing parametric Jinja2 macros with dispatch for adapter-specific overrides, overriding generate_schema_name for multi-tenant schemas, using dbt-utils and dbt-expectations packages, authoring custom generic and singular tests with store_failures, multi-environment profiles.yml with target and env_var, SCD Type 2 snapshots with timestamp and check strategies, and slim CI with state-based selection and GitHub Actions.

2026-05-24

Why Advanced dbt Patterns Matter

Basic dbt — incremental models, schema tests, simple ref() chains — only takes analytics engineering teams so far. In production, with dozens of engineers and hundreds of models, the cracks start to show. Copy-paste SQL proliferates as teams replicate business logic across models because there is no shared macro library. Test coverage remains shallow because the four built-in generic tests (not_null, unique, accepted_values, relationships) cannot express complex business rules. Multi-environment deployments break because dev, staging, and production schemas are handled with ad-hoc workarounds rather than first-class dbt constructs. CI pipelines run the full model graph on every pull request — a process that takes twenty minutes and gets skipped under deadline pressure.

Advanced dbt patterns address each of these failure modes systematically. Jinja2 macros with the dispatch mechanism replace copy-paste SQL with parametric, adapter-aware functions. dbt-expectations and custom generic tests give teams the expressivity to encode business-specific data contracts directly in YAML. Profiles with multiple targets and env_var() injection make environment promotion a configuration concern, not a code change. Slim CI with state:modified+ selection cuts CI runtime from twenty minutes to two by running only the models affected by a given PR.

Note

This article assumes familiarity with core dbt concepts: models, sources, tests, and incremental materialization. If you are new to dbt, the official introduction is a thorough starting point. The patterns here apply to both dbt Core and dbt Cloud, and the code examples are written for BigQuery and Snowflake unless otherwise noted — adapter differences are called out explicitly.

Advanced Macro Development — Jinja2, Dispatch, and Global Overrides

dbt macros are Jinja2 functions stored in the macros/ directory and called from model SQL, other macros, or schema YAML. Beyond simple string-interpolation helpers, macros enable DRY SQL generation for patterns that repeat across models: safe division, surrogate key hashing, audit column injection, and environment-specific schema routing. The Jinja2 templating engine gives macros the full power of Python-like control flow — loops, conditionals, variable scoping — which makes them far more capable than SQL views or CTEs.

The dispatch Mechanism

The dispatch mechanism lets you write adapter-specific implementations of the same macro and have dbt select the correct one at compile time based on the target adapter (BigQuery, Snowflake, Redshift, DuckDB, etc.). dbt-utils uses dispatch extensively — every macro like dbt_utils.surrogate_key has adapter-specific variants (e.g., bigquery__surrogate_key, default__surrogate_key) that get resolved at compile time. You can override any dispatched macro in your own project by creating a macro with the naming convention <adapter>__<macro_name> or by adding your package to the dispatch search order indbt_project.yml.

-- macros/safe_divide.sql
-- Safe division macro: returns null instead of raising a ZeroDivisionError.
-- Usage: {{ safe_divide('revenue', 'sessions') }}
-- Adapter-aware: BigQuery uses SAFE_DIVIDE(), others use CASE WHEN.

{% macro safe_divide(numerator, denominator) %}
  {{ return(adapter.dispatch('safe_divide', 'my_project')(numerator, denominator)) }}
{% endmacro %}

{% macro default__safe_divide(numerator, denominator) %}
  CASE
    WHEN {{ denominator }} = 0 OR {{ denominator }} IS NULL THEN NULL
    ELSE {{ numerator }} / NULLIF({{ denominator }}, 0)
  END
{% endmacro %}

{% macro bigquery__safe_divide(numerator, denominator) %}
  SAFE_DIVIDE({{ numerator }}, {{ denominator }})
{% endmacro %}

{% macro snowflake__safe_divide(numerator, denominator) %}
  IFF({{ denominator }} = 0 OR {{ denominator }} IS NULL, NULL,
    {{ numerator }} / {{ denominator }})
{% endmacro %}


-- macros/generate_schema_name.sql
-- Override the default schema name resolution.
-- In production: use the custom schema as-is (no prefix).
-- In dev/staging: prefix with the target name to isolate developer schemas.
-- This prevents dev models from writing into production schemas.

{% macro generate_schema_name(custom_schema_name, node) -%}
  {%- set default_schema = target.schema -%}
  {%- if custom_schema_name is none -%}
    {{ default_schema }}
  {%- elif target.name == 'prod' -%}
    {#- In production, use the exact custom schema — no target prefix -#}
    {{ custom_schema_name | trim }}
  {%- else -%}
    {#- In dev/staging, prefix with target schema to isolate -#}
    {{ default_schema }}_{{ custom_schema_name | trim }}
  {%- endif -%}
{%- endmacro %}


-- macros/create_audit_table.sql
-- Uses run_query to execute DDL inside a macro.
-- Creates an audit log table if it does not already exist.
-- Call from an on-run-start hook in dbt_project.yml.

{% macro create_audit_table() %}
  {% set create_sql %}
    CREATE TABLE IF NOT EXISTS {{ target.schema }}.dbt_audit_log (
      run_started_at  TIMESTAMP,
      model_name      STRING,
      rows_affected   INT64,
      invocation_id   STRING
    )
  {% endset %}

  {% if execute %}
    {% set result = run_query(create_sql) %}
    {{ log("Audit table created or already exists", info=True) }}
  {% endif %}
{% endmacro %}

Note

The {% if execute %} guard inside macros that call run_query is critical. dbt parses and compiles every macro during the parse phase — before any SQL is executed. Without the guard, dbt will attempt to run your DDL statement during parsing, which fails because no database connection is available at that point. Always wrap run_query calls in {% if execute %}.

dbt Packages — dbt-utils, dbt-expectations, and Writing Your Own

dbt packages are reusable collections of models, macros, and tests installed via the dbt Hub or directly from a Git repository. They are declared in packages.yml at the project root and installed with dbt deps. Two packages are nearly universal in production dbt projects: dbt-utils for SQL utility macros and dbt-expectations for Great Expectations-style data quality tests.

dbt-utils Key Macros

The most valuable macros from dbt-utils in production are: surrogate_key for generating stable MD5-hashed primary keys from composite natural keys; star for selecting all columns from a ref while explicitly excluding a list; union_relations for safely unioning multiple tables with differing column sets (adds nulls for missing columns); get_column_values for dynamically reading distinct values from a column at compile time (used to build dynamic pivots); and pivot for generating column-per-value pivot SQL without manual repetition.

# packages.yml — install at project root, run: dbt deps

packages:
  - package: dbt-labs/dbt_utils
    version: [">=1.1.0", "<2.0.0"]

  - package: calogica/dbt_expectations
    version: [">=0.10.0", "<1.0.0"]

  # Install a private package from GitHub
  - git: "https://github.com/your-org/dbt-internal-utils.git"
    revision: v1.2.0


-- models/fct_orders.sql
-- Using surrogate_key to generate a stable, adapter-agnostic primary key.
-- surrogate_key concatenates the columns and hashes the result with MD5.

SELECT
  {{ dbt_utils.surrogate_key(['order_id', 'source_system']) }} AS order_sk,
  order_id,
  source_system,
  customer_id,
  order_date,
  total_amount_cents,
  status
FROM {{ ref('stg_orders') }}


-- models/fct_events_pivoted.sql
-- Dynamic pivot using get_column_values + pivot.
-- Reads distinct event_type values at compile time — no hardcoding required.

{%- set event_types = dbt_utils.get_column_values(
      ref('stg_events'),
      'event_type',
      default=[]
    ) -%}

SELECT
  user_id,
  date_trunc('day', event_at) AS event_date,
  {{ dbt_utils.pivot(
       'event_type',
       event_types,
       agg='count',
       then_value=1,
       else_value=0,
       suffix='_count'
     ) }}
FROM {{ ref('stg_events') }}
GROUP BY 1, 2


# schema.yml — dbt-expectations tests
# Great Expectations-style assertions in YAML

version: 2

models:
  - name: fct_orders
    columns:
      - name: order_sk
        tests:
          - not_null
          - unique

      - name: total_amount_cents
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_between:
              min_value: 0
              max_value: 10000000   # 100,000 USD in cents — flag absurd values
              config:
                severity: warn

      - name: order_date
        tests:
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: date

  - name: fct_order_items
    tests:
      # Table-level test: unit_price must always be <= order_item_total
      - dbt_expectations.expect_column_pair_values_A_to_be_greater_than_B:
          column_A: order_item_total_cents
          column_B: unit_price_cents
          or_equal: true

Writing and Publishing Your Own Package

Internal packages let you share macros, generic tests, and even staging models across multiple dbt projects in your organization. The structure mirrors a regular dbt project: a dbt_project.yml declaring the package name and version, a macros/ directory with shared helpers, and an optional macros/tests/ directory for custom generic tests. Version the package with Git tags and reference specific revisions in consumer projects. For cross-adapter dispatch, define a dispatch key in dbt_project.yml to control macro resolution order.

# dbt_project.yml for your internal package (my-org/dbt-internal-utils)

name: 'dbt_internal_utils'
version: '1.0.0'
config-version: 2

# Dispatch order: project overrides come first, then this package, then dbt_utils
dispatch:
  - macro_namespace: dbt_utils
    search_order: ['dbt_internal_utils', 'dbt_utils']

# No models in a utilities package — macros and tests only
# models:  (leave empty or omit)

# Declare this so consumers can reference it
require-dbt-version: [">=1.5.0", "<2.0.0"]

Custom Generic and Singular Tests

dbt supports two categories of tests beyond the four built-in generics. Generic tests are Jinja2 SQL templates stored in macros/tests/ (or any subdirectory of macros/) that accept a model and column as arguments and return a query whose result count must be zero to pass. They are reusable across any model by referencing the test name in schema YAML. Singular tests are standalone SQL files in the tests/ directory that return failing rows — useful for complex multi-table referential integrity checks that do not generalize to a parameter template.

The store_failures configuration tells dbt to materialize the failing rows to a table in the database rather than just returning a count. This makes test failures auditable — analysts and data consumers can query the failure table to understand exactly which records violated the contract. Combined with --store-failures in CI, your pipeline writes a snapshot of every test failure to the warehouse, enabling retrospective analysis of data quality trends over time.

-- macros/tests/expect_at_least_one_value.sql
-- Custom generic test: asserts that a column contains at least one non-null,
-- non-empty occurrence of a given value.
-- Usage in schema.yml:
--   tests:
--     - expect_at_least_one_value:
--         value: 'completed'

{% test expect_at_least_one_value(model, column_name, value) %}

WITH value_check AS (
  SELECT COUNT(*) AS match_count
  FROM {{ model }}
  WHERE {{ column_name }} = '{{ value }}'
)
SELECT match_count
FROM value_check
WHERE match_count = 0   -- test fails (returns rows) when no matches found

{% endtest %}


-- macros/tests/expect_no_orphan_fk.sql
-- Generic referential integrity test.
-- Asserts every value in column_name has a matching row in the parent table.
-- More flexible than the built-in 'relationships' test because it supports
-- multi-column joins and custom WHERE filters.

{% test expect_no_orphan_fk(model, column_name, parent_model, parent_column) %}

SELECT child.{{ column_name }}
FROM {{ model }} AS child
LEFT JOIN {{ ref(parent_model) }} AS parent
  ON child.{{ column_name }} = parent.{{ parent_column }}
WHERE child.{{ column_name }} IS NOT NULL
  AND parent.{{ parent_column }} IS NULL

{% endtest %}


-- tests/assert_order_items_match_orders.sql
-- Singular test: complex referential integrity check that cannot be
-- expressed as a generic template.
-- Returns rows that indicate a data integrity violation.

SELECT
  oi.order_item_id,
  oi.order_id,
  oi.created_at
FROM {{ ref('fct_order_items') }} AS oi
LEFT JOIN {{ ref('fct_orders') }} AS o
  ON oi.order_id = o.order_id
WHERE o.order_id IS NULL
  -- Only check recent data to avoid performance issues on large tables
  AND oi.created_at >= CURRENT_DATE - INTERVAL '30 days'


# schema.yml — using custom tests with severity and store_failures

version: 2

models:
  - name: fct_orders
    tests:
      # Table-level singular test equivalent via generic test
      - expect_at_least_one_value:
          column_name: status
          value: 'completed'
          config:
            severity: warn           # warn instead of error so CI still passes

    columns:
      - name: customer_id
        tests:
          - expect_no_orphan_fk:
              parent_model: dim_customers
              parent_column: customer_id
              config:
                severity: error
                store_failures: true          # materialize failures to warehouse
                store_failures_as: table      # 'table' or 'view'

Note

When store_failures: true is set, dbt writes failing rows to a table in a schema named <target_schema>_dbt_test__audit by default. You can customize the failure schema with the schema config key on the test. Run dbt test --store-failures in CI to enable this globally without modifying every test YAML. These failure tables are incremental by default in dbt 1.6+ — they append new failures rather than truncating, giving you a historical audit trail.

Multi-Environment Deployments — Profiles, Targets, and Vars

dbt's profiles.yml defines named connection profiles with multiple targets — one per environment (dev, staging, prod). Each target specifies the database connection parameters, the target schema, and any environment-specific settings. The active target is selected at runtime via the --target flag or the DBT_TARGET environment variable. Inside model SQL and macros, the {{ target.name }} variable resolves to the name of the active target, enabling conditional logic for environment-specific behavior.

For secrets and deployment-specific parameters, use env_var() to inject environment variables at compile time. This avoids hardcoding credentials or environment-specific values in version-controlled files. Combine env_var() with var() for values that should be overridable both via environment and CLI: {{ var('start_date', env_var('DBT_START_DATE', '2024-01-01')) }}.

# profiles.yml (~/.dbt/profiles.yml or project root for dbt Cloud)
# Three-environment setup: dev, staging, prod
# Credentials always come from env vars — never hardcoded.

analytics:
  target: dev  # default target when --target is not specified

  outputs:
    dev:
      type: bigquery
      method: oauth          # use gcloud auth for local dev
      project: my-project-dev
      dataset: "{{ env_var('DBT_SCHEMA', 'analytics_dev') }}"
      location: EU
      threads: 4
      timeout_seconds: 300

    staging:
      type: bigquery
      method: service-account
      project: my-project-staging
      dataset: analytics_staging
      keyfile: "{{ env_var('GOOGLE_APPLICATION_CREDENTIALS') }}"
      location: EU
      threads: 8
      timeout_seconds: 600

    prod:
      type: bigquery
      method: service-account
      project: my-project-prod
      dataset: analytics
      keyfile: "{{ env_var('GOOGLE_APPLICATION_CREDENTIALS') }}"
      location: EU
      threads: 16
      timeout_seconds: 900


# dbt_project.yml — environment-specific model configs and grants

name: 'analytics'
version: '1.0.0'
config-version: 2

profile: 'analytics'

model-paths: ["models"]
test-paths: ["tests"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

models:
  analytics:
    # Staging layer: views everywhere
    staging:
      +materialized: view
      +schema: staging

    # Intermediate layer: ephemeral in dev (no warehouse cost), table in prod
    intermediate:
      +materialized: "{{ 'ephemeral' if target.name == 'dev' else 'table' }}"

    # Marts: tables in prod and staging, views in dev for fast iteration
    marts:
      +materialized: "{{ 'table' if target.name in ['prod', 'staging'] else 'view' }}"
      +schema: marts
      # Grant SELECT to the reporting role in production only
      +grants:
        select:
          - "{{ 'reporting_role' if target.name == 'prod' else 'dev_role' }}"

# on-run-start hooks run before any models execute
on-run-start:
  - "{{ create_audit_table() }}"


-- models/marts/fct_revenue.sql
-- Model that branches behavior based on target environment.
-- In prod: scan the last 90 days only for cost efficiency.
-- In dev: scan only the last 7 days to keep query cost low.

{% set lookback_days = 90 if target.name == 'prod' else 7 %}
{% set source_project = env_var('DBT_SOURCE_PROJECT', 'my-project-dev') %}

SELECT
  date_trunc('{{ var("grain", "day") }}', event_at)   AS period,
  tenant_id,
  SUM(revenue_cents) / 100.0                          AS revenue_usd,
  COUNT(DISTINCT order_id)                            AS order_count,
  COUNT(DISTINCT customer_id)                         AS unique_customers
FROM `{{ source_project }}.raw.orders`
WHERE event_at >= CURRENT_DATE - INTERVAL {{ lookback_days }} DAY
  AND status = 'completed'
GROUP BY 1, 2

Note

Grants are a first-class dbt resource config since dbt 1.2. When grants is set on a model, dbt executes the appropriate GRANT SELECT ON ... statement after each model run — keeping access control co-located with model definition. In BigQuery, grants map to IAM bindings. In Snowflake and Redshift, they map to SQL GRANT statements. dbt does not revoke grants by default; set +grant_access_to carefully for strict access control scenarios.

Snapshots — SCD Type 2 for Slowly Changing Dimensions

dbt snapshots implement Slowly Changing Dimension Type 2 (SCD2): they track historical changes to source records by appending rows with validity windows instead of overwriting. Each snapshot execution compares the current state of the source against the snapshot table and inserts new rows for changed records, closing the previous row's validity window. The result is a table where every version of every record is preserved, queryable as-of any point in time using the dbt_valid_from and dbt_valid_to columns added by dbt.

Two strategies are available. The timestamp strategy uses an updated_at column on the source: a record is considered changed when its updated_at is later than the last snapshot. This is efficient and reliable when sources maintain accurate update timestamps. The check strategy compares a specified list of columns value-by-value: a record is considered changed when any watched column differs from the previous snapshot. Use the check strategy for sources that do not have reliable update timestamps.

-- snapshots/snp_customers.sql
-- SCD Type 2 snapshot of the customers table using timestamp strategy.
-- Run with: dbt snapshot

{% snapshot snp_customers %}

{{
  config(
    target_schema='snapshots',
    unique_key='customer_id',
    strategy='timestamp',
    updated_at='updated_at',   -- source column indicating last modification
    invalidate_hard_deletes=True  -- close validity window for deleted records
  )
}}

SELECT
  customer_id,
  email,
  first_name,
  last_name,
  plan_tier,        -- SCD: we want history when customers upgrade/downgrade
  country_code,
  created_at,
  updated_at
FROM {{ source('crm', 'customers') }}

{% endsnapshot %}


-- snapshots/snp_product_prices.sql
-- Check strategy: no reliable updated_at, so we watch specific columns.

{% snapshot snp_product_prices %}

{{
  config(
    target_schema='snapshots',
    unique_key='product_id',
    strategy='check',
    check_cols=['price_cents', 'currency', 'active'],
    invalidate_hard_deletes=True
  )
}}

SELECT
  product_id,
  sku,
  price_cents,
  currency,
  active
FROM {{ source('catalog', 'products') }}

{% endsnapshot %}


-- models/marts/fct_revenue_with_plan.sql
-- Join against the customers snapshot for point-in-time plan tier.
-- For each order, look up the customer's plan tier AT THE TIME of the order —
-- not their current plan tier. This is the core value of SCD2 snapshots.

SELECT
  o.order_id,
  o.customer_id,
  o.order_date,
  o.total_amount_cents,
  -- Point-in-time plan tier: what plan was the customer on when they ordered?
  c.plan_tier AS plan_tier_at_order_time,
  -- Current plan tier for comparison
  c_current.plan_tier AS current_plan_tier
FROM {{ ref('fct_orders') }} AS o

-- Join to the snapshot with validity window for point-in-time accuracy
LEFT JOIN {{ ref('snp_customers') }} AS c
  ON o.customer_id = c.customer_id
 AND o.order_date >= c.dbt_valid_from
 AND (o.order_date < c.dbt_valid_to OR c.dbt_valid_to IS NULL)

-- Join to the snapshot for current state (dbt_valid_to IS NULL = current record)
LEFT JOIN {{ ref('snp_customers') }} AS c_current
  ON o.customer_id = c_current.customer_id
 AND c_current.dbt_valid_to IS NULL

Note

Run snapshots before your dbt model runs — use dbt snapshot && dbt run in your CI/CD pipeline or orchestrator. Snapshots are append-only operations that read from sources, so they should execute first to capture the current state before any transformations. Avoid running snapshots more frequently than your source data changes — hourly snapshots of a table that updates once a day simply add unnecessary rows with identical data.

Slim CI with State-Based Selection

In large dbt projects, running the full model graph on every pull request is prohibitively expensive — both in time and warehouse cost. Slim CI solves this by using dbt's state-based selectionto run only the models changed in the PR and their downstream dependents. The mechanism relies on comparing the current project's manifest.json (produced by the current run) against the manifest.json from the previous successful run on the target branch. dbt compares model hashes to determine which models have changed.

The state:modified+ selector matches models whose SQL, config, or upstream sources have changed, plus all of their downstream dependents (+ suffix). Combined with dbt build (which runs models, tests, snapshots, and seeds in topological order), dbt build --select state:modified+ gives you a minimal, safe CI execution that proves the changed models and their dependents work correctly without touching unrelated parts of the graph.

Additionally, integrate dbt source freshness into your pipeline to check that source tables have been updated within their expected freshness windows before running transformations. Stale sources are a common cause of incorrect downstream results, and catching them at the start of the pipeline avoids wasted compute on outdated data.

# .github/workflows/dbt-slim-ci.yml
# Slim CI: run only changed models and their dependents on every PR.
# Downloads the previous manifest.json from the last successful production run
# and uses it as the state comparison baseline.

name: dbt Slim CI

on:
  pull_request:
    branches: [main]
    paths:
      - 'models/**'
      - 'macros/**'
      - 'tests/**'
      - 'snapshots/**'
      - 'dbt_project.yml'
      - 'packages.yml'

jobs:
  slim-ci:
    name: dbt Slim CI
    runs-on: ubuntu-latest
    timeout-minutes: 30

    env:
      DBT_TARGET: ci
      GOOGLE_APPLICATION_CREDENTIALS: /tmp/gcp-sa.json

    steps:
      - name: Checkout PR branch
        uses: actions/checkout@v4

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install dbt dependencies
        run: |
          pip install dbt-bigquery==1.8.*
          dbt deps

      - name: Write GCP service account key
        run: echo '${{ secrets.GCP_SA_KEY }}' > /tmp/gcp-sa.json

      # Download the previous manifest from the last successful production run.
      # We store it as a GitHub Actions artifact after each main branch deploy.
      - name: Download previous manifest (state baseline)
        uses: dawidd6/action-download-artifact@v6
        with:
          workflow: dbt-production.yml
          branch: main
          name: dbt-manifest
          path: ./state/
        continue-on-error: true   # first run will not have a previous manifest

      # Check source freshness before running any models
      - name: Check source freshness
        run: |
          dbt source freshness             --target ${{ env.DBT_TARGET }}             --profiles-dir .
        continue-on-error: true   # warn on stale sources but don't block CI

      # Run only models changed in this PR + their dependents
      # --defer: use production-run results for unmodified upstream models
      #          so tests on changed models can reference prod data
      - name: dbt build (slim CI — modified models only)
        run: |
          dbt build             --select state:modified+             --defer             --state ./state/             --target ${{ env.DBT_TARGET }}             --profiles-dir .             --store-failures             --fail-fast
        env:
          DBT_SCHEMA: "ci_pr_${{ github.event.number }}"

      # Upload the manifest from this run so the next PR can use it
      - name: Upload manifest artifact
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: dbt-manifest-ci
          path: ./target/manifest.json
          retention-days: 7

      # Notify on failure
      - name: Notify Slack on failure
        if: failure()
        uses: slackapi/slack-github-action@v1
        with:
          payload: |
            {
              "text": ":x: dbt Slim CI failed on PR #${{ github.event.number }}",
              "blocks": [
                {
                  "type": "section",
                  "text": {
                    "type": "mrkdwn",
                    "text": ":x: *dbt Slim CI failed* — PR #${{ github.event.number }} <${{ github.event.pull_request.html_url }}|View PR>"
                  }
                }
              ]
            }
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
          SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK

Note

The --defer flag is the companion to slim CI: when running only modified models, dbt cannot resolve upstream refs that were not built in this run. With --defer, dbt falls back to the production database for unbuilt upstream models — so tests on your changed model can still reference real data from the production run. This requires the state manifest to include run results (run_results.json) alongside the manifest, and the CI service account must have read access to the production dataset.

Further Reading

  • Jinja2 and Macros in dbt — complete reference for macro syntax, context variables, filters, the run_query function, and the dispatch mechanism for adapter-specific overrides
  • dbt Hub — package registry for discovering and installing community and vendor dbt packages including dbt-utils, dbt-expectations, and adapter-specific utility packages
  • State Comparison Caveats — important edge cases in state-based selection: when models are considered modified even if SQL did not change, cross-version manifest compatibility, and freshness comparisons for sources
  • dbt-expectations on GitHub — full list of available test macros, configuration options, version compatibility matrix, and contribution guide for adding new expectations
  • dbt-project-evaluator — official dbt Labs package that audits your project for best practices violations: missing tests, missing documentation, fan-in/fan-out model DAG anti-patterns, and direct source references in non-staging models

Work with us

Running dbt in production and hitting walls with copy-paste SQL, weak test coverage, or brittle multi-environment deployments?

We help analytics engineering teams implement advanced dbt patterns — from macro libraries and package ecosystems to custom test frameworks, multi-environment deployment strategies, SCD Type 2 snapshots, and slim CI pipelines that keep your warehouse reliable. Let’s talk.

Get in touch

Related Articles

DataSOps Consulting

Need help implementing this in production?

We build and operate data pipelines, AI systems, and observability stacks for engineering teams. Reach out for a free 30-minute architecture review.