The Three Pillars Problem
Most engineering teams have observability — they just have it in three disconnected silos. Traces live in Jaeger, metrics in Prometheus, logs in Elasticsearch. When an incident hits, engineers tab-switch between tools, mentally correlating timestamps and request IDs. The data is there; the connections are not.
OpenTelemetry (OTel) solves this by providing a single, vendor-neutral instrumentation layer for all three signals. Instrument your code once, and export traces, metrics, and logs to any backend — Grafana Tempo, Jaeger, Prometheus, Datadog, or all of them simultaneously. No vendor lock-in, no re-instrumentation when you switch backends.
This article walks through practical OTel adoption: how the Collector works, how to instrument services in multiple languages, how to correlate signals, and how to avoid the pitfalls that trip up most teams in production.
OpenTelemetry Architecture
OTel consists of three layers: the API (interfaces for instrumentation), the SDK (implementation with processors and exporters), and the Collector (a standalone service that receives, processes, and exports telemetry). Understanding the boundaries between these layers is key to a clean deployment.
API Layer
The API defines how you create spans, record metrics, and emit log records. Libraries and frameworks instrument against the API only — they never depend on the SDK. This means library authors can add OTel instrumentation without forcing users into a specific backend or even requiring the SDK to be present. If no SDK is configured, the API is a no-op with zero overhead.
SDK Layer
The SDK provides the concrete implementation: span processors, metric readers, log record processors, samplers, and exporters. Your application configures the SDK at startup to wire instrumentation to backends. The SDK also handles batching, retry, and resource attribution — attaching metadata like service name, version, and environment to every signal.
Collector
The OTel Collectoris a vendor-agnostic proxy that sits between your applications and your backends. It receives telemetry over OTLP (or other protocols), processes it (filtering, sampling, enrichment), and exports it to one or more destinations. Running a Collector decouples your applications from backend specifics — switching from Jaeger to Tempo is a Collector config change, not a code change.
The Collector Pipeline
The Collector processes telemetry through a pipeline of receivers, processors, and exporters. Each pipeline handles one signal type (traces, metrics, or logs), and you can define multiple pipelines per signal.
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
send_batch_size: 8192
timeout: 200ms
memory_limiter:
check_interval: 1s
limit_mib: 1024
spike_limit_mib: 256
attributes:
actions:
- key: environment
value: production
action: upsert
resource:
attributes:
- key: deployment.environment
value: production
action: upsert
exporters:
otlp/tempo:
endpoint: tempo:4317
tls:
insecure: true
prometheusremotewrite:
endpoint: http://mimir:9009/api/v1/push
resource_to_telemetry_conversion:
enabled: true
otlp/loki:
endpoint: loki:3100
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, attributes]
exporters: [otlp/tempo]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheusremotewrite]
logs:
receivers: [otlp]
processors: [memory_limiter, batch, resource]
exporters: [otlp/loki]Note
memory_limiter first in the processor chain. If the Collector runs out of memory, it crashes and you lose all in-flight telemetry. The memory limiter applies backpressure before that happens, refusing new data and giving exporters time to flush.Agent vs. Gateway Deployment
In the agent pattern, a Collector runs as a sidecar or DaemonSet alongside each application. It handles local buffering and forwards to a central gateway. In the gateway pattern, a Collector cluster receives telemetry from all services and handles processing centrally. Most production deployments use both: agents for resilience and local enrichment, gateways for tail-based sampling and cross-service processing.
Distributed Tracing Done Right
A trace is a tree of spans that represents a single request flowing through your system. Each span records an operation — an HTTP handler, a database query, a message publish. The power of tracing is seeing the full chain: which services a request touched, how long each step took, and where errors occurred.
Auto-Instrumentation
OTel provides auto-instrumentation libraries for most popular frameworks and clients. These hook into HTTP servers, database drivers, gRPC clients, and message brokers to create spans automatically — no code changes required. Start with auto-instrumentation to get baseline visibility, then add custom spans for business-critical paths.
# Python — auto-instrumentation with zero code changes
# Install the packages
pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a install
# Run your app with auto-instrumentation
opentelemetry-instrument \
--service_name order-service \
--traces_exporter otlp \
--metrics_exporter otlp \
--logs_exporter otlp \
--exporter_otlp_endpoint http://otel-collector:4317 \
python app.py
# This automatically instruments:
# - Flask/Django/FastAPI HTTP handlers
# - requests/httpx/aiohttp outbound calls
# - psycopg2/asyncpg/SQLAlchemy queries
# - redis, celery, kafka-python, and 40+ more librariesCustom Spans for Business Logic
Auto-instrumentation captures infrastructure operations. Custom spans capture what your business cares about: order validation, payment processing, inventory reservation. These are the spans that show up in incident reviews and SLO dashboards.
// Node.js — custom spans with semantic attributes
import { trace, SpanStatusCode } from '@opentelemetry/api';
const tracer = trace.getTracer('order-service', '1.0.0');
async function processOrder(order: Order): Promise<OrderResult> {
return tracer.startActiveSpan('order.process', async (span) => {
try {
// Add business context as span attributes
span.setAttribute('order.id', order.id);
span.setAttribute('order.total_cents', order.totalCents);
span.setAttribute('order.item_count', order.items.length);
span.setAttribute('customer.id', order.customerId);
span.setAttribute('customer.tier', order.customerTier);
// Validate — creates a child span automatically
const validation = await validateOrder(order);
if (!validation.valid) {
span.setStatus({ code: SpanStatusCode.ERROR, message: validation.reason });
span.setAttribute('order.rejection_reason', validation.reason);
return { status: 'rejected', reason: validation.reason };
}
// Reserve inventory — another child span
const reservation = await tracer.startActiveSpan('inventory.reserve', async (invSpan) => {
invSpan.setAttribute('warehouse.id', order.warehouseId);
const result = await inventoryClient.reserve(order.items);
invSpan.setAttribute('inventory.reserved_count', result.reservedCount);
invSpan.end();
return result;
});
// Process payment
const payment = await tracer.startActiveSpan('payment.charge', async (paySpan) => {
paySpan.setAttribute('payment.method', order.paymentMethod);
paySpan.setAttribute('payment.currency', order.currency);
const result = await paymentClient.charge(order);
paySpan.setAttribute('payment.transaction_id', result.transactionId);
paySpan.end();
return result;
});
span.setAttribute('order.status', 'confirmed');
span.addEvent('order.confirmed', { 'payment.transaction_id': payment.transactionId });
return { status: 'confirmed', transactionId: payment.transactionId };
} catch (error) {
span.recordException(error as Error);
span.setStatus({ code: SpanStatusCode.ERROR, message: (error as Error).message });
throw error;
} finally {
span.end();
}
});
}Note
http.request.method, db.system, rpc.service. Backends build dashboards and alerts around these conventions. Custom attribute names lose that integration.Metrics — Beyond Prometheus Scraping
OTel metrics support both push and pull models. You can export directly via OTLP (push) or expose a Prometheus-compatible scrape endpoint (pull). The OTel metrics API provides three core instruments that cover virtually all use cases:
| Instrument | Example | Aggregation | When to Use |
|---|---|---|---|
| Counter | requests_total | Sum (monotonic) | Events that only go up |
| Histogram | request_duration_ms | Distribution | Latency, sizes, distributions |
| UpDownCounter | active_connections | Sum (non-monotonic) | Gauges that go up and down |
// Go — OTel metrics with OTLP export
package main
import (
"context"
"time"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/metric"
sdkmetric "go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc"
)
func initMetrics(ctx context.Context) (*sdkmetric.MeterProvider, error) {
exporter, err := otlpmetricgrpc.New(ctx,
otlpmetricgrpc.WithEndpoint("otel-collector:4317"),
otlpmetricgrpc.WithInsecure(),
)
if err != nil {
return nil, err
}
provider := sdkmetric.NewMeterProvider(
sdkmetric.WithReader(
sdkmetric.NewPeriodicReader(exporter, sdkmetric.WithInterval(15*time.Second)),
),
)
otel.SetMeterProvider(provider)
return provider, nil
}
// Application metrics
var (
meter = otel.Meter("order-service")
ordersProcessed metric.Int64Counter
orderLatency metric.Float64Histogram
activeOrders metric.Int64UpDownCounter
)
func init() {
var err error
ordersProcessed, err = meter.Int64Counter("orders.processed",
metric.WithDescription("Total number of orders processed"),
metric.WithUnit("{order}"),
)
if err != nil { panic(err) }
orderLatency, err = meter.Float64Histogram("orders.processing_duration",
metric.WithDescription("Order processing latency in milliseconds"),
metric.WithUnit("ms"),
metric.WithExplicitBucketBoundaries(5, 10, 25, 50, 100, 250, 500, 1000, 2500),
)
if err != nil { panic(err) }
activeOrders, err = meter.Int64UpDownCounter("orders.active",
metric.WithDescription("Number of orders currently being processed"),
metric.WithUnit("{order}"),
)
if err != nil { panic(err) }
}
func handleOrder(ctx context.Context, order Order) error {
start := time.Now()
activeOrders.Add(ctx, 1)
defer func() {
activeOrders.Add(ctx, -1)
orderLatency.Record(ctx, float64(time.Since(start).Milliseconds()))
}()
err := processOrder(ctx, order)
ordersProcessed.Add(ctx, 1,
metric.WithAttributes(
attribute.String("status", statusFromErr(err)),
attribute.String("payment_method", order.PaymentMethod),
),
)
return err
}Note
Structured Logs with Trace Correlation
OTel's log signal bridges your existing logging with traces and metrics. Instead of replacing your logger, OTel's log bridge API injects trace context (trace ID, span ID) into every log record. This means you can click from a trace to the exact log lines that were emitted during that span — the correlation that makes incidents solvable in minutes instead of hours.
# Python — structured logging with automatic trace correlation
import logging
import structlog
from opentelemetry.instrumentation.logging import LoggingInstrumentor
# Enable OTel log correlation — injects trace_id and span_id into log records
LoggingInstrumentor().instrument(set_logging_format=True)
# Configure structlog with OTel context
structlog.configure(
processors=[
structlog.contextvars.merge_contextvars,
structlog.processors.add_log_level,
structlog.processors.TimeStamper(fmt="iso"),
# trace_id and span_id are injected automatically by OTel
structlog.processors.JSONRenderer(),
],
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
)
logger = structlog.get_logger()
async def process_payment(order_id: str, amount_cents: int):
# These logs automatically include trace_id and span_id
logger.info("payment.started", order_id=order_id, amount_cents=amount_cents)
try:
result = await payment_gateway.charge(amount_cents)
logger.info(
"payment.completed",
order_id=order_id,
transaction_id=result.transaction_id,
latency_ms=result.latency_ms,
)
return result
except PaymentDeclined as e:
logger.warning(
"payment.declined",
order_id=order_id,
reason=e.reason,
decline_code=e.code,
)
raise
# Output (every line includes trace context):
# {"event": "payment.started", "order_id": "ord-789",
# "amount_cents": 4999, "level": "info",
# "timestamp": "2026-04-15T10:23:45.123Z",
# "otelTraceID": "a1b2c3d4e5f6...", "otelSpanID": "f6e5d4c3..."}The Correlation Pattern in Practice
With trace-correlated logs, your incident workflow becomes: (1) alert fires on a metric anomaly — say, p99 latency exceeding the SLO. (2) You query for slow traces within that time window in Tempo or Jaeger. (3) You find a trace showing a 3-second database query. (4) You click through to the logs for that span and see the exact SQL query and the lock_wait event that caused the delay. Three signals, one workflow, no tab-switching.
Sampling Strategies That Don't Lose Signal
At scale, collecting 100% of traces is neither affordable nor useful. A service handling 10,000 requests per second generates millions of spans per hour. Sampling reduces volume while preserving the traces that matter. OTel supports three sampling approaches:
Head-Based Sampling
The decision is made at the start of the trace (the “head”). A TraceIdRatioBasedsampler keeps a fixed percentage — say 10%. It's simple and efficient, but it can't know if a trace will be interesting until it's complete. You might discard the one trace that shows a rare failure.
Tail-Based Sampling
The decision is made after the trace is complete (the “tail”). The Collector buffers complete traces and applies policies: keep all errors, keep traces slower than 2 seconds, keep 5% of everything else. This captures the interesting traces while dropping routine ones. The trade-off is memory: the Collector must buffer all traces until the decision point.
Hybrid Sampling
Combine both: head-based sampling at 100% (collect everything locally), tail-based sampling at the gateway Collector. This gives you complete local traces for debugging while the gateway decides what reaches long-term storage. The agent-gateway Collector topology makes this natural.
# Tail-based sampling in the Collector
processors:
tail_sampling:
decision_wait: 10s # wait for trace completion
num_traces: 100000 # max traces in memory
expected_new_traces_per_sec: 1000
policies:
# Always keep error traces
- name: errors
type: status_code
status_code:
status_codes: [ERROR]
# Always keep slow traces (> 2s)
- name: slow-traces
type: latency
latency:
threshold_ms: 2000
# Keep all traces from critical services
- name: critical-services
type: string_attribute
string_attribute:
key: service.name
values: [payment-service, auth-service]
# Sample 5% of everything else
- name: baseline
type: probabilistic
probabilistic:
sampling_percentage: 5
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, tail_sampling, batch]
exporters: [otlp/tempo]Note
loadbalancing exporter in your agent Collectors to route spans by trace ID to the appropriate gateway instance. Without this, you get partial traces and incorrect sampling decisions.Context Propagation Across Boundaries
Distributed tracing only works if trace context propagates across service boundaries. OTel uses W3C Trace Context headers by default: traceparent carries the trace ID, span ID, and sampling decision. Every HTTP client and server in your stack must propagate these headers.
// Java — context propagation with Spring Boot
// Auto-instrumentation handles this automatically, but here's what happens:
// Incoming request: extract context from headers
// traceparent: 00-a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4-f6e5d4c3b2a1f6e5-01
// The span created for this request becomes a child of the remote span.
// When making outbound calls, context is injected automatically:
@Service
public class OrderService {
private final RestClient restClient;
// OTel auto-instrumentation wraps RestClient to inject traceparent
public PaymentResult processPayment(Order order) {
return restClient.post()
.uri("http://payment-service/api/v1/charge")
// traceparent header is injected automatically
.body(new ChargeRequest(order.getTotalCents(), order.getCurrency()))
.retrieve()
.body(PaymentResult.class);
}
}
// For message queues, context goes into message headers:
@Component
public class OrderEventPublisher {
private final KafkaTemplate<String, OrderEvent> kafkaTemplate;
public void publishOrderPlaced(OrderPlaced event) {
// OTel Kafka instrumentation injects traceparent into Kafka headers
kafkaTemplate.send("orders.placed", event.getOrderId(), event);
// Consumer on the other end extracts it, creating a linked span
}
}For services using different propagation formats (B3 from Zipkin, X-Ray from AWS), configure the OTel SDK with composite propagators. The SDK can extract from multiple formats and inject in the standard W3C format, bridging legacy and modern instrumentation.
Resource Attributes — Know Where Telemetry Comes From
Every span, metric, and log record carries resource attributes that identify its source. At minimum, set service.name, service.version, and deployment.environment. These attributes power service maps, version comparisons, and environment filtering in every observability backend.
# Environment variables — works with any language SDK
export OTEL_SERVICE_NAME=order-service
export OTEL_RESOURCE_ATTRIBUTES="service.version=1.4.2,deployment.environment=production,service.namespace=ecommerce"
export OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_TRACES_SAMPLER=parentbased_traceidratio
export OTEL_TRACES_SAMPLER_ARG=0.1
# Kubernetes — set resource attributes from pod metadata
env:
- name: OTEL_SERVICE_NAME
value: order-service
- name: OTEL_RESOURCE_ATTRIBUTES
value: >-
service.version=$(IMAGE_TAG),
deployment.environment=production,
k8s.namespace.name=$(K8S_NAMESPACE),
k8s.pod.name=$(K8S_POD_NAME),
k8s.node.name=$(K8S_NODE_NAME)
- name: K8S_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeNameCommon Pitfalls and How to Avoid Them
- Over-instrumenting hot paths. Adding a custom span to every loop iteration or every cache lookup creates noise and overhead. Instrument at the operation level (HTTP handler, queue consumer, batch job), not the line-of-code level.
- High-cardinality metric attributes. Using user IDs, request IDs, or URLs as metric labels explodes your time series count and kills your metrics backend. Stick to bounded values: HTTP methods, status code classes, service names, endpoint groups.
- Skipping the Collector. Exporting directly from applications to backends works in development. In production, it couples your code to specific vendors, loses the ability to transform telemetry centrally, and makes backend migration painful.
- Ignoring context propagation gaps. One service that doesn't propagate
traceparentbreaks the trace chain for every downstream service. Audit propagation across your entire call graph, including queues and async workers. - Not setting resource attributes. Without
service.name, every span shows up as “unknown_service” in your trace viewer. This is the most common setup mistake and the easiest to prevent.
A Practical Adoption Strategy
Adopting OTel across an organization is a journey, not a switch flip. The most successful rollouts follow a phased approach:
- Phase 1 — Collector and auto-instrumentation. Deploy the Collector, enable auto-instrumentation on 2-3 services, and export to your existing backends. Prove the pipeline works without changing application code.
- Phase 2 — Custom spans and metrics. Add business-relevant spans and custom metrics to your most critical services. Build dashboards and alerts around these signals. This is where OTel starts paying back the investment.
- Phase 3 — Log correlation. Bridge your existing logging to OTel, inject trace context, and configure your log backend to link logs to traces. This closes the three-pillar loop.
- Phase 4 — Tail-based sampling and optimization. Once all services emit telemetry, deploy gateway Collectors with tail-based sampling. Optimize costs by filtering noise and keeping only the signals that matter.
Note
One Standard, All Signals
OpenTelemetry is the first instrumentation standard that credibly unifies traces, metrics, and logs under one API, one SDK, and one collection pipeline. The value proposition is straightforward:
- Instrument once, export to any backend — no vendor lock-in
- Correlate traces with logs and metrics for faster incident resolution
- Auto-instrumentation gives you baseline visibility with zero code changes
- The Collector decouples your applications from backend specifics
- Tail-based sampling captures interesting traces while controlling costs
The ecosystem is mature: the specificationis stable for traces and metrics, logs are GA as of late 2024, and every major observability vendor supports OTLP natively. If you're starting a new service or re-evaluating your observability stack, OTel is the foundation to build on.
Building a unified observability platform or migrating from fragmented monitoring?
We help teams implement OpenTelemetry end-to-end — from Collector pipelines and auto-instrumentation to tail-based sampling and SLO-driven alerting. Let’s talk.
Send a Message