What industries do you work with?

We work across a wide range of industries including finance, healthcare, e-commerce, logistics, and telecommunications. Our solutions are tailored to each client’s specific domain requirements and regulatory environment.

How long does a typical engagement take?

It depends on the scope. A focused observability deployment or automation workflow can be delivered in 4-6 weeks. Larger initiatives like full-scale LLM integration or platform builds typically run 2-4 months. We always start with a discovery phase to align on timelines.

Do you offer ongoing support after project delivery?

Yes. We offer flexible support and maintenance plans to ensure your systems stay healthy, updated, and optimized. We can also embed with your team on a part-time basis for continuous improvement.

Can you work with our existing tech stack?

Absolutely. We integrate with your current infrastructure and tools rather than forcing a rip-and-replace. Whether you’re on AWS, GCP, Azure, or on-prem, we adapt our approach to what works best for your environment.

What is your pricing model?

We offer both fixed-price project engagements and time-and-materials contracts depending on the nature of the work. Reach out through our contact form and we’ll provide a tailored estimate within 24 hours.

How do you handle data security and compliance?

Security is built into every engagement. We follow industry best practices for data handling, support GDPR and SOC 2 compliance requirements, and can work within your existing security policies and access controls.

Service Mesh with Istio — Traffic Management, mTLS, and Observability at Scale

Why Service Meshes Exist

When you move from a monolith to microservices, you immediately inherit a new class of problems: service-to-service encryption, mutual authentication, retries, timeouts, circuit breaking, and distributed tracing — all of which used to be handled inside a single process. The naive solution is to bake these concerns into every service library. The result is a maintenance nightmare: different teams implement retry logic differently, certificates expire silently, and adding a new observability requirement means re-deploying every service.

A service mesh moves these concerns into the infrastructure layer. Every pod gets a sidecar proxy (Envoy) injected at admission time. All inbound and outbound traffic flows through that proxy, which is centrally configured by a control plane. Your application code becomes simpler — it just makes HTTP or gRPC calls to localhost and the proxy handles the rest.

Istio is the most widely deployed service mesh, and as of Istio 1.22+ the Ambient Mesh mode removes the sidecar requirement entirely by using per-node ztunnel proxies. This article covers the battle-tested sidecar model (still the default) while noting where Ambient changes the picture.

Note

Istio's sidecar injection adds roughly 2–3ms of latency per hop and around 0.5 vCPU per 1000 RPS per proxy. For most workloads this overhead is negligible, but factor it into capacity planning for latency-critical services. Ambient Mesh reduces this significantly because the ztunnel operates at the node level rather than per-pod.

Istio Architecture — Control Plane and Data Plane

Istio splits into two layers: the data plane (the Envoy sidecar proxies running next to every workload) and the control plane (istiod, the single binary that combines Pilot, Citadel, and Galley from older Istio versions).

istiod — Control Plane

istiod translates Istio CRDs (VirtualService, DestinationRule, Gateway, etc.) into Envoy xDS configuration and pushes it to all sidecar proxies via gRPC. It also acts as a certificate authority — it issues short-lived SPIFFE/X.509 certificates to every workload identity, enabling mTLS without manual cert management.

Envoy Sidecar — Data Plane

Each pod gets an envoy proxy injected as a sidecar container. iptables rules (or eBPF in Ambient mode) redirect all TCP traffic through the proxy. Envoy terminates TLS, enforces routing rules, applies retry/timeout/circuit-breaker policies, and emits telemetry (access logs, metrics, traces).

Ingress Gateway

A dedicated Envoy deployment that handles north-south traffic (external clients → cluster). Configured via Gateway and VirtualService CRDs. Replaces standard Kubernetes Ingress for meshes — it gives you full Envoy feature parity at the cluster edge, including TLS termination, HTTP/2, WebSocket, and gRPC.

Installing Istio with istioctl

The recommended installation method for production is istioctl with an IstioOperator manifest stored in git. Avoid the quick istioctl install --set profile=demo in production — the demo profile disables resource limits and enables debug ports.

# Download and install istioctl (Linux/macOS)
curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.22.1 sh -
export PATH="$PWD/istio-1.22.1/bin:$PATH"

# Verify pre-requisites on the target cluster
istioctl x precheck

# Install with a production-grade IstioOperator manifest
istioctl install -f istio-operator.yaml --verify

# istio-operator.yaml — production profile with resource limits
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  name: production
  namespace: istio-system
spec:
  profile: default
  components:
    pilot:
      k8s:
        resources:
          requests:
            cpu: 500m
            memory: 2Gi
          limits:
            cpu: "2"
            memory: 4Gi
        hpaSpec:
          minReplicas: 2
          maxReplicas: 5
    ingressGateways:
      - name: istio-ingressgateway
        enabled: true
        k8s:
          resources:
            requests:
              cpu: 200m
              memory: 256Mi
            limits:
              cpu: "1"
              memory: 512Mi
          hpaSpec:
            minReplicas: 2
            maxReplicas: 10
          service:
            type: LoadBalancer
  meshConfig:
    accessLogFile: /dev/stdout
    accessLogEncoding: JSON
    enableTracing: true
    defaultConfig:
      tracing:
        sampling: 10.0        # 10% sampling — adjust per traffic volume
        zipkin:
          address: jaeger-collector.observability:9411
  values:
    global:
      proxy:
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 256Mi
      tracer:
        zipkin:
          address: jaeger-collector.observability:9411

Note

Enable sidecar injection per namespace rather than globally. Add the label istio-injection: enabled to namespaces that need mesh coverage. Exclude system namespaces (kube-system, kube-public) and any namespace where injection would break things (e.g. node-local-dns, GPU workloads with host network).

# Enable sidecar injection for a namespace
kubectl label namespace my-app istio-injection=enabled

# Verify injection is working — look for 2/2 READY in pods
kubectl get pods -n my-app

# Roll out injection on existing deployments without restart
kubectl rollout restart deployment -n my-app

Traffic Management — VirtualService and DestinationRule

Traffic management in Istio is configured through two complementary CRDs: VirtualService (how to route requests — header matching, weights, retries, timeouts) and DestinationRule (what to do after routing — load balancing policy, connection pools, outlier detection, TLS settings). Think of VirtualService as the L7 routing table and DestinationRule as the endpoint policy.

Canary Release — Weight-Based Routing

The classic use case: route 10% of traffic to a new version and verify before shifting 100%.

# destination-rule-reviews.yaml
apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: reviews
  namespace: my-app
spec:
  host: reviews
  trafficPolicy:
    loadBalancer:
      simple: LEAST_CONN
  subsets:
    - name: v1
      labels:
        version: v1
    - name: v2
      labels:
        version: v2
---
# virtual-service-reviews.yaml — canary: 90% v1, 10% v2
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: reviews
  namespace: my-app
spec:
  hosts:
    - reviews
  http:
    - route:
        - destination:
            host: reviews
            subset: v1
          weight: 90
        - destination:
            host: reviews
            subset: v2
          weight: 10

Header-Based Routing for Dark Launches

Route a specific team or beta user cohort to a new version using a request header — without changing weights for everyone else.

# virtual-service-reviews-dark.yaml
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: reviews
  namespace: my-app
spec:
  hosts:
    - reviews
  http:
    # Requests with X-Beta-User: true go to v2
    - match:
        - headers:
            x-beta-user:
              exact: "true"
      route:
        - destination:
            host: reviews
            subset: v2
    # All other traffic stays on v1
    - route:
        - destination:
            host: reviews
            subset: v1

Retries, Timeouts, and Fault Injection

# virtual-service-productpage.yaml — retries + timeout + fault injection
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: productpage
  namespace: my-app
spec:
  hosts:
    - productpage
  http:
    - route:
        - destination:
            host: productpage
            subset: v1
      timeout: 5s
      retries:
        attempts: 3
        perTryTimeout: 2s
        retryOn: 5xx,gateway-error,connect-failure,retriable-4xx
      # Fault injection — only active when enabled via flag; remove in production baseline
      # fault:
      #   delay:
      #     percentage:
      #       value: 10
      #     fixedDelay: 1s
      #   abort:
      #     percentage:
      #       value: 5
      #     httpStatus: 503

Note

retryOn: retriable-4xx retries on HTTP 409 (conflict) which is safe for idempotent reads. Do not add retriable-4xx to write endpoints — retrying a failed payment or order creation on a 409 can cause duplicate transactions if your upstream is not idempotent.

Ingress Gateway — TLS Termination at the Edge

The Istio Ingress Gateway is the recommended entry point for external HTTPS traffic. It terminates TLS using a Secret you manage (or cert-manager automates), then routes to backend services using a VirtualService bound to the Gateway.

# gateway.yaml — TLS termination for api.example.com
apiVersion: networking.istio.io/v1
kind: Gateway
metadata:
  name: api-gateway
  namespace: istio-system
spec:
  selector:
    istio: ingressgateway
  servers:
    - port:
        number: 443
        name: https
        protocol: HTTPS
      tls:
        mode: SIMPLE
        credentialName: api-tls-cert   # kubectl create secret tls api-tls-cert ...
      hosts:
        - api.example.com
    - port:
        number: 80
        name: http
        protocol: HTTP
      tls:
        httpsRedirect: true            # redirect HTTP → HTTPS
      hosts:
        - api.example.com
---
# virtual-service-api.yaml — bind to Gateway and route to backend
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: api
  namespace: my-app
spec:
  hosts:
    - api.example.com
  gateways:
    - istio-system/api-gateway
  http:
    - match:
        - uri:
            prefix: /v1/
      route:
        - destination:
            host: api-service
            port:
              number: 8080

# cert-manager Certificate for automatic TLS renewal
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: api-tls-cert
  namespace: istio-system
spec:
  secretName: api-tls-cert
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
    - api.example.com

mTLS — Zero-Trust Service-to-Service Encryption

Mutual TLS (mTLS) means both sides of a connection present certificates — the client proves its identity, not just the server. In Istio, this happens automatically between injected pods using SPIFFE-compliant X.509 certificates issued by istiod. Each workload gets a certificate with a SPIFFE URI like spiffe://cluster.local/ns/my-app/sa/reviews.

Istio defaults to PERMISSIVE mode — accepting both plain text and mTLS traffic. This lets you migrate incrementally. Flip to STRICT once all workloads in a namespace are injected.

# peer-authentication-strict.yaml — enforce mTLS for the entire namespace
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
  name: default
  namespace: my-app
spec:
  mtls:
    mode: STRICT
---
# peer-authentication-mesh-wide.yaml — enforce mTLS mesh-wide (root namespace)
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
  name: default
  namespace: istio-system   # applies to the entire mesh
spec:
  mtls:
    mode: STRICT

# authorization-policy.yaml — deny traffic not from a specific service account
# Fine-grained RBAC layered on top of mTLS identity
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: reviews-allow-productpage
  namespace: my-app
spec:
  selector:
    matchLabels:
      app: reviews
  action: ALLOW
  rules:
    - from:
        - source:
            principals:
              - cluster.local/ns/my-app/sa/productpage  # only productpage SA can call reviews
      to:
        - operation:
            methods: ["GET"]
            paths: ["/reviews/*"]

Note

When you enable STRICT mTLS, any client without a sidecar (legacy VMs, external services, health check probes from the kubelet) will be rejected. Exclude them with a namespace-scoped or workload-scoped PeerAuthentication override set to PERMISSIVE before enforcing STRICT mesh-wide. The istioctl analyze command flags mTLS policy conflicts before you apply them.

Observability — Kiali, Prometheus, and Jaeger

Every Envoy sidecar emits Prometheus metrics, structured access logs, and distributed trace spans out of the box — no instrumentation code required. The standard Istio observability stack is:

Kiali — Service Graph and Health Dashboard

Kiali visualises the live service topology, traffic flow rates, error rates, and latency percentiles. It can validate Istio configuration (detect mismatched subsets, missing DestinationRules, orphaned VirtualServices) and provides a UI for creating and editing traffic policies. Install with: kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.22/samples/addons/kiali.yaml

Prometheus + Grafana — Metrics

Istio exposes RED (Rate, Error, Duration) metrics for every service pair. Key metrics: istio_requests_total, istio_request_duration_milliseconds, istio_tcp_sent_bytes_total. The official Istio Grafana dashboards (install from samples/addons/) include a Mesh Dashboard, Service Dashboard, and Workload Dashboard.

Jaeger — Distributed Tracing

Envoy propagates B3 / W3C trace headers automatically between services. Your application code only needs to forward incoming trace headers on outbound calls — no SDK required for basic trace continuity. For deeper spans (DB queries, internal functions), instrument with OpenTelemetry and have it export to the same Jaeger collector.

# Install the Istio addons (development/demo clusters — use Helm for production)
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.22/samples/addons/prometheus.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.22/samples/addons/grafana.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.22/samples/addons/jaeger.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.22/samples/addons/kiali.yaml

# Open dashboards via port-forward
istioctl dashboard kiali
istioctl dashboard grafana
istioctl dashboard jaeger

# Python service — forward trace headers for Jaeger continuity
# Without this, each hop appears as a separate disconnected trace in Jaeger
from fastapi import FastAPI, Request
import httpx

app = FastAPI()

TRACE_HEADERS = [
    "x-request-id",
    "x-b3-traceid",
    "x-b3-spanid",
    "x-b3-parentspanid",
    "x-b3-sampled",
    "x-b3-flags",
    "traceparent",    # W3C trace context
    "tracestate",
]


def extract_trace_headers(request: Request) -> dict:
    return {h: request.headers[h] for h in TRACE_HEADERS if h in request.headers}


@app.get("/reviews/{product_id}")
async def get_reviews(product_id: str, request: Request):
    headers = extract_trace_headers(request)
    async with httpx.AsyncClient() as client:
        # Forward trace headers to downstream services
        ratings_resp = await client.get(
            f"http://ratings/ratings/{product_id}",
            headers=headers,
        )
    return {"product_id": product_id, "ratings": ratings_resp.json()}

Circuit Breaking and Outlier Detection

Istio implements circuit breaking via DestinationRule outlier detection and connection pool limits. Outlier detection tracks error rates per upstream host and ejects failing instances from the load balancing pool for an exponentially increasing interval. This prevents a single bad pod from causing cascading timeouts.

# destination-rule-circuit-breaker.yaml
apiVersion: networking.istio.io/v1
kind: DestinationRule
metadata:
  name: reviews-circuit-breaker
  namespace: my-app
spec:
  host: reviews
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100          # max concurrent TCP connections
      http:
        http2MaxRequests: 1000       # max concurrent HTTP/2 requests
        http1MaxPendingRequests: 100 # pending queue depth before 503
        maxRetries: 3
    outlierDetection:
      consecutiveGatewayErrors: 5    # eject after 5 consecutive 502/503/504
      consecutive5xxErrors: 5        # or 5 consecutive 5xx errors
      interval: 30s                  # evaluation window
      baseEjectionTime: 30s          # minimum ejection duration
      maxEjectionPercent: 50         # never eject more than 50% of endpoints
      minHealthPercent: 30           # stop ejecting if < 30% endpoints healthy

Note

Outlier detection ejects individual pod IPs, not the Kubernetes service. If you scale down a deployment, the ejected pod IP disappears naturally. The risk is at low pod counts: with 2 replicas and maxEjectionPercent: 50, one bad pod can be ejected while leaving 50% capacity. With a single replica, the circuit never opens regardless of error rate. Size your deployments with resilience in mind.

Fault Injection — Chaos Testing Without Code Changes

Istio's fault injection lets you inject latency and HTTP errors into traffic paths without modifying application code. This is the fastest way to verify that your retry logic, timeouts, and circuit breakers actually work in a staging environment.

# virtual-service-fault-delay.yaml — inject 3s delay for 20% of requests to ratings
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: ratings-fault-delay
  namespace: my-app
spec:
  hosts:
    - ratings
  http:
    - fault:
        delay:
          percentage:
            value: 20.0
          fixedDelay: 3s
      route:
        - destination:
            host: ratings
            subset: v1
---
# virtual-service-fault-abort.yaml — return HTTP 503 for 10% of requests to ratings
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: ratings-fault-abort
  namespace: my-app
spec:
  hosts:
    - ratings
  http:
    - fault:
        abort:
          percentage:
            value: 10.0
          httpStatus: 503
      route:
        - destination:
            host: ratings
            subset: v1

Global Rate Limiting with Envoy RateLimit Service

Local rate limiting (per-proxy) is fast but counts independently per pod. Global rate limiting shares a counter across all instances using the Envoy RateLimit gRPC service (typically backed by Redis). Configure it via EnvoyFilter — Istio's escape hatch for raw Envoy configuration.

# ratelimit-config.yaml — 100 req/min per unique x-api-key header value
apiVersion: v1
kind: ConfigMap
metadata:
  name: ratelimit-config
  namespace: istio-system
data:
  config.yaml: |
    domain: productpage-ratelimit
    descriptors:
      - key: header_match
        value: api_key_header
        descriptors:
          - key: remote_address
            rate_limit:
              unit: minute
              requests_per_unit: 100

# envoyfilter-ratelimit.yaml — attach global rate limiting to the ingress gateway
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-ratelimit
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      istio: ingressgateway
  configPatches:
    - applyTo: HTTP_FILTER
      match:
        context: GATEWAY
        listener:
          filterChain:
            filter:
              name: envoy.filters.network.http_connection_manager
              subFilter:
                name: envoy.filters.http.router
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.filters.http.ratelimit
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
            domain: productpage-ratelimit
            failure_mode_deny: false   # allow traffic if rate-limit service is unavailable
            rate_limit_service:
              grpc_service:
                envoy_grpc:
                  cluster_name: outbound|8081||ratelimit.istio-system.svc.cluster.local
              transport_api_version: V3
            timeout: 10ms

Debugging Istio — Essential Commands

Most Istio production issues fall into four categories: misconfigured routing, mTLS policy conflicts, sidecar injection not happening, and Envoy proxy not receiving updated xDS config. These commands cover all four.

# Check for configuration issues across the whole mesh
istioctl analyze -n my-app

# Verify sidecar injection and proxy sync status for a pod
istioctl proxy-status

# Inspect the Envoy config that istiod pushed to a specific pod
istioctl proxy-config cluster deploy/productpage -n my-app
istioctl proxy-config route deploy/productpage -n my-app
istioctl proxy-config listener deploy/productpage -n my-app

# Check effective mTLS policy for a service
istioctl x describe service reviews.my-app

# Tail access logs from the Envoy sidecar of a pod in real time
kubectl logs -l app=productpage -n my-app -c istio-proxy -f

# Check if a specific connection is mTLS or plain text
istioctl x authz check deploy/reviews -n my-app

# Debug a 503 between two services — check endpoint health
istioctl proxy-config endpoint deploy/productpage -n my-app | grep reviews

Common Gotchas in Production

DestinationRule subset not found → 503 ENVOY_UPSTREAM_503

If a VirtualService references a subset (e.g. v2) but no DestinationRule defines that subset, Envoy returns a 503 with no_healthy_upstream. Always deploy the DestinationRule before the VirtualService that references its subsets. Use istioctl analyze to catch this before apply.

Readiness probes failing after STRICT mTLS

kubelet health checks (livenessProbe, readinessProbe) do not carry mTLS credentials. In STRICT mode they will fail. Istio automatically rewrites HTTP probes on injected pods to go through the proxy — but only for HTTP probes, not TCP or exec. Ensure your probes are HTTP-based or use the rewriteAppHTTPProbers=true mesh config setting.

VirtualService not matching because of missing host

A VirtualService only applies to traffic whose Host header matches. When calling a service in the same namespace you can use the short name (reviews), but from a different namespace you must use the FQDN (reviews.my-app.svc.cluster.local) or the VirtualService must be in the same namespace as the caller. Use istioctl proxy-config route to inspect the resolved routes.

EnvoyFilter ordering breaks HTTP filters

Istio applies EnvoyFilters in creation timestamp order within the same priority. If two EnvoyFilters both INSERT_BEFORE the router filter, the second one may end up in the wrong position. Assign priority values explicitly using the priority field (Istio 1.20+) to control ordering deterministically.

Sidecar resource exhaustion at scale

At 200+ services, each Envoy sidecar holds the full xDS config for every service in the mesh. This can grow to 100MB+ of memory per sidecar. Use the Sidecar CRD to scope each workload's egress to only the services it actually calls — this reduces xDS config size by 60–80% in large meshes.

# Sidecar CRD — scope egress to only required services (reduces xDS config size)
apiVersion: networking.istio.io/v1
kind: Sidecar
metadata:
  name: productpage-sidecar
  namespace: my-app
spec:
  workloadSelector:
    labels:
      app: productpage
  egress:
    - hosts:
        - ./reviews          # same namespace
        - ./details          # same namespace
        - istio-system/*     # control plane

LLM Structured Outputs — Schema Design, Validation, and Retry Patterns for Production AI Systems

Why Service Meshes Exist

Istio Architecture — Control Plane and Data Plane

istiod — Control Plane

Envoy Sidecar — Data Plane

Ingress Gateway

Installing Istio with istioctl

Traffic Management — VirtualService and DestinationRule

Canary Release — Weight-Based Routing

Header-Based Routing for Dark Launches

Retries, Timeouts, and Fault Injection

Ingress Gateway — TLS Termination at the Edge

mTLS — Zero-Trust Service-to-Service Encryption

Observability — Kiali, Prometheus, and Jaeger

Kiali — Service Graph and Health Dashboard

Prometheus + Grafana — Metrics

Jaeger — Distributed Tracing

Circuit Breaking and Outlier Detection

Fault Injection — Chaos Testing Without Code Changes

Global Rate Limiting with Envoy RateLimit Service

Debugging Istio — Essential Commands

Common Gotchas in Production

DestinationRule subset not found → 503 ENVOY_UPSTREAM_503

Readiness probes failing after STRICT mTLS

VirtualService not matching because of missing host

EnvoyFilter ordering breaks HTTP filters

Sidecar resource exhaustion at scale

Further Reading

Building LLM-powered systems and tired of brittle string parsing breaking production?

Related Articles

Need help implementing this in production?