The Hidden Tax of Platform Fragmentation
At some point every growing engineering organization hits the same wall. Spinning up a new service requires hunting across five wikis for the right template, pinging three different teams to get permissions set up, and spending the first week just wiring together CI/CD, logging, and secrets management — again. The code that matters gets delayed by infrastructure ceremony.
This is the problem that Internal Developer Platforms (IDPs) are built to solve. An IDP is a self-service layer on top of your infrastructure — a single portal where engineers can browse the service catalog, spin up new services from golden-path templates, access documentation, and get visibility into deployments and health, all without waiting on a platform team ticket. Backstage, open-sourced by Spotify in 2020 and now a CNCF Incubating project, has become the de facto framework for building IDPs. This guide covers the architecture, core features, and practical implementation patterns that make Backstage IDPs actually useful in production.
Backstage Architecture — What You Are Actually Deploying
Backstage is a React frontend backed by a Node.js plugin host. You do not install Backstage like a SaaS tool — you fork the Backstage repository, create your own app, and own the deployment. This is intentional: customization is a first-class concern.
The core components are: the Software Catalog (the metadata registry for all your services, APIs, libraries, and resources), TechDocs (documentation as code, co-located with the service it describes), Software Templates (the scaffolding engine for golden-path service creation), and the Plugin System (the extension mechanism that connects Backstage to the rest of your toolchain).
# Create a new Backstage app (requires Node.js 20+, yarn)
npx @backstage/create-app@latest
# App structure after creation:
backstage/
├── app-config.yaml # Main configuration file
├── app-config.local.yaml # Local overrides (gitignored)
├── packages/
│ ├── app/ # React frontend — your customizations go here
│ │ └── src/
│ │ ├── App.tsx # Plugin wiring and routes
│ │ └── components/ # Custom UI overrides
│ └── backend/ # Node.js backend — plugin host
│ └── src/
│ └── index.ts # Backend plugin registration
├── plugins/ # Custom plugins you build
└── catalog-info.yaml # Backstage entity describing this repoNote
The Software Catalog — Your Single Source of Truth
The Software Catalog is Backstage's core feature and the foundation everything else builds on. It ingests catalog-info.yaml files from your repositories and builds a searchable registry of all your software entities: services, APIs, libraries, documentation sites, CI pipelines, cloud resources, and teams.
The catalog uses a typed entity model. The main kinds are Component (a service, library, or website), API (an interface exposed by a component), System (a group of related components), Domain (a business domain grouping systems), Resource (infrastructure like databases or S3 buckets), and Group/User (organizational entities for ownership).
# catalog-info.yaml — annotate every repository with this file
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: payment-service
title: Payment Service
description: Handles all payment processing and reconciliation
annotations:
# Link to your CI/CD system
github.com/project-slug: myorg/payment-service
# Link to Kubernetes deployments
backstage.io/kubernetes-id: payment-service
# Link to PagerDuty for on-call info
pagerduty.com/service-id: P1234AB
# Link to Datadog dashboards
datadoghq.com/dashboard-url: https://app.datadoghq.com/dashboard/abc-123
# TechDocs source
backstage.io/techdocs-ref: dir:.
tags:
- payments
- critical
- go
links:
- url: https://grafana.internal/d/payment-service
title: Grafana Dashboard
icon: dashboard
spec:
type: service
lifecycle: production # production | experimental | deprecated
owner: group:payments-team
system: checkout-system
dependsOn:
- component:order-service
- resource:postgres-payments-db
providesApis:
- payment-api# app-config.yaml — configure catalog discovery
catalog:
rules:
- allow: [Component, API, System, Domain, Resource, Group, User, Template, Location]
locations:
# Discover from a GitHub org — scans all repos for catalog-info.yaml
- type: github-discovery
target: https://github.com/myorg
rules:
- allow: [Component, API, System, Domain, Resource]
# Or register individual repos
- type: url
target: https://github.com/myorg/payment-service/blob/main/catalog-info.yaml
# Load team structure from GitHub teams
- type: github-org
target: https://github.com/myorg
rules:
- allow: [Group, User]Note
owner pointing to a Groupentity, the catalog becomes an accountability graph. You can query "which team owns the service that just went down?" and get an answer in seconds. Invest time in getting ownership right from day one — retrofitting it into hundreds of existing repos is painful.TechDocs — Documentation That Stays Current
TechDocs is Backstage's docs-as-code solution. It renders MkDocs-format Markdown from your repository directly into Backstage, co-located with the service in the catalog. When a developer searches for how a service works, they get the docs alongside the code, on-call info, API spec, and deployment status — all in one place.
The standard setup uses the recommended build strategy: your CI pipeline generates the static docs site and publishes it to an object store (S3, GCS, or Azure Blob). Backstage serves the pre-built docs, keeping read performance fast without building on every request.
# mkdocs.yml — place in the root of each repository
site_name: Payment Service
site_description: Handles all payment processing and reconciliation
docs_dir: docs/
nav:
- Home: index.md
- Architecture: architecture.md
- API Reference: api.md
- Runbooks:
- Incident Response: runbooks/incident-response.md
- Scaling Procedures: runbooks/scaling.md
- ADRs:
- 001 - Async payments: decisions/001-async-payments.md
- Changelog: changelog.md
plugins:
- techdocs-core # Required Backstage plugin — adds search, etc.
# catalog-info.yaml annotation (points Backstage to docs)
# backstage.io/techdocs-ref: dir:.# CI job to build and publish TechDocs (GitHub Actions example)
name: Publish TechDocs
on:
push:
branches: [main]
paths:
- docs/**
- mkdocs.yml
jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install TechDocs CLI
run: pip install mkdocs-techdocs-core
- name: Generate docs site
run: npx @techdocs/cli generate --no-docker --source-dir . --output-dir site
- name: Publish to S3
run: |
npx @techdocs/cli publish \
--publisher-type awsS3 \
--storage-name my-techdocs-bucket \
--entity default/component/payment-service
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}Software Templates — The Golden Path in Action
Software Templates are Backstage's scaffolding engine — the feature that eliminates the "copy-paste from an old service and remove everything company-specific" anti-pattern. A template is a YAML file that defines a form (the inputs engineers fill out), a list of steps (actions to run), and an output (the generated repository or resource).
Good templates encode your organization's best practices: CI/CD setup, logging configuration, secrets management, Kubernetes manifests, monitoring dashboards, and a pre-registered catalog entry — all pre-wired and ready on day one. The engineer fills out a form and gets a production-ready repository in under two minutes.
# template.yaml — a Go microservice template
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
name: go-microservice
title: Go Microservice
description: Create a new production-ready Go service with CI/CD, Docker, and Kubernetes manifests
tags:
- go
- microservice
- kubernetes
spec:
owner: group:platform-team
type: service
# Input form — shown in Backstage UI
parameters:
- title: Service Details
required: [name, description, owner, system]
properties:
name:
title: Service Name
type: string
pattern: '^[a-z][a-z0-9-]*[a-z0-9]$'
description: Lowercase letters, numbers, and hyphens only
description:
title: Description
type: string
owner:
title: Owner Team
type: string
ui:field: OwnerPicker # Backstage built-in picker
ui:options:
allowedKinds: [Group]
system:
title: System
type: string
ui:field: EntityPicker
ui:options:
catalogFilter:
kind: System
- title: Infrastructure
properties:
namespace:
title: Kubernetes Namespace
type: string
default: production
enum: [production, staging, development]
replicas:
title: Initial Replica Count
type: integer
default: 2
minimum: 1
maximum: 10
# Steps executed by the scaffolder backend
steps:
- id: fetch-template
name: Fetch Template
action: fetch:template
input:
url: ./skeleton # Template files in ./skeleton directory
values:
name: ${{ parameters.name }}
description: ${{ parameters.description }}
owner: ${{ parameters.owner }}
system: ${{ parameters.system }}
namespace: ${{ parameters.namespace }}
replicas: ${{ parameters.replicas }}
- id: create-github-repo
name: Create GitHub Repository
action: publish:github
input:
allowedHosts: [github.com]
description: ${{ parameters.description }}
repoUrl: github.com?owner=myorg&repo=${{ parameters.name }}
defaultBranch: main
requireCodeOwner: true
repoVisibility: private
- id: register-catalog
name: Register in Software Catalog
action: catalog:register
input:
repoContentsUrl: ${{ steps['create-github-repo'].output.repoContentsUrl }}
catalogInfoPath: /catalog-info.yaml
output:
links:
- title: GitHub Repository
url: ${{ steps['create-github-repo'].output.remoteUrl }}
- title: Open in Catalog
icon: catalog
entityRef: ${{ steps['register-catalog'].output.entityRef }}Note
${{ values.name }}-deployment.yaml becomes payment-service-deployment.yaml after scaffolding. Keep skeletons minimal and maintained — outdated templates that generate broken code erode trust in the platform faster than almost anything else.Kubernetes Plugin — Deployment Visibility Without kubectl
The Backstage Kubernetes plugin surfaces live deployment status, pod health, replica counts, resource consumption, and recent events directly on the component page in the catalog. Engineers get deployment visibility without needing cluster access or kubectl — which matters both for productivity and for security (fewer people need direct cluster credentials).
# app-config.yaml — Kubernetes plugin configuration
kubernetes:
serviceLocatorMethod:
type: multiTenant # Route requests to multiple clusters
clusterLocatorMethods:
- type: config
clusters:
- name: production-eu-west-1
url: https://prod-eu-west-1.k8s.internal
authProvider: serviceAccount
serviceAccountToken: ${K8S_PROD_EU_TOKEN}
caData: ${K8S_PROD_EU_CA}
skipTLSVerify: false
- name: production-us-east-1
url: https://prod-us-east-1.k8s.internal
authProvider: aws
# Uses IRSA — no static credentials needed in EKS
- name: staging
url: https://staging.k8s.internal
authProvider: serviceAccount
serviceAccountToken: ${K8S_STAGING_TOKEN}
caData: ${K8S_STAGING_CA}
# Component annotation to link deployments
# Add to catalog-info.yaml of each service:
# annotations:
# backstage.io/kubernetes-id: payment-service
# backstage.io/kubernetes-namespace: production
# backstage.io/kubernetes-label-selector: 'app=payment-service'# RBAC — create a read-only service account for Backstage
# Apply to each cluster Backstage monitors
apiVersion: v1
kind: ServiceAccount
metadata:
name: backstage-reader
namespace: backstage-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: backstage-reader
rules:
- apiGroups: [""]
resources: [pods, services, configmaps, resourcequotas, limitranges]
verbs: [get, list, watch]
- apiGroups: [apps]
resources: [deployments, replicasets, statefulsets, daemonsets]
verbs: [get, list, watch]
- apiGroups: [autoscaling]
resources: [horizontalpodautoscalers]
verbs: [get, list, watch]
- apiGroups: [metrics.k8s.io]
resources: [pods]
verbs: [get, list]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: backstage-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: backstage-reader
subjects:
- kind: ServiceAccount
name: backstage-reader
namespace: backstage-systemThe Plugin Ecosystem — Integrating Your Toolchain
Backstage's value multiplies with each integration. The plugin marketplace lists 150+ community and vendor plugins. The ones with the highest adoption and production value are:
CI/CD — GitHub Actions, Jenkins, ArgoCD
Surface recent workflow runs, build status, and deployment history directly on the component page. Engineers see at a glance whether the latest commit built successfully and whether it has been deployed to staging or production — without leaving Backstage.
Observability — Grafana, PagerDuty, Datadog
Embed Grafana panels, show active PagerDuty incidents, and surface on-call schedules on each component page. When something is on fire, the responder can see current alerts, the runbook link from TechDocs, and the owning team — all from one URL.
Security — Snyk, SonarQube, Dependabot
Show vulnerability counts, code quality scores, and dependency update status on the component page. Platform teams can define quality gates and track adoption across the catalog without manual audits.
Cost — Backstage Cost Insights
The Cost Insights plugin (open source, originally from Spotify) surfaces cloud cost breakdowns per team and product. It transforms abstract billing data into actionable per-service costs that engineering teams actually understand and can act on.
Building Custom Plugins
When the community ecosystem does not cover your internal tools, you build custom plugins. A Backstage plugin is a React package (frontend) optionally paired with an Express-based backend module. The frontend renders a component on the entity page; the backend handles API calls to internal systems.
# Scaffold a new plugin using the Backstage CLI
cd backstage/
yarn backstage-cli new --select plugin
# CLI prompts:
# Plugin ID: feature-flags
# → creates packages/app/src/plugins/feature-flags/
# Plugin structure:
plugins/feature-flags/
├── package.json
├── src/
│ ├── index.ts # Public exports
│ ├── plugin.ts # Plugin definition and routes
│ ├── components/
│ │ ├── FeatureFlagsCard/ # EntityCard component
│ │ │ ├── FeatureFlagsCard.tsx
│ │ │ └── index.ts
│ └── api/
│ ├── FeatureFlagsClient.ts # API client calling your backend
│ └── types.ts// plugin.ts — register the plugin and its routes
import { createPlugin, createRoutableExtension } from '@backstage/core-plugin-api';
import { rootRouteRef } from './routes';
export const featureFlagsPlugin = createPlugin({
id: 'feature-flags',
routes: {
root: rootRouteRef,
},
});
// Entity card — shows feature flags for the current catalog entity
export const EntityFeatureFlagsCard = featureFlagsPlugin.provide(
createRoutableExtension({
name: 'EntityFeatureFlagsCard',
component: () =>
import('./components/FeatureFlagsCard').then(m => m.FeatureFlagsCard),
mountPoint: rootRouteRef,
}),
);
// Wire into App.tsx — add to the entity page layout:
// <EntityLayout.Route path="/feature-flags" title="Feature Flags">
// <EntityFeatureFlagsCard />
// </EntityLayout.Route>Note
Production Deployment on Kubernetes
For production, run Backstage on Kubernetes with PostgreSQL as the catalog database. The official Helm chart is the standard deployment path. The main configuration decisions are: authentication (use your existing SSO provider — Okta, Azure AD, or Google Workspace), database (PostgreSQL via an RDS instance or a managed service, not SQLite), and TechDocs storage (S3 or GCS for the pre-built docs assets).
# values.yaml for the Backstage Helm chart
backstage:
image:
registry: ghcr.io
repository: myorg/backstage
tag: "1.2.3" # pin to a specific version
appConfig:
app:
baseUrl: https://backstage.internal.mycompany.com
backend:
baseUrl: https://backstage.internal.mycompany.com
database:
client: pg
connection:
host: ${POSTGRES_HOST}
port: 5432
user: ${POSTGRES_USER}
password: ${POSTGRES_PASSWORD}
database: backstage
auth:
environment: production
providers:
microsoft:
production:
clientId: ${AZURE_CLIENT_ID}
clientSecret: ${AZURE_CLIENT_SECRET}
tenantId: ${AZURE_TENANT_ID}
techdocs:
builder: external # Pre-built in CI, not on-demand
publisher:
type: awsS3
awsS3:
bucketName: mycompany-techdocs
region: eu-west-1
ingress:
enabled: true
host: backstage.internal.mycompany.com
annotations:
kubernetes.io/ingress-class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
tls:
- secretName: backstage-tls
hosts:
- backstage.internal.mycompany.com
postgresql:
enabled: false # Use external managed PostgreSQL
auth:
existingSecret: backstage-postgres-secretAdoption Pitfalls — What Makes IDPs Fail
Backstage is not hard to deploy. Making engineers actually use it is. The common failure patterns:
Building features before solving data quality
A catalog full of stale, incomplete, or inaccurate entries is worse than no catalog. Before launching Backstage broadly, get catalog-info.yaml files into your top 20 most-used services with accurate ownership and lifecycle information. Quality beats quantity.
Templates that generate unmaintained code
A scaffold template that generates a service with outdated dependencies or broken CI configuration trains engineers to immediately delete the generated files and start over. Assign ownership to each template and treat it like production code — it needs to be tested, updated, and reviewed.
No golden path incentive
If the manual path is faster than the Backstage template, engineers will take the manual path. The template must genuinely save time by pre-wiring things that are otherwise painful: secrets, service accounts, monitoring, CODEOWNERS, and security scanning. The value must be immediate, not theoretical.
Note
Measuring IDP Success — DORA and Developer Experience Metrics
The primary metrics for evaluating your IDP are the DORA metrics — Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Time to Restore Service. A well-implemented IDP improves all four: templates reduce lead time, golden paths reduce change failure rate, and runbooks in TechDocs reduce time to restore.
Supplement DORA with developer experience metrics: time from repository creation to first production deployment (target: under 2 hours with templates), percentage of services with complete catalog entries (target: 90%+), and TechDocs coverage (percentage of services with at least an architecture doc and a runbook).
# Track catalog completeness with the Backstage Scorecards plugin
# Define quality gates per entity type
# Example: Component scorecard rules
rules:
- id: has-owner
name: Has owner
filter:
kind: Component
rules:
- factRef: catalog:default/entity-metadata-fact
path: $.spec.owner
operator: greaterThan
value: 0
- id: has-techdocs
name: Has TechDocs
rules:
- factRef: catalog:default/entity-metadata-fact
path: $.metadata.annotations['backstage.io/techdocs-ref']
operator: equal
value: "dir:."
- id: has-pagerduty
name: Has PagerDuty service
rules:
- factRef: catalog:default/entity-metadata-fact
path: $.metadata.annotations['pagerduty.com/service-id']
operator: greaterThan
value: 0Building an Internal Developer Platform or improving your developer experience?
We help engineering teams design and implement IDPs with Backstage — from Software Catalog setup and golden-path templates to custom plugins and DORA metric tracking. Let's talk.
Get in Touch