Glossary
Glossary
Acronyms and frameworks referenced throughout this playbook, in alphabetical order.
| Term | Meaning |
|---|---|
| ADR | Architecture Decision Record — a short document capturing a significant technical decision and its rationale, used as a durable artifact in Collaboration, Culture & Governance |
| AIOps | Applying AI/ML to operations data (logs, metrics, traces) to reduce alert noise and accelerate root-cause analysis; see Observability & SRE |
| DORA | DevOps Research and Assessment — the research program (now part of Google Cloud) behind the four key delivery metrics and the annual State of DevOps report; see Delivery Performance & DORA Metrics |
| DR | Disaster Recovery — the plans and mechanisms for restoring service after a major infrastructure or regional failure; see Incident Response, Resilience & Disaster Recovery |
| eBPF | Extended Berkeley Packet Filter — kernel technology enabling low-overhead network and security observability without modifying application code; see Kubernetes & Container Platform Operations |
| FinOps | Financial Operations — the operating model (formalized by the FinOps Foundation) for managing cloud cost as a cross-functional, data-driven discipline; see Cloud Infrastructure & FinOps |
| GitOps | Operating model where Git is the single source of truth for declarative infrastructure and application state, reconciled continuously by an operator (e.g. Argo CD, Flux); see GitOps & Declarative Deployment |
| IDP | Internal Developer Platform — a self-service platform (e.g. built with Backstage) that abstracts infrastructure complexity behind golden paths for application developers; see Platform Engineering & Internal Developer Platforms |
| MTTR | Mean Time To Restore (or Repair) — average time to recover service after a production incident; one of the four DORA metrics |
| OPA | Open Policy Agent — a general-purpose policy engine used to enforce Policy as Code (commonly paired with Gatekeeper for Kubernetes admission control); see Infrastructure as Code & Policy as Code |
| OpenTelemetry (OTel) | The CNCF-hosted, vendor-neutral standard for instrumenting applications to emit traces, metrics, and logs; see Observability & SRE |
| RTO / RPO | Recovery Time Objective / Recovery Point Objective — the target time to restore service and the maximum tolerable data loss after an incident; see Incident Response, Resilience & Disaster Recovery |
| SBOM | Software Bill of Materials — a machine-readable inventory of all components and dependencies in a software artifact; see DevSecOps & Software Supply Chain Security |
| SLI / SLO | Service Level Indicator / Objective — the measured signal of service health and the target threshold for it, the foundation of error-budget-based reliability work; see Observability & SRE |
| SLSA | Supply-chain Levels for Software Artifacts — an OpenSSF framework defining progressive levels of build integrity and provenance for software supply chains; see DevSecOps & Software Supply Chain Security |
| VSM | Value Stream Management — analyzing and optimizing the end-to-end flow of work from idea to production to find and remove bottlenecks; referenced in AI/MLOps & AI-Augmented DevOps |
| Zero Trust | A security model that assumes no implicit trust based on network location and requires continuous verification of every request; see DevSecOps & Software Supply Chain Security |