Skip to content

Glossary

Acronyms and frameworks referenced throughout this playbook, in alphabetical order.

TermMeaning
ADRArchitecture Decision Record — a short document capturing a significant technical decision and its rationale, used as a durable artifact in Collaboration, Culture & Governance
AIOpsApplying AI/ML to operations data (logs, metrics, traces) to reduce alert noise and accelerate root-cause analysis; see Observability & SRE
DORADevOps Research and Assessment — the research program (now part of Google Cloud) behind the four key delivery metrics and the annual State of DevOps report; see Delivery Performance & DORA Metrics
DRDisaster Recovery — the plans and mechanisms for restoring service after a major infrastructure or regional failure; see Incident Response, Resilience & Disaster Recovery
eBPFExtended Berkeley Packet Filter — kernel technology enabling low-overhead network and security observability without modifying application code; see Kubernetes & Container Platform Operations
FinOpsFinancial Operations — the operating model (formalized by the FinOps Foundation) for managing cloud cost as a cross-functional, data-driven discipline; see Cloud Infrastructure & FinOps
GitOpsOperating model where Git is the single source of truth for declarative infrastructure and application state, reconciled continuously by an operator (e.g. Argo CD, Flux); see GitOps & Declarative Deployment
IDPInternal Developer Platform — a self-service platform (e.g. built with Backstage) that abstracts infrastructure complexity behind golden paths for application developers; see Platform Engineering & Internal Developer Platforms
MTTRMean Time To Restore (or Repair) — average time to recover service after a production incident; one of the four DORA metrics
OPAOpen Policy Agent — a general-purpose policy engine used to enforce Policy as Code (commonly paired with Gatekeeper for Kubernetes admission control); see Infrastructure as Code & Policy as Code
OpenTelemetry (OTel)The CNCF-hosted, vendor-neutral standard for instrumenting applications to emit traces, metrics, and logs; see Observability & SRE
RTO / RPORecovery Time Objective / Recovery Point Objective — the target time to restore service and the maximum tolerable data loss after an incident; see Incident Response, Resilience & Disaster Recovery
SBOMSoftware Bill of Materials — a machine-readable inventory of all components and dependencies in a software artifact; see DevSecOps & Software Supply Chain Security
SLI / SLOService Level Indicator / Objective — the measured signal of service health and the target threshold for it, the foundation of error-budget-based reliability work; see Observability & SRE
SLSASupply-chain Levels for Software Artifacts — an OpenSSF framework defining progressive levels of build integrity and provenance for software supply chains; see DevSecOps & Software Supply Chain Security
VSMValue Stream Management — analyzing and optimizing the end-to-end flow of work from idea to production to find and remove bottlenecks; referenced in AI/MLOps & AI-Augmented DevOps
Zero TrustA security model that assumes no implicit trust based on network location and requires continuous verification of every request; see DevSecOps & Software Supply Chain Security