
People talk about GitOps like it is the final form of delivery. In real life, it depends a lot on scale.
I have spent years helping teams go from one multi-tenant instance to hundreds of single-tenant instances. GitOps was useful early. However, for me at large scale, it became a constant fight.
One formula captures it well: P(failure) = 1 - p^n.
Where p is the chance each individual change works, and n is how many moving parts you have to coordinate. As n grows, failure risk climbs fast even if each single change is "pretty safe."
For example: you are deploying one release to 100 single-tenant customer environments, and each environment sync has a 99% success rate.
p = 0.99 (one environment sync succeeds 99% of the time)n = 100 (100 environment syncs in the rollout wave)1 - 0.99^100 = 0.634So that rollout has about a 63% chance that at least one customer environment fails to deploy cleanly on the first pass.
P(failure)
1.00 ┬ ●
│ ●
0.80 │ ●
│ ●
0.60 │ ●
│ ●
0.40 │ ●
│ ●
0.20 │ ●
│ ●
0.00 └───────────────────────────────────────────────────────────────
0 20 40 60 80 100 120 140 160 n
Formula-wise, you only have two levers:
n (fewer independent steps per rollout)p (make each step more reliable)GitOps alone does not raise p for you. To improve p, you need other tooling and controls like preflight checks, dependency validation, rollout orchestration, retries, and policy guardrails.
GitOps is great when:
In that setup, Git gives you clean history, solid audit trails, and predictable rollouts.
small scale
dev -> PR -> merge -> deploy -> done
(few moving parts, easy to reason about)
Once you have a big fleet, a few things happen fast.
Every deployment turns into repo choreography. More branches, more approvals, more waiting. You start optimizing for merge flow instead of delivery outcomes.
large scale
CI -> PR -> approval -> merge -> sync
\-> policy check -> rebase -> approval -> merge -> sync
\-> hotfix PR -> cherry-pick -> re-sync
Rolling back one service is easy. Rolling back a whole environment with dependencies is not. Git can show you what changed, but it cannot restore all runtime conditions.
At scale, you end up with endless overrides: customer-specific, region-specific, compliance-specific, and emergency patches. The issue is not YAML itself. The issue is how much state humans must keep in their heads.
This is the part people avoid saying out loud.
At scale, teams will make changes outside GitOps. During incidents, during customer escalations, during vendor outages. Not because they are careless, but because they are solving an immediate problem.
If your model assumes that never happens, it is too idealistic for enterprise operations.
GitOps lovers and GitOps haters are usually dealing with different scales.
At small scale, GitOps feels clean.
At enterprise scale, repo-centric workflows become too low-level for the job.
That is the real mismatch.
Do not throw away GitOps. Just stop treating Git as the entire control plane.
Use Git for intent and auditability. Add platform-level orchestration for:
desired model
Git (intent) ---> Orchestrator ---> Fleet of environments
| | | | |
| +-> policy e1 e2 e3 ... eN
+-> rollout waves
+-> drift detection
+-> recovery paths
This is the key point: you need a tool that can actively orchestrate and enforce these runtime controls. GitOps alone cannot provide that. Git can store desired state. It does not run rollout logic, cross-environment safety checks, or live dependency coordination by itself.
That is the practical model: GitOps as an input, not the whole operating system.
GitOps is good. Pure GitOps at enterprise scale usually is not.
The bigger you get, the more you need orchestration that lives above pull requests.
Streamline your deployment workflow with intelligent orchestration. Deploy across clouds, manage complex systems, and ensure reliability.