We have spent over a decade helping teams adopt Terraform. In that time, we have built more than 160 open-source modules, contributed to some of the most widely used infrastructure patterns in the ecosystem, and worked hands-on with hundreds of platform engineering teams — from early-stage startups to large enterprises managing thousands of resources across dozens of accounts.
None of that is meant to impress. It is context. Every one of those engagements taught us something about what works, what breaks, and — more importantly — when it breaks.
The pattern is remarkably consistent. Teams start strong. They write clean Terraform, adopt good conventions, set up CI/CD pipelines with GitHub Actions. For the first few months, everything moves fast. Then the infrastructure grows. The team grows. And something starts to crack — not in the Terraform itself, but in everything around it.
What breaks is not Terraform
The tools for writing infrastructure code have gotten genuinely good. Terraform is mature. OpenTofu is a credible alternative. Module registries, testing frameworks, linting tools — the ecosystem is rich. The problem is not writing the code. The problem is operating it at scale.
Every team we have worked with eventually hits the same wall. A pull request touches six stacks across three AWS accounts. Someone needs to figure out what order to deploy them in. Someone needs to make sure two engineers are not running terraform apply against the same state at the same time. Someone needs to notice when a resource drifts from its declared configuration — ideally before it causes an incident at 2 AM, not during one.
These are coordination problems, not Terraform problems. And teams solve them the same way every time: with custom scripts, manual runbooks, and tribal knowledge that lives in one person's head.
THE PATTERN WE KEPT SEEING
Deployment ordering solved with custom workflow logic that nobody wants to maintain
State locking gaps discovered after state corruption
Drift accumulating silently until it causes an incident
Cross-repository visibility cobbled together with dashboards and scripts
We watched this pattern repeat for years. Teams would spend weeks building a deployment orchestration layer, get it working, and then spend months maintaining it. The orchestration code became its own codebase — with its own bugs, its own edge cases, and its own on-call rotation.
What we learned from IaC platforms
We have used many of the major platforms for managing infrastructure as code. They are well-built products, and for some teams, they are the right choice. But there is a common pattern in how they are designed: they want to become your new center of gravity. They want you to move your state, your execution, your workflow into their platform. Your existing CI/CD system becomes a second-class citizen. Your team learns a new interface. Your processes reshape around the platform's opinions.
For some teams, that is a perfectly reasonable trade-off. For the teams we work with — teams that have already invested heavily in GitHub, in Atmos, in their own carefully tuned workflows — being asked to migrate to a new platform is not a solution. It is a new problem. It means re-training engineers, re-building integrations, and accepting a dependency on a system that is now load-bearing for your entire infrastructure lifecycle.
The best infrastructure tooling we encountered was always the lightest — the tools that augmented what you already had rather than replacing it.
The tooling we admired most over the years followed a different philosophy. It was narrow in scope, clear in purpose, and designed to make existing workflows better rather than obsolete. It did not ask you to change how you work. It just removed the friction from the work you were already doing.
The decision to build
About a year ago, we made a decision. We had built some version of the same orchestration layer — deployment ordering, locking, drift detection — for enough clients that it stopped making sense to build it from scratch each time. The problems were well-understood. The solutions were well-tested. What was missing was a product.
We spent that year working closely with early users, gathering feedback, and iterating on the design. We were deliberate about what Atmos Pro should be and, just as importantly, what it should not be. It should not be a platform you migrate to. It should not replace GitHub Actions or demand that you change your CI/CD pipeline. It should be a thin layer — the minimal glue — that makes your existing workflows work together. Ordered deployments. Coordinated locking. Drift detection that runs on a schedule and alerts you before your next incident. All of it operating through the GitHub infrastructure you already have.
Atmos Pro is the result of that year of work. It is not a grand reimagining of how infrastructure should be managed. It is a focused product that solves the specific coordination problems that every platform team encounters at scale — and solves them without asking you to abandon the tools and workflows you have already invested in.
What comes next
We are just getting started, but we wanted to share the thinking behind why we built this. If you are interested in the full feature overview — ordered deployments, deployment locking, drift detection, GitHub-native integration — read Introducing Atmos Pro. And if you are curious about the design philosophy that guided our decisions, we wrote about that separately in The Minimal Glue Philosophy.
We built Atmos Pro because we believe the best infrastructure tooling should disappear into the background. It should make the things that used to break stop breaking — and then get out of your way.
See what we built
Atmos Pro orchestrates your Terraform deployments through GitHub Actions — ordered, locked, and drift-aware.
