Once an organization grows past one account, two design questions get tangled together and people answer them as if they were one. First: how do we split accounts? Second: how do we split Terraform state? They are related but they are not the same line, and conflating them produces either a single state spanning forty accounts or one state per account with massive duplication. We try to keep them separate decisions.
Accounts split by isolation, not by app
An account is a security and blast-radius boundary first. We default to a per-environment-per-workload split for anything sensitive - prod payment processing does not share an account with the marketing site - and a shared services account for the things everyone needs (logging, CI, shared networking). The mistake is one account per microservice; you drown in cross-account IAM and you have not actually bought isolation, just overhead.
- Management/org account - billing, the org itself, SCP guardrails. Almost nothing runs here.
- Security/log-archive - centralized logs and audit, write-once, very few people touch it.
- Shared services - CI runners, shared DNS, transit networking.
- Per-environment workload accounts - prod, staging, dev, ideally per business domain for the sensitive ones.
State seams follow change frequency and ownership
Within and across those accounts, we cut state where ownership and change frequency change. The org-level foundation that provisions the accounts themselves is one state, run by the platform team, touched rarely. Inside each workload account, we cut again by layer - network, platform, application - just as we would in a single account. So one workload account commonly holds three or four states, and the foundation that created the account holds one more.
An account is a blast-radius boundary. A state file is an ownership boundary. Stop drawing them as one line.
Provider config is where this gets fiddly
The mechanical part people underestimate is assuming the right role per account. We drive everything through `assume_role` in the provider block and pass the target account ID as a variable, so the same root module can be pointed at dev, staging, or prod by changing one input and the backend key. The root module stays identical across environments; only the variables and the state key differ. The day your prod and staging code diverge because someone special-cased prod, you have lost the main benefit of the whole layout.
provider "aws" {
region = var.region
assume_role {
role_arn = "arn:aws:iam::${var.account_id}:role/terraform-exec"
}
default_tags { tags = { managed_by = "terraform", env = var.environment } }
}Bootstrapping the chicken and egg
There is always a bootstrap problem: the state bucket has to exist before Terraform can use a remote backend, and the account has to exist before you can create the bucket in it. We bootstrap the foundation state with a local backend, create the org and the central state bucket, then migrate that foundation state into the bucket it just created. It feels circular because it is. Do it once, document it precisely, and never improvise it again - a botched bootstrap is one of the few ways to genuinely lock yourself out.