Terraform Modules: Three Patterns That Survive Contact With Reality

Most Terraform module libraries I audit have one of three failure modes:

A single mega-module per service that takes 60 inputs.
A pile of tiny modules so granular the consumer rebuilds the AWS console in HCL.
A "platform module" that abstracts so heavily that adding any new field requires a platform team PR.

All three are bad. The fix is to commit to one of three patterns and stay disciplined.

Pattern 1: Composition modules

A composition module wraps a cloud primitive thinly. It encodes one or two opinionated defaults. It exposes the underlying resource's full surface as optional inputs.

Example: an S3 bucket module that:

Defaults to private, encrypted, versioned, with TLS-only bucket policy.
Exposes every other property of aws_s3_bucket as a passthrough.
Returns the bucket and its ARN as outputs.

That is the entire module. 60 lines. Used everywhere.

The trap people fall into is adding a feature: "what if we also want to attach a Lambda notification". Now the module has a lambda_arn input, and a lambda_events input, and a count guard, and somewhere a sub-module. Stop. The composition module's job is to be a thin opinionated wrapper. Lambda notifications belong in the consumer's code, calling aws_s3_bucket_notification directly with the bucket output from your module.

When in doubt: the composition module owns the resource itself. It does not own anything that hangs off it.

Pattern 2: Service modules

A service module composes multiple primitives into a deployable unit. "ECS service with ALB and Route53 and CloudWatch alarms." This is where the value of Terraform modules really sits, and also where most teams over-reach.

Two rules:

A service module has at most one "main" resource it owns. Everything else is supporting.
A service module is allowed to be opinionated. That is the whole point. Inputs should be product-shaped, not infrastructure-shaped. service_name, container_image, port, environment. Not alb_idle_timeout, target_group_deregistration_delay, route53_ttl.

If your service module has more than 15 inputs, you are encoding a generic platform, not a service. Split it.

Pattern 3: Account-bootstrap modules

An account-bootstrap module is the thing you run once per AWS account, GCP project, or Azure subscription to lay down the baseline. IAM roles for humans and CI, a logging bucket, baseline VPCs, default KMS keys.

The key property: it is run rarely, and changes to it are reviewed like database migrations. It is not the thing you call from every service.

The mistake I see is teams putting baseline stuff inside service modules, "in case the account doesn't have it yet". This creates implicit ordering, drift, and wonderful 3am bugs where two services race to create the same logging bucket. Bootstrap belongs in its own pipeline.

The three rules that actually matter

Across all three patterns, three rules:

Pin everything

Module sources pinned to a tag. Provider versions pinned to a minor. Terraform CLI pinned in CI. The day you let any of these float, you have invited a Tuesday morning where nothing has changed in your code and everything is broken.

Outputs are an API

Module outputs are a contract. Removing one is a breaking change. Renaming one is a breaking change. Changing the shape of one is a breaking change. Version your modules. Use SemVer. Mean it.

State is a database

Treat state files like databases. They have backups (versioning on the bucket). They have access control. They have audit logs. They are not edited by hand. terraform state rm is a destructive operation and is treated as one.

Where the patterns break down

Honest disclosure: at very large scale, the three-pattern model leaks. Once you have hundreds of teams writing thousands of services, you end up wanting some kind of higher-level abstraction. The market answer in late 2024 is some flavour of internal developer platform: Backstage, Crossplane, Humanitec, Port. They generate Terraform underneath instead of asking humans to write it.

That is fine and probably correct at that scale. But for the 95% of organisations who are not running thousands of services, the three-pattern module library is the right answer, and the worst thing you can do is jump to a heavier abstraction prematurely.

What I actually ship

For most clients, the module library I leave behind is roughly:

10-15 composition modules for the core primitives in their cloud.
5-8 service modules for the deploy archetypes (web service, worker, scheduled job, queue consumer).
1 account bootstrap module per account class.
One README per module, with examples and the tier.

That fits in one repo, one CI pipeline, and one engineer's head. Anything bigger and you are building a platform team's job, not a module library.