Nobody opens the bill looking for data transfer. It is buried under EC2, it has a cryptic line-item name, and each gigabyte costs almost nothing. Then you add it up across a chatty microservices estate and it is 12% of the total. We have found six-figure annual egress hiding behind 'it's only two cents a gig'.
Know the three prices
The fix starts with knowing which transfer you are paying for, because they are priced wildly differently. Internet egress is the expensive headline. Cross-region is mid. And inter-AZ traffic - data moving between availability zones inside the same region - is the sneaky one, charged in both directions at roughly 1-2 cents per GB each way.
- Internet egress - the dearest, and where CDNs earn their keep
- Cross-region replication - often necessary, occasionally accidental
- Inter-AZ chatter - the one that compounds quietly inside a busy cluster
- NAT Gateway processing - a per-GB charge on top of everything routed through it
Inter-AZ is usually self-inflicted
A service in zone A calling a database in zone B pays inter-AZ both ways, on every request, forever. In Kubernetes this is the default unless you do something about it - the scheduler spreads pods across zones for availability and accidentally maximises cross-zone traffic. Topology-aware routing and keeping hot service-to-service paths zone-local can cut this traffic by half or more without touching availability.
Inter-AZ traffic is charged twice and noticed never. It is the purest form of money leaking through a default setting.
VPC endpoints and the NAT tax
Traffic from private subnets to S3 or DynamoDB through a NAT Gateway pays NAT processing per GB on top of the data transfer. Gateway VPC endpoints for S3 and DynamoDB are free and route that traffic off the NAT entirely. For other AWS APIs, interface endpoints have an hourly cost but still beat NAT processing once volume is real. This single change has paid for itself in a week on busy accounts.
aws cur ... -- find the offenders:
SELECT line_item_usage_type, SUM(line_item_unblended_cost) AS cost
FROM cur
WHERE line_item_usage_type LIKE '%DataTransfer%'
OR line_item_usage_type LIKE '%Bytes%'
GROUP BY 1 ORDER BY 2 DESC;Make it visible, then it gets fixed
The reason data transfer grows is that it is invisible to the teams generating it. We pull the DataTransfer usage types out of the CUR into their own dashboard, attribute them to services, and suddenly the team responsible for a chatty replication job can see their line. Visibility alone usually shaves the worst 20-30% before any architectural work begins.