After managing AWS infrastructure for dozens of clients, we've seen the same cost mistakes over and over. Some are quick wins you can fix this afternoon. Others require architectural changes. All of them are costing you more than they should.
The frustrating thing is that AWS gives you all the tools to spot these issues. Most teams just never look, or don't know what to look for.
1. Running Dev/Staging Environments 24/7
This is the most common waste we see, and it's the easiest to fix.
A typical setup has three environments: dev, staging, and production. That means you're paying 3x compute costs. But dev and staging are idle roughly 75% of the time — evenings, weekends, holidays, and all those hours nobody is actually deploying or testing anything.
The fix is scheduled scaling. Use EventBridge rules and a simple Lambda function to shut down non-production environments outside business hours. It takes an afternoon to set up and pays for itself within the first week.
RDS instances are the biggest culprit here. A db.r6g.xlarge running 24/7 costs roughly £600/month. Running it only during business hours (say 10 hours a day, 5 days a week) brings that down to about £180/month. Per instance. Per environment.
Important: Don't delete and recreate RDS instances to save money — stop them. But be aware that stopped RDS instances auto-restart after 7 days. You need to automate the stop on a schedule, or AWS will quietly start billing you again.
EKS node groups are another big one. Scale to zero nodes outside business hours, or use Karpenter to automatically provision and deprovision nodes based on actual demand. No pods running? No nodes paying.
In real numbers: a typical 3-environment setup wastes £1,500–3,000/month on idle non-production resources. That's £18,000–36,000 a year, doing nothing.
2. Ignoring Savings Plans and Reserved Instances
On-demand pricing is the most expensive way to use AWS. It's the sticker price. Nobody should be paying it for stable, predictable workloads.
If you've been running the same workloads for 3+ months and your usage is relatively consistent, you should be looking at Savings Plans. Specifically, Compute Savings Plans — they're the most flexible option because they apply across EC2, Fargate, and Lambda. You're not locked into a specific instance family or region.
The commitment feels scary, but the maths is straightforward:
- 1-year, no-upfront Compute Savings Plan: ~30% discount
- 3-year, no-upfront: ~50% discount
You don't need to guess. AWS Cost Explorer has a Savings Plans recommendations page that analyses your actual usage over the past 7, 30, or 60 days and suggests the right commitment level. Use it.
Start small. Commit to covering your baseline usage — the minimum you always run, even on quiet days. Buy additional coverage as you get comfortable with how the commitment model works. You can always add more later.
The same logic applies to RDS Reserved Instances. If you have databases that run continuously (and most production databases do), Reserved Instances save 30–50% compared to on-demand pricing.
Real impact: a company spending £10K/month on compute typically saves £3–4K/month with appropriate Savings Plans. That's £36–48K/year for filling in a form.
3. Oversized Instances (The "We Might Need It" Tax)
Most EC2 and RDS instances we audit are 2–4x larger than they need to be. Teams provision based on peak load estimates that came from a whiteboard session, then never go back and right-size once they have real data.
The fix is simple: use AWS Compute Optimizer. It analyses 14 days of CloudWatch metrics and recommends the right instance size based on actual utilisation. It's free. There's no reason not to have it enabled.
You can check recommendations right now from the CLI:
aws compute-optimizer get-ec2-instance-recommendations \
--filters "name=Finding,values=OVER_PROVISIONED" \
--output table
RDS is consistently the worst offender. We regularly see db.r6g.2xlarge instances running at 15% CPU utilisation. That should be a db.r6g.large — same family, a quarter of the cost. RDS makes it easy to change instance size with a few minutes of downtime during a maintenance window.
While you're at it, look at Graviton (ARM) instances. They deliver the same or better performance at 20% lower cost. If your application runs on Linux and doesn't depend on x86-specific binaries, switching from m6i to m7g (or r6i to r7g) is essentially free money.
Don't forget EBS volumes either. We see gp2 volumes provisioned at 100GB "just in case" when 20GB would be plenty. And if you're still on gp2 at all, switch to gp3. It's cheaper with better baseline performance (3,000 IOPS and 125 MB/s included). There's almost never a reason to use gp2 anymore.
4. Data Transfer Costs Nobody Budgeted For
This is the hidden cost that surprises everyone. Data transfer rarely shows up in architecture diagrams, but it shows up on your bill.
Inter-AZ data transfer costs $0.01/GB in each direction. That sounds trivial until you have microservices chatting across availability zones at high volume. A service making thousands of API calls to another service in a different AZ can rack up a surprisingly large bill.
NAT Gateway data processing charges $0.045/GB. This is the one that really catches people. If you have services in private subnets pulling data from S3 or DynamoDB through a NAT Gateway — because nobody set up VPC endpoints — you're paying nearly 5 cents per gigabyte for traffic that should be free.
The fixes:
- S3 and DynamoDB VPC Gateway Endpoints are free. Completely free. If you don't have them configured, stop reading this and set them up now. It takes 5 minutes in the console or a few lines of Terraform.
- Co-locate chatty services in the same availability zone where possible, or use VPC endpoints for AWS service calls to avoid inter-AZ charges.
- CloudFront can actually be cheaper than direct S3 access for frequently-requested content, even ignoring the latency benefit. Data transfer from CloudFront to the internet is cheaper than from S3 directly.
Real example: we've seen NAT Gateway bills of £500+/month that dropped to near-zero simply by adding S3 and DynamoDB VPC endpoints. That's a 10-minute fix for £6,000/year in savings.
5. No Tagging Strategy = No Cost Visibility
You can't optimise what you can't measure. Without consistent tags, your AWS bill is a single number. You can't attribute costs to teams, projects, or environments. You can't answer "why did our bill go up £2K last month?" because you don't know which part of your infrastructure caused it.
At minimum, tag every resource with:
- Environment — prod, staging, dev
- Project — which product or service this belongs to
- Owner — which team is responsible
- CostCentre — for finance to allocate spend
Use AWS Organizations Tag Policies or SCPs to enforce mandatory tags on new resources. Then activate Cost Allocation Tags in the billing console so you can break down costs by tag in Cost Explorer.
Set up AWS Budgets with alerts. You should know within 24 hours if spending spikes unexpectedly. A simple budget alert has saved our clients from runaway costs more times than we can count.
Be pragmatic. Perfect tagging on day one is unrealistic. Start with Environment and Project tags, enforce them on new resources, and backfill existing resources over time. Done is better than perfect here.
Bonus: The Quick Checklist
These take 30 minutes to check and can save hundreds per month:
- Check for unattached EBS volumes and Elastic IPs — they cost money doing absolutely nothing
- Review old snapshots and AMIs — set lifecycle policies to auto-delete after a retention period
- Check for idle load balancers with no healthy targets
- Enable S3 Intelligent-Tiering for buckets with unpredictable access patterns
- Review CloudWatch log retention — the default is "never expire," and log storage adds up quietly
- Check for unused Elastic IPs — £3.65/month each, which adds up when you have a dozen sitting around from old projects
Most companies we work with save 25–40% on their AWS bill within the first month of a proper review. The changes aren't complex — they just need someone to look.