Right-Sizing Strategies — Instance Right-Sizing, Compute Optimizer, Auto-Scaling, Workload Profiling
Right-sizing is the practice of matching instance types and sizes to actual workload requirements. Most organizations over-provision by 30-50%, wasting millions annually. This guide covers tools and strategies for right-sizing across AWS, Azure, and GCP.
What You’ll Learn
You’ll use AWS Compute Optimizer and Azure Advisor for recommendations, profile workloads to determine optimal instance families, configure auto-scaling policies that match demand, and implement governance to prevent future over-provisioning.
Why Right-Sizing Matters
Every over-provisioned instance is money burned. A t3.medium running at 10% utilization could be replaced by a t3.nano at 1/8th the cost. Right-sizing is the highest-ROI cost optimization activity — it’s free (no architectural changes), low risk (easy to revert), and immediately impacts the monthly bill.
Learning Path
flowchart LR
A[Anomaly Detection] --> B[Right-Sizing<br/>You are here]
B --> C[Auto-Scaling]
C --> D[Reserved Instances]
style B fill:#f90,color:#fff
Workload Profiling
Before right-sizing, understand your workload patterns:
CPU-Bound vs Memory-Bound vs I/O-Bound
| Workload Type | Characteristic | Right-Sizing Strategy |
|---|---|---|
| CPU-bound | High CPU, low memory | Compute-optimized (c-series) |
| Memory-bound | High RAM, moderate CPU | Memory-optimized (r-series, x-series) |
| I/O-bound | High disk/network I/O | Storage-optimized (i-series, d-series) |
| Burstable | Low average, periodic spikes | Burstable (t-series) or auto-scaling |
| Steady-state | Consistent utilization | Fixed instance (m-series) |
Profiling with CloudWatch
import boto3
import pandas as pd
from datetime import datetime, timedelta
cloudwatch = boto3.client('cloudwatch')
def profile_instance(instance_id, days=30):
"""Get CPU, memory, and network utilization for an instance"""
end = datetime.now()
start = end - timedelta(days=days)
metrics = {}
for metric_name, namespace in [
('CPUUtilization', 'AWS/EC2'),
('NetworkIn', 'AWS/EC2'),
('NetworkOut', 'AWS/EC2'),
('DiskReadBytes', 'AWS/EC2'),
('DiskWriteBytes', 'AWS/EC2')
]:
response = cloudwatch.get_metric_statistics(
Namespace=namespace,
MetricName=metric_name,
Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
StartTime=start,
EndTime=end,
Period=3600, # 1-hour periods
Statistics=['Average', 'Maximum', 'P95']
)
metrics[metric_name] = response['Datapoints']
# Analysis
df = pd.DataFrame(metrics)
print(f"Instance: {instance_id}")
print(f" Avg CPU: {df['CPUUtilization']['Average'].mean():.1f}%")
print(f" Peak CPU: {df['CPUUtilization']['Maximum'].max():.1f}%")
print(f" P95 CPU: {sorted([d['Average'] for d in metrics['CPUUtilization']])[-5]:.1f}%")
print(f" Recommendation: {recommend_instance(df)}")
return df
def recommend_instance(df):
"""Recommend instance family based on utilization patterns"""
avg_cpu = df['CPUUtilization']['Average'].mean()
peak_cpu = df['CPUUtilization']['Maximum'].max()
if avg_cpu < 10 and peak_cpu < 30:
return "Downsize or use burstable (t-series)"
elif avg_cpu < 30 and peak_cpu < 60:
return "Consider smaller size or t-series"
elif avg_cpu > 80:
return "Consider scaling out or compute-optimized"
else:
return "Current size likely appropriate"AWS Compute Optimizer
AWS Compute Optimizer analyzes CloudWatch metrics and recommends optimal instance types:
# Enable Compute Optimizer
aws compute-optimizer update-enrollment-status --status Active
# Get EC2 recommendations
aws compute-optimizer get-ec2-instance-recommendations \
--instance-arns "arn:aws:ec2:us-east-1:ACCOUNT:instance/i-12345"
# Get auto-scaling group recommendations
aws compute-optimizer get-auto-scaling-group-recommendations
# Get Lambda function recommendations
aws compute-optimizer get-lambda-function-recommendations
# Export to CSV
aws compute-optimizer export-ec2-instance-recommendations \
--s3-destination-config bucket=my-bucket,key-prefix=compute-optimizer/Expected recommendation output:
{
"instanceRecommendation": {
"instanceArn": "arn:aws:ec2:us-east-1:ACCOUNT:instance/i-12345",
"currentInstanceType": "m5.xlarge",
"recommendationOptions": [
{
"instanceType": "m5.large",
"performanceRisk": 5,
"projectedUtilizationMetrics": [
{"name": "Cpu", "statistic": "Maximum", "value": 85.0}
],
"rank": 1,
"savingsOpportunity": {
"savingsOpportunityPercentage": 48.5,
"estimatedMonthlySavings": {"value": 124.50}
}
}
],
"inferredWorkloadTypes": ["WebApp"],
"finding": "Overprovisioned"
}
}Azure Advisor Cost Recommendations
Azure Advisor analyzes VM usage and makes right-sizing recommendations:
# List cost recommendations
az advisor recommendation list \
--category Cost \
--query "[?impact=='High']"
# Get specific VM recommendation
az vm list --query "[].{name:name, size:hardwareProfile.vmSize}" -o table
# Check VM utilization
az monitor metrics list \
--resource /subscriptions/SUB/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/VM_NAME \
--metric "Percentage CPU" \
--interval PT1H \
--top 168 \
--orderby descAzure VM Sizing Best Practices
# Identify underutilized VMs
# Look for VMs with avg CPU < 10% and max CPU < 30% over 30 days
az monitor metrics list \
--resource /subscriptions/SUB/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/VM_NAME \
--metric "Percentage CPU" \
--interval P1D \
--aggregation average,maximum \
--query "value[?average < 10 && maximum < 30]"GCP Rightsizing Recommendations
GCP’s Recommender provides rightsizing for Compute Engine:
# List rightsizing recommendations
gcp recommender recommendations list \
--recommender=google.compute.instance.MachineTypeRecommender \
--project=my-project \
--location=us-central1-a
# Apply recommendation (change machine type)
gcp compute instances set-machine-type INSTANCE_NAME \
--machine-type=e2-standard-4 \
--zone=us-central1-a
# Get cluster rightsizing for GKE
gcp container clusters list --recommenderAuto-Scaling Strategies
Target Tracking Scaling
# AWS — target tracking based on CPU
import boto3
asg = boto3.client('autoscaling')
asg.put_scaling_policy(
AutoScalingGroupName='web-app-asg',
PolicyName='cpu-target-50',
PolicyType='TargetTrackingScaling',
TargetTrackingConfiguration={
'PredefinedMetricSpecification': {
'PredefinedMetricType': 'ASGAverageCPUUtilization'
},
'TargetValue': 50.0,
'DisableScaleIn': False
}
)Predictive Scaling (AWS)
# Create a predictive scaling policy
asg.put_scaling_policy(
AutoScalingGroupName='web-app-asg',
PolicyName='predictive-scaling',
PolicyType='PredictiveScaling',
PredictiveScalingConfiguration={
'MetricSpecifications': [{
'TargetValue': 50.0,
'PredefinedMetricPairSpecification': {
'PredefinedMetricType': 'ASGCPUUtilization'
}
}],
'Mode': 'ForecastAndScale',
'SchedulingBufferTime': 300,
'MaxCapacityBreachBehavior': 'IncreaseMaxCapacity',
'MaxCapacityBuffer': 10
}
)Kubernetes Cluster Autoscaler
# cluster-autoscaler-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler-status
data:
min-nodes: "3"
max-nodes: "20"
scale-down-enabled: "true"
scale-down-delay-after-add: "10m"
scale-down-unneeded-time: "10m"
# Only scale down if utilization < 50%
scale-down-utilization-threshold: "0.5"Right-Sizing Governance
Tagging for Right-Sizing
# Tag instances with right-sizing classification
aws ec2 create-tags --resources i-12345 --tags Key=RightSizing,Key=ReviewDateAutomated Right-Sizing with Lambda
import boto3
def lambda_handler(event, context):
"""Automatically downsize severely over-provisioned instances"""
ce = boto3.client('compute-optimizer')
ec2 = boto3.client('ec2')
sns = boto3.client('sns')
# Get all recommendations
response = ce.get_ec2_instance_recommendations(
filters=[{'name': 'finding', 'values': ['Overprovisioned']}]
)
downsized = []
for rec in response.get('instanceRecommendations', []):
# Only auto-downsize if savings > 40% and risk < 10%
best_option = rec['recommendationOptions'][0]
if (best_option['savingsOpportunity']['savingsOpportunityPercentage'] > 40 and
best_option['performanceRisk'] < 10):
instance_id = rec['instanceArn'].split('/')[-1]
new_type = best_option['instanceType']
# Stop, modify instance type, start
ec2.stop_instances(InstanceIds=[instance_id])
waiter = ec2.get_waiter('instance_stopped')
waiter.wait(InstanceIds=[instance_id])
ec2.modify_instance_attribute(
InstanceId=instance_id,
InstanceType={'Value': new_type}
)
ec2.start_instances(InstanceIds=[instance_id])
downsized.append({
'instance': instance_id,
'from': rec['currentInstanceType'],
'to': new_type,
'savings_pct': best_option['savingsOpportunity']['savingsOpportunityPercentage']
})
if downsized:
sns.publish(
TopicArn='arn:aws:sns:us-east-1:ACCOUNT:ops-alerts',
Subject=f'Auto-rightsized {len(downsized)} instances',
Message=str(downsized)
)
return {'downsized': downsized}Common Right-Sizing Mistakes
1. Rightsizing Without Monitoring
Changing instance types based on guesswork is dangerous. Always profile for at least 14 days (including weekends) to capture peak loads.
2. Ignoring Memory Pressure
CloudWatch doesn’t track memory by default — you need the CloudWatch Agent. Without memory metrics, you might downsize an instance into OOM territory.
3. Rightsizing Stateful Instances
Don’t rightsize databases or stateful services without load testing. A smaller instance might handle the same CPU but have less network bandwidth or EBS throughput.
4. Rightsizing in Isolation
An application might have multiple instances behind a load balancer. Rightsizing one at a time allows you to measure impact without full outage.
5. Not Considering Graviton/ARM
AWS Graviton instances (ARM-based) offer 20-40% better price/performance for many workloads. Always check if your workload can run on ARM before choosing x86.
6. Over-Aggressive Auto-Scaling
Setting target tracking to 80% CPU gives no room for traffic spikes. Start at 50%, monitor, and adjust. Use predictive scaling for predictable patterns.
7. Forgetting GPUs and Specialized Hardware
GPU instances (p3, p4, g4) are expensive. If your ML model trains in 2 hours on a p3.2xlarge but you leave it running 24/7, you’re wasting 22 hours of GPU cost. Use spot instances or scheduled start/stop.
Practice Questions
1. What metrics should you analyze before right-sizing? CPU utilization (avg, max, P95), memory usage (requires agent), network throughput, disk I/O, and application-level metrics (response times, throughput). Collect for at least 14 days.
2. How does AWS Compute Optimizer determine if an instance is over-provisioned? It compares the instance’s CPU utilization, memory, and network I/O over the last 14 days against the capabilities of smaller instance types. If a smaller type can handle the observed load with acceptable performance risk, it’s recommended.
3. What is the difference between target tracking and step scaling? Target tracking maintains a metric at a target value (e.g., CPU at 50%). Step scaling adds or removes instances based on alarm breaches with step adjustments. Target tracking is simpler; step scaling gives more control.
4. Why should you profile for at least 14 days? A single day might miss weekend patterns, monthly batch jobs, or weekly peaks. Fourteen days captures most workload cycles and provides a reliable baseline.
5. Challenge: A production web server runs on an m5.4xlarge (16 vCPU, 64GB RAM) with average CPU of 12%, peak CPU of 35%, and memory usage of 8GB. Design a right-sizing plan. Answer: This is significantly over-provisioned. The workload needs ~4 vCPU and ~16GB RAM. Recommended: migrate to m5.xlarge (4 vCPU, 16GB) with target tracking scaling (50% CPU). If acceptable, test t3.xlarge (burstable) for further savings. Estimated savings: 75%.
Mini Project: Right-Sizing Analyzer
Create a script that analyzes an AWS account for right-sizing opportunities:
#!/bin/bash
# rightsizing_analyzer.sh — Analyze EC2 instances and find right-sizing opportunities
# Requires: AWS CLI, jq
echo "=== EC2 Right-Sizing Analysis ==="
echo "Date: $(date)"
echo ""
# Get all running instances
echo "Analyzing instances..."
INSTANCES=$(aws ec2 describe-instances \
--filters Name=instance-state-name,Values=running \
--query 'Reservations[].Instances[].{ID:InstanceId,Type:InstanceType,LaunchTime:LaunchTime}' \
--output json)
echo "$INSTANCES" | jq -r '.[] | "\(.ID) \(.Type) \(.LaunchTime)"' | while read id type launch; do
echo "--- Instance: $id ($type) ---"
echo "Launched: $launch"
# Get average CPU over last 14 days
AVG_CPU=$(aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=$id \
--start-time $(date -d '14 days ago' +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date +%Y-%m-%dT%H:%M:%SZ) \
--period 86400 \
--statistics Average \
--output json | jq -r '[.Datapoints[].Average] | add / length')
PEAK_CPU=$(aws cloudwatch get-metric-statistics \
--namespace AWS/EC2 \
--metric-name CPUUtilization \
--dimensions Name=InstanceId,Value=$id \
--start-time $(date -d '14 days ago' +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date +%Y-%m-%dT%H:%M:%SZ) \
--period 3600 \
--statistics Maximum \
--output json | jq -r '[.Datapoints[].Maximum] | max')
echo "Average CPU: ${AVG_CPU:-N/A}%"
echo "Peak CPU: ${PEAK_CPU:-N/A}%"
# Recommendation
if [ -n "$AVG_CPU" ] && [ -n "$PEAK_CPU" ]; then
if (( $(echo "$AVG_CPU < 10" | bc -l) )); then
echo "Recommendation: DOWNGRADE - seriously over-provisioned"
elif (( $(echo "$AVG_CPU < 30" | bc -l) )); then
echo "Recommendation: Consider smaller instance type"
elif (( $(echo "$AVG_CPU > 70" | bc -l) )); then
echo "Recommendation: UPGRADE - instance under-powered"
else
echo "Recommendation: Current size appears appropriate"
fi
# Get Compute Optimizer recommendation
echo "Compute Optimizer:"
aws compute-optimizer get-ec2-instance-recommendations \
--instance-arns "arn:aws:ec2:$(aws configure get region):$(aws sts get-caller-identity --query Account --output text):instance/$id" \
--query 'instanceRecommendations[0].recommendationOptions[0].{Type:instanceType,Savings:savingsOpportunity.savingsOpportunityPercentage}' \
--output json 2>/dev/null || echo " (not available)"
fi
echo ""
doneFAQ
What’s Next
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro