Skip to content
Right-Sizing Strategies — Instance Right-Sizing, Compute Optimizer, Auto-Scaling, Workload Profiling

Right-Sizing Strategies — Instance Right-Sizing, Compute Optimizer, Auto-Scaling, Workload Profiling

DodaTech Updated Jun 20, 2026 9 min read

Right-sizing is the practice of matching instance types and sizes to actual workload requirements. Most organizations over-provision by 30-50%, wasting millions annually. This guide covers tools and strategies for right-sizing across AWS, Azure, and GCP.

What You’ll Learn

You’ll use AWS Compute Optimizer and Azure Advisor for recommendations, profile workloads to determine optimal instance families, configure auto-scaling policies that match demand, and implement governance to prevent future over-provisioning.

Why Right-Sizing Matters

Every over-provisioned instance is money burned. A t3.medium running at 10% utilization could be replaced by a t3.nano at 1/8th the cost. Right-sizing is the highest-ROI cost optimization activity — it’s free (no architectural changes), low risk (easy to revert), and immediately impacts the monthly bill.

Learning Path

    flowchart LR
  A[Anomaly Detection] --> B[Right-Sizing<br/>You are here]
  B --> C[Auto-Scaling]
  C --> D[Reserved Instances]
  style B fill:#f90,color:#fff
  

Workload Profiling

Before right-sizing, understand your workload patterns:

CPU-Bound vs Memory-Bound vs I/O-Bound

Workload TypeCharacteristicRight-Sizing Strategy
CPU-boundHigh CPU, low memoryCompute-optimized (c-series)
Memory-boundHigh RAM, moderate CPUMemory-optimized (r-series, x-series)
I/O-boundHigh disk/network I/OStorage-optimized (i-series, d-series)
BurstableLow average, periodic spikesBurstable (t-series) or auto-scaling
Steady-stateConsistent utilizationFixed instance (m-series)

Profiling with CloudWatch

import boto3
import pandas as pd
from datetime import datetime, timedelta

cloudwatch = boto3.client('cloudwatch')

def profile_instance(instance_id, days=30):
    """Get CPU, memory, and network utilization for an instance"""

    end = datetime.now()
    start = end - timedelta(days=days)

    metrics = {}
    for metric_name, namespace in [
        ('CPUUtilization', 'AWS/EC2'),
        ('NetworkIn', 'AWS/EC2'),
        ('NetworkOut', 'AWS/EC2'),
        ('DiskReadBytes', 'AWS/EC2'),
        ('DiskWriteBytes', 'AWS/EC2')
    ]:
        response = cloudwatch.get_metric_statistics(
            Namespace=namespace,
            MetricName=metric_name,
            Dimensions=[{'Name': 'InstanceId', 'Value': instance_id}],
            StartTime=start,
            EndTime=end,
            Period=3600,  # 1-hour periods
            Statistics=['Average', 'Maximum', 'P95']
        )
        metrics[metric_name] = response['Datapoints']

    # Analysis
    df = pd.DataFrame(metrics)
    print(f"Instance: {instance_id}")
    print(f"  Avg CPU: {df['CPUUtilization']['Average'].mean():.1f}%")
    print(f"  Peak CPU: {df['CPUUtilization']['Maximum'].max():.1f}%")
    print(f"  P95 CPU: {sorted([d['Average'] for d in metrics['CPUUtilization']])[-5]:.1f}%")
    print(f"  Recommendation: {recommend_instance(df)}")

    return df

def recommend_instance(df):
    """Recommend instance family based on utilization patterns"""
    avg_cpu = df['CPUUtilization']['Average'].mean()
    peak_cpu = df['CPUUtilization']['Maximum'].max()

    if avg_cpu < 10 and peak_cpu < 30:
        return "Downsize or use burstable (t-series)"
    elif avg_cpu < 30 and peak_cpu < 60:
        return "Consider smaller size or t-series"
    elif avg_cpu > 80:
        return "Consider scaling out or compute-optimized"
    else:
        return "Current size likely appropriate"

AWS Compute Optimizer

AWS Compute Optimizer analyzes CloudWatch metrics and recommends optimal instance types:

# Enable Compute Optimizer
aws compute-optimizer update-enrollment-status --status Active

# Get EC2 recommendations
aws compute-optimizer get-ec2-instance-recommendations \
  --instance-arns "arn:aws:ec2:us-east-1:ACCOUNT:instance/i-12345"

# Get auto-scaling group recommendations
aws compute-optimizer get-auto-scaling-group-recommendations

# Get Lambda function recommendations
aws compute-optimizer get-lambda-function-recommendations

# Export to CSV
aws compute-optimizer export-ec2-instance-recommendations \
  --s3-destination-config bucket=my-bucket,key-prefix=compute-optimizer/

Expected recommendation output:

{
    "instanceRecommendation": {
        "instanceArn": "arn:aws:ec2:us-east-1:ACCOUNT:instance/i-12345",
        "currentInstanceType": "m5.xlarge",
        "recommendationOptions": [
            {
                "instanceType": "m5.large",
                "performanceRisk": 5,
                "projectedUtilizationMetrics": [
                    {"name": "Cpu", "statistic": "Maximum", "value": 85.0}
                ],
                "rank": 1,
                "savingsOpportunity": {
                    "savingsOpportunityPercentage": 48.5,
                    "estimatedMonthlySavings": {"value": 124.50}
                }
            }
        ],
        "inferredWorkloadTypes": ["WebApp"],
        "finding": "Overprovisioned"
    }
}

Azure Advisor Cost Recommendations

Azure Advisor analyzes VM usage and makes right-sizing recommendations:

# List cost recommendations
az advisor recommendation list \
  --category Cost \
  --query "[?impact=='High']"

# Get specific VM recommendation
az vm list --query "[].{name:name, size:hardwareProfile.vmSize}" -o table

# Check VM utilization
az monitor metrics list \
  --resource /subscriptions/SUB/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/VM_NAME \
  --metric "Percentage CPU" \
  --interval PT1H \
  --top 168 \
  --orderby desc

Azure VM Sizing Best Practices

# Identify underutilized VMs
# Look for VMs with avg CPU < 10% and max CPU < 30% over 30 days
az monitor metrics list \
  --resource /subscriptions/SUB/resourceGroups/RG/providers/Microsoft.Compute/virtualMachines/VM_NAME \
  --metric "Percentage CPU" \
  --interval P1D \
  --aggregation average,maximum \
  --query "value[?average < 10 && maximum < 30]"

GCP Rightsizing Recommendations

GCP’s Recommender provides rightsizing for Compute Engine:

# List rightsizing recommendations
gcp recommender recommendations list \
  --recommender=google.compute.instance.MachineTypeRecommender \
  --project=my-project \
  --location=us-central1-a

# Apply recommendation (change machine type)
gcp compute instances set-machine-type INSTANCE_NAME \
  --machine-type=e2-standard-4 \
  --zone=us-central1-a

# Get cluster rightsizing for GKE
gcp container clusters list --recommender

Auto-Scaling Strategies

Target Tracking Scaling

# AWS — target tracking based on CPU
import boto3

asg = boto3.client('autoscaling')

asg.put_scaling_policy(
    AutoScalingGroupName='web-app-asg',
    PolicyName='cpu-target-50',
    PolicyType='TargetTrackingScaling',
    TargetTrackingConfiguration={
        'PredefinedMetricSpecification': {
            'PredefinedMetricType': 'ASGAverageCPUUtilization'
        },
        'TargetValue': 50.0,
        'DisableScaleIn': False
    }
)

Predictive Scaling (AWS)

# Create a predictive scaling policy
asg.put_scaling_policy(
    AutoScalingGroupName='web-app-asg',
    PolicyName='predictive-scaling',
    PolicyType='PredictiveScaling',
    PredictiveScalingConfiguration={
        'MetricSpecifications': [{
            'TargetValue': 50.0,
            'PredefinedMetricPairSpecification': {
                'PredefinedMetricType': 'ASGCPUUtilization'
            }
        }],
        'Mode': 'ForecastAndScale',
        'SchedulingBufferTime': 300,
        'MaxCapacityBreachBehavior': 'IncreaseMaxCapacity',
        'MaxCapacityBuffer': 10
    }
)

Kubernetes Cluster Autoscaler

# cluster-autoscaler-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-status
data:
  min-nodes: "3"
  max-nodes: "20"
  scale-down-enabled: "true"
  scale-down-delay-after-add: "10m"
  scale-down-unneeded-time: "10m"
  # Only scale down if utilization < 50%
  scale-down-utilization-threshold: "0.5"

Right-Sizing Governance

Tagging for Right-Sizing

# Tag instances with right-sizing classification
aws ec2 create-tags --resources i-12345 --tags Key=RightSizing,Key=ReviewDate

Automated Right-Sizing with Lambda

import boto3

def lambda_handler(event, context):
    """Automatically downsize severely over-provisioned instances"""

    ce = boto3.client('compute-optimizer')
    ec2 = boto3.client('ec2')
    sns = boto3.client('sns')

    # Get all recommendations
    response = ce.get_ec2_instance_recommendations(
        filters=[{'name': 'finding', 'values': ['Overprovisioned']}]
    )

    downsized = []
    for rec in response.get('instanceRecommendations', []):
        # Only auto-downsize if savings > 40% and risk < 10%
        best_option = rec['recommendationOptions'][0]
        if (best_option['savingsOpportunity']['savingsOpportunityPercentage'] > 40 and
            best_option['performanceRisk'] < 10):

            instance_id = rec['instanceArn'].split('/')[-1]
            new_type = best_option['instanceType']

            # Stop, modify instance type, start
            ec2.stop_instances(InstanceIds=[instance_id])
            waiter = ec2.get_waiter('instance_stopped')
            waiter.wait(InstanceIds=[instance_id])

            ec2.modify_instance_attribute(
                InstanceId=instance_id,
                InstanceType={'Value': new_type}
            )
            ec2.start_instances(InstanceIds=[instance_id])

            downsized.append({
                'instance': instance_id,
                'from': rec['currentInstanceType'],
                'to': new_type,
                'savings_pct': best_option['savingsOpportunity']['savingsOpportunityPercentage']
            })

    if downsized:
        sns.publish(
            TopicArn='arn:aws:sns:us-east-1:ACCOUNT:ops-alerts',
            Subject=f'Auto-rightsized {len(downsized)} instances',
            Message=str(downsized)
        )

    return {'downsized': downsized}

Common Right-Sizing Mistakes

1. Rightsizing Without Monitoring

Changing instance types based on guesswork is dangerous. Always profile for at least 14 days (including weekends) to capture peak loads.

2. Ignoring Memory Pressure

CloudWatch doesn’t track memory by default — you need the CloudWatch Agent. Without memory metrics, you might downsize an instance into OOM territory.

3. Rightsizing Stateful Instances

Don’t rightsize databases or stateful services without load testing. A smaller instance might handle the same CPU but have less network bandwidth or EBS throughput.

4. Rightsizing in Isolation

An application might have multiple instances behind a load balancer. Rightsizing one at a time allows you to measure impact without full outage.

5. Not Considering Graviton/ARM

AWS Graviton instances (ARM-based) offer 20-40% better price/performance for many workloads. Always check if your workload can run on ARM before choosing x86.

6. Over-Aggressive Auto-Scaling

Setting target tracking to 80% CPU gives no room for traffic spikes. Start at 50%, monitor, and adjust. Use predictive scaling for predictable patterns.

7. Forgetting GPUs and Specialized Hardware

GPU instances (p3, p4, g4) are expensive. If your ML model trains in 2 hours on a p3.2xlarge but you leave it running 24/7, you’re wasting 22 hours of GPU cost. Use spot instances or scheduled start/stop.

Practice Questions

1. What metrics should you analyze before right-sizing? CPU utilization (avg, max, P95), memory usage (requires agent), network throughput, disk I/O, and application-level metrics (response times, throughput). Collect for at least 14 days.

2. How does AWS Compute Optimizer determine if an instance is over-provisioned? It compares the instance’s CPU utilization, memory, and network I/O over the last 14 days against the capabilities of smaller instance types. If a smaller type can handle the observed load with acceptable performance risk, it’s recommended.

3. What is the difference between target tracking and step scaling? Target tracking maintains a metric at a target value (e.g., CPU at 50%). Step scaling adds or removes instances based on alarm breaches with step adjustments. Target tracking is simpler; step scaling gives more control.

4. Why should you profile for at least 14 days? A single day might miss weekend patterns, monthly batch jobs, or weekly peaks. Fourteen days captures most workload cycles and provides a reliable baseline.

5. Challenge: A production web server runs on an m5.4xlarge (16 vCPU, 64GB RAM) with average CPU of 12%, peak CPU of 35%, and memory usage of 8GB. Design a right-sizing plan. Answer: This is significantly over-provisioned. The workload needs ~4 vCPU and ~16GB RAM. Recommended: migrate to m5.xlarge (4 vCPU, 16GB) with target tracking scaling (50% CPU). If acceptable, test t3.xlarge (burstable) for further savings. Estimated savings: 75%.

Mini Project: Right-Sizing Analyzer

Create a script that analyzes an AWS account for right-sizing opportunities:

#!/bin/bash
# rightsizing_analyzer.sh — Analyze EC2 instances and find right-sizing opportunities
# Requires: AWS CLI, jq

echo "=== EC2 Right-Sizing Analysis ==="
echo "Date: $(date)"
echo ""

# Get all running instances
echo "Analyzing instances..."
INSTANCES=$(aws ec2 describe-instances \
  --filters Name=instance-state-name,Values=running \
  --query 'Reservations[].Instances[].{ID:InstanceId,Type:InstanceType,LaunchTime:LaunchTime}' \
  --output json)

echo "$INSTANCES" | jq -r '.[] | "\(.ID) \(.Type) \(.LaunchTime)"' | while read id type launch; do
    echo "--- Instance: $id ($type) ---"
    echo "Launched: $launch"

    # Get average CPU over last 14 days
    AVG_CPU=$(aws cloudwatch get-metric-statistics \
      --namespace AWS/EC2 \
      --metric-name CPUUtilization \
      --dimensions Name=InstanceId,Value=$id \
      --start-time $(date -d '14 days ago' +%Y-%m-%dT%H:%M:%SZ) \
      --end-time $(date +%Y-%m-%dT%H:%M:%SZ) \
      --period 86400 \
      --statistics Average \
      --output json | jq -r '[.Datapoints[].Average] | add / length')

    PEAK_CPU=$(aws cloudwatch get-metric-statistics \
      --namespace AWS/EC2 \
      --metric-name CPUUtilization \
      --dimensions Name=InstanceId,Value=$id \
      --start-time $(date -d '14 days ago' +%Y-%m-%dT%H:%M:%SZ) \
      --end-time $(date +%Y-%m-%dT%H:%M:%SZ) \
      --period 3600 \
      --statistics Maximum \
      --output json | jq -r '[.Datapoints[].Maximum] | max')

    echo "Average CPU: ${AVG_CPU:-N/A}%"
    echo "Peak CPU: ${PEAK_CPU:-N/A}%"

    # Recommendation
    if [ -n "$AVG_CPU" ] && [ -n "$PEAK_CPU" ]; then
        if (( $(echo "$AVG_CPU < 10" | bc -l) )); then
            echo "Recommendation: DOWNGRADE - seriously over-provisioned"
        elif (( $(echo "$AVG_CPU < 30" | bc -l) )); then
            echo "Recommendation: Consider smaller instance type"
        elif (( $(echo "$AVG_CPU > 70" | bc -l) )); then
            echo "Recommendation: UPGRADE - instance under-powered"
        else
            echo "Recommendation: Current size appears appropriate"
        fi

        # Get Compute Optimizer recommendation
        echo "Compute Optimizer:"
        aws compute-optimizer get-ec2-instance-recommendations \
          --instance-arns "arn:aws:ec2:$(aws configure get region):$(aws sts get-caller-identity --query Account --output text):instance/$id" \
          --query 'instanceRecommendations[0].recommendationOptions[0].{Type:instanceType,Savings:savingsOpportunity.savingsOpportunityPercentage}' \
          --output json 2>/dev/null || echo "  (not available)"
    fi

    echo ""
done

FAQ

How often should I right-size?
Review rightsizing recommendations monthly. Automated rightsizing can happen weekly for non-production. Production instances should be manually reviewed and tested.
Can right-sizing cause downtime?
Stopping and starting an instance changes the instance type (for most types) and causes downtime. Use a rolling replacement strategy: launch a new instance, verify, then terminate the old one.
What’s the savings potential from right-sizing?
Most organizations save 20-40% on compute costs through right-sizing alone. The largest savings come from: (1) eliminating unused instances, (2) downsizing over-provisioned instances, (3) using burstable instances where appropriate.
Does right-sizing apply to containers?
Yes — container rightsizing means optimizing CPU/memory requests and limits. Over-requested resources waste cluster capacity. Tools like Vertical Pod Autoscaler (VPA) recommend optimal resource settings.
What’s the difference between right-sizing and auto-scaling?
Right-sizing selects the appropriate instance type for the workload. Auto-scaling adjusts the number of instances. They work together: right-size first, then configure auto-scaling for the right-sized instance.
How do I handle workloads with unpredictable spikes?
Use burstable instances (t-series) for moderate baseline with occasional spikes. For frequent spikes, use auto-scaling with a buffer. For rare extreme spikes, accept the cost of over-provisioning during those periods.

What’s Next

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro