Skip to content
Spot & Preemptible Instances: 80-90% Discount on Compute

Spot & Preemptible Instances: 80-90% Discount on Compute

DodaTech Updated Jun 20, 2026 7 min read

Spot and preemptible instances offer 80-90% discounts on cloud compute in exchange for the risk of interruption — ideal for batch processing, CI/CD, stateless web workers, and any fault-tolerant workload that can handle being terminated with little notice.

What You’ll Learn

  • AWS Spot Instances and Spot Fleet
  • Azure Spot VMs and eviction policies
  • GCP Preemptible and Spot VMs
  • Spot pricing mechanisms and how to bid
  • Interruption handling and graceful shutdowns
  • Checkpointing for long-running batch jobs
  • Designing fault-tolerant spot workloads
  • Spot orchestration with Spot.io/NetApp

Why It Matters

On-demand compute is the most expensive option. For workloads that can tolerate interruption — batch processing, testing, CI/CD pipelines, stateless microservices — spot instances reduce compute costs by 80-90%. DodaTech runs all DodaZIP build agents and Durga Antivirus Pro’s malware analysis sandbox on spot instances, saving $14k/month.

    flowchart LR
    A[Workload Type] --> B{Fault-Tolerant?}
    B -->|Yes| C[Spot / Preemptible]
    B -->|No| D[On-Demand / Reserved]
    C --> E[Spot Fleet / Node Group]
    C --> F[Checkpointing]
    C --> G[Interruption Handling]
    E --> H[80-90% Savings]
    style H fill:#22c55e,color:#fff
  

1. AWS Spot Instances

AWS Spot Instances use spare EC2 capacity at up to 90% discount. Pricing varies by instance type, region, and availability.

# Request a Spot Instance
aws ec2 request-spot-instances \
  --spot-price "0.05" \
  --instance-count 5 \
  --type "one-time" \
  --launch-specification '{
    "ImageId": "ami-0c55b159cbfafe1f0",
    "InstanceType": "m5.large",
    "Placement": {"AvailabilityZone": "us-east-1a"}
  }'

# Describe spot price history
aws ec2 describe-spot-price-history \
  --instance-types m5.large \
  --product-description "Linux/UNIX" \
  --start-time 2026-06-01T00:00:00Z

# Check spot instance status
aws ec2 describe-spot-instance-requests \
  --filters "Name=state,Values=active"

Spot price history output:

Time                     InstanceType  ProductDesc    SpotPrice
2026-06-19T12:00:00Z     m5.large      Linux/UNIX      $0.0284
2026-06-19T11:00:00Z     m5.large      Linux/UNIX      $0.0250
2026-06-19T10:00:00Z     m5.large      Linux/UNIX      $0.0312

Spot Fleet

Spot Fleet automatically launches and maintains the optimal mix of spot instances across pools to meet target capacity.

# Create a Spot Fleet
aws ec2 create-fleet \
  --target-capacity-specification '{"TotalTargetCapacity": 20, "DefaultTargetCapacityType": "spot"}' \
  --launch-template-configs '{"LaunchTemplateSpecification": {"LaunchTemplateName": "worker-template", "Version": "1"}}' \
  --type "instant"

2. Azure Spot VMs

Azure Spot VMs offer up to 90% discount with eviction policies: Deallocate (stop VM but keep disk) or Delete (remove VM and disk).

# Create an Azure Spot VM
az vm create \
  --resource-group batch-rg \
  --name spot-worker-1 \
  --image UbuntuLTS \
  --size Standard_D4s_v3 \
  --priority Spot \
  --eviction-policy Delete \
  --max-price -1

# Create a VMSS with Spot priority
az vmss create \
  --resource-group batch-rg \
  --name spot-vmss \
  --image UbuntuLTS \
  --instance-count 5 \
  --vm-sku Standard_D4s_v3 \
  --priority Spot \
  --eviction-policy Delete \
  --max-price -1 \
  --single-placement-group false

Azure eviction policy choices:

  • Deallocate: VM stops, disk persists, restart later (preserves state)
  • Delete: VM and disks removed (best for stateless, lowest cost)

3. GCP Preemptible and Spot VMs

GCP offers two types of interruptible VMs:

TypeMax RuntimeDiscountTermination Notice
Preemptible24 hours60-91%30 seconds
SpotNone60-91%30 seconds
# Create a GCP Spot VM
gcloud compute instances create spot-worker-1 \
  --zone us-central1-a \
  --machine-type e2-standard-4 \
  --provisioning-model=SPOT \
  --instance-termination-action=STOP

# Create a preemptible VM
gcloud compute instances create preemptible-worker-1 \
  --zone us-central1-a \
  --machine-type e2-standard-4 \
  --preemptible

# Set maintenance behavior for Spot VMs
gcloud compute instances create resilient-worker \
  --zone us-central1-a \
  --machine-type e2-standard-4 \
  --provisioning-model=SPOT \
  --instance-termination-action=DELETE \
  --max-run-duration=4h

4. Interruption Handling and Graceful Shutdown

Spot instances receive a termination notice — handle it to save work and exit cleanly.

# AWS: Listen for spot termination notice
import requests
import time

def check_termination():
    url = "http://169.254.169.254/latest/meta-data/spot/termination-time"
    while True:
        try:
            resp = requests.get(url, timeout=5)
            if resp.status_code == 200:
                print(f"Termination at: {resp.text}")
                save_checkpoint()
                return
        except requests.exceptions.RequestException:
            pass
        time.sleep(5)

def save_checkpoint():
    print("Saving checkpoint before termination...")
    # Save state, upload results, drain connections

# GCP: Similar metadata endpoint
# gcloud compute instances describe --zone us-central1-a spot-worker-1
# Kubernetes: Handle spot interruption with node lifecycle handler
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: spot-handler
  namespace: kube-system
data:
  spot-handler.sh: |
    #!/bin/bash
    kubectl cordon \$NODE_NAME
    kubectl drain \$NODE_NAME --ignore-daemonsets --delete-emptydir-data
EOF

5. Checkpointing for Long-Running Jobs

Batch processing workloads must save progress periodically so they resume from the last checkpoint after interruption.

# checkpoint_worker.py
import dill, os

CHECKPOINT_FILE = "/tmp/checkpoint.pkl"

class BatchProcessor:
    def __init__(self, tasks):
        self.tasks = tasks
        self.completed = self.load_checkpoint()

    def load_checkpoint(self):
        if os.path.exists(CHECKPOINT_FILE):
            with open(CHECKPOINT_FILE, "rb") as f:
                return dill.load(f)
        return []

    def save_checkpoint(self, task_id):
        self.completed.append(task_id)
        with open(CHECKPOINT_FILE, "wb") as f:
            dill.dump(self.completed, f)

    def run(self):
        for task in self.tasks:
            if task["id"] in self.completed:
                continue
            process(task)
            self.save_checkpoint(task["id"])

processor = BatchProcessor([{"id": i, "data": f"item-{i}"} for i in range(1000)])
processor.run()

6. Fault-Tolerant Workload Design

Architectures that work well on spot:

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│ Load         │────▶│ Spot Worker  │────▶│ Queue/Storage│
│ Balancer     │     │ Pool (Auto   │     │ (Persistent) │
│              │     │ Scaled)      │     │              │
└──────────────┘     └──────────────┘     └──────────────┘
     │                      │                     │
     │                      ▼                     │
     │              ┌──────────────┐              │
     │              │ Termination  │              │
     └──────────────▶ Handler      │──────────────┘
                     └──────────────┘

7. Spot Orchestration Tools

Spot.io (NetApp) automates spot instance management across AWS, Azure, and GCP:

# Spot.io ELX (Elastigroup) example
spotinst elastigroup create \
  --name "batch-workers" \
  --provider "aws" \
  --region "us-east-1" \
  --strategy '{"risk": 100, "fallbackToOd": true}' \
  --capacity '{"minimum": 2, "maximum": 20, "target": 5}'

Common Mistakes

  1. No interruption handling: Applications crash and lose data when instances are terminated. Always implement termination notice listeners and checkpoints.

  2. Using spot for stateful workloads: Databases, message queues, and stateful apps should never run on spot. They lose data on termination.

  3. Bidding too high: With AWS, you pay the spot price, not your bid. Bidding the on-demand price is safe — you’ll never pay more than on-demand.

  4. Single instance type / AZ: If that instance type has no spot capacity, your workload won’t run. Diversify across instance families and availability zones.

  5. No fallback to on-demand: Always configure a fallback strategy — when spot is unavailable, launch on-demand to maintain availability.

Practice Questions

  1. What is the difference between GCP Preemptible and GCP Spot VMs? Answer: Preemptible VMs have a 24-hour max runtime. Spot VMs have no max runtime and are the recommended option. Both offer similar discounts.

  2. How do you handle spot instance termination gracefully? Answer: Listen to the termination notice endpoint (metadata) and execute a shutdown script: save checkpoints, drain connections, upload results, and terminate.

  3. What workloads are best suited for spot instances? Answer: Stateless, fault-tolerant workloads: batch processing, CI/CD agents, render farms, web workers, testing environments, and data analysis pipelines.

  4. Can you mix spot and on-demand in a Spot Fleet? Answer: Yes. Configure Spot Fleet with a percentage split (e.g., 70% spot, 30% on-demand) to maintain availability while maximizing savings.

Challenge

Design a CI/CD build cluster on spot instances: AWS Spot Fleet with 5 instance types across 3 AZs, termination handler that drains jobs gracefully, checkpoints build artifacts to S3 every 5 minutes, fallback to on-demand when spot price exceeds $0.10/hr, and autoscaling from 0 to 50 workers based on queue depth.

FAQ

How much can I save with spot instances?
: 60-91% compared to on-demand. Most users pay 70-80% less.
Can spot instances be used for web servers?
: Only if they’re stateless, behind a load balancer, and can handle being replaced. Session data must be externalized to Redis or a database.
What happens when a spot instance is terminated?
: AWS sends a 2-minute termination notice. Azure sends a 30-second notice. GCP sends a 30-second notice. Use these to drain traffic and save state.
How do I get notified before spot termination?
: AWS: 169.254.169.254/latest/meta-data/spot/termination-time. Azure: 169.254.169.254/metadata/instance?compute. GCP: Instance termination action metadata.
Is spot pricing stable?
: No — spot prices fluctuate with supply and demand. Use diversified instance types and AWS’s maxPrice = onDemand to avoid paying more than necessary.

What’s Next

TopicDescription
Reserved Instances & Savings Plans
Save 40-70% with commitment-based discounts
Kubernetes Cost Optimization
Reduce K8s infrastructure spend

Related topics: Cloud Cost Optimization, AWS, Azure, GCP

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro