Cloud GPU Providers with Spot / Preemptible Instances

Spot or preemptible GPU instances offer 50-90% savings compared to on-demand pricing, in exchange for the possibility of interruption during high-demand periods. They are ideal for fault-tolerant workloads like distributed training with checkpointing, batch inference, and hyperparameter sweeps. This guide lists cloud GPU providers that offer spot pricing, helping you significantly reduce your GPU compute costs.

Updated June 2026 Showing 4 GPU providers yes

Trustpilot Rating

4.1

Trustpilot Reviews

237

+0 (7d) +8 (30d) +26 (90d)

Starting Price

$0.06/hr

Max VRAM

192 GB

Max GPUs

Billing

Per-second

Compare

🌐 Visit Website

Trustpilot Rating

3.4

Trustpilot Reviews

245

+1 (7d) +13 (30d) +37 (90d)

Starting Price

$0.06/hr

Max VRAM

288 GB

Max GPUs

Billing

Per-second

Compare

🌐 Visit Website

Trustpilot Rating

2.9

Trustpilot Reviews

+0 (7d) +0 (30d) +2 (90d)

Starting Price

$0.11/hr

Max VRAM

80 GB

Max GPUs

Billing

Per-second

Compare

🌐 Visit Website

Trustpilot Rating

1.7

Trustpilot Reviews

557

+1 (7d) +4 (30d) +19 (90d)

Starting Price

$0.47/hr

Max VRAM

288 GB

Max GPUs

Billing

Per-hour

Compare

🌐 Visit Website

What spot and preemptible GPU instances actually are

A spot or preemptible GPU instance is rented from a provider’s pool of spare capacity at a steep discount in exchange for one critical condition: the provider can reclaim the machine at any time, usually with little or no warning. The hardware is identical to the on-demand version of the same GPU — the same VRAM, the same tensor cores, the same interconnect — but the contract around its availability is different. You are buying compute that is cheap precisely because it is interruptible. Every provider in the list above marked as offering spot or preemptible capacity exposes this trade in some form, though the names differ: spot, preemptible, interruptible, community, or surplus instances all describe the same underlying idea.

The discount exists because data centers rarely run at 100% utilization. Idle GPUs earn nothing, so providers sell that slack at a fraction of the standard rate and accept that they may need to take it back the moment a full-price customer wants it, or when their own scheduling needs change. For the renter, that means the headline savings are real, but they come with operational obligations you do not have on a guaranteed on-demand node.

Why interruptible pricing matters for real workloads

The reason spot capacity is worth understanding is that GPU rental is expensive, and the discount on interruptible instances is often large enough to change which projects are economically viable. The catch is that not every workload tolerates being killed mid-run. The deciding factor is almost always how well your job checkpoints and resumes.

Excellent fit: long training and fine-tuning runs that save checkpoints to durable storage every few minutes, large batch-inference or embedding jobs, offline rendering, hyperparameter sweeps where individual trials are independent, and any pipeline already built around fault tolerance.
Poor fit: real-time or low-latency inference serving a live application, interactive development sessions where losing the box means losing unsaved work, and tightly synchronized multi-GPU training that cannot recover gracefully when one node disappears.

The mental model is simple: if losing the instance costs you only the minutes since your last checkpoint, spot is almost always the right call. If losing it costs you a request, a customer, or hours of un-saved state, the on-demand premium buys you peace of mind that is worth paying for.

The trade-offs to weigh

Interruption is the obvious cost, but it is not the only one. When you compare providers on this dimension, keep the full picture in mind:

Reclaim behavior: some providers give a short termination notice (often a couple of minutes) so your job can save state and exit cleanly; others can pull the machine instantly. A grace period is enormously valuable because it lets you trigger a final checkpoint.
Availability variance: spot pools fluctuate. The exact GPU you want at the price you saw can be unavailable for stretches, and the most in-demand accelerators are reclaimed more aggressively than older or less popular cards.
Storage that outlives the instance: if your checkpoints live only on the instance’s local disk, an interruption wipes them. Spot only works safely when your data and checkpoints sit on persistent or network storage that survives the node.
Restart friction: after a reclaim you must re-acquire capacity, re-pull your container image and data, and resume — so cold-start time and image size affect your effective throughput and cost.

What to check before renting spot capacity

Because the same word can mean different things across providers, use the comparison above to confirm the specifics rather than assuming. Before committing a workload to interruptible instances, work through this checklist:

Notice window: does the provider warn you before reclaiming, and how long is the grace period? Even 30–120 seconds changes how you design your checkpointing.
How aggressive are reclaims: are spot machines taken back only under genuine capacity pressure, or also for routine rebalancing? Frequent, low-pressure reclaims erode the savings.
Checkpoint plumbing: can you write checkpoints to durable object or network storage cheaply, and is egress to retrieve them reasonable? This is the single most important enabler of safe spot use.
Automatic re-acquisition: does the platform automatically requeue and restart your job when capacity returns, or must you script that yourself? Managed requeue makes spot far less hands-on.
Multi-GPU and multi-node behavior: if you need several GPUs together, losing one can stall the whole job. Check whether the provider can hold a group atomically or only offers single-GPU spot.
Billing granularity: per-second or per-minute billing pairs well with spot because you only pay for the time you actually ran before a reclaim, rather than rounding up.

A practical pattern many teams adopt is a hybrid setup: run the bulk of throughput-oriented, checkpointable work on spot to capture the discount, while keeping a small on-demand footprint for anything latency-sensitive or stateful. That blend captures most of the savings without exposing the parts of your pipeline that genuinely cannot tolerate interruption.

Frequently asked questions

Will I lose my work when a spot GPU instance is reclaimed?

You lose any state that exists only on the instance at the moment it is reclaimed — including unsaved progress and anything on local-only disk. You do not lose work that you have already written to persistent or network storage. This is why frequent checkpointing to durable storage is the core discipline of using spot capacity safely; with good checkpointing you lose at most the few minutes since your last save.

Is the GPU hardware different on spot versus on-demand instances?

No. Spot and on-demand instances draw from the same physical hardware, so the GPU, its VRAM, its tensor cores, and its interconnect are identical. The only difference is the contract around availability and price: spot is cheaper but interruptible, while on-demand costs more and is not reclaimed out from under you. You are paying for guaranteed continuity, not for faster silicon.

How much can spot instances actually save compared to on-demand?

The discount is typically substantial and is the main reason to choose interruptible capacity, but the exact figure varies by provider, GPU model, region, and current demand, and it moves constantly. Rather than rely on a single number, check the live comparison above for the current spot versus on-demand spread on the specific GPU you want.

Which workloads should never run on spot instances?

Avoid spot for anything that cannot survive a sudden disappearance: live, low-latency inference behind a production application, interactive sessions holding unsaved work, and tightly coupled multi-GPU jobs that cannot recover when one node is lost. For those, the on-demand premium is worth it. Everything that checkpoints cleanly and tolerates restarts — training, fine-tuning, batch inference, rendering, and sweeps — is well suited to spot.

Vast.ai vs RunPod - Comparison of Top Firms in This Guide

Vast.ai vs RunPod - GPU Provider Comparison (June 2026)

Head-to-head comparison of Vast.ai and RunPod. Compare GPU models, hourly pricing, billing granularity, spot instances, VRAM, infrastructure, developer tools, Kubernetes support, and compliance before choosing a provider. Data refreshed June 2026.

Bottom Line: Vast.ai vs RunPod

Vast.ai comes out ahead overall, leading in 4 of 5 compared categories.

Where Vast.ai leads

Trustpilot Rating (4.1 vs 3.4)
GPU Models (35 vs 30)
Regions (2 vs 1)
Compliance (4 vs 1)

Where RunPod leads

Max VRAM (GB) (288 vs 192)

Choose Vast.ai for Trustpilot Rating. Choose RunPod for Max VRAM (GB).

Frequently Asked Questions

Is Vast.ai or RunPod better?

Vast.ai leads in 4 of 5 compared categories. The right choice still depends on the factors that matter most to you.

Which has a better Trustpilot Rating, Vast.ai or RunPod?

Vast.ai (4.1 vs 3.4).

Which has a better Max VRAM (GB), Vast.ai or RunPod?

RunPod (288 vs 192).

Vast.ai vs RunPod - GPU Provider Comparison (June 2026)
	Vast.ai Instant GPUs. Transparent Pricing. Visit Vast.ai	RunPod The cloud built for AI — deploy and scale GPU workloads from serverless inference to instant multi-node clusters on demand. Visit RunPod
Overview
Trustpilot Rating	4.1	3.4
Headquarters	United States	United States
Provider Type	GPU Marketplace	GPU-Focused
Best For	AI training inference fine-tuning Stable Diffusion batch processing research LLM serving generative AI	AI training inference fine-tuning Stable Diffusion batch processing rendering research LLM serving generative AI
GPU Hardware
GPU Models	B200 H200 H100 SXM H100 NVL A100 SXM A100 PCIe RTX 5090 RTX 5080 RTX 5070 Ti RTX 6000 Pro RTX 6000 Ada RTX 4500 Ada RTX A6000 RTX A5000 RTX A4000 L40S L40 A40 A10 RTX 4090 RTX 4080 RTX 4070 Ti RTX 4070 RTX 4060 Ti RTX 4060 RTX 3090 Ti RTX 3090 RTX 3080 Ti RTX 3080 RTX 3070 Ti RTX 3070 Tesla V100 Tesla T4 A2 GTX 1080	B300 B200 H200 H100 SXM H100 PCIe H100 NVL MI300X A100 SXM A100 PCIe RTX 5090 RTX PRO 6000 L40S L40 RTX 6000 Ada RTX 5000 Ada RTX A6000 RTX A5000 RTX 4090 RTX 4080 SUPER RTX 4080 RTX 4070 Ti RTX 3090 Ti RTX 3090 RTX 3080 Ti RTX 3080 RTX 3070 A40 A30 A2 L4
Max VRAM (GB)	192	288
Max GPUs/Instance	8	8
Interconnect	NVLink, InfiniBand	NVLink
Pricing
Starting Price ($/hr)	$0.06/hr	$0.06/hr
Billing Granularity	Per-second	Per-second
Spot/Preemptible	Yes	Yes
Reserved Discounts	Up to 50% (1-6 month reserved)	15-29% (1-month to 1-year plans)
Free Credits	Small test credit on signup	$5-$500 bonus after first $10 spend
Egress Fees	Varies by host ($/TB)	None (Free)
Storage	Varies by host ($/GB/hr, charged while instance exists)	Container/Volume ($0.10/GB/mo), Idle Volume ($0.20/GB/mo), Network Storage ($0.07/GB/mo 1TB)
Infrastructure
Regions	500+ locations, 40+ data centers	31 global regions
Uptime SLA	No formal SLA (host reliability scores visible)	99.99%
Developer Experience
Frameworks	PyTorch TensorFlow CUDA vLLM ComfyUI	PyTorch TensorFlow JAX ONNX CUDA
Docker Support	Yes	Yes
SSH Access	Yes	Yes
Jupyter Notebooks	Yes	Yes
API / CLI	Yes	Yes
Setup Time	Seconds	Instant
Kubernetes Support	No	No
Business Terms
Min Commitment	None	None
Compliance	SOC 2 Type 2 HIPAA GDPR CCPA	SOC 2 Type II

Vast.ai

RunPod

Build your own comparison

Select any 2-6 firms from this guide and open them in the full comparison table.

Vast.ai Rating 4.1 | United States RunPod Rating 3.4 | United States Novita AI Rating 2.9 | United States Vultr Rating 1.7 | United States

Tip: if you do not select any firms we will start with the top 2 from this guide.

Cloud GPU Providers with Spot / Preemptible Instances

What spot and preemptible GPU instances actually are

Why interruptible pricing matters for real workloads

The trade-offs to weigh

What to check before renting spot capacity

Frequently asked questions

Will I lose my work when a spot GPU instance is reclaimed?

Is the GPU hardware different on spot versus on-demand instances?

How much can spot instances actually save compared to on-demand?

Which workloads should never run on spot instances?

Vast.ai vs RunPod - Comparison of Top Firms in This Guide

Vast.ai vs RunPod - GPU Provider Comparison (June 2026)

Bottom Line: Vast.ai vs RunPod

Where Vast.ai leads

Where RunPod leads

Frequently Asked Questions

Related comparisons

Build your own comparison