Cloud GPU Providers with Persistent Storage

Persistent storage ensures that your datasets, model checkpoints, and training outputs survive instance restarts and shutdowns. Without persistent storage, you would need to re-upload data every time you start a new GPU instance. This guide lists cloud GPU providers that offer persistent block or network storage attached to GPU instances.

Updated June 2026 Showing 7 GPU providers yes
Trustpilot Rating
4.6
Trustpilot Reviews
146
+0 (7d) +1 (30d) +8 (90d)
HQ
Cherry Servers LithuaniaLithuania
Starting Price
$0.16/hr
Max VRAM
80 GB
Max GPUs
2
Billing
Per-hour
Trustpilot Rating
4.6
Trustpilot Reviews
2,427
+13 (7d) +47 (30d) +141 (90d)
HQ
DigitalOcean United StatesUnited States
Starting Price
$0.76/hr
Max VRAM
192 GB
Max GPUs
8
Billing
Per-second
Trustpilot Rating
4.1
Trustpilot Reviews
237
+0 (7d) +8 (30d) +26 (90d)
HQ
Vast.ai United StatesUnited States
Starting Price
$0.06/hr
Max VRAM
192 GB
Max GPUs
8
Billing
Per-second
Trustpilot Rating
3.7
Trustpilot Reviews
3
+0 (7d) +0 (30d) +0 (90d)
HQ
Latitude.sh BrazilBrazil
Starting Price
$0.35/hr
Max VRAM
96 GB
Max GPUs
8
Billing
Per-hour
Trustpilot Rating
3.4
Trustpilot Reviews
245
+1 (7d) +13 (30d) +36 (90d)
HQ
RunPod United StatesUnited States
Starting Price
$0.06/hr
Max VRAM
288 GB
Max GPUs
8
Billing
Per-second
Trustpilot Rating
2.9
Trustpilot Reviews
7
+0 (7d) +0 (30d) +2 (90d)
HQ
Novita AI United StatesUnited States
Starting Price
$0.11/hr
Max VRAM
80 GB
Max GPUs
8
Billing
Per-second
Trustpilot Rating
1.7
Trustpilot Reviews
557
+1 (7d) +4 (30d) +19 (90d)
HQ
Vultr United StatesUnited States
Starting Price
$0.47/hr
Max VRAM
288 GB
Max GPUs
16
Billing
Per-hour

What persistent storage means when you rent a cloud GPU

By default, a rented GPU instance gives you a working disk that lives and dies with the instance. The moment you stop, destroy, or get preempted off that machine, the local disk is wiped and the bytes are gone. Persistent storage breaks that coupling: it is a storage volume whose lifetime is independent of any single GPU instance, so your datasets, model checkpoints, conda environments, and cached weights survive a shutdown and reattach to the next machine you spin up. The providers in the comparison above all offer some form of this, but the implementations differ enough that “yes” is only the start of the answer.

In practice persistent storage shows up in two main shapes. The first is a network volume (block or filesystem storage) that you mount over the provider’s internal network and can attach to whatever GPU node you launch. The second is object storage (S3-compatible buckets) that you pull from at job start and push results back to. A few providers also keep a persistent home directory on a fast local NVMe pool that is decoupled from the compute lifecycle. Each behaves very differently for throughput, latency, and how you wire it into a training loop.

Why it matters for real GPU workflows

The reason persistent storage is worth filtering for is that GPU time is the expensive resource and you do not want to waste it re-downloading and re-preparing data. Concretely, it changes these workflows:

  • Long training and fine-tuning runs write checkpoints every few hundred steps. If those checkpoints live only on ephemeral disk, a crashed or preempted node means restarting from zero. Persistent storage lets you resume from the last checkpoint on a fresh GPU.
  • Spot and interruptible instances become genuinely usable. The whole economics of cheap preemptible GPUs depend on being able to lose the node without losing the work — that only holds if your state lives on a volume that outlives the instance.
  • Large datasets (multi-hundred-GB image, video, or token corpora) are painful to re-stage on every launch. A persistent volume holds the prepared, sharded data so each new session starts in seconds rather than after a long copy.
  • Iterative development benefits from a stable home directory: your environment, installed packages, cached Hugging Face weights, and notebooks are still there tomorrow without rebuilding from a container image.
  • Inference serving can keep model weights warm on attached storage so a scaled-up replica loads quickly instead of pulling tens of GB from a remote bucket on cold start.

The trade-offs to weigh

Persistent storage is not free of cost or friction, and the differences between providers usually live in these trade-offs rather than in whether the feature exists at all.

  • You pay for it while idle. Compute billing stops when you shut a GPU down, but a persistent volume keeps billing for capacity (typically per GB-month) whether or not a GPU is attached. A large volume left around between projects becomes a quiet recurring charge.
  • Region and zone pinning. A network volume usually lives in one region or data center. If GPUs of the type you want are only available in another region, you may not be able to attach your volume there — and migrating it can incur egress or copy time.
  • Throughput and latency vary widely. Local NVMe scratch can deliver gigabytes per second; a network filesystem may be far slower and can bottleneck a data-hungry training loop. For high-throughput data pipelines this gap matters more than capacity.
  • Concurrency limits. Some block volumes attach to only one instance at a time, while shared filesystems and object storage allow many readers. Multi-node training generally needs a shared filesystem or object store, not a single-attach block device.
  • Egress and transfer fees. Reading inside the same provider region is usually cheap, but pulling data out to your laptop or another cloud can carry egress charges that dwarf the storage cost.

What to check in the comparison above

When you read the list above as a shortlist of providers that support persistent storage, drill into the specifics rather than treating “yes” as uniform:

  1. Volume type and throughput — is it block, network filesystem, or object storage, and what real read/write bandwidth does it sustain under a training load?
  2. Pricing model — per GB-month for the volume, and whether you keep paying while no GPU is attached.
  3. Region coupling — can the volume attach to the GPU types and regions you actually need, including spot capacity?
  4. Capacity and limits — maximum volume size, snapshot support, and whether it can be shared across multiple nodes.
  5. Egress terms — what it costs to move data out, since that often decides total spend more than the storage line item.

Match those answers to your workload: a single long fine-tune wants reliable checkpoint persistence and resume; a heavy data pipeline wants raw throughput; a serverless or autoscaling inference fleet wants fast shared reads of warm weights. The right provider in the table is the one whose persistent storage shape fits your dominant pattern.

Frequently asked questions

Does persistent storage keep my data if I stop the GPU instance?

Yes — that is precisely its purpose. A persistent volume is decoupled from the compute instance, so stopping or destroying the GPU node leaves the volume and its contents intact. You reattach it to the next instance you launch. Just remember that the volume itself usually keeps incurring a capacity charge while it exists, even with no GPU running.

Is persistent storage included in the GPU rental price?

Usually not. The hourly GPU rate covers compute and a base ephemeral disk, while persistent volumes are billed separately by capacity, typically per GB per month. Always treat storage as a distinct line item when estimating total cost, and check the live comparison above for how each provider prices it.

Can I use persistent storage with cheap spot or interruptible GPUs?

That is one of the best reasons to want it. Because the volume outlives any single node, you can be preempted off a spot instance, lose nothing, and resume from your last checkpoint on a new machine. Confirm that the provider allows attaching the volume to spot capacity in the same region where those GPUs are available.

What is the difference between persistent storage and object storage like S3?

Object storage is one way to make data persist, but you read and write it as buckets and objects over an API rather than mounting it as a local disk. A persistent block or filesystem volume behaves like an attached drive your code reads directly. Object storage scales huge and is shareable across many nodes; mounted volumes usually offer lower latency for an active training loop. Many workflows use both — buckets for cold archives, a mounted volume for the live working set.

Cherry Servers vs DigitalOcean - Comparison of Top Firms in This Guide

Cherry Servers vs DigitalOcean - GPU Provider Comparison (June 2026)

Head-to-head comparison of Cherry Servers and DigitalOcean. Compare GPU models, hourly pricing, billing granularity, spot instances, VRAM, infrastructure, developer tools, Kubernetes support, and compliance before choosing a provider. Data refreshed June 2026.

Bottom Line: Cherry Servers vs DigitalOcean

Cherry Servers and DigitalOcean are closely matched — each leads in several categories, so the right pick depends on your priorities.

Where Cherry Servers leads

  • Starting Price ($/hr) ($0.16/hr vs $0.76/hr)
  • Uptime SLA (99.97% vs 99%)
  • Regions (6 vs 5)

Where DigitalOcean leads

  • Max VRAM (GB) (192 vs 80)
  • Max GPUs/Instance (8 vs 2)
  • Frameworks (7 vs 3)
  • Jupyter Notebooks

Choose Cherry Servers for Starting Price ($/hr). Choose DigitalOcean for Max VRAM (GB).

Frequently Asked Questions

Is Cherry Servers or DigitalOcean better?
It is close — Cherry Servers and DigitalOcean each lead in several categories. Compare the points that matter most to you below.
Which has a better Starting Price ($/hr), Cherry Servers or DigitalOcean?
Cherry Servers ($0.16/hr vs $0.76/hr).
Which has a better Max VRAM (GB), Cherry Servers or DigitalOcean?
DigitalOcean (192 vs 80).
Cherry Servers vs DigitalOcean - GPU Provider Comparison (June 2026)
Cherry Servers
Bare metal GPU servers with 24 years of hosting experience and full hardware-level control.
Visit Cherry Servers
DigitalOcean
Simple, scalable GPU cloud for AI/ML
Visit DigitalOcean
Overview
Trustpilot Rating 4.6 4.6
Headquarters Lithuania United States
Provider Type N/A N/A
Best For AI training inference fine-tuning rendering research HPC generative AI deep learning AI training inference fine-tuning LLM deployment LLM serving computer vision startups generative AI research
GPU Hardware
GPU Models A100 A40 A16 A10 A2 Tesla P4 RTX 4000 Ada RTX 6000 Ada L40S MI300X H100 SXM H200
Max VRAM (GB) 80 192
Max GPUs/Instance 2 8
Interconnect PCIe NVLink
Pricing
Starting Price ($/hr) $0.16/hr $0.76/hr
Billing Granularity Per-hour Per-second
Spot/Preemptible No No
Reserved Discounts N/A N/A
Free Credits None $200 free credit for 60 days
Egress Fees N/A None (included in plan)
Storage NVMe SSD, Elastic Block Storage ($0.071/GB/mo) 500-720 GiB NVMe boot (included), 5 TiB NVMe scratch on larger configs, Volumes at $0.10/GiB/mo
Infrastructure
Regions Lithuania, Netherlands, Germany, Sweden, US, Singapore (6 locations) New York (NYC2), Toronto (TOR1), Atlanta (ATL1), Richmond (RIC1), Amsterdam (AMS3)
Uptime SLA 99.97% 99%
Developer Experience
Frameworks PyTorch TensorFlow CUDA (bare metal — full stack control) PyTorch TensorFlow Jupyter Miniconda CUDA ROCm Hugging Face
Docker Support Yes Yes
SSH Access Yes Yes
Jupyter Notebooks No Yes
API / CLI Yes Yes
Setup Time Minutes Minutes
Kubernetes Support Yes Yes
Business Terms
Min Commitment None None
Compliance ISO 27001 ISO 20000-1 GDPR PCI DSS SOC 2 Type II SOC 3 HIPAA (with BAA) CSA STAR Level 1
Cherry Servers DigitalOcean

Build your own comparison

Select any 2-6 firms from this guide and open them in the full comparison table.

Tip: if you do not select any firms we will start with the top 2 from this guide.