Filtering for an on-demand price ceiling of $1.00 per hour draws a clear line through the cloud GPU market. Below this point you are renting consumer and prosumer-class accelerators, older datacenter cards, or fractional slices of larger GPUs. You are generally not renting current flagship datacenter silicon at full capacity, because those cards command a large multiple of this rate even on interruptible markets. The sub-$1 bracket is where hobbyists, students, indie developers, and cost-sensitive production inference workloads live, and it is large enough that the comparison above will usually return many options.

The defining trade-off at this tier is memory. Most cards you can rent under a dollar an hour use GDDR-class memory (GDDR6 or GDDR6X) rather than the HBM stacks found on training flagships. That means VRAM capacities clustered around 8 GB, 16 GB, and 24 GB, and memory bandwidth measured in the few-hundred-GB/s range rather than the multiple-TB/s a modern HBM card delivers. For the workloads this tier suits, that is frequently enough, but it is the first spec you should read off the table.

What hardware typically lands here

The under-$1 list is usually populated by a few recognizable categories of accelerator:

Consumer and prosumer cards with 8–24 GB of GDDR6/GDDR6X, strong FP16/BF16 throughput and tensor cores, but limited or no NVLink and capped FP64 performance.
Previous-generation datacenter GPUs that have aged out of the premium bracket — cards with 16–24 GB that still offer ECC memory and reliable multi-tenant behavior, often the sweet spot for steady inference.
Fractional or time-sliced GPUs, where a provider partitions a larger card so you rent a guaranteed slice of memory and compute rather than the whole device.
Interruptible / spot instances of mid-range cards, where the low headline rate is contingent on the provider being able to reclaim the machine.

Because the bracket spans both whole small GPUs and slices of big ones, two listings at the same price can behave very differently. A whole 16 GB consumer card and a 16 GB MIG-style slice of a datacenter GPU may rent for similar money but differ sharply in bandwidth, isolation, and burst behavior.

Which workloads fit comfortably under $1/hour

This tier is genuinely well matched to:

Inference and serving of small-to-mid models — anything that fits in 8–24 GB at your chosen precision, especially with INT8 or FP8/4-bit quantization to stretch VRAM.
Fine-tuning with parameter-efficient methods such as LoRA/QLoRA, which keep memory pressure low enough for a single mid-range card.
Prototyping, learning, and notebook experimentation, where you want a GPU on tap without committing to flagship rates.
Rendering, image and video generation, and light batch jobs that are compute-bound but fit in modest VRAM.

It is underpowered for large-model pretraining, multi-billion-parameter full-weight fine-tunes, and anything that needs to shard one model across several NVLink-connected GPUs. The moment a job requires high inter-GPU bandwidth or tens of gigabytes of contiguous VRAM, you have outgrown this bracket and should look at the pricier tiers.

Reading the comparison above at this price point

Because everything here is inexpensive, small differences in the fine print dominate your real cost. Check these before booking:

On-demand vs interruptible. The lowest rates in this band are often spot/preemptible. That is fine for fault-tolerant batch and checkpointed training, but risky for a live endpoint that cannot tolerate eviction.
Billing granularity. Per-second or per-minute billing matters far more proportionally when the hourly rate is low — minimum charges and rounding can quietly double the effective cost of short jobs.
Storage and egress. At under a dollar an hour for compute, a few cents per GB of egress or a separate charge for persistent volumes can become the larger line item.
VRAM and precision support. Confirm the card supports the precision you plan to run (BF16/FP8/INT8) and that the VRAM clears your model plus KV cache or activation overhead.
Whole card vs slice. Verify whether the price is for a full GPU or a partition, since that changes bandwidth and noisy-neighbor exposure.

How this contrasts with cheaper and pricier tiers

Drop materially below this ceiling and you are mostly looking at deeper spot discounts, older or smaller cards, and shorter availability windows — viable, but with more scheduling friction. Move above it and the character of the market changes: you start renting HBM-equipped datacenter GPUs with NVLink, much higher bandwidth, larger VRAM, and the ability to scale across nodes. The $1.00 line is therefore a meaningful inflection point — it roughly separates “single mid-range accelerator for inference and light tuning” from “serious training-class hardware.” Exact figures move constantly, so treat the table above as the source of truth for live pricing and current availability.

Frequently asked questions

Can I train a large language model for under $1/hour?

Not a large one from scratch. This tier suits inference, prototyping, and parameter-efficient fine-tuning like LoRA on small-to-mid models. Full pretraining or large full-weight fine-tunes need flagship HBM cards with high interconnect bandwidth, which sit well above this price ceiling — see the higher tiers for that hardware.

Why are some sub-$1 GPUs so much cheaper than others?

Three factors usually explain the spread: whether the instance is on-demand or interruptible spot capacity, whether you are renting a whole card or a fractional slice, and the card’s age and memory type. A current consumer card on-demand and an older datacenter card on spot can both land here for very different reasons, so read the VRAM, bandwidth, and availability columns together.

Is interruptible capacity safe to use at this price?

It is excellent for fault-tolerant, checkpointed work — batch rendering, offline inference, or training runs that save state frequently. It is risky for anything that must stay up, like a production endpoint, because the provider can reclaim the machine with little warning. Match the billing model to whether your job can survive an eviction.

What spec should I check first under $1/hour?

VRAM capacity. At this tier you are typically choosing between roughly 8, 16, and 24 GB of GDDR memory, and that number decides which models and batch sizes you can actually run. After VRAM, confirm precision support and whether billing is per-second, since both have outsized impact when the hourly rate is this low.

A16 बनाम L40S बनाम A40 — इस गाइड से शीर्ष चयन

A16 vs L40S vs A40
	A16 एम्पियर · 64 GB	L40S एडा लवलेस · 48 GB	A40 एम्पियर · 48 GB
विनिर्देश
निर्माता	NVIDIA	NVIDIA	NVIDIA
वास्तुकला	एम्पियर	एडा लवलेस	एम्पियर
VRAM	64 GB GDDR6	48 GB GDDR6	48 GB GDDR6
बैंडविड्थ	800 GB/s	864 GB/s	696 GB/s
FP16 (टेंसर)	72 TFLOPS	366 TFLOPS	150 TFLOPS
FP32	18 TFLOPS	91.6 TFLOPS	37.4 TFLOPS
TDP	250 W	350 W	300 W
रिलीज़ वर्ष	2021	2023	2020
खंड	डेटा केंद्र	डेटा केंद्र	डेटा केंद्र
क्लाउड मूल्य निर्धारण
सबसे सस्ता ऑन-डिमांड	$0.47/hr	$0.55/hr	$0.30/hr
प्रदाता	2	7	5

$1 प्रति घंटा के तहत सबसे सस्ते क्लाउड GPU — June 2026

What the under-$1/hour tier actually buys you