Does NVIDIA GB200 Superchip support BF16 and FP8?

Odpowiedź

The full NVIDIA GB200 Superchip spec sheet reads: Blackwell generation, 384 GB of HBM3e VRAM, 16,000 GB/s memory bandwidth, 4,500 TFLOPS FP16, 150 TFLOPS FP32, 2,700W power draw, released in 2024.

Memory is typically the constraint for large-model real-time serving — at 384 GB, NVIDIA GB200 Superchip comfortably handles mid-sized transformers in FP16 and much larger models in FP8/INT8. The 16,000 GB/s figure is particularly important for KV-cache-bound autoregressive decoding, where memory bandwidth caps tokens/second more than raw compute.

Full specs, benchmarks, and comparisons are on the NVIDIA GB200 Superchip page.

Więcej FAQ o NVIDIA GB200 Superchip

Poznaj NVIDIA GB200 Superchip