Does NVIDIA GB200 Superchip support BF16 and FP8?
回答
The full NVIDIA GB200 Superchip spec sheet reads: Blackwell generation, 384 GB of HBM3e VRAM, 16,000 GB/s memory bandwidth, 4,500 TFLOPS FP16, 150 TFLOPS FP32, 2,700W power draw, released in 2024.
Memory is typically the constraint for large-model real-time serving — at 384 GB, NVIDIA GB200 Superchip comfortably handles mid-sized transformers in FP16 and much larger models in FP8/INT8. The 16,000 GB/s figure is particularly important for KV-cache-bound autoregressive decoding, where memory bandwidth caps tokens/second more than raw compute.
Full specs, benchmarks, and comparisons are on the NVIDIA GB200 Superchip page.