Does NVIDIA GB200 Superchip support BF16 and FP8?

Trả lời

The full NVIDIA GB200 Superchip spec sheet reads: Blackwell generation, 384 GB of HBM3e VRAM, 16,000 GB/s memory bandwidth, 4,500 TFLOPS FP16, 150 TFLOPS FP32, 2,700W power draw, released in 2024.

Memory is typically the constraint for large-model real-time serving — at 384 GB, NVIDIA GB200 Superchip comfortably handles mid-sized transformers in FP16 and much larger models in FP8/INT8. The 16,000 GB/s figure is particularly important for KV-cache-bound autoregressive decoding, where memory bandwidth caps tokens/second more than raw compute.

Full specs, benchmarks, and comparisons are on the NVIDIA GB200 Superchip page.

Thêm câu hỏi thường gặp về NVIDIA GB200 Superchip

Khám phá NVIDIA GB200 Superchip