Docker GPU benchmark
docker GPU becnchmark CLI
Run this CLI to run docker GPU benchmark
docker run --rm --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark -numbodies=640000
Samples
NVIDIA GeForce GTX 1060 3GB
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Pascal" with compute capability 6.1
> Compute 6.1 CUDA device: [NVIDIA GeForce GTX 1060 3GB]
number of bodies = 640000
640000 bodies, total time for 10 iterations: 29656.438 ms
= 138.115 billion interactions per second
= 2762.301 single-precision GFLOP/s at 20 flops per interaction
Quadro RTX 3000
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Turing" with compute capability 7.5
> Compute 7.5 CUDA device: [Quadro RTX 3000]
number of bodies = 640000
640000 bodies, total time for 10 iterations: 21452.439 ms
= 190.934 billion interactions per second
= 3818.680 single-precision GFLOP/s at 20 flops per interaction
NVIDIA GeForce RTX 3060
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Ampere" with compute capability 8.6
> Compute 8.6 CUDA device: [NVIDIA GeForce RTX 3060]
number of bodies = 640000
640000 bodies, total time for 10 iterations: 11055.232 ms
= 370.503 billion interactions per second
= 7410.066 single-precision GFLOP/s at 20 flops per interaction