Document Scope:
This cheat sheet provides a quick reference guide for using NVIDIA System Management Interface (NVIDIA-SMI) commands to monitor and manage NVIDIA GPU devices on Linux systems. It includes common commands and their descriptions to help users quickly access GPU-related information and perform basic management tasks.
Commands:
- nvidia-smi
- Description: Displays overall GPU information, including utilization, temperature, memory usage, and more.
- Usage:
nvidia-smi
- nvidia-smi -L
- Description: Lists all detected GPU devices along with their unique identifiers.
- Usage:
nvidia-smi -L
- nvidia-smi -q
- Description: Queries and displays detailed information about GPU devices, including temperature, power usage, clock speeds, and more.
- Usage:
nvidia-smi -q
- nvidia-smi -a
- Description: Displays all available GPU information, including utilization, memory usage, clock speeds, and more.
- Usage:
nvidia-smi -a
- nvidia-smi -i [gpu_index]
- Description: Displays detailed information about a specific GPU device identified by its index.
- Usage:
nvidia-smi -i 0
- nvidia-smi --format=csv --query-gpu=index,name,memory.total,memory.used,memory.free
- Description: Displays GPU information in CSV format, including index, name, total memory, used memory, and free memory.
- Usage:
nvidia-smi --format=csv --query-gpu=index,name,memory.total,memory.used,memory.free
- nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv
- Description: Displays GPU utilization metrics (GPU and memory) in CSV format.
- Usage:
nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv
- nvidia-smi -r
- Description: Resets the persistence mode of all GPUs to default.
- Usage:
nvidia-smi -r
- nvidia-smi --help
- Description: Displays a list of available NVIDIA-SMI command-line options and their descriptions.
- Usage:
nvidia-smi --help
-
Gather GPU Performance Information
- Description: Gathers information such as, Clock Speed , Power Details and Performance information.
- Usage:
Supported Clock Values: nvidia-smi -q -d SUPPORTED_CLOCKS
Current GPU Clock: nvidia-smi -q -d CLOCK
GPU Memory Details: nvidia-smi -q -d MEMORY
GPU Power Details: nvidia-smi -q -d POWER
GPU Performance Details: nvidia-smi -q -d PERFORMANCE
-
GPU Performance state
-
P-States are GPU active/executing performance capability and power consumption states.
P-States range from P0 to P15, with P0 being the highest performance/power state, and P15 being the lowest performance/power state. Each P-State maps to a performance level. Not all P-States are available on a given system. The definition of each P-States are currently as follows:
- Usage:
- P0/P1 - Maximum 3D performance
- P2/P3 - Balanced 3D performance-power
- P8 - Basic HD video playback
- P10 - DVD playback
- P12 - Minimum idle power consumption
-
-
Power:
power.min_limit: Minimum power limit that can be set for GPU (watts)
power.max_limit: Maximum power limit that can be set for GPU (watts)
power.draw: Power being consumed by GPU at this moment (watts)
nvidia-smi --query-gpu=power.min_limit,power.max_limit,power.draw --format=csv
-
Temperature
temperature.gpu: GPU temperature (C)
temperature.memory: HBM memory temperature (C)
nvidia-smi --query-gpu=temperature.gpu,temperature.memory --format=csv
-
Current Clock Values
clocks.current.sm
clocks.current.memory
clocks.current.graphics
clocks.current.video
nvidia-smi --query-gpu=clocks.current.sm,clocks.current.memory,clocks.current.graphics,clocks.current.video --format=csv
-
Utilization
utilization.gpu: Percent of sampling interval time that GPU was being used.
utilization.memory: Percent of sampling interval time that device memory was being used.
nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv
-
Modes
persistence_mode: Current persistence mode
ecc.mode.current: Current ECC mode
mig.mode.current: Current MIG mode
nvidia-smi --query-gpu=persistence_mode,ecc.mode.current,mig.mode.current --format=csv
-
ECC Mode
nvidia-smi -e 1 enable
nvidia-smi -e 0 disable
-
Persistent Mode
nvidia-smi -pm 1 enable
nvidia-smi -pm 0 disable
-
Speed
pcie.link.gen.max # Max Link Speed
pcie.link.gen.current # Current Link Speed
nvidia-smi --query-gpu=pcie.link.gen.max,pcie.link.gen.current --format=csv
-
Monitor GPU Performance
nvidia-smi dmon -s pceutv
-
Monitor GPU Processes
nvidia-smi pmon
Conclusion:
The NVIDIA-SMI cheat sheet provides users with a handy reference for accessing and managing NVIDIA GPU devices on Linux systems using the command line. By familiarizing themselves with these commands, users can efficiently monitor GPU performance, troubleshoot issues, and optimize system resources for their GPU-accelerated applications. Experimenting with different command options and combinations can further enhance users' understanding of their NVIDIA GPU devices and their capabilities.
Related to
Comments
0 comments
Please sign in to leave a comment.