How To: NVIDIA-SMI Cheat Sheet

Alexander Hill
Alexander Hill
  • Updated

Document Scope:

This cheat sheet provides a quick reference guide for using NVIDIA System Management Interface (NVIDIA-SMI) commands to monitor and manage NVIDIA GPU devices on Linux systems. It includes common commands and their descriptions to help users quickly access GPU-related information and perform basic management tasks.

 

Commands:

  • nvidia-smi
    • Description: Displays overall GPU information, including utilization, temperature, memory usage, and more.
    • Usage:
      nvidia-smi 
  • nvidia-smi -L
    • Description: Lists all detected GPU devices along with their unique identifiers.
    • Usage:
      nvidia-smi -L 
  • nvidia-smi -q
    • Description: Queries and displays detailed information about GPU devices, including temperature, power usage, clock speeds, and more.
    • Usage:
      nvidia-smi -q 
  • nvidia-smi -a
    • Description: Displays all available GPU information, including utilization, memory usage, clock speeds, and more.
    • Usage:
      nvidia-smi -a 
  • nvidia-smi -i [gpu_index]
    • Description: Displays detailed information about a specific GPU device identified by its index.
    • Usage:
      nvidia-smi -i 0
  • nvidia-smi --format=csv --query-gpu=index,name,memory.total,memory.used,memory.free
    • Description: Displays GPU information in CSV format, including index, name, total memory, used memory, and free memory.
    • Usage:
      nvidia-smi --format=csv --query-gpu=index,name,memory.total,memory.used,memory.free
  • nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv
    • Description: Displays GPU utilization metrics (GPU and memory) in CSV format.
    • Usage:
      nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv 
  • nvidia-smi -r
    • Description: Resets the persistence mode of all GPUs to default.
    • Usage:
      nvidia-smi -r 

  • nvidia-smi --help
    • Description: Displays a list of available NVIDIA-SMI command-line options and their descriptions.
    • Usage:
      nvidia-smi --help

 

  • Gather GPU Performance Information

    • Description: Gathers information such as, Clock Speed , Power Details and Performance information.
    • Usage:
      Supported Clock Values: nvidia-smi -q -d SUPPORTED_CLOCKS
      Current GPU Clock: nvidia-smi -q -d CLOCK
      GPU Memory Details: nvidia-smi -q -d MEMORY
      GPU Power Details: nvidia-smi -q -d POWER
      GPU Performance Details: nvidia-smi -q -d PERFORMANCE
  • GPU Performance state

    • P-States are GPU active/executing performance capability and power consumption states.

      P-States range from P0 to P15, with P0 being the highest performance/power state, and P15 being the lowest performance/power state. Each P-State maps to a performance level. Not all P-States are available on a given system. The definition of each P-States are currently as follows:

    • Usage:
      • P0/P1 - Maximum 3D performance
      • P2/P3 - Balanced 3D performance-power
      • P8 - Basic HD video playback
      • P10 - DVD playback
      • P12 - Minimum idle power consumption

 

  • Power:

    power.min_limit: Minimum power limit that can be set for GPU (watts)

    power.max_limit: Maximum power limit that can be set for GPU (watts)

    power.draw: Power being consumed by GPU at this moment (watts)

    nvidia-smi --query-gpu=power.min_limit,power.max_limit,power.draw --format=csv
  • Temperature

    temperature.gpu: GPU temperature (C)

    temperature.memory: HBM memory temperature (C)

    nvidia-smi --query-gpu=temperature.gpu,temperature.memory --format=csv
  • Current Clock Values

    clocks.current.sm

    clocks.current.memory

    clocks.current.graphics

    clocks.current.video

    nvidia-smi --query-gpu=clocks.current.sm,clocks.current.memory,clocks.current.graphics,clocks.current.video --format=csv
  • Utilization

    utilization.gpu: Percent of sampling interval time that GPU was being used.

    utilization.memory: Percent of sampling interval time that device memory was being used.

    nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv
  • Modes

    persistence_mode: Current persistence mode

    ecc.mode.current: Current ECC mode

    mig.mode.current: Current MIG mode

    nvidia-smi --query-gpu=persistence_mode,ecc.mode.current,mig.mode.current --format=csv
  • ECC Mode

    nvidia-smi -e 1 enable
    nvidia-smi -e 0 disable
  • Persistent Mode

    nvidia-smi -pm 1 enable
    nvidia-smi -pm 0 disable
  • Speed

    pcie.link.gen.max # Max Link Speed

    pcie.link.gen.current # Current Link Speed

    nvidia-smi --query-gpu=pcie.link.gen.max,pcie.link.gen.current  --format=csv
  • Monitor GPU Performance

    nvidia-smi dmon -s pceutv
  • Monitor GPU Processes

    nvidia-smi pmon

 

Conclusion:

The NVIDIA-SMI cheat sheet provides users with a handy reference for accessing and managing NVIDIA GPU devices on Linux systems using the command line. By familiarizing themselves with these commands, users can efficiently monitor GPU performance, troubleshoot issues, and optimize system resources for their GPU-accelerated applications. Experimenting with different command options and combinations can further enhance users' understanding of their NVIDIA GPU devices and their capabilities.

Related to

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.