INTEL Habana Labs Qualification Tool (hl-smi) Cheat Sheet

Alexander Hill
Alexander Hill
  • Updated

Document Scope:

This cheat sheet provides a quick reference guide for using INTEL Habana Labs Qualification Tool (hl_qual) commands to monitor and manage Intel Habana Gaudi 3 GPU devices on Linux systems. It includes common commands and their descriptions to help users quickly access GPU-related information and perform basic management tasks.

 

Commands:

  • hl-smi
    • Description: Displays overall GPU information, including utilization, temperature, memory usage, and more.
    • Usage:
      hl-smi 
  • hl-smi -L
    • Description: Lists all detected GPU devices along with their unique identifiers.
    • Usage:
      hl-smi -L 
  • hl-smi -q
    • Description: Queries and displays detailed information about GPU devices, including temperature, power usage, clock speeds, and more.
    • Usage:
      hl-smi -q 
  • hl-smi -i [gpu_pci-addr]
    • Description: Displays detailed information about a specific GPU device identified by its index.
    • Usage:
      hl-smi -i 0000:17:00.0
  • hl-smi -f csv -Q
    • Description: Displays GPU information in CSV format, including index, name, total memory, used memory, and free memory.
    • Usage:
      hl-smi -i 0000:17:00.0 -Q index,name,memory.total,memory.used,memory.free -f csv
  • nvidia-smi -r
    • Description: Trigger a reset of the HABANALABS AIP. Requires root. Requires -i switch to target specific device.
    • Usage:
      hl-smi -r 
  • hl-smi --help
    • Description: Displays a list of available HL-SMI command-line options and their descriptions.
    • Usage:
      hl-smi --help

 

  • Gather GPU Performance Information
    • Description: Gathers information such as, Clock Speed , Power Details and Performance information.
    • Usage:
    • Current GPU Clock: hl-smi -q -d CLOCK
    • GPU Memory Details: hl-smi -q -d MEMORY
    • GPU Power Details: hl-smi -q -d POWER
    • GPU Product Details: nvidia-smi -q -d PRODUCT
    • GPU Temperature Details: nvidia-smi -q -d TEMPERATURE
    • GPU Performance Details: nvidia-smi -q -d PERFORMANCE

 

  • Power:

power.draw: The total power consumption (W)

hl-smi -Q power.draw -f csv
  • Temperature
    temperature.aip: Maximum temperature read (C)
hl-smi -Q temperature.aip -f csv

Memory
memory.total: The total size of available memory
memory.free: Available size of unused memory
memory.used: The size of used memory

 hl-smi -Q memory.total,memory.free,memory.used -f csv
  • Utilization
    utilization.aip: Percent of sampling interval time that GPU was being used.
    hl-smi -Q utilization.aip -f csv
  • Speed
    pcie.link.gen.max # Max Link Speed
    pcie.link.gen.current # Current Link Speed
    hl-smi -Q pcie.link.gen.max,pcie.link.gen.current -f csv
  • Monitor GPU Performance
    hl-smi dmon -i 0000:17:00.0
  • GPU Topology
     hl-smi topo

 

Conclusion:

The HL-SMI cheat sheet provides users with a handy reference for accessing and managing INTEL GPU devices on Linux systems using the command line. By familiarizing themselves with these commands, users can efficiently monitor GPU performance, troubleshoot issues, and optimize system resources for their GPU-accelerated applications. Experimenting with different command options and combinations can further enhance users' understanding of their GPU devices and their capabilities.

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request

Comments

0 comments

Please sign in to leave a comment.