Overview
Matcha is a CLI tool that measures GPU energy consumption during AI training runs.
You prefix your training command with matcha run. Your training executes at full speed. When it finishes, Matcha reports how much energy your GPUs consumed.
Who it’s for
Engineers and researchers running training or fine-tuning jobs on NVIDIA GPUs, on cloud instances like RunPod, Lambda, and AWS, on-premise clusters, or local workstations.
What it measures
- Total energy consumed (
joules,watt-hours) - Average and peak GPU power draw (
watts) - Duration of the full run
- Per-step energy breakdown (with
matcha wrap)
How it works
Matcha spawns your training command as a child process. In a separate background thread, it polls GPU power via NVML at 100ms intervals. Your training process runs natively. Matcha never intercepts stdout, modifies your code, or injects anything into your training loop.
When the training process exits, Matcha computes total energy using trapezoidal integration of the power readings and prints a single summary line.
What you need
- An NVIDIA GPU with drivers installed (
nvidia-smimust work) - Python 3.9+
piporuv