On GCN GPUs and above, this kind of GPU workload will execute in ‘partial waves’ which means the GPU is being underutilized. The Information tab shows that our pixel shader is only running 1 wavefront and only taking up 32 threads of that wavefront. To inspect the details of the pixel shader running on the GPU, right-click on the draw call, select “View in Pipeline State” and click on PS in the pipeline. We can see that the DrawIndexedInstanced()Ĭall takes 211us to complete. Many third-party tools exist, but the Radeon Developer Panel that comes with the Radeon GPU Profiler has a Device Clocks tab under Applications which can be used to set a stable clock on AMD RDNA™ GPUs, as shown below: You may fix the clocks on your GPU to reduce this variance. But this trades lower power consumption for performance and can introduce noise in our benchmarks, as the clocks may not scale the same way between runs of our application. Most GPUs have a default power management system that switches to a lower clock frequency when idle to save power. This can be useful in gathering repeatable average frame time data for your level.Īnother technique for helping reduce noise in profile results is to run with fixed clocks. It will then shutdown automatically after a fixed number of frames. This means that, if you have your project set up to run a camera flythrough on startup, it will advance through the flythrough using fixed timesteps and a fixed random seed.
Rather, it runs 211×60=12,660 frames using a fixed timestep of 1/60=16.67 milliseconds.
In the above example, benchmarkseconds is not wall-clock seconds (unless every frame of the demo runs at exactly 60 fps).