NVIDIA PerfKit 5: New Tools for 3D Developers
Enhanced PerfHUD 5.0
PerfHUD is a convenient utility to tune performance and debug Direct3D applications. It helps solve difficult problems with rendering speed and quality by monitoring performance, inspecting pipeline status, and getting debug information, which appears on the heads-up display (HUD). That is, the PerfHUD interface is drawn right on the application frame. It contains graphs, text fields, and controls.
The program collects data from an application, driver, API, and GPU. Once started, it works together with an application and displays the data it collects on the foreground of the application. PerfHUD uses special code in the driver that collects data from GPU counters as well as intercepts API calls to gather statistics and integrate into the application. That's why there is some performance drop compared to the usual mode, without the special driver and enabled HUD. However, it does not interfere with the operation.
This utility shows how a frame is rendered on-the-fly, call by call. At the same time, you can analyze (the latest version of the utility also allows to modify!) shaders, geometry data, textures, etc. Other utilities, such as PIX, also allow to monitor all Direct3D API calls, but they don't do it in real time, only at user's requests. So PerfHUD is more convenient to use, although it offers a tad fewer features than PIX. On the whole, PIX works better for debugging, while PerfHUD is a better tool for detecting and eliminating performance bottlenecks. We cannot actually compare PIX and PerfHUD directly. PerfKit facilitates your work with PIX, it gives you an opportunity to use low-level GPU counters, which data are very relevant for debugging.
PerfHUD 5 is the fifth version of NVIDIA's performance analysis utility, one of the key components of PerfKit. Like the previous version, PerfHUD 5 shows how a 3D application renders a frame call by call, and offers several so-called experiments. All these features work in real time. A single key press gives you the list of draw calls, grouped by time. It's usually not easy to collect data about GPU units, especially as modern games render a frame with several thousand draw calls. PerfHUD allows to split a scene and analyze each call separately. It makes the task easier by displaying errors and performance bottlenecks so that a developer could eliminate them.
PerfHUD 5 has a lot of improvements, which have to do with functionality and user interface. The key new features: Full support for NVIDIA G8x GPUs, Direct3D 10 API, and Microsoft Windows Vista, as well as a lot of functions and counters, a new custom user interface, you can now edit shaders, etc.
Here is a detailed list of changes in NVIDIA PerfHUD 5:
- Support for NVIDIA G8x architecture in Windows Vista and Windows XP
- Support for DirectX 10 applications in Windows Vista
- Support for DirectX 9 applications in Windows XP and Windows Vista
- Edit & Continue mode (you can make changes on-the-fly) for HLSL and *.fx vertex, geometry, and pixel shaders
- Edit & Continue mode (you can make changes on-the-fly) for raster operations
- Custom user interface, you can select up to four counters for a graph, a full set of Direct3D and GPU counters from PerfSDK, control the size and position of each graph, save your custom interface into a file
- Improved Frame Debugger mode that shows 1D, 2D and 3D textures, shadow maps and cubic maps, texture arrays. You can choose to display only one draw call
- Improved Frame Profiler with Instruction Count Ratio graphs, pop-up graph tips, support for hierarchic Performance Markers
- Improved user interface, support for a hardware mouse cursor, getting information about hardware and software configurations
- Various improvements in compatibility and stability, small bug fixes
Along with G8x and Direct3D 10 support, the most interesting new feature, in our opinion, is editing shaders and render states in real time. The utility allows to modify shader code on-the-fly and immediately see the result, which makes it much easier to test new ideas and optimize shader code.
As PerfHUD is a powerful tool to analyze 3D applications, NVIDIA implemented protection to restrict the access of third-party users to analysis of applications without developers' consent. In order to use PerfHUD, an application should support it - it must have several lines of code in the DirectX initialize subroutine. When an application is started under PerfHUD, the driver creates a special video adapter to be used by the application, or PerfHUD won't work. Besides, PerfHUD works only with the reference rasterizer. An application will still use hardware features of the GPU, though it selects the NVIDIA PerfHUD video adapter.
This solution has been used for a long time, since PerfHUD 2.x, so developers can use the new version in their programs without any modifications. Applications that do not support PerfHUD in the described way, cannot be analyzed with this utility. Since PerfHUD 5, the conditions have been toughened. While we could see the interface and some data (FPS and a number of triangles in a scene) on a monitor, now we don't see PerfHUD data at all:
In order to run a Direct3D application together with PerfHUD, you should specify a path to an executable file in the command line of the utility or drag the application or its shortcut to the PerfHUD icon. The interface of a program uses hot keys for fast access to functions, there are also mouse control elements. Activity of the interface switches between a user application and PerfHUD with the hot key specified in the settings. When the program is started for the first time, it displays a configuration window, where you should specify key settings.
You choose a hot key to call PerfHUD, specify the HDD path to store log files, choose a method to intercept mouse and keyboard signals (DirectInput or standard system methods), and change settings for Frame Debugger and Frame Profiler modes. The latest version also allows to force the software mouse cursor - a hardware cursor is used by default, which improves mouse control at low frame rates, but it may cause some problems on rare occasions. Later on, you can open the configuration window by launching the program without specifying the application name.
PerfHUD modes:
- Performance Dashboard - Generic performance analysis and bottleneck identification with graphs and GPU usage statistics.
- Debug Console - Review DirectX Debug runtime messages, PerfHUD warnings, and custom messages.
- Frame Debugger - Analyzes stages of the graphics pipeline by freezing the current frame and examining how a scene is rendered call by call.
- Frame Profiler - A debug mode, which automatically detects and shows the most demanding draw calls. It allows to detect problems as far as performance is concerned, and how completely an application uses GPU features.
That's how the program works. When a Direct3D application is started from PerfHUD, it runs in the initial interface mode - Performance Dashboard. It overlays the image rendered by an application that runs under PerfHUD. This mode is convenient for initial tests. It provides basic info about the GPU pipe in a user application. Then you should display a scene that you want to analyze more thoroughly. If it demonstrates some rendering errors, it's easier to determine the reasons in Frame Debugger mode, where you can see how a scene is rendered call by call, and see a geometry model, textures, shaders, and ROPs for each call. Frame Profiler will help you solve performance problems. It allows advanced profiling that helps determine rendering performance problems. Frame Profiler displays a lot of useful statistics in the form of automatic analysis with full info on all draw calls and the time spent by various GPU units. Let's analyze each mode in more detail.