GPGPU-5, the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, London UK, March 2012.
This paper presents a method for auto-tuning interactive ray tracing on GPUs using a hardware model. Getting full performance from modern GPUs is a challenging task. Workloads which require a guaranteed performance over several runs must select parameters for the worst performance of all runs. Our method uses an analytical GPU performance model to predict the current frame’s render- ing time using a selected set of parameters. These parameters are then optimised for a selected frame rate performance on the particular GPU architecture. We use auto-tuning to determine parameters such as phong shading, shadow rays and the number of ambient occlusion rays. We sample a priori information about the current rendering load to estimate the frame workload. A GPU model is run iteratively using this information to tune rendering parameters for a target frame rate. We use the OpenCL API allowing tuning across different GPU architectures. Our auto-tuning enables the rendering of each frame to execute in a predicted time, so a target frame rate can be achieved even with widely varying scene complexities. Using this method we can select optimal parameters for the current execution taking into account the current viewpoint and scene, achieving performance improvements over predetermined parameters.
Author generated version of the paper
The frame times show that the model can very actually track the applications frame to frame behaviour, but there is an offset where the overall performance of the GPU isn't correctly measured leading to the offset error seen in the graphs. The performance targets for the Fairy and Cabin are 333ms and 600ms respectively.
Here the HD4000 shows a similar behaviour as the other GPUs. Again it's noticable that the HD4000 has slightly more AO rays than the 5870, while running at roughly the same speed. Generally though the number of AO rays between GPUs cannot be compared in these graphs because each GPU has different frame time targets.
The errors are similar to the other GPUs. It is notable that the HD4000 has a more consistent error, most likely due to the lack of good GPU parameters.