Optimizations for Shadows in Interactive Environments

A masters thesis in computer graphics
Author: Jon Hasselgren
Supervisor: Lennart Ohlsson, Department of Computer Science, Lund University


One of the most important steps in the process of rendering an artificial three dimensional image is to compute the flow of light from light sources through the world and shade objects appropriately based on the amount of photons that hits them. Even though interactive graphics evolve at a high rate there’s still a long way to go until accurate light computations may be used with acceptable performance.

This thesis presents a new algorithm intended to optimize the performance of pixel accurate direct lighting with shadows which is currently the state of the art in the field. The optimizations are done through exploiting assumed properties in the virtual environment such as static objects and lights. An important part of the algorithm is that objects and light sources with dynamic properties can be easily added too the scene and receive the same quality of lighting and shadowing as the static objects while still keeping a higher overall performance.

The algorithm have been implemented in a demo program and evaluated on mainstream hardware (GeForce FX 5200) and it showed a significant increase in performance in all test cases compared to a fairly optimized traditional implementation of per pixel lighting and shadowing

Screenshots that illustrates the lighting quality that can be mantained with interactive or "real-time" performance using the algorithm described in the report

The Report


The purpose of this master thesis is to research possible optimizations for shadows in interactive graphics. In interactive graphics performance is always an important factor since it limits the complexity of the images that can be drawn in a pace high enough to generate a sequence of images that seem animated to the user. There already exist several optimizations for shadows but in this thesis an alternate approach is taken. The optimizations are based on a pre-computational phase that generates important data that can be saved to the hard drive and used in the interactive part of the application to achieve higher run-time performance.

The thesis report presents an algorithm for high performance light and shadow rendering under the assumption that many of the light and objects in the world are static during the whole execution time. The algorithm has been evaluated on mainstream graphics hardware and a demo application has been constructed that implements both the algorithm described in this paper as well as an optimized version of a common shadowing algorithm.


The final report can be downloaded in high or low quality. High quality is recommended only for printing purposes

The Demo


Download the demo [5.8Mb]. Please read the read the readme while you wait for the demo to download.


Tech demo for the masters thesis: 
"Optimizations for Shadows in Interactive Environments"

The tech demo (and thesis) was developed on GeForceFX 5200 and GeForce3 cards. 
It's compatible with all GeForce cards from the GeForce3 and on and also all 
radeon cards supporting ARB_vertex_program and ARB_fragment_program 
extensions. Since poor performance can be expected when running 
ARB_fragment_program on geforce cards geforce.bat will execute the demo with 
an alternate more geforce friendly configuration. The program can be tweaked
by editing the .cfg files. 

The following keys can be used during execution of the demo:

C - Toggles optimized culling of shadow volumes for static shadows on dynamic
    objects. This culling hasn't been optimized as well as it perhaps should 
    (tree structures etc.) and may therefore cause a CPU slowdown of the 
    program at high resolutions or with very fast hardware.
S - Toggles shadowing method between the method described in the thesis and 
    an optimized implementation of the common shadow volume algorithm.
D - Toggles using dynamic shadows, for the common shadow volume algorithm 
    implemtntation this turns off all shadows and for the optimized 
    implementation this turns of shadows on and cast by dynamic objects.
E - Toggles rendering of edges (including light meshes). Open edges only 
    belonging to a single polygon are rendered with red color while other 
    edges are yellow.
W - Wireframe mode
B - Bounding box rendering
V - Renders scissorboxes used for optimizations

The program defaults to a 1024x768 window. If you wan't to see the best 
performance difference and your graphics hardware is powerful it's best to 
turn up all the fill-rate eating features such as anisotropic texture 
filtering FSAA and a high resolution. The less the demo gets fill-rate bound 
the less the difference in performance will be. 

There exist three different .bat script files to execute the demo with 
different configurations:

geforce.bat    - geforce optimizations, this wont start on radeon cards but 
                 will work on gf3/4/FX. This mode is recommended for FX cards
                 since it uses further near clipping plane which removes some
                 possible artifacts.
other.bat      - ps 2.0 compatible hardware, radeon compatible but wont work 
                 on gf3/4
overdraw.bat   - Renders the overdraw factor. Should only work on ps 2.0 
                 compatible hardware

Disclaimer / Legal
The software is provided as is with no warranty. All use is at own risk.
The author shall not be hold responsible for any damage caused by the
software. Blah blah etc.

Any textures used in this software are copies or altered copies of free
textures which can be found on www.3dcafe.com and www.grsites.com/textures
among others. I claim no credit for any textures used.

The demo may be freely copied but only in its original form and including
this readme file.

Known issues

- Illegal exception error on exit due to double object deletion, it's 
  currently not a high priority to fix this
- Cg compilation of old shaders (aka fp20) is _very_ slow which cause a big 
  CPU performance penalty but execution of new shaders (arb/fp30) is slow 
  (on FX 5200) which cause poor performance. I've also noted that using fp30 
  shaders can make performance drop due to the geometry throughput in low 
  resolutions which is indeed strange. If i run the program with fp30 
  shaders in very low resolution (40x20 for instance) the triangle throughput 
  which isn't very high will still limit performance somehow
- Radeon doesn't support a nice depth pass implementation. In this demo this 
  is worked around by a "hack" namely moving the near clipping plane closer 
  to the camera. This really doesn't work well so shadows on dynamic objects
  will be incorrect for ATI cards
- For the optimized method shadows are not guaranteed to be correct if the
  camera is posisioned outside the map. This is not an issue since the
  position of the camera should be restricted to inside the map only.
- With the common shadow volumes implementation the demo sometimes get
  exception errors on ATI hardware. It has something to do with vertex arrays
  but im not sure if it's a program bug, driver bug or "specification issue"