This is the first part of the article I began writing not too long ago. Unfortunately university assignments and exams halted my progress, however as I am almost done, I thought I might as well put up the first part, and then add more as I get them done once my exams are finished.
I hope you like this introduction, and you return to see the rest of the series. I also hope that this introduction will encourage you to seek out resources, linked or otherwise, on the topic so you can have a try before I post how I did it. Discussion on my implementation/views, and what you have done is welcome and encouraged, feel free to post in the comments and provide feedback.
Recently I have found the time to get started on the engine design I’ve had in mind for a while now, and finally get started on some games I want to develop. Since my interest and focus is on 3D rather than 2D at the moment, I decided to go with a deferred renderer, however I remembered back to last year when I read a post by Wolfgang Engel about a new renderer design, called the Light Pre-Pass renderer, that took the ideas of deferred shading, and added back some of the flexibility of forward rendering, at the cost of an extra pass.
Since then, Crytek have announced they are using a similar technique in CryEngine 3, and other companies have used the technique, or a similar version in their own games, sometimes going under the name of Deferred Lighting.
For those who have implemented deferred shading before, this will be quite easy to understand, however for those who do not understand the concept of a deferred renderer and deferred shading yet, I will explain it here before I continue.
In traditional forward rendering, each light is applied to the affected meshes when those meshes are rendered. This means that inside the material shader for the mesh, each individual light that affects the model must be processed. This has traditionally limited meshes to 3-4 lights, as the instruction counts for earlier shader models have been limiting, and even as those limits are lifted, the time needed to render these lights on every pixel of the mesh (even parts that are not lit) for each mesh, can add up to a substantial time.
Deferred rendering aims to solve this problem by rendering lights and models separately, and therefore reducing the complexity of the shaders for the meshes. The idea is that the meshes are drawn, and all important data such as position, normals, texture colour (diffuse), etc are stored in a series of render targets, taking advantage of the Multiple Render Targets (MRT) feature of modern graphics cards.
Then the lights are then drawn, either using a full screen quad for each light, or simple meshes that represent the shape of the light volume in the world. As these are drawn, the lighting calculations make use of the pre-processed data from the meshes in the other render targets, and so lights are not processed for any unnecessary pixels. This allows for tens-hundreds of lights, as lights are essentially just simple meshes.
There are downsides here however, for one thing, Direct3D9 and the XBOX 360 do not support MultiSample Anti-Aliasing (MSAA) for MRTs, and due to the fact that only one position (or depth) can be stored for each pixel, transparent objects cannot be properly rendered.
The former issue can be resolved with some post processing. A selective blur with edge detect can soften the edges enough to remove aliasing, however the transparency issue will require a second pass using forward rendering. (or more complexity if you choose to use the stippling route – something that I will not be covering)
Material diversity is also an issue, as all lighting calculations are done with a single shader. This means that a single uber-shader must be written to cover all the techniques needed, which is limiting, but if you do not have a diverse selection of materials, this should not be that much of an issue.
Deferred Lighting aka Light Pre-Pass
Deferred Shading also has the negative of having a rather fat frame-buffer during the mesh stage, as quite often 4 MRTs are used to store all the required information. Rendering each of these one by one would impart a huge cost, and so MRT support is required on the graphics card.
Light Pre-Pass attempts to solve not only this issue, but also the material issue mentioned before. It works by just rendering position/depth and normals during the mesh stage. This can be split into two passes for cards that do not have MRT support, and so therefore this will support DX8.1 cards. The cost of rendering meshes a second time at this point is much better than rendering 4x.
Then the lights are drawn into their own single render target. The standard components of the Blinn-Phong lighting calculation are stored for the light being rendered, and all lights are alpha-blended into the render target.
The lighting components stored are as follows:
- LightColour.r * N.L * Att
- LightColour.g * N.L * Att
- LightColour.b * N.L * Att
- (H.N)^SpecPower * N.L * Att
(Att = Attenuation)
As you can see, the three channels of colour for the light are stored in the rgb channels of the render target, and the specular component is stored in the alpha channel.
Modifications can be made to this to store the above in a different colour space, however that is outside the scope of this article. (For more information, check out the article on the CIE Colour Space by Pat Wilson from GarageGames in ShaderX7 – also the comments here for an interesting discussion on using Luminance and/or HSL for storing the light data)
The lights are alpha blended together, and end up as a single lighting value in a pixel in the buffer.
Once this buffer is rendered, the meshes are re-rendered, but this time we simply take the lighting data stored in the existing buffer, and apply that to the material for the mesh. Using LPP we gain the ability to have diverse material types, and each material can have its own shader again.
You no doubt see here that over forward rendering, we can have 10, 20, even 50 lights all lighting a single pixel on the mesh, and we have to draw much less.
So far you can see the following benefits to using Light Pre Pass over Deferred Shading:
- Lower cost per light due to smaller calculations in the light pass
- A greater material variety
- Less memory bandwidth usage and texture fetches – at most 2 during the light phase
- Does NOT require SM3.0 hardware for the 4x MRT support, can actually run on DX8.1 hardware with no MRT support (just split depth + normal rendering)
MSAA is something I have not covered for LPP because it technically can be done if you are fine with splitting the depth and normals rendering. If you go the same route as supporting a DX 8.1card, you then only have a single render target at a time, and MSAA can be enabled for all passes. (You can turn it off for the light stage though, not absolutely required there)
I would suggest that if MSAA is enabled, you use the split path of separate depth and normals, and when MSAA is not enabled (and the hardware supports MRTs) you do the combined rendering. A speed up during a depth only write still is not the same as only rendering the mesh once during that stage. I will not be covering this method in this article, as I want to keep it simple for newcomers. This is easy enough to add once you understand how to implement the normal form.
For this particular article, I am only going to implement the combined depth and normals path. This means that to properly follow along with this article, you will need a card capable of at least two simultaneous render targets. Most modern cards will do this, and the XBOX 360 will do up to 4 at a time, so that will work fine. If you have a Shader Model 3.0 card, then you are set, although anything before and you would need to check.
Other requirements include:
- Understanding of the Blinn-Phong lighting equation – This is outside the scope of this article. Basic Phong will suffice if you know the different specular calculation.
- Visual Studio 2008 (Express will do)
- XNA Game Studio 3.1 (3.0 will suffice if you only have that, but why not get 3.1? Its free)
- FX Composer or ATI RenderMonkey – or anything else that will let you write shaders, these just provide syntax highlighting
- An understanding of Vector/Matrix maths
Continue to Implementation.