Goals
One of our artists best described Infinite's style as "exaggerated reality." The world of Columbia was colorful, high saturation, and high contrast. We needed to handle both bright, sunny exteriors and dark, moody interiors simultaneously. We were definitely not going for photorealism.
The size of the levels were bigger than anything Irrational had attempted before. The previous game Irrational had worked on, BioShock, was more of an intimate corridor shooter. In contrast, we wanted Columbia to feel like a big city in the clouds. This meant much bigger and much more open spaces that still retained the high detail required for environmental story telling, because much of the story telling in a BioShock game was done via the world itself.
The size of the levels were bigger than anything Irrational had attempted before. The previous game Irrational had worked on, BioShock, was more of an intimate corridor shooter. In contrast, we wanted Columbia to feel like a big city in the clouds. This meant much bigger and much more open spaces that still retained the high detail required for environmental story telling, because much of the story telling in a BioShock game was done via the world itself.
Finally, all of this had to perform well on all of our platforms.
The end result |
Hybrid Lighting System
The lighting system we came up with was a hybrid system between baked and dynamic lighting:
- Direct lighting was primarily dynamic
- Indirect lighting was baked in lightmaps and light volumes
- Shadows were a mixture of baked shadows and dynamic shadows
- The system handled both stationary and moving primitives.
Deferred Lighting
Dynamic lighting was handled primarily with a deferred lighting/light-pre pass renderer. This met our goals of high contrast/high saturation -- direct lighting baked into lightmaps tends to be flat, mostly because the specular approximations available were fairly limited. We went with the two-stage deferred lighting approach primarily because the information we needed for our BRDF and baked shadows would not fit in four render targets. We did not want to sacrifice UE3's per-pixel material parameterization, so something like a material id system to compact the G-Buffers was out of the question. This of course meant two passes on the geometry instead of one, which we dealt with by running render dispatch in parallel, instancing, and clever art.
There's been a ton written on this technique, so I'm just going to point out a few wrinkles about our approach.
We used separate specular and diffuse lighting buffers rather than do the combined trick Crytek used. Aside from getting better results, this was cheaper on all of our platforms. Storing the specular luminance basically requires a FP16 buffer since we need an HDR alpha channel. With separate buffers we used the 10/11 bit FP formats on 360 and PC. We encoded to RGBM and blended in the pixel shader on the PS3. This ends up being equivalent bandwidth to a single FP16 buffer.
Doing a limited depth-only pre pass was still a win on the consoles, but we disabled it on most PC hardware. We only rendered a subset of potential occluders in this pass. Primitives in the depth-only pass had to be static (no skinning), cover a reasonable screen area (nothing small), and require no state changes (simple opaque materials only). The player hands and gun were an exception to the "no skinning" rule, as they always covered a significant amount of screen space and needed to be masked out in stencil anyway.The extra pass was rendered in parallel and was really cheap to do, and on the consoles saved much more GPU than it cost.
We supported UE3-style light functions in the deferred pass by compiling unique light projection shaders for light function materials. This was much cheaper than the stock implementation and our artists used these to great effect.
Finally our G buffer contained the normals and glossiness as is fairly standard, but we had a second buffer which contained baked shadow information. More on this later.
Depth |
Imagine a normal/gloss buffer here (image lost due to technical difficulties)
Light attenuation buffer (baked and dynamic shadows) |
Diffuse lighting |
Specular lighting |
End scene color before post |
Our first BRDF was the legacy Phong model which has been used for ages in games. When we were putting together our first demo, we had a lot of trouble making materials that looked good in both bright and dark areas, which resulted in a ton of hacks and tweaking per-primitive and per-material.
We modified our BRDF to help solve this mid-project. It sounds crazy but the artists were willing. They didn't like having to tweak materials per-primitive in the world and knew it would be impossible to deliver on our quality goals if that state of affairs continued.
The new model used energy-conserving Phong, switched to using gloss maps, and added environmental specular with IBL. For IBL, artists would place env spec probes throughout the level with volumes which determined their area of effect, and the lighting build generated pre-filtered cubemaps. We used Sebastian Legarde's modified AMD cubemapgen to filter the cubemaps. Most primitives used a single probe for their spec, but we also supported blending between two probes for certain primitives such as the player gun to avoid popping when transitioning between cube probes.
For efficiency our geometric term was set to cancel out with the divisor.
We experimented with switching to a more physically plausible NDF such as Blinn-Phong. Too much content was built assuming a Phong lobe, so it would have made the transition to the new model too difficult.
We could not afford to do per-light Fresnel, but a material option to use N dot V fresnel for both env spec and analytic spec. This isn't right but I'm pretty sure a few other games have done it, I unfortunately can not find the links.
I would not in a million years say what we did was physically based shading (hence "influenced"), but we did use many of the ideas even if they were applied in an ad hoc fashion. We did get a lot more predictability and consistency of material response in different lighting environments, which was the goal, and did it in a way that minimized the transition pain for a project already in development. If I could have done it all over again, I would have concentrated on the BRDF much, much earlier and used more systemic reference.
UE3 had a built in baked shadow system, but it had some limitations. "Toggleable" lights can't move but they can change brightness and color. The system could bake the occlusion information to a texture for a given light by projecting the shadow into texture space using a unique UV mapping for each primitive. Each primitive-light combination required a unique texture in an atlas. The more primitive-light interactions you had, the more the memory used by these textures would grow.
We came up with a system that supported baked shadows but put a fixed upper bound on the storage required for baked shadows. The key observation was that if two lights do not overlap in 3D space, they will never overlap in texture space.
We made a graph of lights and their overlaps. Lights were the vertices in the graph and the edges were present if two lights' falloff shapes overlapped in 3D space. We could then use this graph to do a vertex coloring to assign one of four shadow channels (R,G,B,A) to each light. Overlapping lights would be placed in different channels, but lights which did not overlap could reuse the same channel.
This allowed us to pack a theoretically infinite number of lights in a single baked shadow texture as long as the graph was 4-colorable. I explained this to artists as "any light can not overlap with more than three other lights". Packing non-overlapping lights into the same channel is useful for large surfaces such as floors or hallways. The shadow data was packed into either a DXT1 or DXT5 texture depending on how many shadow channels were allocated for a primitive, and packed into an atlas. Baked shadows were stored in gamma space, as we found this to produce much better results. Storing in linear resulted in banding in the shadows.
During rendering we would un-pack the data into the proper global channels, either using texture remap hardware on the consoles or a matrix multiply on the PC. The global shadow channels were rendered during the G-Buffer pass into a light attenuation buffer. Dynamic shadows from toggleable lights would be projected into this buffer using a MIN blend (since this is just storing obscurance, you want the more obscured value). When projecting lights, each light would sample the light attenuation buffer and do a dot product with a shadow channel mask to attain its appropriate shadowing value.
Some notes on this approach:
- Vertex coloring of an arbitrary graph is NP complete. We used an incremental greedy approximation with a couple of heuristics - the first is directional lights had priority for their assigned channel over any other light type, and the second is if a light already had a shadow channel assigned, we preferred to keep it rather than reassign it.
- Because our shadow channel assignment was incremental, we could give artists instant feedback in the editor when they had too many overlaps.
- Point/Point and Point/Spot overlap detection is trivial, but for Spot/Spot we generated convex hulls that approximated the spotlight falloff shape and did a convex/convex intersection.
- Compression artifacts can occur due to packing independent channels into DXT colors, but in practice this didn't affect the final image much as it was mitigated by the inherent noise in our normal maps and diffuse maps.
- The sampling rate used for projecting the shadow in the texture can cause data to overlap when two lights falloff shapes are close to each other but do not touch, in practice this does not cause an issue because the two lights are generally already attenuated by their falloff in the overlapping areas.
- When projecting dynamic shadows on top of the baked shadows, it is important to clip the shadows to the falloff of the light because the shadow projection is a frustum that may go outside of the light's falloff boundary, which can cause incorrect shadowing on nearby lights sharing the same channel.
Baked Shadows on Dynamic Primitives
One problem with baked shadows was handling static primitives casting shadows on dynamic primitives. In stock UE3, the solution was "preshadowing" which did a dynamic projection of the static caster onto the dynamic primitive, but masked to the dynamic primitive via stencil. This was not sufficient for our needs as the whole point of baked shadows is to avoid the cost of projecting dynamic shadows from static geometry.
Our solution was to bake low-frequency shadowing data from static primitives into a volume texture. These volume textures were streamed into a virtual texture which surrounded the camera. Because we had a global shadow channel assigned per light, we knew that a light's baked shadow data would not conflict with any other lights' shadowing information.
Dynamic primitives just needed to do a single volume texture tap to get their shadowing information during the G-Buffer pass, and wrote it into the light attenuation buffer.
As the camera moved through the world, we streamed in chunks of shadowing information into a single volume texture representing the shadowing information near the camera. We used UVW wrap mode to avoid having to actually shift these chunks around in memory - imagine a 2D tile scroller but in 3D. Far Cry 3 independently developed a similar scheme for moving a virtual volume texture around the world for their deferred radiance transfer volumes, and have a pretty good explanation of the technique.
For objects far way from the camera, we kept around an in-memory "average" shadowing volume texture that covered the entire map. To reduce memory consumption, this data was kept ZLIB compressed in memory and sampled once per primitive on the CPU in a bulk job that ran in parallel.
We stored indirect lighting from "toggleable" lights in lightmaps and light volumes. This could be disabled for certain toggleable lights if their color or brightness was going to be radically modified at runtime. Some fill lights were baked both their direct and indirect contribution, to give artists flexibility in areas that had too many overlapping direct lights to make them "toggleable" or where they needed extra performance.
For static primitives indirect lighting, we used UE3's stock lightmaps pretty much unmodified, except we generated them with Autodesk Beast. UE3's Lightmass GI solver did not exist when we started the project.
For dynamic primitives, we used a similar scheme to baked shadows by storing baked lighting in a volume texture that was streamed around the camera. Light volumes were encoded as spherical harmonics. We used a heavily compressed encoding scheme that used two DXT3 textures to store the constant and linear bands of HDR spherical harmonic data. The constant band was stored as RGBM in a single texture. The linear band was stored as a direction in RGB and a scale in alpha in another DXT3. We used DXT3 rather than DXT5 for predictable quantization of the scale terms, we found this led to much less error when very bright samples were next to dark ones.
The biggest problems were due to bleed of lighting through surfaces due to the low sampling frequency of the light volumes. This was mitigated by the fact we primarily used this on dynamic primitives which did not take part in the GI solution, so there were no problems from self-occlusion there. Additionally, when generating the volumes we biased samples above floors (i.e. we bled light from above a floor to beneath it rather than the other way around).
One particular challenge was doors. Our doors were generally closed except for brief moments, and for lighting build purposes we had a static placeholder in the doorway to prevent light from bouncing between rooms. In game though, the door was a dynamic primitive. This meant it often got the indirect lighting from either one room or the other, depending on where it fell in the light volume sampling grid. One solution I considered was pushing out the light volume sample along the geometry normal, similar to Crytek. In the end, it was easier for me to just generate a lightmap for the door mesh since I knew it was mostly going to remain closed. Since direct lighting was dynamic and the door's shadow was dynamic, you still got proper runtime shadowing when the door opened, but indirect lighting would be baked.
Translucent lighting is always the bane of deferred rendering and we were no different. I considered using inferred lighting, but my prototypes showed the reduction of lighting resolution with even just one layer of translucency was unacceptable for our use cases. I did not want to maintain a separate forward path, as we didn't have the resources.
The solution we used came out of the prototype I had done for screen space spherical harmonic lighting. The basic idea was to do something similar to UE3's lighting environments, but completely on the GPU. Bungie's Destiny has developed a similar translucent lighting approach.
We had three 96x32 FP16 render targets (3072 light environment samples) which would accumulate the lighting in 2-band SH in GPU light environments. Primitives would be assigned a pixel, and write their view space position into another FP16 texture. Each frame we'd project all the visible lights into SH and accumulate them into these render targets. This projection would use the baked shadow volume for shadowing from static primitives. We didn't support dynamic shadows on translucency, although the technique doesn't preclude it. Light volumes would also be projected and accumulated into these render targets.
When a translucent primitive was rendered, it would sample the appropriate pixel for its lighting environment. Even though we were sampling the same pixels over and over, on console we found it was actually faster to have a CPU job convert the SH textures into shader constants after the GPU was done producing them.
The GPU light environments had varying quality levels. Lowest was nondirectional lighting, which only used the constant band but was very fast. Highest we would generate a runtime lightmap by rendering a primitive's positions into UV space and allocating a small area of the SH textures as a lightmap. This was mostly used on large water sheets and other large translucent objects.
Our GPU light environments were useful for skin and hair rendering. For skin, we rendered both the standard deferred lighting and a GPU light environment. In the second pass which applied the deferred lighting, we took the GPU light environment, windowed the SH, multiplied it times the transmission color (reddish for skin) and blended it with the deferred lighting. This in effect was a very cheap approximation to blurring the lighting. For hair, we split the light environment into direct and indirect components, and extracted a directional light for each. We then used a hair spec model loosely based on this AMD presentation.
A few things that don't warrant their own section
- For SSAO, on consoles and low end PC we used Toy Story 3's line integral approach. On high end PC we used HDAO.
- For fog, we used exponential height fog. Fog settings were put into UE3's post process volume system, so artists could preview and tweak fog per-area easily.
- We also placed the main directional light's settings in the post process volume system. Artists would turn the sunlight off via the post process volumes when the player was in a fully interior area, which is a simple but very effective optimization.
- We rewrote UE3's stock light shafts. Mathi Nagarajan came up with an optimization to do the light shaft radial blur in two passes - a coarse pass and a refinement pass. This allowed us to get many more effective samples, making it practical to use them all the time on console. It does suffer from some stippling when a light source is near the edge of the screen and the viewer is at certain angles. On high end PC we increased the number of samples which handles most of these cases. In hindsight we should have tried increasing the number of samples as a light source got close to the edge of the screen, even on console.
- Dynamic shadows were projected at a lower resolution than the screen, using a small number of taps (4 on 360, 4 HW PCF on PS3, and 8 on PC). We then blurred (edge-aware) and up-sampled using a bilateral filter. Even though edge-aware blurs and bilateral filters are not separable, we implemented it as separable after reading this paper, and it worked pretty well.
Acknowledgements
Big games are always a collaborative undertaking, and Infinite was no exception. Toward the end of the project we probably had 5-6 programmers doing rendering work. I can't list them all here but did want to call out a few people specifically. Mathi Nagarajan was an exclusive contractor to Irrational who was on for the bulk of the project, and a key contributor. Iron Galaxy did a lot of platform optimization and bug-fixing, particularly for PS3.
On the art side there were so many awesome and talented artists who really made the game shine, but I want to call out two in particular who worked with me closely on the tech side. Spencer Luebbert was our Tech Artist who iterated with me closely on many key lighting features, and did an excellent job of documenting and educating the rest of the team how to get the best out of the engine. Stephen Alexander was our Lead Effects Artist but also often pushed the engine to its limits to do things I didn't even think possible.
Finally, I want to thank all the people who wrote an article, blog post, presentation, or paper about rendering techniques. This sharing was a big help to everything we did, and this entry is only a small down payment on the debt I owe.
Interesting read, and good job on the game obviously, thanks! I really loved the graphics of Bioshock Infinite, and also the fact that it actually ran with a solid framerate on my computer.
ReplyDeleteAwesome post Steve, great work. The end result speaks for itself.
ReplyDeletewell done on getting it out there and finishing these things up - bugs and all. the end result is visually exciting and - whilst I'm sure a lot of that is art rather than code - the visual appeal of the game is something that attracted me to it from the first. :)
ReplyDeletejust quick thought... and i'm guessing something that may have been considered and just not done due to quality of the data or time constraints:
> We also placed the main directional light's settings in the post process volume system. Artists would turn the sunlight off via the post process volumes when the player was in a fully interior area, which is a simple but very effective optimization.
this is a small saving for artists but if you have some partitioning and pvs solution out of UE3 (i assume so by their static BSP geometry and baking of things - but i've not touched the code sadly and i don't have the tools to test and its late... blah blah lazy) then you can do this at the exact correct times... this way you can turn it off if e.g. stood outside but looking into an area of complete shadow. the thing is not to see if the light itself is visible but everything visible to the light itself in the pvs (i.e. the pvs for the pvs volume containing the light) which is precisely the set of geometry on which light/shadow can occur from the light source. it may be that none of this is visible if you are e.g. stood behind a building which is blocking the sunlight, looking through windows etc as well as entirely inside and blocked off... :)
i have a particular bee in my bonnet about getting artists/designers to toggle things in volumes or place volumes in the world. its fine for exceptional things (eg one exterior lighting source) but when most of your culling relies on this sort of work (which sadly i have seen - twice now) it becomes uneconomical... if not embarrassing that the dev team does not understand the technology that let Quake get out of the door at framerate all those years ago.
and a small further note you can use the same technique to cull the geometry fed into the shadow pass as well - if its not in the pvs from the light's perspective then it can't cast a shadow you will ever see from that light. :)
DeleteUE3 used GPU occlusion queries for occlusion culling, and that's the system we used.
DeleteToward the end of Infinite, Epic added PVS primarily for mobile platforms which don't support HW occlusion queries, but we were well into development by then. And their solution still requires some artist intervention: https://udn.epicgames.com/Three/PrecomputedVisibility.html
For the most part we used their stock GPU occlusion query system. We did some tweaks to mitigate pop-in due to misprediction (results were frame-delayed), and we completely rewrote the viz cull to run in parallel.
This was all automatic and applied to lights and shadows as well as primitives, and best of all did not require any lengthy build time or memory. Iteration speed was very important to us, and we were very tight on memory.
The directional light was a manual exception, and something that's important to realize is the artists set up these post process volumes anyway, because they used them heavily to control the look of different areas. Even if the directional light was handled automatically, they would still place these volumes for art direction reasons. So there's not really much time savings there.
I go back and forth on visibility solutions that require precomputation - the tradeoff between iteration time and runtime perf is not always a clear win one way or the other, I think it is game and workflow dependent.
An excellent read and very thorough! I thoroughly enjoyed the art style and appreciate the nitty gritty details put into the scenes by the entire team to make it possible!
ReplyDeleteGreat summary! It would be nice to see what effects you could achieve with UE4. Have you considered Clustered Deferred Rendering?
ReplyDeleteThanks! One of our programmers who came on after Infinite, Paul Zimmons, was working on converting the renderer to tiled deferred, and clustered deferred was on his todo list to explore.
DeleteYes, amazing work! I love the look of Infinite! I'm playing through the game now, and really enjoying it.
ReplyDeleteWow, this is VERY close to our lighting system. Interesting how parallel development works. Nice article.
ReplyDeleteI am confused... you didn't want to make add a forward pipeline. But how did you handle rendering the translucents before lighting?
ReplyDelete