Unity: Forward, Forward+, Deferred, Deferred+

https://discussions.unity.com/t/deferred/1579636/3

Forward

Render both, transparent and opaque objects, one at a time. For each object, we bind a list of lights that affect that it. On the GPU, we iterate over this list and accumulate lighting.

The upside of this approach is that this is dead-simple, and has a low CPU cost. For very simple scenes, this is a win.

Downsides are (a) overdraw, i.e. you pay for lighting pixels that will be later occluded by other geometry, (b) a smaller light limit, (c) incompatibility with GPU-driven rendering and GPU occlusion culling.

Forward+

On the CPU using Burst, sort all visible lights and reflection probes into screen-space clusters. Now that we have a light list for each screen-space cluster, upload this to the GPU. Transparent and opaque objects can now be instanced, which is great for performance. While shading, we calculate which cluster the shaded pixel lies in, and then loop over the light list for it.

Upsides of this approach are (a) Compatibility with GPU-driven rendering and GPU occlusion culling, (b) higher light limit.

The downsides are that (a) the clustering has a CPU cost, which might hurt on low-end devices, (b) you still end up paying for overdraw, similar to Forward.

Deferred

Opaque objects first render their material properties to intermediate textures, called GBuffers. After all opaque objects are rendered, there is a pass for each local light. This light will render a volume to the stencil buffer. This volume looks like a cone for spotlights and a sphere for point lights. Pixels that pass this stencil are lit by the light, and the lighting result is additively written to the final color. After we draw a pass for each light, we’ll end up with a fully lit opaque final frame. Then we use Forward mode for drawing transparent objects.

Upsides of this approach are (a) you don’t pay for overdraw for opaque objects, (b) you don’t pay a CPU cost for light clustering.

Downsides are (a) you’re writing to the final frame memory once per light, so the texture bandwidth usage is higher, which can be a problem on mobile devices (b) this isn’t compatible with GPU-driven rendering or GPU occlusion culling.

Deferred+

Similar to Forward+, perform light and reflection probe clustering on the CPU. Then, similar to deferred, render all opaque objects to GBuffers. Now, unlike Deferred, do a single full-screen lighting pass. In this pass, loop over the lights and reflection probes, accumulate writing, and write it to the final texture and the end. Then use Forward+ for drawing transparent objects.

Upsides of this approach are (a) you don’t pay for overdraw for opaque objects (b) you write to the final frame once, using less texture bandwidth (c) you can use GPU-driven rendering and GPU occlusion culling, (d) higher light limit.

The downsides of this approach is that, like Forward+, it has a CPU cost, which might hurt on low-end devices.

unitycoder/UnityRendering.md

Select an option