Depth Gradient for Implementing the SVGF Denoiser

When I supervised a student implementing different image denoisers into my line renderer LineVis, we faced the problem that SVGF, one of the most often used denoisers for real-time path traced applications, is dependent on ∇z(p), which the paper describes as the "gradient of clip-space depth with respect to screenspace coordinates"1 at the point p. It is used in formula (3) of the paper …

Fragment Shader Interlock Performance on the Steam Deck

This is an update of the article "Fragment Shader Interlock Performance (α Compositing)", where I previously had a look at the performance of a GPU feature called fragment shader interlock for alpha composition. In this article, I said: I would love to also test whether fragment shader interlock also comes out as the performance winner on AMD hardware, but unfortunately AMD does not …

Precision of Hardware-Accelerated Trilinear Interpolation

A student I'm currently supervising found an interesting phenomenon on NVIDIA GPUs. When using trilinear interpolation with a 3D texture in Vulkan, the interpolation will not return interpolated values less than 1/2048, even if the texture stores 32-bit floating point values. In the image below, you can see a test case I created where a very small step size is used for volumetric path tracing and …

Fragment Shader Interlock Performance (α Compositing)

Over the last decades, how GPUs can be used radically evolved. In the olden days™, GPUs had specialized units for vertex and fragment processing. Due to multiple reasons, one being that different applications may utilize these different types of units to different degrees, a shift to unified units happened that would perform both vertex and fragment processing. For a more in-depth view on the …