利用者:Lukasstockner97/GSoC 2016/Weekly Reports/Week 11

提供: wiki
< 利用者:Lukasstockner97‎ | GSoC 2016‎ | Weekly Reports
2016年8月6日 (土) 12:42時点におけるwiki>Lukasstockner97による版 (Created page with "Hi! I started this week with a short side project - implementing the recent Solid Angle paper "Blue-noise Dithered Sampling" into Cycles. Initial results are pretty nice - espec...")
(差分) ← 古い版 | 最新版 (差分) | 新しい版 → (差分)
移動先: 案内検索

Hi!

I started this week with a short side project - implementing the recent Solid Angle paper "Blue-noise Dithered Sampling" into Cycles. Initial results are pretty nice - especially for low sample counts (< ~16spp), a visual improvement is clear. However, there are some tricky cases like the hashing that's currently performed for Branched Path Tracing, so a proper implementation will take a bit longer.

On Monday, I updated my wiki page (https://wiki.blender.org/index.php/User:Lukasstockner97/GSoC_2016), which now contains all my weekly reports and a few rendered image comparisons. Detailed documentation will be added later, once the final design is fixed.

I then went back to coding and fixed the remaining "black spot" problems, which could be tracked down to a uninitialized variable and a few more NaNs in Cycles. Since then, I haven't encountered a single black spot, so I'm optimistic that the issue is pretty much solved.

Next up, I did the denoise pass storage refactor - instead of directly accessing the RenderBuffers through complicated addressing logic, all relevant data is now copied into a linear array before processing it. That makes denoising a bit faster (~15-20%, depending on the scene) and allows to simplify the code a lot, along with being necessary for the upcoming big CPU speedup from SIMD instructions. I'm still not sure whether I should go for SSE or AVX instructions - AVX is eight times as fast (SSE "only" four times), but requires a fairly new processor to get any speedup (Sandy Bridge or newer for Intel, Bulldozer or newer for AMD). If I go for AVX, the default half-window size should be reduced to 7 - a 15-pixel window needs 2 instructions, while a 17-pixel one needs 3...

At the end of the week, I started to work on the standalone tool and cross-frame filtering. The changes to the actual kernel are remarkably small, but the host side needs a lot of extra code for it.

So, for next week, I'll finish the standalone frame sequence denoising for animations, and then will continue with the NLM weights and, if I'm not out of week at that point, the SIMD CPU version.

Note that the actual filter result didn't change at all due to the changes from this week, it's just a bit faster and the code is way cleaner and ready for further improvement.

Lukas