利用者:Lukasstockner97/GSoC 2016/Weekly Reports/Week 5

提供: wiki
移動先: 案内検索

Hi!

This week, I finished the GPU denoising kernels - it turned out to be bit more work than I expected since I planned to use a different strategy for the seamless filtering there: On the CPU, tiles are denoised when their neighbors have been rendered. However, on GPUs, that approach has multiple problems: Tiles tend to be much larger and memory usage tends to be a major problem, so having 10 or more 256x256 tiles with 30 or more channels in memory isn't exactly great. Also, that would cause problems on Multi-GPU systems where neighbor tiles might be on different cards. Therefore, on GPUs a different approach is used: Instead of getting the pixels around the border from other tiles, each tile is internally rendered a bit bigger and only the pixels that will be visible in the end are filtered. Since the tiles are pretty large, the performance impact isn't that big: For 256x256 tiles at default settings, Cycles will render internally with 272x272 tiles, a 13% increase in pixels - not great, but the best option available as far as I can see. Therefore, I modified the RenderBuffer and RenderResult sync code to deal with "overscan", as I called it for now. With that in place, the CUDA kernels were fairly easy to add.

Next, I started working on the option to re-denoise after rendering is completed. Since I wanted a clean and flexible solution, I added a general "postprocess" option to the External Render API that comes with a poll function that checks whether a particular RenderResult can be post-processed, and a function that actually does the post-processing. These are then used by a operator that's called by a button in the image editor. From the Cycles side, the polling function already works, but the actual processing isn't implemented yet. Therefore, the button doesn't do anything yet. The reason for that is that I'm not sure how to implement it in the Session handling in Cycles: It could be added in as a third option, similar to how baking exists next to rendering. However, that isn't really great: Rendering and baking share most of the syncing code, only the final details (which device tasks are executed) are really different. For denoising, however, the situation is a lot different: All that's needed is 1. create a RenderBuffer from the existing result, 2. get tiles from the tile manager, 3. call the device to denoise them (no two-stage processing or overscan needed since the image is fully rendered to begin with), 4. write the result back. Therefore, the second option would be to add a simpler DenoisingSession, but that's not really nice either from a code design point of view...

So, from a user perspective, not too much changed in the branch: I added a currently useless button, but at least GPU denoising works now - or at least should.

Next to the denoiser work, I also did some work on other Cycles topics:

  • Bug fixing: I fixed T48691, T48698 and a few unreported bugs:
    • Building the Cycles CUDA kernels at runtime failed due to a new file missing from CMake (dfa7ddd)
    • Cycles crashed when using an environment texture with OSL (73cfbb0)
    • The anisotropic BSDF failed with OSL because the automatically inserted Convert node was connected wrongly due to a missing entry in the Socket Type -> Name mapping (9bce807)
  • Multi-Scattering GGX closures (D2002): Finally in master (23c2768)! I implemented the OSL bindings, fixed a few remaining numerical issues etc. Thanks to Brecht and Thomas for the review and suggestions! Next up is the Metallic BSDF based on the same paper (D2003)

Oh, and sorry for not providing any more test renders yet - I somehow overlooked the mail until now, and previously I didn't think that it makes too much sense yet because the filter is still buggy. Also, over the weekends I'm usually on my laptop, which means that every larger render takes a few hours... I'll render a few as soon as I'm back on the decent PC, though.

Lukas