利用者・トーク:Jbakker
目次
Blender OpenCL proposal
Typical desktop systems have multiple Processing Units (PU). The two most common Processing Units are the CPU and the GPU. The CPU is the main processor of a system. The GPU is specialized in Graphical processing. Over the last years the GPU became more powerful than the CPU. A CPU has typically 1-4 cores as the GPU has 100 smaller cores.
An GPU is accessed differently, than a CPU. This makes it hard to develop applications for it. OpenCL is a open standard what enables easier developing for GPU’s and eventually their goal is heterogeneous computing systems (a system where multiple types of processing units can work together).
Currently only the MAC OS-X install the OpenCL libraries by default. ATI and NVIDIA drivers can install OpenCL libraries on Windows and Linux systems.
To install OpenCL on Windows and Linux systems you need to install a OpenCL-implementation. NVIDIA and AMD/ATI deploys a implementation of the standard. But for now these are specialized for their GPU. At the moment only ATI has drivers what combines CPU and GPU in a single OpenCL implementation.
When using OpenCL within the Blender architecture it can only be done as an optional component. Only use when enabled by the user
- not all development systems have OpenCL SDK’s installed
- not all system of the users have OpenCL
- technical limitation of using OpenCL (like: memory limits, and not optimal OpenCL code for the specific task)
Compiler settings
BF_OPENCL: This setting allows blender to be compiled with OpenCL support. Without this flag all linkage to OpenCL will not be compiled and linked
Commandline options
--enable-opencl: this setting will enables OpenCL inside Blender. Default OpenCL will not be used.
Changes in Blenderkernel
Add BKE_opencl.h and BKE_opencl.c these files will add basic functions what can be used to identify if OpenCL must be used.
//this method can be called inside blender code to make sure if opencl is enabled //by the user int BKE_opencl_is_enabled();
// this method is called when the user places the –opencl startup flag. It will initialize // the opencl library and search for suitable open cl devices. // when no suitable devices are found the BKE_is_opencl_enabled() will return // negative BKE_opencl_init():
// this method will be called as a shutdown hook to the blender process. // it will deinitialize the opencl system and release all allocated resources. BKE_opencl_deinit():
// get the gpu device BKE_opencl_get_gpu_context(hints); BKE_opencl_get_gpu_device (hints);
BKE_opencl_resultcode(); BKE_opencl_status();
Design principles
There must always and I say ALWAYS be a fallback to normal CPU execution (without the usage of the OpenCL library)
GPU based rendering is not the goal. First get experience how the library works. Converting the internal blenderrender is not the (targetted) goal of this implementation.
First step
- implement changes to the blender kernel
- use these changes to implement a first test-case (eg. remove doubles?)
- with this we can check how the implementation works on different platforms (OS, Graphical cards manufacturer, etc) and finetune where needed.
- Implement a good testcase like the fractal generator (not in trunk http://www.youtube.com/watch?v=F43e_iwRUJ0 ), not functional but be able to show the speedup to get sponsors (developers & studios) interesting.
note: other suggestions are welcome.
Next steps
- device and platform selection in system-preferences
- basic opencl monitoring
- use OpenCL in different area’s:
- Compositing node system
- Particle systems: SPH, boids
- Baking, (for fluids and smoke etc)
- Others…
Implementation pattern
Everywhere the opencl is used the next implementation pattern can be used.
If (BKE_is_opencl_enabled() && doGPUCalculationInitialization() != ERROR) { executeGPUCalculation(); } else { // do normal CPU calculation due to memory limits or no opencl enabled system executeCPUCalculation(); }
Note: the doGPUCalculationInitialization also allocates device memory. When this fails (out of memory or so) the CPU based implementation is done
Note: more complex memory models can be done (GPU->CPU->GPU to address larger memory space)
Release target
I am not targeting this for Blender 2.5! perhaps a first release in Blender 2.6 but only when stable and tested. There are other people already working on this subject inside the game engine and planning. The first implementation will be done in a separated branch.
developers documentation will be created.
big issue is the maintenance of double implementation (CPU based and OpenCL based)
Open questions
- API for determine Computing scenarios What scenario fits best the task at hand
- Is Blenderkernel the correct location or should an additional library be used
- How to maintain the dual implementation of CPU and GPU based code (test-cases?)
- use CPU OpenCL based fall-back works far from optimal. better to optimize for CPU in many cases
- setup for node systems, per node or per system
- debug tooling (result codes, memory management, device info etc) feedback and API
- Why no auto detection of OpenCL?
- Some tasks are easier on CPU
- Some systems have faster CPU than GPU
- Memory limits when using large datasets
- Look at MiniCL (being used in bullet). Not sure where to place MiniCL as CPU optimizations is different than GPU optimzations.
side effects
How does the defocus node work (algorithm)
http://sicg.atmind.nl/index.php?option=com_content&view=article&id=29:blender-defocus-algorithm
Proposal: Nodes property windows enhancement
Current situation
in the current situation inside the node editor there is a properties panel (press 'n'-key). This pabel displays some information about the node, backdrop and grease pencil. The UI of the property panel is typically vertical oriented. Nodes in the other hand are not oriented in a direction. Both area's are draw via the same draw function.
With some nodes this will create not user-friendly UI. Try the color-balance for instance). The 3 color circles are drawn next to each other, it would be better to draw them below each other.
When creating more complex nodes you don't want to display all handles in the node-panel and in the properties panel. For instance finetuning handles you only want to appear in the property panel to reduce place in the node itself.
Proposal
My proposal is to separate the draw functions of the property panel and the node panel. When no special draw function is created for the property panel, the draw function of the node will be used as 'fallback'
Impact
BKE_node.h
add a new uifunc (called uifuncbut) to the bNodeType struct. The definition is the same as the uifunc.
node_buttons.c
if the uifuncbut is set, call it. currently calls the uifunc method
drawnode.c
static void node_composit_set_butfunc(bNodeType *ntype). set the uifuncbut function where needed. When at the end of the method uifuncbut is still empty, set uifuncbut to the uifunc.