Inlining OpenCL with managed code works
June 3, 2012
Posted by on
It’s a dumb example but shows the basic idea.
"__kernel void add_scalar(__global float* a, float s)"
" a[get_global_id(0)] += s;"
Arrayf32 A = make1(N, cpuA);
Arrayf32 B = make1(N, cpuB);
C = A + B;
(progCL, "add_scalar", C, 1.5f)(N, 1);
C += A - 3.3f;
C.read1(cpuC, N * sizeof(float));
OpenCL kernels can be inserted anywhere in the managed execution trace. The virtual machine and JIT do the right thing regarding memory transfers and generated kernel boundaries. It just works. If nothing else, this allows programming in OpenCL as a kernel language and letting the virtual machine manage memory. It’s also a very natural way to extend the platform dynamically.
OpenCL programs can have multiple kernels, local memory, images, etc. The ProgramCL object parses the source code text and does the right thing with arguments. The work space dimensions can be 1D or 2D. The example above is 1D with “(global0, local0)“. For 2D, this would be “(global0, global1, local0, local1)“.
The syntax is unusual. It is also the best compromise I could find. I did not want the verbose Java “setData()” style of interface design. However, I also did not want to expose magic macros or C++ templates in the API. The variadic function call does not work as the array variable objects are not Plain Old Data. So I overloaded the comma and function call operators. It’s a balance between flexibility, natural syntax, and compile-time safety.
I have to switch gears and work on my presentation for AFDS 2012. The final slide deck deadline with AMD is June 6th. There’s also another slide deck I’m working on too.