September 28, 2011
Posted by on
GATLAS is perhaps 30% integrated. It’s more work than I expected. My notion of rewriting this to make it slimmer now seems ridiculous.
I made some changes to the JIT so auto-tuning can happen before kernels must be scheduled. This allows the JIT to make better decisions about use of memory buffers, images, and vectorization. It’s the kind of efficient memory optimization you would do yourself.
The main thing now is porting the code over. The wrapper library around OpenCL has changed. OpenCL is kind of like Xlib (which no one still working in the software industry remembers). It’s a good C API but generally too awkward to use directly. Applications use a higher level wrapper library.
So like others, I wrote an OO wrapper around OpenCL. When I wrote GATLAS, the wrapper was very C-like with integer handles to objects maintained by the runtime. Since then, it has been rewritten to be more object-based. (I don’t want to say object-oriented as it really is not.)
That’s why GATLAS must be ported. It’s something that had to happen anyway. In terms of performance and binary size, it should be about the same. However, it’s cleaner and easier to read.