Next post will be the alpha release
February 11, 2012
Posted by on
I think I’m done with this alpha release. The next post will include a link to a GitHub download. The source will be checked in there too. I’ll probably create a new WordPress page just for releases to avoid forcing people to search through this blog narrative.
This release has shifted my mindset from self absorbed virtual machine language builder to considering users. So although it’s not much and some of it makes me look bad, I decided to be more open about the state of this project and technology. I’m taking a long view – if this technology has merit and is good, then it will live. Otherwise, it will die. It doesn’t matter too much how I portray it.
Here’s my test matrix. It’s a meagre collection of small demo code applications on every compute device I have.
OK - everything works
AF - autotuning fails for batched kernels, scheduled to interpreter
IR - interpreter support only (compute device RNG not implemented)
IS - interpreter support only (single precision only compute device)
OF - everything works except may fail rarely (enough to be noticed?)
SV - segmentation fault (also bus error)
KF - segmentation fault in kernel generated by shader compiler
UC - usually correct results, sometimes wrong
UW - results generally wrong or out of order
Sample PentiumM Core2Duo Corei7_920 5440 5670 5770 5870 480
Application AMD AMD AMD INTEL AMD AMD AMD AMD NVIDIA
============ ======== ======== === ===== ==== ==== ==== === ======
cg KF OK OK OK OK OK OK OK OK
cg64 KF OK OK OK IS IS IS OK OK
index OK OK OK OK OK OK OK OK OK
kirch SV SV SV SV OK OK OK OK OK
loopsum_omp KF UC SV SV UC UC UW OK OK
loopsum_pth KF UC SV SV UC UC UC OK OK
loopsum_uni KF OK OK OK OK OK OK OK OK
loopsum_vec KF UW UW UW UW UC UW OK OK
matmul_omp KF AF AF AF OK OK OK OK OK
matmul_pth KF AF AF AF OK OK OK OK OK
matmul_uni KF OK OK OK OK OK OK OK OK
matmul_vec KF AF AF AF OK OK OK OK OK
matmul64_omp KF AF AF AF IS IS IS OK OK
matmul64_pth KF AF AF AF IS IS IS OK OK
matmul64_uni KF OK OK OK IS IS IS OK OK
matmul64_vec KF AF AF AF IS IS IS OK OK
mingle KF OK OK OK OK OK OK OK OK
mingle64 KF OK OK OK IS IS IS OK OK
monte IR IR IR IR IR IR IR IR IR
pi IR IR IR IR IR IR IR IR IR
sum_omp KF OK SV SV OF OF OF OF OK
sum_pth KF OK SV SV OF OF OF OF OK
sum_uni KF OK OK OK OK OK OK OK OK
sum_vec KF OK OK OK OK OK OK OK OK
You can see some of the skeletons in the virtual machine closet through this table.
Some of that is the inherently complex nature of the technology. Vendors have to market GPGPU as a solved problem and mature technology. If that were true, it wouldn’t be interesting (easy stuff generally lacks mystique).
Some of that is also bugs, especially race conditions and logic errors. I’ve come to rely on testing as an integral part of development. That is just being realistic. I can’t keep the entire design in my head at the same time any more.
- Tomorrow, I’ll read The Java Native Interface Programmer’s Guide and Specification by Sheng Liang. I’ve done JNI before. That was ten years ago. I haven’t done any Java for over two years either. I am very rusty. My thinking is to create a simple bridge from Java to native code libraries, analogous to how BerkeleyDB works.
- I am flying out to Savannah for SIAM Parallel Processing 2012 this coming Tuesday (Valentine’s Day).