there are so many cores

Just another site

Almost at feature freeze for first release

Added this week:

  • index arrays (index_f32, index_f64 – both 1D and 2D variants along width or height)
  • gather data shuffling (gather1_floor, gather2_floor – allows array subscripts with ordinates from data instead of loop indexes and work-item IDs)
  • outer product matrix multiply in generated kernel

The last remaining major feature is auto-tuned GEMV (matrix-vector multiply). That shouldn’t be too hard as I’ve done this before (have old code) and GEMM is already integrated. I want to get this done tomorrow.

As mentioned earlier, there’s no time to work on GPU random number support in the immediate future. Any execution traces that use the RNG API will be scheduled on the CPU interpreter.

The JIT does work but has many dark corners. That’s why it is important to stop adding new features and start fixing bugs. For the first release, not everything will work – which makes it more important to know what does work.

Correctness is the prime requirement. After that is stable and consistent behavior. That’s an issue with managed platforms sometimes (e.g. unpleasant surprises with a database execution plan optimizer). Failures are o.k. if known and not silent.

That’s the mindset I have for this as technology. It must be useful. It doesn’t have to be perfect. I want to add value, not uncertainty.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: