there are so many cores

Just another WordPress.com site

Alpha 5 release: anticipating embedded configuration

The alpha 5 code is committed to GitHub and as usual, there is a small download without metadata.

I spent about a week working on JIT changes to support more aggressive private register use. None of that made it into this code commit (in fact, I will throw all that away and do it differently). This drop has no functional changes. It is purely refactoring of source code.

One obvious question is the market segment for Chai:

  1. Server clusters, maybe in the cloud
  2. Mobile platforms like smart phones and tablets

After my AFDS12 technical presentation, someone asked me this question. My answer was that I had not decided. It was a lame answer but also true.

Chai was designed to mimic PeakStream from 2007 – a vision of desktop HPC using big discrete GPUs.

The world has changed. The market middle of servers, workstations, and PCs is in decline. A combination of smaller devices using services in big clouds drives growth. Society seeks cheaper solutions.

I haven’t been blind to this. With all of the effort to just make Chai work, I was too occupied to think about the big picture. Now that the basic engineering problems are solved (well enough for an alpha prototype), I am starting to look around at the situation.

Business thinks about total cost in terms of risk over a time horizon. The risks in any new platform are large. This is why software languages and platforms are usually given away for free. To offset the risk, the price must be zero.

This is also why business usually chooses conservative solutions. Throwing hardware at a problem or using the less efficient platform everyone knows may be inefficient and expensive – but the risk of surprises is much lower.

PeakStream deliberately chose to be a language inside C++ to reduce risk. Chai made the same choice. However, even with its performance advantage, C++ has become an embedded language. It is used when constraints on performance, memory, and power efficiency prevent using anything else.

That’s why I believe it is necessary to go farther and change the vision. The world has changed.

Here’s my argument. There are three ways GPGPU could go.

  1. (Status quo) Nothing changes as technology is mature and good enough.
  2. (STI Cell, Larrabee) Multi-core CPUs become many-core. GPGPU becomes irrelevant.
  3. (SoC CPU+GPU, APU) Heterogeneous balance of multi-core CPU with integrated GPU.

If the future is 1, then Chai will go nowhere. What happened to PeakStream and Rapidmind? They were acquired and disappeared. The market is so small that no standard platform has appeared to address this need since then.

If the future is 2, then Chai will go nowhere. Why do all this rocket science to program GPUs? However, again history has voted. It’s not necessarily easier to program than GPUs and is less efficient by design. The many-core processor is too expensive.

If the future is 3, then Chai may be useful. The GPU becomes the data parallel math coprocessor. Part of this is already happening as SoC processors with integrated GPUs are now standard. However, these are still graphics oriented (i.e. no double precision support).

So it may not be that future 3 will happen (although this is AMD’s existential bet with HSA). Rather, this is the only future that has Chai in it.

One last thing. Roy Spliet sent me this. It embarrassed me when I first watched it. During the DDP flashmob through the Hyatt, a camera crew stopped me for an interview. I wondered what happened to that footage.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: