[vox] Who thinks Java is cool?

Norm Matloff matloff at cs.ucdavis.edu
Fri Jun 17 11:00:54 PDT 2011


On Thu, Jun 16, 2011 at 11:59:50PM -0700, Bill Broadley wrote:
> On 06/15/2011 11:41 PM, Norm Matloff wrote:

> > It's generally believed in the parallel
> > processing community that the shared-memory paradigm makes for clearer
> > code than does message-passing, and I personally agree.
 
> That's an interesting assertion, possibly better discussed in person.

Well, there's fact and there's taste.

I think it's fair to say that it is a fact that shared-memory code is
simpler, i.e. takes up fewer lines of source code.

But is simpler clearer?  That is absolutely a matter of taste.

> The HPC world of parallel processing seems to be largely invested in MPI
> which seems orders of magnitude more popular than shared memory both in
> HPC clusters and in HPC codes.  

That was true until two or three years ago.  But then GPU became big.

> Sure OpenMP works, but that's only for toy scaling.  If you want
> to scale to say a factor of 100, 1000 or 10,000 on a code that's not
> embarrassingly parallel then you use message passing.  

In general, one can do more tweaking in a message-passing environment.
This is especially true if the alternative is a cache-coherent
shared-memory setting.  That picture changes radically with GPU.  If
one's criterion is how many different applications can get excellent
performance/price ratio, GPU would probably win hands down.

Of course, one does even better in a multi-GPU setting, linking the GPUs
using message-passing, and currently this is the typical mode in the
"world class" supercomputers.  (Take that with a grain of salt, as the
applications used are rather narrow.)

I'm really surprised that you say one can get speedups of multiple
orders of magnitude on non-embarrassingly parallel problems.  I've been
trying to think up examples of this, or find examples that others have
implemented, for quite a while now, and I haven't found any.  If you
know of examples, PLEASE LET ME KNOW.  The all-caps format here is not
to shout at you or do challenge you, but rather because it really
matters.  Among other things, I'm writing a book on R programming, and I
have a chapter on parallel R.  Things get a little more complicated in
R, but the basic issue remains:  Can one achieve speedups of that
magnitude on non-embarrassingly parallel problems in large-scale
distributed systems?

> I've had a few experiences with MPI over shared memory (SGI and Mpich)
> and both worked rather poorly.  Granted that's not relevant to getting
> useful things done with shared memory.  As counter intuitive as it was,
> it was faster to send a message to a pci-e attached card for local
> messages than it was to send a message using shared memory.

Yes, I've had a similar experience.  For those doing message-passing, I
don't recommend using shared memory as the underlying communications
medium.

> Go has pointers, but not pointer arithmetic (unless I'm
> misremembering my languages).

Then does it not have shared arrays?

Norm



More information about the vox mailing list