[vox-tech] beowulf cluster

Bill Broadley vox-tech@lists.lugod.org
Fri, 10 Jan 2003 17:24:26 -0800


On Fri, Jan 10, 2003 at 04:43:25PM -0800, Shawn P. Neugebauer wrote:
> On Friday 10 January 2003 03:41 pm, Bill Broadley wrote:
> > On Mon, Jan 06, 2003 at 02:48:38PM -0800, Ryan Detert wrote:
> > > I am looking for a good howto or a really clear book on setting up a
> > > beowulf cluster. I have 3 computers and I am wondering first off if it
> > > would be easier to use NFS or having each node have a completely
> > > functional OS.
> >
> > Hrm, thats a pretty large subject, what exactly are your goals?  Resume
> > fodder?  A particular application?  Parallelizing a particular code?
> 
> I hear lots and lots of talk about building beowulf clusters, but I don't
> hear much talk about applications.  Is there existing software that
> will distribute problems across such such clusters *automatically*?  

Basically, in general, no.

> I'm thinking about high-level software, e.g., Matlab, or octave, or even
> a more special-purpose application.  Or is the power of these clusters

There are very specific things that can be auto-parallelized.  But
often the "parallel matlab" is just a small set of calls that runs
in parallel.  

> only harnessed when I write near-custom, MPI-based code that
> specifically parallelizes *my* problem? 

Correct, that is exactly how most of the research is done.  There are
commercial applications that are starting to support MPI for a particular
area of research.

The other case is in embarassingly parallel situations, say for instance
a collider generates 100,000 collisions, each one takes 8 hours of
number crunching, if you send one per cpu you can easily keep N<100,000
cpu's busy.

> I recognize the difficulty in
> auto-magically parallelizing, but what *good* is such a cluster if I
> have to write custom code all the time?

Well everyone wants a computer that magically answers all questions with
no work, unfortunately the reality is for now that writing a parallel
code that scales well on a beowulf cluster is pretty tough, the flip
side is that the skill can be rather valuable.  I've been programming
for a few decades now, and parallel programming is definitely the most
difficult thing I've tried.

> I've got 3 computers, too, Ryan, and I've always wondered what I
> could do to combine them into a more useful whole, especially for
> Matlab-like processing.

Well running 3 seperate jobs is easy (one per cpu), there is a parallel
matlab product, but I believe it's not a generalized solution, just a
few routines that can use MPI to utilize more than one cpu.

I haven't looked recently there might even be a similar opensource project.

-- 
Bill Broadley
Mathematics
UC Davis