[vox-tech] mmap()

Norm Matloff matloff at cs.ucdavis.edu
Wed Oct 23 23:51:06 PDT 2013


Thanks, everyone, for the comments.

Someone asked what I am trying to do.  Here it is in a nutshell:

I have an R package, Rdsm, to do shared-memory parallel computing,
http://cran.r-project.org/web/packages/Rdsm/index.html  It does genuine
shared-memory access when all the processes (nominally independent R
processes) are on the same machine, and I'd like to be able to do this
on a cluster, with the programmer illusion of shared memory.
Performance of course would be much lower in the cluster case, but
better than nothing (and probably no worse than Rmpi etc.).

Rdsm runs on top of another R package, bigmemory.  The latter allows
storage in files rather than solely in memory, so in principle Rdsm
should work on a cluster running a shared file system.  However, it
turns out not to do so (even though using files for the storage is fine
if all the processes are on the same machine).

The culprit seems to be mmap(), which apparently works quite differently
from ordinary file access.  For the latter, the changes made by one
process seem to propagate to the other processes reasonably quickly.

I'm probably going to give up, and bypass the mmap() part completely in
the cluster case, settling instead for plain files (calling R's save()
in one process and load() in the other).

Norm



More information about the vox-tech mailing list