[vox-tech] collaborative data storage (of excel files)

Dylan Beaudette dylan.beaudette at gmail.com
Tue Jan 15 10:33:04 PST 2008


On Tuesday 15 January 2008, Henry House wrote:
> On 2008-01-15, wrote Dylan Beaudette:
> > Hi,
> >
> > some of the people in my lab are interested in collaboratively compiling
> > a large quantity of environmental data- each user appending several
> > hundred measurements of several variables every week.
> >
> > They are currently emailing around a spread sheet file and there have
> > been numerous data accidents. Now they are asking to put the file onto a
> > shared drive, so that they can access it remotely. This sounds like a
> > terrible idea to me- even worse than the previous attempt.
>
> Their idea is distilled evil! But you already knew that.

Indeed! 


> > The data are essentially rows and cols of numbers that are added to and
> > edited weekly.
> >
> > At first I thought subversion might be helpful, but revision control
> > doesn't work so well with binary data (excel files)... unless there is
> > something I don't know about. It would be hard to detect conflicts, or to
> > merge data. However, it would allow for timestamps and revision numbers
> > to provide some level of authority.
> >
> > Designing some kind of database-driven system seems like a logical
> > choice, but I do not have the time to do this. Perhaps there is already
> > something out there.
> >
> > Does anyone have some insight into how to solve this data management
> > nighmare?
>
> The right way to do this is to use a database. But, an easier
> maybe-almost-as-good solution might be to use subversion and save the
> data as CSV text files (excel can do this just fine). It is useful to
> add comment lines (maybe you could have an internal convention about
> this) that help subversion to figure out where to merge in changes.

This is a good idea. Unfortunately we might be stuck with 4 or 5 mega CSV 
files which are constantly appended to, but SVN should be able to deal with 
keeping things sane. The trick will be to get people to realize that these 
are CSV files, and therefore no monkeying around with formulas, etc. 

I'll keep digging around for ideas.

Dylan

-- 
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341


More information about the vox-tech mailing list