[vox-tech] Tar, MySQL, and cron-ed backups

Mitch Patenaude vox-tech@lists.lugod.org
Fri, 5 Dec 2003 16:43:25 -0800


> So, if tar is saying the "file changed as we read it", does that mean 
> that tar:
> * skipped the file,

No

> * made a copy of the version that existed when tar *started* the 
> operation,

No

> * made a copy of the version that existed when tar *finished* the 
> operation, or

No

> * some combination of these?

yes. if it was half-way through reading the file.. then the first half 
is the old file, and the second half is the new files.  If more than 
one change was made, then more than one file is there.

Even if tar doesn't complain, the database is probably still be 
corrupted, since several independent files all need to be "in sync" for 
the database to be consistent. Almost certainly the files in the 
archive will be badly corrupted (indices inconsistent with tables, 
records that were changed half-way through a read), and therefore the 
backup of those files will be useless.  Even if they "work" (mysqld 
doesn't barf on startup) they are almost certainly not transactionally 
consistent, and that will lead to unpredictable behavior and incorrect 
results.  You'll need to either shut down the database or exclude those 
files whatever the final solution.

This is the kind a dilemma that a good DBA/consultant gets paid so much 
to resolve.  Basically, you can use a tool like mysqldump to dump a 
version of the tables, which can be used to reconstruct the database 
later.   If you don't have enough disk space, then get another (or 
bigger) disk.  Even then, transactional rules that aren't strictly 
enforced by the database may lead to inconsistent data if you're 
dumping a "live" database.

For example... Say the mysqldump script is working on an order 
processing database.  If it dumps the customer table first, followed by 
the product table and then the order table you can end up with a 
customer record which was added to the database after the dump for the 
customer table completed, (and therefore isn't in the backup) but a 
corresponding order record was added before the backup of the order 
table, which means that the restored database would have an order 
corresponding to a non-existent customer record, violating the foreign 
key constraint.

If you "fix" this by backing up the order table first, then you end up 
with the same problem with customer record deletions.  The solution 
here is to have a way to "suspend" certain types of operations during 
the backup.  (e.g. don't delete customers/orders, just set a flag in 
the table that says they're invalid, and delete them later.. this is 
still no guaranty though... but it may lessen the amount of work needed 
to manually "fix" the inconsistencies when you have to restore.

Good luck,
   -- Mitch