[Fwd: [vox-tech] corrupted ext3 filesystem]

Jonathan Stickel vox-tech@lists.lugod.org
Tue, 25 Feb 2003 09:42:05 -0800


Thanks for all your comments and advice.  I finally had time to explore 
a few of these (comments below).

msimons@moria.simons-clan.com wrote:
> On Tue, Feb 11, 2003 at 04:20:03PM -0800, Jonathan Stickel wrote:
> 
>>kjournald[150] exited with preempt count 1
> 
> 
>   From a few minutes in google, this appears to be relevant:
> 
> http://lwn.net/Articles/17846/
> # Kernel preemption.
> # ~~~~~~~~~~~~~~~~~
> # - The much talked about preemption patches made it into 2.5.
> #   With this included you should notice much lower latencies especially
> #   in demanding multimedia applications. 
> # - Note, there are still cases where preemption must be temporarily disabled
> #   where we do not. These areas occur in places where per-CPU data is used.
> # - If you get "xxx exited with preempt count=n" messages in syslog,
> #   don't panic, these are non fatal, but are somewhat unclean.
> #   (Something is taking a lock, and exiting without unlocking)
> # - If you DO notice high latency with kernel preemption enabled in
> #   a specific code path, please report that to Andrew Morton <akpm@digeo.com>
> #   and Robert Love <rml@tech9.net>.
> #   The report should be something like "the latency in my xyz application
> #   hits xxx ms when I do foo but is normally yyy" where foo is an action
> #   like "unlink a huge directory tree".
> 
> (while this document is talking about 2.5, redhat normally applies a bunch
> of custom patches to their production kernels, and I didn't bother checking 
> if this has made it into 2.4 mainline).

I am actually running a custom 2.4.18 kernel, not one of RedHat's.  For 
now, I will ignore this "preemt count".

>   for future reference a very good way to force a filesystem check is
> ===
> shutdown -F -r now
> ===
>   the -F asks for a forced file system check on bootup... while it
> requires some support from the bootup scripts to happen I imagine
> support for that it is standard on most linux distributions.

Yes, this is very helpful.  I use this now whenever I get my unmount 
errors on shutdown, which still occur about once every 2 weeks. 
Fortunately, fsck has not found any errors since.

>   A while ago I noticed that the Redhat installer created ext3 
> filesystems that will never be checked periodically.  This can
> lead to massive filesystem corruption later on if small errors
> in the filesystem go undetected and the filesystem continues to
> be used.
>   This corruption happens because the kernel filesystem drivers
> don't cross check the filesystem data on each use (it would be
> slower), and since only the filesystem driver should change the
> data it is trusted to be correct... if different records go out 
> of sync very bizarre things can happen.
> 
>   You can use 'tune2fs -l' to check what the "Maximum mount count"
> and "Check interval" are.  I would recommend having max mount be
> something between 20 and 40, and check interval be something between
> 3 and 6 months.

Your were right:  my RH8 system defaults to 0 for maximum mount count 
and check interval.  I have changed these to 20 mounts and 20 days. 
RedHat should put this in their documentation...