[vox-tech] Hardware Fault on Mandrake System?
Marc Elliot Hall
vox-tech@lists.lugod.org
Sun, 14 Sep 2003 08:53:09 -0700
The situation is this:
I have a Mandrake 9.1 system with the stock kernel running on an EPIA
mini-ITX mainboard with an 800MHz VIA CPU (Ezra) and 512 MB of RAM.
Other than the motherboard, PSU, and a pair of cooling fans, the only
things in the box are the mass storage devices:
* /dev/hda 76 GB Western Digital (Mandrake identifies
this as a Maxtor 6Y080P0 for some reason) with
an EXT3 filesystem
* /dev/hdc 36 GB Maxtor 6L040J2 with an EXT3 filesystem
* /dev/hdd 36 GB Maxtor 6L040J2 with an EXT3 filesystem
* /dev/scd0 SAMSUNG CD-R/RW SW-240B drive
/dev/hda is a single partition, upon which I have stored my media files
-- including my vast, *legally acquired* (for the benefit of any RIAA
spiders), music collection. More on this in a moment.
Mandrake 9.1 correctly identified and setup all the hardware on this box
(with the exception noted above) when I updated it from Mandrake 8.2.
(Mandrake 9.0 needed a patch for this CPU, but I didn't like the
results, so I rolled it back.) I love this system!
This machine is used as my daily desktop system, home network DNS
server, and backup web server for several domains.
The problem is this:
Going about my daily work (my now seven-month-long job search,
thankyouverymuch) in a KDE environment, I keep an XMMS session running
through a number of playlists. Periodically (by which I mean about once
a week), XMMS will freeze mid-song; the hard disk LED will light up and
stay on; over a period of thirty seconds one by one, all other programs
will become non-responsive (including ctrl+alt+Fx to drop to a console).
If I am fast enough to quit a top session in a running shell and do a
killall xmms, I can recover and work normally; however, I've only
managed that a couple of times. Usually, what happens is the entire
system hangs: no mouse, no keyboard (not even numlock), no remote login.
I don't know about SysRq... probably should try that, but I never think
about it at the time.
Upon reboot, the system recognizes an unclean shutdown and asks if I
want to e2fsck. If I say no, it correctly finds journal entries and
cleans up the disks. Total boot time: 2 minutes, 30 seconds. If I say
yes, it proceeds through a file system check and nukes anywhere from 0
to 2 GB on /dev/hda. *ALWAYS* /dev/hda. Most of the time (but not
always), these files can be recovered from lost+found. Total boot time:
anywhere from five to thirty minutes (once, about four months ago, it
took three hours).
My question is this:
Is it likely that /dev/hda is faulty? Is there a utility I can use to
check disk integrity/hardware stuff? Should I have broken the disk into
smaller partitions? I don't think this is heat related, as my disks are
each spaced a full bay apart (big case, small mobo) and I have two
(running) case fans in addition to the PSU fan. This problem is annoying
rather than mission critical at the moment; but I imagine if it
continues for much longer permanent harm may result. Any thoughts?
--
Marc Elliot Hall www.hallmarc.net/quick_resume.html
P.O. Box 435
Shingle Springs, CA 95682
(530)409-0372 cell
(530)672-8504 home
www.hallmarc.net Hire me!