[vox-tech] ECC memory --- is it worth it? (semi-OT)

hajhouse hajhouse at houseag.com
Sun Apr 8 00:30:54 PDT 2007


ECC memory is supposed to correct single-bit errors that can be caused
by radiation and other freak events of the quantum-mechanical field in
which we live. That sounds like a good thing that I would like to have
and an important feature for a machine that will be up for months
between reboots. 

However, ECC modules cost more than standard modules.  Also, most
motherboards don't list ECC support in their feature lists. I assume
that this means that either plugging in ECC modules would lead to
non-function, or that they would function as standard memory, without
using their error-correcting capability (rather pointless). Choosing to
use ECC memory then also means you get to pick from a smaller set of
motherboards than you would otherwise, and will probably pay more for
the board because only high-end boards have ECC support.

How many of you are using non-ECC (standard) memory on long-uptime
machines? Are you having any problems because of it? Do you think ECC is
worth the premium?

My current main machine does have ECC memory. I've not made a habit of
looking at /proc/ram to see whether my machine has had RAM errors, but
currently it shows none.

Chipset ECC capability : ECC with hardware scrubber
Current ECC mode : ECC with hardware scrubber
Bank	Size	Type	ECC	SBE	MBE
0	128M	RDR	Y	0	0
1	128M	RDR	Y	0	0
2	128M	RDR	Y	0	0
3	128M	RDR	Y	0	0
4	128M	RDR	Y	0	0
5	128M	RDR	Y	0	0
6	128M	RDR	Y	0	0
7	128M	RDR	Y	0	0

For background, here's what Wikipedia says:

   Memory controllers in most modern PCs can typically detect, and correct
   errors of a single bit per 64 bit "word" (the unit of bus transfer), and
   detect (but not correct) errors of two bits per 64 bit word. Some
   systems also 'scrub' the errors, by writing the corrected version back
   to memory. The BIOS in some computers, and operating systems such as
   Linux, allow counting of detected and corrected memory errors, in part
   to help identify failing memory modules before the problem becomes
   catastrophic. Unfortunately, most modern PCs are supplied with memory
   modules that have no parity or ECC bits.
   ...
   A reasonable rule of thumb is to expect one bit error, per month, per
   gigabyte of memory. Actual error rates vary widely.

   (http://en.wikipedia.org/wiki/Dynamic_RAM)

-- 
Henry House
+1 530 753 3361 ext. 13
Please don't send me HTML mail! My mail system frequently rejects it.
The unintelligible text that may follow is a digital signature.
See <http://hajhouse.org/pgp> to find out how to use it.
My OpenPGP key: <http://hajhouse.org/hajhouse.asc>.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.lugod.org/pipermail/vox-tech/attachments/20070408/42483fdc/attachment.pgp


More information about the vox-tech mailing list