[vox-tech] gzip bug?

Jan Wynholds vox-tech@lists.lugod.org
Thu, 14 Mar 2002 13:49:47 -0800 (PST)


Hi Kevin:

Have you checked memory?  Whenever I have problems with something that _should_
be rock solid (like bzip and gzip), I check memory...  Not to say that it
couldn't be some other random hardware problem, but memory is what I have seen
most commonly.  Have any other pieces of hardware changed since you have seen
this behavior?  

With a RedHat 7.1 system I think I have used bzip and gzip to handle many
Gigabytes of tape data.  I am doubtful it is your software.

Is there anything else that is giving you problems with this box?  Does gcc
work correctly?  Will a kernel compiled on that box run correctly?  Do any
programs halt with Segmentation Fault (sig11)?

I have had problems with RedHat boxen that have memory problems.  Have you
upgraded memory lately?  I ask only b/c it seems like bzip and gzip should not
croak on such sized files.  Since you are using RH 7.1, 2 GB file size limits
shouldn't be your problem.  Your problem is quite weird, b/c gzip is tested and
retested (to the point of bullet proof), so it is very doubtful that it is your
software.  My guess it is something with your hardware.  I found alot of useful
(hardware) testing information from the Sig11 FAQ found at:

http://www.bitwizard.nl/sig11/

Here is some text from that page on (very nearly) your problem:

QUESTION
Is it always signal 11?
ANSWER
Nope. Other signals like four, six and seven also occur occasionally.
Signal 11 is most common though.
As long as memory is getting corrupted, anything can happen. I'd
expect bad binaries to occur much more often than they really
do. Anyway, it seems that the odds are heavily biased towards gcc
getting a signal 11. Also seen:

free_one_pmd: bad directory entry 00000008

EXT2-fs warning (device 08:14): ext_2_free_blocks bit already
     cleared for block 127916

Internal error: bad swap device

Trying to free nonexistent swap-page

kfree of non-kmalloced memory ...

scsi0: REQ before WAIT DISCONNECT IID

Unable to handle kernel NULL pointer dereference at virtual
     address c0000004 

put_page: page already exists 00000046

invalid operand: 0000

Whee.. inode changed from under us. Tell Linus

<<This might be akin to your problem:>>

crc error  --  System halted  (During the uncompress of the Linux kernel)

Segmentation fault

"unable to resolve symbol" 

make [1]: *** [sub_dirs] Error 139

make: *** [linuxsubdirs] Error 1

The X Window system can terminate with a "caught signal xx"
The first few ones are cases where the kernel "suspects" a
kernel-programming-error that is actually caused by the bad memory.
The last few point to application programs that end up with the
trouble.
-- S.G.de Marinis (trance@interseg.it)
-- Dirk Nachtmann (nachtman@kogs.informatik.uni-hamburg.de)

<<END>>
HTHO,

jan
--- Kevin Dawson <kdawson@ucdavis.edu> wrote:
> Hi Pete,
> 
> The files I'm having the problem with are just 50M and 30M after gzip.
> 
> I followed your suggestion. Everything was going fine up to 1M. Using
> 10M, I had to learn that I didn't have 20G disk space available on
> /home. Using 100k, the created file size is in the range of my problem
> files (50M), can be compressed to only 50k, and uncompressed perfectly.
> It looks like that the problem is dependent on the size of the
> compressed file (30M in my case).
> 
> bunzip2 is also having trouble. It says:
> bunzip2: Data integrity error when decompressing.
> 	Input file=bigfile.bz2, output file=bigfile
> 
> It is possible that the compressed file(s) have become corrupted.
> You can use the -tvv option to test integrity of such files.
> 
> When I do the test, it will print
> [N: huff+mtf rt+rld] whereas N goes up to 29.
> Then it prints: data integrity (CRC) error in data.
> 
> When gunzipping the files, I don't hear any unusual disk churning.
> 
> Thanks again,
> 
> Kevin
> 

__________________________________________________
Do You Yahoo!?
Yahoo! Sports - live college hoops coverage
http://sports.yahoo.com/