[vox-tech] Linux Block Layer is Lame (it retries too much)

Michael Wenk vox-tech@lists.lugod.org
Wed, 28 May 2003 14:58:30 -0700


On Wednesday 28 May 2003 12:09 pm, Mike Simons wrote:
> On Wed, May 28, 2003 at 11:31:56AM -0700, Jeff Newmiller wrote:
> > On Tue, 27 May 2003, Mike Simons wrote:
> > >   Last week I was having problems with the ide layer... it retries
> > > way too many times.  I was trying to read 512 byte blocks from a dying
> > > /dev/hda (using dd_rescue which calls pread), for each bad sector the
> > > kernel would try 8 times,
>
> [...]
>
> > >   Even better because the process is inside a system call, it is not
> > > killable and so there is no practical way to speed up the process.
> >
> > It should be open to termination between the time the read system call
> > returns and the write system call starts.
>
>   Yes, it was "killable" in that you could ^C or send a signal with kill,
> after waiting 10 minutes the kernel would finish retrying and the
> process would exit cleanly on the signal.
>
>   I meant there was no way to abort the 8 sector read attempt.
>
> > > - How does a 1 sector read is expanded to an 8 sector chunk?
> >
> > I don't know.  But I suspect it has to do with the "natural" way files
> > are read in... by "mmap"ing them to pages in RAM.  i386 memory managers
> > usually use 4k pages... ergo, 8 x 512B sectors.
> >
> > Some of this behavior may be due to the algorithms in dd_rescue.
>
>   Nah... dd_rescue is certainly not the cause.  It is a very simple
> program that reads blocks of a size you can specify on the command line.
>
>   It has the concept of a "soft block size" which it uses to quickly cover
> the good sections of disk, and a "hard block size" which it uses to
> slowly walk the bad sections of disk.  By default it will use the soft
> size until a read error happens, it will then drop to the hard block
> size and read until it travels a few "soft" block sizes without errors.
>   I realize I was not explicit enough, but I has set the "soft" and
> "hard" block size to 512 bytes, which because soft and hard are the
> same will prevents dd_rescue from retrying the read of any bad blocks...
>
> > > - Any other ideas on how to pull the disk blocks?
> >
> > Not easy ones. (Build your own device driver that doesn't use mmap.)
>
>   Michael Wenk suggested using O_DIRECT on the open call, which is
> an excellent idea.  This was what the Oracle people at their Clustering
> Filesystem talk.  I have one more failing hard drive around which I'm
> going to try that on...


The other thing I was looking at was an ioctl call for BLKRASET, or BLKFRASET.  
I googled this and came up with an interesting link on my first shot.  

http://www.linuxtv.org/mailinglists/vdr/2002/04-2002/msg00061.html


In this link you have the following proc settings: 

echo file_readahead:0 > /proc/ide/hdc/settings
echo breada_readahead:0 > /proc/ide/hdc/settings

Maybe this or the ioctl will help(or possibly O_DIRECT)  

Mike


-- 
wenk@praxis.homedns.org
Mike Wenk