[vox-tech] File recovery?

ME vox-tech@lists.lugod.org
Mon, 2 Sep 2002 02:48:49 -0700 (PDT)


On Mon, 2 Sep 2002, Ryan wrote:
> --[PinePGP]--------------------------------------------------[begin]--
> On Monday 02 September 2002 12:20 am, ME wrote:
> > However, if you are set on this direction, you could try using grep. :-)
> >
> > grep --binary-files=text --byte-offset "hello" /dev/hda1 |less
> > variations to this can further help refine the search for the offset.
> >
> > It will likely be messy. :-/
> 
> I just tried using mplayer on a offset in the partition. Only plays a few
> frames before quiting due to "end of file"

Yes, this would be expected. First, when you try to read the device
directly, and seek to play from the offset, even if the file *is*
sequencial, there is still filesystem information left in the physical
space. Attempts to read the "file" by reading the device directly will
lead to your application also reading filesystem data and also lead to the
application to try to decipher the overhead of filesystem "header"
information as actual file contents. When this is done, checksums will
fail. Also, EOF char may be encountered sooner as it would likely be used
"elsewhere" by the filesystem driver as a kind of field separation and not
part of the file when accessed via "normal" (the usual format, in asking
the system to open a file for reading where the kernel deals with the
filesystem to extract the "payload") IO.

In addition to this, files may not be stored sequencially on disk. In
balanced trees, and some (many? all?) not only is db/index/node use in the
tree balanced, but disk location/use can be too. I know very little about
the actual filesystem in a "real world" view as I have not written or
deciphered any modern (after FAT/VFAT... i.e. something that includes
permissions, complex ACL or is a journaled) filesystems in detail or
written my own. I know just enough of basic concept to say the wrong
thing, so please take this paragraph as it was spoken by a complete
newbie. I hope to convey at least a basic conceptual reason for
non-sequencial storage on disk beyond the normal fragmentation issues.

Add the two above together, and you can see two good reasons I offer for
how grep would be messy. :-(

> TCT looks like it's not going to do anything for recovering files if debugfs
> can't find any deleted inodes, besides text files and crap. I mentioned
> earlire, this stuff is mostly fansubbed anime, so it looks as though I'm just
> going to have to re-download the lost files.

I *thought* TCT stepped the actual device. If TCT can recognize the
filesystem (it was made before ext3, I think, docs will say for sure if it
can do ext3 now) then it tries to piece bits of *data* on disk without
inodes. Undeleting files in Linuxland ext2/ext3 is not as easy as with
vfat/fat. I tried to piece a file together by hand on an ext2 filesystem
before TCT and I can say it was a huge pain! Only retreived about 80% of
the file.

> Output from debugfs:
> 
>  debugfs 1.27 (8-Mar-2002)
> debugfs:  lsdel
>  Inode  Owner  Mode    Size    Blocks   Time deleted
> 0 deleted inodes found.
> 
> I don't understand why this happened, I'm running ext3, shouldn't it have
> just recovered the journal and been fine?

I have not converted to ext3 yet, and I do not have an answer for you
here for this question. Sorry. :-(

-ME

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS/CM$/IT$/LS$/S/O$ !d--(++) !s !a+++(-----) C++$(++++) U++++$(+$) P+$>+++ 
L+++$(++) E W+++$(+) N+ o K w+$>++>+++ O-@ M+$ V-$>- !PS !PE Y+ PGP++
t@-(++) 5+@ X@ R- tv- b++ DI+++ D+ G--@ e+>++>++++ h(++)>+ r*>? z?
------END GEEK CODE BLOCK------
decode: http://www.ebb.org/ungeek/ about: http://www.geekcode.com/geek.html