[vox-tech] late night musings: stripping

Peter Jay Salzman vox-tech@lists.lugod.org
Thu, 26 Feb 2004 09:22:12 -0800


On Thu 26 Feb 04,  8:49 AM, Mitch Patenaude said:
> Hi Pete,
> 
> 
> The difference you're seeing is one of debugging information vs. symbol 
> table.
> 
> The symbol table is used during linking, and contains the addresses of 
> function entry points and global variables.  gdb can use this to decode 
> the stack frames to tell what the call stack is, but can't give you 
> more detailed information.
 
Hi Mitch,

Thanks for the reply!  With debugging information, the finish command,
of course, resumes execution until the current stack frame returns (code
is listed below):

   p@satan$ gdb hello2
   (gdb) break myfunction 
   Breakpoint 1 at 0x8048386: file hello2.c, line 13.
   (gdb) run
   Starting program: /home/p/stuff/hello2 

   Breakpoint 1, myfunction () at hello2.c:13
   13              printf("hello world\n");
   (gdb) finish
   Run till exit from #0  myfunction () at hello2.c:13
   hello world
   main () at hello2.c:7
   7               return 0;

so we were in myfunction and "finished" until we got back to main.  From
what you said, I would've expected this to work without debugging
information, but it doesn't:

   (gdb) break main
   Breakpoint 1 at 0x804836a
   (gdb) run
   Starting program: /home/p/stuff/hello2 

   Breakpoint 1, 0x0804836a in main ()
   (gdb) stepi
   0x0804836d in main ()
   (gdb) stepi
   0x08048372 in main ()
   (gdb) stepi
   0x08048374 in main ()
   (gdb) stepi
   0x08048380 in myfunction ()
   (gdb) finish 
   Run till exit from #0  0x08048380 in myfunction ()
   hello world
   0x40045dc6 in __libc_start_main () from /lib/libc.so.6

   (gdb) bt
   #0  0x40045dc6 in __libc_start_main () from /lib/libc.so.6
   #1  0x00000001 in ?? ()

   (gdb) print &main
   $1 = (<text variable, no debug info> *) 0x8048364 <main>
   (gdb) print &myfunction 
   $2 = (<text variable, no debug info> *) 0x8048380 <myfunction>

we finished right into glibc.  shouldn't GDB have known when myfunction
returned to main, even if there's no debugging information?


> The debugging info will tell you much more, since it will allow gdb to 
> tie the PC to the source, so you can see what source line is actually 
> executing.  It will also give symbolic access to local automatic and 
> static variables, as well as allow the debugger to display more 
> complicated data structures intelligently.
> 
> The -g option only give minimal information.  Better to use -ggdb
 
heh.  i just recently learned about -ggdb.  force of habit.  :)

> As for disabling copy protection/license checking/etc.... You're 
> right..though you need to set the appropriate return value as well, and 
> cracked versions of programs much use that technique.  However, the 
> developers know that, and they take steps to make this more difficult.  
> It starts with stripping the executable, burying the check deep in a 
> library somewhere, and making more than one check.  There are bunches 
> of other techniques as well...

what would be useful would be something like GDB which can follow a
process and collect information about:

1. control flow (what functions call what).
2. get the parameters and return values of the function calls.


the only way i know how to get #1 is to sit there, using stepi (and
possibly nexti over uninteresting libc functions) with a pencil and
paper in hand.

as for #2, seems like the only way to do that would be to disassemble
the code.  i don't know a lick of x86 assembly, but i did notice that
%eax appears to be the register for returning integers.

i rewrote myfunction to return an int of 1.

   p@satan$ gdb hello3
   (gdb) disassemble myfunction 
   Dump of assembler code for function myfunction:
   0x8048380 <myfunction>: push   %ebp
   0x8048381 <myfunction+1>:       mov    %esp,%ebp
   0x8048383 <myfunction+3>:       sub    $0x8,%esp
   0x8048386 <myfunction+6>:       movl   $0x80484b4,(%esp,1)
   0x804838d <myfunction+13>:      call   0x8048288 <printf>
   0x8048392 <myfunction+18>:      mov    $0x1,%eax
   0x8048397 <myfunction+23>:      leave  
   0x8048398 <myfunction+24>:      ret    

here, myfunction returns a float of 1.0:

   p@satan$ gdb hello4
   (gdb) disassemble myfunction 
   Dump of assembler code for function myfunction:
   0x8048382 <myfunction>: push   %ebp
   0x8048383 <myfunction+1>:       mov    %esp,%ebp
   0x8048385 <myfunction+3>:       sub    $0x8,%esp
   0x8048388 <myfunction+6>:       movl   $0x80484b4,(%esp,1)
   0x804838f <myfunction+13>:      call   0x8048288 <printf>
   0x8048394 <myfunction+18>:      fld1   
   0x8048396 <myfunction+20>:      leave  
   0x8048397 <myfunction+21>:      ret    

i'll need to google for fld1, but its general idea is clear.


anyway, it would be neat to have a program that automated all this
stuff.

pete





> On Thursday, Feb 26, 2004, at 06:08 US/Pacific, Peter Jay Salzman wrote:
> 
> >there's no point to this post, other than to share some things i found
> >interesting while playing around with some code last night.
> >
> >
> >
> >here's some code that i compiled WITHOUT an enhanced symbol table:
> >
> >
> >   #include <stdio.h>
> >   void myfunction(void);
> >
> >   int main(void)
> >   {
> >      myfunction();
> >      return 0;
> >   }
> >
> >
> >   void myfunction(void)
> >   {
> >      printf("hello world\n");
> >   }
> >
> >
> >i can still set a breakpoint at main, since that's a libc thing.  every
> >program has a main function, even ones that don't have a main function
> >are given a main function (like fortran):
> >
> >   (gdb) break main
> >   Breakpoint 1 at 0x804836a
> >   (gdb) run
> >   Starting program: /home/p/stuff/hello2
> >
> >   Breakpoint 1, 0x0804836a in main ()
> >   (gdb) stepi
> >   0x0804836d in main ()
> >   (gdb)
> >   0x08048372 in main ()
> >   (gdb)
> >   0x08048374 in main ()
> >   (gdb)
> >   0x08048380 in myfunction ()
> >   (gdb)
> >   0x08048381 in myfunction ()
> >   (gdb)
> >   0x08048383 in myfunction ()
> >   (gdb)
> >   0x08048386 in myfunction ()
> >   (gdb)
> >   0x0804838d in myfunction ()
> >   (gdb)
> >   0x08048288 in printf ()
> >   ...
> >
> >
> >i found it odd that gdb has any concept of what i name my functions 
> >when
> >i don't specify -g.  but it obviously does:
> >
> >   (gdb) info functions
> >   All defined functions:
> >
> >   Non-debugging symbols:
> >   0x08048250  _init
> >   0x08048278  __libc_start_main
> >   0x08048288  printf
> >   0x080482c4  call_gmon_start
> >   0x080482f0  __do_global_dtors_aux
> >   0x08048330  frame_dummy
> >   0x08048364  main
> >   0x08048380  myfunction
> >   0x080483a0  __libc_csu_init
> >   0x08048400  __libc_csu_fini
> >   0x08048450  __i686.get_pc_thunk.bx
> >   0x08048460  __do_global_ctors_aux
> >   0x08048490  _fini
> >
> >although i didn't compile the function with "-g", file reports the
> >executable as unstripped:
> >
> >   p@satan$ file hello2
> >   hello2: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), 
> >for
> >   GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped
> >
> >so let's strip it:
> >
> >   p@satan$ strip hello2
> >
> >function names are gone:
> >
> >   p@satan$ gdb hello2
> >   (no debugging symbols found)...(gdb)
> >   (gdb) info functions
> >   All defined functions:
> >
> >   Non-debugging symbols:
> >   0x08048278  __libc_start_main
> >   0x08048288  printf
> >
> >
> >i'm not sure what stripped is, but my little experiment certainly hints
> >at what it is.
> >
> >i have mathematica installed on my system (legally).  it is statically
> >linked and unstripped.
> >
> >i also have the intel fortran compiler on my system (legally).  it is
> >unstripped and dynamically linked.  it's also protected by a license
> >manager called flexlm (which i've had many bad experiences with.  i've
> >had other software (legally) where flexlm decided to stop working out 
> >of
> >the blue, and at the worst possible moments).  it SHOULD be possible to
> >be be able to step through compiler, figure out the flexlm function 
> >that
> >grants access to the program and NOP it out.
> >
> >not that i would do that.  i believe this would be illegal under the
> >DMCA.  and i have the compiler installed on my system legally.  but it
> >is an interesting thought.
> >
> >yet another thing to put on my google/reading list....
> >
> >pete
> >
> >ps- the intel compiler / debugger is non-free (it's free as in beer, 
> >not
> >free as in liberty) but very good.  i've been able to get DDD to use
> >intel's debugger (idb) as a backend.  it's not perfect, but it works
> >well enough.
> >
> >although it doesn't "support" debian, i was able to install it within
> >minutes.  a combination of "alien" and looking at some bash scripts to
> >discover where certain directories are made it a snap to install.

-- 
Make everything as simple as possible, but no simpler.  -- Albert Einstein
GPG Instructions: http://www.dirac.org/linux/gpg
GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E 70A9 A3B9 1945 67EA 951D