[vox-tech] late night musings: stripping
Peter Jay Salzman
vox-tech@lists.lugod.org
Thu, 26 Feb 2004 09:22:12 -0800
On Thu 26 Feb 04, 8:49 AM, Mitch Patenaude said:
> Hi Pete,
>
>
> The difference you're seeing is one of debugging information vs. symbol
> table.
>
> The symbol table is used during linking, and contains the addresses of
> function entry points and global variables. gdb can use this to decode
> the stack frames to tell what the call stack is, but can't give you
> more detailed information.
Hi Mitch,
Thanks for the reply! With debugging information, the finish command,
of course, resumes execution until the current stack frame returns (code
is listed below):
p@satan$ gdb hello2
(gdb) break myfunction
Breakpoint 1 at 0x8048386: file hello2.c, line 13.
(gdb) run
Starting program: /home/p/stuff/hello2
Breakpoint 1, myfunction () at hello2.c:13
13 printf("hello world\n");
(gdb) finish
Run till exit from #0 myfunction () at hello2.c:13
hello world
main () at hello2.c:7
7 return 0;
so we were in myfunction and "finished" until we got back to main. From
what you said, I would've expected this to work without debugging
information, but it doesn't:
(gdb) break main
Breakpoint 1 at 0x804836a
(gdb) run
Starting program: /home/p/stuff/hello2
Breakpoint 1, 0x0804836a in main ()
(gdb) stepi
0x0804836d in main ()
(gdb) stepi
0x08048372 in main ()
(gdb) stepi
0x08048374 in main ()
(gdb) stepi
0x08048380 in myfunction ()
(gdb) finish
Run till exit from #0 0x08048380 in myfunction ()
hello world
0x40045dc6 in __libc_start_main () from /lib/libc.so.6
(gdb) bt
#0 0x40045dc6 in __libc_start_main () from /lib/libc.so.6
#1 0x00000001 in ?? ()
(gdb) print &main
$1 = (<text variable, no debug info> *) 0x8048364 <main>
(gdb) print &myfunction
$2 = (<text variable, no debug info> *) 0x8048380 <myfunction>
we finished right into glibc. shouldn't GDB have known when myfunction
returned to main, even if there's no debugging information?
> The debugging info will tell you much more, since it will allow gdb to
> tie the PC to the source, so you can see what source line is actually
> executing. It will also give symbolic access to local automatic and
> static variables, as well as allow the debugger to display more
> complicated data structures intelligently.
>
> The -g option only give minimal information. Better to use -ggdb
heh. i just recently learned about -ggdb. force of habit. :)
> As for disabling copy protection/license checking/etc.... You're
> right..though you need to set the appropriate return value as well, and
> cracked versions of programs much use that technique. However, the
> developers know that, and they take steps to make this more difficult.
> It starts with stripping the executable, burying the check deep in a
> library somewhere, and making more than one check. There are bunches
> of other techniques as well...
what would be useful would be something like GDB which can follow a
process and collect information about:
1. control flow (what functions call what).
2. get the parameters and return values of the function calls.
the only way i know how to get #1 is to sit there, using stepi (and
possibly nexti over uninteresting libc functions) with a pencil and
paper in hand.
as for #2, seems like the only way to do that would be to disassemble
the code. i don't know a lick of x86 assembly, but i did notice that
%eax appears to be the register for returning integers.
i rewrote myfunction to return an int of 1.
p@satan$ gdb hello3
(gdb) disassemble myfunction
Dump of assembler code for function myfunction:
0x8048380 <myfunction>: push %ebp
0x8048381 <myfunction+1>: mov %esp,%ebp
0x8048383 <myfunction+3>: sub $0x8,%esp
0x8048386 <myfunction+6>: movl $0x80484b4,(%esp,1)
0x804838d <myfunction+13>: call 0x8048288 <printf>
0x8048392 <myfunction+18>: mov $0x1,%eax
0x8048397 <myfunction+23>: leave
0x8048398 <myfunction+24>: ret
here, myfunction returns a float of 1.0:
p@satan$ gdb hello4
(gdb) disassemble myfunction
Dump of assembler code for function myfunction:
0x8048382 <myfunction>: push %ebp
0x8048383 <myfunction+1>: mov %esp,%ebp
0x8048385 <myfunction+3>: sub $0x8,%esp
0x8048388 <myfunction+6>: movl $0x80484b4,(%esp,1)
0x804838f <myfunction+13>: call 0x8048288 <printf>
0x8048394 <myfunction+18>: fld1
0x8048396 <myfunction+20>: leave
0x8048397 <myfunction+21>: ret
i'll need to google for fld1, but its general idea is clear.
anyway, it would be neat to have a program that automated all this
stuff.
pete
> On Thursday, Feb 26, 2004, at 06:08 US/Pacific, Peter Jay Salzman wrote:
>
> >there's no point to this post, other than to share some things i found
> >interesting while playing around with some code last night.
> >
> >
> >
> >here's some code that i compiled WITHOUT an enhanced symbol table:
> >
> >
> > #include <stdio.h>
> > void myfunction(void);
> >
> > int main(void)
> > {
> > myfunction();
> > return 0;
> > }
> >
> >
> > void myfunction(void)
> > {
> > printf("hello world\n");
> > }
> >
> >
> >i can still set a breakpoint at main, since that's a libc thing. every
> >program has a main function, even ones that don't have a main function
> >are given a main function (like fortran):
> >
> > (gdb) break main
> > Breakpoint 1 at 0x804836a
> > (gdb) run
> > Starting program: /home/p/stuff/hello2
> >
> > Breakpoint 1, 0x0804836a in main ()
> > (gdb) stepi
> > 0x0804836d in main ()
> > (gdb)
> > 0x08048372 in main ()
> > (gdb)
> > 0x08048374 in main ()
> > (gdb)
> > 0x08048380 in myfunction ()
> > (gdb)
> > 0x08048381 in myfunction ()
> > (gdb)
> > 0x08048383 in myfunction ()
> > (gdb)
> > 0x08048386 in myfunction ()
> > (gdb)
> > 0x0804838d in myfunction ()
> > (gdb)
> > 0x08048288 in printf ()
> > ...
> >
> >
> >i found it odd that gdb has any concept of what i name my functions
> >when
> >i don't specify -g. but it obviously does:
> >
> > (gdb) info functions
> > All defined functions:
> >
> > Non-debugging symbols:
> > 0x08048250 _init
> > 0x08048278 __libc_start_main
> > 0x08048288 printf
> > 0x080482c4 call_gmon_start
> > 0x080482f0 __do_global_dtors_aux
> > 0x08048330 frame_dummy
> > 0x08048364 main
> > 0x08048380 myfunction
> > 0x080483a0 __libc_csu_init
> > 0x08048400 __libc_csu_fini
> > 0x08048450 __i686.get_pc_thunk.bx
> > 0x08048460 __do_global_ctors_aux
> > 0x08048490 _fini
> >
> >although i didn't compile the function with "-g", file reports the
> >executable as unstripped:
> >
> > p@satan$ file hello2
> > hello2: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
> >for
> > GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped
> >
> >so let's strip it:
> >
> > p@satan$ strip hello2
> >
> >function names are gone:
> >
> > p@satan$ gdb hello2
> > (no debugging symbols found)...(gdb)
> > (gdb) info functions
> > All defined functions:
> >
> > Non-debugging symbols:
> > 0x08048278 __libc_start_main
> > 0x08048288 printf
> >
> >
> >i'm not sure what stripped is, but my little experiment certainly hints
> >at what it is.
> >
> >i have mathematica installed on my system (legally). it is statically
> >linked and unstripped.
> >
> >i also have the intel fortran compiler on my system (legally). it is
> >unstripped and dynamically linked. it's also protected by a license
> >manager called flexlm (which i've had many bad experiences with. i've
> >had other software (legally) where flexlm decided to stop working out
> >of
> >the blue, and at the worst possible moments). it SHOULD be possible to
> >be be able to step through compiler, figure out the flexlm function
> >that
> >grants access to the program and NOP it out.
> >
> >not that i would do that. i believe this would be illegal under the
> >DMCA. and i have the compiler installed on my system legally. but it
> >is an interesting thought.
> >
> >yet another thing to put on my google/reading list....
> >
> >pete
> >
> >ps- the intel compiler / debugger is non-free (it's free as in beer,
> >not
> >free as in liberty) but very good. i've been able to get DDD to use
> >intel's debugger (idb) as a backend. it's not perfect, but it works
> >well enough.
> >
> >although it doesn't "support" debian, i was able to install it within
> >minutes. a combination of "alien" and looking at some bash scripts to
> >discover where certain directories are made it a snap to install.
--
Make everything as simple as possible, but no simpler. -- Albert Einstein
GPG Instructions: http://www.dirac.org/linux/gpg
GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E 70A9 A3B9 1945 67EA 951D