[vox-tech] Need X Windows performance monitoring help
Peter Jay Salzman
vox-tech@lists.lugod.org
Fri, 16 Aug 2002 00:32:05 -0700
begin Eric Nelson <en77@attbi.com>
> On Thursday 15 August 2002 21:19, Peter Jay Salzman wrote:
> > >
> > > My question is, how can we time stamp events,
> >
> > what do you mean by "time stamp"? depending on what you mean, i
> > would either recommend using the "time" command or the "top"
> > command.
>
> What I mean is,
> request to show image -> Xserver 10:43:42.234
> Xserver sends commands to hardware 10:43:42.678
> image shows 10:43:43.012
> etc.
> In the time from start to finish, where are the hangups. In other
> words, do we need to improve the graphics adapter, the Xserver, add
> ram, what?
holy cow, that's hardcore. you can get that info, but it'll be a mess
and a bitch to implement.
there are two profiling tools i've used: gprof (gnu) and pgprof (pgi).
the pgi version has a graphical interface. they both work the same way;
they do _almost_ what you want.
i say "almost", but an inch is as bad as a mile in this case.
first, you'll need to recompile the X server. similarly to how you
compile binaries with an augmented symbol table for gdb. you'll need to
add profiling data.
they'll profile user space functions, not:
system calls made on X's behalf (like DRI/DRM). i think these are
atomic operations as far as a profiler is concerned.
events like "image shows up at this time". you'll need to gain some
kind of familiarity with the X source to translate function calls to
a meaningful event. i've never looked at the code. maybe it would
be obvious. i dunno.
in either case, you're talking about a BOATLOAD of hard work.
> We can film the screen, and see how much time things take there, but
> where are other delays?
> >
> > > or get a trace of memory
> >
> > i'm not exactly sure what a memory trace is, either. but if it
> > helps, take a look at /proc/meminfo and see if that shows what you
> > want.
> >
> meminfo and ps have good data, but you would like to see what they
> say, each, say 10 milliseconds.
10 ms would be the lowest granularity you'll get. the linux kernel
programs the internal clock to issue a timer interrupt every 10
miliseconds.
i don't know if that's x86 specific or just linux in general. i suspect
it's general. that means the kernel can't keep time more accurate than
10 ms.
> Taking the analytical approach, we can put hooks in the code, say
> toggle a bit on the parallel port, and watch w/ a logic analyzer,
> but, whew, what a job. I was hoping to get some type of strace w/
> time stamps.
egads. i think it might be easier to design your own custom made
microprocessor for the job! ;)
> The kind of ironic thing is that, w/ 8 bit processors, a good analyzer
> will do all these things. Now, with the complexity of modern OS's,
> it's almost like going back to the very early days of microprocessor
> development, as far as tools go. Like, printk is about the most
> powerful tool out there :~)
> > if that's not what you want, take a look at top.
> >
> > > and swap usage,
> >
> > ok, that i know. try:
> >
> > cat /proc/swaps
> >
> > to get how much swap is being used. if you want to know what
> > processes have been swaped to disk, it's good old "ps". see "man
> > ps" and do a search on "swap".
>
> I read somewhere that maybe the libraries swap out. How can I tell?
no, i don't think this is right. i mean, why bother? a copy of the
library is already on disk. it doesn't make sense to swap out something
that's already on disk. look at a typical /proc/<pid>/map file:
p@satan% cat 326/maps
08048000-08077000 r-xp 00000000 03:05 258566 /usr/X11R6/bin/xterm
08077000-0807f000 rw-p 0002e000 03:05 258566 /usr/X11R6/bin/xterm
0807f000-080dd000 rwxp 00000000 00:00 0
40000000-40013000 r-xp 00000000 03:05 225817 /lib/ld-2.2.5.so
40013000-40014000 rw-p 00013000 03:05 225817 /lib/ld-2.2.5.so
40015000-40017000 r-xp 00000000 03:05 1548 /usr/lib/gconv/ISO8859-1.so
40017000-40018000 rw-p 00001000 03:05 1548 /usr/lib/gconv/ISO8859-1.so
40023000-40038000 r-xp 00000000 03:05 81876 /usr/X11R6/lib/libXft.so.1.1
40038000-4003a000 rw-p 00014000 03:05 81876 /usr/X11R6/lib/libXft.so.1.1
this field ^
shows the location of the shared library on disk. major number 3 (hda),
minor number 5 (partition 5).
swapping only makes sense for things like stacks and heaps. data that
you can write to.
however, each shared library is loaded into 2 maps: one which is
executable, and one which is writable. i don't know why that is. but
maybe you're right. maybe the writable portion copy does get swapped.
i dunno. if you find out, i'd like to know.
by the way, doesn't the "sticky bit" of a file ensure that the file
stays in core? i thought that was the whole point of why hackers try to
get a copy of a root shell with the sticky bit set.
> > i saw rod's reply. as usual, he makes total sense. :)
>
> Yes, often the empirical approach is easiest, and since you are
> measuring what you want to measure, and not some artificial
> byproduct, it's more accurate. And, you get the quickest results.
> We vary ram, clock speed, maybe even different graphics adapters.
> I'm pretty sure that's the best, most direct approach.
rod has a nasty habit of being right about alot of things. :)
pete
--
GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E 70A9 A3B9 1945 67EA 951D