[vox-tech] Need X Windows performance monitoring help

Fri, 16 Aug 2002 00:32:05 -0700

begin Eric Nelson <en77@attbi.com> 
> On Thursday 15 August 2002 21:19, Peter Jay Salzman wrote:
> > >
> > > My question is, how can we time stamp events,
> >
> > what do you mean by "time stamp"?  depending on what you mean, i
> > would either recommend using the "time" command or the "top"
> > command.
> 
> What I mean is, 
> request to show image -> Xserver 10:43:42.234
> Xserver sends commands to hardware 10:43:42.678
> image shows 10:43:43.012
> etc.  
> In the time from start to finish, where are the hangups.  In other 
> words, do we need to improve the graphics adapter, the Xserver, add 
> ram, what?

holy cow, that's hardcore.  you can get that info, but it'll be a mess
and a bitch to implement.

there are two profiling tools i've used: gprof (gnu) and pgprof (pgi).
the pgi version has a graphical interface.  they both work the same way;
they do _almost_ what you want.

i say "almost", but an inch is as bad as a mile in this case.

first, you'll need to recompile the X server.  similarly to how you
compile binaries with an augmented symbol table for gdb.  you'll need to
add profiling data.

they'll profile user space functions, not:

   system calls made on X's behalf (like DRI/DRM).  i think these are
   atomic operations as far as a profiler is concerned.

   events like "image shows up at this time".  you'll need to gain some
   kind of familiarity with the X source to translate function calls to
   a meaningful event.  i've never looked at the code.  maybe it would
   be obvious.  i dunno.

in either case, you're talking about a BOATLOAD of hard work.

> We can film the screen, and see how much time things take there, but 
> where are other delays? 
> >
> > > or get a trace of memory
> >
> > i'm not exactly sure what a memory trace is, either.   but if it
> > helps, take a look at /proc/meminfo and see if that shows what you
> > want.
> >
> meminfo and ps have good data, but you would like to see what they 
> say, each, say 10 milliseconds.  

10 ms would be the lowest granularity you'll get.  the linux kernel
programs the internal clock to issue a timer interrupt every 10
miliseconds.

i don't know if that's x86 specific or just linux in general.  i suspect
it's general.  that means the kernel can't keep time more accurate than
10 ms.

> Taking the analytical approach, we can put hooks in the code, say 
> toggle a bit on the parallel port, and watch w/ a logic analyzer, 
> but, whew, what a job.  I was hoping to get some type of strace w/ 
> time stamps.

egads.  i think it might be easier to design your own custom made
microprocessor for the job!   ;)

> The kind of ironic thing is that, w/ 8 bit processors, a good analyzer 
> will do all these things.  Now, with the complexity of modern OS's, 
> it's almost like going back to the very early days of microprocessor 
> development, as far as tools go.  Like, printk is about the most 
> powerful tool out there :~)

> > if that's not what you want, take a look at top.
> >
> > > and swap usage,
> >
> > ok, that i know.   try:
> >
> >    cat /proc/swaps
> >
> > to get how much swap is being used.  if you want to know what
> > processes have been swaped to disk, it's good old "ps".  see "man
> > ps" and do a search on "swap".
> 
> I read somewhere that maybe the libraries swap out.  How can I tell?

no, i don't think this is right.  i mean, why bother?  a copy of the
library is already on disk.  it doesn't make sense to swap out something
that's already on disk.  look at a typical /proc/<pid>/map file:

p@satan% cat 326/maps 
08048000-08077000 r-xp 00000000 03:05 258566   /usr/X11R6/bin/xterm
08077000-0807f000 rw-p 0002e000 03:05 258566   /usr/X11R6/bin/xterm
0807f000-080dd000 rwxp 00000000 00:00 0
40000000-40013000 r-xp 00000000 03:05 225817   /lib/ld-2.2.5.so
40013000-40014000 rw-p 00013000 03:05 225817   /lib/ld-2.2.5.so
40015000-40017000 r-xp 00000000 03:05 1548     /usr/lib/gconv/ISO8859-1.so
40017000-40018000 rw-p 00001000 03:05 1548     /usr/lib/gconv/ISO8859-1.so
40023000-40038000 r-xp 00000000 03:05 81876    /usr/X11R6/lib/libXft.so.1.1
40038000-4003a000 rw-p 00014000 03:05 81876    /usr/X11R6/lib/libXft.so.1.1

this field                       ^

shows the location of the shared library on disk.  major number 3 (hda),
minor number 5 (partition 5).

swapping only makes sense for things like stacks and heaps.  data that
you can write to.

however, each shared library is loaded into 2 maps: one which is
executable, and one which is writable.  i don't know why that is.  but
maybe you're right.  maybe the writable portion copy does get swapped.
i dunno.   if you find out, i'd like to know.

by the way, doesn't the "sticky bit" of a file ensure that the file
stays in core?  i thought that was the whole point of why hackers try to
get a copy of a root shell with the sticky bit set.

> > i saw rod's reply.  as usual, he makes total sense.  :)
> 
> Yes, often the empirical approach is easiest, and since you are 
> measuring what you want to measure, and not some artificial 
> byproduct, it's more accurate.  And, you get the quickest results.  
> We vary ram, clock speed, maybe even different graphics adapters.  
> I'm pretty sure that's the best, most direct approach.

rod has a nasty habit of being right about alot of things.   :)

pete

-- 
GPG Fingerprint: B9F1 6CF3 47C4 7CD8 D33E  70A9 A3B9 1945 67EA 951D