[vox-tech] Need X Windows performance monitoring help

Eric Nelson vox-tech@lists.lugod.org
Tue, 27 Aug 2002 21:50:04 -0700


On Tuesday 27 August 2002 00:17, Bill Broadley wrote:
> On Thu, Aug 15, 2002 at 07:16:52PM -0700, Eric Nelson wrote:
> > We are developing an embedded system which uses Java and X
> > Windows. It doesn't have a VGA.
> >
> > When we run some code on a desktop with lots of ram and vga, we
> > get pretty good performance, but on our embedded system,
> > performance is poor.
>
> First of all I'd like to advocate IBM JVM for x86's if it fits your
> license requirements.  I've been playing europa (java xbattle like
> game), and of all the people that play IBM's JVM is the only one
> that seems entirely stable, it's quite fast as well.

I agree.  The IBM product (VisualAge ??) seems to be by far better=20
than the Sun product.  However, the decision makers in our company=20
want to use Sun because we already have a product using it.  Go=20
figure.
>
> > There are several factors which may contribute to this problem:
> > swapping of the libraries or X Server itself, latency in the X
>
> Vmstat is a good place to start, I'd read the man page fully.  Try
> vmstat or vmstat 1 (1 data point per second).

I will.
>
> Keep in mind that with linux you have a unified cache, and that
> for binaries that are read only without enough memory you get ONLY
> page in's for code.  Since the binaries are read-only when locked
> the pages are just invalidated.  So for a 32 MB binary and 24 MB
> ram you will continuously keep reading from storage to try to get
> the 32 MB in memory (assuming the right memory access pattern).  As
> opposed to traditional swapping where each page miss results in a
> read and a write (which will happen if you try to have to much data
> in memory).

I really don't understand.  Does this mean the binarys continually=20
swap in and out?
>
> Of course embedded systems often have much slower I/O systems then
> your average IDE disk, which exaberates the problems.
>

Well, I would think the flash reads are faster than ide.

> Flash can be VERY slow to write, you should definitely be VERY
> careful to insure all writes are non-blocking for any kind of
> performance/realtime requirements.
>

I'm going to use a ram drive for say, /tmp, and part of /var, to=20
minimize writes to flash.  Also, flash wears out, and you can get=20
file corruptions, but thats another thread.

> > Server, processor speed, Java, the video driver, etc.  So maybe
> > we need more ram, or a faster video driver, or to lock X into
> > ram, or simply go to a faster clock.
> >
> > My question is, how can we time stamp events, or get a trace of
> > memory and swap usage, or whatever necessary, to pin down where
> > the problem is?
>
> It's fairly easy to write a cycle counter to get very accurate
> timing that will not effect the system much, although you don't
> mention the cpu your using, only afaik pentiums and higher have the
> traditional cycle counter, I use something like:
> inline
> unsigned long long
> getcycle (void)                 /* return the cycle counter in an
> unsigned long long.  If you have similiar code for any other
> architecture/os please email me */ {
>   unsigned long high, low;
>
>   __asm__ __volatile__ (".byte 0x0f,0x31"
>
>                         :"=3Da" (low), "=3Dd" (high));
>
>   return (((unsigned long long) high << 32) + low);
> }
>

Yeah, but I have deadlines, commitments, a boss who doesn't appreciate=20
the beauty of really solving a problem how it should be done.  I=20
would enjoy hacking the code like that, but, to do this w/ X will=20
require understanding how it all fits together.  Egads!

As Todd Christiansen says,
there are three qualities of a good programmer:
laziness
impatience
hubris
I've got plenty of the first two.  But, I appreciate the suggestion=20
and will remember it for when I do my own drivers (soon).

>
> If your not swapping (and you shouldn't be), I'd sprinkle getcycle
> calls throughput your code and track where the time is.  I'm not
> that familiar with java profilling, but it's not to hard to call a
> c function from java (unless it's an applet).

We won't have swap.  But, I have heard, X can lock up when ram runs=20
low.  We will be checking for that.  l will be learning how interface=20
C to Java soon, when we interface our drivers to Java.
>
> So what cpu are you targeting (intel x86? clone?)  Application?=20
> Ram? Storage? (flash?)  Vram?  How much?  Video chipset? Any?=20
> Supporting bitblit?  Even the most basic bitblt support can make
> radical difference for anything involving scrolling.

What is bitblit?  The cpu is Geode, by National, w/ Cyrix CS5530=20
companion.  It's an oddball processer, not so well supported by=20
Linux.  It's a point of sale terminal w/ touchscreen, mag. stripe=20
reader, several serial ports, networking, data base.  I'm having=20
problems w/ audio, console, X server.  But, it's low power, (no=20
fans), low mips :-(  , and I am getting things to work, with a lot of=20
coaxing.  We will use 128 to 256 M ram.  We will use disk on chip for=20
main storage and compact flash for backup.  Probably 128 Meg on the=20
DOC.  Would like to use JFFS2, Journalling Flash File System,=20
(another thread) but will probably start w/ ext3.

To be honest, another guy is worrying about performance now, but I am=20
sure it will be a recurring issue.

>
> Send us more detail and we might be able to offer other
> suggestions.

As the project continues, I am sure there will be plenty of=20
interesting problems, and you guys are great to present questions to. =20
;~)