[vox-tech] lame question on memory allocation

Tim Riley vox-tech@lists.lugod.org
Tue, 21 Jan 2003 16:53:47 -0800


Jeff,

This response is getting away from Pete's question; however,
I would like to understand memory management better.

Jeff Newmiller wrote:

> On Tue, 21 Jan 2003, Tim Riley wrote:
>
> >
> >
> > Peter Jay Salzman wrote:
> <snip>

> > > for example, the man page says that malloc() may be required to return
> > > word aligned memory pages, so in the diagram:
> > >
> >
> > This implies that malloc() may return more memory than you request.
> > It might be that malloc( 1 ) gets you an entire page -- 4 or 8 K. However,
> > subsequent malloc( 1 )s will just fill in the existing page. When it fills
> > up you
> > get a new page.
> >
> > The gcc complier knows about page alignments and will allocate
> > memory requested on the stack effeciently. So, whenever I need
> > string space, I'll do: "char buffer[ 1024 ];" because I believe that
> > "char buffer[ 1000 ];" gets me 1024 anyway.
>
> a) malloc() affects heap memory, not stack memory.

This is true; however, both the heap and stack are paged. The
compiler probably aligns a stack of memory just like it aligns
a heap of memory. My example should have been "malloc( 1000 )"
vs. "malloc( 1024 )."

> <snip>

> My point is that your "optimization" is rather implementation-specific...
> it can backfire if the code is ported, yielding less efficient use of
> memory by requiring two pages instead of one.  (If you don't mind some
> platform dependencies in your search for performance, this may be par for
> the course.)
>

This is a risk I hadn't considered.

>
> <snip>

>
> > > what's a word?  :)
> > >
> >
> > A word is the minimum amount of addressable memory. So, if you
> > declare "char c;" on the stack of a 386, even though you plan on storing
> > only 8 bits, more bits are available. To figure out the word size, take
> > the log based 2 of the constant UINT_MAX + 1 (in limits.h) -- probably
> > it's 32.
>
> Actually, the minimum amount of addressable memory is more commonly
> referred to as a byte.  (Technically, the minimum amount of addressable
> memory _large enough to represent any printable character in the "C"
> locale_ is a byte... some architectures can address bits.)
>

I learned that a byte is 8 bits, no matter how many bits are available for
storage.
I also learned that the CPU stores both an integer and a byte in memory as a
word. Try
this test:

/* test1.c */
int main( int argc, char **argv )
{
    static char c;
}

/* test2.c */
int main( int argc, char **argv )
{
    static int c;
}

ls -l test1 test2 <-- the sizes are the same on my computer.

> A word is the natural unit of data that can be moved from memory to a
> processor register. [1]

Right. The CPU moves words from memory to registers and back. It moves
memory in chunks of words because that is how it addresses them.

>  Most modern processors have data paths composed
> of multiple bytes, and separate "enable" signals allow smaller units of
> data than a "word" to be read or written.  This definition differs from
> the x86 version, which is fixed by convention at 16 bits to support
> upward software compatibility for certain compilers and assemblers.
>

Is this for backward compatibility for 16 bit buses? My guess is that
by now there's a "move a 32 bit word from memory to a register in
one operation" x86 instruction.

>
> It may be inefficient to move a "word" around that is not stored beginning
> with the first addressable byte in the data bus.

Hardware is not my forte, but I don't see how this can even be possible,
much less inefficient. What instruction addresses the middle of a word?

>  Part of the word has to
> be read in one bus operation, and the rest in a second bus
> operation.

Does the data bus have to be smaller than a register for the CPU to have
to use two move operations? This probably doesn't happen any more.

<snip>