[vox-tech] Memory addressing?
Ken Bloom
kbloom at gmail.com
Wed Jun 23 11:36:54 PDT 2010
On Wed, 2010-06-23 at 10:42 -0700, timriley at appahost.com wrote:
> > -------- Original Message --------
> > Subject: Re: [vox-tech] Memory addressing?
> > From: "Chanoch (Ken) Bloom" <kbloom at gmail.com>
> > Date: Tue, June 22, 2010 9:46 am
> > To: lugod's technical discussion forum <vox-tech at lists.lugod.org>
> >
> >
> > On Tue, Jun 22, 2010 at 09:11:44AM -0700, Brian Lavender wrote:
> > > Can someone confirm what is correct?
> > >
> > > Tim and I were discussing memory addressing at Crepeville last night
> > > and we had a disagreement about how memory is addressable. I say that
> > > on today's common intel i386 32 bit architecture (in case you are one of
> > > those souls who builds your hardware from scratch), that memory is byte
> > > (octet) addressable. You can load a byte from memory into the lower 8
> > > bits of a register. Tim says that memory is only addressable on 32 bit
> > > word boundaries.
> > >
>
> I stand corrected. Now that I have my hardware textbook open,
> Tanenbaum 1990, I see that the smallest addressable unit on most
> computers
> is the byte -- 8 bits. Bytes are grouped into words. The word-length is
> the size of the registers, not registers and memory.
>
> <snip>
>
> > Consider the following program. The fact that you can get pointers to
> > arbitrary characers should be enough proof that the architecture is
> > byte addressable.
>
> Tanenbaum calls the smallest addressable unit a cell. He then lists
> examples
> of cell lengths for 11 computers. Before cells were standardized
> to 8 bits, they ranged from 1 bit to 60 bits per cell. Therefore, a 60
> bits per cell computer would store a single ASCII character in the
> lowest
> 7 bits and set the upper 53 bits to off (probably). You could then
> allocate
> a pointer to address that 60 bit cell.
>
> The reason for the discussion was Brian's intrusion detection
> implementation stored the incoming packets in a hash
> table. The key to the hash table was quite
> large -- inbound IP address, outbound IP address, inbound port, and
> outbound port.
12 bytes in IPv4, 36 bytes in IPv6.
> I thought to my self, on a very large implementation (say
> Google) the table could grow to a billion entries. Could a hash table
> store this amount in memory? Could you allocate an array of half the
> total memory? Could you allocate an array of a billion integers? Brian,
> on his laptop, couldn't allocate a billion integers. But he could
> allocate a billion characters (bytes). Since I thought both bytes and
> integers were words, and since I thought memory stored words
> like registers stored words, we had our discussion.
>
> The following failed to compile with an array-to-large error:
> int main( void )
> {
> int table[ 1000000000 ];
> return 0;
> }
On a 32-bit machine, this will eat up most of the computer's address
space, including *all* application space, and some kernel space (so you
can expect things to segfault). On a 64-bit machine, it should work
though.
It in fact compiles perfectly on a 64-bit machine, so what your
compiler's doing is detecting the OS and architecture's limits and
warning you about them.
Malloc should make this compile on a 32-bit machine, but the code will
fail to run -- malloc should return an error.
> The following succeeded to compile:
> int main( void )
> {
> char table[ 1000000000 ];
> return 0;
> }
> This compiles but core dumps:
> #include <stdio.h>
> int main( void )
> {
> char table[ 1000000000 ];
> printf( "Hello world!\n" );
> }
Stuff gets pushed on the stack after the end of the array, approximately
1GB into memory. This means you have a stack that's 1GB long when you
get inside printf, which is seriously much more than the OS is prepared
to let you have for your stack. (This will overflow the stack whether
your machine is 32-bits or 64-bits.) Try mallocing your array instead.
(Oh and change that printf to be printf("%px Hello world\n",table); so
you can see whether the allocation is working.)
If you're looking at an IDS that big, you're going to need to find an
appropriate caching data structure that can write the infrequently used
parts out to disk. Or you're going to have to find some other way of
minimizing the number of packets you're keeping track of at a time.
More information about the vox-tech
mailing list