[vox-tech] Memory addressing?

Brian Lavender brian at brie.com
Wed Jun 23 12:39:25 PDT 2010


On Wed, Jun 23, 2010 at 10:42:18AM -0700, timriley at appahost.com wrote:
> 
> The reason for the discussion was Brian's intrusion detection
> implementation stored the incoming packets in a hash
> table. The key to the hash table was quite
> large -- inbound IP address, outbound IP address, inbound port, and
> outbound port. I thought to my self, on a very large implementation (say
> Google) the table could grow to a billion entries. Could a hash table
> store this amount in memory? Could you allocate an array of half the
> total memory? Could you allocate an array of a billion integers? Brian,
> on his laptop, couldn't allocate a billion integers. But he could
> allocate a billion characters (bytes). Since I thought both bytes and
> integers were words, and since I thought memory stored words
> like registers stored words, we had our discussion.

What started this is that having a hash provides ideally a O(1).  So,
I say that to achieve this, one would ideally want to have a large array
to store your IP addresses and a hashing function that provides good
distribution. To which, Tim pointed to using a smaller array and using
chaining, because he initially thought that arrays were limited to smaller
sizes than was observed by doing some simple tests. To which I replied,
using a fixed size array and the chaining could cause the hash to decay
into a linked list which has a O(n). And to which Tim has noted using a
large fixed size array may just take up all your architectural limits
of the program. I imagine that having such a large array would push
the lower bound of the stack up in memory, or if allocated at runtime,
would push the heap way down from the top.

In my Genetic Algorithm, I used Gnome's glib to do the hashing for me. I
figured that someone else already looked at this problem. In my Genetic
Algorithm, I am only storing a maximum of 32 bits in the hash key.

In the nProbe code, Luca Deri wrote the hash, using the key values as Tim
described. I have not figured out exactly how Luca does it, but I will
assume that he used a suitable method that works well enough for me. ;-)
If I don't contain the scope on my Master's Project, I won't finish!

Thus, this was the impetus for our discussion. ;-)

brian
-- 
Brian Lavender
http://www.brie.com/brian/

"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the other
way is to make it so complicated that there are no obvious deficiencies."

Professor C. A. R. Hoare
The 1980 Turing award lecture


More information about the vox-tech mailing list