Saturday, May 19, 2007

Casting Pointer Types to Integral Types in Beautiful C/C++

The somewhat intuitive question I should answer before talking about this issue is: why do we need to cast a pointer type to an integral type?
Ok, there is an answer. A good one too. And I am not writing this article just to talk about unthinkable things that C/C++ let us do :-D
The need for casting a pointer type to an integral type is in a hashing function where the pointer is the key. This is the short answer, to make it more clear I will describe what a hashing function and a hash table are. Then I will answer the question that some readers may ask: Again, why do we need to use a pointer as the key to a hash?

The Hash:
A hash table is simply an array that you can access with indexes that are not integers (e.g. strings, floats, user-defined types, etc...)
Lets say we have an addressbook with two attributes; name and email. We want to access the address book by giving it a name, and expecting and email for return value.
One way we could do this is by adding all address book entries to a linked-list. But then every time we want to do a "find" operation on the linked-list to get an email corresponding to a certain name, we may have to traverse the whole list to find our goal. This is time consuming, especially for large linked-lists [complexity O(n)].
Hashtables provide a solution for this problem, they reduce the complexity of the "find", "search", "insert" operations to "ideally" O(1).
The idea is to make a function that will map the key value (the name, in our case) to an integral index for the array. This function is called the hash-function. So now whenever we want to get an email of someone, we could use something like:
email e = myHashTable.get("Ahmad");

or alternatively if we have the [] operator overloaded:
email e = myHashTable["Ahmad"];
Internally the get() function (or the operator[]) would look like this:
email get(char* name) {
int index = hash(name);
return hashStore[index];
}
where hashStore is an array of email, and hash() is the hashing function.
It is worth-mentioning that the STL has a hash table implementation under the name map.

Alright, but why use a pointer as a key to a hash table? If we already have a pointer to an object, then why we need to search for it?
Let's say we have a class Rectangle, we create an array of Rectangles as follows:
Rectangle myRectangles[] = new Rectangle[10];
and then we make a for loop to initialize our array:
for (int i=0; i<10;>
//new Rectangle of random width and height
int height = rand();
int width = rand();
myRectangles[i] = new Rectangle(width, height);
if (width == height){
//if it is a square, display it
myRectangle[i].display();
//hash it, to know what do we have on display
displayedRectsHash.add(&myRectangle[i]);
}//end if
}//end for
//Now that you have displayed all Squares, display all the rectangles
for (i=0; i<10;>
Rectangle * aRect = displayedRectsHash.find(&myRectangle[i]);
if (aRect == NULL) {
//it is not in the hash, i.e. it is not displayed, so we display it
myRectangles[i].display();
//according to need, we could hash this pointer, too, here
}//end if
}//end for
The example above is trivial of course, but it is only meant to explain the idea. Because we could have used the index of the array as the key to the hash. However, sometimes the situation is not as easy as that.

Ok now that I explained when and why one might need to cast a pointer type to an integral type. Let's see how do we do it:
unsigned long hashPointer(void* key) {
return hashLong((unsigned long)key));
}
unsigned long hashLong(unsigned long key) {
unsigned long result;
//...real hashing algorithm here
return result;
}
Note the use of unsigned long variables and casts, not just long. Because if we used long casts, and the pointer held address larger than 2^31, the result of the cast will be a negative value, which of course is undesirable (unless of course if you design the hash function in a way that makes it aware of that fact, which is unusual).
Long type is 32-bit size, according to the C standards. The address space of a modern Intel x86 processor is 32-bit size as well. A unsigned long variable can hold values from 0 to 2^32-1. Large enough to contain any memory address on a x86 processor. On the other hand, signed long variables can contain values from -1*(2^31-1) to 2^31-1. Which is enough for mapping the address space of x86, too. But half of it will be mapped into negative values. And thus the negative result of the cast.

Wednesday, April 18, 2007

Intro - 1st Blog

Hello my firends, colleagues, buddies, and just surfers-by :-)

For the last couple of years I've been thinking I should make a blog, but I was too lazy to, or may be I wasn't excited enough.

I graduated in summer 2006, prior to that, I used to express my ideas and experiences throu our class's web forum. It was the perfect place to communicate and keep our things (b)logged. Now that all the classmates are busy in their jobs, our forum is slowly, but inevitably, fading away, leaving me with no place to (b)log my stuff and shaking up my old idea of creating my own blog.

Another reason for bringing this blog to life is the numerous exciting stories and situations I've been throu and wanted to record. All the adventures during summer 2006 and all the experiences, starting from the graduation and the hunt for a job to my near-death experience on the way to the North Coast. Also as I started my work at Mentor Graphics, I wanted to record some of my most interesting technical experiences.
[To those who don't know it: Mentor Graphics, Inc (NASDAQ: MENT) is a US-based multinational corporation dealing in electronic design automation (EDA) for electrical engineering and electronics, as of 2004, ranked third in the EDA industry it helped create. The company, founded in 1981, is headquartered in Wilsonville, Oregon, and employs 4,000 people worldwide (source).]

As well as my practical experience, my readings in computer engineering and computer science and my research as a Master of Science (M.Sc.) student provide me with a fertile medium for my technical blogs.

I expect that technical blogs will take the largest share of my blogs here. May be when they grow large enough I will move them to a separate technical blog and leave this one for personal blogs only. But untill then do not expect this blog to be purely technical, you will stumble across political posts sometimes, posts about my opinions on movies, restaurants, cafes, since I spend a good portion of my life hanging out :-D

Anywayz, ladies and gentlemen, finally the Blog thought sees the light now in April, 2007... well, actually I've set up the blog a while earlier but left it postless untill now :-)

Enjoy and visit again soon!