PDA

View Full Version : Bitwise operations help

jaywhy13
09-11-2009, 06:49 PM
Okay... I have a binary string from a Postgis db that looks like this:
0101000020E6100000000000000000F03F000000000000F03F

It's a sequence of 25 bytes, each represented by 2 hex digits. I'm trying to understand how the getInt method works.

Now imagine that all this data in a byte array called "arr" and there's a position variable that gets incremented whenever any of the following methods are called.

byte getByte() {
return (byte) (int) arr[position++] & 0xFF; // get rid of int extended by sign
}

int getInt(){
return (arr[(index + 3)] << 24) + (arr[(index + 2)] << 16)
+ (arr[(index + 1)] << 8) + arr[(index)];
}

Calling getByte on the stream will give the underlined part.
0101000020E6100000000000000000F03F000000000000F03F

What will calling getInt return?

oracleguy
09-11-2009, 08:02 PM
That is impossible to say without knowing the value of index. You need to provide more information.

Both of those functions should really take parameters. For example, getInt should take the index and the array as parameters.

jaywhy13
09-11-2009, 11:06 PM
The index is 1. I figured it out. Since the byte order is NDR / little endian the least significant bit is on the left so getInt() had to reverse the bytes. That's exactly what the code does. So it returns 1.

This is the correct array btw... the values are in base 10 though.
{0=1, 1=1, 2=0, 3=0, 4=0, 5=0, 6=0, 7=0, 8=0, 9=0, 10=0, 11=-16, 12=63, 13=0, 14=0, 15=0, 16=0, 17=0, 18=0, 19=-16, 20=63}

NB: That's listed as index=value
The getInt method is concerned with getting an int starting @ index 1. Hence values 1=1, 2=0, 3=0 and 4=0.
As said, the bytes need to be reversed since it's actually 0 0 0 1... it's just stored 1 0 0 0 by little endian convention.

However... I do have another question.

int typeword = data.getInt(); // int we previously discussed as being one
int realtype = typeword & 0x1FFFFFFF; // cut off high flag bits

The value "realtype" is also 1.
My evaluation is .....

0x1FFFFFFF anded with
0x00000001 gives
-----------
0x00000001

Here are some other pieces of the code.

boolean haveZ = (typeword & 0x80000000) != 0;
boolean haveM = (typeword & 0x40000000) != 0;
boolean haveS = (typeword & 0x20000000) != 0;

I'm assuming all those will be equal to zero???
0x80000000 &
0x00000001
-----------
0x00000000

oracleguy
09-11-2009, 11:16 PM
Little endian and big endian depends on the system architecture. In the case of Intel x86, it uses little endian. In your program, at an application level, the only time you need to worry about endianness is if you are reading data stored in the opposite type than the system architecture. Which in itself is extremely rare, one of the few times it comes up is when reading raw Ethernet packets which are big endian. But even then 99% of the time you'd be using a socket library and not have to worry about it.

Is that evaluation correct? Also, does the system automatically represent ints as little endians? I'm using Win xP here. I'm thinking that 1 in this case would be 00000000 00000000 00000000 00000001 but the little endian would be 10000000 00000000 00000000 00000000

Is this correct?

Assuming your first byte is the lowest memory address, no, you have it backwards.

As far as how an integer is stored in memory with little endian, location 0 would be the lowest byte and location 3 would be the last byte. (Assuming we are talking about 32-bit integers)

The nice thing about big endian is that when looking at memory dumps, you can read it from left to right. I've done embedded programming using Motorola processors that were big endian.

Those code snippets, are they from a program you wrote or what? The reason I ask is their apparent use of global variables in lieu of parameters, which is really bad programming practice.

jaywhy13
09-11-2009, 11:35 PM
Ok... thanks. I corrected my earlier post. Thanks soo much for the explanation. It makes things alot clearer. You can probably read my updated post. I see I was misunderstanding the typeword anding operations thinking that it'd need to be reversed.

I copied code from a PostGISLayer in openmap. I condensed the code to make it easier to read. Instead of showing ALL the methods I just yanked the lines out of the methods and put them all together. So :) it's not half as bad as I made it look.

Thanks again though.

jaywhy13
09-11-2009, 11:42 PM
Also... this line JUST made sense to me :o

int realtype = typeword & 0x1FFFFFFF; // cut off high flag bits
That 1 at the start of the hex is actually. 00000001. So when they say "cut off high flat bits" I'm assuming that they would store different flags in that first byte.

Example...

boolean haveZ = (typeword & 0x80000000) != 0;
boolean haveM = (typeword & 0x40000000) != 0;
boolean haveS = (typeword & 0x20000000) != 0;
The z flag would be 00001000
The m flag would be 00000100
The s flag would be 00000010

Therefore haveZ, haveM and haveS would all be true if the typeword was
0xE0000001?
Is that correct?

oracleguy
09-12-2009, 12:25 AM
boolean haveZ = (typeword & 0x80000000) != 0;
boolean haveM = (typeword & 0x40000000) != 0;
boolean haveS = (typeword & 0x20000000) != 0;
The z flag would be 00001000
The m flag would be 00000100
The s flag would be 00000010

Therefore haveZ, haveM and haveS would all be true if the typeword was
0xE0000001?
Is that correct?

No, they would all be true if the top nibble was 0xE, the remaining nibble and byte could be anything and they'd all be true. This is because Z, M and S are looking for specific bits being set, as long as they are, it doesn't matter what the other bits are.

jaywhy13
09-12-2009, 12:45 AM
Given the current values of the typeword ( 1) haveZ, haveM and haveS all evaluate to false. I'm thinking that an example of a typeword having a lower value of 1 and also having the z, m and s bits flags set would look like 0xE0000001?

That's not correct? If not correct what would the typeword look like if it the realtype still evaluates to 1 and haveZ, haveM and haveS all evaluate to true? What would that typeword look like?

int realtype = typeword & 0x1FFFFFFF; // cut off high flag bits

I understand that 0xE would have z, m and s flags set. I was just wondering what the typeword would need to look like in in order to satisfy realtype = 1 and haveZ, haveM and haveS = true.

oracleguy
09-12-2009, 03:35 AM
I understand that 0xE would have z, m and s flags set. I was just wondering what the typeword would need to look like in in order to satisfy realtype = 1 and haveZ, haveM and haveS = true.

Then it would be: 0xE0000001

jaywhy13
09-13-2009, 07:37 AM
Okay... THANKS ALOT!!!! Great help. Needed this information for a project I was working on!