byte swapping

  • Ok, I'm really hoping someone can explain this to me. I am running on an
    Intel Mac and the file I am working with was written by code running on
    an intel mac.

    If I look at my file with HexEdit, I can see the following Bytes:

    00 00 07 F2

    If I read these bytes in using FSReadFork and then run the following code:

                for ( long x = 0; x < sizeof( ObjectHeader ); x++ ) {
                    CFShow( CFStringCreateWithFormat( kCFAllocatorDefault,
                                                      NULL,
                                                      CFSTR( "%x" ),

    ((char*)&objectHeader)[x] ) );
                }

    I see printed out:

    0
    0
    7
    fffffff2

    So, the bytes are in the proper order in my objectHeader struct.

    However, when I try to print them out using the following code:

                CFShow( CFStringCreateWithFormat( kCFAllocatorDefault,
                                                  NULL,
                                                  CFSTR( "%ld %x" ),
                                                  objectHeader.fSize,
                                                  objectHeader.fSize ) );

    I get:

    -234422272  f2070000

    Suddenly and unexpectedly my bytes get swapped around.

    Now, I am assuming there is a perfectly logical explanation for this,
    but I am not sure what that might be.
    Can anyone explain it?

    Thank you.
  • On 10/6/06, Eric <mailist...> wrote:

    > Now, I am assuming there is a perfectly logical explanation for this,
    > but I am not sure what that might be.
    > Can anyone explain it?

    Intel systems are little endian.

    In the first code block you are listing out the data as single bytes
    in order of increasing address. In the second code block you are
    interpreting a block of four bytes as an integer. On Intel (little
    endian systems) the high-order byte of that integer is at highest byte
    address and the lowest-order byte is at the lowest byte address.

    "[#]" byte address in memory

    [0] = 0x00
    [1] = 0x00
    [2] = 0x07
    [3] = 0xF2

    Integer32 of that sequence of bytes on a little endian system...

    [3][2][1][0] = 0xF2070000

    ...and on a big endian system...

    [0][1][2][3] = 0x000007F2

    -Shawn
  • On 10/6/06, Shawn Erickson <shawnce...> wrote:
    > On 10/6/06, Eric <mailist...> wrote:
    >
    >> Now, I am assuming there is a perfectly logical explanation for this,
    >> but I am not sure what that might be.
    >> Can anyone explain it?
    >
    > Intel systems are little endian.
    >
    > In the first code block you are listing out the data as single bytes
    > in order of increasing address. In the second code block you are
    > interpreting a block of four bytes as an integer. On Intel (little
    > endian systems) the high-order byte of that integer is at highest byte
    > address and the lowest-order byte is at the lowest byte address.
    >
    > "[#]" byte address in memory
    >
    > [0] = 0x00
    > [1] = 0x00
    > [2] = 0x07
    > [3] = 0xF2
    >
    > Integer32 of that sequence of bytes on a little endian system...
    >
    > [3][2][1][0] = 0xF2070000
    >
    > ...and on a big endian system...
    >
    > [0][1][2][3] = 0x000007F2

    So the important point is that it looks like the bytes are not in the
    correct order in memory for a little endian system if you are going to
    interpret those bytes as an integer. You basically need to decide what
    byte order you will use when you write out integers, etc. to the data
    file and then swap the bytes as needed when you read and write them
    from your data file depending on the endianess of the platform you are
    on.

    Review Apple's documentation on this subject...

    <http://developer.apple.com/documentation/MacOSX/Conceptual/universal_binary
    /universal_binary_byte_swap/chapter_4_section_1.html
    >

    -Shawn
  • Shawn Erickson wrote:
    > On 10/6/06, Shawn Erickson <shawnce...> wrote:
    >> On 10/6/06, Eric <mailist...> wrote:
    >>
    >>> Now, I am assuming there is a perfectly logical explanation for this,
    >>> but I am not sure what that might be.
    >>> Can anyone explain it?
    >>
    >> Intel systems are little endian.
    >>
    >> In the first code block you are listing out the data as single bytes
    >> in order of increasing address. In the second code block you are
    >> interpreting a block of four bytes as an integer. On Intel (little
    >> endian systems) the high-order byte of that integer is at highest byte
    >> address and the lowest-order byte is at the lowest byte address.
    >>
    >> "[#]" byte address in memory
    >>
    >> [0] = 0x00
    >> [1] = 0x00
    >> [2] = 0x07
    >> [3] = 0xF2
    >>
    >> Integer32 of that sequence of bytes on a little endian system...
    >>
    >> [3][2][1][0] = 0xF2070000
    >>
    >> ...and on a big endian system...
    >>
    >> [0][1][2][3] = 0x000007F2
    >
    > So the important point is that it looks like the bytes are not in the
    > correct order in memory for a little endian system if you are going to
    > interpret those bytes as an integer. You basically need to decide what
    > byte order you will use when you write out integers, etc. to the data
    > file and then swap the bytes as needed when you read and write them
    > from your data file depending on the endianess of the platform you are
    > on.
    Perhaps I am just being dense, but 0xf2070000 should be the little
    endian representation of 2034.

    So, why is it that:

              CFShow( CFStringCreateWithFormat( kCFAllocatorDefault,
                                                NULL,
                                                CFSTR( "%ld %x" ),
                                                objectHeader.fSize,
                                                objectHeader.fSize ) );

    was outputing:

    -234422272  f2070000

    I would have expected this to output either:

    -234422272  0x000007F2

    or

    2034  f2070000

    So that, in my little endian system, the hex output would be consistent
    with the decimal output.
    I still fail to see how outputting -234422272  f2070000 can be correct
    and consistent.
  • Why are you doing

    CFSTR( "%ld %x" ),

    and not

    CFSTR("%ld %lx"),

    Why is one a long and the other not?  Also I think 'u' is the
    unsigned 'd' and 'X' is the unsigned 'x' in printf format strings.

    On Oct 9, 2006, at 9:08 AM, Eric wrote:

    Shawn Erickson wrote:
    > On 10/6/06, Shawn Erickson <shawnce...> wrote:
    >> On 10/6/06, Eric <mailist...> wrote:
    >>
    >>> Now, I am assuming there is a perfectly logical explanation for
    >> this,
    >>> but I am not sure what that might be.
    >>> Can anyone explain it?
    >>
    >> Intel systems are little endian.
    >>
    >> In the first code block you are listing out the data as single bytes
    >> in order of increasing address. In the second code block you are
    >> interpreting a block of four bytes as an integer. On Intel (little
    >> endian systems) the high-order byte of that integer is at highest
    >> byte
    >> address and the lowest-order byte is at the lowest byte address.
    >>
    >> "[#]" byte address in memory
    >>
    >> [0] = 0x00
    >> [1] = 0x00
    >> [2] = 0x07
    >> [3] = 0xF2
    >>
    >> Integer32 of that sequence of bytes on a little endian system...
    >>
    >> [3][2][1][0] = 0xF2070000
    >>
    >> ...and on a big endian system...
    >>
    >> [0][1][2][3] = 0x000007F2
    >
    > So the important point is that it looks like the bytes are not in the
    > correct order in memory for a little endian system if you are going to
    > interpret those bytes as an integer. You basically need to decide what
    > byte order you will use when you write out integers, etc. to the data
    > file and then swap the bytes as needed when you read and write them
    > from your data file depending on the endianess of the platform you are
    > on.
    Perhaps I am just being dense, but 0xf2070000 should be the little
    endian representation of 2034.

    So, why is it that:

              CFShow( CFStringCreateWithFormat( kCFAllocatorDefault,
                                                NULL,
                                                CFSTR( "%ld %x" ),
                                                objectHeader.fSize,
                                                objectHeader.fSize ) );

    was outputing:

    -234422272  f2070000

    I would have expected this to output either:

    -234422272  0x000007F2

    or

    2034  f2070000

    So that, in my little endian system, the hex output would be
    consistent with the decimal output.
    I still fail to see how outputting -234422272  f2070000 can be
    correct and consistent.

    _______________________________________________
    MacOSX-dev mailing list
    <MacOSX-dev...>
    http://www.omnigroup.com/mailman/listinfo/macosx-dev
  • Kevin Stone wrote:
    > Why are you doing
    >
    > CFSTR( "%ld %x" ),
    >
    > and not
    >
    > CFSTR("%ld %lx"),
    >
    > Why is one a long and the other not?
    > Also I think 'u' is the unsigned 'd' and 'X' is the unsigned 'x' in
    > printf format strings.

    Well, mostly because of this page:

    http://developer.apple.com/documentation/Cocoa/Conceptual/Strings/Articles/
    formatSpecifiers.html#//apple_ref/doc/uid/TP40004265


    I should actually be using %d instead of %ld, but none of these variants
    change the output.
previous month october 2006 next month
MTWTFSS
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          
Go to today