Mac Pro memory sizes

  • Hi,
    Not sure if I'm addressing the right list for this topic.
    I'm just trying to get a notion of my memory requirements for a
    program I am designing to run on my Mac Pro. I will have large volumes
    of data passing through the program and I'm worrying about minimising
    page collisions

    I've had a quick look on the net but can't find answers to the types
    of questions i'm asking below.
    If anyone knows a good link I'd appreciate it.
    I guess I ought to run some trials and no doubt will do once I have
    something running but someone ought to know the answers so ....
    My questions are:

    About This Mac says that I have 2GB of internal memory.
    Is this 2GB of 64-bit words or 2GB of 8-bit bytes?
    I appreciate that GB is Giga Byte but ......

    Similarly with respect to the L2 Cache, I have 12 MB per processor, is
    that 12 MB by 8 bits or 64 bits?

    In thinking about memory usage, where previously I would think of my
    program in terms of 8 or 16 or 32 bit words should I now be thinking
    in terms of 64 bit words?
    That is, should I think of my available internal memory space as
    effectively being 500MB words?

    Similarly, say that I had 100MB of 2 x 8-bit byte integers to save to
    disk, should I now think that this will be saved as 100MB by 64 bit
    (i.e. 8 x 8-bit byte) integers?
    If it is 100MB by 64 bit integers then should I think of compressing
    the data so as to reduce bandwidth requirements?

    Thanks
    Julius

    http://juliuspaintings.co.uk
  • On 11 Jan 2009, at 21:44, julius wrote:

    > About This Mac says that I have 2GB of internal memory.
    > Is this 2GB of 64-bit words or 2GB of 8-bit bytes?

    8 bits. Always 8 bits.
  • On Jan 11, 2009, at 4:44 PM, julius wrote:

    > About This Mac says that I have 2GB of internal memory.
    > Is this 2GB of 64-bit words or 2GB of 8-bit bytes?
    > I appreciate that GB is Giga Byte but ......
    >
    > Similarly with respect to the L2 Cache, I have 12 MB per processor,
    > is that 12 MB by 8 bits or 64 bits?

    Although the term byte can be variable depending on architecture, this
    is something that is rarely done now.  You can be pretty much sure
    that when something says byte it is supposed to mean 8 bits.

    Now technically a gigabyte (GB) is 10^6 bytes (1,000,000 bytes) but
    often people mean 2^20 bytes (1,048,576 bytes).  In actuality a
    gibibyte (GiB) is 2^20 bytes but it's not used in all the places it
    should be used.

    Here's some more information for you:
    <http://en.wikipedia.org/wiki/Byte>

    - Ken
  • On Jan 11, 2009, at 5:04 PM, Kenneth Bruno II wrote:

    > On Jan 11, 2009, at 4:44 PM, julius wrote:
    >
    >> About This Mac says that I have 2GB of internal memory.
    >> Is this 2GB of 64-bit words or 2GB of 8-bit bytes?
    >> I appreciate that GB is Giga Byte but ......
    >>
    >> Similarly with respect to the L2 Cache, I have 12 MB per processor,
    >> is that 12 MB by 8 bits or 64 bits?
    >
    > Although the term byte can be variable depending on architecture,
    > this is something that is rarely done now.  You can be pretty much
    > sure that when something says byte it is supposed to mean 8 bits.
    >
    > Now technically a gigabyte (GB) is 10^6 bytes (1,000,000 bytes) but
    > often people mean 2^20 bytes (1,048,576 bytes).  In actuality a
    > gibibyte (GiB) is 2^20 bytes but it's not used in all the places it
    > should be used.
    >
    > Here's some more information for you:
    > <http://en.wikipedia.org/wiki/Byte>

    My bad here, I accidently used the wrong amounts and confused mega-
    and mebi- with giga- and gibi-

    It should be:

    megabyte (MB): 10^6 bytes (1,000,000 bytes)
    mebibyte (MiB): 2^30 bytes (1,048,576 bytes)
    gigabyte (GB): 10^9 bytes (1,000,000,000 bytes)
    gibibyte (GiB): 2^30 bytes (1,073,741,824 bytes)

    - Ken
  • Depending on what sort of data you has, you could try allocating all of
    your memory on startup, organised into related "zones". That way you are
    not constantly allocating/deallocating anything. Just overwriting
    values. This can provide an unbelievable speed inprovement, and low
    memory overheads/paging and so on.

    On 12/1/09 8:44 AM, julius wrote:
    > Hi,
    > Not sure if I'm addressing the right list for this topic.
    > I'm just trying to get a notion of my memory requirements for a
    > program I am designing to run on my Mac Pro. I will have large volumes
    > of data passing through the program and I'm worrying about minimising
    > page collisions
    >
    > I've had a quick look on the net but can't find answers to the types
    > of questions i'm asking below.
    > If anyone knows a good link I'd appreciate it.
    > I guess I ought to run some trials and no doubt will do once I have
    > something running but someone ought to know the answers so ....
    > My questions are:
    >
    > About This Mac says that I have 2GB of internal memory.
    > Is this 2GB of 64-bit words or 2GB of 8-bit bytes?
    > I appreciate that GB is Giga Byte but ......
    >
    > Similarly with respect to the L2 Cache, I have 12 MB per processor, is
    > that 12 MB by 8 bits or 64 bits?
    >
    > In thinking about memory usage, where previously I would think of my
    > program in terms of 8 or 16 or 32 bit words should I now be thinking
    > in terms of 64 bit words?
    > That is, should I think of my available internal memory space as
    > effectively being 500MB words?
    >
    > Similarly, say that I had 100MB of 2 x 8-bit byte integers to save to
    > disk, should I now think that this will be saved as 100MB by 64 bit
    > (i.e. 8 x 8-bit byte) integers?
    > If it is 100MB by 64 bit integers then should I think of compressing
    > the data so as to reduce bandwidth requirements?
    >
    >
    > Thanks
    > Julius
    >
    >
    >
    > http://juliuspaintings.co.uk
  • On 11 Jan 2009, at 22:04:09, Kenneth Bruno II wrote:

    > In actuality a gibibyte (GiB) is 2^20 bytes but it's not used in all
    > the places it should be used.

    It's rarely used at all, for several reasons. One is that it makes
    little sense to your average consumer, but the more amusing reason
    that standard isn't used is because "kibibyte" sounds like a
    children's breakfast cereal.

    In general, it depends on the level of technical discussion going on.
    In this area it should always mean 2^30, in my opinion. In context of
    discussing hard drive sizes with your neighbour, it rarely matters.
    Remember: context. For example, a discussion on Wikipedia is leaning
    to wards constant use of giga- anyway to prevent confusion. In
    general, the discussions which occur on these lists would not generate
    such confusion when gibi- is used, but use of giga- to mean 10^9 would
    be far less useful than the "real" value of 2^30.
  • It's important to note that the reason for this peculiarity is that in
    computer science we use powers of 2 extensively. As an electrical
    engineer, I find the use of kilo, mega, giga, etc. prefixes irritating
    as  these are defined by the SI system to be 10^3, 10^6 and 10^9,
    respectively. See http://en.wikipedia.org/wiki/SI_prefix.

    It is highly unfortunate that consumers are subjected to this
    confusion and that we are accustomed to our "500GB" drives only
    holding around 460GB in "real" terms.

    While "kibibyte" and "gibibyte" will never catch on, we need to
    realise that although the terminology is somewhat nonsensical, the
    quantity they represent is the power of 2 value reported by our
    operating systems: the "real" value.

    All of this aside, anyone who is sufficiently computer literate or has
    experience programming (read: anyone on this list) should be able to
    understand gigabyte in all contexts and be able to recognise the
    different values it can hold in each of those contexts.

    Kind regards,
    Jamie Toolin.

    PS: Sorry Scott for the slightly off-topic nature of this post.

    On 11 Jan 2009, at 22:26, Benjamin Dobson wrote:

    >
    > On 11 Jan 2009, at 22:04:09, Kenneth Bruno II wrote:
    >
    >> In actuality a gibibyte (GiB) is 2^20 bytes but it's not used in
    >> all the places it should be used.
    >
    > It's rarely used at all, for several reasons. One is that it makes
    > little sense to your average consumer, but the more amusing reason
    > that standard isn't used is because "kibibyte" sounds like a
    > children's breakfast cereal.
    >
    > In general, it depends on the level of technical discussion going
    > on. In this area it should always mean 2^30, in my opinion. In
    > context of discussing hard drive sizes with your neighbour, it
    > rarely matters. Remember: context. For example, a discussion on
    > Wikipedia is leaning to wards constant use of giga- anyway to
    > prevent confusion. In general, the discussions which occur on these
    > lists would not generate such confusion when gibi- is used, but use
    > of giga- to mean 10^9 would be far less useful than the "real" value
    > of 2^30.
  • On Jan 11, 2009, at 5:26 PM, Benjamin Dobson wrote:

    >
    > On 11 Jan 2009, at 22:04:09, Kenneth Bruno II wrote:
    >
    >> In actuality a gibibyte (GiB) is 2^20 bytes but it's not used in
    >> all the places it should be used.
    >
    > It's rarely used at all, for several reasons. One is that it makes
    > little sense to your average consumer, but the more amusing reason
    > that standard isn't used is because "kibibyte" sounds like a
    > children's breakfast cereal.
    >
    > In general, it depends on the level of technical discussion going
    > on. In this area it should always mean 2^30, in my opinion. In
    > context of discussing hard drive sizes with your neighbour, it
    > rarely matters. Remember: context. For example, a discussion on
    > Wikipedia is leaning to wards constant use of giga- anyway to
    > prevent confusion. In general, the discussions which occur on these
    > lists would not generate such confusion when gibi- is used, but use
    > of giga- to mean 10^9 would be far less useful than the "real" value
    > of 2^30.

    Yes, I'd assume in this context that GB means 2^30 instead of 10^9 but
    it still stands to reason that you should be careful to find out
    exactly which meaning is relevant.

    One important thing to note is that if you assume GB means 10^9 bytes
    and you plan your resource usage accordingly then even if really
    represents 2^30 bytes you won't over-allocate resources since 2^30 is
    larger than 10^9.  If you assume that GB means 2^30 bytes and it is
    really the smaller amount of 10^9 bytes then you could run out of
    resources.

    - Ken
  • On 11 Jan 2009, at 22:19, Jacob Rhoden wrote:

    > Depending on what sort of data you has, you could try allocating all
    > of your memory on startup, organised into related "zones". That way
    > you are not constantly allocating/deallocating anything. Just
    > overwriting values. This can provide an unbelievable speed
    > inprovement, and low memory overheads/paging and so on.

    I'll take a good look at this.
    Thanks.

    I think I'll do a fair bit of packing 8 bit bytes into 64 bit integers
    - trade speed for memory, and think hard on how best to use my 4 x
    512GB RAID. I suspect experimentation is the only way.

    best wishes
    Julius

    http://juliuspaintings.co.uk
  • On Sun, Jan 11, 2009 at 4:44 PM, julius <julius...> wrote:
    > About This Mac says that I have 2GB of internal memory.
    > Is this 2GB of 64-bit words or 2GB of 8-bit bytes?
    > I appreciate that GB is Giga Byte but ......

    Others have covered this adequately but I just want to reinforce that
    there's essentially no other way to interpret 2GB these days other
    than as roughly 16 billion bits. ("Roughly" because of the whole
    decimal/binary silliness.)

    > Similarly with respect to the L2 Cache, I have 12 MB per processor, is that
    > 12 MB by 8 bits or 64 bits?

    Likewise here. This principle applies *everywhere*.

    > In thinking about memory usage, where previously I would think of my program
    > in terms of 8 or 16 or 32 bit words should I now be thinking in terms of 64
    > bit words?
    > That is, should I think of my available internal memory space as effectively
    > being 500MB words?

    No, this makes no sense. You have 2GB of memory. If you're working
    with 64-bit words then it makes sense to think of your memory as being
    roughly 256 million words (note: not 500) but that's not the same as
    "500MB words".

    > Similarly, say that I had 100MB of 2 x 8-bit byte integers to save to disk,
    > should I now think that this will be saved as 100MB by 64 bit (i.e. 8 x
    > 8-bit byte) integers?
    > If it is 100MB by 64 bit integers then should I think of compressing the
    > data so as to reduce bandwidth requirements?

    Just because your machine has a 64-bit processor doesn't mean you're
    suddenly required to work with 64-bit quantities everywhere. You can
    still work with 8, 16, or 32-bit quantities as you need. Which one is
    the most appropriate choice, I couldn't say, but you seem to have this
    strange idea that your 8-bit integers will somehow magically take up
    64 bits of storage just because you're running on a Core2
    architecture. It's simply not the case.

    Mike
  • On 12 Jan 2009, at 02:32, "Michael Ash" <michael.ash...> wrote:
    >
    >
    >> In thinking about memory usage, where previously I would think of
    >> my program
    >> in terms of 8 or 16 or 32 bit words should I now be thinking in
    >> terms of 64
    >> bit words?
    >> That is, should I think of my available internal memory space as
    >> effectively
    >> being 500MB words?
    >
    > No, this makes no sense. You have 2GB of memory. If you're working
    > with 64-bit words then it makes sense to think of your memory as being
    > roughly 256 million words (note: not 500) but that's not the same as
    > "500MB words".
    Yes, of course. Silly mistake.
    >
    >
    >> Similarly, say that I had 100MB of 2 x 8-bit byte integers to save
    >> to disk,
    >> should I now think that this will be saved as 100MB by 64 bit (i.e.
    >> 8 x
    >> 8-bit byte) integers?
    >> If it is 100MB by 64 bit integers then should I think of
    >> compressing the
    >> data so as to reduce bandwidth requirements?
    >
    > Just because your machine has a 64-bit processor doesn't mean you're
    > suddenly required to work with 64-bit quantities everywhere. You can
    > still work with 8, 16, or 32-bit quantities as you need.
    Yes, but my understanding is that this will change when we go into a
    full 64 bit architecture and as a one man band I would prefer to write
    code that anticipates the change than to have to change everything
    later.
    Also, the documentation
    http://tinyurl.com/6ceoqz
    leads me to believe that if I need an address space of more than 4GB
    then I should be using 64 bit computing.
    > Which one is
    > the most appropriate choice, I couldn't say, but you seem to have this
    > strange idea that your 8-bit integers will somehow magically take up
    > 64 bits of storage just because you're running on a Core2
    > architecture. It's simply not the case.
    I'm glad you've flagged this up because  one of the reasons for my
    asking the question was to increase my understanding of what I should
    and should not be thinking about in this regard.

    So let me then ask: under the 64 bit architecture, will the standard c
    types like int, char etc still be available and not give me problems
    under garbage collection given I define them as strong?
    Currently I'm defining most my variables as type NSInteger and
    CGFloat. Is that wrong?
    Or should I be implementing my numbers as NSNumber? I thought the main
    purpose of NSNumber was as a wrapper to enable us to put numbers into
    NSArray etc. Would not using it as the main representation seriously
    affect computation speed ? What are other people doing?
    If you have any good links or advice they'd be much appreciated.

    Thanks

    Julius

    http://juliuspaintings.co.uk
  • On Jan 12, 2009, at 9:50 AM, julius wrote:

    > So let me then ask: under the 64 bit architecture, will the standard
    > c types like int, char etc still be available and not give me
    > problems under garbage collection given I define them as strong?
    > Currently I'm defining most my variables as type NSInteger and
    > CGFloat. Is that wrong?
    > Or should I be implementing my numbers as NSNumber? I thought the
    > main purpose of NSNumber was as a wrapper to enable us to put
    > numbers into NSArray etc. Would not using it as the main
    > representation seriously affect computation speed ? What are other
    > people doing?
    > If you have any good links or advice they'd be much appreciated.

    From what I understand there won't much change for the smaller
    variable types.  Going 64 bit just means that the largest variables
    available will be larger and that you'll be able to address more
    memory at a time.  There will be some changes in size and alignment in
    some of the variable types but most of these changes will hardly be
    noticeable under most circumstances.

    NSInteger and NSUInteger are designed so that they will best fit the
    environment which the code is compiled for.  If you use them then you
    generally don't have to worry if you are compiling for 32 bit or 64
    bit, these variables are defined with an appropriate size for the
    environment.  If you need to know what that size is then you can use
    the constants NSIntegerMin, NSIntegerMax, and NSUIntegerMax.

    NSNumber is an object that holds values for you.  You use it when you
    can't use a plain variable, such as when you need to store a value in
    a container class that only holds objects, such as NSArray.  There is
    a small performance and memory hit for using NSNumber over a regular
    variable but if you use them in small amounts and don't allocate and
    deallocate them like mad then you shouldn't have any trouble.

    You can learn more about 64 bit computing under Mac OS X here:
    <http://developer.apple.com/documentation/Darwin/Conceptual/64bitPorting/int
    ro/chapter_1_section_1.html
    >

    Here is the section on data type changes:
    <http://developer.apple.com/documentation/Darwin/Conceptual/64bitPorting/tra
    nsition/chapter_3_section_3.html
    >

    As you can see in that last link the variable types char, short, and
    int are staying the same size.  Only the variable types long, pointer,
    and size_t are changing size.  This isn't really a big deal but if you
    are writing code to compile for both 32 bit and 64 bit environments
    then you might want to do some sanity checks against the max size for
    the types that are going to change if you think you might run up
    against these limits in the 32 bit environment.

    Lastly you can always use the exact-width integer types such as int8_t
    which are guaranteed to be at least 8 bits in size.  These (and many
    more) are defined in the C99 standard:
    <http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf>

    Honestly though, most of us won't have to worry about these details.
    I'd use NSInteger and NSUInteger unless you are dealing with a lot of
    very small integers that you need to pack in as little memory as
    possible.  If you need an object instead of a plain variable then use
    NSNumber or NSValue.

    - Ken
  • On Mon, Jan 12, 2009 at 6:50 AM, julius <julius...> wrote:
    >
    > On 12 Jan 2009, at 02:32, "Michael Ash" <michael.ash...> wrote:
    >>
    >>
    >>> In thinking about memory usage, where previously I would think of my
    >>> program
    >>> in terms of 8 or 16 or 32 bit words should I now be thinking in terms of
    >>> 64
    >>> bit words?
    >>> That is, should I think of my available internal memory space as
    >>> effectively
    >>> being 500MB words?
    >>
    >> No, this makes no sense. You have 2GB of memory. If you're working
    >> with 64-bit words then it makes sense to think of your memory as being
    >> roughly 256 million words (note: not 500) but that's not the same as
    >> "500MB words".
    >
    > Yes, of course. Silly mistake.
    >>
    >>
    >>> Similarly, say that I had 100MB of 2 x 8-bit byte integers to save to
    >>> disk,
    >>> should I now think that this will be saved as 100MB by 64 bit (i.e. 8 x
    >>> 8-bit byte) integers?
    >>> If it is 100MB by 64 bit integers then should I think of compressing the
    >>> data so as to reduce bandwidth requirements?
    >>
    >> Just because your machine has a 64-bit processor doesn't mean you're
    >> suddenly required to work with 64-bit quantities everywhere. You can
    >> still work with 8, 16, or 32-bit quantities as you need.
    >
    > Yes, but my understanding is that this will change when we go into a full 64
    > bit architecture

    Not true.

    > and as a one man band I would prefer to write code that
    > anticipates the change than to have to change everything later.
    > Also, the documentation
    > http://tinyurl.com/6ceoqz
    > leads me to believe that if I need an address space of more than 4GB then I
    > should be using 64 bit computing.

    True.

    >>
    >> Which one is
    >> the most appropriate choice, I couldn't say, but you seem to have this
    >> strange idea that your 8-bit integers will somehow magically take up
    >> 64 bits of storage just because you're running on a Core2
    >> architecture. It's simply not the case.
    >
    > I'm glad you've flagged this up because  one of the reasons for my asking
    > the question was to increase my understanding of what I should and should
    > not be thinking about in this regard.
    >
    > So let me then ask: under the 64 bit architecture, will the standard c types
    > like int, char etc still be available

    Of course they will. Removing these types would render a C compiler useless.

    > and not give me problems under garbage
    > collection given I define them as strong?

    Garbage collection has nothing to do with integers, only pointers.
    There is no reason to define an int or char as strong.

    > Currently I'm defining most my variables as type NSInteger and CGFloat. Is
    > that wrong?

    No, it isn't wrong, but it my be wasteful. My recommendation is to use
    NSInteger/NSUInteger and CGFloat for parameters, and local variables.
    However for things in structures or arrays, think carefully about
    whether or not you actually *need* a 64-bit type, otherwise, you could
    be wasting space.

    > Or should I be implementing my numbers as NSNumber? I thought the main
    > purpose of NSNumber was as a wrapper to enable us to put numbers into
    > NSArray etc. Would not using it as the main representation seriously affect
    > computation speed ? What are other people doing?
    > If you have any good links or advice they'd be much appreciated.

    --
    Clark S. Cox III
    <clarkcox3...>
  • On Mon, Jan 12, 2009 at 9:50 AM, julius <julius...> wrote:
    > Yes, but my understanding is that this will change when we go into a full 64
    > bit architecture and as a one man band I would prefer to write code that
    > anticipates the change than to have to change everything later.

    Well, that's wrong.

    > Also, the documentation
    > http://tinyurl.com/6ceoqz
    > leads me to believe that if I need an address space of more than 4GB then I
    > should be using 64 bit computing.

    It's true, but "64 bit computing" does not mean what you think it means.

    > So let me then ask: under the 64 bit architecture, will the standard c types
    > like int, char etc still be available and not give me problems under garbage
    > collection given I define them as strong?

    Yes, they are all identical to what they were before, with the
    exception of 'long', which becomes a 64-bit quantity. I don't know
    what your GC question is about, as GC only affects pointers.

    Let me briefly explain what "64-bit" is all about, because this seems
    to be the major point of confusion here.

    In a 32-bit processor, pointers are 32 bits long. Since 2^32 = 4
    billion and change, this means that you can address about 4GB of
    memory. Also, usually but not always, on a 32-bit processor the
    largest native integer quantity is 32 bits long. Usually there is
    software support for 64-bit integers but it suddenly becomes
    significantly slower because the CPU can only deal with 32 bits at a
    time.

    In a 64-bit processor, pointers are 64 bits long. Since 2^64 = really
    huge, that means you can address a really huge amount of memory (it's
    equal to 4 billion and change squared). You also get native support
    for 64-bit integers.

    That's it! Pointer size and native 64-bit integers are the only
    difference between the two! Your chars don't suddenly expand from 8
    bits to 64 bits. Your floats don't suddenly expand from 32 bits to 64
    bits. (On Mac OS X your longs do suddenly expand from 32 bits to 64
    bits, but int still gives you a 32-bit integer.) The only difference
    is the size of pointers and, sometimes, the ability to do native math
    on 64-bit integers.

    Mike
  • >> leads me to believe that if I need an address space of more than 4GB then I
    >> should be using 64 bit computing.
    >
    > True.

    Also note that loading of various runtime libraries will take up a big chunk
    of address space, so "needing an address space of more than 4GB" translates
    very roughly to "need to manipulate more than about 2GB of data".

    --
    Scott Ribe
    <scott_ribe...>
    http://www.killerbytes.com/
    (303) 722-0567 voice
  • On Jan 12, 2009, at 9:23 AM, Michael Ash wrote:

    > That's it! Pointer size and native 64-bit integers are the only
    > difference between the two!

    In addition to what Mike said, the transition from X86 to X86-64
    includes a few other benefits besides larger pointers and native
    integers. The number of registers were doubled, and the calling
    conventions were changed so that 80% of the time function/method
    arguments are stored in CPU registers instead of being placed in a
    four-byte-aligned position on the stack. And that 20% of cases only
    happen when you pass in a structure larger than 128 bits, or pass in
    an unaligned structure, or have a function that takes more than 6
    arguments.

    So typically a program ported from X86 to X86-64 will run just
    slightly faster, especially if the program passes around a lot of 64-
    bit arguments. This doesn't apply to the PPC64 architecture, which is
    almost unchanged from PPC, and so PPC64 programs are typically slower
    due to the extra overhead.

    Nick Zitzmann
    <http://www.chronosnet.com/>
  • Hi Julius,

    If I understand your problem correctly you are:
    1) processing a very large amount of intergers
    2) using highly optimized code that is:
           a) you are manipulating the data directly via pointers
           b) the data in memory is expected to be in a specific
                    order/structure
           c) the data is stored on disk in pure binary form
                    that is the same format as in memory

    Several years back I had optimized the code of a C program and gained a
    speed bump by the factor of 100 by doing the above and doing the pointer
    arithmatic by hand for accessing the data in the structure instead of
    using
    builtins and standard structures.

    So you do not need to worry about the size of your data just how you
    access it,
    I had to have the program work on different architectures with
    different word sizes.
    The inital data where in text for so the conversion to integer was
    easy. The trick was
    to use the sizeof function to get the correct values for the pointer
    math.

    Far as stuffing two 32-bit values into a 64-bit value to avoid
    possible context
    switching is probaly a very bad trade off as the handling to such
    values and doing
    any kind of math with will hurt you badly speed wise with no space
    savings.

    Of course if you can do the math with bitwise operation directly you
    could process
    two integers at one time. But, I do not know exactly what you are up to.

    Hope this helps.

    Keith.
  • On Jan 12, 2009, at 5:51 PM, Schultz Keith J. wrote:

    > Hi Julius,
    >
    > If I understand your problem correctly you are:
    > 1) processing a very large amount of intergers
    > 2) using highly optimized code that is:
    > a) you are manipulating the data directly via pointers
    > b) the data in memory is expected to be in a specific
    > order/structure
    > c) the data is stored on disk in pure binary form
    > that is the same format as in memory
    >
    > Several years back I had optimized the code of a C program and
    > gained a
    > speed bump by the factor of 100 by doing the above and doing the
    > pointer
    > arithmatic by hand for accessing the data in the structure instead
    > of using
    > builtins and standard structures.
    >
    > So you do not need to worry about the size of your data just how you
    > access it,
    > I had to have the program work on different architectures with
    > different word sizes.
    > The inital data where in text for so the conversion to integer was
    > easy. The trick was
    > to use the sizeof function to get the correct values for the pointer
    > math.
    >
    > Far as stuffing two 32-bit values into a 64-bit value to avoid
    > possible context
    > switching is probaly a very bad trade off as the handling to such
    > values and doing
    > any kind of math with will hurt you badly speed wise with no space
    > savings.
    >
    > Of course if you can do the math with bitwise operation directly you
    > could process
    > two integers at one time. But, I do not know exactly what you are up
    > to.

    And of course you should take a look at the Apple documentation for
    this topic:

    Memory Usage Performance Guidelines
    <http://developer.apple.com/documentation/Performance/Conceptual/ManagingMem
    ory/ManagingMemory.html
    >

    - Ken
  • On 12 Jan 09, at 14:51, Schultz Keith J. wrote:
    > Far as stuffing two 32-bit values into a 64-bit value to avoid
    > possible context
    > switching is probaly a very bad trade off as the handling to such
    > values and doing
    > any kind of math with will hurt you badly speed wise with no space
    > savings.

    "Premature optimization is the root of all evil." -- Knuth

    Optimization for space, in this case. Simply declaring the array to
    contain 32-bit values is just as efficient in terms of storage space,
    and it will probably perform better (as well as being easier to write).

    > Of course if you can do the math with bitwise operation directly you
    > could process
    > two integers at one time. But, I do not know exactly what you are up
    > to.

    If speed is actually a concern, vector operations are *much* more
    efficient for this sort of thing than any bitwise hacks could possibly
    be. Look up Accelerate.framework for details.
  • Keit hi,
    On 12 Jan 2009, at 22:51, Schultz Keith J. wrote:

    > Hi Julius,
    >
    > If I understand your problem correctly you are:
    > 1) processing a very large amount of intergers
    > 2) using highly optimized code that is:
    > a) you are manipulating the data directly via pointers
    > b) the data in memory is expected to be in a specific
    > order/structure
    > c) the data is stored on disk in pure binary form
    > that is the same format as in memory
    That's about right.
    I'm creating a painting system that I want eventually to make use of
    all the available pixels on large display panels, e.g. 1600 x 1000
    pixel resolution. The shape and contents of each brushstroke change
    over time and the contents themselves are complex, e.g. random
    patterns. Effectively there is very little redundancy both in each
    frame and in any image sequence. I don't know all the problems that
    trying to pump images of this size at say 15 or 25 fps onto the screen
    will entail so I'm adopting a gradualistic approach to program
    development. If I can, I want to anticipate difficulties sufficient to
    at least maintaining a stable overall program structure.

    However, pushing this data out onto the screen is not the main
    problem. The main problem is that I need to have the picture with all
    its dynamics displayed as I am painting and that I want to keep my
    options when painting reasonably open, for instance to have the
    ability to edit stroke shape variation and colour variation after it
    has been painted. Essentially colour variation can be thought of as a
    movie. A very simple (13Mb) early example may be seen here:
    http://animatedpaint.co.uk/nestaMovies/pearCity4Half.mpg.
    Without going into details, I need to use lots of data, there's a fair
    bit of disk I/O and processing. With every increase in manipulative
    freedom and image complexity comes a corresponding increase in the
    data and processing requirement
    >
    >
    > Several years back I had optimized the code of a C program and
    > gained a
    > speed bump by the factor of 100 by doing the above and doing the
    > pointer
    > arithmatic by hand for accessing the data in the structure instead
    > of using
    > builtins and standard structures.
    Yes, this can be a very good way to go however I was getting a bit
    scared off by my lack of familiarity in using GC on malloc'd data,
    which I'm over now, in fact no problem, but working by oneself with no
    one to  discuss the simplest of things can blow anxieties up from
    molehills to veritable everests.
    >
    >
    > So you do not need to worry about the size of your data just how you
    > access it,
    > I had to have the program work on different architectures with
    > different word sizes.
    > The inital data where in text for so the conversion to integer was
    > easy. The trick was
    > to use the sizeof function to get the correct values for the pointer
    > math.
    Right I'll pay attention to these
    >
    >
    > Far as stuffing two 32-bit values into a 64-bit value to avoid
    > possible context
    > switching is probaly a very bad trade off as the handling to such
    > values and doing
    > any kind of math with will hurt you badly speed wise with no space
    > savings.
    yes, which is why earlier advice I received on using standard c types
    has been a big  relief to me.
    >
    >
    > Of course if you can do the math with bitwise operation directly you
    > could process
    > two integers at one time. But, I do not know exactly what you are up
    > to.
    >
    > Hope this helps.
    >
    > Keith.
    >
    >
    Yes thanks loadsa.
    Essentially everything I'm doing is very simple. It is just that
    there's an awful lot of data and it could all become very complex
    indeed if I didn't continually struggle to stop it going that way.

    best wishes
    Julius

    http://juliuspaintings.co.uk
  • My thanks to all who replied to my query.
    Let me see if I can correctly summarise your advice.

    All references to Mac memory as GB or MB refer to standard 8 bit bytes.

    Mac 64-bit computing relates to the size of pointers into the address
    space and a number of native data types such as NSInteger, NSUInteger
    and CGFloat.

    When working in a 64-bit architecture there is no need to stop using
    standard c types such as char, int, float etc. These do not expand to
    fill 64-bit space but retain their accepted sizes. If one wants to be
    specific about the actual size of a type then use the types specified
    in the International Standard ISO/IEC9899 :
    http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
    For instance, quote: "The typedef name intN_t designates a signed
    integer type with width N, nopadding bits, and a two’s complement
    representation. Thus, int8_t denotes a signed integer type with a
    width of exactly 8 bits. "

    One can discover the sizes of NSInteger, NSUInteger from the constants
    NSIntegerMin, NSIntegerMax, and NSUIntegerMax.

    It is a good idea to use NSInteger/NSUInteger and CGFloat for
    parameters, and local variables.
    It is also good normally to use NSInteger etc and not worry unless
    working with large numbers of small integers that need to be packed
    into as little memory as possible.

    Also re the transition from X86 to X86-64:
    The number of registers were doubled, and the calling conventions were
    changed so that 80% of the time function/method  arguments are stored
    in CPU registers instead of being placed in a  four-byte-aligned
    position on the stack. And that 20% of cases only happen when you pass
    in a structure larger than 128 bits, or pass in  an unaligned
    structure, or have a function that takes more than 6  arguments.
    (Thus..) a program ported from X86 to X86-64 will run just slightly
    faster, especially if the program passes around a lot of 64- bit
    arguments.

    I was glad to have my confusion about GC corrected, vis. in the
    context of GC one need only concern oneself with how the pointers are
    defined, e.g. only to use __strong as pointer to standard c types
    rather than as I was doing: scattering __strongs amongst all my chars
    and ints. Clearly I need to get out more.

    I had not come accross the following bit of documentation:
    http://developer.apple.com/documentation/Darwin/Conceptual/64bitPorting/int
    ro/chapter_1_section_1.html


    I'd been using
    http://developer.apple.com/documentation/Cocoa/Conceptual/Cocoa64BitGuide/I
    ntroduction/chapter_1_section_1.html#/

    /apple_ref/doc/uid/TP40004247-CH1-DontLinkElementID_26
    and not seen the link to the preceding at the bottom of the page.

    It is also a good idea to revisit Memory Usage Performance Guidelines
    http://developer.apple.com/documentation/Performance/Conceptual/ManagingMem
    ory/ManagingMemory.html


    Runtime libraries use a lot of address space so "needing an address
    space of more than 4GB" translates very roughly to "need to manipulate
    more than about 2GB of data".

    A useful optimisation technique is to do pointer arithmetic manually
    rather than rely on built-ins and standard structures. If one is
    working in different size architectures then use sizeof to get the
    correct sizes for pointer arithmetic.

    Finally I was advised to avoid premature optimisation and bit packing.

    So my thanks to you all.
    I return to the fray with mind much eased.
    Thanks again
    Julius

    http://juliuspaintings.co.uk