Altivec Linear Algebra Help

  • Hi,
    I'm part of the mac team porting an open source 3d graphics library to
    OS X.  It has its own built in linear algebra package.  I was
    wondering if instead of using its own matrix multiplication calls
    (etc) I could use the altivec ones instead.

    > From what I've read about altivec, is that the structure of the
    matrices and vectors would have to be changed into ones recognized by
    altivec, otherwise we'd lose tons of speed just transforming data
    around.  If anyone has any advice, or comments on whether or not this
    is worthwhile I'd be very grateful.  I'd really like to make this
    library great for OS X and it seems this is a good way to start.

    Thanks!
    gabe taubman
  • Here are a couple of links you might find helpful:

    1) http://developer.apple.com/hardware/ve/vector_libraries.html
    2) http://hpc.sourceforge.net/
    3) http://developer.apple.com/hardware/ve/index.html

    #1 introduces the Apple vector/matrix libraries which are altivec accelerated
    #2 has a reference to macstl which has accelerated numerics classes for C++
    #3 is the Velocity Engine / Altivec homepage

    good luck!

    On Sun, 17 Oct 2004 00:38:48 -0400, gabe taubman <gtaubman...> wrote:
    > Hi,
    > I'm part of the mac team porting an open source 3d graphics library to
    > OS X.  It has its own built in linear algebra package.  I was
    > wondering if instead of using its own matrix multiplication calls
    > (etc) I could use the altivec ones instead.
    >
    >> From what I've read about altivec, is that the structure of the
    > matrices and vectors would have to be changed into ones recognized by
    > altivec, otherwise we'd lose tons of speed just transforming data
    > around.  If anyone has any advice, or comments on whether or not this
    > is worthwhile I'd be very grateful.  I'd really like to make this
    > library great for OS X and it seems this is a good way to start.
    >
    > Thanks!
    > gabe taubman
    > _______________________________________________
    > MacOSX-dev mailing list
    > <MacOSX-dev...>
    > http://www.omnigroup.com/mailman/listinfo/macosx-dev
    >
  • Thanks for the links!  I was also wondering if anyone knows if this
    would be a useful way of utilizing AltiVec.  For instance, the
    programs people write with this library are most likely not
    multiplying thousands of matrices in a row.  However, it being a
    graphics package, they will be multiplying LOTS of vectors by matrices
    one by one, just not all in a bunch.

    On the Vector Libraries page, I read the stuff about cBLAS and it
    seems to have Matrix*Vector and stuff like that.  So let's say that i
    comment out the function overload for Matrix * Vector and I replace it
    with one that uses the cBLAS version (gemv I think it is) and they do
    it just once.  Will there be any significant speed up or is it just
    not worth it?

    Thanks!
    gabe

    On Sun, 17 Oct 2004 16:52:14 +1000, Steven Marcus
    <steven.marcus...> wrote:
    > Here are a couple of links you might find helpful:
    >
    > 1) http://developer.apple.com/hardware/ve/vector_libraries.html
    > 2) http://hpc.sourceforge.net/
    > 3) http://developer.apple.com/hardware/ve/index.html
    >
    > #1 introduces the Apple vector/matrix libraries which are altivec accelerated
    > #2 has a reference to macstl which has accelerated numerics classes for C++
    > #3 is the Velocity Engine / Altivec homepage
    >
    > good luck!
    >
    >
    >
    >
    > On Sun, 17 Oct 2004 00:38:48 -0400, gabe taubman <gtaubman...> wrote:
    >> Hi,
    >> I'm part of the mac team porting an open source 3d graphics library to
    >> OS X.  It has its own built in linear algebra package.  I was
    >> wondering if instead of using its own matrix multiplication calls
    >> (etc) I could use the altivec ones instead.
    >>
    >>> From what I've read about altivec, is that the structure of the
    >> matrices and vectors would have to be changed into ones recognized by
    >> altivec, otherwise we'd lose tons of speed just transforming data
    >> around.  If anyone has any advice, or comments on whether or not this
    >> is worthwhile I'd be very grateful.  I'd really like to make this
    >> library great for OS X and it seems this is a good way to start.
    >>
    >> Thanks!
    >> gabe taubman
    >> _______________________________________________
    >> MacOSX-dev mailing list
    >> <MacOSX-dev...>
    >> http://www.omnigroup.com/mailman/listinfo/macosx-dev
    >>
    >
  • So you've got some matrix M and you want to operate on a bunch of
    vectors v1,v2,...,vn. You're worried about the overhead of calling the
    BLAS function n times to act on each vector in turn?

    If that's a problem for you (and you can easily benchmark it to find
    out how much of a problem it is), then why don't you just store all
    your vectors contiguously to form a matrix V such that the ith column
    is the ith vector? Make one BLAS call to multiply matrix M by matrix V,
    and then the ith column of the resulting matrix will be the result
    you'd have got from multiplying M by your ith vector.

    If you're really paranoid about speed then you might want to look at
    ATLAS <http://math-atlas.sourceforge.net/>. I'm not sure what the state
    of play is nowadays, but there was a point when it produced a faster
    BLAS implementation than the one in Apple's Accelerate framework.

    Richard.

    On 17 Oct 2004, at 19:23, gabe taubman wrote:

    > Thanks for the links!  I was also wondering if anyone knows if this
    > would be a useful way of utilizing AltiVec.  For instance, the
    > programs people write with this library are most likely not
    > multiplying thousands of matrices in a row.  However, it being a
    > graphics package, they will be multiplying LOTS of vectors by matrices
    > one by one, just not all in a bunch.
    >
    > On the Vector Libraries page, I read the stuff about cBLAS and it
    > seems to have Matrix*Vector and stuff like that.  So let's say that i
    > comment out the function overload for Matrix * Vector and I replace it
    > with one that uses the cBLAS version (gemv I think it is) and they do
    > it just once.  Will there be any significant speed up or is it just
    > not worth it?
    >
    > Thanks!
    > gabe
    >
    >
    > On Sun, 17 Oct 2004 16:52:14 +1000, Steven Marcus
    > <steven.marcus...> wrote:
    >> Here are a couple of links you might find helpful:
    >>
    >> 1) http://developer.apple.com/hardware/ve/vector_libraries.html
    >> 2) http://hpc.sourceforge.net/
    >> 3) http://developer.apple.com/hardware/ve/index.html
    >>
    >> #1 introduces the Apple vector/matrix libraries which are altivec
    >> accelerated
    >> #2 has a reference to macstl which has accelerated numerics classes
    >> for C++
    >> #3 is the Velocity Engine / Altivec homepage
    >>
    >> good luck!
    >>
    >>
    >>
    >>
    >> On Sun, 17 Oct 2004 00:38:48 -0400, gabe taubman <gtaubman...>
    >> wrote:
    >>> Hi,
    >>> I'm part of the mac team porting an open source 3d graphics library
    >>> to
    >>> OS X.  It has its own built in linear algebra package.  I was
    >>> wondering if instead of using its own matrix multiplication calls
    >>> (etc) I could use the altivec ones instead.
    >>>
    >>>> From what I've read about altivec, is that the structure of the
    >>> matrices and vectors would have to be changed into ones recognized by
    >>> altivec, otherwise we'd lose tons of speed just transforming data
    >>> around.  If anyone has any advice, or comments on whether or not this
    >>> is worthwhile I'd be very grateful.  I'd really like to make this
    >>> library great for OS X and it seems this is a good way to start.
    >>>
    >>> Thanks!
    >>> gabe taubman
    >>> _______________________________________________
    >>> MacOSX-dev mailing list
    >>> <MacOSX-dev...>
    >>> http://www.omnigroup.com/mailman/listinfo/macosx-dev
    >>>
    >>
    > _______________________________________________
    > MacOSX-dev mailing list
    > <MacOSX-dev...>
    > http://www.omnigroup.com/mailman/listinfo/macosx-dev
  • On 18/10/2004, at 2:23 AM, gabe taubman wrote:

    > Thanks for the links!  I was also wondering if anyone knows if this
    > would be a useful way of utilizing AltiVec.  For instance, the
    > programs people write with this library are most likely not
    > multiplying thousands of matrices in a row.  However, it being a
    > graphics package, they will be multiplying LOTS of vectors by matrices
    > one by one, just not all in a bunch.
    >
    > On the Vector Libraries page, I read the stuff about cBLAS and it
    > seems to have Matrix*Vector and stuff like that.  So let's say that i
    > comment out the function overload for Matrix * Vector and I replace it
    > with one that uses the cBLAS version (gemv I think it is) and they do
    > it just once.  Will there be any significant speed up or is it just
    > not worth it?
    >
    > Thanks!
    > gabe
    >
    >

    I'm the miscreant responsible for macstl, listed on the HPC page.

    One of the advantages of macstl is minimal overhead and maximal
    flexibility when calling functions, since all functions are inlined as
    opposed to being in a code library. You can also use valarrays
    transparently and portably and not have to touch Altivec code unless
    you want to, e.g. you can use the slice arrays to step through a
    valarray and it will still use Altivec to optimally fetch the data. You
    can write your own functions in C++ to call through to macstl and if
    they inline, they will be very fast e.g.

    valarray <float> v1, v2;
    valarray <float> v3 = v1 [slice (2, 3, 5)] * v2 [slice (1, 3, 7)];    //
    optimally fetches data from slices of v1 and v2, and multiplies, all
    within Altivec

    Here's a direct link:

    http://www.pixelglow.com/macstl/

    I'm going to break one of the cardinal rules of software marketing --
    "promise less, deliver more" :-) ... version 0.2 is on the way and it
    will work portably on SSE2 as well. It will also have fast integer and
    accurate division and square root algorithms, complex number
    arithmetic, transparent constant generation, fused multiply-add
    optimizations, alias safety etc. Licensing will likely change to
    open-source RPL + a paid-up proprietary license for those who don't
    want to reciprocate their code.

    Cheers, Glen Low

    ---
    pixelglow software | simply brilliant stuff
    www.pixelglow.com