CS 202 - MMX and SSE links
By Daniel Scharstein
The Pentium assembly language, as discussed in class (and BO text) so
far, is really Intel's IA-32 (or i386)
instruction set, dating back to 1985. Below is some information on
the MMX and SSE additions to the i386 instruction set. First, some
links to overall info:
MMX stands for "multi-media extensions" (or "matrix-math
extensions"?). It uses the 8 floating point registers as additional
64-bit integer registers mm0 - mm7. Each register can each be divided
into 8 bytes, four 16-bit words, two 32-bit words, or left undivided.
Most MMX instructions start with p and end with a letter representing
the data type they operate on (b=bytes, w=16-bit words, d=32-bit
doublewords, q=64-bit quadword). Some links:
SSE stands for "streaming SIMD extensions". It uses 8 NEW 128-bit
floating point registers xmm0 - xmm7, each of which holds 4 32-bit
floats. Unlike the MMX instructions, you can make the gcc compiler
generate code that uses SSE instructions for floating point operations
using the options "-msse -mfpmath=sse".
- Wikipedia article
- My own examples for compiling with "-mfpmath=sse":
fadd.c,
faddi.c
- My timing results for adding arrays of 4 and N floats:
addF4main.c,
addF4fp.c,
addF4fp.s,
addF4sse.c,
addF4sse.s,
addF4mysse.s
addFmain.c,
addFfp.c,
addFfp.s,
addFsse.c,
addFsse.s,
addFmysse.s
- Dr. Dobbs introduction to SSE, part1;
part 2;
part 3
- Intel's "Getting started with SSE/SSE2 for Pentium 4"
- Tommesani's SSE Primer
- Optimizing MILC Math Routines with SSE
- Example C program using SSE instructions by Indrek Kruusa
SSE2 / SSE3 (since Pentium 4, 2001 / 2004)
SSE2 extends SSE to support 64-bit floats, as well as 64, 32, 16 and
8-bit ints, thus essentially making MMX obsolete (SSE2 adopts the
integer MMX instructions to the new xmm registers). SSE3 adds a few
more instructions.