SSE2 - not all gold is good for you

A few users notified me that SSW won't run on some of the AMD and Intel processors. After looking at the crash dump submitted by one user, I figured that the culprit was one of the SSE2 instructions, like this one:

movsd   xmm0,mmword ptr [webalizer!_real (0045e290)]

I decided to make a special build, so that people can run SSW on older architectures and spent some time last weekend creating new build configurations. Once I was done, I wanted to check how much slower SSW would run with SSE2 disabled and ran a small test.

Much to my surprise, the build with SSE2 disabled processed a 100 MB log file about 750 records per second faster on my old 1.7 GHz P4. I ran a few more tests and saw consistent 3% improvement in processing speed when disabling streaming instructions.  I tried to switch to SSE, but the result was the same.

Needless to say that I removed new configuration and released SSW with streaming instructions disabled. Once again this proved that there may be too much of a good thing!

Comments: