Jean-Marc Valin wrote:> Duane Storey a ?crit : >> Actually, it might just be an OS "feature".. On most linux and mac >> platforms, the memory managers align memory on proper boundaries -- this >> doesn't occur on most versions of windows. I don't have all the code in >> front of me, but it's possible that it's simply a side effect of windows not >> aligning the memory, and an implicit assumption in the speex code that it >> will have proper alignment. > > I actually doubt it's the OS (as much as I hate Windows). After all, > it's the compiler that manages the stack and needs to ensure alignment. >On Win32, the OS aligns the stack to the native pointer size. Meaning 4 bytes for most machines. MS and Intel's compilers explicitly align the stack by "and $-16" it for every function that uses SSE. GCC uses a smarter scheme and has the caller align the stack so it's aligned to 16 bytes on every function entry. This is better as it allows the hardware to pipeline moves to the stack without having to wait for the and instruction to retire. However, this scheme breaks if the stack wasn't aligned to start with. The MinGW runtime aligns the stack before entering main(), so all functions called from the main() thread are aligned. If you create threads on your own, you'll have to align the stack yourself before you call your thread entry function. Come to think of it, that last paragraph should probably be in the Speex Win32 FAQ if there is one. Anyway. IF you have the "OS misaligned stack and you didn't correct it" problem, the thread will crash on the very first SSE load it does, as it is unaligned. Since GCC maintains 16-byte alignment relative to the starting SP, either ALL stack loads are aligned or they are all unaligned. The case here was that GCC ignores the needs-16-byte alignment of __m128 when it was inside a union. I found a simple fix though: Instead of union { float bla[4]; __m128 blah; } use union { __m128 blah; float bla[4]; } In other words, move the __m128 so it's the first entry in the union. I've been unable to replicate this behaviour on GCC 4.1 or 4.2 running on Ubuntu, so it seems to be specific to the mingw build I'm using. Which means that one of the many configure options for GCC is causing this. I don't really have time to figure out which, and as long as the above fix works, I'm happy enough.
On 8/23/07, Thorvald Natvig <speex@natvig.com> wrote:> Jean-Marc Valin wrote: > > Duane Storey a ?crit : > >> Actually, it might just be an OS "feature".. On most linux and mac > >> platforms, the memory managers align memory on proper boundaries -- this > >> doesn't occur on most versions of windows. I don't have all the code in > >> front of me, but it's possible that it's simply a side effect of windows not > >> aligning the memory, and an implicit assumption in the speex code that it > >> will have proper alignment. > > > > I actually doubt it's the OS (as much as I hate Windows). After all, > > it's the compiler that manages the stack and needs to ensure alignment. > > > > On Win32, the OS aligns the stack to the native pointer size. Meaning 4 > bytes for most machines. > > MS and Intel's compilers explicitly align the stack by "and $-16" it for > every function that uses SSE. GCC uses a smarter scheme and has the > caller align the stack so it's aligned to 16 bytes on every function > entry. This is better as it allows the hardware to pipeline moves to the > stack without having to wait for the and instruction to retire. However, > this scheme breaks if the stack wasn't aligned to start with. > > The MinGW runtime aligns the stack before entering main(), so all > functions called from the main() thread are aligned. If you create > threads on your own, you'll have to align the stack yourself before you > call your thread entry function. > Come to think of it, that last paragraph should probably be in the Speex > Win32 FAQ if there is one. > > Anyway. IF you have the "OS misaligned stack and you didn't correct it" > problem, the thread will crash on the very first SSE load it does, as it > is unaligned. Since GCC maintains 16-byte alignment relative to the > starting SP, either ALL stack loads are aligned or they are all unaligned.FFMpeg had problems with SSE missaligned stack variables under Windows too. Now, with gcc 4.2 available on Windows they solved it with __attribute__((force_align_arg_pointer)) set for publicly visible functions. You may find their patch and short description in this mail: http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2007-August/034010.html Speex may follow similar way to solve this problem. -- Regards, Alexander Chemeris. SIPez LLC. SIP VoIP, IM and Presence Consulting http://www.SIPez.com tel: +1 (617) 273-4000
> FFMpeg had problems with SSE missaligned stack variables under > Windows too. Now, with gcc 4.2 available on Windows they solved it > with __attribute__((force_align_arg_pointer)) set for publicly visible > functions. You may find their patch and short description in this mail: > http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2007-August/034010.html > > Speex may follow similar way to solve this problem.Looks a bit scary (and likely to cause problems in the future). Not quite sure how to handle that. My first idea would be to abort compilation if someone's trying to use SSE with environments that have this bug and then print a message saying "if you really know what you're doing and want to live dangerously, do that". Otherwise, is there a compiler option to force alignment on every single function? It's a bit of overkill, but since Speex is really small, I don't think it'd be an issue. Jean-Marc