bugzilla-daemon at bugzilla.mindrot.org
2010-Apr-09 22:18 UTC
[Bug 1753] New: Use -funroll-loops with umac.c
https://bugzilla.mindrot.org/show_bug.cgi?id=1753 Summary: Use -funroll-loops with umac.c Product: Portable OpenSSH Version: -current Platform: Itanium OS/Version: Other Status: NEW Severity: enhancement Priority: P2 Component: Build system AssignedTo: unassigned-bugs at mindrot.org ReportedBy: imorgan at nas.nasa.gov By default, umac.c is compiled with -O2 and performs well on x86 and x86_64 architectures. However, on other architectures the performance can be improved by adding -funroll-loops. Using Ted Krovetz's original code, the performance for 1KB blocks on various architectures (clocks per byte) is as follows: gcc -O2 gcc -O2 -funroll-loops x86_64: 0.95 1.04 IA64: 2.31 1.36 -funroll-loops SPARC: 9.52 9.50 POWER5: 3.88 3.67 The architecture that benefits the most from this is IA64. A memory-to-mekory test using ssh on a 1.5 GHz Itanium system shows an improvement of approximately 9 MB/s; 128 MB/s with just -O2 and 137 MB/s when -funroll-loops is added. It may be worthwhile adding the following to Makefile.in: umac.o: umac.c $(CC) $(CFLAGS) -funroll-loops $(CPPFLAGS) -c $? Admittedly, it has to be acknowledged that this would be slightly detrimental to x86_64 and for architectures other than IA64 the benefit appears to be marginal. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Apr-10 00:25 UTC
[Bug 1753] Use -funroll-loops with umac.c
https://bugzilla.mindrot.org/show_bug.cgi?id=1753 Darren Tucker <dtucker at zip.com.au> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dtucker at zip.com.au --- Comment #1 from Darren Tucker <dtucker at zip.com.au> 2010-04-10 10:25:36 EST --- (In reply to comment #0)> umac.o: umac.c > $(CC) $(CFLAGS) -funroll-loops $(CPPFLAGS) -c $?Well the first problem with that not all compilers understand -funroll-loops, so it'll stop it compiling in anything that's not gcc (or pretending to be gcc). Can you just do "./configure --with-cflags=-funroll-loops"? or does that have a detrimental impact elsewhere? -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
bugzilla-daemon at bugzilla.mindrot.org
2010-Apr-10 01:22 UTC
[Bug 1753] Use -funroll-loops with umac.c
https://bugzilla.mindrot.org/show_bug.cgi?id=1753 --- Comment #2 from Iain Morgan <imorgan at nas.nasa.gov> 2010-04-10 11:22:29 EST --- Ah, yes, I hadn't considered the (lack of) portability with -funroll loops. I haven't seen any detrimental effects adding it for entire build process, but thought it might be better to take a more surgical approach. Using --with-cflags should do the trick. I've been using this performance tweak on Itanium for the past two and thought that it was about time to pass on the observation to the community. I have to admit that I almost didn't submit this bug when I saw the marginal benefit on other architectures. -- Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching the assignee of the bug. You are watching someone on the CC list of the bug.
Maybe Matching Threads
- [Bug 1753] Use -funroll-loops with umac.c
- [Bug 1753] Use -funroll-loops with umac.c
- [Bug 1753] Use -funroll-loops with umac.c
- [Bug 1462] New: Unaligned access warnings on IA64 when using umac-64
- [Bug 2392] New: unable to ssh with umac has algorithm. error:Disconnecting packet:corrupted MAC on input.