search for: threadripper

Displaying 12 results from an estimated 12 matches for "threadripper".

2019 Apr 05
0
Deep Replicable Bug With AMD Threadripper MultiCore
On 4 April 2019 at 17:28, ivo welch wrote: | The following program is whittled down from a much larger program that | always works on Intel, and always works on AMD's threadripper with | lapply but not mclappy. With mclapply on AMD, all processes go into | "suspend" mode and the program then hangs. This bug is replicable on an | AMD Ryzen Threadripper 2950X 16-Core Processor (128GB RAM), running | latest ubuntu 18.04. The R version 3.5.3 (2019-03-11) -- "Gr...
2019 Apr 05
2
Deep Replicable Bug With AMD Threadripper MultiCore
The following program is whittled down from a much larger program that always works on Intel, and always works on AMD's threadripper with lapply but not mclappy. With mclapply on AMD, all processes go into "suspend" mode and the program then hangs. This bug is replicable on an AMD Ryzen Threadripper 2950X 16-Core Processor (128GB RAM), running latest ubuntu 18.04. The R version 3.5.3 (2019-03-11) -- "Great Trut...
2023 Aug 27
1
Issue with gc() on Ubuntu 20.04
...is lowered self.pct to 99.36. Not much there. After some pondering, I added an options(gc.auto=Inf) at the beginning of each function, not resetting it at exit, but expecting the offending function(s) to plead guilty. Not so although it did lower the gc() time to 95.84%. This was on a 16 core Threadripper 1950X box so I was intending to use library parallel but I tried it on my lowly windows box that is years old and got it down to 88.07%. The only thing I can think of is that there are quite a lot of cases where a function is generated on the fly as in: eval(parse(t=paste("dprob <- fu...
2023 Aug 27
1
Issue with gc() on Ubuntu 20.04
On Sun, 27 Aug 2023 19:54:23 +0100 John Logsdon <j.logsdon at quantex-research.com> wrote: > Not so although it did lower the gc() time to 95.84%. > > This was on a 16 core Threadripper 1950X box so I was intending to > use library parallel but I tried it on my lowly windows box that is > years old and got it down to 88.07%. Does the Windows box have the same version of R on it? > The only thing I can think of is that there are quite a lot of cases > where a functio...
2018 Oct 30
1
IBM buying RedHat
> On 2018-10-30 02:46, Simon Matter wrote: >>> On 10/29/18 1:55 AM, Simon Matter wrote: >>>> To me it seems like, if they are smart, they will try to push IBM >>>> POWER >>>> and RedHat Linux together to establish real competition in the >>>> hardware >>>> market again (and of course don't forget to keep Fedora/CentOS
2020 May 18
6
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
...vements may still be seen when + * matching chunks from NVMe storage even on newer CPUs. + * + * Benchmarks C SSE2 SSSE3 + * - Intel Atom D2700 550 MB/s 750 MB/s 1000 MB/s + * - Intel i7-7700hq 1850 MB/s 2550 MB/s 4050 MB/s + * - AMD ThreadRipper 2950x 2900 MB/s 5600 MB/s 8950 MB/s + * + * This optimization for get_checksum1 is intentionally limited to x86-64 as + * no 32-bit CPU was available for testing. As 32-bit CPUs only have half the + * available xmm registers, this optimized version may not be faster than the + * pure C vers...
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
...> + * matching chunks from NVMe storage even on newer CPUs. > + * > + * Benchmarks C SSE2 SSSE3 > + * - Intel Atom D2700 550 MB/s 750 MB/s 1000 MB/s > + * - Intel i7-7700hq 1850 MB/s 2550 MB/s 4050 MB/s > + * - AMD ThreadRipper 2950x 2900 MB/s 5600 MB/s 8950 MB/s > + * > + * This optimization for get_checksum1 is intentionally limited to x86-64 > as > + * no 32-bit CPU was available for testing. As 32-bit CPUs only have half > the > + * available xmm registers, this optimized version may not be f...
2020 May 19
5
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
...vements may still be seen when + * matching chunks from NVMe storage even on newer CPUs. + * + * Benchmarks C SSE2 SSSE3 + * - Intel Atom D2700 550 MB/s 750 MB/s 1000 MB/s + * - Intel i7-7700hq 1850 MB/s 2550 MB/s 4050 MB/s + * - AMD ThreadRipper 2950x 2900 MB/s 5600 MB/s 8950 MB/s + * + * This optimization for get_checksum1() is intentionally limited to x86-64 + * as no 32-bit CPU was available for testing. As 32-bit CPUs only have half + * the available xmm registers, this optimized version may not be faster than + * the pure C ve...
2020 May 18
2
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
...hunks from NVMe storage even on newer CPUs. >> + * >> + * Benchmarks C SSE2 SSSE3 >> + * - Intel Atom D2700 550 MB/s 750 MB/s 1000 MB/s >> + * - Intel i7-7700hq 1850 MB/s 2550 MB/s 4050 MB/s >> + * - AMD ThreadRipper 2950x 2900 MB/s 5600 MB/s 8950 MB/s >> + * >> + * This optimization for get_checksum1 is intentionally limited to x86-64 as >> + * no 32-bit CPU was available for testing. As 32-bit CPUs only have half the >> + * available xmm registers, this optimized version may no...
2020 May 20
0
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
...> + * matching chunks from NVMe storage even on newer CPUs. > + * > + * Benchmarks C SSE2 SSSE3 > + * - Intel Atom D2700 550 MB/s 750 MB/s 1000 MB/s > + * - Intel i7-7700hq 1850 MB/s 2550 MB/s 4050 MB/s > + * - AMD ThreadRipper 2950x 2900 MB/s 5600 MB/s 8950 MB/s > + * > + * This optimization for get_checksum1() is intentionally limited to x86-64 > + * as no 32-bit CPU was available for testing. As 32-bit CPUs only have half > + * the available xmm registers, this optimized version may not be faster th...
2020 May 18
3
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
What do you base this on? Per https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html : "For the x86-32 compiler, you must use -march=cpu-type, -msse or -msse2 switches to enable SSE extensions and make this option effective. For the x86-64 compiler, these extensions are enabled by default." That reads to me like we're fine for SSE2. As stated in my comments, SSSE3 support must be
2020 May 22
2
[PATCH] Optimized assembler version of md5_process() for x86-64
...mes. + * The MD5_CTX structure as expected here (from OpenSSL) is binary compatible + * with the md_context used by rsync, for the fields accessed. + * + * Benchmarks (in MB/s) C ASM + * - Intel Atom D2700 302 334 + * - Intel i7-7700hq 351 376 + * - AMD ThreadRipper 2950x 728 784 + * + * The original code was also incorporated into OpenSSL. It has since been + * modified there. Those changes have not been made here due to licensing + * incompatibilities. Benchmarks of those changes on the above CPUs did not + * show any significant difference in perfo...