Displaying 20 results from an estimated 8000 matches similar to: "Getting SSE2 instructions to work in 32-bit builds on Windows"
2006 Feb 23
2
Problems building R 2.2.1 with libgoto and SSE2 enabled
Hi,
I am trying to build R 2.2.1 with Kazushige Goto's BLAS library (libgoto) and
encountered a problem: I have two computers with the almost identical
hardware (P4 Northwood CPU, i875 chipset, 2GB DDR400 RAM) and identical Linux
OS. I have the latest version of libgoto for this CPU installed on both boxes
(libgoto_northwood32p-r1.00.so) and I am using gcc compiler flags "-O2
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I think this is a great patch but, in my view, an even better way to tackle
the fundamental problem (the performance limitations) is to use a much
faster checksum like xxhash, as has been suggested before:
https://lists.samba.org/archive/rsync/2019-October/031975.html
Cheers,
Filipe
On Mon, 18 May 2020 at 17:08, Jorrit Jongma via rsync <rsync at lists.samba.org>
wrote:
> This drop-in
2020 May 20
0
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
would it perhaps make sense to have a "--disable-sse2/3" commandline
switch in rsync, too - at least for some timeframe until this is
considered "rock solid" ?
i dislike having automatic cpu feature switching code in a tool which
needs to be reliable for me, this new optimization may have issues - and
without such switch it can't be easily workarounded without replacing
2020 May 18
2
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I don't disagree that MD5 could (or even should) be replaced so it is
no longer the bottleneck in several real-world cases (including mine).
However this patch is not for MD5 performance, rather for the rolling
checksum rsync uses to match blocks on existing files on both ends to
reduce transfer size.
On Mon, May 18, 2020 at 5:44 PM Filipe Maia via rsync
<rsync at lists.samba.org>
2023 Aug 31
1
Problems with installing R packages from source and running C++ in R, even on fresh R installation
> So starting a new Rcmd.exe process fails for some reason.
>
> If you take the same R session where the environment variables are
> right and Sys.which() resolves Make and GCC and try to run
> tools:::.shlib_internal(c('-n', 'hello.c')) or
> tools:::.shlib_internal('hello.c'), does it do something useful?
I think I tried the commands in the right R
2023 Apr 05
1
path to rtools not updated in R 4.2.3 - line 1: gcc: command not found
Dear listers,
I have update to rtools43 and, using R 4.2.3 I have been surprised not
to be able to compile packages needing compilation when updating.
Looks like the path given in
gcc? -I"C:/PROGRA~1/R/R-42~1.3/include" -DNDEBUG -DNTIMER
-I./SuiteSparse_config -DUSE_FC_LEN_T
-I"C:/rtools42/x86_64-w64-mingw32.static.posix/include"???? -O2 -Wall?
-std=gnu99 -mfpmath=sse
2020 Feb 09
0
Development version of R fails tests and is not installed
On Sat, Feb 8, 2020 at 9:27 AM Berwin A Turlach
<berwin.turlach at gmail.com> wrote:
>
> G'day all,
>
> I have daily scripts running to install the patched version of the
> current R version and the development version of R on my linux box
> (Ubuntu 18.04.4 LTS).
>
> The last development version that was successfully compiled and
> installed was "R Under
2020 May 18
6
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
This drop-in patch increases the performance of the get_checksum1()
function on x86-64.
On the target slow CPU performance of the function increased by nearly
50% in the x86-64 default SSE2 mode, and by nearly 100% if the
compiler was told to enable SSSE3 support. The increase was over 200%
on the fastest CPU tested in SSSE3 mode.
Transfer time improvement with large files existing on both ends
2020 May 19
5
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I've read up some more on the subject, and it seems the proper way to
do this with GCC is g++ and target attributes. I've refactored the
patch that way, and it indeed uses SSSE3 automatically on supporting
CPUs, regardless of the build host, so this should be ideal both for
home builders and distros.
Getting the code to build right in c++ mode (checksum_sse2.cpp only)
was a bit of an
2020 May 18
3
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
What do you base this on?
Per https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html :
"For the x86-32 compiler, you must use -march=cpu-type, -msse or
-msse2 switches to enable SSE extensions and make this option
effective. For the x86-64 compiler, these extensions are enabled by
default."
That reads to me like we're fine for SSE2. As stated in my comments,
SSSE3 support must be
2016 Dec 03
1
Questions about libFLAC and SSE/SSE2/...
Erik de Castro Lopo wrote:
> lvqcl.mail wrote:
>> now. Removing OS check will greatly simplify src/libFLAC/cpu.c.
>
> That makes sense.
Should I post a patch that removes OS check and keeps only CPU check?
>> 2.
>> "configure" build system adds -msse2 option by default. It means that
>> x86 (32-bit) library won't work on older, non-SSE2
2016 Dec 02
4
Questions about libFLAC and SSE/SSE2/...
1.
A program can use SSE instructions only if both CPU and OS support SSE.
Currently libFLAC tests both CPU and OS for this support, but is it really
necessary? Maybe CPU check is enough? Operating systems that don't support
SSE (Win95, WinNT 4.0, Linux kernel 2.2 (iirc), ...) are really outdated
now. Removing OS check will greatly simplify src/libFLAC/cpu.c.
2.
"configure" build
2017 Jan 09
0
accelerating matrix multiply
> From: "Cohn, Robert S" <robert.s.cohn at intel.com>
>
> I am using R to multiply some large (30k x 30k double) matrices on a
> 64 core machine (xeon phi). I added some timers to
> src/main/array.c to see where the time is going. All of the time is
> being spent in the matprod function, most of that time is spent in
> dgemm. 15 seconds is in matprod in
2016 Dec 03
0
Questions about libFLAC and SSE/SSE2/...
lvqcl.mail wrote:
> 1.
> A program can use SSE instructions only if both CPU and OS support SSE.
> Currently libFLAC tests both CPU and OS for this support, but is it really
> necessary? Maybe CPU check is enough? Operating systems that don't support
> SSE (Win95, WinNT 4.0, Linux kernel 2.2 (iirc), ...) are really outdated
> now. Removing OS check will greatly simplify
2014 Mar 11
2
x86_64 SSE2/SSE41 optim not used
Hi Guys,
In stream_decoder.c when assigning lpc restore function,
only IA32 processor benefits from SS2 and SSE4.1 optimization.
Shouldn't it be the case for x86_64 processor as well ?
Thanks,
--
Olivier TRISTAN
uvi.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/flac-dev/attachments/20140311/1d49b5c2/attachment.htm
2015 Apr 10
2
[PATCH] configure: only use -mstackrealign for mingw32
On Fri, Apr 10, 2015 at 1:40 PM, lvqcl <lvqcl.mail at gmail.com> wrote:
> Tristan Matthews wrote:
>
> > if test "x$asm_optimisation$sse_os" = "xyesyes" ; then
> > XIPH_ADD_CFLAGS([-msse2])
> > - XIPH_ADD_CFLAGS([-mstackrealign])
> > + if test "$host_os" = "mingw32" ; then
>
2018 Nov 27
1
Subsetting row in single column matrix drops names in resulting vector
Dmitriy Selivanov (selivanov.dmitriy at gmail.com) wrote:
> Consider following example:
>
> a = matrix(1:2, nrow = 2, dimnames = list(c("row1", "row2"), c("col1")))
> a[1, ]
> # 1
>
> It returns *unnamed* vector `1` where I would expect named vector. In fact
> it returns named vector when number of columns is > 1.
> Same issue applicable
2015 Apr 10
3
[PATCH] configure: only use -mstackrealign for mingw32
Tristan Matthews wrote:
> ---
> configure.ac | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index eb9b0cc..e7d68c3 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -399,9 +399,11 @@ if test x$ac_cv_c_compiler_gnu = xyes ; then
>
> if test "x$asm_optimisation$sse_os" =
2015 Jul 14
0
Two bugs showing up mostly on SPARC systems
On 14/07/2015 6:08 PM, Radford Neal wrote:
> In testing pqR on Solaris SPARC systems, I have found two bugs that
> are also present in recent R Core versions. You can see the bugs and
> fixes at the following URLs:
>
> https://github.com/radfordneal/pqR/commit/739a4960a4d8f3a3b20cfc311518369576689f37
Thanks for the report. Just one followup on this one:
There are two sections
2015 Apr 10
2
[PATCH] configure: only use -mstackrealign for mingw32
---
configure.ac | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/configure.ac b/configure.ac
index eb9b0cc..4347c07 100644
--- a/configure.ac
+++ b/configure.ac
@@ -399,9 +399,10 @@ if test x$ac_cv_c_compiler_gnu = xyes ; then
if test "x$asm_optimisation$sse_os" = "xyesyes" ; then
XIPH_ADD_CFLAGS([-msse2])
- XIPH_ADD_CFLAGS([-mstackrealign])
+