Displaying 20 results from an estimated 1100 matches similar to: "Vectorizing structure reads, writes, etc on X86-64 AVX"
2015 Nov 03
2
Vectorizing structure reads, writes, etc on X86-64 AVX
Thank you for your reply. FWIW, I wrote the .ll by hand after taking
the C program, using clang to emit the llvm and seeing the memcpy. The
memcpy version that clang generates gets compiled into assembly that
uses the large sequence of movs and does not use the vector hardware
at all. When I started debugging, I took that clang produced .ll and
started to write it different ways trying to get
2015 Nov 03
2
Vectorizing structure reads, writes, etc on X86-64 AVX
----- Original Message -----
> From: "Sanjay Patel via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "Jay McCarthy" <jay.mccarthy at gmail.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Tuesday, November 3, 2015 12:30:51 PM
> Subject: Re: [llvm-dev] Vectorizing structure reads, writes, etc on X86-64 AVX
>
> If the
2015 Nov 04
2
Vectorizing structure reads, writes, etc on X86-64 AVX
Hi Jay -
I see the slow, small accesses using an older clang [Apple LLVM version
7.0.0 (clang-700.1.76)], but this looks fixed on trunk. I made a change
that comes into play if you don't specify a particular CPU:
http://llvm.org/viewvc/llvm-project?view=revision&revision=245950
$ ./clang -O1 -mavx copy.c -S -o -
...
movslq %edi, %rax
movq _spr_dynamic at GOTPCREL(%rip),
2007 Oct 15
2
clipping off words inside a vector of strings
Hi,
I have a vector of strings (class character) with 6 elements (length
6). I call it 'names'.
"Graham Chapman"
"John Cleese"
"Terry Gilliam"
"Eric Idle"
"Terry Jones"
"Michael Palin"
And I want to turn it into another vector of strings called
'shortnames' with the same length.
The new vector should look like:
2005 Oct 14
2
Wherefore SELinux?
I've tried to wrap my head around SELinux several times, and so far, all I've
managed to do is rap my head, and then somewhat painfully.
Is there a detailed "SELinux Unleashed" style book anybody can recommend?
Anybody read Oreilly's book care to comment?
Link:
2013 May 24
1
Libcurl.so.3: wherefore art thou?
I am utterly stumped. I need libcurl.so.3 for CentOS 6.4.
I have Googled everything and I have been going all over the Internet
looking for a solution. I give up; I need help. The error message I
get when I try to run a program is:
error while loading shared libraries: libcurl.so.3: cannot open shared
object file: No such file or directory
I have:
Package libcurl-7.19.7-36.el6_4.i686 already
2005 Oct 12
3
Wherefore whitebox?
Well,
I'm a recent convert from WBEL. My biggest concern with CentOS is that the
community here seems to want to be more than a recompile of RHEL.
But WBEL is floundering, what with Katrina and Rita, and there really being
only 1 developer behind it, etc.
I offer an automated shell script to switch from WBEL4 to CentOS4 (easy, it's
hosted on my home DSL line!) It assumes that
2010 Jul 14
1
convert data.frame to matrix -> NA factor problem
Hi list,
I tried to convert a data.frame into a matrix using data.matrix.
Unfortunately my matrix contains missing values (NA) wherefore all columns
including NA's were changed into factors.
I thought that many people stumbled across that problem already wherefore
there must be a
simple solution. But it seems there isn't. I tried lapply, replace and other
things in combination
but still
2008 Feb 16
0
arris tm502g cablemodem FXS ports and zaptel 1.4.8
Hi there,
I have a cablemodem, ARRIS brand, model tm502G. It has two FXS ports.
I was wondering if anyone has details about the correct signalling of
these FXS ports when connected to original X100p.
Tests:
fxsks on the zapata.conf and zaptel.conf files. From my cellphone I
call the ARRIS, it starts ringing but the zap channel sees no call
coming in.
fxsls on the zapata.conf and zaptel.conf
2008 Apr 12
2
Wherefore is FUSE?
Tonight, I tried to roll out fuse on my CentOS 4 production system. (in order
to use GlusterFS)
I have two identical servers, and one took, the other didn't.
How simple could this be?
# yum install yum-plugin-priorities
# yum install rpmforge-release
# yum install fuse dkms-fuse
both of these seem to work. Yet I run
[root at kepler drivers]# modprobe fuse
FATAL: Module fuse not
2008 Mar 21
3
Oh inheritacne, wherefore art thou
Is there anyway to achieve inheritance of specifications?
Suppose I had an action like
def index
filter = params[:filter] ? params[:filter] : ''%''
order = params[:order] ? params[:order] : ''created_at DESC''
@posts = Post.find(:conditions => "title LIKE #{filter}", :order => order)
...
end
Now I might have half a dozen specs for this
2012 Jan 09
2
[LLVMdev] Calling conventions for YMM registers on AVX
I'll explain what we see in the code.
1. The caller saves XMM registers across the call if needed (according to DEFS definition).
YMMs are not in the set, so caller does not take care.
2. The callee preserves XMMs but works with YMMs and clobbering them.
3. So after the call, the upper part of YMM is gone.
- Elena
-----Original Message-----
From: Bruno Cardoso Lopes [mailto:bruno.cardoso at
2012 Jan 08
2
[LLVMdev] Calling conventions for YMM registers on AVX
Hi,
What is the calling conventions for YMM. According to documents I saw till now, the YMMs are scratch and not saved in callee.
This is also the default behavior of the Intel Compiler.
In X86InstrControl.td the YMMs are not in "defs" set of call.
- Elena
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments
2012 Jan 09
0
[LLVMdev] Calling conventions for YMM registers on AVX
Hi,
> What is the calling conventions for YMM. According to documents I saw till now, the YMMs are scratch and not saved in callee.
> This is also the default behavior of the Intel Compiler.
x86_64 Non-windows targets use the rules defined in the x86_64 abi!
> In X86InstrControl.td the YMMs are not in "defs" set of call.
The XMMs are subregisters of YMMs, and they are in the
2012 Jan 09
0
[LLVMdev] Calling conventions for YMM registers on AVX
On Jan 8, 2012, at 11:18 PM, Demikhovsky, Elena wrote:
> I'll explain what we see in the code.
> 1. The caller saves XMM registers across the call if needed (according to DEFS definition).
> YMMs are not in the set, so caller does not take care.
This is not how the register allocator works. It saves the registers holding values, it doesn't care which alias is clobbered.
Are you
2012 Jan 09
3
[LLVMdev] Calling conventions for YMM registers on AVX
On Jan 9, 2012, at 10:00 AM, Jakob Stoklund Olesen wrote:
>
> On Jan 8, 2012, at 11:18 PM, Demikhovsky, Elena wrote:
>
>> I'll explain what we see in the code.
>> 1. The caller saves XMM registers across the call if needed (according to DEFS definition).
>> YMMs are not in the set, so caller does not take care.
>
> This is not how the register allocator
2012 Jan 10
0
[LLVMdev] Calling conventions for YMM registers on AVX
This is the wrong code:
declare <16 x float> @foo(<16 x float>)
define <16 x float> @test(<16 x float> %x, <16 x float> %y) nounwind {
entry:
%x1 = fadd <16 x float> %x, %y
%call = call <16 x float> @foo(<16 x float> %x1) nounwind
%y1 = fsub <16 x float> %call, %y
ret <16 x float> %y1
}
./llc -mattr=+avx
2011 Nov 30
0
[PATCH 2/4] x86/emulator: add emulation of SIMD FP moves
Clone the existing movq emulation to also support the most fundamental
SIMD FP moves.
Extend the testing code to also exercise these instructions.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -629,6 +629,60 @@ int main(int argc, char **argv)
else
2013 Apr 09
1
[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics
Hello,
LLVM generates two additional instructions for 128->256 bit typecasts
(e.g. _mm256_castsi128_si256()) to clear out the upper 128 bits of YMM register corresponding to source XMM register.
vxorps xmm2,xmm2,xmm2
vinsertf128 ymm0,ymm2,xmm0,0x0
Most of the industry-standard C/C++ compilers (GCC, Intel's compiler, Visual Studio compiler) don't
generate any extra moves
2009 Apr 30
2
[LLVMdev] RFC: AVX Feature Specification
I've been working on adding AVX to LLVM and have run across a number of
questions. Here's the first one.
In some ways AVX is "just another" SSE level. Having AVX implies you have
SSE1-SSE4.2. However AVX is very different from SSE and there are a number
of sub-features which may or may not be available on various implementations.
So right now I've done this:
def