similar to: Use Galois field New Instructions (GFNI) to combine affine instructions

Displaying 20 results from an estimated 100 matches similar to: "Use Galois field New Instructions (GFNI) to combine affine instructions"

2020 May 18
2
Use Galois field New Instructions (GFNI) to combine affine instructions
On 5/18/20 8:24 PM, Craig Topper wrote: > I can tell you that your avx512 issue is that v64i8 gfni instructions also > require avx512bw to be enabled to make v64i8 a supported type. The C > intrinsics handling in the front end know this rule. But since you > generated your own intrinsics you bypassed that. Indeed that's the issue... I was stick with what Intel announces here
2016 Sep 19
2
RFC: New intrinsics masked.expandload and masked.compressstore
Hi all, AVX-512 ISA introduces new vector instructions VCOMPRESS and VEXPAND in order to allow vectorization of the following loops with two specific types of cross-iteration dependencies: Compress: for (int i=0; i<N; ++i) If (t[i]) *A++ = expr; Expand: for (i=0; i<N; ++i) If (t[i]) X[i] = *A++; else
2016 Sep 25
5
RFC: New intrinsics masked.expandload and masked.compressstore
| |Hi Elena, | |Technically speaking, this seems straightforward. | |I wonder, however, how target-independent this is in a practical |sense; will there be an efficient lowering when targeting any other |ISA? I don't want to get into the territory where, because the |vectorizer is supposed to be architecture independent, we need to |add target-independent intrinsics for all
2016 Sep 26
2
RFC: New intrinsics masked.expandload and masked.compressstore
| |How would this work in this case? The result would need to affect the |legality and cost of the memory instruction. From your poster, it looks |like we're talking about loops with constructs like this: | |for (i =0; i < N; i++) { | if (topVal > b[i]) { | *dst = a[i]; | dst++; | } |} | |is this loop vectorizable at all without these constructs? Good
2020 Jul 23
1
New x86-64 micro-architecture levels
Hello, On Wed, 22 Jul 2020, Mallappa, Premachandra wrote: > > That's deliberate, so that we can use the same x86-* names for 32-bit library selection (once we define matching micro-architecture levels there). > > Understood. > > > If numbers are out, what should we use instead? > > x86-sse4, x86-avx2, x86-avx512? Would that work? > > Yes please, I think
2020 Oct 08
4
__attribute__((apple_abi)): targeting Apple/ARM64 ABI from Linux (and others)
Hello everyone, I made a quick patch to clang/llvm to introduce an "apple_abi" function attribute (https://github.com/aguinet/llvm-project/commit/c4905ded3afb3182435df30e527955031cb0d098), to be able to compile functions for the Apple ARM64 ABI when targeting other ARM64 OSes (e.g. Linux). This can be seen as the Apple version of the already existing "ms_abi" attribute. In
2020 Jun 25
2
How to implement load/store for vector predicate register
Hi, there I am writing an backend, and I met a problem. We don't have load/store instructions for vector predicate registers(vpr for short). The hardware has 64 vector registers(vr for short) and 8 vector predicate registers. And there is no move instructions between vr and vpr. vr supports many operations, and vpr supports vpror, vprxor, vprand and vprinv operations. A vr has 512 bits, and
2008 Apr 01
3
Xen without APIC
Hi, I am trying to boot Xen on top of a simulator. I am having problems with the APIC module in my simulator, not related to Xen. So, I want to boot Xen without APIC support. Is there a way to disable APIC support in Xen? I am fine even Xen is restricted to uni-processor environment. Thanks, Bhaskar _______________________________________________ Xen-users mailing list
2020 Jun 26
2
How to implement load/store for vector predicate register
Hi, I am planning to expanding the pseudo instructions in XXXTargetLowering::EmitInstrWithCustomInserter(), and use temporary virtual registers as operands. If I use virtual registers, do I need to mark them as "early clobber"? I saw that sometimes they marked virtual register as "early clobber" in EmitInstrWithCustomInserter() in MIPS backend. What is the effect of marking a
2017 Jun 17
5
LoopVectorize fails to vectorize loops with induction variables with PtrToInt/IntToPtr conversions
Hello all, There is a missing vectorization opportunity issue with clang 4.0 with the file attached. Indeed, when compiled with -O2, the "op_distance" function get vectorized, but not the "op" one. For information, this test case has been reduced from a file generated by the Pythran compiler (https://github.com/serge-sans-paille/pythran). If we take a look at the generated
2018 Feb 05
0
LLVM Weekly - #214, Feb 5th 2018
LLVM Weekly - #214, Feb 5th 2018 ================================ If you prefer, you can read a HTML version of this email at <http://llvmweekly.org/issue/214>. Welcome to the two hundred and fourteenth issue of LLVM Weekly, a weekly newsletter (published every Monday) covering developments in LLVM, Clang, and related projects. LLVM Weekly is brought to you by [Alex
2008 Mar 12
6
Time is off by an hour in my XEN vm
Hello, I''m hiring a XEN virtual machine running Ubuntu at a hosting company. My XEN virtual machine is hosted on a server which has some other VM''s running on it. They all use ubuntu or debian. After a crash sometime last week, the systemclock of my VM is off by an hour (it says 19:49, although it''s 18:49 here now). The other VM''s don''t have
2020 Apr 22
3
_ExtInt, LLVM integers and constant time
> On Apr 22, 2020, at 12:24 AM, Roman Lebedev via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Wed, Apr 22, 2020 at 9:35 AM Adrien Guinet via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hello everyone, >> >> After reading the nice blog post about _ExtInt, I was wondering whether >>
2012 Oct 19
1
OpenSSH and Galois/Counter mode i.e. GCM
Hello, Are there any known efforts to implement RFC 5647 i.e. AES Galois Counter Mode for the Secure Shell Transport Layer Protocol for OpenSSH? If not, would OpenSSH project be interested in such feature? Thanks.
2019 Apr 12
3
Generating C headers from LLVM
Hi List, is there any way to generate proper C header files for functions that are defined in LLVM-IR. My current attempts fail when clang does some fancy transformations (to adhere to some ABIs ??), e.g., for returning a struct. For example the declaration typedef struct {int64_t a; int64_t b;int64_t c;} test; test create_test(void);yields the LLVM code %struct.test = type { i64, i64, i64 }
2020 Jul 21
7
New x86-64 micro-architecture levels
* Premachandra Mallappa: > [AMD Public Use] > > Hi Floarian, > >> I'm including a proposal for the levels below. I use single letters for them, but I expect that the concrete implementation of this proposal will use >> names like “x86-100”, “x86-101”, like in the glibc patch referenced above. (But we can discuss other approaches.) > > Personally I am not a big
2018 Feb 21
1
Finding and replacing instruction patterns
Hi all -- first time poster, hoping that this is going to the right list. Also a complete LLVM newbie, so please correct any glaring errors in my understanding. I am an architecture researcher at Penn State working on Processing in Memory (PIM) architectures. Currently, I plan to use LLVM to detect and replace groups of instructions which can be accelerated in memory. Once a group of
2008 Mar 19
1
Problems with Samba - Domain not reachable
I'm very despaired. Trying to join a Domain I also get an error: "Domain not reachable". But to my mind the samba-configuration is ok. The logs give no hint to an error. testparm smb.conf works fine. I created a shortcut to logon on server, that works. (via ip and via name). But i cannot join the domain. Does anyone has an advice, where the problem could be? To my mind it is a
2019 Apr 18
3
Opt plugin linkage
The fundamental problem here is that opt doesn’t use ExecutionEngine (because it has no need to), so trying to use ExecutionEngine (or any other bit of llvm that opt doesn’t use for that matter) in an opt plugin isn’t going to work. The solution I’d go with would be to build llvm with shared libraries (use –DBUILD_SHARED_LIBS=ON on the cmake command) then link the plugin against ExecutionEngine.
2019 Apr 16
2
Opt plugin linkage
Hey: I spent sometime debugging this, it seems like editing ``llvm/tools/opt.cpp`` and move ``cl::ParseCommandLineOptions(argc, argv, "llvm .bc -> .bc modular optimizer and analysis printer\n");`` to the beginning of main() solved it for me. I'm not sure if this is a bug on LLVM side Zhang ------------------ Original ------------------ From: "Viktor Was BSc via