search for: elena

Displaying 20 results from an estimated 315 matches for "elena".

2016 Feb 26
2
how to force llvm generate gather intrinsic
...pec includes gather; whether it's slow or fast is an implementation detail. We need a feature bit / cost model entry somewhere to signify this, so we're not overloading the meaning of the architectural features with that implementation detail. On Fri, Feb 26, 2016 at 12:23 PM, Demikhovsky, Elena < elena.demikhovsky at intel.com> wrote: > No. Gather operation is slow on AVX2 processors. > > > > - * Elena* > > > > *From:* zhi chen [mailto:zchenhn at gmail.com] > *Sent:* Thursday, February 25, 2016 20:48 > *To:* Sanjay Patel <spatel at rotat...
2016 Feb 25
2
how to force llvm generate gather intrinsic
...ay Patel <spatel at rotateright.com> wrote: > I don't think gather has been enabled for AVX2 as of r261875. > Masked load/store were enabled for AVX with: > http://reviews.llvm.org/D16528 / http://reviews.llvm.org/rL258675 > > On Wed, Feb 24, 2016 at 11:39 PM, Demikhovsky, Elena < > elena.demikhovsky at intel.com> wrote: > >> Yes, masked load/store/gather/scatter are completed. >> >> >> >> - * Elena* >> >> >> >> *From:* zhi chen [mailto:zchenhn at gmail.com] >> *Sent:* Thursday, February 25, 2...
2016 Feb 26
0
how to force llvm generate gather intrinsic
...; whether it's slow or fast is an implementation detail. We need a feature > bit / cost model entry somewhere to signify this, so we're not overloading > the meaning of the architectural features with that implementation detail. > > On Fri, Feb 26, 2016 at 12:23 PM, Demikhovsky, Elena < > elena.demikhovsky at intel.com> wrote: > >> No. Gather operation is slow on AVX2 processors. >> >> >> >> - * Elena* >> >> >> >> *From:* zhi chen [mailto:zchenhn at gmail.com] >> *Sent:* Thursday, February 25, 2016...
2016 Feb 25
2
how to force llvm generate gather intrinsic
Yes, masked load/store/gather/scatter are completed. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 01:20 To: Demikhovsky, Elena <elena.demikhovsky at intel.com> Cc: Sanjay Patel <spatel at rotateright.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subjec...
2016 Feb 26
0
how to force llvm generate gather intrinsic
No. Gather operation is slow on AVX2 processors. - Elena From: zhi chen [mailto:zchenhn at gmail.com] Sent: Thursday, February 25, 2016 20:48 To: Sanjay Patel <spatel at rotateright.com> Cc: Demikhovsky, Elena <elena.demikhovsky at intel.com>; Nema, Ashutosh <Ashutosh.Nema at amd.com>; llvm-dev <llvm-dev at lists.llvm.org> Subjec...
2016 Feb 25
0
how to force llvm generate gather intrinsic
I don't think gather has been enabled for AVX2 as of r261875. Masked load/store were enabled for AVX with: http://reviews.llvm.org/D16528 / http://reviews.llvm.org/rL258675 On Wed, Feb 24, 2016 at 11:39 PM, Demikhovsky, Elena < elena.demikhovsky at intel.com> wrote: > Yes, masked load/store/gather/scatter are completed. > > > > - * Elena* > > > > *From:* zhi chen [mailto:zchenhn at gmail.com] > *Sent:* Thursday, February 25, 2016 01:20 > *To:* Demikhovsky, Elena <elena...
2016 Feb 24
0
how to force llvm generate gather intrinsic
Hi Elena, Are the masked_load and gather working now? Best, Zhi On Sat, Jan 23, 2016 at 12:06 PM, Demikhovsky, Elena < elena.demikhovsky at intel.com> wrote: > Ø Can we legalize the same set of masked load/store operations for AVX1 > as AVX2? > > Yes, of course. > > > > -...
2014 Oct 27
4
[LLVMdev] Adding masked vector load and store intrinsics
we just follow a common recommendation to start with intrinsics: http://llvm.org/docs/ExtendingLLVM.html - Elena From: Owen Anderson [mailto:resistor at mac.com] Sent: Sunday, October 26, 2014 23:57 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu; dag at cray.com Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics What is the motivation for using intrinsics versus adding new instructions...
2014 Oct 28
2
[LLVMdev] Adding masked vector load and store intrinsics
...bly explain the criteria. What the diff between fma and fadd? Or fptrunc and fabs? A new instruction like %a = loadm <4 x i32>* %addr, <4 x i32> %passthru, i32 4, <4 x i1>%mask is possible, but may be not very useful for most of targets. So we start from intrinsics. - Elena From: Owen Anderson [mailto:resistor at mac.com] Sent: Monday, October 27, 2014 18:59 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu; dag at cray.com Subject: Re: [LLVMdev] Adding masked vector load and store intrinsics Since this is something that you expect to be supported on all targets, an...
2015 Apr 16
2
[LLVMdev] Code review for gather and scatter intrinsics
Hi Renato, I fully agree with you, but indexed load and store is the next step. I'm asking to review gather and scatter code. Thanks. - Elena -----Original Message----- From: Renato Golin [mailto:renato.golin at linaro.org] Sent: Thursday, April 16, 2015 17:17 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu; Chandler Carruth; James Molloy Subject: Re: [LLVMdev] Code review for gather and scatter intrinsics On 16 April 2015 at 14:44,...
2016 Apr 12
2
X86 TRUNCATE cost for AVX & AVX2 mode
<Copied Cong> Thanks Elena. Mostly I was interested in why such a high cost 30 kept for TRUNCATE v16i32 to v16i8 in SSE41. Looking at the code it appears like TRUNCATE v16i32 to v16i8 in SSE41 is very expensive vs SSE2. I feel this number should be same/close to the cost mentioned for same operation in SSE2ConversionTbl. B...
2016 May 20
5
Working on FP SCEV Analysis
...king coming from lack of good understanding on SCEV and their proper usages. Now, let’s assume we can postpone discussion about case A. What is the best approach to handle case B? Let me summarize the discussion so far. Hope I didn’t miss anything. 1) Extend SCEV was the initial approach taken by Elena. Elena thinks this solution ”looks very structured”. If I’m not mistaken, some people think this is overkill and overly complicates already complicated SCEV. Anyone care to look at the patch Elena came up with? 2) IndVarSimplify::handleFloatingPointIV...
2016 May 25
1
RFC: FileCheck Enhancements
It's equivalent to {{\b}}whatever{{\b}}. I amn't sure if assertion \b is supported. \s will not match with start and of line, but it should be matched. Elena. -----Original Message----- From: Jonathan Roelofs [mailto:jonathan at codesourcery.com] Sent: Tuesday, May 24, 2016 5:14 PM To: Elena Lepilkina <Elena.Lepilkina at synopsys.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] RFC: FileCheck Enhancements On 5/24/16 8...
2016 Mar 04
2
Fwd: [PATCH] D17497: Support arbitrary address space for intrinsics
Per my previous email, I have just signed off on Artur's original patch. Philip On 03/02/2016 11:21 AM, Philip Reames via llvm-dev wrote: > Elena, > > I'd like to propose that we move forward withArtur's original patch > <http://reviews.llvm.org/D17270> and separate the discussion of how we > might change our intrinsic naming scheme. Artur's patch is addressing > a correctness problem; that has to overrul...
2017 Sep 18
1
Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'
...nePointerInfo()); > } else { > return DAG.getExtLoad(ISD::EXTLOAD, dl, N->getValueType(0), Store, StackPtr, MachinePointerInfo(), EltVT); > } I assume that we need the opposite - if (.. < 8) getExtLoad // VT should be MVT::i8, MemVT should be MVT::i1 else getLoad - Elena From: jingu at codeplay.com [mailto:jingu at codeplay.com] Sent: Monday, September 18, 2017 13:40 To: Demikhovsky, Elena <elena.demikhovsky at intel.com>; daniel_l_sanders at apple.com <daniel_l_sanders at apple.com>; Jon Chesterfield <jonathanchesterfield at gmail.com> Cc: llvm-...
2017 Sep 17
2
Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'
Please open a bugzilla ticket and attach your testcase. It will allow us to debug and fix the problem. Thanks - Elena From: JinGu [mailto:jingu at codeplay.com] Sent: Saturday, September 16, 2017 00:38 To: Demikhovsky, Elena <elena.demikhovsky at intel.com>; daniel_l_sanders at apple.com <daniel_l_sanders at apple.com>; Jon Chesterfield <jonathanchesterfield at gmail.com> Cc: llvm-dev at lists.l...
2015 Mar 03
4
[LLVMdev] Extending Vector GEP - proposal
...prefer to generate a splat-GEP, is compile-time saving. I should generate 2 (or more, for each splat element) redundant instructions (broadcast is insert+shuffle), hoist them outside the loop on some stage. Then look for them on CodeGenPreare pass, sink them back and rebuild the CFG. - Elena From: Nadav Rotem [mailto:nrotem at apple.com] Sent: Monday, March 02, 2015 19:01 To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu; Duncan P. N. Exon Smith; dag at cray.com; Philip Reames (listmail at philipreames.com); Hal Finkel (hfinkel at anl.gov); Chandler Carruth (chandlerc at gmail.com) Su...
2016 May 20
0
Working on FP SCEV Analysis
...ood understanding on SCEV and their proper usages. Now, let’s assume we > can postpone discussion about case A. What is the best approach to handle case B? Let me summarize > the discussion so far. Hope I didn’t miss anything. > > 1) > Extend SCEV was the initial approach taken by Elena. > Elena thinks this solution ”looks very structured”. > If I’m not mistaken, some people think this is overkill and overly complicates already complicated SCEV. > Anyone care to look at the patch Elena came up with? > 2) > IndVarSimpli...
2014 Dec 15
2
[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets
AFAIK, there is no additional penalty for AMD processors. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chandler Carruth Sent: Monday, December 15, 2014 3:57 AM To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets FWIW, this makes sense to me. I'd be interested to hear from folks that are supporting AMD processors which do support AVX to ensure that there isn't an undue runtime penalty for these...
2008 Oct 04
3
ggplot2: how to combine position=stack and position=dodge in a single graph?
...added two layers with the time series of the stacked data for both years. That worked well exept the bars are obscuring each other. How can I shift one of the layers to get them displayed next to each other. Is there an other easier way to achieve this? Thanks a lot for any help on this. -- Elena