similar to: masked-load endpoints optimization

Displaying 20 results from an estimated 7000 matches similar to: "masked-load endpoints optimization"

2016 Mar 11
3
masked-load endpoints optimization
Thanks, Ashutosh. Yes, either TTI or TLI could be used to limit the transform if we do it in CGP rather than the DAG. The real question I have is whether it is legal to read the extra memory, regardless of whether this is a masked load or something else. Note that the x86 backend already does this, so either my proposal is ok for x86, or we're already doing an illegal optimization: define
2016 Mar 15
3
the as-if rule / perf vs. security
[cc'ing cfe-dev because this may require some interpretation of language law] My understanding is that the compiler has the freedom to access extra data in C/C++ (not sure about other languages); AFAIK, the LLVM LangRef is silent about this. In C/C++, this is based on the "as-if rule": http://en.cppreference.com/w/cpp/language/as_if So the question is: where should the optimizer
2016 Mar 16
3
the as-if rule / perf vs. security
Hi Ben - Thanks for your response. For the sake of argument, let's narrow the scope of the problem to eliminate some of the variables you have rightfully cited. Let's assume we're not dealing with volatiles, atomics, or FP operands. We'll even guarantee that the extra loaded value is never used. This is, in fact, the scenario that http://reviews.llvm.org/rL263446 is concerned
2016 Mar 16
3
the as-if rule / perf vs. security
We are careful not to try this optimization where it would extend the range of loaded memory; this is purely for what I call a "load doughnut". :) Reading past either specified edge would be very bad because it could cause a memory fault / exception where there was none in the original program. That's definitely not legal. On Wed, Mar 16, 2016 at 12:20 PM, Craig, Ben <ben.craig
2008 Dec 15
2
Duplicates among columns of a data frame
Dear list, I have a data frame of survey respondents, a little like this: set.seed(20081215) n <- 100 dat <- data.frame(id=1:100, addr1=sample(LETTERS, n, replace=TRUE), addr2=sample(LETTERS, n, replace=TRUE), addr3=sample(LETTERS, n, replace=TRUE)) head(dat) id addr1 addr2 addr3 1 1 R H Q 2 2 H C K 3 3
2006 Jan 09
3
Design Question
I am sure some of you can give me an insight into this. This is more towards the database design for the scenario below: Say for example, I have a person table and this person can have different address types. One could be Home and the other could be say Office. Should be model this Table people id fname lname Table addresses id person_id addr1 addr2 .... or Table people id fname lname
2009 Dec 12
1
Dovecot-sieve multiple redirect question
Hi, I have a question about redirecting message to a multiple addresses. I have an user script like following: require ["copy"]; redirect :copy "addr1 at dom.ain"; redirect :copy "addr2 at dom.ain"; All works fine, but if addr1 at ... has exceeded quota, this script seems stop working and addr2 at ... doesn't receive this message too. Is this correct
2014 Feb 28
1
VoiceMail Issue
Hello, am attempting again to resolve an issue with multi-tenancy and the forwarding to VMs between mailboxes. If in a multi-tenancy environment one uses custom contexts ie. [a1-ext1](a1) mailbox=101 at a1 and the associated voicemail.conf entry: [a1] 101 => 1234,My User 1,addr1 at email.com,,tz=eastern|imapuser=addr1 at email.com|imapfolder=Inbox 102 => 1234,My User 2,addr2 at
2008 Apr 05
3
iaxmodem + hylafax w/ DID routing
hi folks. i'm experimenting with iaxmodem + hylafax using DID to determine where to send the fax to it's final destination. however i have difficulties passing the DID information from iaxmodem to hylafax. in extensions.conf: exten => _XXXX,1,Dial(IAX2/iaxmodem0/${EXTEN}|20|r) exten => _XXXX,n,Dial(IAX2/iaxmodem1/${EXTEN}|20|r) exten => _XXXX,n,Busy exten => _XXXX,n,Hangup
2013 Feb 27
2
[LLVMdev] Question about intrinsic function llvm.objectsize
On Feb 27, 2013, at 12:37 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote: > Hi, Nuno and Arnold: > > Thank you all for the input. > > Let me coin a term, say "clique" for this discussion to avoid unnecessary confusion. > A clique is statically or dynamically allocated type-free stretch of memory. A "clique" > 1) is maximal in the sense
2014 Sep 12
6
NVA3: Small misc mem reclocking fixes
Patch 1 fixes nva3 bailing due to not finding the right ramcfg Patch 2 is a resend rebased on 3.17.0-rc4 for setting the vblank period Patch 3-5 handle writes to per-partition registers, for which NVA3 does not have special broadcast regs available. Patch 6 removes local structs from NVA3 reclocking in favour of the already existing "ram->base." variables, like in NVE0 As always,
2013 Feb 27
0
[LLVMdev] Question about intrinsic function llvm.objectsize
Hi, Nuno and Arnold: Thank you all for the input. Let me coin a term, say "clique" for this discussion to avoid unnecessary confusion. A clique is statically or dynamically allocated type-free stretch of memory. A "clique" 1) is maximal in the sense that a clique dose not have any enclosing data structure that can completely cover or, partially
2011 Feb 21
0
[LLVMdev] [PATCH] OpenCL support - update on keywords
The problem is that we use the ordering private, global, constant and local, and this is the same ordering that is used on Apple as well. As we already have OpenCL binaries out in public, making the change is problematic as we want to keep backward compatibility at all costs. Thanks, Micah > -----Original Message----- > From: Anton Lokhmotov [mailto:Anton.Lokhmotov at arm.com] > Sent:
2013 Feb 27
4
[LLVMdev] Question about intrinsic function llvm.objectsize
On Feb 27, 2013, at 4:05 AM, Nuno Lopes <nunoplopes at sapo.pt> wrote: > Hi, > > Regarding the definition of object for @llvm.objectsize, it is identical to gcc's __builtin_object_size(). So it's not wrong; it's just the way it was defined to be. > > Regarding the BasicAA's usage of these functions, I'm unsure. It seems to me that isObjectSmallerThan()
2011 Feb 21
3
[LLVMdev] [PATCH] OpenCL support - update on keywords
> > > > +enum OpenCLAddressSpace { > > > > + OPENCL_PRIVATE = 0, > > > > + OPENCL_GLOBAL = 1, > > > > + OPENCL_LOCAL = 2, > > > > + OPENCL_CONSTANT = 3 > > > > +}; > -----Original Message----- > From: Villmow, Micah [mailto:Micah.Villmow at amd.com] > > Anton, > Would there be any issue with switching
2014 Jun 27
1
[PATCH] drm/nouveau/fb: Prevent inlining of ramfuc_reg
When gcc 4.8 inlines this function, it eats up 16 bytes on the stack every time. Eventually we hit warnings because our stack grew too much: ramnve0.c:1383:1: error: the frame size of 1496 bytes is larger than 1024 bytes We fix this by preventing inlining for this function. Signed-off-by: St?phane Marchesin <marcheu at chromium.org> --- drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h | 2
2016 Jun 15
3
[Proposal][RFC] Strided Memory Access Vectorization
Sorry for the spam. Copy-paste didn't capture the Subject properly. Resending with the correct Subject so that the thread is captured properly. -----Original Message----- From: Saito, Hideki Sent: Wednesday, June 15, 2016 1:39 PM To: 'llvm-dev at lists.llvm.org' <llvm-dev at lists.llvm.org> Subject: RE: [llvm-dev] [Proposal][RFC] Strided Memory Access Ashutosh, First,
2013 Feb 27
0
[LLVMdev] Question about intrinsic function llvm.objectsize
On 2/27/13 11:21 AM, Arnold Schwaighofer wrote: > On Feb 27, 2013, at 12:37 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote: > >> Hi, Nuno and Arnold: >> >> Thank you all for the input. >> >> Let me coin a term, say "clique" for this discussion to avoid unnecessary confusion. >> A clique is statically or dynamically allocated
2016 Jun 18
2
[Proposal][RFC] Strided Memory Access Vectorization
>Vectorizer's output should be as clean as vector code can be so that analyses and optimizers downstream can >do a great job optimizing. Guess I should clarify this philosophical position of mine. In terms of vector code optimization that complicates the output of vectorizer: If vectorizer is the best place to perform the optimization, it should do so. This includes the cases like
2016 Jun 30
0
[Proposal][RFC] Strided Memory Access Vectorization
One common concern raised for cases where Loop Vectorizer generate bigger types than target supported: Based on VF currently we check the cost and generate the expected set of instruction[s] for bigger type. It has two challenges for bigger types cost is not always correct and code generation may not generate efficient instruction[s]. Probably can depend on the support provided by below RFC by