Displaying 20 results from an estimated 7000 matches similar to: "masked-load endpoints optimization"
2016 Mar 11
3
masked-load endpoints optimization
Thanks, Ashutosh.
Yes, either TTI or TLI could be used to limit the transform if we do it in
CGP rather than the DAG.
The real question I have is whether it is legal to read the extra memory,
regardless of whether this is a masked load or something else.
Note that the x86 backend already does this, so either my proposal is ok
for x86, or we're already doing an illegal optimization:
define
2016 Mar 15
3
the as-if rule / perf vs. security
[cc'ing cfe-dev because this may require some interpretation of language
law]
My understanding is that the compiler has the freedom to access extra data
in C/C++ (not sure about other languages); AFAIK, the LLVM LangRef is
silent about this. In C/C++, this is based on the "as-if rule":
http://en.cppreference.com/w/cpp/language/as_if
So the question is: where should the optimizer
2016 Mar 16
3
the as-if rule / perf vs. security
Hi Ben -
Thanks for your response. For the sake of argument, let's narrow the scope
of the problem to eliminate some of the variables you have rightfully
cited.
Let's assume we're not dealing with volatiles, atomics, or FP operands.
We'll even guarantee that the extra loaded value is never used. This is, in
fact, the scenario that http://reviews.llvm.org/rL263446 is concerned
2016 Mar 16
3
the as-if rule / perf vs. security
We are careful not to try this optimization where it would extend the range
of loaded memory; this is purely for what I call a "load doughnut". :)
Reading past either specified edge would be very bad because it could cause
a memory fault / exception where there was none in the original program.
That's definitely not legal.
On Wed, Mar 16, 2016 at 12:20 PM, Craig, Ben <ben.craig
2008 Dec 15
2
Duplicates among columns of a data frame
Dear list,
I have a data frame of survey respondents, a little like this:
set.seed(20081215)
n <- 100
dat <- data.frame(id=1:100,
addr1=sample(LETTERS, n, replace=TRUE),
addr2=sample(LETTERS, n, replace=TRUE),
addr3=sample(LETTERS, n, replace=TRUE))
head(dat)
id addr1 addr2 addr3
1 1 R H Q
2 2 H C K
3 3
2006 Jan 09
3
Design Question
I am sure some of you can give me an insight into this. This is more
towards the database design for the scenario below:
Say for example,
I have a person table and this person can have different address types.
One could be Home and the other could be say Office.
Should be model this
Table people
id
fname
lname
Table addresses
id
person_id
addr1
addr2
....
or
Table people
id
fname
lname
2009 Dec 12
1
Dovecot-sieve multiple redirect question
Hi,
I have a question about redirecting message to a multiple addresses.
I have an user script like following:
require ["copy"];
redirect :copy "addr1 at dom.ain";
redirect :copy "addr2 at dom.ain";
All works fine, but if addr1 at ... has exceeded quota, this script seems stop
working and addr2 at ... doesn't receive this message too.
Is this correct
2014 Feb 28
1
VoiceMail Issue
Hello,
am attempting again to resolve an issue with multi-tenancy and the forwarding to VMs between mailboxes. If in a multi-tenancy environment one uses custom contexts ie.
[a1-ext1](a1)
mailbox=101 at a1
and the associated voicemail.conf entry:
[a1]
101 => 1234,My User 1,addr1 at email.com,,tz=eastern|imapuser=addr1 at email.com|imapfolder=Inbox
102 => 1234,My User 2,addr2 at
2008 Apr 05
3
iaxmodem + hylafax w/ DID routing
hi folks.
i'm experimenting with iaxmodem + hylafax using DID to determine
where to send the fax to it's final destination. however i have
difficulties passing the DID information from iaxmodem to
hylafax.
in extensions.conf:
exten => _XXXX,1,Dial(IAX2/iaxmodem0/${EXTEN}|20|r)
exten => _XXXX,n,Dial(IAX2/iaxmodem1/${EXTEN}|20|r)
exten => _XXXX,n,Busy
exten => _XXXX,n,Hangup
2013 Feb 27
2
[LLVMdev] Question about intrinsic function llvm.objectsize
On Feb 27, 2013, at 12:37 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> Hi, Nuno and Arnold:
>
> Thank you all for the input.
>
> Let me coin a term, say "clique" for this discussion to avoid unnecessary confusion.
> A clique is statically or dynamically allocated type-free stretch of memory. A "clique"
> 1) is maximal in the sense
2014 Sep 12
6
NVA3: Small misc mem reclocking fixes
Patch 1 fixes nva3 bailing due to not finding the right ramcfg
Patch 2 is a resend rebased on 3.17.0-rc4 for setting the vblank period
Patch 3-5 handle writes to per-partition registers, for which NVA3 does not
have special broadcast regs available.
Patch 6 removes local structs from NVA3 reclocking in favour of the already
existing "ram->base." variables, like in NVE0
As always,
2013 Feb 27
0
[LLVMdev] Question about intrinsic function llvm.objectsize
Hi, Nuno and Arnold:
Thank you all for the input.
Let me coin a term, say "clique" for this discussion to avoid
unnecessary confusion.
A clique is statically or dynamically allocated type-free stretch of
memory. A "clique"
1) is maximal in the sense that a clique dose not have any
enclosing data structure that can
completely cover or, partially
2011 Feb 21
0
[LLVMdev] [PATCH] OpenCL support - update on keywords
The problem is that we use the ordering private, global, constant and local, and this is the same ordering that is used on Apple as well. As we already have OpenCL binaries out in public, making the change is problematic as we want to keep backward compatibility at all costs.
Thanks,
Micah
> -----Original Message-----
> From: Anton Lokhmotov [mailto:Anton.Lokhmotov at arm.com]
> Sent:
2013 Feb 27
4
[LLVMdev] Question about intrinsic function llvm.objectsize
On Feb 27, 2013, at 4:05 AM, Nuno Lopes <nunoplopes at sapo.pt> wrote:
> Hi,
>
> Regarding the definition of object for @llvm.objectsize, it is identical to gcc's __builtin_object_size(). So it's not wrong; it's just the way it was defined to be.
>
> Regarding the BasicAA's usage of these functions, I'm unsure. It seems to me that isObjectSmallerThan()
2011 Feb 21
3
[LLVMdev] [PATCH] OpenCL support - update on keywords
> > > > +enum OpenCLAddressSpace {
> > > > + OPENCL_PRIVATE = 0,
> > > > + OPENCL_GLOBAL = 1,
> > > > + OPENCL_LOCAL = 2,
> > > > + OPENCL_CONSTANT = 3
> > > > +};
> -----Original Message-----
> From: Villmow, Micah [mailto:Micah.Villmow at amd.com]
>
> Anton,
> Would there be any issue with switching
2014 Jun 27
1
[PATCH] drm/nouveau/fb: Prevent inlining of ramfuc_reg
When gcc 4.8 inlines this function, it eats up 16 bytes on the stack
every time. Eventually we hit warnings because our stack grew too
much:
ramnve0.c:1383:1: error: the frame size of 1496 bytes is larger than
1024 bytes
We fix this by preventing inlining for this function.
Signed-off-by: St?phane Marchesin <marcheu at chromium.org>
---
drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h | 2
2016 Jun 15
3
[Proposal][RFC] Strided Memory Access Vectorization
Sorry for the spam. Copy-paste didn't capture the Subject properly. Resending with the correct Subject so that the thread is captured properly.
-----Original Message-----
From: Saito, Hideki
Sent: Wednesday, June 15, 2016 1:39 PM
To: 'llvm-dev at lists.llvm.org' <llvm-dev at lists.llvm.org>
Subject: RE: [llvm-dev] [Proposal][RFC] Strided Memory Access
Ashutosh,
First,
2013 Feb 27
0
[LLVMdev] Question about intrinsic function llvm.objectsize
On 2/27/13 11:21 AM, Arnold Schwaighofer wrote:
> On Feb 27, 2013, at 12:37 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>
>> Hi, Nuno and Arnold:
>>
>> Thank you all for the input.
>>
>> Let me coin a term, say "clique" for this discussion to avoid unnecessary confusion.
>> A clique is statically or dynamically allocated
2016 Jun 18
2
[Proposal][RFC] Strided Memory Access Vectorization
>Vectorizer's output should be as clean as vector code can be so that analyses and optimizers downstream can
>do a great job optimizing.
Guess I should clarify this philosophical position of mine. In terms of vector code optimization that complicates
the output of vectorizer:
If vectorizer is the best place to perform the optimization, it should do so.
This includes the cases like
2016 Jun 30
0
[Proposal][RFC] Strided Memory Access Vectorization
One common concern raised for cases where Loop Vectorizer generate
bigger types than target supported:
Based on VF currently we check the cost and generate the expected set of
instruction[s] for bigger type. It has two challenges for bigger types cost
is not always correct and code generation may not generate efficient
instruction[s].
Probably can depend on the support provided by below RFC by