thr3ads.net - llvm dev - [LLVMdev] Functions: sret and readnone [Oct 2009]

If this information is useful, please help other people find it:
Share via:

Stephan Reiter

2009-Oct-05 14:21 UTC

[LLVMdev] Functions: sret and readnone

Hi all,

I'm currently building a DSL for a computer graphics project that is
not unlike NVIDIA's Cg. I have an intrinsic with the following
signature

float4 sample(texture tex, float2 coords);

that is translated to this LLVM IR code:

declare void @"sample"(%float4* noalias nocapture sret, %texture,
$float2) nounwind readnone

The type float4 is basically an array of four floats, which cannot be
returned directly on an x86 using the traditional calling conventions
but only via the sret mechanism.

You might already have spotted that "readnone" attribute, which is
causing some problems: The GVN optimization pass seems to treat the
sret pointer just like any other pointer to memory and eliminates all
calls to the function, since it sees it as returning void without
touching any memory. Is there a way to make sure that the GVN pass
interpretes the sret argument as the actual return value of the
function? Or are there other approaches I could try?

Currently, the only way to make sure that the sample function behaves
as expected is to drop the "readnone" attribute, but that obviously
hinders optimization ...

Thanks a lot,
Stephan

Kenneth Uildriks

2009-Oct-05 14:59 UTC

head link

[LLVMdev] Functions: sret and readnone

On Mon, Oct 5, 2009 at 9:21 AM, Stephan Reiter <stephan.reiter at
gmail.com> wrote:> Hi all,
>
> I'm currently building a DSL for a computer graphics project that is
> not unlike NVIDIA's Cg. I have an intrinsic with the following
> signature
>
> float4 sample(texture tex, float2 coords);
>
> that is translated to this LLVM IR code:
>
> declare void @"sample"(%float4* noalias nocapture sret, %texture,
> $float2) nounwind readnone
>
> The type float4 is basically an array of four floats, which cannot be
> returned directly on an x86 using the traditional calling conventions
> but only via the sret mechanism.
>
> You might already have spotted that "readnone" attribute, which
is
> causing some problems: The GVN optimization pass seems to treat the
> sret pointer just like any other pointer to memory and eliminates all
> calls to the function, since it sees it as returning void without
> touching any memory. Is there a way to make sure that the GVN pass
> interpretes the sret argument as the actual return value of the
> function? Or are there other approaches I could try?
>
> Currently, the only way to make sure that the sample function behaves
> as expected is to drop the "readnone" attribute, but that
obviously
> hinders optimization ...
>
> Thanks a lot,
> Stephan
I believe you are out of luck for the time being.

I plan to change the codegen stage so that it handles large struct
returns; then you could declare your function to return the four
floats directly and mark it readnone.  But I don't have a target date
for that change.

Duncan Sands

2009-Oct-05 15:23 UTC

head link

[LLVMdev] Functions: sret and readnone

Hi Stephan,
> You might already have spotted that "readnone" attribute, which
is
> causing some problems: The GVN optimization pass seems to treat the
> sret pointer just like any other pointer to memory and eliminates all
> calls to the function, since it sees it as returning void without
> touching any memory.
as explained in the language reference,
  http://llvm.org/docs/LangRef.html,
readonly functions must not write to any byval arguments.
The reason for this is that it allows the inliner to avoid introducing
a temporary variable and copy when inlining readonly functions with
a byval argument.

Is there a way to make sure that the GVN pass> interpretes the sret argument as the actual return value of the
> function? Or are there other approaches I could try?
Not for the moment, sorry.

Ciao,

Duncan.

Chris Lattner

2009-Oct-05 16:56 UTC

head link

[LLVMdev] Functions: sret and readnone

On Oct 5, 2009, at 7:21 AM, Stephan Reiter wrote:
> Hi all,
>
> I'm currently building a DSL for a computer graphics project that is
> not unlike NVIDIA's Cg. I have an intrinsic with the following
> signature
>
> float4 sample(texture tex, float2 coords);
>
> that is translated to this LLVM IR code:
>
> declare void @"sample"(%float4* noalias nocapture sret, %texture,
> $float2) nounwind readnone
The best thing to do to handle this is to add a custom AliasAnalysis  
implementation, which will know the precise mod/ref sets for the  
function.  See docs/AliasAnalysis.html for some more information.

-Chris
>
> The type float4 is basically an array of four floats, which cannot be
> returned directly on an x86 using the traditional calling conventions
> but only via the sret mechanism.
>
> You might already have spotted that "readnone" attribute, which
is
> causing some problems: The GVN optimization pass seems to treat the
> sret pointer just like any other pointer to memory and eliminates all
> calls to the function, since it sees it as returning void without
> touching any memory. Is there a way to make sure that the GVN pass
> interpretes the sret argument as the actual return value of the
> function? Or are there other approaches I could try?
>
> Currently, the only way to make sure that the sample function behaves
> as expected is to drop the "readnone" attribute, but that
obviously
> hinders optimization ...
>
> Thanks a lot,
> Stephan
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Dan Gohman

2009-Oct-05 21:33 UTC

head link

[LLVMdev] Functions: sret and readnone

On Oct 5, 2009, at 7:21 AM, Stephan Reiter wrote:
> Hi all,
>
> I'm currently building a DSL for a computer graphics project that is
> not unlike NVIDIA's Cg. I have an intrinsic with the following
> signature
>
> float4 sample(texture tex, float2 coords);
>
> that is translated to this LLVM IR code:
>
> declare void @"sample"(%float4* noalias nocapture sret, %texture,
> $float2) nounwind readnone
>
> The type float4 is basically an array of four floats, which cannot be
> returned directly on an x86 using the traditional calling conventions
> but only via the sret mechanism.
Is there a reason it needs to be an array? A vector of four floats
wouldn't have this problem, if that's an option.

Dan

Stephan

2009-Oct-06 07:00 UTC

head link

[LLVMdev] Functions: sret and readnone

On 5 Okt., 23:33, Dan Gohman <goh... at apple.com>
wrote:>
> Is there a reason it needs to be an array? A vector of four floats
> wouldn't have this problem, if that's an option.
>
Unfortunately that's not an option. At the moment I'm restricting
myself to the use of scalar code only, in order to be able to
vectorize the code easily later (e.g., float4 as it is now will then
become an array of four vectors for parallel processing of n (probably
4, SSE) pixels). But thanks for coming up with this idea!

Chris, I'll take a look at the AliasAnalysis functionality. Depending
on how much effort it is to implement a solution I might follow this
approach. If not, there's still Kenneth's new code generator to look
forward to. :)

Thanks,
Stephan

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Oct 2009 - [LLVMdev] Functions: sret and readnone

[LLVMdev] Functions: sret and readnone

[LLVMdev] Functions: sret and readnone

[LLVMdev] Functions: sret and readnone

[LLVMdev] Functions: sret and readnone

[LLVMdev] Functions: sret and readnone

[LLVMdev] Functions: sret and readnone

Possibly Parallel Threads