Displaying 20 results from an estimated 800 matches similar to: "builtins name mangling in SPIR 2.0"
2016 Sep 12
2
builtins name mangling in SPIR 2.0
Thanks a lot.
On Mon, Sep 12, 2016 at 1:42 PM, Liu, Yaxun (Sam) <Yaxun.Liu at amd.com> wrote:
> If you use the default header file under clang/lib/Headers/opencl-c.h,
> get_global_id will be mangled.
>
>
>
> If you want to declare get_global_id in your own header, add
> __attribute__((overloadable)), then it will be mangled.
>
>
>
> Sam
>
>
>
>
2016 Sep 16
2
builtins name mangling in SPIR 2.0
+ Alexey Anastasia
According to SPIR spec v1.2 s2.10.3
2.10.3 The printf function
The printf function is supported, and is mangled according to its prototype as follows:
int printf(constant char * restrict fmt, ... )
Note that the ellipsis formal argument (...) is mangled to argument type specifier z
It seems printf should be mangled.
Alexey/Anastasia,
What do you think? Thanks.
Sam
From:
2016 Sep 18
2
builtins name mangling in SPIR 2.0
I don't see any problem mangling it to be honest even though there seems to be only one prototype anyways.
We could add restrict in as well.
Cheers,
Anastasia
________________________________
From: Hongbin Zheng <etherzhhb at gmail.com>
Sent: 17 September 2016 05:32:54
To: Liu, Yaxun (Sam)
Cc: cfe-dev at lists.llvm.org; llvm-dev; Bader, Alexey (alexey.bader at intel.com); Anastasia
2016 Jan 11
4
Some llvm questions (for tgsi backend)
Hi,
After a few distractions I'm back to work on the llvm tgsi backend. I've
added clang integration and I can now compile a simple opencl program
to something which sort of looks like tgsi.
You can find my latest work on this here:
http://cgit.freedesktop.org/~jwrdegoede/llvm
http://cgit.freedesktop.org/~jwrdegoede/clang
(the latter may still need to sync)
I've a little test
2016 Jan 12
1
Some llvm questions (for tgsi backend)
Hi Tom,
Thanks for taking the time to answer this.
On 11-01-16 18:10, Tom Stellard wrote:
> On Mon, Jan 11, 2016 at 12:07:14PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> After a few distractions I'm back to work on the llvm tgsi backend. I've
>> added clang integration and I can now compile a simple opencl program
>> to something which sort of looks like
2016 May 24
1
BitcodeReader non explicit error
Hi,
I'm working on OpenCL and I'm using clang as compiler (based on clang 3.7.0).
I have a issue, I'm generating a bitcode file (that I can print before before the generation). But when I'm trying to read it again with clang, I have this issue:
"error: Invalid record"
How can I managed to know where it comes from?
Thank you,
Romaric
Here is what is print before the
2016 Jan 11
0
Some llvm questions (for tgsi backend)
On Mon, Jan 11, 2016 at 12:07:14PM +0100, Hans de Goede wrote:
> Hi,
>
> After a few distractions I'm back to work on the llvm tgsi backend. I've
> added clang integration and I can now compile a simple opencl program
> to something which sort of looks like tgsi.
>
> You can find my latest work on this here:
> http://cgit.freedesktop.org/~jwrdegoede/llvm
>
2016 Jan 11
0
Some llvm questions (for tgsi backend)
On Mon, Jan 11, 2016 at 6:07 AM, Hans de Goede <hdegoede at redhat.com> wrote:
> Hi,
>
> After a few distractions I'm back to work on the llvm tgsi backend. I've
> added clang integration and I can now compile a simple opencl program
> to something which sort of looks like tgsi.
>
> You can find my latest work on this here:
>
2017 Feb 06
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Thanks a lot for reviewing this huge assembly function!
silk_warped_autocorrelation_FIX_c()'s kernel part is
for( n = 0; n < length; n++ ) {
tmp1_QS = silk_LSHIFT32( (opus_int32)input[ n ], QS );
/* Loop over allpass sections */
for( i = 0; i < order; i++ ) {
/* Output of allpass section */
tmp2_QS = silk_SMLAWB(
2017 Feb 07
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
This is a great idea. But the order (psEncC->shapingLPCOrder) can be
configured to 12, 14, 16, 20 and 24 according to complexity parameter.
It's hard to get a universal function to handle all these orders
efficiently. Any suggestions?
Thanks,
Linfeng
On Mon, Feb 6, 2017 at 12:40 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:
> Hi Linfeng,
>
> On 06/02/17 02:51 PM,
2017 Feb 07
3
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Thanks for your suggestions. Will get back to you once we have some updates.
Linfeng
On Mon, Feb 6, 2017 at 5:47 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:
> Hi Linfeng,
>
> On 06/02/17 07:18 PM, Linfeng Zhang wrote:
> > This is a great idea. But the order (psEncC->shapingLPCOrder) can be
> > configured to 12, 14, 16, 20 and 24 according to
2017 Apr 05
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
I attached a new patch with small cleanup (disassembly is identical as the
last patch). We have done the same internal testing as usual.
Also, attached 2 failed temporary versions which try to reduce code size
(just for code review reference purpose).
The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of
3,228 bytes (with gcc).
smaller_slower.c has a code size of 2,304
2017 Apr 05
4
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Thank Jean-Marc!
The speedup percentages are all relative to the entire encoder.
Comparing to master, this optimization patch speeds up fixed-point SILK
encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8%
Complexity 8: 5.5% Complexity 10: 4.0%
when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max
MHz: 2116.5
Thanks,
Linfeng
On Wed, Apr 5, 2017 at 11:02 AM,
2010 Sep 20
1
ERROR: Object not found
Dear All,
I am trying to use ode solver "rk4" to solve an ODE system, however, it
keeps saying: Error in eval(expr, envir, enclos) : object "dIN" not found.
The sample codes are enclosed as follows, please help me. Thank you very
much!
rm(list=ls())
library(odesolve)
# The ODE system
ode <- function(t,x,p){
with(as.list(c(x,p)),{
2010 Sep 20
1
Ask for help with Error: Object not found
Dear All,
I am trying to use ode solver "rk4" to solve an ODE system, however, it
keeps saying: Error in eval(expr, envir, enclos) : object "dIN" not found.
The sample codes are enclosed as follows, please help me. Thank you very
much!
rm(list=ls())
library(odesolve)
# The ODE system
ode <- function(t,x,p){
with(as.list(c(x,p)),{
2017 Feb 06
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Linfeng,
On 06/02/17 02:51 PM, Linfeng Zhang wrote:
> However, the critical thing is that all the states in each stage when
> processing input[i] are reused by the next input[i+1]. That is
> input[i+1] must wait input[i] for 1 stage, and input[i+2] must wait
> input[i+1] for 1 stage, etc.
That is indeed the tricky part... and the one I think you could do
slightly differently. If
2017 Feb 07
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Linfeng,
On 06/02/17 07:18 PM, Linfeng Zhang wrote:
> This is a great idea. But the order (psEncC->shapingLPCOrder) can be
> configured to 12, 14, 16, 20 and 24 according to complexity parameter.
>
> It's hard to get a universal function to handle all these orders
> efficiently. Any suggestions?
I can think of two ways of handling larger orders. The obvious one is
2017 Apr 03
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Jean-Marc,
Attached is the silk_warped_autocorrelation_FIX_neon() which implements
your idea.
Speed improvement vs the previous optimization:
Complexity 0-4: Doesn't call this function. Complexity 5: 2.1% (order = 16)
Complexity 6: 1.0% (order = 20) Complexity 8: 0.1% (order = 24) Complexity
10: 0.1% (order = 24)
Code size of silk_warped_autocorrelation_FIX_neon() changes from 2,644
2017 Apr 05
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Linfeng,
Thanks for the updated patch. I'll have a look and get back to you. When
you report speedup percentages, is that relative to the entire encoder
or relative to just that function in C? Also, what's the speedup
compared to master?
Cheers,
Jean-Marc
On 05/04/17 12:14 PM, Linfeng Zhang wrote:
> I attached a new patch with small cleanup (disassembly is identical as
> the
2017 Apr 06
0
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
Hi Linfeng,
I had a closer look at your patch and the code looks good -- and
slightly simpler than I had anticipated, so that's good.
I did some profiling on a Cortex A57 and I've been seeing slightly less
improvement than you're reporting, more like 3.5% at complexity 8. It
appears that the warped autocorrelation function itself is only faster
by a factor of about 1.35. That's a