thr3ads.net - similar to: "[LLVMdev] Segfault on AArch64 LNT"

Displaying 20 results from an estimated 8000 matches similar to: "[LLVMdev] Segfault on AArch64 LNT"

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

2014 Dec 29

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

Hi Timothy, It requires some extra effort if twiddles and input/output have different bit width. Since Opus uses int32 for twiddles, we are going to do the same thing. Thanks, Phil Wang -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not

[LLVMdev] Address Space Casting

2013 Sep 10

[LLVMdev] Address Space Casting

Hello to everybody, I am writing this mail to inform you about a patch that will be committed soon (respect to current reviews). Here the link to the first mail in llvm-commits: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130812/184422.html This patch introduces a new IR instruction named 'addrspacecast' that will be used to represent the casting operation between

[LLVMdev] RFC:LNT Improvements

2014 Apr 30

[LLVMdev] RFC:LNT Improvements

On 30 April 2014 10:50, Tobias Grosser <tobias at grosser.es> wrote: > Only then we can judge the effects of changes that are aimed to increase the > quality. Agreed. > My proposal is to do this right ahead. As there is enough data from the > public X86 -O3 runs (10 samples each run, with 3-5 commits between each > run), the only missing piece seems to be the LNT changes to

[LLVMdev] Support for Soft-float

2014 Sep 24

[LLVMdev] Support for Soft-float

Hi, I'm trying to generate some SPARCv8 assembly for a sparc target that doesn't have an FPU. I'm unable to get the flow to generate calls to a soft-float library. Since I wasn't able to find a definitive answer, I was hoping someone might be able to offer some pointers or shed some light. Running "clang -c -emit-llvm -msoft-float test.c -o test.bc" doesn't generate

fixed point version for celt_pitch_xcorr on aarch64

2015 Jan 27

fixed point version for celt_pitch_xcorr on aarch64

Hi, all, Does Opus need celt_pitch_xcorr' s fixed point version for ARM aarch64 architecture? If yes, which version does Opus prefer: assembly or instrinsics? Thanks, Zhongwei -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the

[LLVMdev] Contributing the Apple ARM64 compiler backend

2014 Mar 31

[LLVMdev] Contributing the Apple ARM64 compiler backend

Hi, Apart from whether fast-isel should be enabled or disabled (I think enabled, personally), I haven't heard any dissenting voices about how to attack the merge problem yet. Tim, am I correct in saying that you believe AArch64 -> ARM64 is the right way to go? Does anyone disagree with that approach? Cheers, James ________________________________________ From: llvmdev-bounces at

[LLVMdev] RFC:LNT Improvements

2014 Apr 29

[LLVMdev] RFC:LNT Improvements

Dear all, Following the Benchmarking BOF from 2013 US dev meeting, I’d like to propose some improvements to the LNT performance tracking software. The most significant issue with current implementation is that the report is filled with extremely noisy values. Hence it is hard to notice performance improvements or regressions. After investigation of LNT and the LLVM test suite, I propose

[LLVMdev] Address Space Casting

2013 Sep 10

[LLVMdev] Address Space Casting

Hi, | This patch introduces a new IR instruction named 'addrspacecast' that will be | used to represent the casting operation between pointers of different address | spaces. This instruction will represent whatever kind of conversion (potentially | both value and size of the pointer) and the semantic of the conversion between a | pair of address spaces is target specific. Assuming I

opus Digest, Vol 76, Issue 11

2015 May 11

opus Digest, Vol 76, Issue 11

Hi Jean-Marc, Thanks for pointing us the way. Yes it is a overflowing problem. I moved all scaling code in the front of any other operations, and test_unit_mdct passes for all sizes. I will update Ne10 right after Vish double checks it on hardware. He will repost patches with more verification later this week. Regards, Phil Wang Well, I see three questions that need to be answered at this point

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 24

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

Hi, I am working on DSP module of Ne10. I see there are fixed-point and floating-point FFT inside Opus. Is fixed-point FFT only a fall back for CPU without VFP? On ARMv7-A and ARMv8-A, benchmark result shows that fixed-point (int32) and floating-point (float32) FFT have similar performance. I guess fixed-point version is not often used on these platforms. Is it worth the effort to NEON-optimize

[ARM][FFT][NEON] Integrate Ne10 into Opus?

2014 Dec 18

[ARM][FFT][NEON] Integrate Ne10 into Opus?

Hi Ralph, I have pushed patches to enable radix 3 and radix 5. Github: https://github.com/projectNe10/Ne10/releases/tag/v1.2.0 Best Regards, Phil Wang > Date: Thu, 11 Dec 2014 10:46:50 -0800 > From: Ralph Giles <giles at thaumas.net> > Subject: Re: [opus] [ARM][FFT][NEON] Integrate Ne10 into Opus? > To: opus at xiph.org > Message-ID: <5489E69A.5000305 at thaumas.net>

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

2015 Jan 19

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

Hi Jean-Marc, I have implemented fixed-point FFT with 32-bit twiddles. Now I want to evaluate the accuracy, what method does Opus use? I use function implemented inside Ne10 to calculate SNR. Any comment? | size | SNR (dB) | | 16 | 82.558587 | | 32 | 83.530298 | | 60 | 80.292433 | | 64 | 82.752950 | | 120 | 79.625077 | | 128 | 83.091260 | | 240 | 79.555263 | | 256 |

[LLVMdev] Pseudo load and store instructions for AArch64

2014 Aug 22

[LLVMdev] Pseudo load and store instructions for AArch64

Hi Renato, > > I'm trying to add pseudo 64-bit load and store instructions for AArch64, which > > should have latencies set to "1" while being otherwise exactly the same as > > normal load and store instructions. > > Can I ask why would you need that? This is the only way I found to stop Machine Instruction Scheduler from reordering load and store

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 25

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

Jean-Marc Valin wrote: > There is definitely some use for a Neon fixed-point FFT. How much > exactly I'm not sure. Fixed-point is a bit more than just a fall-back Well, we use fixed-point mode by default in Firefox for both Firefox OS and Fennec (Firefox on Android). The reason is that, although there is some NEON-class hardware where float does finally appear to be a little bit

[LLVMdev] RFC:LNT Improvements

2014 Apr 30

[LLVMdev] RFC:LNT Improvements

On 30/04/2014 16:20, Yi Kong wrote: > Hi Tobias, Renato, > > Thanks for your attention to my RFC. > On 30 April 2014 07:50, Tobias Grosser <tobias at grosser.es> wrote: > >> - Show and graph total compile time > >> There is no obvious way to scale up the compile time of > >> individual benchmarks, so total time is the best thing we can do to >

[LLVMdev] [cfe-dev] AArch64 Clang CLI interface proposal

2014 Jan 08

[LLVMdev] [cfe-dev] AArch64 Clang CLI interface proposal

I knew I'd regret leaving that option in for the MIPS port back in 99. Basically this is the only acceptable way for mcpu to exist, but should never have been added to the GCC aarch64 port at all since there's no compatibility with existing build systems to worry about. I would still like you to show this mythical piece of software that needs this compatibility. -eric On Jan 8, 2014 3:06

opus Digest, Vol 72, Issue 17

2015 Feb 03

opus Digest, Vol 72, Issue 17

Hi all, I have already added support for scaled forward non-power-of-2 floating-point FFT: https://github.com/projectNe10/Ne10/commit/79c3d787302f8d74b9bcfe6545d487cdf1b101d9 Two flags are added to cfg structure: is_forward_scaled and is_backward_scaled. By setting is_forward_scaled to anything but zero, ne10_fft_c2c_1d_float32_neon will scale the output. So we can remove need for one buffer on

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

2015 Oct 06

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

I'm trying to get these cleaned up and landed, but I'm running into some trouble with this patch. Using commit a08b29d88e3c (July 21) of Ne10, I'm seeing test failures for 60-point FFTs: nfft=60 inverse=0,snr = -3.312408 ** poor snr: -3.312408 ** nfft=60 inverse=1,snr = -16.079597 ** poor snr: -16.079597 ** All other sizes tested appear to work fine (84 to 140 dB of SNR). This

[LLVMdev] Contributing the Apple ARM64 compiler backend

2014 Mar 31

[LLVMdev] Contributing the Apple ARM64 compiler backend

On Mon, Mar 31, 2014 at 1:01 PM, Renato Golin <renato.golin at linaro.org>wrote: > On 31 March 2014 20:55, Tim Northover <t.p.northover at gmail.com> wrote: > > I'd almost prefer to leave it in for the bugs to be discovered > > (perhaps after some simple tests of our own). ARM went wirthout > > FastISel support on Linux for years simply because it was declared

[LLVMdev] [PBQP] Are edges between nodes from totally disjoint register classes necessary ?

2015 Feb 04

[LLVMdev] [PBQP] Are edges between nodes from totally disjoint register classes necessary ?

Hi Lang, While working on improving the debug dumps of the PBQP graphs, I found out that we can have some edges between nodes which belong to totally disjoint register classes (for example, on AArch64, this would be an int and a floating point register). Although it is true those 2 registers interferes, in the sense they are alive at the same time, they never have any physical interference,

similar to: [LLVMdev] Segfault on AArch64 LNT