thr3ads.net - search: "lsbs"

2008 Dec 16

2

[LLVMdev] Shifts that use only 5 LSBs.

I'm working on a Target that only uses the 5 lsbs of the shift amount. I only have 32 bit registers, no 64 bit, so 64 bit math is emulated, LLVM doing the transformations whenever I can get it to. I think I'm seeing a case where it ultimately looks like a standard multiword shift (from e.g. Hacker's Delight) is being inline expanded...

[LLVMdev] Shifts that use only 5 LSBs.

2008 Dec 17

0

[LLVMdev] Shifts that use only 5 LSBs.

On Tue, Dec 16, 2008 at 3:36 PM, Daniel M Gessel <gessel at apple.com> wrote: > I'm working on a Target that only uses the 5 lsbs of the shift amount. Okay, that's quite common... x86 is the same. > I only have 32 bit registers, no 64 bit, so 64 bit math is emulated, > LLVM doing the transformations whenever I can get it to. x86 is the same. > I think I'm seeing a case where it ultimately looks like a sta...

[LLVMdev] Shifts that use only 5 LSBs.

2008 Dec 17

0

[LLVMdev] Shifts that use only 5 LSBs.

On Tue, Dec 16, 2008 at 5:20 PM, Daniel M Gessel <gessel at apple.com> wrote: > The problem here is that it looks like LLVM is introducing an expansion that > assumes 32 bit shifts use more than 5 bits of the shift value. > I created a simple test function: > u64 mebbe_shift( u64 x, int test ) > { > if( test ) > x <<= 2; > return x; > } > I compile using

[LLVMdev] Shifts that use only 5 LSBs.

2008 Dec 18

0

[LLVMdev] Shifts that use only 5 LSBs.

On Thu, Dec 18, 2008 at 1:40 PM, Daniel M Gessel <gessel at apple.com> wrote: > I can't find the bug you refer to. Did the link not work? I'll try pasting it in again. In any case, I checked in a fix disabling the broken optimization; try updating to current SVN. http://llvm.org/bugs/show_bug.cgi?id=3225 > Also, it doesn't have this problem in x86: it uses the shldl

[LLVMdev] Shifts that use only 5 LSBs.

2008 Dec 18

2

[LLVMdev] Shifts that use only 5 LSBs.

I can't find the bug you refer to. Also, it doesn't have this problem in x86: it uses the shldl instruction. PPC32, interestingly enough, generates something similar, but looks like it has extra instructions to or in what's guaranteed to be 0. Reminding myself of some PPC assembler though, so I'm not 100%. Thanks, Dan On Dec 16, 2008, at 9:27 PM, Eli Friedman wrote:

bitreader optimizations

2008 Mar 14

2

bitreader optimizations

...AC__bitreader_read_raw_uint32() */ FLAC__bool FLAC__bitreader_read_rice_signed_block(FLAC__BitReader *br, int vals[], unsigned nvals, unsigned parameter) -/* OPT: possibly faster version for use with MSVC */ -#ifdef _MSC_VER { - unsigned i; - unsigned uval = 0; - unsigned bits; /* the # of binary LSBs left to read to finish a rice codeword */ - /* try and get br->consumed_words and br->consumed_bits into register; * must remember to flush them back to *br before calling other - * bitwriter functions that use them, and before returning */ - register unsigned cwords; - register unsigne...

[LLVMdev] Shifts that use only 5 LSBs.

2008 Dec 17

2

[LLVMdev] Shifts that use only 5 LSBs.

On Dec 16, 2008, at 7:57 PM, Eli Friedman wrote: > On Tue, Dec 16, 2008 at 3:36 PM, Daniel M Gessel <gessel at apple.com> > wrote: >> I'm working on a Target that only uses the 5 lsbs of the shift >> amount. > > Okay, that's quite common... x86 is the same. > Thanks - yes, I'd heard rumors that x86 operates the same way. >> I only have 32 bit registers, no 64 bit, so 64 bit math is emulated, >> LLVM doing the transformations whenever I can...

[PATCH] Optimize FLAC__bitreader_read_rice_signed

2012 May 04

0

[PATCH] Optimize FLAC__bitreader_read_rice_signed

...AC__bitreader_read_raw_uint32() */ FLAC__bool FLAC__bitreader_read_rice_signed_block(FLAC__BitReader *br, int vals[], unsigned nvals, unsigned parameter) -/* OPT: possibly faster version for use with MSVC */ -#ifdef _MSC_VER { - unsigned i; - unsigned uval = 0; - unsigned bits; /* the # of binary LSBs left to read to finish a rice codeword */ - /* try and get br->consumed_words and br->consumed_bits into register; * must remember to flush them back to *br before calling other - * bitwriter functions that use them, and before returning */ - register unsigned cwords; - register unsigne...

bitreader optimizations

2008 Mar 17

0

bitreader optimizations

...AC__bitreader_read_raw_uint32() */ FLAC__bool FLAC__bitreader_read_rice_signed_block(FLAC__BitReader *br, int vals[], unsigned nvals, unsigned parameter) -/* OPT: possibly faster version for use with MSVC */ -#ifdef _MSC_VER { - unsigned i; - unsigned uval = 0; - unsigned bits; /* the # of binary LSBs left to read to finish a rice codeword */ - /* try and get br->consumed_words and br->consumed_bits into register; * must remember to flush them back to *br before calling other - * bitwriter functions that use them, and before returning */ - register unsigned cwords; - register unsigne...

[PATCH 1/3] Make FLAC__clz_soft_uint32 static.

2012 Aug 28

3

[PATCH 1/3] Make FLAC__clz_soft_uint32 static.

--- src/libFLAC/include/private/bitmath.h | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/libFLAC/include/private/bitmath.h b/src/libFLAC/include/private/bitmath.h index 61b0e03..d32b1a7 100644 --- a/src/libFLAC/include/private/bitmath.h +++ b/src/libFLAC/include/private/bitmath.h @@ -42,7 +42,7 @@ #endif /* Will never be emitted for MSVC, GCC, Intel compilers */

Exc CB Search very little Question

2006 Sep 19

2

Exc CB Search very little Question

...straightforward. My goal is to make sure if nothing is embedded, the altered Speex version still needs to write exactly the same coefficients as the original non-stego version. For my scenario, I stick to NB encoding at 15kbps, so N is equal to 2. So all I can do, in my opinion, is to check if the LSBs of best_nind[0][i] and best_nind[1][i] (for each i in 0:nb_subvect) are different. The Problem is, that I would like to know how big the error is, which I introduce into the signal by this change. But in the end, I only know the ndist for all indexes of the last one of these nb_subvect indexes. (I...

Exc CB Search very little Question

2006 Sep 19

1

Exc CB Search very little Question

...h function to get the N best _combinations_ of CB IDs? Wouldn't it be easier and just as effective (for N==2) to do the following: Get nb_subvect indeces to write into the stream. Write all but the last of these into the stream. If nind[0][nb_subvect-1] and nind[1][nb_subvect-1] have different LSBs and if the difference between ndist[0] and ndist[1] is small enough (smaller than the normal variation of ndist[0], which can be measured), we can decide to write nind[1][nb_subvect-1] into the stream instead of nind[0][nb_subvect-1]. (that's actually what I am doing right now) > Of course,...

2017 Apr 26

2

2 patches related to silk_biquad_alt() optimization

...o I'm not totally opposed to that, but it increases the > testing/maintenance cost so it needs to be worth it. So the question is > how much speedup can you get and how close you can make the result to > the original function. If you can make the output be always within one > of two LSBs of the C version, then the asm check can simply be a little > bit more lax than usual. Otherwise it becomes more complicated. This > isn't a function that scares me too much about going non-bitexact, but > it's also not one of the big complexity costs either. In any case, let >...

2017 Apr 25

2

2 patches related to silk_biquad_alt() optimization

On Mon, Apr 24, 2017 at 5:52 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > On 24/04/17 08:03 PM, Linfeng Zhang wrote: > > Tested on my chromebook, when stride (channel) == 1, the optimization > > has no gain compared with C function. > > You mean that the Neon code is the same speed as the C code for > stride==1? This is not terribly surprising for an IIRC

2017 May 15

2

2 patches related to silk_biquad_alt() optimization

...testing/maintenance cost so it needs to be worth it. So the > question is > how much speedup can you get and how close you can make the > result to > the original function. If you can make the output be always > within one > of two LSBs of the C version, then the asm check can simply be a > little > bit more lax than usual. Otherwise it becomes more complicated. This > isn't a function that scares me too much about going > non-bitexact, but > it's also not one of the b...

Exc CB Search very little Question

2006 Sep 19

0

Exc CB Search very little Question

...is to make sure if nothing is embedded, > the altered Speex version still needs to write exactly the same > coefficients as the original non-stego version. > > For my scenario, I stick to NB encoding at 15kbps, so N is equal to 2. > So all I can do, in my opinion, is to check if the LSBs of > best_nind[0][i] and best_nind[1][i] (for each i in 0:nb_subvect) are > different. The Problem is, that I would like to know how big the error > is, which I introduce into the signal by this change. But in the end, I > only know the ndist for all indexes of the last one of these nb_...

2017 Apr 26

0

2 patches related to silk_biquad_alt() optimization

...ltiplication. OK, so I'm not totally opposed to that, but it increases the testing/maintenance cost so it needs to be worth it. So the question is how much speedup can you get and how close you can make the result to the original function. If you can make the output be always within one of two LSBs of the C version, then the asm check can simply be a little bit more lax than usual. Otherwise it becomes more complicated. This isn't a function that scares me too much about going non-bitexact, but it's also not one of the big complexity costs either. In any case, let me know what you fin...

Denoiser level and AEC problem

2006 Sep 20

2

Denoiser level and AEC problem

...straightforward. My goal is to make sure if nothing is embedded, the altered Speex version still needs to write exactly the same coefficients as the original non-stego version. For my scenario, I stick to NB encoding at 15kbps, so N is equal to 2. So all I can do, in my opinion, is to check if the LSBs of best_nind[0][i] and best_nind[1][i] (for each i in 0:nb_subvect) are different. The Problem is, that I would like to know how big the error is, which I introduce into the signal by this change. But in the end, I only know the ndist for all indexes of the last one of these nb_subvect indexes. (I...

Idea to possibly improve flac?

2011 Jan 08

0

Idea to possibly improve flac?

...tored > in 24 bits might have some patterns, such as perhaps all 1s or all 0s, but > there's no guarantee. ?Dithering starts by adding noise, and so the > intermediate result has valid bits everywhere. ?When the dither process is > complete, the result should be masked so that the 8 LSBs are all 0, but if > the masking isn't done then they really could be anything. ?I have noticed > interesting patterns with bit meters, but I'm not sure whether to trust a > bit meter that I did not write myself, as the two I have used show > completely different pictures of the...

FLAC decoding details

2009 Apr 05

1

FLAC decoding details

Hello all, I am writing an implementation of a FLAC decoder and I am polishing up some details. The format <http://flac.sourceforge.net/format.html> page leaves some room for interpretation. Can anyone help me by clarifying the official rules about the following? Most of them are degenerate cases that probably don't happen in practice: Thanks, --Jonathan Can the bits per sample

search for: lsbs