thr3ads.net - search: "doubleword"

Displaying 20 results from an estimated 32 matches for "doubleword".

2011 Oct 27

[LLVMdev] Trunc Load

...%tmp = load i64* @l %conv = trunc i64 %tmp to i32 ret i32 %conv } ================================== Of interest are the lines %tmp = load i64* @l %conv = trunc i64 %tmp to i32 ... which LLVM automatically translates to "load singleword" from memory, instead of "load doubleword; then truncate". However, this (simply load a singleword) gives an erroneous result. On my architecture, this results in the high bits being loaded into the return register, instead of the low bits, as should happen with truncate. Details: i64 's are stored in two adjacent 32 bit regis...

[LLVMdev] Paired register allocation problem

2010 Feb 22

[LLVMdev] Paired register allocation problem

...e 64bit instructions which are using pairs of these 32bit registers. I have defined registers, aliases and subregister set. The problem is that register allocator is using 32bit registers that are already used in a pair, for example: lw $r0, 16[$r12] // load word to r0 ld $p0, 36[$r12] // load doubleword to p0 shl $p0, $p0, $r0 // shift left p0 by r0 and store result in p0 where p0 is a pair r0:r1 Could anyone tell me what am I doing wrong? Thanks in advance Artur -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attac...

llvm-3.6 MCAsmParser x64 Error "invalid operand for instruction" when msb set

2015 Dec 17

llvm-3.6 MCAsmParser x64 Error "invalid operand for instruction" when msb set

Hello, I am experiencing problems, when trying to assemble these two x86-64 Opcodes "add r64, imm32" "imul r64, r64, imm32" When having the most significant bit set for imm32, for example: "add rax, 0x80000000", "add rax, 0xffffffff", ... "imul rbx, rsi, 0x80000000", "imul rbx, rsi, 0xffffffff", ... The Error Message I receive is the

[LLVMdev] Trunc Load

2011 Oct 27

[LLVMdev] Trunc Load

...i32 > ret i32 %conv > } > ================================== > > Of interest are the lines > %tmp = load i64* @l > %conv = trunc i64 %tmp to i32 > > ... which LLVM automatically translates to "load singleword" from > memory, instead of "load doubleword; then truncate". However, this > (simply load a singleword) gives an erroneous result. On my > architecture, this results in the high bits being loaded into the return > register, instead of the low bits, as should happen with truncate. > > Details: i64 's are stored in two...

[LLVMdev] Paired register allocation problem

2010 Feb 22

[LLVMdev] Paired register allocation problem

Hello, Artur > I have defined registers, aliases and subregister set. > The problem is that register allocator is using 32bit registers that are > already used in a pair, for example: > lw $r0, 16[$r12] // load word to r0 > ld $p0, 36[$r12] // load doubleword to p0 > shl $p0, $p0, $r0 // shift left p0 by r0 and store result in p0 > where p0 is a pair r0:r1 > Could anyone tell me what am I doing wrong? Have you defined aliases properly? Look how this is handled inside s390 backend (systemz). Note that in general you'll need to write &quo...

[LLVMdev] Bug #16941

2013 Oct 25

[LLVMdev] Bug #16941

Nadav, The problem appears only for vectors longer than available hardware register (in doubleword elements, i.e. more than 4 on SSE4 and more than 8 on AVX). Select does weird thing. <8 x i1> mask comes as two XMM registers, select converts them to a single XMM registers (i.e. 8 x 16 bit), immediately after it converts back to two XMM registers and does blend. Conversion forth and back ha...

libFLAC internals

2004 Sep 10

libFLAC internals

...ough some runs, it appears that 'order' mod 4 is always 0. Is that guaranteed, either by the format or by higher functions in the reference decoder? Also, what assumptions can I make about the alignment of 'data' and 'qlp_coeff'? It would be really nice if these were both doubleword-aligned. Finally, in a more general context, is there an easy way to build for profiling, or do I have to edit the makefiles? I'm using gcc and gprof. Thanks in advance, -Brady -- Brady Patterson (brady@spaceship.com) Do you know Old Kentucky Shark?

[LLVMdev] Bug #16941

2013 Oct 26

[LLVMdev] Bug #16941

...o split the vectors to increase ILP ? In that case ISPC should generate two vectors operations. Thanks, Nadav On Oct 25, 2013, at 2:16 PM, Dmitry Babokin <babokin at gmail.com> wrote: > Nadav, > > The problem appears only for vectors longer than available hardware register (in doubleword elements, i.e. more than 4 on SSE4 and more than 8 on AVX). Select does weird thing. <8 x i1> mask comes as two XMM registers, select converts them to a single XMM registers (i.e. 8 x 16 bit), immediately after it converts back to two XMM registers and does blend. Conversion forth and back ha...

[LLVMdev] Bug #16941

2013 Oct 26

[LLVMdev] Bug #16941

...In that case ISPC should generate two > vectors operations. > > Thanks, > Nadav > > > On Oct 25, 2013, at 2:16 PM, Dmitry Babokin <babokin at gmail.com> wrote: > > Nadav, > > The problem appears only for vectors longer than available hardware > register (in doubleword elements, i.e. more than 4 on SSE4 and more than 8 > on AVX). Select does weird thing. <8 x i1> mask comes as two XMM registers, > select converts them to a single XMM registers (i.e. 8 x 16 bit), > immediately after it converts back to two XMM registers and does blend. > Conversi...

[LLVMdev] Trunc Load

2011 Oct 27

[LLVMdev] Trunc Load

On Thu, Oct 27, 2011 at 9:29 AM, Johannes Birgmeier <e0902998 at student.tuwien.ac.at> wrote: > >> Hi Johannes, what processor are you targeting? Is it little-endian or >> big-endian? > Little-endian. (The truth: you can set it manually, but it is set to > little endian, for sure.) The processor is a TI TMS320C64x. > > Follow-up: I discovered that the

[LLVMdev] Trunc Load

2011 Oct 27

[LLVMdev] Trunc Load

...is contradictory: on a little-endian processor, the address for > loading a 64-bit value is same as the address of the low word. Are > you sure you're modeling the semantics of your lddw and stddw > instructions correctly? ... I thought so until now. Because I implemented stdw (store doubleword) completely analogous to lddw: Just print out stdw with the given pointer and the register pair, just like lddw. (This seems obvious.) Well, ****. I just read the documentation very carefully (yeah I know. I'm sorry) and it seems that stdw doesn't care about the big/little endian settin...

[RFC PATCH 10/12] virtio/s390: consolidate DMA allocations

2019 Apr 10

[RFC PATCH 10/12] virtio/s390: consolidate DMA allocations

...find anything > in the virtio spec that would put requirements on how this > status field needs to be aligned. But I did not look to hard. > > The ccw.cda can hold an arbitrary data address AFAIR (for indirect, > of course we do have alignment requirements). I think it needs to be doubleword aligned. > > Apparently status used to be a normal field, and became a pointer with > 73fa21ea4fc6 "KVM: s390: Dynamic allocation of virtio-ccw I/O > data." (Cornelia Huck, 2013-01-07). I could not quite figure out why. In the beginning, the code used a below-2G-area for al...

libFLAC internals

2004 Sep 10

libFLAC internals

...gt; that guaranteed, either by the format or by higher functions in the reference > decoder? No, 1 <= order <= 32. There is -l option :). > Also, what assumptions can I make about the alignment of 'data' and > 'qlp_coeff'? It would be really nice if these were both doubleword-aligned. Everything should be 4 byte aligned, residual is 8 byte aligned on GNU libc based system. If this isn't good enough (and it isn't for SSE2), we will have to replace appropriate malloc calls. However, you can copy qlp_coeffs on stack for better alignment. > > Finally, in a...

[LLVMdev] Trunc Load

2011 Oct 27

[LLVMdev] Trunc Load

> Hi Johannes, what processor are you targeting? Is it little-endian or > big-endian? Little-endian. (The truth: you can set it manually, but it is set to little endian, for sure.) The processor is a TI TMS320C64x. Follow-up: I discovered that the "guilty" method is DAGCombiner::ReduceLoadWidth. The error is introduced because the offset is not calculated correctly. The first

[RFC PATCH 10/12] virtio/s390: consolidate DMA allocations

2019 Apr 11

[RFC PATCH 10/12] virtio/s390: consolidate DMA allocations

...ents on how this > > > status field needs to be aligned. But I did not look to hard. > > > > > > The ccw.cda can hold an arbitrary data address AFAIR (for indirect, > > > of course we do have alignment requirements). > > > > I think it needs to be doubleword aligned. > > > > I've re-read the part of the PoP that describes the ccw formats. And > it reinforced my position: for IDA and MIDA we need proper alignment, > but if the CCW ain't an indirect one there is no alignment requirement. > > QEMU also does not seem to...

[LLVMdev] Bug #16941

2013 Oct 22

[LLVMdev] Bug #16941

On Oct 21, 2013, at 12:09 PM, Dmitry Babokin <babokin at gmail.com> wrote: > By the way, I'm curious, is the any reason why you focus on SSE4, not AVX? Seems that vectorizer should care the most about the latest silicon. > I am interested in looking at the SSE4 code because lowering of AVX code is more complicated, especially for masks. The problem that <8 x i1> can be

[LLVMdev] PSA: Perfectly forwarding thunks can now be expressed in LLVM IR with musttail and varargs

2014 Oct 09

[LLVMdev] PSA: Perfectly forwarding thunks can now be expressed in LLVM IR with musttail and varargs

On 8 Oct 2014, at 18:19, Reid Kleckner <rnk at google.com> wrote: > The one target I know about where varargs are passed differently from normal arguments is aarch64-apple-ios/macosx. After thinking a bit more, I think this forwarding thunk representation works fine even on that target. Typically a forwarding thunk is called indirectly, or at least through a bitcast, so the LLVM IR call

[LLVMdev] Bug #16941

2013 Oct 21

[LLVMdev] Bug #16941

Nadav, You are right, ISPC may issue intrinsics as a result of AST selection. Though I believe that we should stick to LLVM IR whenever is possible. Intrinsics may appear to be boundaries for optimizations (on both data and control flow) and are generally not optimizable. LLVM may improve over time from performance stand point and we would benefit from it (or it may play against us, like in this

[LLVMdev] Paired register allocation problem

2010 Feb 22

[LLVMdev] Paired register allocation problem

...Anton, Thanks for reply > I have defined registers, aliases and subregister set. > > The problem is that register allocator is using 32bit registers that are > > already used in a pair, for example: > > lw $r0, 16[$r12] // load word to r0 > > ld $p0, 36[$r12] // load doubleword to p0 > > shl $p0, $p0, $r0 // shift left p0 by r0 and store result in p0 > > where p0 is a pair r0:r1 > > Could anyone tell me what am I doing wrong? > Have you defined aliases properly? Look how this is handled inside > s390 backend (systemz). I've compared again...

mboot.c32, weird e820 map on HP blade machine, possible memory corruption

2006 Mar 11

mboot.c32, weird e820 map on HP blade machine, possible memory corruption

...values of the e820 printed correctly! its very weird why this may happen, and I haven't changed anything else in mboot.c from syslinux-2.11 code. Can someone think of anything? Another thing is, the e820 buffer should be zeroed out since some bios'es are buggy and dont overwrite the high doubleword of Length field of the AddrRangeDesc. This is seen on the Dell Poweredge1800's. So it should look like: while(((void *)(e820 + 1)) < __com32.cs_bounce + __com32.cs_bounce_size) { memset(e820, 0, sizeof(*e820)); e820->size = sizeof(*e820) - sizeof(e820->size); .......

search for: doubleword