similar to: Where's the optimiser gone (part 11): use the proper instruction for sign extension

Displaying 20 results from an estimated 1100 matches similar to: "Where's the optimiser gone (part 11): use the proper instruction for sign extension"

2018 Nov 20
2
A pattern for portable __builtin_add_overflow()
Hi LLVM, clang, I'm trying to write a portable version of __builtin_add_overflow() it a way that the compiler would recognize the pattern and use the add_overflow intrinsic / the best possible machine instruction. Here are docs about these builtins: https://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins . With unsigned types this is easy: int uaddo_native(unsigned
2018 Nov 06
4
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
Hi @ll, while clang/LLVM recognizes common bit-twiddling idioms/expressions like unsigned int rotate(unsigned int x, unsigned int n) { return (x << n) | (x >> (32 - n)); } and typically generates "rotate" machine instructions for this expression, it fails to recognize other also common bit-twiddling idioms/expressions. The standard IEEE CRC-32 for "big
2018 Nov 27
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
"Sanjay Patel" <spatel at rotateright.com> wrote: > IIUC, you want to use x86-specific bit-hacks (sbb masking) in cases like > this: > unsigned int foo(unsigned int crc) { > if (crc & 0x80000000) > crc <<= 1, crc ^= 0xEDB88320; > else > crc <<= 1; > return crc; > } To document this for x86 too: rewrite the function
2012 Mar 27
1
[LLVMdev] Compiling integer mod
For the simple C program below I show the output of clang and the output of the VS compiler (I am on windows). Maybe this is obvious to you, but is it really faster to do 2 multiplications, 3 movl instructions, 2 shifts, 1 add, and 1 substract than to do 1 mov, 1 cdq, and 1 idiv? I run into this while trying to understand why my code runs slower with llvm than a comparable program on windows.
2014 Jan 11
3
[LLVMdev] Possible error in docs.
http://llvm.org/docs/CodeGenerator.html#machine-code-description-classes Section starting: Fixed (preassigned) registers It talks about converting: define i32 @test(i32 %X, i32 %Y) { %Z = udiv i32 %X, %Y ret i32 %Z } into ;; X is in EAX, Y is in ECX mov %EAX, %EDX sar %EDX, 31 idiv %ECX ret BUT, where does the "sar" come from? Kind Regards James
2014 Jan 02
4
EFI build problems
On 01/02/2014 04:09 AM, Ferenc Wagner wrote: > > Issuing another make after this gave the previous error again: > > isolinux.asm:1102: error: TIMES value -4 is negative > I just fixed this one... it seems to be a consequence of merging in the MOVZX isolinux fix into the firmware branch. -hpa
2003 Oct 07
1
is.na(v)<-b (was: Re: Beginner's query - segmentation fault)
I am puzzled by the advice to use is.na(x) <- TRUE instead of x <- NA. ?NA says Function `is.na<-' may provide a safer way to set missingness. It behaves differently for factors, for example. However, "MAY provide" is a bit scary, and it doesn't say WHAT the difference in behaviour is. I must say that "is.na(x) <- ..." is rather repugnant,
2014 Jan 03
1
EFI build problems
On 01/02/2014 10:12 PM, Ady wrote: > >> On 01/02/2014 04:09 AM, Ferenc Wagner wrote: >>> >>> Issuing another make after this gave the previous error again: >>> >>> isolinux.asm:1102: error: TIMES value -4 is negative >>> >> >> I just fixed this one... it seems to be a consequence of merging in the >> MOVZX isolinux fix into
2013 Oct 04
2
Again about encoding speed of different compiles
I downloaded current version of FLAC sources and compiled it with: * GCC 4.8.1 (MSYS from http://xhmikosr.1f0.de/tools/) * Intel C++ Composer XE 2013 update 5 * MSVS 2010 SP1 * MSVS 2012 update 3 (SSSE3 and SSE4.1 code was disabled for all compilers) Stereo 24-bit WAV file was encoded with -8 preset. Encoding time, in seconds: GCC 32-bit: 209 ICC 32-bit: 130 VS10 32-bit: 116 VS12 32-bit: 114
2011 Mar 19
2
[LLVMdev] Apparent optimizer bug on X86_64
Compiling a simple automaton created by GNU bison with -O1 or -O2 resulted in the following machine code: 1300 /*-----------------------------. 1301 | yyreduce -- Do a reduction. | 1302 `-----------------------------*/ 1303 yyreduce: 1304 /* yyn is the number of a rule to reduce with. */ 1305 yylen = yyr2[yyn]; 0x0000000000400c14 <rpcalc_parse+628>: mov
2018 Nov 28
2
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32)
On Wed, Nov 28, 2018 at 7:11 AM Sanjay Patel via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Thanks for reporting this and other perf opportunities. As I mentioned > before, if you could file bug reports for these, that's probably the only > way they're ever going to get fixed (unless you're planning to fix them > yourself). It's not an ideal situation, but
2013 Nov 26
3
Sysinux 6 will not boot ISOs on BIOS (i.e. pre-UEFI) systems
Hi, hpa wrote: > - mov dx,cx > + movzx edx,cx Gerardo Exequiel Pozzi: > Yes! Fixed :) > (maybe garbage in high word of "edx"?) I am now pondering too, why my machine booted from high LBAs. I understand that the new code zeros the upper 16 bit of EDX. Was there remaining garbage from early BIOS activity before isolinux.bin got started ? Does my
2014 Jan 03
1
EFI build problems, fixed in which git repo
On 01/02/2014 09:59 PM, Geert Stappers wrote: > Op 2014-01-02 om 15:37 schreef H. Peter Anvin: >> On 01/02/2014 04:09 AM, Ferenc Wagner wrote: >>> >>> Issuing another make after this gave the previous error again: >>> >>> isolinux.asm:1102: error: TIMES value -4 is negative >>> >> >> I just fixed this one... it seems to be a
2009 Apr 27
3
Question about vk_check and rllunpack
I am hitting a problem on syslinux-3.80-pre1-2-g6c0fb9e (only last label in config file is found), but don't want to "cry wolf" (again), so let's start with a question: ui.inc: ; ; Now check if it is a "virtual kernel" ; vk_check: mov esi,[HighMemSize] ; Start from top of memory .scan: cmp esi,[VKernelEnd] jbe
2018 May 23
0
[PATCH v3 18/27] xen: Adapt assembly for PIE support
Change the assembly code to use the new _ASM_MOVABS macro which get a symbol reference while being PIE compatible. Adapt the relocation tool to ignore 32-bit Xen code. Position Independent Executable (PIE) support will allow to extended the KASLR randomization range below the -2G memory limit. Signed-off-by: Thomas Garnier <thgarnie at google.com> --- arch/x86/tools/relocs.c | 16
2013 Dec 01
1
request backport fix for isolinux 4.xx branch
Recently a patch by HPA was added to the elflink branch, "isolinux: Clear upper half of EDX before using..." http://git.zytor.com/?p=syslinux/syslinux.git;a=commit;h=870b84dd8714d dfccc9288025331423efa6d76b7 The patch was then applied to the firmware branch too. The patch solves an issue introduced by a prior commit "isolinux: Update LBA in getlinsec loop". Since the
2008 Jul 15
2
meaning of tests presented in anova(ols(...)) {Design package}
Hi, I am curious about how to interpret the table produced by anova(ols(...)), from the Design package. I have a multiple linear regression model, with some interaction, defined by: ols(formula = log(ksat * 60 * 60) ~ log(sar) * pol(activity, 3) + log(conc) * pol(sand, 3), data = sm.clean, x = TRUE, y = TRUE) n Model L.R. d.f. R2 Sigma 1834 1203
2010 May 17
3
Xen Dom0 performance reporting/monitoring
Hi All, Debian Lenny, Xen 3.2.1 Apologies if this has been covered many times before, but I''m unsure how to go about this. I want to obtain stats from my dom0, IO, cpu etc over time to see if I''m getting any load peaks or if my hardware could handle more domU''s. I''m thinking of something similar to sar which gives you 15 minute averages across the day. I
2009 Dec 25
1
questions relate to "sar"
We have CENTOS 5.3 on DELL server. I tried to use "sar -b" or "sar -u" and it only show report starting on 12:00 A.M. my questions are: 1. for "sar -u" or "sar -b" how can I generate two or three days ago report? 2. how to generate daily report from "sa2 " peocess? Thanks. ___________________________________________________ ??????? ?
2017 Feb 13
3
[PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function
On February 13, 2017 2:53:43 AM PST, Peter Zijlstra <peterz at infradead.org> wrote: >On Mon, Feb 13, 2017 at 11:47:16AM +0100, Peter Zijlstra wrote: >> That way we'd end up with something like: >> >> asm(" >> push %rdi; >> movslq %edi, %rdi; >> movq __per_cpu_offset(,%rdi,8), %rax; >> cmpb $0, %[offset](%rax); >> setne %al;