search for: uarch

Displaying 20 results from an estimated 39 matches for "uarch".

Did you mean: arch
2017 Jan 10
0
[Bug 12508] New: fileflags & forcechange don't work for hardlinks
...QA Contact: rsync-qa at samba.org I've received the following bug report in FreeBSD's bugzilla: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=215688 Use case: # mkdir -pv /root/debug/{src,dst} # touch /root/debug/src/0x # ls -lio /root/debug/src/0x 330810 -rw-r--r-- 1 root wheel uarch 0 Jan 10 12:40 /root/debug/src/0x # ln /root/debug/src/0x /root/debug/src/1x # ln /root/debug/src/0x /root/debug/src/2x # ln /root/debug/src/0x /root/debug/src/3x # ls -lio /root/debug/src/ total 2 330810 -rw-r--r-- 4 root wheel uarch 0 Jan 10 12:40 0x 330810 -rw-r--r-- 4 root wheel uarch 0 J...
2016 Jul 21
3
Replication sieve scripts.
Hello, Thanks for the advice. I have looked for the libs and here is the difference: Dovecot production env 2.2.10: /usr/lib/dovecot/modules/doveadm rw-r--r-- 1 root root 18560 Jan 9 2014 lib10_doveadm_acl_plugin.so -rw-r--r-- 1 root root 14256 Jan 9 2014 lib10_doveadm_expire_plugin.so -rw-r--r-- 1 root root 10232 Jan 9 2014 lib10_doveadm_quota_plugin.so -rw-r--r-- 1 root root
2019 Dec 16
3
Guidance on working with the NVIDIA GPU back-end
...rdware person but would like to do some compiler-architecture co-design research. Are there any good references for the NVPTX backend? I'd like to change that backend to have a limited number of physical registers rather than an unlimited number of virtual ones (for more realistic modeling in a uarch simulator). Being able to do register allocation and other optimizations on the virtual ISA (PTX) would be incredibly useful to the research community. Thanks in advance, --Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/ll...
2015 May 10
1
FYI: dovecot (008632bdfd2c) compilation woes, and minor glitch regarding update-version.sh
...ntical: > > 108 -r-xr-xr-x 2 root wheel 382368 Nov 11 15:03 /bin/csh* > 118 -r-xr-xr-x 1 root wheel 142184 Nov 11 15:03 /bin/sh* > 108 -r-xr-xr-x 2 root wheel 382368 Nov 11 15:03 /bin/tcsh* ls -al /bin/sh /usr/local/bin/bash (FBSD 10.1-STABLE): | -r-xr-xr-x 1 root wheel uarch 142144 May 8 13:57 /bin/sh | -rwxr-xr-x 1 root wheel uarch 895712 May 8 13:09 /usr/local/bin/bash JFTR: Both scripts fail to run with FBSD's /bin/sh (lot of syntax errors), but run to completion when modifying the first line to "#!/usr/local/bin/bash" (needed, because ports are...
2009 Jul 23
1
[LLVMdev] Two Regalloc Enhancements
...gt; Post-ra scheduling has been working for a while. The reason it's not > turned on for x86 is it's not helping much (1 or 2%) while the compile > time cost is too high (~9% codegen time). I assume you guys are doing > your experiments using AMD processors. It could be Intel's uArch is > just not benefiting from the load scheduling. Yes, I can imagine there would be differences here. The memory architectures are quite different. > Round-robin register assignment probably will help post-ra scheduling. > However, for small functions it may end up increase the number...
2015 Jun 26
2
[LLVMdev] [cfe-dev] bitwise ops on booleans
...gt; this in C/C++ to achieve the desired effect: we want both comparisons to > be > > evaluated (do *not* want short-circuiting)? > > Why do you want that? > For performance. The statement that materializing the condition bits is as expensive as a branch is dependent on the arch, uarch, and data set. We're currently only using arch for the DAG transform. So for example, x86 globally changes all of these while PPC does not. SimplifyCFG doesn't even consider arch and does the inverse transform. The programmer may have more knowledge about the predictability of a specific b...
2018 Apr 09
1
SCEV and LoopStrengthReduction Formulae
> From: fglaser at apple.com <fglaser at apple.com> On Behalf Of escha at apple.com > Sent: Saturday, April 7, 2018 8:22 AM > >> I realize this is a micro-op saving a single cycle.  But this reduces the instruction count, one less >> instr to decode in a potentially hot path. If this all makes sense, and seems like a reasonable addition >> to llvm, would it make
2009 Jul 23
0
[LLVMdev] Two Regalloc Enhancements
...scheduler. Post-ra scheduling has been working for a while. The reason it's not turned on for x86 is it's not helping much (1 or 2%) while the compile time cost is too high (~9% codegen time). I assume you guys are doing your experiments using AMD processors. It could be Intel's uArch is just not benefiting from the load scheduling. Round-robin register assignment probably will help post-ra scheduling. However, for small functions it may end up increase the number of registers used. That can be bad for performance. > > What's the community's opinion on whet...
2015 May 09
2
FYI: dovecot (008632bdfd2c) compilation woes, and minor glitch regarding update-version.sh
Hi ? Teemu Huovila <teemu.huovila at dovecot.fi> wrote: > On 04/24/2015 10:00 PM, Michael Grimm wrote: >> 1) I'm trying to compile a recent hg dovecot version (008632bdfd2c) at a FBSD10-STABLE system without success: [?] >> fts-tokenizer-generic.c:214:18: error: use of undeclared identifier 'MidNum' >> if (uint32_find(MidNum, N_ELEMENTS(MidNum), c,
2017 Feb 13
2
(RFC) Adjusting default loop fully unroll threshold
On Mon, Feb 13, 2017 at 2:06 PM Gerolf Hoflehner via llvm-dev < llvm-dev at lists.llvm.org> wrote: > For unrolling specifically I agree with Hal that the hooks should be > target specific. Actually, I go further and think they should be uArch > specific. > They already are, it is just that no one has contributed a patch to use this on x86 microarchitectures. Until someone shows up with data showing that we need different tunings for different microarchitectures, it doesn't make sense for us to just make up numbers there. On...
2013 Oct 29
3
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
...t performance considerations. > > No one is compelled to use these if they don't want to. > >> x86 has this issue to an extent that goes far beyond what you describe here, and FWIW I've never seen a situation where it has been a problem. Usually when doing instruction-level/uarch-level optimization I find myself disassembling raw bytes in memory or in linked executables (or showing relocations in object files). The point of source code (even assembler) is to abstract over what is happening in the machine; when you specifically want to know what is happening in the machine y...
2013 Oct 27
0
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
...rt of thinking about performance considerations. No one is compelled to use these if they don't want to. > x86 has this issue to an extent that goes far beyond what you describe here, and FWIW I've never seen a situation where it has been a problem. Usually when doing instruction-level/uarch-level optimization I find myself disassembling raw bytes in memory or in linked executables (or showing relocations in object files). The point of source code (even assembler) is to abstract over what is happening in the machine; when you specifically want to know what is happening in the machine y...
2013 Oct 26
5
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
...kes it very difficult to read the assembly and do any > sort of thinking about performance considerations. > x86 has this issue to an extent that goes far beyond what you describe here, and FWIW I've never seen a situation where it has been a problem. Usually when doing instruction-level/uarch-level optimization I find myself disassembling raw bytes in memory or in linked executables (or showing relocations in object files). The point of source code (even assembler) is to abstract over what is happening in the machine; when you specifically want to know what is happening in the machine y...
2020 Jul 09
2
[RFC] carry-less multiplication instruction
(As per IRC discussion) I understand that the carry-less multiplication algorithm has it's uses since/and it is implemented as an instruction in many architectures and that adding it as a general-purpose intrinsic will allow us to drop target-specific intrinsics as by-product. What i do *NOT* understand is: what is the actual/main goal/driving factor of adding an LLVM intrinsic for it? The
2009 Jul 23
5
[LLVMdev] Two Regalloc Enhancements
We have two features for register allocation we'd like to contribute if folks think they are worthwhile. We want to get a read on whether they will be useful to people. The first features backschedules reloads during the spilling phase. As reloads are generated, we have some very simple code to try to schedule them as far ahead of the use as possible. The second features modifies
2013 Oct 31
0
[LLVMdev] Add support for ldr pseudo instruction in ARM integrated assembler
...considerations. >> > > No one is compelled to use these if they don't want to. > > x86 has this issue to an extent that goes far beyond what you describe > here, and FWIW I've never seen a situation where it has been a problem. > Usually when doing instruction-level/uarch-level optimization I find myself > disassembling raw bytes in memory or in linked executables (or showing > relocations in object files). The point of source code (even assembler) is > to abstract over what is happening in the machine; when you specifically > want to know what is happen...
2015 Jun 27
2
[LLVMdev] [cfe-dev] bitwise ops on booleans
...+ to achieve the desired effect: we want both > > comparisons to be > > evaluated (do *not* want short-circuiting)? > > Why do you want that? > > For performance. The statement that materializing the condition bits > is as expensive as a branch is dependent on the arch, uarch, and > data set. We're currently only using arch for the DAG transform. So > for example, x86 globally changes all of these while PPC does not. > SimplifyCFG doesn't even consider arch and does the inverse > transform. > > I don't have any problem teaching codegen to...
2012 Feb 29
1
[LLVMdev] Proposed implementation of N3333 hashing interfaces for LLVM (and possible libc++)
...of a 'hash_value' routine or similar context. > // > // Note that 'hash_combine_range' contains very special logic for hashing > // a contiguous array of integers or pointers. This logic is *extremely* fast, > // on a modern Intel "Gainestown" Xeon (Nehalem uarch) @2.2 GHz, these were > // benchmarked at over 8.5 GiB/s for large keys, and <20 cycles/ 20 cycles per what? Don't keep us in suspense! > // > //===----------------------------------------------------------------------===// > > #ifndef LLVM_ADT_HASHING_H > #define LLVM_A...
2014 Jan 11
3
[LLVMdev] Possible error in docs.
http://llvm.org/docs/CodeGenerator.html#machine-code-description-classes Section starting: Fixed (preassigned) registers It talks about converting: define i32 @test(i32 %X, i32 %Y) { %Z = udiv i32 %X, %Y ret i32 %Z } into ;; X is in EAX, Y is in ECX mov %EAX, %EDX sar %EDX, 31 idiv %ECX ret BUT, where does the "sar" come from? Kind Regards James
2012 Jun 23
9
[PATCH 0/5] btrfs: lz4/lz4hc compression
WARNING: This is not compatible with the previous lz4 patchset. If you''re using experimental compression that isn''t in mainline kernels, be prepared to backup and restore or decompress before upgrading, and have backups in case it eats data (which appears not to be a problem any more, but has been during development). These patches add lz4 and lz4hc compression