similar to: [LLVMdev] generating LLVM code that meets the C ABI

Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] generating LLVM code that meets the C ABI"

2013 Sep 19
0
[LLVMdev] unaligned AVX store gets split into two instructions
Nadav, We see multiple regressions after r172868 in ISPC compiler (based on LLVM optimizer). The regressions are due to spill/reloads, which are due to increase register pressure. This matches Zach's analysis. We've filed bug 17285 for this problem. Is there any possibility to avoid splitting in case of multiple loads going together? Dmitry. On Wed, Jul 10, 2013 at 1:12 PM, Zach
2013 Jul 10
2
[LLVMdev] unaligned AVX store gets split into two instructions
I've narrowed this down to a single kernel (kernel.ll), which does a fixed-size matrix-matrix multiply: # ~/llvm-32-final/bin/llc kernel.ll -o kernel32.s # ~/llvm-33-final/bin/llc kernel.ll -o kernel33.s # ~/llvm-32-final/bin/clang++ harness.cpp kernel32.s -o harness32 # ~/llvm-32-final/bin/clang++ harness.cpp kernel33.s -o harness33 # time ./harness32 real 0m0.584s user 0m0.581s sys 0m0.001s
2013 Jul 10
0
[LLVMdev] unaligned AVX store gets split into two instructions
Thanks for all the the info! I'm still in the process of narrowing down the performance difference in my code. I'm no longer convinced its related to only the unaligned loads/stores alone since extracting this part of the kernel makes the performance difference disappear. I will try to narrow down what is going on and if it seems related LLVM, I will post an example. Thanks again, Zach
2013 Jul 10
3
[LLVMdev] unaligned AVX store gets split into two instructions
Hi, Yes. On Sandybridge 256-bit loads/stores are double pumped. This means that they go in one after the other in two cycles. On Haswell the memory ports are wide enough to allow a 256bit memory operation in one cycle. So, on Sandybridge we split unaligned memory operations into two 128bit parts to allow them to execute in two separate ports. This is also what GCC and ICC do. It is very
2013 Jul 10
0
[LLVMdev] unaligned AVX store gets split into two instructions
On Tue, Jul 9, 2013 at 9:01 PM, Zach Devito <zdevito at gmail.com> wrote: > I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads > on AVX. > 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as a > single instruction (details below). > In a matrix-matrix inner-kernel, I see a ~25% decrease in performance, which > seems to be
2013 Jul 10
4
[LLVMdev] unaligned AVX store gets split into two instructions
I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads on AVX. 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as a single instruction (details below). In a matrix-matrix inner-kernel, I see a ~25% decrease in performance, which seems to be due to this. Any ideas why this changed? Thanks! Zach LLVM Code: define <4 x double> @vstore(<4 x
2014 Mar 25
3
[LLVMdev] Getting the Debugging JIT-ed Code with GDB example to work
I'm trying to run the example described at: http://llvm.org/docs/DebuggingJITedCode.html I followed the sample command line session (below, with versions numbers for everything), but gdb doesn't stop at the breakpoints as described. Any idea what is wrong? Thanks, Zach zdevito at derp:~/terra/tests$ > ~/clang+llvm-3.4-x86_64-unknown-ubuntu12.04/bin/clang -cc1 -O0 -g >
2007 Jun 12
0
[PATCH][libxenapi] Fix segmentaion fault in libxenapi
When calling xen_vbd_set_mode(), libxenapi attempted to convert the enum mode parameter to a string twice - resulting in segfault. Removed first conversion since conversion is taking place in marshalling/demarshalling layer. Fixed similar double enum conversion in other places as well. Regards, Jim _______________________________________________ Xen-devel mailing list
2016 Jun 30
1
Entry for llvm.org/ProjectsWithLLVM - Terra programming language
Terra: A low-level counterpart to Lua By Zach DeVito (http://cs.stanford.edu/~zdevito) Terra (http://terralang.org/) is a system programming language that is embedded in and meta-programmed by Lua, which handles details like conditional compilation, type systems, namespaces, and templating/function specialization that are normally special constructs in other languages. Terra code shares
2016 Mar 28
0
RFC: atomic operations on SI+
On Fri, Mar 25, 2016 at 02:22:11PM -0400, Jan Vesely wrote: > Hi Tom, Matt, > > I'm working on a project that needs few coherent atomic operations (HSA > mode: load, store, compare-and-swap) for std::atomic_uint in HCC. > > the attached patch implements atomic compare and swap for SI+ > (untested). I tried to stay within what was available, but there are > few issues
2008 May 08
2
Testing render :text without has_text
A controller I''m trying to test simply delivers a text string to the client, which then demarshalls it to retrieve some objects. I want to test that the returned string is correct. I don''t want to compare the string character-by-character with response.has_text because that ties me to the implementation of the Marshall class. Instead, I just want to demarshall the string
2006 Feb 28
0
Dallas Ruby Brigade meets March 7th
The Dallas Ruby Brigade begins! Following in the proud tradition of Seattle.rb, NYC.rb and many others, Dallas has joined the crowd with its own local Ruby Brigade. We''ll be meeting Tuesday, March 7th in Addison. The plan is as follows: * Meet between 6:45 and 7:00 in the lobby downstairs. Because of the way security works, you''ll want to be timely in your arrival. At
2018 Nov 09
0
Wine release 3.20
The Wine development release 3.20 is now available. What's new in this release (see below for details): - Async interfaces and ACF files in the IDL compiler. - Support for substorage transforms in MSI. - RPC/COM marshalling fixes. - Support for Unicode requests in WinHTTP. - Shell Autocomplete optimizations. - Various bug fixes. The source is available from the following
2009 May 22
0
Wine release 1.1.22
The Wine development release 1.1.22 is now available. What's new in this release (see below for details): - More improvements to OLE copy/paste. - Beginnings of x86_64 exception handling. - Direct3D locking fixes. - ARB shaders improvements. - Better OpenGL pixel format support. - Various bug fixes. The source is available from the following locations:
2013 Nov 23
2
OSX 10.9 appdmg/pkgdmg
Howdy, Trying to install dmg files with puppet. However, after running my manifest the .dmg file is never downloaded by curl. I tried pkgdmg and appdmg. I also tried using a local directory as the source. It seems to ignore any path I give as the source even totally bogus ones. 1 define pkg_deploy($sourcedir = false) { 2 $sourcedir_real = $sourcedir ? { 3 false
2007 Apr 18
0
[PATCH 13/21] i386 Gdt page isolation
Make GDT page aligned and page padded to support running inside of a hypervisor. This prevents false sharing of the GDT page with other hot data, which is not allowed in Xen, and causes performance problems in VMware. Rather than go back to the old method of statically allocating the GDT (which wastes unneded space for non-present CPUs), the GDT for APs is allocated dynamically. Signed-off-by:
2007 Apr 18
0
[PATCH 13/21] i386 Gdt page isolation
Make GDT page aligned and page padded to support running inside of a hypervisor. This prevents false sharing of the GDT page with other hot data, which is not allowed in Xen, and causes performance problems in VMware. Rather than go back to the old method of statically allocating the GDT (which wastes unneded space for non-present CPUs), the GDT for APs is allocated dynamically. Signed-off-by:
2007 Apr 18
0
[PATCH 17/21] i386 Ldt cleanups 1
Big cleanup of LDT code. This code has very little type checking and is not frequently used, so I audited the code, added type checking and size optimizations to generate smaller assembly code. First, just introduce some small definitions that will be used later. Signed-off-by: Zachary Amsden <zach@vmware.com> Index: linux-2.6.14-zach-work/arch/i386/kernel/entry.S
2007 Apr 18
0
[PATCH 17/21] i386 Ldt cleanups 1
Big cleanup of LDT code. This code has very little type checking and is not frequently used, so I audited the code, added type checking and size optimizations to generate smaller assembly code. First, just introduce some small definitions that will be used later. Signed-off-by: Zachary Amsden <zach@vmware.com> Index: linux-2.6.14-zach-work/arch/i386/kernel/entry.S
2009 Jan 18
1
Installations fail
Hi! I've tried to install several games in Wine; however, many of them can't even be installed. They should work pretty well according to the AppDB, so it has to be a problem with my computer... The problem comes up as soon as the program starts copying the files. The first few files are copied without any problems, however. OS: Ubuntu 8.10 Wine version: 1.1.13 (I upgraded after it