search for: 64x64

Displaying 20 results from an estimated 33 matches for "64x64".

Did you mean: 4x64
2011 Mar 30
1
[LLVMdev] Bignums
...s register allocator to do the right thing on x86? I'm getting swaths of code like: movq 88(%rsp), %rax mulq 112(%rsp) movq %rax, %r15 addq %r11, %r15 movq %rdx, %r14 adcq %rcx, %r14 adcq $0, %r9 (that's a 64x64 -> 128-bit multiply with 192-bit accumulate.) The problem is, %r11 and %rcx are dead here. It should have just added %rax and %rdx into them. This results in more movs, more spills, more code and less performance. (3) Is there a way to avoid this whole mess? I'm using a script to spi...
2020 Jul 09
2
[RFC] carry-less multiplication instruction
...code efficiently. [14] >> >>  ==clmul lowering without hardware support== >>  A 8x8=>16 clmul can also be lowered to a 32x32=>64 multiplication when there is no specialized instruction (also 15x15=>30, to a 60x60=>120, or if bitreverse is available 16x16=>32 to TWO 64x64=>64 multiplications)[3]. >> >>  [1] https://en.wikipedia.org/wiki/Carry-less_product >>  [2] (page 30) https://raw.githubusercontent.com/riscv/riscv-bitmanip/master/bitmanip-0.92.pdf >>  [3] https://www.bearssl.org/constanttime.html > > What benefit would this intri...
2017 Mar 20
1
[PATCH] inspect: get a better icon for ALT Linux guests (RHBZ#1433937)
...+ b/lib/inspect-icon.c @@ -455,7 +455,7 @@ icon_voidlinux (guestfs_h *g, struct inspect_fs *fs, size_t *size_r) return get_png (g, fs, VOIDLINUX_ICON, size_r, 20480); } -#define ALTLINUX_ICON "/usr/share/doc/alt-docs/altlogo.png" +#define ALTLINUX_ICON "/usr/share/icons/hicolor/64x64/apps/altlinux.png" static char * icon_altlinux (guestfs_h *g, struct inspect_fs *fs, size_t *size_r) -- 2.9.3
2020 Jul 05
8
[RFC] carry-less multiplication instruction
...code efficiently. [14]</p><p>==clmul lowering without hardware support==<br />A 8x8=>16 clmul can also be lowered to a 32x32=>64 multiplication when there is no specialized instruction (also 15x15=>30, to a 60x60=>120, or if bitreverse is available 16x16=>32 to TWO 64x64=>64 multiplications)[3].</p><p>[1] <a href="https://en.wikipedia.org/wiki/Carry-less_product">https://en.wikipedia.org/wiki/Carry-less_product</a><br /><a href="https://en.wikipedia.org/wiki/Carry-less_product%5B2%5D%20(page%2030)%20https://raw.git...
2009 Aug 17
2
[PATCH] kms: Fix <nv11 hardware cursor.
...nouveau_hw.h index a548a4c..aa2a3b4 100644 --- a/src/nouveau_hw.h +++ b/src/nouveau_hw.h @@ -287,6 +287,23 @@ static inline bool NVLockVgaCrtcs(NVPtr pNv, bool lock) return waslocked; } +/* nv04 cursor max dimensions of 32x32 (A1R5G5B5) */ +#define NV04_CURSOR_SIZE 32 +/* limit nv10 cursors to 64x64 (ARGB8) (we could go to 64x255) */ +#define NV10_CURSOR_SIZE 64 + +static inline int nv_cursor_width(NVPtr pNv) +{ + return pNv->NVArch >= 0x10 ? NV10_CURSOR_SIZE : NV04_CURSOR_SIZE; +} + +static inline int nv_cursor_pixels(NVPtr pNv) +{ + int width = nv_cursor_width(pNv); + + return width *...
2015 Jul 24
0
[LLVMdev] SIMD for sdiv <2 x i64>
...%rax > vmovq %rax, %xmm6 > vpunpcklqdq %xmm5, %xmm6, %xmm5 # xmm5 = xmm6[0],xmm5[0] AVX2 doesn't have integer vector division instructions and LLVM lowers divides by constants into (128 bit) multiplies. However, AVX2 doesn't have a way to get to the upper 64 bits of a 64x64->128 bit multiply either, so LLVM uses the scalar imulq instruction to do that. There's not much room to optimize here given the limitations of AVX2. You seem to be subtracting pointers though, so if you can guarantee that the pointers are aligned you could set the exact bit on your 'sd...
2018 May 11
1
[PATCH v2 2/4] drm/vc4: Take underscan setup into account when updating planes
...'s a cursor plane enabled > this feature is pretty much useless. But let's take a real use case to > show you how negligible the lack of scaling on the cursor plane will > be. Say you have borders taking 10% of you screen (which is already a > lot), and your cursor is a plane of 64x64 pixels, you'll end up with a > 64x64 cursor instead of 58x58. Quite frankly, I doubt you'll notice > the difference. Now you're assuming the cursor is only ever used as a cursor. It can be used for other things and those may need to be positioned pixel perfect in relation to othe...
2015 Jul 24
2
[LLVMdev] SIMD for sdiv <2 x i64>
It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing also happens to zext <2 x i32> -> <2 x i64> and trunc <2 x i64> -> <2 x i32>. Any ideas to optimize these
2018 May 11
0
[PATCH v2 2/4] drm/vc4: Take underscan setup into account when updating planes
...underscan when there's a cursor plane enabled this feature is pretty much useless. But let's take a real use case to show you how negligible the lack of scaling on the cursor plane will be. Say you have borders taking 10% of you screen (which is already a lot), and your cursor is a plane of 64x64 pixels, you'll end up with a 64x64 cursor instead of 58x58. Quite frankly, I doubt you'll notice the difference. Anyway, I'd like to hear back from Eric on that, since he is the one who asked me to work on this feature. > > > > > > > > > > &gt...
2018 Dec 30
3
[cfe-dev] Portable multiplication 64 x 64 -> 128 for int128 reimplementation
_mulx_u64 only exists when the target is x86_64. That's still not very portable. I'm not opposed to removing the bmi2 check, but gcc also has the same check so it doesn't improve portability much. ~Craig On Sat, Dec 29, 2018 at 4:44 PM Arthur O'Dwyer via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi Pawel, > > There is the _mulx_u64 intrinsic, but it
2020 Mar 18
0
[PATCH i-g-t] tests/kms_plane: Generate reference CRCs for partial coverage too
...these tests even when we don't correctly program our hardware plane's framebuffer. So, get rid of that TODO and implement this by converting convert_fb_for_mode__position() to a generic convert_fb_for_mode() function that allows us to create a colored FB, either with or without a series of 64x64 rectangles, and use that in test_grab_crc() to generate reference CRCs for the plane-position-hole tests. Additionally, we move around all of the test flags into a single enumerator, and make sure none of them have overlapping bits so we can correctly tell from test_grab_crc() whether or not our re...
2015 Jul 24
2
[LLVMdev] SIMD for sdiv <2 x i64>
...;> vmovq %rax, %xmm6 >> vpunpcklqdq %xmm5, %xmm6, %xmm5 # xmm5 = xmm6[0],xmm5[0] > AVX2 doesn't have integer vector division instructions and LLVM lowers divides by constants into (128 bit) multiplies. However, AVX2 doesn't have a way to get to the upper 64 bits of a 64x64->128 bit multiply either, so LLVM uses the scalar imulq instruction to do that. There's not much room to optimize here given the limitations of AVX2. > > You seem to be subtracting pointers though, so if you can guarantee that the pointers are aligned you could set the exact bit on you...
2018 May 11
3
[PATCH v2 2/4] drm/vc4: Take underscan setup into account when updating planes
On Fri, May 11, 2018 at 07:12:21PM +0200, Boris Brezillon wrote: > On Fri, 11 May 2018 19:54:02 +0300 > Ville Syrjälä <ville.syrjala at linux.intel.com> wrote: > > > On Fri, May 11, 2018 at 05:52:56PM +0200, Boris Brezillon wrote: > > > On Fri, 11 May 2018 18:34:50 +0300 > > > Ville Syrjälä <ville.syrjala at linux.intel.com> wrote: > > >
2019 Feb 05
4
[RFC] Vector Predication
On 2/5/19 1:27 AM, Philip Reames via llvm-dev wrote: > > On 1/31/19 4:57 PM, Bruce Hoult wrote: >> On Thu, Jan 31, 2019 at 4:05 PM Philip Reames via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >>> Do such architectures frequently have arithmetic operations on the >>> mask registers?  (i.e. can I reasonable compute a conservative >>> length
2009 Mar 11
2
[NV50] application to confirm texture layout
...it's ok. Otherwise it spews a lot of errors. You have to manually set the TILE_HEIGHT, TILE_WIDTH and TILING_PARAM. TILE_WIDTH == TILE_PITCH in this case because it's 8bpp data. Use this list (TILING_PARAM) TILE_WIDTHxTILE_HEIGHT: (0x00) 32x4 (0x10) 64x8 (0x20) 64x16 (0x30) 64x32 (0x40) 64x64 I'm curious to know if anyone has different layouts. You may need to sudo run it. Maarten. -------------- next part -------------- A non-text attachment was scrubbed... Name: nv50_test.tar.bz2 Type: application/x-bzip2 Size: 1981 bytes Desc: not available Url : http://lists.freedesktop.org/a...
2006 Jan 08
0
Creating versions with file_column but only if file is image
I would like to create versions of images uploaded to a model that has a file_column but only if the file is a image. file_column :file, :magick => { :versions => {"thumb" => "64x64"}} If I upload a txt file this will give me an error of "File invalid image". Any idears? -- Posted via http://www.ruby-forum.com/.
2003 May 05
0
Macroblock Coding Issues
...most completed a new VP3 decoder implementation. In the course of doing so, I have encountered something odd about macroblock coding. As a quick overview, VP3 has a notion of fragments (8x8 pixels) and superblocks (32x32 pixels, 4x4 fragments). These apply to each individual plane (e.g., a 64x64 video will have 4 Y superblocks and 1+1 C superblocks, 64 Y fragments and 16+16 C fragments). VP3 also has a notion of macroblocks which are the same as MPEG. I.e., 1 macroblock encompasses 4 Y fragments and 1+1 C fragments. IOW, 1 macroblock applies to all 3 planes, whereas one fragment o...
2018 Dec 31
0
[cfe-dev] Portable multiplication 64 x 64 -> 128 for int128 reimplementation
...ents about GCC/Clang intrinsics. I never considered >> using them, but they might be better alternative to inline assembly. >> Is there a one for regular MUL? >> > > I'm not sure, but I think there currently does not exist any intrinsic to > generate the top half of a 64x64=128 multiply, except for `_mulx_64`. > If Clang stopped requiring `-mbmi2`, I would then expect the `_mulx_64` > intrinsic to generate a regular MUL instruction; similar to > how_addcarry_u64 generates ADCX/ADOX when available/useful and a regular > ADC otherwise. > MSVC calls this i...
2009 Aug 02
4
X11 theme cursors
If anybody know HOW I can use the X11 cursor theme in Wine?
2019 Feb 05
3
[RFC] Vector Predication
...generate from C loops, although it's > certainly possible. > > Here's an example: > > void foo(size_t n, int64_t *dst, int32_t *a, int32_t *b){ > for (size_t i=0; i<n; ++i) > dst[i] += a[i] * b[i]; > } > > If 32x32->64 multiplies are cheaper than 64x64->64 multiplies then you > might want to compile this to: > > # args n in a0, dst in a1, a in a2, b in a3, AVL in t0 > foo: > vsetvli a4, a0, vsew32,vlmul4 # vtype = 32-bit integer vectors, AVL in a4 > vlw.v v0, (a2) # Get 32b vector a into v0-v3 > vl...