search for: maska

Displaying 6 results from an estimated 6 matches for "maska".

Did you mean: mask
2005 Dec 02
0
run time assembler patch for altivec, sse + bug fixes
...ec.h" spx_word32_t inner_prod_altivec(const float *a, const float *b, int len) { int i; float sum; int a_aligned = (((unsigned long)a) & 15) ? 0 : 1; int b_aligned = (((unsigned long)b) & 15) ? 0 : 1; __vector float MSQa, LSQa, MSQb, LSQb; __vector unsigned char maska, maskb; __vector float vec_a, vec_b; __vector float vec_result; vec_result = (__vector float)vec_splat_u8(0); if ((!a_aligned) && (!b_aligned)) { // This (unfortunately) is the common case. maska = vec_lvsl(0, a); maskb = vec_lvsl(0, b);...
2013 May 16
0
[LLVMdev] Combining physical registers
On 5/16/2013 11:17 AM, Jakob Stoklund Olesen wrote: > > Would this TRI function solve your problem? >[...] > /// > /// Covering = getCoveringLanes(); > /// MaskA = getSubRegIndexLaneMask(SubA); > /// MaskB = getSubRegIndexLaneMask(SubB); > /// > /// If (MaskA & ~(MaskB & Covering)) == 0, then SubA is completely covered by > /// SubB. > unsigned getCoveringLanes() const { return CoveringLanes; } Yes, this would solve...
2013 May 16
1
[LLVMdev] Combining physical registers
...covered by the ssub_0 and ssub_1 lanes. /// This is related to the CoveredBySubRegs property on register definitions. /// /// This function returns a bit mask of lanes that completely cover their /// sub-registers. More precisely, given: /// /// Covering = getCoveringLanes(); /// MaskA = getSubRegIndexLaneMask(SubA); /// MaskB = getSubRegIndexLaneMask(SubB); /// /// If (MaskA & ~(MaskB & Covering)) == 0, then SubA is completely covered by /// SubB. unsigned getCoveringLanes() const { return CoveringLanes; } /jakob
2013 May 16
2
[LLVMdev] Combining physical registers
The function TII::canCombineSubRegIndices has been gone for a while now, and I was wondering if there is a target-independent way of determining if a certain set of physical registers "adds up" to a larger register. For example, on X86, AL and AH together form AX. On Hexagon, R0 and R1 are D0. The context here is an attempt to coalesce multiple loads/stores into fewer loads/stores
2004 Aug 06
2
[PATCH] Make SSE Run Time option. Add Win32 SSE code
Jean-Marc, >I'm still not sure I get it. On an Athlon XP, I can do something like >"mulps xmm0, xmm1", which means that the xmm registers are indeed >supported. Besides, without the xmm registers, you can't use much of >SSE. In the Atholon XP 2400+ that we have in our QA lab (Win2000 ) if you run that code it generates an Illegal Instruction Error. In addition,
2004 Aug 06
6
[PATCH] Make SSE Run Time option.
...p; CPU_MODE_ALTIVEC )) { #ifdef _USE_ALTIVEC int i; float sum; int a_aligned = (((unsigned long)a) & 15) ? 0 : 1; int b_aligned = (((unsigned long)b) & 15) ? 0 : 1; __vector float MSQa, LSQa, MSQb, LSQb; __vector unsigned char maska, maskb; __vector float vec_a, vec_b; __vector float vec_result; vec_result = (__vector float)vec_splat_u8(0); if ((!a_aligned) && (!b_aligned)) { // This (unfortunately) is the common case. maska = vec_lvsl(0, a);...