Displaying 4 results from an estimated 4 matches for "r108".
Did you mean:
108
2018 Jun 21
2
NVPTX - Reordering load instructions
...onally, even with launch configurations where this could
hurt performance), it allows the CUDA PTX JIT to emit 128-bit loads:
> LDS.128 R76, [0x2f0];
> LDS.128 R60, [0xa0];
> LDS.128 R72, [0x130];
> LDS.128 R96, [0x1b0];
> LDS.128 R92, [0x30];
> LDS.128 R116, [0x50];
> LDS.128 R108, [0x1f0];
LLVM preserves the operations more or less as they are emitted by the
front-end, interleaving memory operations with arithmetic. As a result,
the SASS code contains many more 32-bit loads, which lowers performance
by ~10% on this specific benchmark.
What would be the best approach to im...
2010 Jun 15
0
[LLVMdev] Question on X86 backend
...50, R51, R52, R53, R54, R55, R56, R57, R58, R59, R60, R61, R62, R63,
R64, R65, R66, R67, R68, R69, R70, R71, R72, R73, R74, R75, R76, R77, R78, R79,
R80, R81, R82, R83, R84, R85, R86, R87, R88, R89, R90, R91, R92, R93, R94, R95,
R96, R97, R98, R99, R100, R101, R102, R103, R104, R105, R106, R107, R108, R109, R110, R111,
R112, R113, R114, R115, R116, R117, R118, R119, R120, R121, R122, R123, R124, R125, R126, R127,
R128, R129, R130, R131, R132, R133, R134, R135, R136, R137, R138, R139, R140, R141, R142, R143,
R144, R145, R146, R147, R148, R149, R150, R151, R152, R153, R154, R155, R156, R157, R...
2010 Jun 15
2
[LLVMdev] Question on X86 backend
Hi Micah,
> In X86InstrInfo.td for Call Instructions, it mentions that Uses for
> argument registers are added manually. Can someone point me to the
> location where they are added as the comment doesn't reference a
> where or how?
the register uses are added by the function
X86TargetLowering::LowerCall() during the DAG Lowering phase. This is
the relevant code segment:
// Add
2018 Jun 21
2
NVPTX - Reordering load instructions
...e=g>s the
> CUDA PTX JIT to emit 128-bit loads:
> >
> >> LDS.128 R76, [0x2f0];
> >> LDS.128 R60, [0xa0];
> >> LDS.128 R72, [0x130];
> >> LDS.128 R96, [0x1b0];
> >> LDS.128 R92, [0x30];
> >> LDS.128 R116, [0x50];
> >> LDS.128 R108, [0x1f0];
> > LLVM preserves the operations more or less as they are emitted by the
> > front-end, interleaving memory operations with arithmetic. As a result,
> > the SASS code contains many more 32-bit loads, which lowers performance
> > by ~10% on this specific benchmark....