thr3ads.net - search: "r108"

Displaying 4 results from an estimated 4 matches for "r108".

Did you mean: 108

2018 Jun 21

NVPTX - Reordering load instructions

...onally, even with launch configurations where this could hurt performance), it allows the CUDA PTX JIT to emit 128-bit loads: > LDS.128 R76, [0x2f0]; > LDS.128 R60, [0xa0]; > LDS.128 R72, [0x130]; > LDS.128 R96, [0x1b0]; > LDS.128 R92, [0x30]; > LDS.128 R116, [0x50]; > LDS.128 R108, [0x1f0]; LLVM preserves the operations more or less as they are emitted by the front-end, interleaving memory operations with arithmetic. As a result, the SASS code contains many more 32-bit loads, which lowers performance by ~10% on this specific benchmark. What would be the best approach to im...

[LLVMdev] Question on X86 backend

2010 Jun 15

[LLVMdev] Question on X86 backend

...50, R51, R52, R53, R54, R55, R56, R57, R58, R59, R60, R61, R62, R63, R64, R65, R66, R67, R68, R69, R70, R71, R72, R73, R74, R75, R76, R77, R78, R79, R80, R81, R82, R83, R84, R85, R86, R87, R88, R89, R90, R91, R92, R93, R94, R95, R96, R97, R98, R99, R100, R101, R102, R103, R104, R105, R106, R107, R108, R109, R110, R111, R112, R113, R114, R115, R116, R117, R118, R119, R120, R121, R122, R123, R124, R125, R126, R127, R128, R129, R130, R131, R132, R133, R134, R135, R136, R137, R138, R139, R140, R141, R142, R143, R144, R145, R146, R147, R148, R149, R150, R151, R152, R153, R154, R155, R156, R157, R...

[LLVMdev] Question on X86 backend

2010 Jun 15

[LLVMdev] Question on X86 backend

Hi Micah, > In X86InstrInfo.td for Call Instructions, it mentions that Uses for > argument registers are added manually. Can someone point me to the > location where they are added as the comment doesn't reference a > where or how? the register uses are added by the function X86TargetLowering::LowerCall() during the DAG Lowering phase. This is the relevant code segment: // Add

NVPTX - Reordering load instructions

2018 Jun 21

NVPTX - Reordering load instructions

...e=g>s the > CUDA PTX JIT to emit 128-bit loads: > > > >> LDS.128 R76, [0x2f0]; > >> LDS.128 R60, [0xa0]; > >> LDS.128 R72, [0x130]; > >> LDS.128 R96, [0x1b0]; > >> LDS.128 R92, [0x30]; > >> LDS.128 R116, [0x50]; > >> LDS.128 R108, [0x1f0]; > > LLVM preserves the operations more or less as they are emitted by the > > front-end, interleaving memory operations with arithmetic. As a result, > > the SASS code contains many more 32-bit loads, which lowers performance > > by ~10% on this specific benchmark....

search for: r108