search for: optimizing_assembly

Displaying 3 results from an estimated 3 matches for "optimizing_assembly".

2014 Jan 14
1
PATCH for lpc_asm.nasm
...Two comments ";ASSERT(lp_quantization <= 31)" in the new functions ..._wide_asm_ia32() -- just to mention this constraint. (max. possible value of lp_quantization is 15, so it's not a problem) 2) "mov cl, ..." was replaced with "mov ecx, ..." (again Agner Fog, optimizing_assembly.pdf) summary: write to a partial register may result in false dependencies between instructions, so it is better to avoid it. (also bitreader_asm.nasm and stream_encoder_asm.nasm both have "mov ecx, ..." instructions, and no "mov cl, ..."). -------------- next part ------------...
2014 Jan 03
1
PATCH: match calls and returns
According to Agner Fog, "...you must make sure that all calls are matched with returns. Never jump out of a subroutine without a return and never use a return as an indirect jump." (see paragraph 3.15 in microarchitecture.pdf and examples 3.5a and 3.5b in optimizing_assembly.pdf) Basically this patch replaces call .get_eip0 .get_eip0: pop eax with call .mov_eip_to_eax .get_eip0: and .mov_eip_to_eax: mov eax, [esp] ret -------------- next part -------------- A non-text attachment was scrubbed... Name: get_eip.diff Type: application/octet-stream Size...
2016 Jan 19
1
Lets do a 1.3.2 release
Dave Yeo wrote: >> I cannot find information what version of binutils supports AVX/AVX2/FMA >> instructions, but IIRC OS/2 doesn't support AVX instructions anyway, >> so it doesn't matter much. > > Surprisingly, I've yet to have a report of an AVX related crash or trap > (used in FFmpeg and projects based on it, Mozilla, probably others). > As I