Why do we need a function x86/mmxencfrag:oc_enc_frag_sub_128_mmx()? ^ 28 28 28 28 28 28 28 28 ^ 28 28 28 28 28 28 28 28 ^ 28 28 28 29 29 29 28 28 ^ 28 29 28 29 29 29 28 28 ^ 28 29 28 28 28 28 28 28 ^ 29 28 29 28 28 28 28 28 ^ 29 28 29 29 29 29 28 28 ^ 29 29 29 29 29 29 29 29 After _sub_128_mmx: ^ -100 -100 -100 -100 -100 -100 -100 -100 ^ -100 -100 -100 -100 -100 -100 -100 -100 ^ -100 -100 -100 -99 -99 -99 -100 -100 ^ -100 -99 -100 -99 -99 -99 -100 -100 ^ -100 -99 -100 -100 -100 -100 -100 -100 ^ -99 -100 -99 -100 -100 -100 -100 -100 ^ -99 -100 -99 -99 -99 -99 -100 -100 ^ -99 -99 -99 -99 -99 -99 -99 -99 After FDCT, x86/sse2fdct:oc_enc_fdct8x8_x86_64sse2(), it is: * -3187 3 -3 2 0 -1 1 -1 * -5 -1 0 1 0 -1 -1 -1 * 4 0 3 0 1 2 2 2 * -4 4 5 -1 -1 1 1 2 * 2 0 2 0 -1 -2 -3 -2 * 2 1 -4 3 2 0 2 -1 * 2 1 2 0 1 1 1 1 * -1 0 0 1 0 0 0 -1 It is right? So, we need 13 bits for storing DCT coefficients (for 8-bit pixels)? See formulas - http://www.cs.cf.ac.uk/Dave/Multimedia/node231.html F(i,j) = 64*255*2/8 = 4080 = 12 bits; + Sign = 13 bits (-4096...+4095)?