Delay in poly white.. what is working
Posted: Tue Sep 06, 2022 1:07 pm
//// EDIT : Sorry, i was misleading ! There was a problem with the delay i was using !
//// The true answer will be on the next post where i will upload a corrected file !
I do another topic because it go to confusion what works or not. (sorry also have to test more before to post !)
So i get some problem with my last delay optimization trying to use it in poly white.
Depending on the situation, it's confusing seaming to works when it's not...
But using shufps or array with another's poly code with success, it was not clear what was not working..
So here is more about it.. Not already sure of all but here's a delay that use movaps xmm0,mem[eax];
in poly and is working.
(using this is faster than the other way that need to convert the variable in 80bit then reconvert in32)
That's a little experimental for now so maybe some instruction are not really needed..
The true problem is when we do : Movd eax,xmm7; movaps xmm0,mem[eax]; then shufps to mix back the different read. >> Something detect a relation between the different SSE channels ???
So here i use variable where i store all delay address. Then i write the result in a variable :
So first i was thinking that movd was the problem.
But i use it to get back the read..
Shufps seams ok but only if it isolate one sse channel after reading all, not when exchanging some of them or mixing different variable...
Maybe the true problem is to shufps 2 operand that are reading the 4 channel of the array ?
//// The true answer will be on the next post where i will upload a corrected file !
I do another topic because it go to confusion what works or not. (sorry also have to test more before to post !)
So i get some problem with my last delay optimization trying to use it in poly white.
Depending on the situation, it's confusing seaming to works when it's not...
But using shufps or array with another's poly code with success, it was not clear what was not working..
So here is more about it.. Not already sure of all but here's a delay that use movaps xmm0,mem[eax];
in poly and is working.
(using this is faster than the other way that need to convert the variable in 80bit then reconvert in32)
That's a little experimental for now so maybe some instruction are not really needed..
The true problem is when we do : Movd eax,xmm7; movaps xmm0,mem[eax]; then shufps to mix back the different read. >> Something detect a relation between the different SSE channels ???
So here i use variable where i store all delay address. Then i write the result in a variable :
- Code: Select all
mov eax,r1[0];
movaps xmm0,mem[eax];
movd eax,xmm0;
mov read1[0],eax;
mov eax,r1[1];
movaps xmm0,mem[eax];
shufps xmm0,xmm0,85;
movd eax,xmm0;
mov read1[1],eax;
... .... ...
So first i was thinking that movd was the problem.
But i use it to get back the read..
Shufps seams ok but only if it isolate one sse channel after reading all, not when exchanging some of them or mixing different variable...
Maybe the true problem is to shufps 2 operand that are reading the 4 channel of the array ?