Could this technique be faster ?
Posted: Wed Apr 13, 2022 3:42 pm
Hi,
What do you think of this technique ? Could it be faster ?
Here's an example with a little sin osc code.
Normally we must load 6 variables every cycle.
But there's 2 float and 2 int that have the same value for all SSE channels.
The idea is to use the SSE to store the 2 float an 2 int in 1 float and 1 int in stage0.
So we could load 2 variable instead of 4.
Then in stage 2 we have to copy to another register and shufps. But i suppose this could be faster ?
Well in this example it does not change so much but maybe with code that need more variable ?
Also, do you think that the aliasing is ok with this sin approximation ?
Thanks for any response !)
What do you think of this technique ? Could it be faster ?
Here's an example with a little sin osc code.
Normally we must load 6 variables every cycle.
But there's 2 float and 2 int that have the same value for all SSE channels.
The idea is to use the SSE to store the 2 float an 2 int in 1 float and 1 int in stage0.
So we could load 2 variable instead of 4.
Then in stage 2 we have to copy to another register and shufps. But i suppose this could be faster ?
Well in this example it does not change so much but maybe with code that need more variable ?
Also, do you think that the aliasing is ok with this sin approximation ?
Thanks for any response !)