Support

If you have a problem or need to report a bug please email : support@dsprobotics.com

There are 3 sections to this support area:

DOWNLOADS: access to product manuals, support files and drivers

HELP & INFORMATION: tutorials and example files for learning or finding pre-made modules for your projects

USER FORUMS: meet with other users and exchange ideas, you can also get help and assistance here

NEW REGISTRATIONS - please contact us if you wish to register on the forum

do the Shufps

DSP related issues, mathematics, processing and techniques

do the Shufps

Postby HughBanton » Thu May 06, 2021 11:23 am

Hi all,

Can anyone tell me the shufps code to change the sse channel order from 0123 to 3210? i.e. to reverse it - what would be the 'n' in :
shufps xmm0,xmm0,n; ? (Maybe it needs more than one step ..)

I've read here that there was a handy shufps helper on the forum some years back, but I haven't been able to find it. A ref to that would be most useful!

Thanks
H
User avatar
HughBanton
 
Posts: 265
Joined: Sat Apr 12, 2008 3:10 pm
Location: Evesham, Worcestershire

Re: do the Shufps

Postby tulamide » Thu May 06, 2021 6:52 pm

HughBanton wrote:Hi all,

Can anyone tell me the shufps code to change the sse channel order from 0123 to 3210? i.e. to reverse it - what would be the 'n' in :
shufps xmm0,xmm0,n; ? (Maybe it needs more than one step ..)

I've read here that there was a handy shufps helper on the forum some years back, but I haven't been able to find it. A ref to that would be most useful!

Thanks
H


According to Intel x86 Assembly/SSE, this would be it:
Code: Select all
shufps $0x1b, %xmm0, %xmm0 # reverse order of the 4 floats

The control byte (that apart from this language NASM is always displayed as the last operand), is an 8-bit immediate and tells what goes where.
The source operand can be an XXM register or a 128-bit memory location. The destination operand is an XMM register. The select operand is an 8-bit immediate: bits 0 and 1 select the value to be moved from the destination operand the low doubleword of the result, bits 2 and 3 select the value to be moved from the destination operand the second doubleword of the result, bits 4 and 5 select the value to be moved from the source operand the third doubleword of the result, and bits 6 and 7 select the value to be moved from the source operand the high doubleword of the result.

$0x1b is hexcode, decimal 27, binary 00011011, broken into immediate 0, 1, 2, 3, I think it's MSB order

Hope it helps!
"There lies the dog buried" (German saying translated literally)
tulamide
 
Posts: 2686
Joined: Sat Jun 21, 2014 2:48 pm
Location: Germany

Re: do the Shufps

Postby HughBanton » Thu May 06, 2021 8:15 pm

Hah - 27 .. that's it! Thanks Tula.

I had searched hi & lo, but couldn't find the logic written down anywhere. I'll make a note of all that.

I've been occasionally looking at Rotary Speaker stuff of late (about time ..?) and realised that swapping the mono-4 channels around like this would instantly simplify the spiders web inside the auto-panner that I've introduced. I'm trying to make the delay reflections move individually in stereo as they 'rotate', seems to be a crucial Leslie element.

Anyway, more on all this when I eventually get something worth demonstrating.

Thanks again.

H
User avatar
HughBanton
 
Posts: 265
Joined: Sat Apr 12, 2008 3:10 pm
Location: Evesham, Worcestershire

Re: do the Shufps

Postby tulamide » Thu May 06, 2021 9:36 pm

It's the first time I had to deal with it. Which shows that it's actually pretty easy. The select operand has 8 bits, and each 2 bits represent an action to be done on the equivalent element of the register. You just need to learn 4 states:

0 = copy to least significant element
1 = copy to second element
2 = copy to third element
3 = copy to most significant element

above numbers in 2-bit binary: 0 = 00, 1 = 01, 2 = 10, 3 = 11

These are the same for all 4 instructions in the IMM8. But, and this is the catch, there's a specified order, when using two registers!

However, if you only work with one register, you can directly translate it:

ABCD to DABC
IMM8 2, 1, 0, 3 = mask 10 01 00 11 = binary 10010011 = decimal 147 = hex 0x93

Above example would be called rotation. If you are only interested in specific usage of shufps on one register, specifically broadcast, swap and rotate, this page will help you a lot, as it doesn't explain much, but gives straight usage code for specific tasks.
http://www.songho.ca/misc/sse/sse.html

EDIT: I told you it is in MSB order, but my example was in LSB order! Sorry! 0x93 would do ABCD to BCDA !
EDIT2: According to the tool, Martin posted, my original explanation is absolutely correct. So ignore Edit1 please!
Last edited by tulamide on Fri May 14, 2021 8:56 pm, edited 1 time in total.
"There lies the dog buried" (German saying translated literally)
tulamide
 
Posts: 2686
Joined: Sat Jun 21, 2014 2:48 pm
Location: Germany

Re: do the Shufps

Postby martinvicanek » Fri May 07, 2021 10:00 am

Wonderful tool by STW and infuzion!
Attachments
shufps ASM operand mask helper 1.5.2.fsm
(20.36 KiB) Downloaded 753 times
User avatar
martinvicanek
 
Posts: 1315
Joined: Sat Jun 22, 2013 8:28 pm

Re: do the Shufps

Postby tulamide » Fri May 07, 2021 12:01 pm

martinvicanek wrote:Wonderful tool by STW and infuzion!

Interesting. His tool lays out the mask exactly as I did in my example. 0x97 does a right shift. But Intel explains it exactly the opposite. According to their documentation, it should do a left shift.

What's going on here? :?:
"There lies the dog buried" (German saying translated literally)
tulamide
 
Posts: 2686
Joined: Sat Jun 21, 2014 2:48 pm
Location: Germany

Re: do the Shufps

Postby tulamide » Fri May 14, 2021 8:54 pm

Am I ignored, or does nobody know?
"There lies the dog buried" (German saying translated literally)
tulamide
 
Posts: 2686
Joined: Sat Jun 21, 2014 2:48 pm
Location: Germany

Re: do the Shufps

Postby Spogg » Sat May 15, 2021 6:37 am

tulamide wrote:Am I ignored, or does nobody know?


Definitely ignored. :lol:

We need a “I read your post but I know nothing" button!
User avatar
Spogg
 
Posts: 3318
Joined: Thu Nov 20, 2014 4:24 pm
Location: Birmingham, England

Re: do the Shufps

Postby martinvicanek » Sat May 15, 2021 10:27 am

Sorry, Tula, not ignoring your post, just don't know the answer to your question.
tulamide wrote:
The source operand can be an XXM register or a 128-bit memory location. The destination operand is an XMM register. The select operand is an 8-bit immediate: bits 0 and 1 select the value to be moved from the destination operand the low doubleword of the result, bits 2 and 3 select the value to be moved from the destination operand the second doubleword of the result, bits 4 and 5 select the value to be moved from the source operand the third doubleword of the result, and bits 6 and 7 select the value to be moved from the source operand the high doubleword of the result.

If this is intel's explanation then I don't understand it. I have read it several times but even the grammar seems odd to me. All I can say is that the shufps helper tool, which I have been using excessively for years, works flawlessly.
User avatar
martinvicanek
 
Posts: 1315
Joined: Sat Jun 22, 2013 8:28 pm

Re: do the Shufps

Postby tulamide » Sat May 15, 2021 1:58 pm

martinvicanek wrote:Sorry, Tula, not ignoring your post, just don't know the answer to your question.
tulamide wrote:
The source operand can be an XXM register or a 128-bit memory location. The destination operand is an XMM register. The select operand is an 8-bit immediate: bits 0 and 1 select the value to be moved from the destination operand the low doubleword of the result, bits 2 and 3 select the value to be moved from the destination operand the second doubleword of the result, bits 4 and 5 select the value to be moved from the source operand the third doubleword of the result, and bits 6 and 7 select the value to be moved from the source operand the high doubleword of the result.

If this is intel's explanation then I don't understand it. I have read it several times but even the grammar seems odd to me. All I can say is that the shufps helper tool, which I have been using excessively for years, works flawlessly.

Thanks! Yes, as I said earlier, the tool and my explanation both do the correct thing. That's why I was confused, that it's explained in the opposite order.

But nobody ever complained about the description, so I assume its flaw has long been accepted and people are aware of it? Or it is a thing of little and big endian, which is dependend on the CPU. Maybe I was reading the description for big-endian, instead of little endian as used by Intel-CPUs? Well, I think we can leave it at that.
"There lies the dog buried" (German saying translated literally)
tulamide
 
Posts: 2686
Joined: Sat Jun 21, 2014 2:48 pm
Location: Germany

Next

Return to DSP

Who is online

Users browsing this forum: No registered users and 21 guests