Support

If you have a problem or need to report a bug please email : support@dsprobotics.com

There are 3 sections to this support area:

DOWNLOADS: access to product manuals, support files and drivers

HELP & INFORMATION: tutorials and example files for learning or finding pre-made modules for your projects

USER FORUMS: meet with other users and exchange ideas, you can also get help and assistance here

NEW REGISTRATIONS - please contact us if you wish to register on the forum

Optimized Chorus/Flanger/Echo&PingPong Delay/Tarrabia filter

Post any examples or modules that you want to share here

Optimized Chorus/Flanger/Echo&PingPong Delay/Tarrabia filter

Postby TrojakEW » Thu Dec 27, 2012 10:31 pm

I'v done some adjustment/asm optimizing to some stock modules in flowstone. Nothing extraordinary and nothing for advanced users. I also want to thanks trogluddite, MyCo, infuzion, cyto for hard work on synthmaker forums. From their posts I learned a lot. So now is my turn to help other users in begining, even I'm still beginer too.

Stock Echo/Ping Pong delay are not CPU hungry but I was able to reduce usage even more. I replaced almost all math with asm version and gain 20~25% performance. I also add additional delay module with stereo input.

Stereo chorus and chorus/flanger use 40% less CPU. I also replaced stock delay with optimized fractional delay (don't know who optimized it so credits go to someone else).

Last is Tarrabia filter from audiooak module pack. Optimized version is 45~55% faster depend on selected filter type.

read more info/statistic and download here

This is for now. More to come.
User avatar
TrojakEW
 
Posts: 111
Joined: Sat Dec 25, 2010 10:12 am
Location: Slovakia

Re: Optimized Chorus/Flanger/Echo&PingPong Delay/Tarrabia fi

Postby infuzion » Sat Dec 29, 2012 3:51 am

Welcome :) Ah, a [wo]man after my own heart!

Yes ASM can bring reduce CPU cycles alot, but I'm not certain 40-50% like you are claiming; even the now old Core2Duo CPUs self-optimize decently. For AMD, I could see around 40%.
There are still a few places you can squeze out an extra cycle or 2. Or 5. But overall well done :)
infuzion
 
Posts: 109
Joined: Tue Jul 13, 2010 11:55 am
Location: Kansas City, USA, Earth, Sol

Re: Optimized Chorus/Flanger/Echo&PingPong Delay/Tarrabia fi

Postby TrojakEW » Sat Dec 29, 2012 10:59 am

Thank you.
Well I have old athlon 64 x2 4200 and for me those are real number. Of course there are some room for more improvement by combining asm block to one in order to reduce movaps but for me (and I'm sure for some other users) it's more easier to navigate trough few block rather than one big code (so I will sacrifice 2 or 5 cycles for that :mrgreen: ). That's why I choose flowstone. Of course if you have any suggestion I'm always ready to learn more.
User avatar
TrojakEW
 
Posts: 111
Joined: Sat Dec 25, 2010 10:12 am
Location: Slovakia

Re: Optimized Chorus/Flanger/Echo&PingPong Delay/Tarrabia fi

Postby TrojakEW » Sat Dec 29, 2012 12:06 pm

User avatar
TrojakEW
 
Posts: 111
Joined: Sat Dec 25, 2010 10:12 am
Location: Slovakia

Re: Optimized Chorus/Flanger/Echo&PingPong Delay/Tarrabia fi

Postby infuzion » Sat Dec 29, 2012 3:30 pm

TrojakEW wrote:Well I have old athlon 64 x2 4200 and for me those are real number.
I learned to optimize SM-ASM with my Athlon 64x2 also; the best CPU in it's day but it is like 7 years old now? Surprised people still use it... though I thought about reviving it to ensure my ASM is fastest. Most newer Intel CPUs (like my i7 laptop) do not allow accurate CPU% readings inside SM.

BTW, I would pick up a Core2Duo (even an old used one) if you can; not only it will help you not wear out your Athlon for ASM testing, but it is much faster in many ways, & a few ASM optimizations in Athlon actually make the Intel lose a few cycles. Not many, but I did notice it when I was helping with the toolbox's DeZip & a few other projects. C2D do some smart opcode self-interlacing, opcodes are faster (esp divides), & on some occasions can run the same XMM registers at the same time IIRC. A few SM primitives run faster also; IIRC the Selector used ~10 cycles in stream mode on the Athlon but 2-3 cycles on the C2D+, so your design choices may be different.

I have a few more tips for you in SM's forum in the past year; search "ASM" with user "infuzion".
infuzion
 
Posts: 109
Joined: Tue Jul 13, 2010 11:55 am
Location: Kansas City, USA, Earth, Sol

Re: Optimized Chorus/Flanger/Echo&PingPong Delay/Tarrabia fi

Postby TrojakEW » Sat Dec 29, 2012 4:20 pm

I can't afford buy anything now even used stuff. If differcence is only few cycles than it's ok (for now). But this also shows that it's almost impossible to optimize anything for all CPU types. As for your tips from SM forum I think I read them almost all. I hope. Some are self explaining while other need time to "master". Now I'm trying to optimize arrays in asm which is little confusing for me. Most of time I end up with exact oppostie effect and code will run 4-8x slower :mrgreen: .
User avatar
TrojakEW
 
Posts: 111
Joined: Sat Dec 25, 2010 10:12 am
Location: Slovakia

Re: Optimized Chorus/Flanger/Echo&PingPong Delay/Tarrabia fi

Postby TrojakEW » Fri Jan 25, 2013 11:08 pm

I have updated chorus pack. Both modules are even faster (66-67% on my CPU). Big reduction thanks to Trog Luddite optimized delay. Replaced sine LFO with asm version and logscaling with ruby version in order to reduce number of modules/components.
User avatar
TrojakEW
 
Posts: 111
Joined: Sat Dec 25, 2010 10:12 am
Location: Slovakia


Return to User Examples

Who is online

Users browsing this forum: Google [Bot] and 30 guests

cron