DSP Robotics Support

Posted: **Mon Jun 10, 2019 11:20 am**

Hello
i try to optimise some of my modules.
what is the ultimate tool to mesure differences between 2 modules ?
my tools are quite old (osm times), i suspect to be non acurate.

lets begin by get the right tools, so is there an analyser/speed tester usable in 2019, from wich we can expect reliable results ?

thanks

Posted: **Mon Jun 10, 2019 2:06 pm**

The old ones still work but they rely on stream. Stream takes into account poly's, multiple signals with which it allows to utilize each in an ordered manner which is why people never really made a new one.

Stream is a bit limiting though, admittedly.

Posted: **Mon Jun 10, 2019 3:27 pm**

Results are curious.
I put here the 3 analysors i use.
is there best one now ?
which one is the most efficient ?

Posted: **Mon Jun 10, 2019 4:20 pm**

Any of those three analyser would still be equally useful today. They all use the same "analyser" component internally, so there is little to choose between them other than which one you find easiest to work with (I prefer the one with the big GUI, but mostly because it's the one that I'm most familiar with.)

When I'm close to an optimal design, I always double-check by running a plugin inside a VST host - one with "per-track" CPU readout is usually most useful (e.g. Reaper.) The reason for this is that modern CPUs use many internal optimisations for efficiency; they can execute the instructions in a different order than you coded them in, and the way that memory is optimised by using CPU caches can have significant effects, too. Testing inside a host while the CPU is busy with many other tasks can show CPU/memory load effects that aren't always clear when using an isolated test set up.

Posted: **Mon Jun 10, 2019 4:38 pm**

Thanks,
Here a simple test:
start with simplex module : INVERSOR that is STREAM * -1 (INV modules)
Translate in DSP out=in*-1 (INV DSP modules)
Translate in ASM (direct) (INV ASM modules) :

Code: Select all: streamin in;streamout out;float FM1=-1; movaps xmm0,in; mulps xmm0,FM1; //Assignment> sLeft=xmm0 movaps out,xmm0;

then testing...#ZERO differences
chain 12 modules to see a difference...

FSM INVERT is fastest

why is there no gain when simplify module ?

Posted: **Mon Jun 10, 2019 5:01 pm**

In the case of DSP vs. ASM in your example, there's no difference because they are running exactly the same code - the DSP code is translated to ASM internally, and in this case it gives exactly the same instructions.

The primitive is lighter because something similar happens for stream primitives - not just the primitives, but also the connections between them, are converted to ASM internally, and this sometimes allows FS to use some optimisations which aren't possible within the ASM/DSP blocks.

The general principle is this. Instructions using only the "xmm" registers are very fast, but anything which uses float variables, streamins, and streamouts requires storing things in memory, which is much slower. The DSP->ASM translator doesn't always optimise very well - it sometimes (but not in this case) uses memory reading/writing where it doesn't really need to, and we can often optimise these better by hand, by keeping values inside "xmm" registers instead of reading/writing memory.

For example, when you look at the ASM output of a DSP code block, you sometimes see something like this...

Code: Select all: movaps xmm0, variable // Process xmm0 movaps variable, xmm0 movaps xmm0, variable // Process xmm0 movaps variable, xmm0

In this case, the middle two "movaps" lines are not necessary - it's storing something only to load it straight back into the same place, and the final "movaps" at the end is all we need to make sure that the final result gets stored.

Posted: **Mon Jun 10, 2019 5:19 pm**

my knowledge of ASM is not so accurate (last PGM i wrote was on early 8bit proc about 45 years ago !), but i understand what you say.
I just remark that FS is well coded for such results. At least i should exist a difference between DSp and ASM versions since there is no "translation" in second one.
When you run a FSM scheme in FS, the scheme is interpreted or compiled ?
If interpreted, i say BRAVO again for quality coding.

so anothers questions :
( i will test further on more complex modules (4 op are too little i guess))
is the GUI (cosmetic guis like the ones of my exemple) of the module a slow factor ?
is the number of nested modules a slow factor ?

thanks for your very interresting answers

Posted: **Mon Jun 10, 2019 5:47 pm**

"payaDSP wrote:At least i should exist a difference between DSp and ASM versions since there is no "translation" in second one.

The "translation" to ASM only happens once - when you write the code. When the code is running, the DSP module is really the same as an ASM module which has the "translated" code typed into it. This is different to, say, the Ruby code, which really is translated ("interpreted") every single time.

payaDSP wrote:is the GUI (cosmetic guis like the ones of my exemple) of the module a slow factor ?

It can be, yes - for example, if you use very large, fast animations. FS does not use the power of the graphics-card to draw the graphics, it is mostly done by the main CPU, so does add to the CPU load. However, this cannot be seen on the FlowStone CPU meter, which only shows audio stream processing. But it will show up on the Windows 'task manager' CPU meters (you may see a rise if you move controls very quickly, for example).

payaDSP wrote:is the number of nested modules a slow factor ?

No. The modules are only a graphical help for the user to organise things - they don't affect the CPU load at all. They may make the FS or VST file a little bit bigger, but only by a very small amount.

payaDSP wrote:thanks for your very interresting answers

You're welcome. I started with 8-bit machines (Z80 mostly) myself.

Posted: **Mon Jun 10, 2019 6:45 pm**

AH ZX spectrum, and CPC 464...
and a little earlier apple II, i was very proud of my first PGM ( blinking a LED) :lol:

Posted: **Tue Jun 11, 2019 1:56 am**

trogluddite wrote:
payaDSP wrote:is the number of nested modules a slow factor ?

No. The modules are only a graphical help for the user to organise things - they don't affect the CPU load at all. They may make the FS or VST file a little bit bigger, but only by a very small amount.

This is actually at least debatable. A few years ago, Exo proved through some tests, that the amount of modules does have an impact on CPU load. One example was a schematic with deeply nested modules, everything just wonderfully clear and obvious to work with. Then the same schematic in just one module. A mess and hard to work with. However, it was lighter on the CPU.

DSP Robotics Support

optimisation - tools (2019)

optimisation - tools (2019)

Re: optimisation - tools (2019)

Re: optimisation - tools (2019)

Re: optimisation - tools (2019)

Re: optimisation - tools (2019)

Re: optimisation - tools (2019)

Re: optimisation - tools (2019)

Re: optimisation - tools (2019)

Re: optimisation - tools (2019)

Re: optimisation - tools (2019)