Comments (10)
@core-c, does the compiler not already optimize this loop? If you have a way to verify that there is a statistically significant speed up, I could make a pull request.
from arduinofft.
I can check that out, and get back on it.
There are more things that can be optimized though: Like assigning a value inside a loop, from a line of code that always produces the same result.. after some minor calculation.
Such things seem fine enough on a fast processor, maybe, but not so on an AVR, i think. Those MCUs run at low clockspeeds, where every tick counts.
from arduinofft.
So I compiled the code changing this block of code
for (uint16_t i = 0; i < this->_samples; i++) {
this->_vReal[i] /= this->_samples;
this->_vImag[i] /= this->_samples;
}
to
for (uint16_t i = 0; i < this->_samples; i++) {
double reciprocal = 1 / this->_samples;
this->_vReal[i] *= reciprocal;
this->_vImag[i] *= reciprocal;
}
The compiled code for each did change. I ran it through a disassembler, but the resulting output was completely different. It turns out that Arduino compiles with the gcc -Os
flag enabled. This means that Arduino optimizes for space instead of speed. The compiler still might also be optimizing this loop, but there is no way to know, so there might actually be a speed benefit for doing this. It warrants additional testing/research.
from arduinofft.
-Os optimizes for size.
There are other options you can try, like the most common: -O1, -O2, -O3 (increasingly more speed optimization).
In any case, you should calculate that reciprocal ONCE outside the for-loop.
Now it calculates reciprocal this->_samples times, while it keeps resulting the same value (_samples value does not change).
I will see if i can make things faster, in C.
But i am really thinking of writing it in AVR assembler (because i need it for my own projects). But for that, first the algorithm/calculation must be efficient.
For example, i see that function Exponent(). It is not used very often, but still it can be changed to make it run faster:
uint8_t arduinoFFT::Exponent(uint16_t value) {
uint8_t result = 0;
while (((value >> result) & 1) != 1)
result++;
return (result);
}
The AVR has bit-shift instructions that shift only 1 bit (to the left or right).
Say, you enter a value where only bit 10 is set. That while-loop starts manipulating that value, shifting result times in each loop-pass. The function does not store the intermediate value, but performs a new bit-shift on every loop-pass. Eventually, once it gets to the 10th bit, it did perform these shifts:
value >> 0
value >> 1
value >> 2
...all the way to...
value >> 10
If you want to shift a value 10 bits, the AVR will perform 10 shift instructions (because it can only shift a byte one bit at a time). !!note: the AVR can shift bytes. To shift a 16b word, it must shift 2 bytes!!
The way Exponent() is doing it, there will be even more shift instructions being executed (1+2+3+4+5+6+7+8+9+10 in total in the while-loop). That is too much and the function can easily be improved.
I guess this code will run faster:
uint8_t arduinoFFT::Exponent(uint16_t value) {
uint16_t v = value;
uint8_t result = 1;
while ((v & 1) == 0) {
result++;
v >>= 1;
}
return result;
}
!!note: Compilers are smart nowadays. With proper settings, it will not just shift 10 times, but it does shift 2 bits on the MSB of a word instead.. but you get the idea of my example.
But yeah, this is just an example of how an algorithm can be optimized.
Perhaps in C-code that little function seems longer (a few lines of code), but it does run faster.
I was also thinking of using fixed-point numbers. That would really improve performance.
The AVR has no floating-point instructions. If you use fixed-point numbers it'll make it a lot easier for the AVR to calculate with such numbers.
from arduinofft.
The develop branch has an FFT_SPEED_OVER_PRECISION
macro that you can define that does what you're proposing.
from arduinofft.
o cool. To be honest, i had only checked out this branch (a bit). I will have a look. Thanks..
While i was tracking down the origin of this "abandoned" project, i did find some code that was already highly optimized (written in assembler). I think it is from the creator..
I have no need, for myself, to optimize any C-code anymore. Someone already did it..
There is a github source that shows several FFT implementations for Arduino.
I also found more code that inspired the original maker of this project. There you find an FHT implementation (input & output are real values.. easy to use for Arduino users, and sufficient for many/most Arduino projects).
I am unsure to post the URLs here, because i do not want to discredit this project. email me, if you want to know
Something i want to mention:
Performance-wise: Have you ever tried to use only (signed) integer values, instead of doubles/floats?
For an AVR-Arduino i think this library is used most often in combination with sampled real values from the ADC (10-bit values most probably), all integers.
from arduinofft.
@core-c I am intrigued about checking out the URLs you mention
from arduinofft.
http://wiki.openmusiclabs.com/wiki/ArduinoFHT
https://github.com/imagest108/arduino/blob/master/libraries/PlainFFT/PlainFFT.cpp
https://github.com/imagest108/arduino/tree/master/libraries
https://github.com/Evg33/ArduinoFHT
https://github.com/kosme/arduinoFFT
..
from arduinofft.
sorry for a very late reply.. i was busy with RL. kosme
from arduinofft.
Both mentioned optimizations are implemented on version 1.6.1
from arduinofft.
Related Issues (20)
- Needs function to calculate octaves
- Implement C++ template HOT 5
- (improvement) use sqrtf() on ESP32 HOT 2
- Windowing / Weighing Factor HOT 1
- too much time cost for compute FFT on Esp32-C3 HOT 6
- Weird results while imag input other than 0 at example FFT_1 HOT 2
- I thought I was nearly there but can't quite get it over the line. HOT 5
- Updating Documentation and Examples HOT 1
- Needed FFT real time example HOT 1
- average FFT result HOT 1
- calc complex amplitude for only specific frequencies HOT 1
- What is the relation of the calculated magnitude to the original signal
- Windowing compensation factors HOT 1
- First two bins 0 Hz and 62.5 Hz always an order of magnitude larger than all other bins HOT 5
- v2.0 does not extract major peak, while v1.6.2 does HOT 24
- Inverse function does not work correctly HOT 1
- FFT Uncertainty HOT 1
- Issue with library dictionary HOT 1
- Cannot detect frequency from sound sensor in v2 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arduinofft.