Arduino/ESP32 loop execution time "stuck" at 80ms - optimization

I'm trying to build a quadcopter flight controller from scratch using an ESP32, BNO055 IMU, and Turnigy TGY-i6 Transmitter/Receiver. So far I've been able to get everything to work together and give me reasonable data using PID control loops, filters, and various libraries. Previously I was using an Arduino Nano in place of the ESP32 because it was what I had at the time and I thought it worked well with it.
After doing some (very rough) flight testing I ended up coming to the conclusion that the code was not executing fast enough with the Nano to sustainably keep the quadcopter in the air. I decided to measure the loop time by setting a t1 and t2 at the beginning and end of the loop, and took the difference to find the time it took to execute one loop of the code. With the Nano, it took roughly 80000 microseconds exactly with little deviation.
Seeing this, I bought the ESP32 knowing that it had a much faster CPU clock speed and overall performance compared against the Nano. After getting the code to work on the ESP32 with a few changes, I ran a speed test again, and got the same 80ms. I was a bit confused at first but I decided to try and isolate the problem by taking out chunks of the code to see if the loop time would change - and it did not. After reading about some specific inefficencies of Arduino IDE [such as digitalWrite()] I kept trying to take away specific parts of the code, and no difference arose as I kept measuring the clock speed. It kept at roughly 80000 microseconds no matter the changes I made.
This leads me to believe that there is something quite important that I'm unaware about in my code that is causing it to run so slow. Ideally, I would like to be able to lower the loop time to at most 10ms so that the quadcopter can autolevel with little to no oscillation.
I've used Arduino for a few years now, but by no means am an expert - so any help on optimizing the code and/or solving this odd problem would be very much appreciated, Thank you.
Note: I have a suspicion that it may be something related to the IMU (BNO055) because that is one part of the hardware and software I know very little about.
These are the libraries I'm using:
#include "ESC.h" //ESC
#include <ESP32Servo.h> //ESC
#include <Wire.h>
#include <Adafruit_Sensor.h> //IMU
#include <Adafruit_BNO055.h> //IMU
#include <utility/imumaths.h> //IMU
#include <PID_v1.h> //PID
Here is the rest of the code.

After reviewing the code and segmenting it further, I discovered the problem was coming from the pulseIn() functions I use to read the PPM signal from the receiver.
The function works by:
Reads a pulse (either HIGH or LOW) on a pin. For example, if value is HIGH, pulseIn() waits for the pin to go from LOW to HIGH, starts timing, then waits for the pin to go LOW and stops timing. Returns the length of the pulse in microseconds or gives up and returns 0 if no complete pulse was received within the timeout. Link
Looking back, this clearly causes a delay in the code.
Thank you to dimich for putting me on the right path to solve the problem.
Edit: How you can solve this:
Rather than use pulseIn(), you can use interrupts to solve the problem. I did this by attaching an interrupt to the pin that is sending the PPM signal and listening for a CHANGE to happen on it. You then check whether the pin is HIGH or LOW and either begin or stop your timer. This avoids using delays in the program and allows for a clean signal to be read.
Note: Some receivers use a PPMsum on one pin rather than individual pins. On my receiver, I use 4 different channels and 4 different GPIO pins for each axis + throttle.

Related

Can't erase the first 256KB of Nor Flash(M29W256GL) using BDI2000

I'm porting u-boot-2013.10 to MPC8306 based board. Previously, I can erase the first several sectors of Nor Flash using BDI2000. But after sometime, when the porting task is nearly done, (I mean that I can use gdb to trace the code execution and find the u-boot code runs into command line mainloop, though there are no serial output at the time, due to error configure of Serial Port)the first 256KB of Nor Flash can't be erased even if after power off reset. Other sectors can be erased normally.
The Nor Flash is Micron M29W256GL, with block size 128KB. I'm sure the WP# Pin has been pulled high, so there is no hardware protection upon the first block.
When config the jumper on board to change the PowerPC Config Word in order not to let MPC8306 fetch boot code at power up, the problem remains.
I used to run u-boot-1.1.6 on this board, I have erased this version of u-boot so many times without the problem mentioned above. I guess u-boot-2013.10 made some new approach to flash manipulation or others, for example, non-volatile protection on first 256KB of flash.
Is there someone can help me to solve the problem? I would very much appreciate your help.

Plot a graph of Time vs RSSI for a 433Mhz RF ASK Receiver

Hi Im using the following RF module
http://www.apogeekits.com/rf_receiver_module_rx433.htm
on an embedded board with the PIC16F628A. Sadly, I realized that the signal strength was in analog form and couldn't get any ideas to get the RSSI reading off the pin because well my PIC is digital DUH!.
My basic idea was
To get the RSSI value from my Receiver
Send it to the PIC
Link the PIC to a PC via RS232
Plot a graph of time vs RSSI of the receiver (so I can make out how close my TX is to my RX)
I thought it was bloody brilliant at first but ive hit a dead end here. Any ideas on getting the RSSI data to my PC from this receiver would be nice.
Thanks in Advance
You can get a PIC that has an integrated ADC for sampling the analog signal. Or, you can use an external ADC chip to do the conversion. You would connect that to your PIC using SPI or I2C.
The simplest thing to do is obviously to use a more appropriate microcontroller - one with an ADC! There are many (most), including PICs (though that wouldn't be my first choice).
Attaching an external SPI or I2C ADC might be a bit tedious since having no SPI or I2C on your part, you'd have to bit-bash it. If you do that, use an SPI part - its simpler. Your sample rate will suffer and may end-up being a bit jittery if you are not careful.
Another solution is to use a voltage controlled PWM, then use the timer input capture to time the pulse width. That will give you good regularity and potentially good resolution. You can get a chip (example) to do that, or grow your own. That last option requires a triangle wave input as well as the measured (control) voltage, but on the same site...
In a similar vein, you could use a low frequency VCO (example) and use the output to clock one of the timers, then using a second timer periodically sampling the first and reset it. The count will relate to the voltage, though not necessarily a linear relationship, linearisation could be none on the PIC or at the receiving PC - I'd go for the latter - your micro will suck at arithmetic (performance wise) - even integer arithmetic, especially if it involves division.

What does post-layout simulation take a long time?

I am wondering why post-layout simulations for digital designs take a long time?
Why can't software just figure out a chip's timing and model the behavior with a program that creates delays with sleep() or something? My guess is that sleep() isn't accurate enough to model hardware, but I'm not sure.
So, what is it actually doing that takes so long?
Thanks!
Post layout simulations (in fact - anything post synthesis) will be simulating gates rather than RTL, and there's a lot of gates.
I think you've got your understanding of how a simulator works a little confused. I say that because a call like sleep() is related to waiting for time as measured by the clock on the wall, not the simulator time counter. Simulator time advances however quickly the simulator runs.
A simulator is a loop that evaluates the system state. Each iteration of the loop is a 'time slice' e.g. what the state of the system is at time 100ns. It only advances from one time slice to the next when all the signals in it have reached a steady state.
In an RTL or untimed gate simulation, most evaluation of signals happens in 'zero time', which is to say that the effect of evaluating an assignment happens in the same time slice. The one exception tends to be the clock, which is defined to change at a certain time and it causes registers to fire, which causes them to change their output, which causes processes, modules, assignments which have inputs from registers to re-evaluate, which causes other signals to change, which causes other processes to re-evaluate, etc, etc, etc.... until everything has settled down, and we can move to the next clock edge.
In a post layout simulation, with back-annotated timing, every gate in the system has a time from input to output associated with it. This means nothing happens in 'zero time' any more. The simulator now has put the effect of every assignment on a list saying 'signal b will change to 1 at time 102.35ns'. Every gate has different timing. Every input on every gate will have different timing to the output. This means that a back-annotated simulation has to evaluate lots and lots of time slices as signals are changing state at lot's of different times. Not just when the clock changes. There probably isn't much happening in each slice, but there's lots of them.
...and I've only talked about adding gate timing. Add wire timing and things get even more complex.
Basically there's a whole lot more to worry about, and so the sims get slower.

Simple robust error correction for transmission of ascii over serial (RS485)

I have a very low speed data connection over serial (RS485):
9600 baud
actual data transmission rate is about 25% of that.
The serial line is going through an area of extremely high EMR. Peak fluctuations can reach 3000 KV.
I am not in the position (yet) to force a change in the physical medium, but could easily offer to put in a simple robust forward error correction scheme. The scheme needs to be easy to implement on a PIC18 series micro.
Ideas?
This site claims to implement Reed-Solomon on the PIC18. I've never used it myself, but perhaps it could be a helpful reference?
Search for CRC algorithm used in MODBUS ASCII protocol.
I develop with PIC18 devices and currently use the MCC18 and PICC18 compilers. I noticed a few weeks ago that the peripheral headers for PICC18 incorrectly map the Busy2USART() macro to the TRMT bit instead of the TRMT2 bit. This caused me major headaches for short time before I discovered the problem. Example, a simple transmission:
putc2USART(*p_value++);
while Busy2USART();
putc2USART(*p_value);
When the Busy2USART() macro was incorrectly mapped to the TRMT bit, I was never waiting for bytes to leave the shift register because I was monitoring the wrong bit. Before I realized the inaccurate header file, the only way I was able to successfully transmit a byte over 485 was to wait 1 ms between bytes. My baud rate was 91912 and the delays between bytes killed my throughput.
I also suggest implementing a means of collision detection and checksums. Checksums are cheap, even on a PIC18. If you are able to listen to your own transmissions, do so, it will allow you to be aware of collisions that may result from duplicate addresses on the same loop and incorrect timings.

Testing Real Time Operating System for Hardness

I have an embedded device (Technologic TS-7800) that advertises real-time capabilities, but says nothing about 'hard' or 'soft'. While I wait for a response from the manufacturer, I figured it wouldn't hurt to test the system myself.
What are some established procedures to determine the 'hardness' of a particular device with respect to real time/deterministic behavior (latency and jitter)?
Being at college, I have access to some pretty neat hardware (good oscilloscopes and signal generators), so I don't think I'll run into any issues in terms of testing equipment, just expertise.
With that kind of equipment, it ought to be fairly easy to sync the o-scope to a steady clock, produce a spike each time the real-time system produces an output, an see how much that spike varies from center. The less the variation, the greater the hardness.
To clarify Bob's answer maybe:
Use the signal generator to generate a pulse at some varying frequency.
Random distribution across some range would be best.
use the signal generator (trigger signal) to start the scope.
the RTOS has to respond, do it thing and send an output pulse.
feed the RTOS output into input 2 of the scope.
get the scope to persist/collect mode.
get the scope to start on A , stop on B. if you can.
in an ideal workd, get it to measure the distribution for you. A LeCroy would.
Start with a much slower trace than you would expect. You need to be able to see slow outliers.
You'll be able to see the distribution.
Assuming a normal distribution the SD of the response time variation is the SOFTNESS.
(This won't really happen in practice, but if you don't get outliers it is reasonably useful. )
If there are outliers of large latency, then the RTOS is NOT very hard. Does not meet deadlines well. Unsuitable then it is for hard real time work.
Many RTOS-like things have a good left edge to the curve, sloping down like a 1/f curve.
Thats indicitive of combined jitters. The thing to look out for is spikes of slow response on the right end of the scope. Keep repeating the experiment with faster traces if there are no outliers to get a good image of the slope. Should be good for some speculative conclusion in your paper.
If for your application, say a delta of 1uS is okay, and you measure 0.5us, it's all cool.
Anyway, you can publish the results ( and probably in the publish sense, but certainly on the web.)
Link from this Question to the paper when you've written it.
Hard real-time has more to do with how your software works than the hardware on its own. When asking if something is hard real-time it must be applied to the complete system (Hardware, RTOS and application). This means hard or soft real-time is system design issues.
Under loading exceeding the specification even a hard real-time system will fail (hopefully with proper failure indication) while a soft real-time system with low loading would give hard real-time results. How much processing must happen in time and how much pre/post processing can be performed is the real key to hard/soft real-time.
In some real-time applications some data loss is not a failure it should just be below a certain level, again a system criteria.
You can generate inputs to the board and have a small application count them and check at what level data is going to be lost. But that gives you a rating specific to that system running that application. As soon as you start doing more processing your computational load increases and you now have a different hard real-time limit.
This board will running a bare bones scheduler will give great predictable hard real-time performance for most tasks.
Running a full RTOS with heavy computational load you probably only get soft real-time.
Edit after comment
The most efficient and easiest way I have used to measure my software's performance (assuming you use a schedular) is by using a free running hardware timer on the board and to time stamp my start and end of my cycle. Or if you run a full RTOS time stamp you acquisition and transition. Save your Max time and run a average on the values over a second. If your average is around 50% and you max is within 20% of your average you are OK. If not it is time to refactor your application. As your application grows the cycle time will grow. You can monitor the effect of all your software changes on your cycle time.
Another way is to use a hardware timer generate a cyclical interrupt. If you are in time reset the interrupt. If you miss the deadline you have interrupt handler signal a failure. This however will only give you a warning once your application is taking to long but it rely on hardware and interrupts so you can't miss.
These solutions also eliminate the requirement to hook up a scope to monitor the output since the time information can be displayed in any kind of terminal by a background task. If it is easy to monitor you will monitor it regularly avoiding solving the timing problems at the end but as soon as they are introduced.
Hope this helps
I have the same board here at work. It's a slightly-modified 2.6 Kernel, I believe... not the real-time version.
I don't know that I've read anything in the docs yet that indicates that it is meant for strict RTOS work.
I think that this is not a hard real-time device, since it runs no RTOS.
I understand being geek, but using oscilloscope to test a computer with ethernet/usb/other digital ports and HUGE internal state (RAM) is both ineffective and unreliable.
Instead of watching wave forms, you can connect any PC to the output port and run proper statistical analysis.
The established procedure (if the input signal is analog by nature) is to test system against several characteristic inputs - traditionally spikes, step functions and sine waves of different frequencies - and measure phase shift and variance for each input type. Worst case is then used in specifications of the system.
Again, if you are using standard ports, you can easily generate those on PC. If the input is truly analog, a separate DAC or simply a good sound card would be needed.
Now, that won't say anything about OS being real-time - it could be running vanilla Linux or even Win CE and still produce good and stable results in those tests if hardware is fast enough.
So, you need to simulate heavy and varying loads on processor, memory and all ports, let it heat and eat memory for a few hours, and then repeat tests. If latency stays constant, it's hard real-time. If it doesn't, under any load and input signal type, increase above acceptable limit, it's soft. Otherwise, it's advertisement.
P.S.: Implication is that even for critical systems you don't actually need hard real-time if you have hardware.