Freertos and the necessity of uart transmit interrupt - interrupt

For uart reception, it's pretty obvious to me what can go wrong in case of 'blocking receive' over uart. Even in freertos with a dedicated task to read from uart, context / task switching could result in missing bytes that were received in the uart peripheral.
But for transmission I am not really sure if there is a need for interrupt based approach. I transmit from a task, and in my design it's no problem if that task is blocked for a short while. (it also blocks/sleeps on mutexes e.g).
Is there another strong argument to use use uart transmit in interrupt mode? I am not risking anything wrt loss of data, right?
In my case I use an stm32, but I guess the type of mcu is not really relevant here.

Let's focus on TX only and assume that we don't use interrupts and handle all the transmission with the tools provided by the RTOS.
µC UART hardware generally have a transmit shift register (TSR) and some kind of data register (DR). The software loads the DR, and if the TSR is empty, DR is instantly transferred into TSR and TX begins. The software is free to load another byte into DR, and the hardware loads the new byte from DR to TSR whenever the TX (shift-out) of the previous byte finishes. Hardware provides status bits for querying the status of DR & TSR. This way, the software can using polling method and still achieve continuous transmission with no gaps between the bytes.
I'm not sure if the hardware configuration I described above holds for every µC. I have experience with 8 & 16-bit PICs and STM32 F0, F1, F4 series. They are all similar. UART hardware doesn't provide additional hardware buffers.
Now, back to RTOS... Obviously, your TX task needs to be polling UART status bits. If we assume that UART baud rate is 115200 (which is a common value), you waste ~90 µs of polling for each byte. The general rule of RTOS is that if you are waiting for something to happen, your task needs to be blocked so other tasks can run. But block on what? What will tell you when to unblock? For this you need interrupts. Your task blocks on task notification, (ulTaskNotifyTake()), and the interrupt gives the notification using xTaskNotifyGive().
So, I can't imagine any other way without using interrupts. But, the method mentioned above isn't good either. It makes no sense to block - unblock with each byte.
There are 2 possible solutions:
Move TX handling completely to interrupt handler (ISR), and notify the task when TX is completed.
Use DMA instead! Almost all modern 32-bit µCs have DMA support. DMA generates a single interrupt when the TX is completed. You can notify the task from the DMA transfer complete interrupt.
On this answer I've focused on TX, but using DMA is the proper way of handling reception (RX) too.

Related

Why are FIFO One-quarter full, Half-full, three-quarter full interrupts provided in a UART RX FIFO? What are their use cases?

I am implementing a protocol decoder which receives bytes through UART of a microcontroller. The ISR takes bytes from the UART peripheral and puts it in a ring buffer. The main loop reads from the ring buffer and runs a state machine to decode it.
The UART internally has a 32-byte receive FIFO, and provides interrupts when this FIFO is quarter-full, half-full, three-quarter full and completely full. How should i determine which of these interrupts should trigger my ISR? What is the tradeoff involved?
Note - The protocol involves packets of 32-byte (fixed length), send every 10ms.
This depends on a lot of things, most of all the maximum baudrate supported, and how much time your application needs for executing other tasks.
Traditional ring buffers work on byte-per-byte interrupt basis. But it is of course always nice to reduce the number of interrupts. It probably doesn't matter much how often you let it trigger.
It is much more important to implement a double-buffer scheme. You should of course not start to run a state machine decoding straight from a single ring buffer. That will turn into a race condition nightmare.
Your main program should hit the semaphore/disable the UART interrupt, then copy the whole buffer, then allow interrupt. Ideally buffer copy is done by changing a pointer, rather than doing a hard copy. The code doing this needs to be benchmarked to perform faster than 1/baudrate * 10 seconds. Where 10 is: 1 start, 8 data, 1 stop, assuming UART is 8-N-1.
If available, use DMA over software ring buffers.
Given a packet based protocol and a UART that interrupts when more than one byte has been received, consider what should happen if the final byte of a packet is received but that final byte isn't enough to fill the FIFO past the threshold and trigger an interrupt. Is your application simply not going to receive that incomplete packet until some subsequent packet is received and the FIFO finally fills enough? What if the other end is waiting for a response and never sends another packet? Or is your application supposed to poll the UART to check for lingering bytes remaining in the UART FIFO? That seems overly complicated to both use an interrupt and poll for received bytes.
With the packet-based protocols I have implemented, the UART driver does not rely on the UART FIFO and configures the UART to interrupt when a single byte is available. This way the driver gets notified for every byte and there is no chance for the final byte of a packet to be left lingering in the UART's FIFO.
The UART's FIFO can be convenient for streaming protocols (such as audio or video data). When the driver is receiving a stream of data then there will always be incoming data to keep filling the FIFO. The driver can rely on the UART's FIFO to buffer some data. The driver can be more efficient by processing multiple bytes per interrupt and reducing the interrupt rate.
You might consider using the UART FIFO since your packets are a fixed length. But consider how the driver would recover if a single byte is dropped due to noise or whatever. I think it's still best to not rely on the FIFO for packet-based protocols regardless of whether the packets are fixed length.

DSPIC33F UART DMA Example not working

I am trying to use DMA for my UART Rx and Tx. Till now I had the freeRTOS version of the serial demo working fine. It still works fine. However, now I have incorporated the UART DMA example, from the example projects.
the code is conditionally compiled, so that when a switch _HAS_DMA == 1, only then the DMA engine is configured, ram buffers are configured, and default UART ISRs as required by the FreeRTOS demo are removed.
At this point, whenever I send a serial byte stream, the running project simply gets reset.
I am using MPLAB IDE 8.92, XC16 v1.20, Explorer-16 platform, dspic33fj256gp710 part.
The DMA code included does not use any FreeRTOS API calls.
I have setup the project so that StackOverflow is detected using the FreeRTOS configuration option. But the code does not reach the Stackoverflow hook function. I have also included the U2ErrInterrupt ISR to see if incoming bytes are coming in fine, however even that interrupt is not reached.
Has any one faced this before?
interestingly, the UART DMA Loopback example from Microchip website, which uses the MPLAB C30 compiler, works fine on my board.
any pointers on this one? I could not locate any code examples in the FreeRTOS forum on how to use the DMA for UART, but it is suggested to use this method in production code for efficiency.
Need help here.
Thanks and best regards,
Vishal
OK. I found the culprit. Its me. :)).
When setting up the DMA to receive UART interrupts, one should not enable UART interrupt separately in software. Which is what I was doing. In addition, I had conditionally un-compiled the UART ISRs from my code !!!. So in effect, whenever a byte was received by the UART engine, the processor is getting confused as to who will serve this interrupt, DMA or Application code. I thing the PC would point to the designated UART RX ISR vector location, where the processor would not find anything, and this was causing the reset. Or may be there was a race condition setup between the DMA and the processor to serve this interrupt, which was causing the reset.
Now that I have setup UART so that Interrupts are not enabled separately by application, when DMA is going to serve the UART RX, my code is working fine. I am yet to integrate the whole thing with FreeRTOS deferred interrupt processing using binary semaphores, but I hope I will not see any troubles there.
There is not much documented about this though...neither in Microchip manuals nor in the FreeRTOS examples.
Also, I found that when using DMA with UART, as per the manual, the DMA receives WORDS from the UART RX engine, with lower byte having the data, and upper byte having the status. If the DMA is also used for UART Tx, and is set to transfer WORDS to UART TXREG, the two intelligently manage to send only the lower data byte out. So the receiving party still gets expected bytes. This is also not documented well.
I will try to post my code here for future generations though :)).

How to make sure interrupt is delayed until "sleep for interrupt" is issued?

Imagine the following scenario. A microcontroller is slave on a bus, say SPI. Once writing a byte on the bus, it wants to sleep until the byte is transferred. The code would look something like this:
write_byte_to_bus(byte);
wait_for_interrupt(); /* a single assembly instruction */
Now since the microcontroller is not the master, theoretically (and likely if the bus is really fast), the byte could be transferred immediately as the master requests and therefore something like this happens:
write_byte_to_bus(byte);
interrupt arrives saying
that operations is done
wait_for_interrupt();
which results in the microcontroller sleeping for interrupt after the interrupt has arrived. How can one protect itself from such cases?
For your specific microcontroller, you can disable interrupts first with a SIM, then execute your write_byte_to_bus and then WFI will re-enable interrupts appropriately.
Other microcontrollers may have different ways of managing this.

How CPU finds ISR and distinguishes between devices

I should first share all what I know - and that is complete chaos. There are several different questions on the topic, so please don't get irritated :).
1) To find an ISR, CPU is provided with a interrupt number. In x86 machines (286/386 and above) there is a IVT with ISRs in it; each entry of 4 bytes in size. So we need to multiply interrupt number by 4 to find the ISR. So first bunch of questions is - I am completely confused in mechanism of CPU receiving the interrupt. To raise an interrupt, firstly device shall probe for IRQ - then what ? The interrupt number travels "on IRQ" towards CPU? I also read something like device putting ISR address on data bus ; whats that then ? What is the concept of devices overriding the ISR. Can somebody tell me few example devices where CPU polls for interrupts? And where does it finds ISR for them ?
2) If two devices share an IRQ (which is very much possible), how does CPU differs amongst them ? What if both devices raise an interrupt of same priority simultaneously. I got to know there will be masking of same type and low priority interrupts - but how this communication happens between CPU and device controller? I studied the role of PIC and APIC for this problem, but could not understand.
Thanks for reading.
Thank you very much for answering.
CPUs don't poll for interrupts, at least not in a software sense. With respect to software, interrupts are asynchronous events.
What happens is that hardware within the CPU recognizes the interrupt request, which is an electrical input on an interrupt line, and in response, sets aside the normal execution of events to respond to the interrupt. In most modern CPUs, what happens next is determined by a hardware handshake particular to the type of CPU, but most of them receive a number of some kind from the interrupting device. That number can be 8 bits or 32 or whatever, depending on the design of the CPU. The CPU then uses this interrupt number to index into the interrupt vector table, to find an address to begin execution of the interrupt service routine. Once that address is determined, (and the current execution context is safely saved to the stack) the CPU begins executing the ISR.
When two devices share an interrupt request line, they can cause different ISRs to run by returning a different interrupt number during that handshaking process. If you have enough vector numbers available, each interrupting device can use its own interrupt vector.
But two devices can even share an interrupt request line and an interrupt vector, provided that the shared ISR is clever enough to go back to all the possible sources of the given interrupt, and check status registers to see which device requested service.
A little more detail
Suppose you have a system composed of a CPU, and interrupt controller, and an interrupting device. In the old days, these would have been separate physical devices but now all three might even reside in the same chip, but all the signals are still there inside the ceramic case. I'm going to use a powerPC (PPC) CPU with an integrated interrupt controller, connected to a device on a PCI bus, as an example that should serve nicely.
Let's say the device is a serial port that's transmitting some data. A typical serial port driver will load bunch of data into the device's FIFO, and the CPU can do regular work while the device does its thing. Typically these devices can be configured to generate an interrupt request when the device is running low on data to transmit, so that the device driver can come back and feed more into it.
The hardware logic in the device will expect a PCI bus interrupt acknowledge, at which point, a couple of things can happen. Some devices use 'autovectoring', which means that they rely on the interrupt controller to see to it that the correct service routine gets selected. Others will have a register, which the device driver will pre-program, that contains an interrupt vector that the device will place on the data bus in response to the interrupt acknowledge, for the interrupt controller to pick up.
A PCI bus has only four interrupt request lines, so our serial device will have to assert one of those. (It doesn't matter which at the moment, it's usually somewhat slot dependent..) Next in line is the interrupt controller (e.g. PIC/APIC), that will decide whether to acknowledge the interrupt based on mask bits that have been set in its own registers. Assuming it acknowledges the interrupt, it either then obtains the vector from the interrupting device (via the data bus lines), or may if so programmed use a 'canned' value provided by the APIC's own device driver. So far, the CPU has been blissfully unaware of all these goings-on, but that's about to change.
Now it's time for the interrupt controller to get the attention of the CPU core. The CPU will have its own interrupt mask bit(s) that may cause it to just ignore the request from the PIC. Assuming that the CPU is ready to take interrupts, it's now time for the real action to start. The current instruction usually has to be retired before the ISR can begin, so with pipelined processors this is a little complicated, but suffice it to say that at some point in the instruction stream, the processor context is saved off to the stack and the hardware-determined ISR takes over.
Some CPU cores have multiple request lines, and can start the process of narrowing down which ISR runs via hardware logic that jumps the CPU instruction pointer to one of a handful of top level handlers. The old 68K, and possibly others did it that way. The powerPC (and I believe, the x86) have a single interrupt request input. The x86 itself behaves a bit like a PIC, and can obtain a vector from the external PIC(s), but the powerPC just jumps to a fixed address, 0x00000500.
In the PPC, the code at 0x0500 is probably just going to immediately jump out to somewhere in memory where there's room enough for some serious decision making code, but it's still the interrupt service routine. That routine will first go to the PIC and obtain the vector, and also ask the PIC to stop asserting the interrupt request into the CPU core. Once the vector is known, the top level ISR can case out to a more specific handler that will service all the devices known to be using that vector. The vector specific handler then walks down the list of devices assigned to that vector, checking interrupt status bits in those devices, to see which ones need service.
When a device, like the hypothetical serial port, is found wanting service, the ISR for that device takes appropriate actions, for example, loading the next FIFO's worth of data out of an operating system buffer into the port's transmit FIFO. Some devices will automatically drop their interrupt request in response to being accessed, for example, writing a byte into the transmit FIFO might cause the serial port device to de-assert the request line. Other devices will require a special control register bit to be toggled, set, cleared, what-have-you, in order to drop the request. There are zillions of different I/O devices and no two of them ever seem to do it the same way, so it's hard to generalize, but that's usually the way of it.
Now, obviously there's more to say - what about interrupt priorities? what happens in a multi-core processor? What about nested interrupt controllers? But I've burned enough space on the server. Hope any of this helps.
I Came over this Question like after 3 years.. Hope I Can help ;)
The Intel 8259A or simply the "PIC" has 8 pins ,IRQ0-IRQ7, every pin connects to a single device..
Lets suppose that u pressed a button on the keyboard.. the voltage of the IRQ1 pin, which is connected to the KBD, is High.. so after the CPU gets interrupted, acknowledge the Interrupt bla bla bla... the PIC does simply add 8 to the number of the IRQ line so IRQ1 means 1+8 which means 9
SO the CPU sets its CS and IP on the 9th entry in the vector table.. and because the IVT is an array of longs it just multiply the number of cells by 4 ;)
CPU.CS=IVT[9].CS
CPU.IP=IVT[9].IP
the ESR deals with the device through the I/O ports ;)
Sorry for my bad english .. am an Arab though :)

Which Cortex-M3 interrupts can I use for general purpose work?

I'd have some code that needs to be run as the result of a particular interrupt going off.
I don't want to execute it in the context of the interrupt itself but I also don't want it to execute in thread mode.
I would like to run it at a priority that's lower than the high level interrupt that precipitated its running but also a priority that higher than thread level (and some other interrupts as well).
I think I need to use one of the other interrupt handlers.
Which ones are the best to use and what the best way to invoke them?
At the moment I'm planning on just using the interrupt handlers for some peripherals that I'm not using and invoking them by setting bits directly through the NVIC but I was hoping there's a better, more official way.
Thanks,
ARM Cortex supports a very special kind of exception called PendSV. It seems that you could use this exception exactly to do your work. Virtually all preemptive RTOSes for ARM Cortex use PendSV to implement the context switch.
To make it work, you need to prioritize PendSV low (write 0xFF to the PRI_14 register in the NVIC). You should also prioritize all IRQs above the PendSV (write lower numbers in the respective priority registers in the NVIC). When you are ready to process the whole message, trigger the PendSV from the high-priority ISR:
*((uint32_t volatile *)0xE000ED04) = 0x10000000; // trigger PendSV
The ARM Cortex CPU will then finish your ISR and all other ISRs that possibly were preempted by it, and eventually it will tail-chain to the PendSV exception. This is where your code for parsing the message should be.
Please note that PendSV could be preempted by other ISRs. This is all fine, but you need to obviously remember to protect all shared resources by a critical section of code (briefly disabling and enabling interrupts). In ARM Cortex, you disable interrupts by executing __asm("cpsid i") and you enable interrupts by __asm("cpsie i"). (Most C compilers provide built-in intrinsic functions or macros for this purpose.)
Are you using an RTOS? Generally this type of thing would be handled by having a high priority thread that gets signaled to do some work by the interrupt.
If you're not using an RTOS, you only have a few tasks, and the work being kicked off by the interrupt isn't too resource intensive, it might be simplest having your high priority work done in the context of the interrupt handler. If those conditions don't hold, then implementing what you're talking about would be the start of a basic multitasking OS itself. That can be an interesting project in its own right, but if you're looking to just get work done, you might want to consider a simple RTOS.
Since you mentioned some specifics about the work you're doing, here's an overview of how I've handled a similar problem in the past:
For handling received data over a UART one method that I've used when dealing with a simpler system that doesn't have full support for tasking (ie., the tasks are round-robined i na simple while loop) is to have a shared queue for data that's received from the UART. When a UART interrupt fires, the data is read from the UART's RDR (Receive Data Register) and placed in the queue. The trick to deal with this in such a way that the queue pointers aren't corrupted is to carefully make the queue pointers volatile, and make certain that only the interrupt handler modifies the tail pointer and that only the 'foreground' task that's reading data off the queue modified the head pointer. A high-level overview:
producer (the UART interrupt handler):
read queue.head and queue.tail into locals;
increment the local tail pointer (not the actual queue.tail pointer). Wrap it to the start of the queue buffer if you've incremented past the end of the queue's buffer.
compare local.tail and local.head - if they're equal, the queue is full, and you'll have to do whatever error handing is appropriate.
otherwise you can write the new data to where local.tail points
only now can you set queue.tail == local.tail
return from the interrupt (or handle other UART related tasks, if appropriate, like reading from a transmit queue)
consumer (the foreground 'task')
read queue.head and queue.tail into locals;
if local.head == local.tail the queue is empty; return to let the next task do some work
read the byte pointed to by local.head
increment local.head and wrap it if necessary;
set queue.head = local.head
goto step 1
Make sure that queue.head and queue.tail are volatile (or write these bits in assembly) to make sure there are no sequencing issues.
Now just make sure that your UART received data queue is large enough that it'll hold all the bytes that could be received before the foreground task gets a chance to run. The foreground task needs to pull the data off the queue into it's own buffers to build up the messages to give to the 'message processor' task.
What you are asking for is pretty straightforward on the Cortex-M3. You need to enable the STIR register so you can trigger the low priority ISR with software. When the high-priority ISR gets done with the critical stuff, it just triggers the low priority interrupt and exits. The NVIC will then tail-chain to the low-priority handler, if there is nothing more important going on.
The "more official way" or rather the conventional method is to use a priority based preemptive multi-tasking scheduler and the 'deferred interrupt handler' pattern.
Check your processor documentation. Some processors will interrupt if you write the bit that you normally have to clear inside the interrupt. I am presently using a SiLabs c8051F344 and in the spec sheet section 9.3.1:
"Software can simulate an interrupt by setting any interrupt-pending flag to logic 1. If interrupts are enabled for the flag, an interrupt request will be generated and the CPU will vector to the ISR address associated with the interrupt-pending flag."