Issues using PIC18 as an SPI slave - embedded

I have been working on a PIC18F45k20 running at 16 MHz and using it as an SPI slave.  I find that no matter the SPI clock rate (SCK) from the master I always have to add a significant delay (~64 us) between SPI bytes to avoid SPI collisions or receive overflow.   Without the delay and at very slow SPI clock rates, 95% of the SPI packets will get through without collision or overflow.
Online posts lend me to think that this may be a "feature" of this, and other, PIC18 processors.
Have others observed this same slave “feature”?
If this is a “feature”, is it found in all PIC18 processors?
I tested the PIC18 without an interrupt with the following:
if (SSPSTATbits.BF)
{
DataIn = SSPBUF;
SSPBUF = DataOut;
}
Also tested using an interrupt and saw the same challenge.
Makes me wonder if it doesn’t truly detect the SPI clock properly.

If you have an oscilloscope check to make sure that the chip select is not being released prior to the PIC clocking out the last SPI data byte. You need to wait on the SPI busy bit before releasing the chip select line.

As I know PIC18 is a 8bit microcontroller, although you can easily find that it's integer variable is mapped into 16bit. However SPI works with 8bit data. It means if your master send for this microcontroller more than 8bit, such as 16 bit, overflow happens in SPI module and cann't response to master clock anymore. So In Slave mode, make sure data from master have 8bit structure. But if pic18 was Master in SPI connection, even though its slave send 16bit data, pic18 hold clock data after first 8bit and wait until its buffer read and empty for next 8bit.

I've also come across this issue and it seems like what one should take into account is that supported SPI simple tells how fast MCU can receive one byte into SSPBUF.
Reading this byte from SSPBUF and storing it in a buffer will require some work like incrementing a pointer etc., which will take some time. This is what reduces actual SPI bandwidth for multi-byte SPI.

Related

How to log a particular address from an STM32 NUCLEO-F334R8 with an inbuilt ST-LINK in real time using SWD & openOCD without halting the processor?

I am trying to learn how to debug an MCU non-intrusively using SWD & openOCD.
while (1)
{
my_count++;
HAL_GPIO_TogglePin(LD2_GPIO_Port,LD2_Pin);
HAL_Delay(750);
}
The code running on my MCU has a free running counter "my_count" . I want to sample/trace the data stored in the address holding "my_count" in real time :
I was doing it this way:
while(1){// generic algorithm no specific language
mdw 0x00000000200000ac; //openOCD command to read from an address
}
0x200000ac is the address of the variable my_count from the .map file.
But, this method is very slow and experiences data drops at high frequencies.
Is there any other way to trace the data at high frequencies without experiencing data drops?
I made some napkin math, and I have an idea that may work.
As per Reference Manual, page 948, the max baud rate for UART of STM32F334 is 9Mbit/s.
If we want to send memory at the specific address, it will be 32 bits. 1 bit takes 1/9Mbps or 1.111*10^(-7)s, multiply that by 32 bits, that makes it 3.555 microseconds. Obviously, as I said, it's purely napkin math. There are start and stop bits involved. But we have a lot of wiggle room. You can easily fit 64 bits into transmission too.
Now, I've checked with the internet, it seems the ST-Link based on STM32F103 can have max baud rate of 4.5Mbps. A bummer, but we simply need to double our timings. 3.55*2 = 7.1us for 32-bit and 14.2us for 64-bit transmission. Even given there is some start and stop bit overhead, we still seem to fit into our 25us time budget.
So the suggestion is the following:
You have a timer set to 25us period that fires an interrupt, that activates DMA UART transmission. That way your MCU actually has very little overhead since DMA will autonomously handle the transmission, while your MCU can do whatever it wants in the meantime. Entering and exiting the timer ISR will be in fact the greatest part of the overhead caused by this, since in the ISR you will literally flip a pair of bits to tell DMA to send stuff over UART # 4.5Mbps.

what is the best way to design a shift register with stm32

I am using a STM32F031K6, clocked at 40MHz, and I want to design a program which acts as a looping shift register - an external trigger is used to clock it, the values in the shift register left shift every time a rising/falling edge is received. the output is one pin either high or low.
I need to make the time between the clocking edge and the output less than 0.5uS, or failing that as quick as possible. The values of the shift register can be changed and the length can also be changed, but for now I'm just starting with a byte like 11000010 .
I initially thought to implement this with an external interrupt but it was suggested there may be a better way to implement it
any help much appreciated
You might use the SPI peripheral of the STM32F0 for your task. When configured in slave mode, each time an external clock edge is detected on the SCK signal, the MISO will be set to the next bit of a value loaded into an internal shift register via the SPI data register.
Check out the chapter on the Serial peripheral interface (SPI) in STM32F0 reference manual.
Especially have a look at the sections addressing the following keywords:
General description: SPI block diagram
Slave Mode (Master selection: Slave configuration)
Simplex communication: Transmit-only mode (RXONLY=0)
Slave select (NSS) pin management: Software NSS management (SSM = 1)
Data frame format (data size can be set from 4-bit up to 16-bit length)
Configuration of SPI
The SPI unit is highly configurable, e.g. regarding the polarity of clock signal. Since it is an independent hardware unit, it should be able to handle your 0.5us reaction time requirement. The MCU firmware needs to set up the SPI unit and then provide new data to the SPI unit, each time the Tx buffer empty flag (TXE) is set. This can also be done by interrupt (TXEIE) or even using a DMA channel (TXDMAEN) with a circular buffer. In the latter case the "shift register functionality" runs completely independent of the MCU core (after setup).

Where is the SPI multiplexer of my dreams?

Consider you have an SPI bus with only a single chip select.
Is there a chip such that I can connect 8 or more devices to that SPI bus?
To simplify things, you may assume that all devices agree on the SPI mode (data needs to be valid on a rising edge). Also, all devices are of the time, where chip selects stays low for the whole transfer, and is not toggled after each word.
The SPI multiplexer could have 4 inputs:
MISO, MOSI, input clock, master chip select
and 9 outputs:
output clock, 8 slave chip selects
MISO and MOSI are connected directly to the slaves. The slaves have their SPI clock connected to the output clock and their chips selects are connected to one of the 8 slave chip selects.
The SPI multiplexer would take the two bytes of each SPI transfer as its own input. The first byte could indicate which slave is to be selected. For configuration of the multiplexer, a ninth address could be allowed.
If one of the 8 slaves is selected, the multiplexer would then activate the slave's chip select after the first byte (or even after the first few bits of the first byte). The output clock would activate with the start of the second byte and it would be synchronous to the input clock. Leaving the clock inactive during the first byte makes sure that the slaves never take notice of the first byte.
Such a chip does not seem to exist. I found solutions with two chip selects, but that's not an option to upgrade old hardware designs with just a single chip select.
Does such a thing exist?
It does not exist because there is no need for it. Often CS is only controlled from software as a regular gpio before initiating an SPI transfer, then you could just wire the different slaves to different GPIO and use them as CS. If the CS is generated and needed by the hardware block you gate this signal with the external chip select to choose the wanted slave.
Your proposed one byte to select bus would also break all software which relay on writing from and reading to the same buffer, and it would be decrease the available bandwidth with control signals.
As a side note with recent development with secure enclaves and Trustzone the trend is not to share spi buses but rather hard wire them so only code running in the trusted part of the SoC will be able to access the connected slave.

Why are FIFO One-quarter full, Half-full, three-quarter full interrupts provided in a UART RX FIFO? What are their use cases?

I am implementing a protocol decoder which receives bytes through UART of a microcontroller. The ISR takes bytes from the UART peripheral and puts it in a ring buffer. The main loop reads from the ring buffer and runs a state machine to decode it.
The UART internally has a 32-byte receive FIFO, and provides interrupts when this FIFO is quarter-full, half-full, three-quarter full and completely full. How should i determine which of these interrupts should trigger my ISR? What is the tradeoff involved?
Note - The protocol involves packets of 32-byte (fixed length), send every 10ms.
This depends on a lot of things, most of all the maximum baudrate supported, and how much time your application needs for executing other tasks.
Traditional ring buffers work on byte-per-byte interrupt basis. But it is of course always nice to reduce the number of interrupts. It probably doesn't matter much how often you let it trigger.
It is much more important to implement a double-buffer scheme. You should of course not start to run a state machine decoding straight from a single ring buffer. That will turn into a race condition nightmare.
Your main program should hit the semaphore/disable the UART interrupt, then copy the whole buffer, then allow interrupt. Ideally buffer copy is done by changing a pointer, rather than doing a hard copy. The code doing this needs to be benchmarked to perform faster than 1/baudrate * 10 seconds. Where 10 is: 1 start, 8 data, 1 stop, assuming UART is 8-N-1.
If available, use DMA over software ring buffers.
Given a packet based protocol and a UART that interrupts when more than one byte has been received, consider what should happen if the final byte of a packet is received but that final byte isn't enough to fill the FIFO past the threshold and trigger an interrupt. Is your application simply not going to receive that incomplete packet until some subsequent packet is received and the FIFO finally fills enough? What if the other end is waiting for a response and never sends another packet? Or is your application supposed to poll the UART to check for lingering bytes remaining in the UART FIFO? That seems overly complicated to both use an interrupt and poll for received bytes.
With the packet-based protocols I have implemented, the UART driver does not rely on the UART FIFO and configures the UART to interrupt when a single byte is available. This way the driver gets notified for every byte and there is no chance for the final byte of a packet to be left lingering in the UART's FIFO.
The UART's FIFO can be convenient for streaming protocols (such as audio or video data). When the driver is receiving a stream of data then there will always be incoming data to keep filling the FIFO. The driver can rely on the UART's FIFO to buffer some data. The driver can be more efficient by processing multiple bytes per interrupt and reducing the interrupt rate.
You might consider using the UART FIFO since your packets are a fixed length. But consider how the driver would recover if a single byte is dropped due to noise or whatever. I think it's still best to not rely on the FIFO for packet-based protocols regardless of whether the packets are fixed length.

Sampling a high speed serial bit stream with MCU

I'm currently working on an application where an MCU is receiving data from a hardware chip in the form of an asynchronous serial bit transmission at 2Mbps. This data has no encoding and no protocol aside from a start sequence, after which it is raw binary data.
The current approach for recovery is using the SPI module in 3-pin mode to oversample the stream 4x at 8MHz, allowing for recovery of the asynchronous data. While seemingly effective thus far with a simulated testbench, this method is rather complicated as an internal clock needs to be routed to the SPI CLK as the device is run in slave mode in order for DMA to recover the transmitted data while the processor executes another task.
Would it be possible to use any other peripherals efficiently for this task aside from SPI? Faking a communication protocol to recover a serial bit stream seems a bit roundabout, but I am not sure how to utilize UART or I2C without doing the same, and those might not even be possible to use as the protocol bits are not present in the stream. I also want to avoid using an ADC in the interest of power, along with the fact that the data is already digital so it seems unnecessary.