why is clock stretching required in I2C - embedded

Clock stretching is used to slow down the master in I2C.. If Slave device is I2C Based, then it means it can work in 100KHZ according to the standard.
My confusion is what is the need of clock stretching, when the clock is already 100KHz in Master and slave?
Please provide an example for this..

The need of clock stretching because the Slave may is not able to received from the Master or the Slave require additional time for process the received data.
Many Slave device is low cost device that don't have good buffer system, need long time read of slow memory and use clock stretching to prevent Overflow.
It is a cheap speed control system (compare as more complex async connections),safe and simple.
It also as some low cost devices CPU can not working on I2C when working on a busy task, For example a sensor may require interrupt and hold the clock line low, after sensor internal jobs done, the sensor will goes high and send the data under normal speed.
Ref: http://cache.freescale.com/files/sensors/doc/app_note/AN4481.pdf P.12

Related

How to log a particular address from an STM32 NUCLEO-F334R8 with an inbuilt ST-LINK in real time using SWD & openOCD without halting the processor?

I am trying to learn how to debug an MCU non-intrusively using SWD & openOCD.
while (1)
{
my_count++;
HAL_GPIO_TogglePin(LD2_GPIO_Port,LD2_Pin);
HAL_Delay(750);
}
The code running on my MCU has a free running counter "my_count" . I want to sample/trace the data stored in the address holding "my_count" in real time :
I was doing it this way:
while(1){// generic algorithm no specific language
mdw 0x00000000200000ac; //openOCD command to read from an address
}
0x200000ac is the address of the variable my_count from the .map file.
But, this method is very slow and experiences data drops at high frequencies.
Is there any other way to trace the data at high frequencies without experiencing data drops?
I made some napkin math, and I have an idea that may work.
As per Reference Manual, page 948, the max baud rate for UART of STM32F334 is 9Mbit/s.
If we want to send memory at the specific address, it will be 32 bits. 1 bit takes 1/9Mbps or 1.111*10^(-7)s, multiply that by 32 bits, that makes it 3.555 microseconds. Obviously, as I said, it's purely napkin math. There are start and stop bits involved. But we have a lot of wiggle room. You can easily fit 64 bits into transmission too.
Now, I've checked with the internet, it seems the ST-Link based on STM32F103 can have max baud rate of 4.5Mbps. A bummer, but we simply need to double our timings. 3.55*2 = 7.1us for 32-bit and 14.2us for 64-bit transmission. Even given there is some start and stop bit overhead, we still seem to fit into our 25us time budget.
So the suggestion is the following:
You have a timer set to 25us period that fires an interrupt, that activates DMA UART transmission. That way your MCU actually has very little overhead since DMA will autonomously handle the transmission, while your MCU can do whatever it wants in the meantime. Entering and exiting the timer ISR will be in fact the greatest part of the overhead caused by this, since in the ISR you will literally flip a pair of bits to tell DMA to send stuff over UART # 4.5Mbps.

Receive/Transmit FIFO vs Data Registers in UART

Could you please tell me what is the difference between receive/transmit FIFO and the data register in UART?
This is the first time I develop driver code for UART and so the first time I have come across with this. I am really not able to understand the difference. Please help me!
A FIFO (first-in-first-out) buffer in a UART is a hardware implemented queue of received or transmitted data. You do not access the FIFO directly, instead you read or write the data register and this data is automatically read or written from the head of the queue.
A FIFO can improve link efficiency because it allows software data read/write timing to vary while maintaining streaming data on the physical link.
When the FIFO is disabled or for a UART with no FIFO, there are only two bytes of buffering - the shift-register and the data-register. For input data, if the software does not read the data register in time before new data is received, it will be overwritten and data will be lost. Equally for transmission, if data is not written as quickly as it is transmitted, the full bandwidth and efficiency of the link may not be realised.
A FIFO is perhaps most useful on systems without deterministic real-time performamce where there may be no guarantees on timely servicing of the UART (such as a Desjktop PC running a general-purpose OS such as Windows for example), however on an embedded system where buffered interrupt driven serial I/O is used, the FIFO may not be entirely necessary for low to moderate data rates in many cases. On microcontrollers UARTS lacking a FIFO often support DMA instead which can be more effective at managing large streaming data bursts.
Once you write to the Transmit Data Register, the byte will go to the transmit FIFO, it will sit there until physical lines are ready to transmit.
The other way arround is the same, data come from physical lines and go to the receive FIFO, waiting the user code to collect it by reading a Received Data Register.

Issues using PIC18 as an SPI slave

I have been working on a PIC18F45k20 running at 16 MHz and using it as an SPI slave.  I find that no matter the SPI clock rate (SCK) from the master I always have to add a significant delay (~64 us) between SPI bytes to avoid SPI collisions or receive overflow.   Without the delay and at very slow SPI clock rates, 95% of the SPI packets will get through without collision or overflow.
Online posts lend me to think that this may be a "feature" of this, and other, PIC18 processors.
Have others observed this same slave “feature”?
If this is a “feature”, is it found in all PIC18 processors?
I tested the PIC18 without an interrupt with the following:
if (SSPSTATbits.BF)
{
DataIn = SSPBUF;
SSPBUF = DataOut;
}
Also tested using an interrupt and saw the same challenge.
Makes me wonder if it doesn’t truly detect the SPI clock properly.
If you have an oscilloscope check to make sure that the chip select is not being released prior to the PIC clocking out the last SPI data byte. You need to wait on the SPI busy bit before releasing the chip select line.
As I know PIC18 is a 8bit microcontroller, although you can easily find that it's integer variable is mapped into 16bit. However SPI works with 8bit data. It means if your master send for this microcontroller more than 8bit, such as 16 bit, overflow happens in SPI module and cann't response to master clock anymore. So In Slave mode, make sure data from master have 8bit structure. But if pic18 was Master in SPI connection, even though its slave send 16bit data, pic18 hold clock data after first 8bit and wait until its buffer read and empty for next 8bit.
I've also come across this issue and it seems like what one should take into account is that supported SPI simple tells how fast MCU can receive one byte into SSPBUF.
Reading this byte from SSPBUF and storing it in a buffer will require some work like incrementing a pointer etc., which will take some time. This is what reduces actual SPI bandwidth for multi-byte SPI.

Synopsys USB OTG Controller (2.65a) occasionally truncates isochronous IN in USB device mode

I'm using a Synopsys OTG core in device mode. Programming an isochronous IN high speed endpoint (USB 2.0) for the maximum transfer per microframe (3 packets of 1024 bytes) using a periodic FIFO dedicated to this endpoint. It works 99+% of the time. But occasionally the transfer is truncated. For example, the first 1024 bytes will go onto the bus with the DATA0 PID (instead of the correct DATA2 PID) and the remaining 2048 bytes will not be sent. Since I've programmed the packet count, multicount, max packet size and transfer size correctly I'm not sure what is causing this.
Obviously this is a very specific question and I don't have much hope of getting an answer, but I figured a shot in the dark was worth a try. Thanks in advance.
Isochronous transfers does not guarantee packet delivery. So if host controller has other active transfers, it will silently drop isochronous packets. If you need guaranteed packed delivery, you should use bulk transfers (but then it will not guarantee delivery time).
Isochronous is ideal for applications, like sound or video streaming, where you need constant delivery time, but loss of some frames is ok.
The specification places limits on the bus, allowing no more than 90% of any frame to be allocated for periodic transfers (Interrupt and Isochronous) on a full speed bus. On high speed buses this limitation gets reduced to no more than 80% of a microframe can be allocated for periodic transfers. (c) http://www.beyondlogic.org/usbnutshell/usb4.shtml
Answering my own question in case it may help someone else. It appears this OTG controller has a bug where the TX FIFO does not always empty properly. I found a successful workaround is to flush the FIFO after each TX. It's quick and the truncation symptom goes away.

Direct memory access DMA - how does it work?

I read that if DMA is available, then processor can route long read or write requests of disk blocks to the DMA and concentrate on other work. But, DMA to memory data/control channel is busy during this transfer. What else can processor do during this time?
First of all, DMA (per se) is almost entirely obsolete. As originally defined, DMA controllers depended on the fact that the bus had separate lines to assert for memory read/write, and I/O read/write. The DMA controller took advantage of that by asserting both a memory read and I/O write (or vice versa) at the same time. The DMA controller then generated successive addresses on the bus, and data was read from memory and written to an output port (or vice versa) each bus cycle.
The PCI bus, however, does not have separate lines for memory read/write and I/O read/write. Instead, it encodes one (and only one) command for any given transaction. Instead of using DMA, PCI normally does bus-mastering transfers. This means instead of a DMA controller that transfers memory between the I/O device and memory, the I/O device itself transfers data directly to or from memory.
As for what else the CPU can do at the time, it all depends. Back when DMA was common, the answer was usually "not much" -- for example, under early versions of Windows, reading or writing a floppy disk (which did use the DMA controller) pretty much locked up the system for the duration.
Nowadays, however, the memory typically has considerably greater bandwidth than the I/O bus, so even while a peripheral is reading or writing memory, there's usually a fair amount of bandwidth left over for the CPU to use. In addition, a modern CPU typically has a fair large cache, so it can often execute some instruction without using main memory at all.
Well the key point to note is that the CPU bus is always partly used by the DMA and the rest of the channel is free to use for any other jobs/process to run. This is the key advantage of DMA over I/O. Hope this answered your question :-)
But, DMA to memory data/control channel is busy during this transfer.
Being busy doesn't mean you're saturated and unable to do other concurrent transfers. It's true the memory may be a bit less responsive than normal, but CPUs can still do useful work, and there are other things they can do unimpeded: crunch data that's already in their cache, receive hardware interrupts etc.. And it's not just about the quantity of data, but the rate at which it's generated: some devices create data in hard real-time and need it to be consumed promptly otherwise it's overwritten and lost: to handle this without DMA the software may may have to nail itself to a CPU core then spin waiting and reading - avoiding being swapped onto some other task for an entire scheduler time slice - even though most of the time further data's not even ready.
During DMA transfer, the CPU is idle and has no control over memory bus. CPU is put in idle state by using high impedance state