How can I broadcast data to multiple SPI slaves and how it works?

How can I broadcast data to multiple SPI slaves and how it works? - embedded

I know basic SPI protocol and its master to slave operations. I want to know that is it possible to broadcast data on multiple slave? If it is possible then how it works.
I've heard that after writing to any SPI slave you have to read from slave even if you are not going to use read data. How is this possible in case broadcast if broadcast is possible?

No, it's not possible. Since SPI is 4- wire interface. The SS line is used for slave selection.One SPI master can have multiple slaves but it cannot select all the slaves at once since it is either using some bit or SS signal to select the slave(and cannot select multiple slave at a single time).
SPI works in full duplex mode(most of time and cases), and is implemented using shift registers. Hence while writing the data on MOSI(assumption: master is writing),we are getting the data in MISO line too (inserting data in master shift register does shift the data in slave shift register). Hence we get data on both of the TX and RX lines.

It is possible if you have multiple SPI controllers. I have four SPI controllers connected to four different chips of the same type. I control the SS in software because the the SPI controller can't control it in a useful way. My driver has read, write, and broadcast methods. Read and write act on one SPI controller instance while the broadcast method (write-only of course) takes an array of SPI controller instances.
The broadcast method then takes the common address buffer of size bytes and the common data buffer of size bytes and sends them to all four SPI controllers sequentially (but close in time) to effect a broadcast from the caller's perspective. It isn't a hardware based broadcast but it does allow higher level software to think it is broadcasting common commands and setup data to all four chips at once.
It is also overall more time-efficient than sequentially sending the same thing to all four chips since efficiencies are to be had by having the lower layer SPI controller do the work.

If you use simplex SPI with one master and many slaves, you can create an interface of 1 to many. Assuming DMA, you posit the size of the transaction and a data buffer. The physical interface is 2 wire plus ground - CLK and MOSI. NSS is software selected not hardware. In theory you DMA from the master and the slaves get the data. This is a one way channel. I have been playing around with these sorts of configurations and this one seems to be working. Hardware is STM32f0/g0 architecture. I am also looking into ring topologies.. which were fun it the 90s, and may still be fun..

You cannot broadcast data to multiple slaves at the same time.

Related

what is the best way to design a shift register with stm32

I am using a STM32F031K6, clocked at 40MHz, and I want to design a program which acts as a looping shift register - an external trigger is used to clock it, the values in the shift register left shift every time a rising/falling edge is received. the output is one pin either high or low.
I need to make the time between the clocking edge and the output less than 0.5uS, or failing that as quick as possible. The values of the shift register can be changed and the length can also be changed, but for now I'm just starting with a byte like 11000010 .
I initially thought to implement this with an external interrupt but it was suggested there may be a better way to implement it
any help much appreciated

You might use the SPI peripheral of the STM32F0 for your task. When configured in slave mode, each time an external clock edge is detected on the SCK signal, the MISO will be set to the next bit of a value loaded into an internal shift register via the SPI data register.
Check out the chapter on the Serial peripheral interface (SPI) in STM32F0 reference manual.
Especially have a look at the sections addressing the following keywords:
General description: SPI block diagram
Slave Mode (Master selection: Slave configuration)
Simplex communication: Transmit-only mode (RXONLY=0)
Slave select (NSS) pin management: Software NSS management (SSM = 1)
Data frame format (data size can be set from 4-bit up to 16-bit length)
Configuration of SPI
The SPI unit is highly configurable, e.g. regarding the polarity of clock signal. Since it is an independent hardware unit, it should be able to handle your 0.5us reaction time requirement. The MCU firmware needs to set up the SPI unit and then provide new data to the SPI unit, each time the Tx buffer empty flag (TXE) is set. This can also be done by interrupt (TXEIE) or even using a DMA channel (TXDMAEN) with a circular buffer. In the latter case the "shift register functionality" runs completely independent of the MCU core (after setup).

Where is the SPI multiplexer of my dreams?

Consider you have an SPI bus with only a single chip select.
Is there a chip such that I can connect 8 or more devices to that SPI bus?
To simplify things, you may assume that all devices agree on the SPI mode (data needs to be valid on a rising edge). Also, all devices are of the time, where chip selects stays low for the whole transfer, and is not toggled after each word.
The SPI multiplexer could have 4 inputs:
MISO, MOSI, input clock, master chip select
and 9 outputs:
output clock, 8 slave chip selects
MISO and MOSI are connected directly to the slaves. The slaves have their SPI clock connected to the output clock and their chips selects are connected to one of the 8 slave chip selects.
The SPI multiplexer would take the two bytes of each SPI transfer as its own input. The first byte could indicate which slave is to be selected. For configuration of the multiplexer, a ninth address could be allowed.
If one of the 8 slaves is selected, the multiplexer would then activate the slave's chip select after the first byte (or even after the first few bits of the first byte). The output clock would activate with the start of the second byte and it would be synchronous to the input clock. Leaving the clock inactive during the first byte makes sure that the slaves never take notice of the first byte.
Such a chip does not seem to exist. I found solutions with two chip selects, but that's not an option to upgrade old hardware designs with just a single chip select.
Does such a thing exist?

It does not exist because there is no need for it. Often CS is only controlled from software as a regular gpio before initiating an SPI transfer, then you could just wire the different slaves to different GPIO and use them as CS. If the CS is generated and needed by the hardware block you gate this signal with the external chip select to choose the wanted slave.
Your proposed one byte to select bus would also break all software which relay on writing from and reading to the same buffer, and it would be decrease the available bandwidth with control signals.
As a side note with recent development with secure enclaves and Trustzone the trend is not to share spi buses but rather hard wire them so only code running in the trusted part of the SoC will be able to access the connected slave.

UART vs I2C vs SPI for inter-processor communication between microcontrollers

I am examining a way to connect two microcontrollers. On the level of serialization I am thinking of using Nano protobuffers (http://code.google.com/p/nanopb/). This way I can encode/decode messages and send them between two processors.
Basically, one small processor would be the RPC server, capable of doing several functions. Bigger processor will call there RPCs via messages sent, and then when data is ready, it will read it from smaller processor.
What would be the pros/cons of using UART, I2C or SPI?
Messages will be put in the mailbox que prior to sending.

It depends on your total requirements and how expensive are pins.
I2C only needs two pins, but it's slow and to handle it with or without interrupts is a pain, even with the build in peripheral modules.
It's a master/slave system, it's good for controlling many slow devices like temp sensors.
Only two lines for all bus devices, the selection is done via an I2C-Address in the protocol.
Uart needs two pins, it's normally faster, easier to handle, but requires (nearly) the same clocks at both sides.
One to one asynchronous system, can be good if both systems needs to be send sometimes data without waiting for a master poll request.
Can also be used as a bus system, but then you need a master/slave structure or more complex protocols.
SPI needs 3 (or 4 with CS) pins, it's the fastest, simple to implement even with DMA, low cpu time overhead, often buffered.
When you have enough free pins I would prefer it.

All of these interfaces have pros/cons.
UART connection in it's basic functionality requires 2 pins: RX and TX. The SW implementation of how to message over that UART is quite a bit more complicated...you'll have to develop your own messenging protocol between the devices and decide what is a good message and what is a bad message. It could get quite complicated because you pretty much have to define how to "communicate" over the physical link, what is an error, retries, etc. Unless you are implementing a serial port connection to a PC or some other external device, I think a UART is highly overkill for a IC to IC communication path. Master and slave are not specifically defined.
SPI is a master-slave relationship and can be a faster interface (I've seen up to 60MHz clock rates, not common) but it also requires more pins, 3 at a minimum for a point-to-point communication scheme but the number of pins increases to 3+n as the number of "slaves" increases above 1. There are no error indications via SPI. SPI is a "de-facto" standard...meaning it can vary in implementation...your mileage may vary depending on how a IC supplier defined "their" SPI implementation. I generally consider the lack of a true standard for SPI to be a "con".
I2C is also a two pin interface and is an actual "standard" developed by Phillips (now NXP.) As a standard, it is well-defined in how it operates, how errors are raised, and is simple to implement. It has an addressing scheme, can send commands, and can support 0 or more data frames in a transaction. CRC (optional) and higher data rates can be supported (up to 5Mbits.) It does have cons, namely bus capacitance can limit actual data rates (rise/fall time) but generally you can design around this "problem".
In their most basic forms, all of these busses are "ground referenced"...and can suffer from system induced noise. Obviously, lower rail voltages can make this even more of issue. Again careful design practice can mitigate many of the problems some people report to be the bain of their existence.
For the point-to-point system initially asked by the poster, if a master-slave arrangement is required, a SPI or I2C interface may be appropriate (data rate dependent.) If a master-master relationship is required, I2C or UART may be required.
For ease of implementation from a software point of view, I'd rank these communication methods in the following order:
I2C, if you need faster data rates than I2C can handle, then SPI
SPI, if you need multi-master, then I2C or UART
UART as a last resort...has a lot more software overhead to manage the communications channel

I would use UART or CAN or ETH or any protocol that is asynchronous.
If you use a synchronous protocol, the master must always "ask" the slave if it has data and generate unwanted traffic.

Chip to chip communication protocol over SPI

I'm trying to design an efficient communication protocol between a micro-controller on one side and an ARM processor on a multi-core TI chip on the other side through SPI.
The requirements for the needed protocol:
1 - Multi-session with queuing support, as I have multiple sending/receiving threads, so it will be more than one application using this communication protocol and I need the protocol to handle queuing these requests (I will keep holding the buffer if the transmission is queue but I just need the protocol to manage scheduling the queues).
2 - Works over SPI as an underlying protocol.
3 - Simple error checking.
In this thread: "Simple serial point-to-point communication protocol", PPP was a recommended option, however I see PPP does only part of the job.
I also found Light weight IP (LwIP) project featuring PPP over serial (which I assume that I can use it over SPI), so I thought about the possibility of utilizing any of the upper layers protocols like TCP/UDP to do the rest of the required jobs. Fortunately, I found TI including LwIP as part of their ethernet SW in the starterware package, which I assume to ease porting at least on the TI chip side.
So, my questions are:
1 - Is it valid to use LwIP for this communication scheme? Won't this introduce much overhead due to IP headers which are not necessary for a point to point (on the chip level) communication and kill the throughput?
2 - Will the TCP or any similar protocol residing in LwIP handle the queuing of transmission requests, for example if I request transmission through a socket while the communication channel is busy transmitting/receiving request for another socket (session) of another thread, will this be managed by the protocol stack? If so, which protocol layer manages it?
3 - Is their a more efficient protocol stack than LwIP, that meets the above requirements?
Update 1: More points to consider
1 - SPI is the only available option, I use it with available GPIOs to indicate to the master when the slave has data to send.
2 - The current implemented (non-standard) protocol uses DMA with SPI, and a message format of《STX_MsgID_length_payload_ETX》with a fixed message fragments length, however the main drawback of the current scheme is that the master waits for a response on the message (not fragment) before sending another one, which kills the throughput and does not utilise the full duplex nature of SPI.
3- An improvement to this point was to use a kind of mailbox for receiving fragments, so a long message can be interrupted by a higher priority one so that fragments of a single message can arrive non sequentially, but the problem is that this design lead to complicating things especially that I don't have much available resources for many buffers to use the mailbox approach on the controller (master) side. So I thought that it's like I'm re-inventing the wheel by designing a protocol stack for a simple point to point link which may not be efficient.
4- What kind of higher level protocols can be normally used above SPI to establish multiple sessions and solve the queuing/scheduling of messages?
Update 2: Another useful thread "A good serial communications protocol/stack for embedded devices?"
Update 3: I had a look at Modbus protocol, it seems to specify the application layer then directly the data link layer for serial line communication, which sounds to skip the unnecessary overhead of network oriented protocols layers.
Do you think this will be a better option than LwIP for the intended purpose? Also, is there a widely used open source implementation like LwIP but for Modbus?

I think that perhaps you are expecting too much of the humble SPI.
An SPI link is little more a pair of shift registers one in each node. The master selects a single node to connect to its SPI shift register. As it shifts in its data, the slave simultaneously shifts data out. Data is not exchanged unless the master explicitly clocks the data out. Efficient protocols on SPI involve the slave having something useful to output while the master inputs. This may be difficult to arrange, so you usually need a means of indicating null data.
PPP is useful when establishing a connection between two arbitrary endpoints, when the endpoints are fixed and known a priori, PPP would serve no purpose other than to complicate things unnecessarily.
SPI is not a very sophisticated nor flexible interface and probably unsuited to heavyweight general purpose protocols such as TCP/IP. Since "addressing" on SPI is performed by physical chip-select, the addressing inherent in such protocols is meaningless.
Flow control is also a problem with SPI. The master has no way of determining that the slave has copied the data from SPI the shift register before pushing more data. If your slave SPI supports DMA you would be wise to use it.
Either way I suggest that you develop something specific to your purpose. Since SPI is not a network as such, you only need a means to address threads on the selected node. This could be as simple as STX<thread ID><length><payload>ETX.
Added 27 September 2013 in response to comments
Generally SPI as its names suggests is used to connect to peripheral devices, and in that context the protocol is defined by the peripheral. EEPROMS for example typically use a common or at least compatible command interface across vendors, and SD/MMC card SPI interface uses a standardised command test and protocol.
Between two microcontrollers, I would imagine that most implementations are proprietary and application specific. Open protocols are designed for generic interoperability and to achieve that might impose significant unnecessary overhead for a closed system, unless perhaps the nodes were running a system that already had a network stack built in.
I would suggest that if you do want to use a generic network stack that you should abstract the SPI with device drivers at each end that give the SPI a standard I/O stream interface (open(), close(), read(), write() etc.), then you can use the higher-level PPP and TCP/IP protocols (although PPP can probably be avoided since the connection is permanent). However that would only be attractive if both nodes already supported these protocols (running Linux for example), otherwise it will be significant effort and code for little benefit, and would certainly not be "efficient".

I assume you dont really want or have room for a full ip (lwip) stack on the microcontroller? This just sounds like a lot of overkill. Why not just roll your own simple packet structure to move the data items you need to move. Depending on how spi is supported on both sides you may or may not be able to use it to define the frame for your data, if not a simple start pattern, length and a trailing checksum and maybe tail pattern would suffice for finding packet boundaries in the stream (no different than a serial/uart solution). You can even use the PPP solution for that with a start pattern and I think end pattern with the payload using a two byte pattern whenever the start pattern happens to show up in the data. I dont remember all the details now.
Whatever your frame is then add a packet type and your handshakes, or if the data is going to just be microcontroller to arm then you dont even need to do that.
To get back to your direct question. Yes, I think that an ip stack (lwip or other) will introduce a lot of overhead. both bandwidth and more important the amount of code needed to support that stack will chew up rom/ram on both sides. If you ultimately need to present this data in an ip fashion (a website hosted by the embedded system) then somewhere in the path you need an ip stack, etc.
I cant imagine that lwip manages your queues for you. I assume you would need to do that yourself. the various queues might want to talk to a single driver that deals with the single spi bus (assuming there is a single spi bus with multiple chip selects). It also depends on how you are using the spi interface, if you are allowing the arm to talk to multiple microcontrollers and the packets of data are broken up into a little bit from this controller a little from that controller so that nobody has to wait to long before they get a few more bytes of data. Or will a complete frame have to move from one microcontroller before moving onto the next gpio interrupt to pull that guys data? The long and short of it is I would assume you have to manage the shared resource just like you would in any other situation where you have multiple users of a shared resource (rtos, full blown operating system, etc). I dont remember lwip that well at all but with a full blown berkeley sockets application interface the user could write separate applications where each application only cared about one TCP or UDP port and the libraries and drivers managed separating those packets out to each application as well as all of the rules for the IP stack.
If you are not already doing experiments with moving data over the spi interface(s) I would start with simple experiments first just to get the feel for how well it is or isnt going to work, the sizes of transfers you can do reliably per spi transction, etc. Your solution may naturally just fall out of those experiments.

Control stepper motors via USB

I'm doing a USB device is to control stepper motors. I've done this before using a parallel port. because these ports do not exist in current motherboards, I decided to implement a USB communication between my device and the PC (host).
To achieve My objective, I endowed the freescale microcontroller the device with that has a USB module 12Mbps.
My USB device must receive 4 bytes (one for each motor driver) at a given time, because every byte is a step that should move the engine.
In the PC (Host) an application of user processes a text file with information and make the trajectory coordinates sending bytes at a certain rate for each motor (time is trivial to achieve the acceleration and speed of the motors) .
Using the parallel port was an easy the task because each byte is sent sequentially to a time determined by the user app.
doing a little research about full speed USB protocol understood that the frame is sent every 1ms.
then you can send 4 byte or many more every 1ms but I can not manage time like I did with the parallel port.
My microcontroller can send up to 64 bytes per frame (Based on transfer papers type Control, Bulk, Int, Iso ..).
question 1:
I want to know in what way I can send 4-byte packets faster than every 1 ms?
question 2:
What type of transfer can advise me for these type of devices?
Thanks.

Like Ricardo said, USB-serial will suffice.
As for the type of transfer, try implementing a CDC stack and use your SCI receiver to listen for PC commands. That will give you a receive buffer which will meet your needs.
Initialize your SCI (baud, etc)
Enable receiver and interrupt
On data receive, move it to your 4-byte command buffer
Clear receive buffer, wait for more
When you have all 4 bytes, fire off the steppers! Four bytes should take µs.
Check with Freescale to see if your processor is supported.
http://cache.freescale.com/files/microcontrollers/doc/support_info/USB_STACK_RELEASE_NOTES_V4.1.1.pdf?fpsp=1
There might even be some sample code to get you started.
-Cheers

I am achieving the same goal (driving/control CNC machines) like this:
the USB device is just synchronous I/O parallel port. Using continuous bulk transfer one pipe as input and one as output. This way I was able to achieve synchronous 64bit parallel communication with ~70KHz sample rate. It uses traffic around (i)4.27+(o)4.27 MBit/s that is limit for mine MCU and code. Bigger speeds cause jitter on the output due to USB events interrupts.
How to do it (on MCU side)
I have 2 FIFO's one for ingoing and one for outgoing data. I have timer interrupt occurring with sample rate frequency. In it I read the inputs and feed it to the first FIFO and read data from the other FIFO and send it to the outputs.
On top of that the USB task is called (inside the same interrupt) checking FIFO for sending to and incoming data from USB handling the transfer itself
I choose ATMEL AT32UC3A chips for this task. After a long and pain full research I decided these MCU's because they have enough memory for both FIFO's and program so no need for additional IC. It has FPGA package which can be used (BGA is not an option). It has HS USB (most USB MCU's have only FS like yours). It runs at 66MHz. It supports many interesting features (did interesting projects with it in the past) and of coarse I have experience with ATMEL MCU's from past
So if you want to achieve something similar then
start with bulk transfer (PC -> USB -> MCU -> output)
add FIFO if needed
do not know the sample rate you need. The old LPT's could handle from 80-196KHz depend on the manufactor. The modern ones are much much slower (which is silly and sad).
measure the critical sample rate
you need oscilloscope or very good hearing for this. The output data must be synchronous so no holes in it, no jitter, etc...
if any of these are present you have to lower the sample rate. Mine setup could handle even 1MHz sample rate but the USB jitter was present (sometimes USB event froze the sending for longer that one sample...) so I achieve only 70KHz of stable output.
if needed also inputs then add them
but only if the output is working as it should. Do not forget to lower the sample rate after this too ... Use separate bulk pipes and FIFOs for input and output.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas