UART vs I2C vs SPI for inter-processor communication between microcontrollers - embedded

I am examining a way to connect two microcontrollers. On the level of serialization I am thinking of using Nano protobuffers (http://code.google.com/p/nanopb/). This way I can encode/decode messages and send them between two processors.
Basically, one small processor would be the RPC server, capable of doing several functions. Bigger processor will call there RPCs via messages sent, and then when data is ready, it will read it from smaller processor.
What would be the pros/cons of using UART, I2C or SPI?
Messages will be put in the mailbox que prior to sending.

It depends on your total requirements and how expensive are pins.
I2C only needs two pins, but it's slow and to handle it with or without interrupts is a pain, even with the build in peripheral modules.
It's a master/slave system, it's good for controlling many slow devices like temp sensors.
Only two lines for all bus devices, the selection is done via an I2C-Address in the protocol.
Uart needs two pins, it's normally faster, easier to handle, but requires (nearly) the same clocks at both sides.
One to one asynchronous system, can be good if both systems needs to be send sometimes data without waiting for a master poll request.
Can also be used as a bus system, but then you need a master/slave structure or more complex protocols.
SPI needs 3 (or 4 with CS) pins, it's the fastest, simple to implement even with DMA, low cpu time overhead, often buffered.
When you have enough free pins I would prefer it.

All of these interfaces have pros/cons.
UART connection in it's basic functionality requires 2 pins: RX and TX. The SW implementation of how to message over that UART is quite a bit more complicated...you'll have to develop your own messenging protocol between the devices and decide what is a good message and what is a bad message. It could get quite complicated because you pretty much have to define how to "communicate" over the physical link, what is an error, retries, etc. Unless you are implementing a serial port connection to a PC or some other external device, I think a UART is highly overkill for a IC to IC communication path. Master and slave are not specifically defined.
SPI is a master-slave relationship and can be a faster interface (I've seen up to 60MHz clock rates, not common) but it also requires more pins, 3 at a minimum for a point-to-point communication scheme but the number of pins increases to 3+n as the number of "slaves" increases above 1. There are no error indications via SPI. SPI is a "de-facto" standard...meaning it can vary in implementation...your mileage may vary depending on how a IC supplier defined "their" SPI implementation. I generally consider the lack of a true standard for SPI to be a "con".
I2C is also a two pin interface and is an actual "standard" developed by Phillips (now NXP.) As a standard, it is well-defined in how it operates, how errors are raised, and is simple to implement. It has an addressing scheme, can send commands, and can support 0 or more data frames in a transaction. CRC (optional) and higher data rates can be supported (up to 5Mbits.) It does have cons, namely bus capacitance can limit actual data rates (rise/fall time) but generally you can design around this "problem".
In their most basic forms, all of these busses are "ground referenced"...and can suffer from system induced noise. Obviously, lower rail voltages can make this even more of issue. Again careful design practice can mitigate many of the problems some people report to be the bain of their existence.
For the point-to-point system initially asked by the poster, if a master-slave arrangement is required, a SPI or I2C interface may be appropriate (data rate dependent.) If a master-master relationship is required, I2C or UART may be required.
For ease of implementation from a software point of view, I'd rank these communication methods in the following order:
I2C, if you need faster data rates than I2C can handle, then SPI
SPI, if you need multi-master, then I2C or UART
UART as a last resort...has a lot more software overhead to manage the communications channel

I would use UART or CAN or ETH or any protocol that is asynchronous.
If you use a synchronous protocol, the master must always "ask" the slave if it has data and generate unwanted traffic.

Related

SPI slave within boundary scan on STM32F4

I'd like to test a SPI slave on my STM32F4 via JTAG boundary scanning methods (best would be using OpenOCD, instead of other special tool).
Does somebody know details and typical pitfalls of such thing?
What I found was this site, whereas this neatly explains boundary scanning.
I am thankful for any hint on that topic.
As the linked site points out, testing your µC's output on the SPI pins through boundary scan will suffer from very low speed (because you have to feed the corresponding bit-banging commands through the boundary-scan protocol, which is far from efficient).
Using the STM32F4 controller, I therefore suggest you to keep the CPU in debug (break), and to set up the GPIOs and SPI through JTAG (as if the firmware were doing this from inside). Then, you are free to put entire data bytes/words into the TX register and poll the SPI status and RX register. This is one or two levels above the (plain) boundary scan method, but it will be quite easy to implement.
(Only) if you want to take this idea even some steps further, you can use JTAG first to switch the clock settings to higher speed or to add DMA (and write larger amounts of data to RAM before triggering the SPI transfer).

Sampling a high speed serial bit stream with MCU

I'm currently working on an application where an MCU is receiving data from a hardware chip in the form of an asynchronous serial bit transmission at 2Mbps. This data has no encoding and no protocol aside from a start sequence, after which it is raw binary data.
The current approach for recovery is using the SPI module in 3-pin mode to oversample the stream 4x at 8MHz, allowing for recovery of the asynchronous data. While seemingly effective thus far with a simulated testbench, this method is rather complicated as an internal clock needs to be routed to the SPI CLK as the device is run in slave mode in order for DMA to recover the transmitted data while the processor executes another task.
Would it be possible to use any other peripherals efficiently for this task aside from SPI? Faking a communication protocol to recover a serial bit stream seems a bit roundabout, but I am not sure how to utilize UART or I2C without doing the same, and those might not even be possible to use as the protocol bits are not present in the stream. I also want to avoid using an ADC in the interest of power, along with the fact that the data is already digital so it seems unnecessary.

Chip to chip communication protocol over SPI

I'm trying to design an efficient communication protocol between a micro-controller on one side and an ARM processor on a multi-core TI chip on the other side through SPI.
The requirements for the needed protocol:
1 - Multi-session with queuing support, as I have multiple sending/receiving threads, so it will be more than one application using this communication protocol and I need the protocol to handle queuing these requests (I will keep holding the buffer if the transmission is queue but I just need the protocol to manage scheduling the queues).
2 - Works over SPI as an underlying protocol.
3 - Simple error checking.
In this thread: "Simple serial point-to-point communication protocol", PPP was a recommended option, however I see PPP does only part of the job.
I also found Light weight IP (LwIP) project featuring PPP over serial (which I assume that I can use it over SPI), so I thought about the possibility of utilizing any of the upper layers protocols like TCP/UDP to do the rest of the required jobs. Fortunately, I found TI including LwIP as part of their ethernet SW in the starterware package, which I assume to ease porting at least on the TI chip side.
So, my questions are:
1 - Is it valid to use LwIP for this communication scheme? Won't this introduce much overhead due to IP headers which are not necessary for a point to point (on the chip level) communication and kill the throughput?
2 - Will the TCP or any similar protocol residing in LwIP handle the queuing of transmission requests, for example if I request transmission through a socket while the communication channel is busy transmitting/receiving request for another socket (session) of another thread, will this be managed by the protocol stack? If so, which protocol layer manages it?
3 - Is their a more efficient protocol stack than LwIP, that meets the above requirements?
Update 1: More points to consider
1 - SPI is the only available option, I use it with available GPIOs to indicate to the master when the slave has data to send.
2 - The current implemented (non-standard) protocol uses DMA with SPI, and a message format of《STX_MsgID_length_payload_ETX》with a fixed message fragments length, however the main drawback of the current scheme is that the master waits for a response on the message (not fragment) before sending another one, which kills the throughput and does not utilise the full duplex nature of SPI.
3- An improvement to this point was to use a kind of mailbox for receiving fragments, so a long message can be interrupted by a higher priority one so that fragments of a single message can arrive non sequentially, but the problem is that this design lead to complicating things especially that I don't have much available resources for many buffers to use the mailbox approach on the controller (master) side. So I thought that it's like I'm re-inventing the wheel by designing a protocol stack for a simple point to point link which may not be efficient.
4- What kind of higher level protocols can be normally used above SPI to establish multiple sessions and solve the queuing/scheduling of messages?
Update 2: Another useful thread "A good serial communications protocol/stack for embedded devices?"
Update 3: I had a look at Modbus protocol, it seems to specify the application layer then directly the data link layer for serial line communication, which sounds to skip the unnecessary overhead of network oriented protocols layers.
Do you think this will be a better option than LwIP for the intended purpose? Also, is there a widely used open source implementation like LwIP but for Modbus?
I think that perhaps you are expecting too much of the humble SPI.
An SPI link is little more a pair of shift registers one in each node. The master selects a single node to connect to its SPI shift register. As it shifts in its data, the slave simultaneously shifts data out. Data is not exchanged unless the master explicitly clocks the data out. Efficient protocols on SPI involve the slave having something useful to output while the master inputs. This may be difficult to arrange, so you usually need a means of indicating null data.
PPP is useful when establishing a connection between two arbitrary endpoints, when the endpoints are fixed and known a priori, PPP would serve no purpose other than to complicate things unnecessarily.
SPI is not a very sophisticated nor flexible interface and probably unsuited to heavyweight general purpose protocols such as TCP/IP. Since "addressing" on SPI is performed by physical chip-select, the addressing inherent in such protocols is meaningless.
Flow control is also a problem with SPI. The master has no way of determining that the slave has copied the data from SPI the shift register before pushing more data. If your slave SPI supports DMA you would be wise to use it.
Either way I suggest that you develop something specific to your purpose. Since SPI is not a network as such, you only need a means to address threads on the selected node. This could be as simple as STX<thread ID><length><payload>ETX.
Added 27 September 2013 in response to comments
Generally SPI as its names suggests is used to connect to peripheral devices, and in that context the protocol is defined by the peripheral. EEPROMS for example typically use a common or at least compatible command interface across vendors, and SD/MMC card SPI interface uses a standardised command test and protocol.
Between two microcontrollers, I would imagine that most implementations are proprietary and application specific. Open protocols are designed for generic interoperability and to achieve that might impose significant unnecessary overhead for a closed system, unless perhaps the nodes were running a system that already had a network stack built in.
I would suggest that if you do want to use a generic network stack that you should abstract the SPI with device drivers at each end that give the SPI a standard I/O stream interface (open(), close(), read(), write() etc.), then you can use the higher-level PPP and TCP/IP protocols (although PPP can probably be avoided since the connection is permanent). However that would only be attractive if both nodes already supported these protocols (running Linux for example), otherwise it will be significant effort and code for little benefit, and would certainly not be "efficient".
I assume you dont really want or have room for a full ip (lwip) stack on the microcontroller? This just sounds like a lot of overkill. Why not just roll your own simple packet structure to move the data items you need to move. Depending on how spi is supported on both sides you may or may not be able to use it to define the frame for your data, if not a simple start pattern, length and a trailing checksum and maybe tail pattern would suffice for finding packet boundaries in the stream (no different than a serial/uart solution). You can even use the PPP solution for that with a start pattern and I think end pattern with the payload using a two byte pattern whenever the start pattern happens to show up in the data. I dont remember all the details now.
Whatever your frame is then add a packet type and your handshakes, or if the data is going to just be microcontroller to arm then you dont even need to do that.
To get back to your direct question. Yes, I think that an ip stack (lwip or other) will introduce a lot of overhead. both bandwidth and more important the amount of code needed to support that stack will chew up rom/ram on both sides. If you ultimately need to present this data in an ip fashion (a website hosted by the embedded system) then somewhere in the path you need an ip stack, etc.
I cant imagine that lwip manages your queues for you. I assume you would need to do that yourself. the various queues might want to talk to a single driver that deals with the single spi bus (assuming there is a single spi bus with multiple chip selects). It also depends on how you are using the spi interface, if you are allowing the arm to talk to multiple microcontrollers and the packets of data are broken up into a little bit from this controller a little from that controller so that nobody has to wait to long before they get a few more bytes of data. Or will a complete frame have to move from one microcontroller before moving onto the next gpio interrupt to pull that guys data? The long and short of it is I would assume you have to manage the shared resource just like you would in any other situation where you have multiple users of a shared resource (rtos, full blown operating system, etc). I dont remember lwip that well at all but with a full blown berkeley sockets application interface the user could write separate applications where each application only cared about one TCP or UDP port and the libraries and drivers managed separating those packets out to each application as well as all of the rules for the IP stack.
If you are not already doing experiments with moving data over the spi interface(s) I would start with simple experiments first just to get the feel for how well it is or isnt going to work, the sizes of transfers you can do reliably per spi transction, etc. Your solution may naturally just fall out of those experiments.

Why is not USB interrupt-driven?

What's the rationale behind making USB a polling mechanism rather than interrupt-driven? The answers I can come up with some reasoning are:
Leave control of processing efficiency and granularity to OS, rather than the device itself.
Prevent "interrupt storms" by faulty devices.
Some explanations on the net that I found say that it's mostly because of the nature of USB devices. They are mostly microcontroller-based systems which cannot queue larger transfers therefore require short interrupt intervals and such short interrupt intervals may not be the most efficient. Is that true?
Could there be other reasons?
The overarching premise of the development of USB was, "cheap chips". This was done, through the use of polling, which reduces the need for a higher arbitration protocol.
Firewire, which did allow for interrupts from the devices and even DMA, was much more expensive. So USB won in the low-cost field, and firewire in low-latency/low-overhead/... field. Due to history USB more or less won.
What's the rationale behind making USB a polling mechanism rather than interrupt-driven?
This seems to be anti-USB FUD (as in Fear-Uncertainy-Doubt).
The reason is that this simplifies things on the harware level quite a bit - no more collisions for example. USB is half-duplex to reducex the amount of wires in the cable, so only one can talk anyway.
While USB uses polling on the wire, once you use it in software you will notice that you have interrupts in USB. The only issue is a slight increase in latency - neglible in most use cases. Since the polling is usually realized in hardware IIRC, software only gets notified if there is new data.
On the software level, there are so-called "interrupt endpoints" - and guess what, every HID device uses them: Mice, Keyboard and Josticks are HID.
There are three ways to avoid data transfer collisions on a bus:
Have a somewhat complex bus management protocol. Such a protocol has to be rather sophisticated, as when too simple, it will make the bus rather slow (see Token Ring, which is rather simple, yet inefficient). However, having a sophisticated protocol makes all components expensive as all of them require management logic and need to understand of how the bus actually works (see Firewire).
Don't avoid them at all, allow them but detect and handle them. This is also somewhat complex and the bus cannot guarantee any speed or latency as if there are constantly collisions, throughput will fall and latency will raise (see Ethernet without switches, see WiFi).
Have one bus master that controls who can use the bus at which time and for how long. This is inexpensive, as only the master must be sophisticated and the master can give any possible guarantee regarding speed or latency.
And (3) is how USB works and not even the master USB chips need to be sophisticated as the master is usually a computer with a fast CPU and can perform all bus management in software.
USB device chips are dump as toast. They don't need to understand the nifty details of the bus. They just have to look for packets addressed to them which are either control packets, e.g. requesting meta data or selecting a configuration, data packets sent by the master, or poll requests from the master saying "if you have something to send, the bus is now yours".
To make sure the master polls them on time, they hand out a simple description table on request, that explains which endpoints they offer, how often those need to be polled, and how much data they will at most transfer when being polled. The master can use that information to build up a poll schedule that ensures that all devices are polled on time and get the bus for long enough to allow their maximum transfer size. Of course, this is not possible under all circumstances. If you connect too many devices that require very frequent polling and always want to send a lot of data, your system may refuse to add a new device with the error, that its poll requirements cannot be satisfied anymore. Yet that situation is rare in practice and USB is limited to 127 devices (hubs count as devices, so does the master itself).
Power management works in a similar way. Every device tells the master how much power it needs and the master ensures that the bus can still deliver that much, taking active hubs into account. If you connect another device and the bus cannot power it anymore, adding the device will fail with an error.
This allows a fairly complex, powerful, and fast bus system, yet with components that don't even need a real CPU. The simplest USB chips out there are just bridges to a serial data line (like an internal RS-232 bus or an I2C bus) and there is nothing really configurable and they cannot run software or have a firmware one could update. They just place incoming data packets into a buffer and then sent the buffer content bit for bit over the serial bus and they receive serial data in another buffer and return the buffer content when being polled. As for telling the master the configuration (including device and vendor IDs, as well as human readable strings), they simply send the content of a small external EPROM. Things cannot get much simpler than that yet such a chip is enough to build plenty of USB hardware already.

Why are GPIOs used?

I have been searching around [in vain] for some good links/sources to help understand GPIOs and why they are used in embedded systems. Can anyone please point me to some ?
In any useful system, the CPU has to have some way to interact with the outside world - be it lights or sounds presented to the user or electrical signals used to communicate with other parts of the system. A GPIO (general purpose input/output) pin lets you either get input for your program from outside the CPU or to provide output to the user.
Some uses for GPIOs as inputs:
detect button presses
receive interrupt requests from external devices
Some uses for GPIOs as outputs:
blink an LED
sound a buzzer
control power for external devices
A good case for a bidirectional GPIO or a set of GPIOs can be to "bit-bang" a protocol that your SoC doesn't provide natively. You could roll your own SPI or I2C interface, for example.
The reason you cannot find an answer is probably because if you know what an embedded system is and does, or indeed anything about digital electronic systems, then the answer is rather too obvious to write down! That is to say that if you get as far a s actually implementing a working embedded system, you should already know what they are.
GPIO pins are as a minimum, two state digital logic I/O. In most cases some or all of them may also be interrupt sources. These interrupts may have options for be rising, falling, dual edge, or level triggering.
On some targets GPIO pins may have configurable output circuitry to allow, for example, external pull-ups to be omitted, or to allow connection to devices that require open-collector outputs, and in some cases even to provide filtering of high frequency noise and glitches.
In most embedded systems, a processor will be ultimately responsible for sensing the state of various devices which translate external stimuli to digital-level logic voltages (e.g. when a button is pushed, a pin will go low; otherwise it will sit high), and controlling devices which translate logic-level voltages directly into action (e.g. when a pin is high, a light will go on; when low, it will go off). It used to be that processors did not have general-purpose I/O, but would instead have to use a shared bus communicate with devices that could process I/O requests and set or report the state of the external circuits. Although this approach was not entirely without advantages (one processor could monitor or control thousands of circuits on a shared bus) it was inconvenient in many real-world applications.
While it is possible for a processor to control any number of inputs and outputs using a four-wire SPI bus or even a two-wire I2C bus, in many cases the number of signals a processor will need to monitor or control is sufficiently small that it's easier to simply include the circuitry to monitor or control some signals directly on the chip itself. Although dedicated interfacing hardware will frequently have output-only or input-only pins (the person choosing the hardware interface chips will know how many signals need to be monitored, and how many need to be controlled), a particular family of processor may be used in some applications that require e.g. 4 inputs and 28 outputs, and other applications that require 28 inputs and 4 outputs. Instead of requiring that different parts be used in applications with different balances between inputs and outputs, it's simpler to just have one part with inputs that can be configured as inputs or outputs, as needed.
I think you have it backwards. GPIO is the default in electronics. It's a pin, a signal, that can be programmed. Everything is made up of these. For a processor, dedicated peripherals are a special case, they're extras for when you know you want a more limited function.
From a chip manufacturers perspective, you often don't know exactly what the user needs so you can't make the exact peripherals on your chip. You make generic ones instead. Many applications are so rare that there's no market for a specific chip. Only thing you can do is use GPIO or make specific hardware yourself. Also, all (unused or potentially unused) pins are worth turning into GPIO because that makes the part even more generic and reusable. Generic and reusable is very nearly the whole point of programmable chips, otherwise you would just make ASICs.
Some particularly suitable applications:
Reset parts (chips) in a system
Interface to switches, keypads, lights (all they have is one pin/signal!)
Controlling loads with relays or semicondctor switches (on-off)
Solenoid, motor, heater, valve...
Get interrupts from single signals
Thermostats, limit switches, level detectors, alarm devices...
BTW, the Parallax Propeller has practically nothing but GPIO pins. Peripherals are made in software. It works very well for many uses.