I'm attempting to speed up a rather sluggish bootloader. Currently I'm sending data on a single USB HID output endpoint, and as it's a low-speed device I'm apparently limited to one 8-byte packet per 10 ms interval for a whopping 800 bytes/second.
Is it possible to increase the reporting frequency somehow? Or to use multiple output endpoints in a single interface or as part of a composite device? Or perhaps to abuse the control endpoint to send additional data?
Better compression is always an alternative I suppose, but it's an area of diminishing returns, and redesigning the hardware to allow full-speed USB isn't really an option.
For the record I'd be happy with a Windows-only solution.
Or perhaps to abuse the control endpoint to send additional data?
You can use "Vendor specific requests" for that. The TI TUSB3410 Chip works that way AFAIK. Many USB stacks have the hooks for them already in place.
This requires a driver or libusb on the host side, however.
I was able to speed up the upload by orders of magnitude by using SET_REPORT requests on the control endpoint, instead of declaring a separate interrupt out endpoint. That way you get all of the bandwidth available for control transfers.
Also using a larger report split into multiple segments helped reduce the number of SETUP packets needed.
Who says you are limited to an 8-byte packet per 10ms? I don't know the exact numbers off the top of my head, but I know you can send larger packets than that. I did an HID device and was using 64-byte packets. I think I could go larger, but that limit is probably hardware-specific. What hardware are you using?
Also, have you consulted USB in a NutShell?
The actual limit is 8 bytes every 10ms for low-speed devices, and 64 bytes every 1ms for high-speed devices, per interrupt-based endpoint.
So it seems that the first thing to try is switching to high-speed mode, if the hardware supports it. The next thing on the list is using multiple endpoints. If you really want to get the highest possible transfer rate, the HID class is a bad choice.
Related
I'm trying to interface a board level USB camera with a STM32 family microcontroller and send the image file to a central computer using CANbus. Just want to know if this is possible/ has been done before and how involved a task it would be.
I worked at a company where we sent live (low-resolution infra-red) video streams over CAN, but towards the end of my time there they shifted towards ethernet.
So it is possible, but certainly not what it is best suited for. The main advantages of CAN are that it is a multi-point, multi-master bus with built in arbitration. It is meant for short packets, typically 8 bytes (CAN FD allows you to increase that).
If your camera is USB, why not just get a USB repeater cable or USB-over-ethernet gateway?
If there is already a CAN network in place that you are piggy-backing onto then you need to consider what impact you will have on the existing traffic.
If you are starting from scratch then of course CAN will work but it would be an odd choice.
Depending on if its CAN or CANFD (Affects the maximum bulk transfer packet size) you have higher level protocol options to packetise your images and send them over canbus like any other block of data.
For just reguarlar CAN your after part of the standard called J1939.21 Data Link Layer, there are public versions of this floating around online, however due to the agreement when purchasing the standard, I am not able to share the specifics from what I have.
Its on pages 27-28 of the 2001 revision.
I am working with hardware which support USB3.0 interface. According to USB standard spec it support 5Gbps data rates.
Now I want to know what is best way of effective utilization of USB bandwidth.
What's the recommended maximum packet size and frame sizes?
In my scenario I need to send huge amount of data (60GB) as fast as possible. How do form these data into packets.
As per USB spec we can send 1frame/mille second, so to get 5Gbps data rate what is the number of packets/frame required?
Give some brief explanation on USB3.0 protocol.
Thank you.
I want to interface two microcontrollers with a UART interface and I search a protocol to exchange data between them.
In practice, I want to exchange data periodically (ie: sensors reading) and also data on event (GPIO state). I have around 100-200 bytes to exchange every 100 milli second.
Does anybody know a protocol or library to achieve this kind of task ?
For now, I see protobuf and nano protobuff ? Is there something else ?
It would be nice if I could add a software layer over the UART and use "virtual data stream" like if it was a TCP/IP connection to N ports.
Any idea ?
Thanks
I think the most straight forward way is to roll your own.
You'll find RS232 drivers in the manufacturers chip support library.
RS232 is a stream oriented transport, that means you will need to encode your messages into some frameing structure when you send them and detect frame boundaries on the receiver side. A clever and easy to use mechanism to do this is "Consistent Overhead Byte Stuffing".
https://en.wikipedia.org/wiki/Consistent_Overhead_Byte_Stuffing
This simple algorithm turns zeros in your messages into some other value, so the zero-byte can be used to detect start and end of frame. If a byte gets corrupted on the way you can even resynchronize to the stream and keep going.
The code on Wikipedia should be easy enough even for the smallest micro-processors.
Afterwards you can define your message format. You can probably keep it very simple and directly send your data-structures as is.
Suggestion for a simple message format:
Byte-ID Meaning
---------------------------------
0 Destination port number
1 message type (define your own)
2 to n message data
If you want to send variable length messages you can either send out a length byte or derive the length from the output of the Constant Overhead Byte Stuffing framing.
By the way, UART/RS232 is nice and easy to work with, but you may also want to take a look at SPI. The SPI interface is more suitable to exchange data between two micro-controllers. It is usually faster than RS232 and more robust because it has a dedicated clock-line.
How about this: eRPC https://community.nxp.com/docs/DOC-334083
The eRPC (Embedded Remote Procedure Call) is a Remote Procedure Call (RPC) system created by NXP. An RPC is a mechanism used to invoke a software routine on a remote system using a simple local function call. The remote system may be any CPU connected by an arbitrary communications channel: a server across a network, another CPU core in a multicore system, and so on. To the client, it is just like calling a function in a library built into the application. The only difference is any latency or unreliability introduced by the communications channel.
I have use it in a two processor embedded system, a cortext-A9 CPU with a Context-M4 MCU, which communicate each other with SPI/GPIO.
Erpc can run over UART, SPI, rpmsg and network(tcp). even when using serial or SPI as transport tunnel, it can do bidirectional
calls and with very minimal footprint.
Simple serial point-to-point communication protocol
http://www.zipplet.co.uk/index.php/content/openformats_mise
It depends if you need master/slave implementation, noise protection, point-point or multi-point (and in this case collision detection), etc
but, as our colleague said, I would go with the simplest solution that fits the problem, following the KISS principle http://en.wikipedia.org/wiki/KISS_principle
Just add some header information like ID and length, if necessary CRC checking, and be happy :)
Try Microcontroller Interconnect Network (MIN) 1.0:
https://github.com/min-protocol/min
It has framing using byte-stuffing to keep receiver sync, 16-bit Fletcher's algorithm for checksum, an identifier for use by the application and a variable payload of up to 15 bytes.
There's embedded C code there plus also a Python implementation to make it easier to talk to a PC.
As the first answer starts, the simplest result is to roll your own. Define your header (the "format" above) as needed, perhaps including status information so each processor knows that the other is working properly. I have had success with a protocol that includes
2 byte ascii prefix and suffix such as "[" and "]" so that a
protocol analyzer can show you message boundaries.
The number of bytes.
The command ID (parsed to indicate what command handler to use.
Command arguments (I used 3 32 bit words).
A CRC or checksum to verify transfer integrity
The parser then recognizes the [* as the start of the message, and dispatches the body to the command handler for the particular command ID with the associated arguments as long as the checksum matches.
I'm doing a USB device is to control stepper motors. I've done this before using a parallel port. because these ports do not exist in current motherboards, I decided to implement a USB communication between my device and the PC (host).
To achieve My objective, I endowed the freescale microcontroller the device with that has a USB module 12Mbps.
My USB device must receive 4 bytes (one for each motor driver) at a given time, because every byte is a step that should move the engine.
In the PC (Host) an application of user processes a text file with information and make the trajectory coordinates sending bytes at a certain rate for each motor (time is trivial to achieve the acceleration and speed of the motors) .
Using the parallel port was an easy the task because each byte is sent sequentially to a time determined by the user app.
doing a little research about full speed USB protocol understood that the frame is sent every 1ms.
then you can send 4 byte or many more every 1ms but I can not manage time like I did with the parallel port.
My microcontroller can send up to 64 bytes per frame (Based on transfer papers type Control, Bulk, Int, Iso ..).
question 1:
I want to know in what way I can send 4-byte packets faster than every 1 ms?
question 2:
What type of transfer can advise me for these type of devices?
Thanks.
Like Ricardo said, USB-serial will suffice.
As for the type of transfer, try implementing a CDC stack and use your SCI receiver to listen for PC commands. That will give you a receive buffer which will meet your needs.
Initialize your SCI (baud, etc)
Enable receiver and interrupt
On data receive, move it to your 4-byte command buffer
Clear receive buffer, wait for more
When you have all 4 bytes, fire off the steppers! Four bytes should take µs.
Check with Freescale to see if your processor is supported.
http://cache.freescale.com/files/microcontrollers/doc/support_info/USB_STACK_RELEASE_NOTES_V4.1.1.pdf?fpsp=1
There might even be some sample code to get you started.
-Cheers
I am achieving the same goal (driving/control CNC machines) like this:
the USB device is just synchronous I/O parallel port. Using continuous bulk transfer one pipe as input and one as output. This way I was able to achieve synchronous 64bit parallel communication with ~70KHz sample rate. It uses traffic around (i)4.27+(o)4.27 MBit/s that is limit for mine MCU and code. Bigger speeds cause jitter on the output due to USB events interrupts.
How to do it (on MCU side)
I have 2 FIFO's one for ingoing and one for outgoing data. I have timer interrupt occurring with sample rate frequency. In it I read the inputs and feed it to the first FIFO and read data from the other FIFO and send it to the outputs.
On top of that the USB task is called (inside the same interrupt) checking FIFO for sending to and incoming data from USB handling the transfer itself
I choose ATMEL AT32UC3A chips for this task. After a long and pain full research I decided these MCU's because they have enough memory for both FIFO's and program so no need for additional IC. It has FPGA package which can be used (BGA is not an option). It has HS USB (most USB MCU's have only FS like yours). It runs at 66MHz. It supports many interesting features (did interesting projects with it in the past) and of coarse I have experience with ATMEL MCU's from past
So if you want to achieve something similar then
start with bulk transfer (PC -> USB -> MCU -> output)
add FIFO if needed
do not know the sample rate you need. The old LPT's could handle from 80-196KHz depend on the manufactor. The modern ones are much much slower (which is silly and sad).
measure the critical sample rate
you need oscilloscope or very good hearing for this. The output data must be synchronous so no holes in it, no jitter, etc...
if any of these are present you have to lower the sample rate. Mine setup could handle even 1MHz sample rate but the USB jitter was present (sometimes USB event froze the sending for longer that one sample...) so I achieve only 70KHz of stable output.
if needed also inputs then add them
but only if the output is working as it should. Do not forget to lower the sample rate after this too ... Use separate bulk pipes and FIFOs for input and output.
What's the rationale behind making USB a polling mechanism rather than interrupt-driven? The answers I can come up with some reasoning are:
Leave control of processing efficiency and granularity to OS, rather than the device itself.
Prevent "interrupt storms" by faulty devices.
Some explanations on the net that I found say that it's mostly because of the nature of USB devices. They are mostly microcontroller-based systems which cannot queue larger transfers therefore require short interrupt intervals and such short interrupt intervals may not be the most efficient. Is that true?
Could there be other reasons?
The overarching premise of the development of USB was, "cheap chips". This was done, through the use of polling, which reduces the need for a higher arbitration protocol.
Firewire, which did allow for interrupts from the devices and even DMA, was much more expensive. So USB won in the low-cost field, and firewire in low-latency/low-overhead/... field. Due to history USB more or less won.
What's the rationale behind making USB a polling mechanism rather than interrupt-driven?
This seems to be anti-USB FUD (as in Fear-Uncertainy-Doubt).
The reason is that this simplifies things on the harware level quite a bit - no more collisions for example. USB is half-duplex to reducex the amount of wires in the cable, so only one can talk anyway.
While USB uses polling on the wire, once you use it in software you will notice that you have interrupts in USB. The only issue is a slight increase in latency - neglible in most use cases. Since the polling is usually realized in hardware IIRC, software only gets notified if there is new data.
On the software level, there are so-called "interrupt endpoints" - and guess what, every HID device uses them: Mice, Keyboard and Josticks are HID.
There are three ways to avoid data transfer collisions on a bus:
Have a somewhat complex bus management protocol. Such a protocol has to be rather sophisticated, as when too simple, it will make the bus rather slow (see Token Ring, which is rather simple, yet inefficient). However, having a sophisticated protocol makes all components expensive as all of them require management logic and need to understand of how the bus actually works (see Firewire).
Don't avoid them at all, allow them but detect and handle them. This is also somewhat complex and the bus cannot guarantee any speed or latency as if there are constantly collisions, throughput will fall and latency will raise (see Ethernet without switches, see WiFi).
Have one bus master that controls who can use the bus at which time and for how long. This is inexpensive, as only the master must be sophisticated and the master can give any possible guarantee regarding speed or latency.
And (3) is how USB works and not even the master USB chips need to be sophisticated as the master is usually a computer with a fast CPU and can perform all bus management in software.
USB device chips are dump as toast. They don't need to understand the nifty details of the bus. They just have to look for packets addressed to them which are either control packets, e.g. requesting meta data or selecting a configuration, data packets sent by the master, or poll requests from the master saying "if you have something to send, the bus is now yours".
To make sure the master polls them on time, they hand out a simple description table on request, that explains which endpoints they offer, how often those need to be polled, and how much data they will at most transfer when being polled. The master can use that information to build up a poll schedule that ensures that all devices are polled on time and get the bus for long enough to allow their maximum transfer size. Of course, this is not possible under all circumstances. If you connect too many devices that require very frequent polling and always want to send a lot of data, your system may refuse to add a new device with the error, that its poll requirements cannot be satisfied anymore. Yet that situation is rare in practice and USB is limited to 127 devices (hubs count as devices, so does the master itself).
Power management works in a similar way. Every device tells the master how much power it needs and the master ensures that the bus can still deliver that much, taking active hubs into account. If you connect another device and the bus cannot power it anymore, adding the device will fail with an error.
This allows a fairly complex, powerful, and fast bus system, yet with components that don't even need a real CPU. The simplest USB chips out there are just bridges to a serial data line (like an internal RS-232 bus or an I2C bus) and there is nothing really configurable and they cannot run software or have a firmware one could update. They just place incoming data packets into a buffer and then sent the buffer content bit for bit over the serial bus and they receive serial data in another buffer and return the buffer content when being polled. As for telling the master the configuration (including device and vendor IDs, as well as human readable strings), they simply send the content of a small external EPROM. Things cannot get much simpler than that yet such a chip is enough to build plenty of USB hardware already.