RX vs messaging queues like rabbitmq or zeromq? [closed] - rabbitmq

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm quite new to these high level concurrency paradigms, and I've started using the scala RX bindings. So I'm trying to understand how RX differs from messaging queues like RabbitMQ or ZeroMQ?
They both appear to use the subscribe/publish paradigm. Somewhere I saw a tweet about RX being run atop RabbitMQ.
Could someone explain the differences between RX and messaging queues? Why would I choose one over the other? Can one be substituted for the other, or are they mutually exclusive? In what areas do they overlap?

It's worth clicking the learn more link on the [system.reactive] tag, we put a fair bit of info there!
From the intro there you can see that Rx is not a message queueing technology:
The Reactive Extensions (Rx) is a library for composing asynchronous and event-based programs using observable sequences and LINQ-style query operators. System.Reactive is the root namespace used through the library. Using Rx, developers represent asychronous data streams using LINQ operators, and parameterize the concurrency in the asynchronous data streams using Schedulers. Simply put, Rx = Observables + LINQ + Schedulers.
So, Rx and message queueing are really distinct technologies that can complement each other quite well. A classic example is a stock price ticker service - this might be delivered via message queuing, but then transformed by Rx to group, aggregate and filter prices.
You can go further: much as Entity Framework turns IQueryable<T> queries into SQL run directly on the database, you can create providers that turn Rx IQbservable<T> queries into native queries - for example a Where filter might leverage the native filtering capability that exists in many message queueing technologies to apply filters directly. This is quite a lot of work though, and hard.
It's much easier, and not uncommon to see message queue messages fed into an Rx Subject so that incoming messages from a queue are transformed into an Rx stream for easy consumption in the client. Rx is also used extensively in GUIs to work with client side events like button presses and text box changes to make traditionally difficult scenarios like drag-n-drop and text auto-completion via asynchronous server queries dramatically easier. There's a good hands on lab covering the later scenario here. It was written against an early release of Rx, but still is very relevant.
I recommend looking at this video presentation by Bart de Smet for a fantastic intro: Curing Your Event Processing Blues with Reactive Extensions (Rx) - including the classic Rx demo that writes a query over a live feed from Kinect to interpret hand-waving!

Could someone explain the differences between RX and these other messaging queues?
Rx is simply an abstraction over Events (any kind of event!). Receiving a message from a distributed queue is an Event, and often, ZeroMQ / RabbitMQ solutions often have to use and combine different Events quite a bit, which Rx is very good at.
So often, Rx makes writing ZeroMQ / RabbitMQ apps much easier than it would be otherwise :)

Related

What pub/sub protocols have subscriber based data propagation?

I'm trying to evaluate different pub/sub messaging protocols on their ability to horizontally scale without producing unnecessary cross chatter.
My architecture will have NodeJS servers with web socket clients connected. I plan on using a consistent hashing based router to direct clients to servers based off of the topics they're interested in subscribing to. This would mean that for a given topic, only a subset of servers will have clients subscribing to that topic. Messages will then be published to a pub/sub broker, which would be responsible for fanning out that data to servers that have subscribers.
The situation I want to avoid is one in which every broker receives every request, and the network becomes saturated. This is a clear issue with scaling Redis Pub/Sub. Adding servers shouldn't create an n squares' problem.
The number of clients on the pub/sub protocol would be the number of servers. Ideally, each server would be able to have a local broker to fan out data efficiently to multiple NodeJS processes, as to avoid unnecessary network bandwidth. In most cases, for a given topic, all subscribers would be on that same server.
What pub/sub protocols offer this sort of topic based data propagation?
The protocols I'm evaluating are: MQTT, RabbitMQ, ZMQ, nanomsg. This isn't inclusive, and SAAS options are acceptable.
The quality assurance constraints are easy. At most once, or at least once are both adequate. Acknowledgment isn't important. Event order isn't important. We're looking for fire and forget, with an emphasis on horizontal scalability.
First, let me address a risk of mis-understanding
In many cases, similar words do not mean the same thing. The more the abbreviations.
Having that said, let me review a PUB/SUB terminus technicus.
Martin SUSTRIK's and Pieter HINTJENS' team in imatix & 250bpm have developed a few smart messaging frameworks over the past decades, so these guys know a lot about the architecture benefits, constraints and implementation compromises.
That said helps me to state that these fathers, who introduced grounds of the modern messaging, do not consider PUB/SUB to be a protocol.
It is, at least in nanomsg & ZeroMQ, rather a smart Distributed Scaleability-focused Formal Communication Pattern -- i.e. a behaviour emulated by all involved parties.
Both ZeroMQ and nanomsg are broker-less.
In this sense, asking "what protocols" does not have solid grounds.
Let's start from the "data propagation" side
In initial ZeroMQ implementations PUB had no other choice but distribute all messages to all SUB-s that were in a connected-state. Pieter HINTJENS explained numerous times this decision that actual subscription-based filtering was performed on SUB-side ( distributed in 1:all-connected manner ).
It came much later to implement PUB-side subscription based filtering and you may check revisions history to find since which version this started to avoid 1:all-connected broadcasts of data.
Similarly, you may check the nanomsg remarks from Martin SUSTRIK, who gave many indepth posts on performance improvements designed in his fabulous nanomsg project.
Scaleability as a priority No.1
If Scaleability is the focus of your post and if it were a serious Project, my question number one would be what is the quantitative metric for comparing feasible candidates according to such Project goal - i.e. what is the feasibility translated into a utility function to score candidates to compare all the parallel attributes your Project is interested in?

Chip to chip communication protocol over SPI

I'm trying to design an efficient communication protocol between a micro-controller on one side and an ARM processor on a multi-core TI chip on the other side through SPI.
The requirements for the needed protocol:
1 - Multi-session with queuing support, as I have multiple sending/receiving threads, so it will be more than one application using this communication protocol and I need the protocol to handle queuing these requests (I will keep holding the buffer if the transmission is queue but I just need the protocol to manage scheduling the queues).
2 - Works over SPI as an underlying protocol.
3 - Simple error checking.
In this thread: "Simple serial point-to-point communication protocol", PPP was a recommended option, however I see PPP does only part of the job.
I also found Light weight IP (LwIP) project featuring PPP over serial (which I assume that I can use it over SPI), so I thought about the possibility of utilizing any of the upper layers protocols like TCP/UDP to do the rest of the required jobs. Fortunately, I found TI including LwIP as part of their ethernet SW in the starterware package, which I assume to ease porting at least on the TI chip side.
So, my questions are:
1 - Is it valid to use LwIP for this communication scheme? Won't this introduce much overhead due to IP headers which are not necessary for a point to point (on the chip level) communication and kill the throughput?
2 - Will the TCP or any similar protocol residing in LwIP handle the queuing of transmission requests, for example if I request transmission through a socket while the communication channel is busy transmitting/receiving request for another socket (session) of another thread, will this be managed by the protocol stack? If so, which protocol layer manages it?
3 - Is their a more efficient protocol stack than LwIP, that meets the above requirements?
Update 1: More points to consider
1 - SPI is the only available option, I use it with available GPIOs to indicate to the master when the slave has data to send.
2 - The current implemented (non-standard) protocol uses DMA with SPI, and a message format of《STX_MsgID_length_payload_ETX》with a fixed message fragments length, however the main drawback of the current scheme is that the master waits for a response on the message (not fragment) before sending another one, which kills the throughput and does not utilise the full duplex nature of SPI.
3- An improvement to this point was to use a kind of mailbox for receiving fragments, so a long message can be interrupted by a higher priority one so that fragments of a single message can arrive non sequentially, but the problem is that this design lead to complicating things especially that I don't have much available resources for many buffers to use the mailbox approach on the controller (master) side. So I thought that it's like I'm re-inventing the wheel by designing a protocol stack for a simple point to point link which may not be efficient.
4- What kind of higher level protocols can be normally used above SPI to establish multiple sessions and solve the queuing/scheduling of messages?
Update 2: Another useful thread "A good serial communications protocol/stack for embedded devices?"
Update 3: I had a look at Modbus protocol, it seems to specify the application layer then directly the data link layer for serial line communication, which sounds to skip the unnecessary overhead of network oriented protocols layers.
Do you think this will be a better option than LwIP for the intended purpose? Also, is there a widely used open source implementation like LwIP but for Modbus?
I think that perhaps you are expecting too much of the humble SPI.
An SPI link is little more a pair of shift registers one in each node. The master selects a single node to connect to its SPI shift register. As it shifts in its data, the slave simultaneously shifts data out. Data is not exchanged unless the master explicitly clocks the data out. Efficient protocols on SPI involve the slave having something useful to output while the master inputs. This may be difficult to arrange, so you usually need a means of indicating null data.
PPP is useful when establishing a connection between two arbitrary endpoints, when the endpoints are fixed and known a priori, PPP would serve no purpose other than to complicate things unnecessarily.
SPI is not a very sophisticated nor flexible interface and probably unsuited to heavyweight general purpose protocols such as TCP/IP. Since "addressing" on SPI is performed by physical chip-select, the addressing inherent in such protocols is meaningless.
Flow control is also a problem with SPI. The master has no way of determining that the slave has copied the data from SPI the shift register before pushing more data. If your slave SPI supports DMA you would be wise to use it.
Either way I suggest that you develop something specific to your purpose. Since SPI is not a network as such, you only need a means to address threads on the selected node. This could be as simple as STX<thread ID><length><payload>ETX.
Added 27 September 2013 in response to comments
Generally SPI as its names suggests is used to connect to peripheral devices, and in that context the protocol is defined by the peripheral. EEPROMS for example typically use a common or at least compatible command interface across vendors, and SD/MMC card SPI interface uses a standardised command test and protocol.
Between two microcontrollers, I would imagine that most implementations are proprietary and application specific. Open protocols are designed for generic interoperability and to achieve that might impose significant unnecessary overhead for a closed system, unless perhaps the nodes were running a system that already had a network stack built in.
I would suggest that if you do want to use a generic network stack that you should abstract the SPI with device drivers at each end that give the SPI a standard I/O stream interface (open(), close(), read(), write() etc.), then you can use the higher-level PPP and TCP/IP protocols (although PPP can probably be avoided since the connection is permanent). However that would only be attractive if both nodes already supported these protocols (running Linux for example), otherwise it will be significant effort and code for little benefit, and would certainly not be "efficient".
I assume you dont really want or have room for a full ip (lwip) stack on the microcontroller? This just sounds like a lot of overkill. Why not just roll your own simple packet structure to move the data items you need to move. Depending on how spi is supported on both sides you may or may not be able to use it to define the frame for your data, if not a simple start pattern, length and a trailing checksum and maybe tail pattern would suffice for finding packet boundaries in the stream (no different than a serial/uart solution). You can even use the PPP solution for that with a start pattern and I think end pattern with the payload using a two byte pattern whenever the start pattern happens to show up in the data. I dont remember all the details now.
Whatever your frame is then add a packet type and your handshakes, or if the data is going to just be microcontroller to arm then you dont even need to do that.
To get back to your direct question. Yes, I think that an ip stack (lwip or other) will introduce a lot of overhead. both bandwidth and more important the amount of code needed to support that stack will chew up rom/ram on both sides. If you ultimately need to present this data in an ip fashion (a website hosted by the embedded system) then somewhere in the path you need an ip stack, etc.
I cant imagine that lwip manages your queues for you. I assume you would need to do that yourself. the various queues might want to talk to a single driver that deals with the single spi bus (assuming there is a single spi bus with multiple chip selects). It also depends on how you are using the spi interface, if you are allowing the arm to talk to multiple microcontrollers and the packets of data are broken up into a little bit from this controller a little from that controller so that nobody has to wait to long before they get a few more bytes of data. Or will a complete frame have to move from one microcontroller before moving onto the next gpio interrupt to pull that guys data? The long and short of it is I would assume you have to manage the shared resource just like you would in any other situation where you have multiple users of a shared resource (rtos, full blown operating system, etc). I dont remember lwip that well at all but with a full blown berkeley sockets application interface the user could write separate applications where each application only cared about one TCP or UDP port and the libraries and drivers managed separating those packets out to each application as well as all of the rules for the IP stack.
If you are not already doing experiments with moving data over the spi interface(s) I would start with simple experiments first just to get the feel for how well it is or isnt going to work, the sizes of transfers you can do reliably per spi transction, etc. Your solution may naturally just fall out of those experiments.

Microcontroller interfacing [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm new to the world of embedded programming and I'm looking for information about interfacing with a microcontroller using I2C, USB, UART, CAN, etc. Does anybody know any good links, books, tutorials, about this subject? Since I´m a real newb on this subject I prefer if it is as basic as possible.
Since you are already a desktop developer, you can probably jump in somewhere in the middle. Download the user manual for the controller you plan to use. Get the example code from the manufacturers etc. sites for one of the simpler peripherals - UART is good.
Get a development board and an Eclipse/Crossworks/whatever development system that supports your board and get something to work - flash LED, UART echo or the like. Don't try to use a multitasker and interrupt driver first off - just poll the UART with as few lines of code as possible. Why - because just getting a development setup to compile, link, download and run one line of code is a considerable exercise in itself, without any complications from complex code. You have your development setup and hardware to debug first before you can effectively write/debug any code for the controller. Just getting 'blinky' code that merely flashes an on-board LED to work is a huge step forwards:)
Most controllers have dedicated groups/blogs, either on the uC manufacturers site or linked from it - join them.
If you want to get into this efficiently, get a board and try to get it to do something - there is no better way. Once you can get a LED to flash or a UART to issue a string of chars, you're off to the races:)
Developing simple 'blinky' or UART-polling functions are not wasted - you can continue to use them later on when you have more complex code. Blinking a LED, (with the delay-loop replaced by an OS sleep), is such a good indicator that the code is sorta running that I've always kept it in on delivered systems. The UART poll is useful too - it will run without any interrupts and so you can call it from the data/prefetch/whatever abort vectors to issue the many 'CRITICAL ERROR - SYSTEM HALTED' messages that you will get during ongoing development:)
Wikipedia really is as good a place to start as any to learn the basics surrounding each of these mechanisms.
Once you've got a grasp, I'd recommend looking at an actual datasheet and user's guide for a microcontroller that has some of these features. For example, this 16-bit PIC has a dedicated UART, I2C and SPI bus. Looking at the relevant sections of the documentation, armed with your new knowledge of the underlying principles, you'll start to see how to design a system that uses them.
The next step would be to buy a development board for such a device and then, using example code (of which there is tons), code yourself up some datalinks. Incidentally, a UART is certainly the easiest to test, since all PCs can transmit using the RS-232 protocol from a terminal, so in this case you can code a loopback to transmit and receive characters with relative ease. With I2C and SPI however, I think you would need to buy a dedicated host dongle to allow you to transmit using the protocols, although I think Windows 8 might be introducing native support (but don't quote me on that).
I haven't implemented a datalink using CAN, so I can't comment specifically, although I just did a quick Google search and there's a PIC family that supports it, so I'm sure you could follow a similar approach. As for USB, I consider it a bit of a black art, so I'll leave someone else to answer that one - although I do know you can get a USB software stack for PICs, so again, a similar approach could probably be followed.
On a final note, I've only mentioned PICs as an example microcontroller. If you're not familiar with microcontrollers in general, you'll soon realise that all the major microcontroller families from companies like Microchip, TI and Atmel generally follow the same design, so I would have a browse and pick the family who's documentation you're most comfortable with (or least uncomfortable with, as suggested by Martin James!).
For the most part the data sheet or reference manual for the microcontroller you intend to use is what you need. Each vendor will implement these interfaces in different ways, so that is your only definitive source for low-level programming information.
I2C and UART are relatively simple with no standard higher-level protocol stack; that would be defined by you or the device you might be connection to. Your microcontroller vendor will almost ceratainly have example code and/or application notes.
CAN is more complex, and typically CAN networks use a higher-level application protocol of which there are several for different application domains such as CANopen, NMEA2000, DeviceNet, IEC J1939 and more. Most often a third-party library is the most cost effective method of implementing an application protocol stack, but if the network comprises of only devices you are implementing, then that complexity may not be necessary. Again your microcontroller vendor is likely to have examples.
USB is very strongly defined by the USB Implementers Forum, and protocol stacks are non-trivial. Most microcontroller vendors that have on-chip USB interfaces will provide sample code and application notes for at least device-class USB. USB Host stacks are more complex, and again the use of a third-party library (which are relatively expensive) is the quickest way to market. Fees you would otherwise be obliged to pay the USB IF may make writing your own host stack no less expensive. Even for a USB Device interface you will strictly need a USB Vendor ID at $2000 from USB IF. Some microcontroller and library vendors will allow you to use their Vendor ID for a defined subset of Product IDs at little or no cost, but you would probably have to be a significant customer in terms of volumes. For internal experimentation, you can probably get away without an official VID, but such a product could not be released commercially.

What is the best way to implement my program for Keil MCB1700 evaluation board?

I want to develop a program for MCB1700 evaluation board.
Client software of PC reads a picture from HDD.
Then it sends the picture to MCB1700 evaluation board through socket (Ethernet).
Server of MCB1700 receives pictures from PC through Socket-connection and displays it on LCD.
Also server must perform such tasks:
To save picture to USB-stick;
To read picture from USB-stick and send it to client through socket;
To send and receive information through CAN
COM-logging.
etc.
Socket connection can be implemented with help of CMSIS and RL-ARM libraries.
But, as far as I understood, in both cases sofrware has to listen for incoming TCP-connection and handle network's events in an endless loop - All examples of Keil are based on such principle.
I always thought, that it is a poor way for embedded programming to use endless loops.
Moreover, I read such interesting statement
"it is certainly possible to create real-time programs without an RTOS
(by executing one or more tasks in a loop)"
http://www.keil.com/support/man/docs/rlarm/rlarm_ar_artxarm.htm
So, as I understood, it is normal practice to execute a lot of tasks in loop?
while (1) {
task1();
task2();
...
taskN();
}
I think that it is better to handle all events by interrupts.
Is it possible to use socket conection of CMSIS and RL-ARM libraries and organize all my software by handling of interrupts?
My server (on MCB1700) has to perform a lot of tasks. I guess, I should use RTOS RTX in my software. Isn't it so? Is it better to implement my software without RTX?
Simple real-time systems often operate in a "big-loop" architecture, with some time critical events handled by interrupts. It is not a "bad" architecture, bit it is somewhat inflexible, scales poorly, and any change may affect the timing of and or all parts of the system.
It is not true that RL-TCPnet must work this way, but it is designed to run stand alone, and the examples work that way so that there are no dependencies on other libraries for the widest applicability. They are only examples and are intended to demonstrate RL-TCPnet and nothing else. In that sense the examples are not realistic and should be adapted to your requirements.
You might devise a system where all your application code runs in interrupt handlers and the network stack is services in an endless-loop in the thread context, but architecturally that is probably far worse than the "big-loop" architecture, since you may end up with very long interrupt handlers, with the higher priority ones starving and affecting the real-time response of lower priority ones. You end up with a system that is difficult to schedule satisfactorily. Moreover not all library routines are suitable for execution in an interrupt handler, so it would be rather restrictive.
It is possible to service the network stack in a low priority thread in an RTOS. The framework for such operation is described in the documentation in the section: Using RL-TCPnet with RTX kernel.. This will require you to understand and use the RL-RTX kernel library, but given your reservations about "big loop" code, and the description of tasks your system must perform, it sounds like that is what you want to do in any case. If you choose to use a different RTOS, then RL-TCPnet can work in a similar way with any RTOS.

Why would I consider using an RTOS for my embedded project?

First the background, specifics of my question will follow:
At the company that I work at the platform we work on is currently the Microchip PIC32 family using the MPLAB IDE as our development environment. Previously we've also written firmware for the Microchip dsPIC and TI MSP families for this same application.
The firmware is pretty straightforward in that the code is split into three main modules: device control, data sampling, and user communication (usually a user PC). Device control is achieved via some combination of GPIO bus lines and at least one part needing SPI or I2C control. Data sampling is interrupt driven using a Timer module to maintain sample frequency and more SPI/I2C and GPIO bus lines to control the sampling hardware (ie. ADC). User communication is currently implemented via USB using the Microchip App Framework.
So now the question: given what I've described above, at what point would I consider employing an RTOS for my project? Currently I'm thinking of these possible trigger points as reasons to use an RTOS:
Code complexity? The code base architecture/organization is still small enough that I can keep all the details in my head.
Multitasking/Threading? Time-slicing the module execution via interrupts suffices for now for multitasking.
Testing? Currently we don't do much formal testing or verification past the HW smoke test (something I hope to rectify in the near future).
Communication? We currently use a custom packet format and a protocol that pretty much only does START, STOP, SEND DATA commands with data being a binary blob.
Project scope? There is a possibility in the near future that we'll be getting a project to integrate our device into a larger system with the goal of taking that system to mass production. Currently all our projects have been experimental prototypes with quick turn-around of about a month, producing one or two units at a time.
What other points do you think I should consider? In your experience what convinced (or forced) you to consider using an RTOS vs just running your code on the base runtime? Pointers to additional resources about designing/programming for an RTOS is also much appreciated.
There are many many reasons you might want to use an RTOS. They are varied & the degree to which they apply to your situation is hard to say. (Note: I tend to think this way: RTOS implies hard real time which implies preemptive kernel...)
Rate Monotonic Analysis (RMA) - if you want to use Rate Monotonic Analysis to ensure your timing deadlines will be met, you must use a pre-emptive scheduler
Meet real-time deadlines - even without using RMA, with a priority-based pre-emptive RTOS, your scheduler can help ensure deadlines are met. Paradoxically, an RTOS will typically increase interrupt latency due to critical sections in the kernel where interrupts are usually masked
Manage complexity -- definitely, an RTOS (or most OS flavors) can help with this. By allowing the project to be decomposed into independent threads or processes, and using OS services such as message queues, mutexes, semaphores, event flags, etc. to communicate & synchronize, your project (in my experience & opinion) becomes more manageable. I tend to work on larger projects, where most people understand the concept of protecting shared resources, so a lot of the rookie mistakes don't happen. But beware, once you go to a multi-threaded approach, things can become more complex until you wrap your head around the issues.
Use of 3rd-party packages - many RTOSs offer other software components, such as protocol stacks, file systems, device drivers, GUI packages, bootloaders, and other middleware that help you build an application faster by becoming almost more of an "integrator" than a DIY shop.
Testing - yes, definitely, you can think of each thread of control as a testable component with a well-defined interface, especially if a consistent approach is used (such as always blocking in a single place on a message queue). Of course, this is not a substitute for unit, integration, system, etc. testing.
Robustness / fault tolerance - an RTOS may also provide support for the processor's MMU (in your PIC case, I don't think that applies). This allows each thread (or process) to run in its own protected space; threads / processes cannot "dip into" each others' memory and stomp on it. Even device regions (MMIO) might be off limits to some (or all) threads. Strictly speaking, you don't need an RTOS to exploit a processor's MMU (or MPU), but the 2 work very well hand-in-hand.
Generally, when I can develop with an RTOS (or some type of preemptive multi-tasker), the result tends to be cleaner, more modular, more well-behaved and more maintainable. When I have the option, I use one.
Be aware that multi-threaded development has a bit of a learning curve. If you're new to RTOS/multithreaded development, you might be interested in some articles on Choosing an RTOS, The Perils of Preemption and An Introduction to Preemptive Multitasking.
Lastly, even though you didn't ask for recommendations... In addition to the many numerous commercial RTOSs, there are free offerings (FreeRTOS being one of the most popular), and the Quantum Platform is an event-driven framework based on the concept of active objects which includes a preemptive kernel. There are plenty of choices, but I've found that having the source code (even if the RTOS isn't free) is advantageous, esp. when debugging.
RTOS, first and foremost permits you to organize your parallel flows into the set of tasks with well-defined synchronization between them.
IMO, the non-RTOS design is suitable only for the single-flow architecture where all your program is one big endless loop. If you need the multi-flow - a number of tasks, running in parallel - you're better with RTOS. Without RTOS you'll be forced to implement this functionality in-house, re-inventing the wheel.
Code re-use -- if you code drivers/protocol-handlers using an RTOS API they may plug into future projects easier
Debugging -- some IDEs (such as IAR Embedded Workbench) have plugins that show nice live data about your running process such as task CPU utilization and stack utilization
Usually you want to use an RTOS if you have any real-time constraints. If you don’t have real-time constraints, a regular OS might suffice. RTOS’s/OS’s provide a run-time infrastructure like message queues and tasking. If you are just looking for code that can reduce complexity, provide low level support and help with testing, some of the following libraries might do:
The standard C/C++ libraries
Boost libraries
Libraries available through the manufacturer of the chip that can provide hardware specific support
Commercial libraries
Open source libraries
Additional to the points mentioned before, using an RTOS may also be useful if you need support for
standard storage devices (SD, Compact Flash, disk drives ...)
standard communication hardware (Ethernet, USB, Firewire, RS232, I2C, SPI, ...)
standard communication protocols (TCP-IP, ...)
Most RTOSes provide these features or are expandable to support them