How does PPP or Ethernet recover from errors? - error-handling

Looking at the data-link level standards, such as PPP general frame format or Ethernet, it's not clear what happens if the checksum is invalid. How does the protocol know where the next frame begins?
Does it just scan for the next occurrence of "flag" (in the case of PPP)? If so, what happens if the packet payload just so happens to contain "flag" itself? My point is that, whether packet-framing or "length" fields are used, it's not clear how to recover from invalid packets where the "length" field could be corrupt or the "framing" bytes could just so happen to be part of the packet payload.
UPDATE: I found what I was looking for (which isn't strictly what I asked about) by looking up "GFP CRC-based framing". According to Communication networks
The GFP receiver synchronizes to the GFP frame boundary through a three-state process. The receiver is initially in the hunt state where it examines four bytes at a time to see if the CRC computed over the first two bytes equals the contents of the next two bytes. If no match is found the GFP moves forward by one byte as GFP assumes octet synchronous transmission given by the physical layer. When the receiver finds a match it moves to the pre-sync state. While in this intermediate state the receiver uses the tentative PLI (payload length indicator) field to determine the location of the next frame boundary. If a target number N of successful frame detection has been achieved, then the receiver moves into the sync state. The sync state is the normal state where the receiver examines each PLI, validates it using cHEC (core header error checking), extracts the payload, and proceeds to the next frame.
In short, each packet begins with "length" and "CRC(length)". There is no need to escape any characters and the packet length is known ahead of time.
There seems to be two major approaches to packet framing:
encoding schemes (bit/byte stuffing, Manchester encoding, 4b5b, 8b10b, etc)
unmodified data + checksum (GFP)
The former is safer, the latter is more efficient. Both are prone to errors if the payload just happens to contain a valid packet and line corruption causes the proceeding bytes to contain the "start of frame" byte sequence but that sounds highly improbable. It's difficult to find hard numbers for GFP's robustness, but a lot of modern protocols seem to use it so one can assume that they know what they're doing.

Both PPP and Ethernet have mechanisms for framing - that is, for breaking a stream of bits up into frames, in such a way that if a receiver loses track of what's what, it can pick up at the start of the next frame. These sit right at the bottom of the protocol stack; all the other details of the protocol are built on the idea of frames. In particular, the preamble, LCP, and FCS are at a higher level, and are not used to control framing.
PPP, over serial links like dialup, is framed using HDLC-like framing. A byte value of 0x7e, called a flag sequence, indicates the start of the frame. The frame continues until the next flag byte. Any occurrence of the flag byte in the content of the frame is escaped. Escaping is done by writing 0x7d, known as the control escape byte, followed by the byte to be escaped xor'd with 0x20. The flag sequence is escaped to 0x5e; the control escape itself also has to be escaped, to 0x5d. Other values can also be escaped if their presence would upset the modem. As a result, if a receiver loses synchronisation, it can just read and discard bytes until it sees a 0x7e, at which points it knows it's at the start of a frame again. The contents of the frame are then structured, containing some odd little fields that aren't really important, but are retained from an earlier IBM protocol, along with the PPP packet (called a protocol data unit, PDU), and also the frame check sequence (FCS).
Ethernet uses a logically similar approach, of having symbols which are recognisable as frame start and end markers rather than data, but rather than having reserved bytes plus an escape mechanism, it uses a coding scheme which is able to express special control symbols that are distinct from data bytes - a bit like using punctuation to break up a sequence of letters. The details of the system used vary with the speed.
Standard (10 Mb/s) ethernet is encoded using a thing called Manchester encoding, in which each bit to be transmitted is represented as two successive levels on the line, in such a way that there is always a transition between levels in every bit, which helps the receiver to stay synchronised. Frame boundaries are indicated by violating the encoding rule, leading to there being a bit with no transition (i read this in a book years ago, but can't find a citation online - i might be wrong about this). In effect, this system expands the binary code to three symbols - 0, 1, and violation.
Fast (100 Mb/s) ethernet uses a different coding scheme, based on a 5b/4b code, where groups of four data bits (nybbles) are represented as groups of five bits on the wire, and transmitted directly, without the Manchester scheme. The expansion to five bits lets the sixteen necessary patterns used be chosen to fulfil the requirement for frequent level transitions, again to help the receiver stay synchronised. However, there's still room to choose some extra symbols, which can be transmitted but don't correspond to data value, in effect, expanding the set of nybbles to twenty-four symbols - the nybbles 0 to F, and symbols called Q, I, J, K, T, R, S and H. Ethernet uses a JK pair to mark frame starts, and TR to mark frame ends.
Gigabit ethernet is similar to fast ethernet, but with a different coding scheme - the optical fibre versions use an 8b/10b code instead of the 5b/4b code, and the twisted-pair version uses some very complex quinary code arrangement which i don't really understand. Both approaches yield the same result, which is the ability to transmit either data bytes or one of a small set of additional special symbols, and those special symbols are used for framing.
On top of this basic framing structure, there is then a fixed preamble, followed by a frame delimiter, and some control fields of varying pointlessness (hello, LLC/SNAP!). Validity of these fields can be used to validate the frame, but they can't be used to define frames on their own.

You're pretty close to the correct answer already. Basically if it starts with a preamble and ends in something that matches as a checksum, it's a frame and passed up to higher layers.
PPP and ethernet both look for the next frame start signal. In the case of Ethernet, it's the preamble, a sequence of 64 alternating bits. If an ethernet decoder sees that, it simply assumes what follows is a frame. By capturing the bits and then checking if the checksum matches, it decides if it has a valid frame.
As for the payload containing the FLAG, in PPP it is escaped with additional bytes to prevent such misinterpretation.

As far as I know, PPP only supports error detection, and does not support any form of error correction or recovery.
Backed up by Cisco here: http://www.cisco.com/en/US/docs/internetworking/technology/handbook/PPP.html

This Wikipedia PPP line activation section describes the basics of RFC 1661.
A Frame Check sequence is used to detect transmission errors in a frame (described in the earlier Encapsulation section).
The diagram from RFC 1661 on this Wikipedia page describes how the Network protocol phase can restart with Link Establishment on an error.
Also, notes from the Cisco page referred by Suvesh.
PPP Link-Control Protocol
The PPP LCP provides a method of establishing, configuring, maintaining, and terminating the point-to-point connection. LCP goes through four distinct phases.
First, link establishment and configuration negotiation occur. Before any network layer datagrams (for example, IP) can be exchanged, LCP first must open the connection and negotiate configuration parameters. This phase is complete when a configuration-acknowledgment frame has been both sent and received.
This is followed by link quality determination. LCP allows an optional link quality determination phase following the link-establishment and configuration-negotiation phase. In this phase, the link is tested to determine whether the link quality is sufficient to bring up network layer protocols. This phase is optional. LCP can delay transmission of network layer protocol information until this phase is complete.
At this point, network layer protocol configuration negotiation occurs. After LCP has finished the link quality determination phase, network layer protocols can be configured separately by the appropriate NCP and can be brought up and taken down at any time. If LCP closes the link, it informs the network layer protocols so that they can take appropriate action.
Finally, link termination occurs. LCP can terminate the link at any time. This usually is done at the request of a user but can happen because of a physical event, such as the loss of carrier or the expiration of an idle-period timer.
Three classes of LCP frames exist. Link-establishment frames are used to establish and configure a link. Link-termination frames are used to terminate a link, and link-maintenance frames are used to manage and debug a link.
These frames are used to accomplish the work of each of the LCP phases.

Related

What happens at the receive part when we write with SPI?

When SPI master writes to the slave, something is shifting into the receive buffer right?
If yes, then it is normal "RXDATAAVAILABLE" flag to be set? It is nonsense! We send data, and when data is sent, we get notified that there is data received.
If all of my statements are correct, then how do we know what the correct data is into the RXFIFO?
Suppose we send two bytes frame. The first one is the address and the second one is dummy in order to read the value in that address (of the slave). Then suppose we have two levels Rx FIFO. In that FIFO instead the value read from the slave, we have two bytes, the first is who knows what, and the second the value read from the slave.
So the question is: how do we manage to receive only what is necessary, without getting junk data during the write part of the frame?
SPI works like simple 8 bit shift registers. You shift out bytes on MOSI at each flank of a clock and at the same time you shift in new data from MISO. Thus you send and receive at the same time. Hence the names MOSI = Master Out Slave In, and MISO = Master In Slave Out.
SPI peripherals on microcontrollers are more intricate than that though, and have separate data registers that are different from the actual hardware shift register, so that we can write data without worrying about the pending transmission. Some may even have multiple data buffers. But on the fundamental level, SPI always work with 8 bits.
When the microcontroller acting as SPI master writes something, there is usually two flags, one that says that the data buffer is made available, and another that tells that transmission is done.
When you are done sending, you are also done receiving. You'll get some sort of flag set. This is assuming that all devices are implementing SPI as intended, which is often not the case.
Note that some devices implement a system where you first send x bytes of data, and after that receive x bytes of data. This seems to be the scenario you describe. Sending and receiving is not done at the same time for that device, but instead in sequence. Meaning that during the first transmission, you'll clock in garbage, and then in order to receive data, you must clock out garbage. This is no fault of SPI, but how the manufacturer of the specific device has specified things.
Note that SPI is very poorly standardized and therefore all manner of weird crap exists on the market. The manner of sending/receiving data may vary, the clock polarity (flanks) may vary, where the device clocks the data may vary. Some devices might need delays between data bytes. Some devices might need some obscure handling of the Slave Select pin in order to work. It's all one big mess and the lack of international standardization is to blame.
An SPI master engine's received data available flag will be set as a simple result of the occurrence of a word's worth of clock cycles generated by the master itself. It tells you nothing about the operation or even existence of a peripheral on the bus.
When this flag is set, it is entirely up to you and your software to know if the contents of the received data register will have meaning or not.
If you have properly selected and interacted with an existent, operational peripheral in a read or transfer operation where it is documented to give a result, they will have meaning
If you have performed a purely write operation to a peripheral for which no reply data is documented at the word position in question, it will be meaningless, effectively no different than reading some random legal memory location. Note that in most cases, a write operation is simply a transfer where the received data is to be ignored - at implementation level there is usually no other difference.
If you have failed to address any existent peripheral it will be similarly be meaningless.
As with any other memory or read operation, it is up to you to know if the contents of the register in a particular situation are meaningful or not.
Since you know that the first byte contains "who knows what" while the second has meaning, write your software to ignore the first and use the second.
(As an aside, many, though by no means all, SPI peripherals are documented to shift out whatever constitutes their primary status register during the address phase, since that makes for a quick way to poll it)

Microcontroller to microcontroller communication library (over UART/RS232)

I want to interface two microcontrollers with a UART interface and I search a protocol to exchange data between them.
In practice, I want to exchange data periodically (ie: sensors reading) and also data on event (GPIO state). I have around 100-200 bytes to exchange every 100 milli second.
Does anybody know a protocol or library to achieve this kind of task ?
For now, I see protobuf and nano protobuff ? Is there something else ?
It would be nice if I could add a software layer over the UART and use "virtual data stream" like if it was a TCP/IP connection to N ports.
Any idea ?
Thanks
I think the most straight forward way is to roll your own.
You'll find RS232 drivers in the manufacturers chip support library.
RS232 is a stream oriented transport, that means you will need to encode your messages into some frameing structure when you send them and detect frame boundaries on the receiver side. A clever and easy to use mechanism to do this is "Consistent Overhead Byte Stuffing".
https://en.wikipedia.org/wiki/Consistent_Overhead_Byte_Stuffing
This simple algorithm turns zeros in your messages into some other value, so the zero-byte can be used to detect start and end of frame. If a byte gets corrupted on the way you can even resynchronize to the stream and keep going.
The code on Wikipedia should be easy enough even for the smallest micro-processors.
Afterwards you can define your message format. You can probably keep it very simple and directly send your data-structures as is.
Suggestion for a simple message format:
Byte-ID Meaning
---------------------------------
0 Destination port number
1 message type (define your own)
2 to n message data
If you want to send variable length messages you can either send out a length byte or derive the length from the output of the Constant Overhead Byte Stuffing framing.
By the way, UART/RS232 is nice and easy to work with, but you may also want to take a look at SPI. The SPI interface is more suitable to exchange data between two micro-controllers. It is usually faster than RS232 and more robust because it has a dedicated clock-line.
How about this: eRPC https://community.nxp.com/docs/DOC-334083
The eRPC (Embedded Remote Procedure Call) is a Remote Procedure Call (RPC) system created by NXP. An RPC is a mechanism used to invoke a software routine on a remote system using a simple local function call. The remote system may be any CPU connected by an arbitrary communications channel: a server across a network, another CPU core in a multicore system, and so on. To the client, it is just like calling a function in a library built into the application. The only difference is any latency or unreliability introduced by the communications channel.
I have use it in a two processor embedded system, a cortext-A9 CPU with a Context-M4 MCU, which communicate each other with SPI/GPIO.
Erpc can run over UART, SPI, rpmsg and network(tcp). even when using serial or SPI as transport tunnel, it can do bidirectional
calls and with very minimal footprint.
Simple serial point-to-point communication protocol
http://www.zipplet.co.uk/index.php/content/openformats_mise
It depends if you need master/slave implementation, noise protection, point-point or multi-point (and in this case collision detection), etc
but, as our colleague said, I would go with the simplest solution that fits the problem, following the KISS principle http://en.wikipedia.org/wiki/KISS_principle
Just add some header information like ID and length, if necessary CRC checking, and be happy :)
Try Microcontroller Interconnect Network (MIN) 1.0:
https://github.com/min-protocol/min
It has framing using byte-stuffing to keep receiver sync, 16-bit Fletcher's algorithm for checksum, an identifier for use by the application and a variable payload of up to 15 bytes.
There's embedded C code there plus also a Python implementation to make it easier to talk to a PC.
As the first answer starts, the simplest result is to roll your own. Define your header (the "format" above) as needed, perhaps including status information so each processor knows that the other is working properly. I have had success with a protocol that includes
2 byte ascii prefix and suffix such as "[" and "]" so that a
protocol analyzer can show you message boundaries.
The number of bytes.
The command ID (parsed to indicate what command handler to use.
Command arguments (I used 3 32 bit words).
A CRC or checksum to verify transfer integrity
The parser then recognizes the [* as the start of the message, and dispatches the body to the command handler for the particular command ID with the associated arguments as long as the checksum matches.

Guidelines for designing forward compatible communication protocols?

I'm working on a communication protocol between embedded devices. The protocol will definitely need new commands and fields in the future. What do I need to do to make sure I'm not painting myself into a corner?
This is a wide open question. Here are some random thoughts about it:
Leave spares.
Use a very basic header with a "number of bytes to follow" field.
If there are enumerated message types, make sure the type field can accomodate
growth.
If you use bitflags, leave spares.
Possibly include a "raw data" message, which can be used to wrap any protocol future generations think up.
In summary, Leave spares.
If at all possible, allow a human at one end of a cable to figure out what is at the other end of the cable.
Ideally, a human could hook up a dumb terminal and hit the keyboard three times (Enter Question-mark Enter), then a long, detailed message would come back describing what kind of machine it is, what is its model number, the name and phone number and web site of the organization that built it, the "official" protocol version number, and the unofficial build time:
__DATE__ ": " __TIME__
Also send the same detailed message every time the machine boots up.
If at all possible, try to design your protocol so that a human being with a dumb terminal can talk to your device.
The HTTP is one such human-readable protocol, and I suspect this is one of the reasons for its popularity.
Human-readable implies, among other things:
Limit yourself to characters that a human can read and type.
Avoid special control characters. Take advantage of the power of plain text.
Always send CR+LF at the end of each packet (as mandated by many Internet protocols).
Accept characters at any rate, from maximum-speed file upload from a PC to a non-touch-typing human slowly pecking at a keyboard.
You might also want to glance over the list of common protocols for embedded systems.
Perhaps one already meets your requirements?
Is there any reason to use something more difficult to decode than the standard Netstring format?
The question is a little too general for a clear answer. There are many aspects an embedded system may need to communicate like;
How many peers will it need to communicate with?
How much data does it need to communicate?
How tightly synchronized do the systems need to be?
What is the physical media for the protocol and what are the bandwidth limitations, and error susceptibility considerations?
All of these requirements and resource limitations will certainly constrain the system and then you can start to figure out what the protocol will need. Once you know these issues you can then project how some the requirements may change/expand in the future. From there you can design the protocol to accommodate(or not) the worst case use cases.
I would use HDLC. I have had good luck with it in the past. I would for a point to point serial just use the Asynchronous framing and forget about all of the other control stuff as it would probably be overkill.
In addition to using HDLC for the framing of the packet. I format my packet like the following. This is how options are passed using 802.11
U8 cmd;
U8 len;
u8 payload[len];
The total size of each command packet is len +2
You then define commands like
#define TRIGGER_SENSOR 0x01
#define SENSOR_RESPONSE 0x02
The other advantage is that you can add new commands and if you design your parser correctly to ignore undefined commands then you will have some backwards compatibility.
So putting it all together the packet would look like the following.
// total packet length minus flags len+4
U8 sflag; //0x7e start of packet end of packet flag from HDLC
U8 cmd; //tells the other side what to do.
U8 len; // payload length
U8 payload[len]; // could be zero len
U16 crc;
U8 eflag; //end of frame flag
The system will then monitor the serial stream for the flag 0x7e and when it is there you check the length to see if it is pklen >= 4 and pklen=len+4 and that the crc is valid. Note do not rely on just crc for small packets you will get a lot of false positives also check length. If the length or crc does not match just reset the length and crc and start with decoding the new frame. If it is a match then copy the packet to a new buffer and pass it to your command processing function. Always reset length and crc when a flag is received.
For your command processing function grab the cmd and len and then use a switch to handle each type of command. I also require that a certain events send a response so the system behaves like a remote procedure call that is event driven.
So for example the sensor device can have a timer or respond to a command to take a reading. It then would format a packet and send it to the PC and the PC would respond that it received the packet. If not then the sensor device could resend on a timeout.
Also when you are doing a network transfer you should design it as a network stack like the OSI modle. The HDLC is the data link layer and the RPC and command handling is the Application Layer.

How are SYNC words chosen?

I'm using a data transmission system which uses a fixed SYNC word (0xD21DB8) at the beginning of every superframe. I'd be curious to know how such SYNC words are chosen, i.e. based on which criteria designers choose the length and the value of such a SYNC word.
In short:
high probability of uniqueness
high density of transitions
It depends on the underlying "server layer" (in communication terms). If the said server layer doesn't provide a means of distinguishing payload data from control signals then a protocol must be devised. It is common in synchronous bit-stream oriented transport layer to rely on a SYNC pattern in order to delineate payload units. A good example of such technique used is in SONET/SDH/OTN, the major optical transport communication technologies.
Usually, the main criterion for choosing a SYNC word is high probability of uniqueness. Of course what makes its uniqueness property depend on the encoding used for the payload.
Example: in SONET/SDH, once the SYNC word has been found, it is validated for a number of superframes (I don't remember exactly of many) before declaring a valid sync state. This is required because false positive can occur: encoding on a synchronous bit stream cannot be guaranteed to generate encoded payload patterns orthogonal to the SYNC word.
There is another criterion: high density of transitions. Sometimes, the server layer is made up of both clock and data signals (i.e. not separate). In this case, for the receiver to be able to delineate symbols from the stream, it is critical to ensure a maximum number of 0->1, 0->1 transitions in oder to extract the clock signal.
Hope this helps.
Updated: these presentations might be of interest too.
At the physical layer, another consideration (besides those mentioned in jldupont's answer) is that a sync word may be used to synchronise the receiver's communication clock to that of the sender. Synchronisation may only require zeroing the receiver's clock, but it may also involve changing the frequency of the clock to match the sender's more closely.
For a typical asynchronous protocol, the sender and receiver are required to have clocks that are the same. In reality of course, the clocks are never precisely the same, so a maximum error is normally specified.
Some protocols don't require the receiver to adjust its clock rate, but tolerate the error by oversampling, or some other method. For example, a typical UART is able to cope with errors by zeroing on the first edge of the start bit, and thereafter, taking multiple samples at the point where it expects the middle of each bit to be. In this case, the sync word is just the start bit, and ensures a transition at start of the message.
In the HART industrial protocol, the sync word is 0xFF, plus a zero parity bit, repeated a number of times. This is represented as an analogue waveform, encoded using FSK, and appears as 8 periods (equal to 8 bits times) of a 1200 Hz sinusoidal wave, followed by one bit time at 2200 Hz. This pattern allows the receiver to detect that there is a valid signal, and then synchronise to the start of a byte by detecting the transition from 2200 Hz back to 1200 Hz. If required, the receiver can also use this waveform to adjust its clock.

Protocols used to talk between an embedded CPU and a PC

I am building a small device with its own CPU (AVR Mega8) that is supposed to connect to a PC. Assuming that the physical connection and passing of bytes has been accomplished, what would be the best protocol to use on top of those bytes? The computer needs to be able to set certain voltages on the device, and read back certain other voltages.
At the moment, I am thinking a completely host-driven synchronous protocol: computer send requests, the embedded CPU answers. Any other ideas?
Modbus might be what you are looking for. It was designed for exactly the type of problem you have. There is lots of code/tools out there and adherence to a standard could mean easy reuse later. It also support human readable ASCII so it is still easy to understand/test.
See FreeModBus for windows and embedded source.
There's a lot to be said for client-server architecture and synchronous protocols. Simplicity and robustness, to start. If speed isn't an issue, you might consider a compact, human-readable protocol to help with debugging. I'm thinking along the lines of modem AT commands: a "wakeup" sequence followed by a set/get command, followed by a terminator.
Host --> [V02?] // Request voltage #2
AVR --> [V02=2.34] // Reply with voltage #2
Host --> [V06=3.12] // Set voltage #6
AVR --> [V06=3.15] // Reply with voltage #6
Each side might time out if it doesn't see the closing bracket, and they'd re-synchronize on the next open bracket, which cannot appear within the message itself.
Depending on speed and reliability requirements, you might encode the commands into one or two bytes and add a checksum.
It's always a good idea to reply with the actual voltage, rather than simply echoing the command, as it saves a subsequent read operation.
Also helpful to define error messages, in case you need to debug.
My vote is for the human readable.
But if you go binary, try to put a header byte at the beginning to mark the beginning of a packet. I've always had bad luck with serial protocols getting out of sync. The header byte allows the embedded system to re-sync with the PC. Also, add a checksum at the end.
I've done stuff like this with a simple binary format
struct PacketHdr
{
char syncByte1;
char syncByte2;
char packetType;
char bytesToFollow; //-or- totalPacketSize
};
struct VoltageSet
{
struct PacketHdr;
int16 channelId;
int16 voltageLevel;
uint16 crc;
};
struct VoltageResponse
{
struct PacketHdr;
int16 data[N]; //Num channels are fixed
uint16 crc;
}
The sync bytes are less critical in a synchronous protocol than in an asynchronous one, but they still help, especially when the embedded system is first powering up, and you don't know if the first byte it gets is the middle of a message or not.
The type should be an enum that tells how to intepret the packet. Size could be inferred from type, but if you send it explicitly, then the reciever can handle unknown types without choking. You can use 'total packet size', or 'bytes to follow'; the latter can make the reciever code a little cleaner.
The CRC at the end adds more assurance that you have valid data. Sometimes I've seen the CRC in the header, which makes declaring structures easier, but putting it at the end lets you avoid an extra pass over the data when sending the message.
The sender and reciever should both have timeouts starting after the first byte of a packet is recieved, in case a byte is dropped. The PC side also needs a timeout to handle the case when the embedded system is not connected and there is no response at all.
If you are sure that both platforms use IEEE-754 floats (PC's do) and have the same endianness, then you can use floats as the data type. Otherwise it's safer to use integers, either raw A/D bits, or a preset scale (i.e. 1 bit = .001V gives a +/-32.267 V range)
Adam Liss makes a lot of great points. Simplicity and robustness should be the focus. Human readable ASCII transfers help a LOT while debugging. Great suggestions.
They may be overkill for your needs, but HDLC and/or PPP add in the concept of a data link layer, and all the benefits (and costs) that come with a data link layer. Link management, framing, checksums, sequence numbers, re-transmissions, etc... all help ensure robust communications, but add complexity, processing and code size, and may not be necessary for your particular application.
USB bus will answer all your requirements. It might be very simple usb device with only control pipe to send request to your device or you can add an interrupt pipe that will allow you to notify host about changes in your device.
There is a number of simple usb controllers that can be used, for example Cypress or Microchip.
Protocol on top of the transfer is really about your requirements. From your description it seems that simple synchronous protocol is definitely enough. What make you wander and look for additional approach? Share your doubts and we will try to help :).
If I wasn't expecting to need to do efficient binary transfers, I'd go for the terminal-style interface already suggested.
If I do want to do a binary packet format, I tend to use something loosely based on the PPP byte-asnc HDLC format, which is extremely simple and easy to send receive, basically:
Packets start and end with 0x7e
You escape a char by prefixing it with 0x7d and toggling bit 5 (i.e. xor with 0x20)
So 0x7e becomes 0x7d 0x5e
and 0x7d becomes 0x7d 0x5d
Every time you see an 0x7e then if you've got any data stored, you can process it.
I usually do host-driven synchronous stuff unless I have a very good reason to do otherwise. It's a technique which extends from simple point-point RS232 to multidrop RS422/485 without hassle - often a bonus.
As you may have already determined from all the responses not directly directing you to a protocol, that a roll your own approach to be your best choice.
So, this got me thinking and well, here are a few of my thoughts --
Given that this chip has 6 ADC channels, most likely you are using Rs-232 serial comm (a guess from your question), and of course the limited code space, defining a simple command structure will help, as Adam points out -- You may wish to keep the input processing to a minimum at the chip, so binary sounds attractive but the trade off is in ease of development AND servicing (you may have to trouble shoot a dead input 6 months from now) -- hyperterminal is a powerful debug tool -- so, that got me thinking of how to implement a simple command structure with good reliability.
A few general considerations --
keep commands the same size -- makes decoding easier.
Framing the commands and optional check sum, as Adam points out can be easily wrapped around your commands. (with small commands, a simple XOR/ADD checksum is quick and painless)
I would recommend a start up announcement to the host with the firmware version at reset - e.g., "HELLO; Firmware Version 1.00z" -- would tell the host that the target just started and what's running.
If you are primarily monitoring, you may wish to consider a "free run" mode where the target would simply cycle through the analog and digital readings -- of course, this doesn't have to be continuous, it can be spaced at 1, 5, 10 seconds, or just on command. Your micro is always listening so sending an updated value is an independent task.
Terminating each output line with a CR (or other character) makes synchronization at the host straight forward.
for example your micro could simply output the strings;
V0=3.20
V1=3.21
V2= ...
D1=0
D2=1
D3=...
and then start over --
Also, commands could be really simple --
? - Read all values -- there's not that many of them, so get them all.
X=12.34 - To set a value, the first byte is the port, then the voltage and I would recommend keeping the "=" and the "." as framing to ensure a valid packet if you forgo the checksum.
Another possibility, if your outputs are within a set range, you could prescale them. For example, if the output doesn't have to be exact, you could send something like
5=0
6=9
2=5
which would set port 5 off, port 6 to full on, and port 2 to half value -- With this approach, ascii and binary data are just about on the same footing in regards to computing/decoding resources at the micro. Or for more precision, make the output 2 bytes, e.g., 2=54 -- OR, add an xref table and the values don't even have to be linear where the data byte is an index into a look-up table ...
As I like to say; simple is usually better, unless it's not.
Hope this helps a bit.
Had another thought while re-reading; adding a "*" command could request the data wrapped with html tags and now your host app could simply redirect the output from your micro to a browser and wala, browser ready --
:)