Reliable packetization over a noisy serial line
10 May 2012
Many peripherals still use raw UART-based protocols like RS-232 and RS-485, particularly in industrial electronics. Any protocol for communication over one of these serial lines needs to be able to cope with transmission errors and garbage.
This document describes a system for transmitting and receiving variable-length packets via a serial line. Packets are received fully intact, or not at all (like UDP datagrams). The protocol is also highly resistant to garbage. Garbage between well-formed packets is unlikely to prevent them from being received, and the receiver’s state will not be adversely affected by a long stream of garbage (for example, data transmitted at the wrong baud rate).
Message format
The basic message format is always the same:
[SOF] [LENGTH] [PAYLOAD] [CHECKSUM]
The parts of the message, and their purpose, are explained below:
SOF
Start-of-frame sequence. In this document, it’s assumed to be the two-character sequence
AB
. The only requirement is that the initial character is unique within the sequence.The purpose of this sequence is to prevent the receiver from interpreting any garbage bytes as a length field, and then expecting a possibly lengthy payload while the transmitter may be attempting to begin transmission of a new message.
LENGTH
Two-byte payload size field. This field is clearly necessary because we want to be able to send variable-length messages.
There is a maximum size to payloads. If the receiver does somehow get triggered by a spurious SOF sequence, we need to make sure that there’s a limit to how long it could be stuck misinterpreting incoming data as a payload.
PAYLOAD
The message data. This field is entirely unspecified, apart from being bound by the maximum payload size described above.
CHECKSUM
A 16-bit (two byte) CRC, calculated over the payload. Messages without a valid CRC are ignored.
Receiver implementation
The receiver can be implemented as a state machine with a single event type: an incoming character. The state machine is reset by putting it into the IDLE
state.
The states are described below:
IDLE:
- If 'A' received, go to SOF_A.
SOF_A:
- If 'A' received, stay in SOF_A.
- If 'B' received, go to LENGTH_HI.
- Otherwise, go to IDLE.
LENGTH_HI:
- Store incoming byte as high byte of length.
- Go to LENGTH_LO.
LENGTH_LO:
- Store incoming byte as low byte of length.
- If length greater than maximum allowed, go to IDLE.
- If length is zero, go to CKSUM_HI.
- Otherwise, set pointer = 0 and go to PAYLOAD.
PAYLOAD:
- Store incoming byte at current offset in payload buffer and
increment pointer.
- If pointer >= length, go to CKSUM_HI.
- Otherwise, stay in PAYLOAD.
CKSUM_HI:
- Store incoming byte as high byte of checksum.
- Go to CKSUM_LO.
CKSUM_LO:
- Store incoming byte as low byte of checksum.
- If computed checksum matches stored checksum, process the
payload as a received message.
- Go to IDLE.
This state machine can be easily implemented in under 100 lines of C. The states used to parse the SOF marker are what gives the receiver its resistance to inter-packet garbage. More states can be added to parse longer SOF markers. For example, if the sequence ABC
is used as a marker:
IDLE:
- If 'A' received, go to SOF_A.
SOF_A:
- If 'A' received, stay in SOF_A.
- If 'B' received, go to SOF_B.
- Otherwise, go to IDLE.
SOF_B:
- If 'A' received, go to SOF_A.
- If 'C' received, go to LENGTH_HI.
- Otherwise, go to IDLE.
...
Note that the initial character of the SOF marker acts as a kind of reset before a complete marker has been parsed. This is why it must be unique – it allows an SOF to be recognized correctly even if it’s preceeded by garbage that happens to contain partial SOF sequences.
When communication starts with a receiver, it may be in an unknown state. You can force the receiver into the IDLE
state from any starting state by transmitting at most N+4
non-SOF bytes, where N
is the maximum allowable payload length.
Enhancements
If this protocol is implemented in a semi-realtime environment, a good way of improving reliability is to add the requirement of a minimum intra-frame delay (the maximum allowable delay between successive characters).
The receiver should detect a line-idle condition of greater than the intra-frame delay and reset the state machine when it occurs. This helps to prevent the state machine from getting stuck in the PAYLOAD
state if it receives a partial message.
If the maximum payload length is quite small relative to the maximum value of the length field, the unused upper bits may be used as a parity-check on the length value. This reduces the probability of getting stuck in the PAYLOAD
state if a spurious SOF is received.
For devices which are expected to be hot-pluggable, it may be a good idea to send periodic “keep-alive” messages in each direction. Receivers can then reset a timeout counter whenever they receive a valid packet, and perform a higher-level reset if the timeout expires.