Reliable packetization over a noisy serial line

Daniel Beer

10 May 2012

Many peripherals still use raw UART-based protocols like RS-232 and RS-485, particularly in industrial electronics. Any protocol for communication over one of these serial lines needs to be able to cope with transmission errors and garbage.

This document describes a system for transmitting and receiving variable-length packets via a serial line. Packets are received fully intact, or not at all (like UDP datagrams). The protocol is also highly resistant to garbage. Garbage between well-formed packets is unlikely to prevent them from being received, and the receiver's state will not be adversely affected by a long stream of garbage (for example, data transmitted at the wrong baud rate).

Message format

The basic message format is always the same:

[SOF] [LENGTH] [PAYLOAD] [CHECKSUM]

The parts of the message, and their purpose, are explained below:

SOF

Start-of-frame sequence. In this document, it's assumed to be the two-character sequence AB. The only requirement is that the initial character is unique within the sequence.

The purpose of this sequence is to prevent the receiver from interpreting any garbage bytes as a length field, and then expecting a possibly lengthy payload while the transmitter may be attempting to begin transmission of a new message.

LENGTH

Two-byte payload size field. This field is clearly necessary because we want to be able to send variable-length messages.

There is a maximum size to payloads. If the receiver does somehow get triggered by a spurious SOF sequence, we need to make sure that there's a limit to how long it could be stuck misinterpreting incoming data as a payload.

PAYLOAD

The message data. This field is entirely unspecified, apart from being bound by the maximum payload size described above.

CHECKSUM

A 16-bit (two byte) CRC, calculated over the payload. Messages without a valid CRC are ignored.

Receiver implementation

The receiver can be implemented as a state machine with a single event type: an incoming character. The state machine is reset by putting it into the IDLE state.

Receiver state machine for the described protocol.

Receiver state machine for the described protocol.

The states are described below:

IDLE:
  - If 'A' received, go to SOF_A.

SOF_A:
  - If 'A' received, stay in SOF_A.
  - If 'B' received, go to LENGTH_HI.
  - Otherwise, go to IDLE.

LENGTH_HI:
  - Store incoming byte as high byte of length.
  - Go to LENGTH_LO.

LENGTH_LO:
  - Store incoming byte as low byte of length.
  - If length greater than maximum allowed, go to IDLE.
  - If length is zero, go to CKSUM_HI.
  - Otherwise, set pointer = 0 and go to PAYLOAD.

PAYLOAD:
  - Store incoming byte at current offset in payload buffer and
    increment pointer.
  - If pointer >= length, go to CKSUM_HI.
  - Otherwise, stay in PAYLOAD.

CKSUM_HI:
  - Store incoming byte as high byte of checksum.
  - Go to CKSUM_LO.

CKSUM_LO:
  - Store incoming byte as low byte of checksum.
  - If computed checksum matches stored checksum, process the
    payload as a received message.
  - Go to IDLE.

This state machine can be easily implemented in under 100 lines of C. The states used to parse the SOF marker are what gives the receiver its resistance to inter-packet garbage. More states can be added to parse longer SOF markers. For example, if the sequence ABC is used as a marker:

IDLE:
  - If 'A' received, go to SOF_A.

SOF_A:
  - If 'A' received, stay in SOF_A.
  - If 'B' received, go to SOF_B.
  - Otherwise, go to IDLE.

SOF_B:
  - If 'A' received, go to SOF_A.
  - If 'C' received, go to LENGTH_HI.
  - Otherwise, go to IDLE.

...

Note that the initial character of the SOF marker acts as a kind of reset before a complete marker has been parsed. This is why it must be unique -- it allows an SOF to be recognized correctly even if it's preceeded by garbage that happens to contain partial SOF sequences.

When communication starts with a receiver, it may be in an unknown state. You can force the receiver into the IDLE state from any starting state by transmitting at most N+4 non-SOF bytes, where N is the maximum allowable payload length.

Enhancements

If this protocol is implemented in a semi-realtime environment, a good way of improving reliability is to add the requirement of a minimum intra-frame delay (the maximum allowable delay between successive characters).

The receiver should detect a line-idle condition of greater than the intra-frame delay and reset the state machine when it occurs. This helps to prevent the state machine from getting stuck in the PAYLOAD state if it receives a partial message.

If the maximum payload length is quite small relative to the maximum value of the length field, the unused upper bits may be used as a parity-check on the length value. This reduces the probability of getting stuck in the PAYLOAD state if a spurious SOF is received.

For devices which are expected to be hot-pluggable, it may be a good idea to send periodic "keep-alive" messages in each direction. Receivers can then reset a timeout counter whenever they receive a valid packet, and perform a higher-level reset if the timeout expires.