How USB Works: Communication Protocol (Part 2)

Published Apr 23, 2024

In the previous tutorial, we learned about the hardware-level connection and functionalities of the USB protocol. Now, in this tutorial, we’ll get a working idea of how the firmware for the USB host communicates with the devices.

We all know that we can just plug a USB peripheral into the computer, either a keyboard, mouse, or disk drive, and it will be immediately up and running; it seems pretty straightforward. Well, this was only the case after the development of the USB protocol. Peripheral devices, like keyboards and mice that support PS/2 port, have to be connected to the computer before it’s turned on. If they are connected after the booting process, the devices won’t work, and the computer needs to be restarted to get them working; these devices are not hot-pluggable. USB peripherals are hot-pluggable i.e., they can be plugged in or out without the need for restarting the system. In the upcoming tutorials, we’ll understand what makes these devices hot-swappable.

USB Endpoints

To understand the data flow in the USB protocol, first, we need to remember that USB 2.0 uses a polling mechanism, i.e., only the host can initiate a transaction on the bus. With that said, let’s move on to the software components of the USB architecture. A USB device contains a collection of endpoints; a device endpoint is a uniquely addressable portion of the device that can share information between the host and the device, but there is a catch here: even though these are uniquely addressable portions, they are specific to the devices only. So, there can be multiple devices with unique addresses that can have similar endpoints to the other devices. For instance, there can be multiple USB storage drives with unique addresses attached to the host controller that share similar types of endpoints. The endpoints are connected to the buffer of the host via pipes. Pipes might almost sound like physical entities, but are channels established between a specific endpoint on the USB device and the host controller. Figure 1.1 gives a demonstration of the USB data flow model.

USB endpoints are unidirectional, and the data can only flow in the direction permitted by the endpoint. They use cyclic redundancy checks (CRC) to detect errors in transactions. CRC is one of the most common techniques used for checking the integrity of the packet received. Before sending the data, a special algorithm calculates a check value based on the data packet, this check value is called CRC checksum; this checksum is appended to the data packet. When the receiving device receives the packet it calculates its own checksum for the data packet, and compares it to the checksum appended along with the packet, if the both checksums match there was no error in transmission, however a mismatch indicates error occurred during transmission.

The USB specification has 32 endpoints consisting of 16 IN endpoints and 16 OUT endpoints, which can be accessed after the initial configuration. To communicate initially, a special set of endpoints is used, which are known as Control Endpoint or Endpoint 0. This Endpoint 0 is defined as Endpoint 0 IN and Endpoint 0 OUT, and it doesn’t require a separate descriptor as it acts as a default endpoint for the host to communicate with the device. Remember that the IN data in the USB protocol is when information moves from the device to the host, and OUT data is when the data moves from the host to the device.

Certainly, the question may arise: what is the need for endpoints? Why unnecessarily complicate things when they can be done easily? You see, USB protocol works on polling i.e. the connected devices can only communicate with the host if the host requests some data from the device, and we know that USB is defined to be a universal connector that can connect any supported input, output, or storage devices. All these devices have different functionalities and work on different polling rates. t?Devices such as keyboards and mice require fast polling rates with smaller packet sizes, while USB thumb drives can tolerate slower polling rates but require a larger packet size. To cater to these needs, USB specification defines four types of transfers:

Control transfers – These endpoints support control transfers, which all devices must support. They are bidirectional endpoints that have 10% reserved bandwidth on the bus in the Low-Speed and Full-Speed devices, whereas, for the High-Speed devices, it is 20%. They are important for the USB system-level control and the initial configuration and setup of the device. This endpoint plays an important role in the device enumeration process.

Interrupt Endpoints – These endpoints support interrupt transfers; while the name of this transfer can be misleading, it doesn’t truly support an interrupt mechanism rather it ensures that the host will use a polling rate i.e. the host checks for the data at a predictable interval. These transfers give high data reliability over a small amount of data and guarantee accuracy as errors are properly detected and failed transactions are retransmitted. This is very important for Human Interface Devices (HID) such as keyboards, mice, game controllers, etc.,

Interrupt transfers have a guaranteed bandwidth of 90% on Low- and Full-Speed devices and 80 percent on High-Speed devices. Interrupt endpoint maximum packet size can vary with the device speed. In high-speed, the maximum packet size is 1024 bytes, in full-speed capable devices, it is 64 bytes, and in low-speed devices, it is 8 bytes.

Bulk Endpoints – These types of endpoints are used in devices where a large amount of data is transferred. The data transferred here is in bulk, and this type of transfer is called bulk transfer. Bulk transfer guarantees accuracy because it supports error correction, and packets are re-sent on failure. However, there is a tradeoff between accuracy and time, as the delivery time is variable. The delivery time depends upon the available bandwidth on the bus, which makes the delivery time unpredictable, and hence, these types of transfers are not time-sensitive. Note that Low-Speed devices do not support bulk transfers.

Isochronous Endpoints – These endpoints support continuous, real-time transfers with a pre-negotiated bandwidth. These types of transfers do detect errors but are error-tolerant as they do not support error recovery mechanisms or handshaking (error recovery and handshaking are discussed in the next part of the tutorial). They are used for streaming applications; hence, the occasional loss of data is not a big problem as it is ignored by human ears and eyes. Isochronous transfers are guaranteed a 90% bandwidth on Low- and Full- Speed devices, whereas 80% on High- Speed devices and the bandwidth is shared with interrupt transfers. Isochronous transfers are not available on Low-speed devices.

This explains how endpoints play an important role in the USB protocol to keep it a truly universal bus protocol for any type of device or peripheral. Now, let’s discuss how the communication is carried out in the USB protocol.

Communication

Communication in the USB 2.0 protocol is half-duplex i.e. the transmission and reception of data on the bus cannot be done simultaneously and the transactions are initialized by the host while the devices connected to the hubs can only respond to them. One important fact to remember about USB protocol is that the data on the bus is transmitted with the least significant bit (LSB) first.

If we look at the USB protocol from a time perspective, it contains a series of frames. A frame consists of the start of frame (SOF) packet and one or more transactions. SOF marks the beginning of a new frame on the USB bus.

Figure 2.1: USB frame, transaction and packets

Figure 2.1 shows what a transaction looks like; a transaction is a set of a token packet, an optional data packet, and a handshake packet. A packet can contain the following information:

Packet ID (PID): These bits help identify the type of packet. The size of this packet is 8 bits, in which the first 4 bits are type bits and the last 4 bits are error check bits. The last 4 bits are the complement of the first 4 bits.

For example, an OUT token has a PID of 0001 1110. The receiver can check for the consistency between the upper and lower bits to detect transmission errors within the PID itself.

Optional Device Address: It contains the address of the device and has a length of 7 bits.
Optional Endpoint Address: USB specification supports up to 32 endpoints i.e. 1-16 IN endpoints and 1-16 OUT endpoints. The length of the endpoint address is 4 bits, which gives a maximum value of 16.
Optional Payload Data: It is the actual chunk of data that has to be sent and the size of the chunk can range from 0 to 1023 bytes.
Optional CRC: Cyclic redundancy check is the error-checking mechanism to check the integrity of the packet. The length of the CRC field can be either 5 or 16 bits, depending upon the CRC technique used ,i.e. CRC5 or CRC16.

Now that we have an abstract idea of how a USB packet looks on the representational level we can dive deep into how it is transmitted on the physical level.

Figure 2.3: USB Packet as captured by a logic analyzer

We already know that the USB protocol uses various states (J, K, SE0, SE1) instead of differential 0 or 1, as the definition of these states changes, with a change in the device speed. A new packet on the USB bus is indicated by a sync pattern, and the end of the packet is marked by an End of Packet (EOP).

To start the transmission on the bus, the data lines of the bus are transitioned to the K state, the sync pattern is sent to mark the start of a new packet. A sync pattern consists of a total of 8 states that are a combination of 3 KJ pairs followed by 2 K’s after that the actual packet data is sent i.e. packet ID, payload, etc. and to define the end of the packet EOP is sent on the bus. EOP pattern is an SE0 for 2-bit times followed by a J state for at least 1-bit time, note that the J state can be sent more than one time, it depends on the configuration of the host controller.

There is support for four different types of packets in the USB 2.0 specification.

Token packets
Data packets
Handshake packets
Special packets

We’ll start by looking at the function of each packet one by one.

1. Token Packets

Token packets are always initiated by the host and are used to direct the traffic on the bus. A token packet decides the type of transaction that will be carried out between the host and the device. This packet contains a packet ID, 7-bit device address, 4-bit endpoint ID, and 5-bit CRC. There are three different types of token packets:

IN token packet: This type of token packet is used to request the device for data and in the transaction the data packet is sent from the device to the host.
OUT token packet: This type of token packet is used to notify the device that the host is ready to send data to the device.
SETUP token packet: This type of token packet is issued during the setup and configuration of the device.

SOF token packet: This type of token packet is issued to mark the start of a new frame. This is a special type of packet and contains an 11-bit field called frame number, which indicates the current number and the total number of frames transmitted. The frame number is incremented by the host controller every time a new SOF packet is issued on the bus. (In LS and FS, but in HS only incerment in 1ms not every 125us)

As discussed earlier, communication on the USB bus is carried out in frames; For Low-Speed (1.5 Mbit/s) and Full-Speed (12 Mbit/s) devices, the host controller divides the bus time into 1 ms frames whereas on High-Speed devices (480 Mbit/s) each frame is of 125 µs, and 125 µs frames are called microframes. In each frame or micro-frame, multiple transactions can take place. The type of the token packet decides the type of transaction, the IN token packet on the bus starts an upstream transaction, and the OUT token packet on the bus starts one downstream transaction.

Figure 2.6: Role of SOF packet on the bus

2. Data Packets

Data packets carry the payload that is requested or sent by the host controller, it consists of a packet ID, payload data, and CRC16 field to check the integrity of the packet received. Data packets support two types of packet IDs: DATA0 and DATA1. The question may arise as to why use two different packet IDs when just one ID can mark the packet as a data packet. Well, you see when the data packet is sent, the packet ID is toggled to its alternate state i.e. if DATA0 then the next packet will be DATA1 or vice versa.

This technique of toggling packet ID, helps the device recognize the loss in packets if any have occurred. Suppose the last packet sent by the host had the DATA0 as the packet ID and the latest packet has the same (DATA0) then the device concludes that one packet has been lost during transmission as the latest packet ID should be DATA1 and not DATA0. The size of the payload in each packet can range from 0-1024 bytes.

3. Handshake Packets

These packets are used to conclude each transaction, each handshake includes an 8-bit packet ID and is sent by the receiver of the transaction.

ACK: Acknowledge successful completion.
NAK: Negative acknowledgement.
STALL: Error indication sent by a device.
NYET: Indicates the device is not ready to receive another data packet. This type of packet ID is only supported on the High-Speed devices.

4. Special Packets

PRE: This is issued to hubs by the host to indicate that the next packet is low-speed when using Full- or High-Speed devices.
PING: Only available on High-Speed devices, it is used to check the status of USB devices after receiving an NYET packet.

Packet Type	PID Value	Packet Identifier
Token	0001	OUT Token
	1001	IN Token
	0101	SOF Token
	1101	SETUP Token
Data	0011	DATA0
Data	1011	DATA1
Handshake	0010	ACK
	1010	NAK
	1110	STALL
	0110	NYET
Special	1100	Premable (PRE)
Special	0100	Ping

Types of Transactions

There mainly three types of transactions on the bus, they are as follows:

Upstream Transactions

Upstream/Read/IN Transactions are referred to a transaction that is sent from device to the host. These transactions are initiated by the host to request the data from the device with an IN token packet. The device sends one or more data packets and the host responds with a handshake packet.

If in case the device is not ready to send the data to the host at the moment the device instead of sending the data packet, the device responds with a NAK, to notify the host that it is not ready with the data yet and the host tries again after some time.

Downstream Transactions

Downstream/Write/OUT Transactions refer to a transaction that is sent from the host to the device. These transactions are initiated by the host to send the data to the device with an OUT packet. The device responds with a handshake on the reception of the data.

If in case the device cannot accept the data sent by the host, it responds with an NAK, and the host concludes that the device cannot accept the packet right now and retries after some time.

Control Transactions

Control transfers identify, configure, and control devices and enable the host to read information about a device, set the device address, establish configuration, and issue certain commands. A control transfer is always directed to the control endpoint (Endpoint 0) of a device. Control transfers have three stages: the setup stage, the data stage, and the status stage.

The setup packet is only used in a control transaction in the setup stage. This packet sends USB requests from the host to the device. The device always acknowledges the setup stage and cannot NAK it.

The data stage is optional in a control transaction; it can have multiple data transactions and is only required when a data payload is to be transferred between the host and the device.

The final stage, the status stage, includes a single IN or OUT transaction that reports on the success or failure of the previous stages. The data packet is always DATA1 and contains a zero-length data packet. The status stage ends with a handshake transaction that is sent by the receiver of the previous packet.

In this tutorial, we learned about USB Transfer types and how they are crucial for extending the support of USB protocol across all devices despite their functionalities. Then, we learned how transactions occur on the bus. In the next tutorial, we’ll understand the USB enumeration and configuration process and the significance of USB descriptors.

Authored By

Jesal Shah

An electronics engineer who programs robots. Amazed by how simple algorithms can solve the hardest problems. Learning robotics and enjoys photography and videography.

Next up:

How USB Works: Enumeration and Configuration (Part 3)

Terms Used

Current