Writing Drivers for SPI Chips in Embedded Systems

A practical reference for writing embedded SPI drivers for ADCs, sensors, DACs, digital potentiometers, and other ordinary peripheral chips, with datasheet reading, transaction design, register maps, timing, testing, and bring-up checks.

Part 2 of 3

Previously: We set up the driver boundary, SPI transaction shape, register helpers, ID checks, and command-style ADC access.

In this part: We handle conversion timing, polling, decoding, burst reads, shared buses, DMA thresholds, and asynchronous reads.

Model the conversion sequence

A conversion sequence is state, even in a small blocking driver. There is usually a moment when configuration is applied, a moment when conversion begins, a period where data is not ready, and a read phase where the returned bytes finally mean something. Naming those moments prevents stale data from looking like a valid measurement.

The failure mode is usually subtle. The driver returns numbers, the numbers change, and the system appears alive. Only later does someone notice that a channel change is one sample late, or that the first sample after changing gain is invalid. Modeling the sequence gives you a place to discard or flag those cases intentionally.

ADCs and sensors often have a measurement sequence that is separate from the SPI read. Some parts continuously convert. Some require a start command. Some need a conversion delay before the read. Some return the previous conversion while the new one starts. Some expose a data-ready bit or interrupt pin.

If the driver treats all of those chips as "read register and return value", it will eventually return stale or half-ready data.

Numbered SPI ADC and sensor conversion sequence with arrows connected to each stage.
For converters and sensors, the useful driver operation is usually a sequence. The SPI read is only one step in that sequence.

The driver should make the sequence explicit. A blocking version is fine for small systems when the conversion time is short and known. A nonblocking version is better when the conversion time is long, when the system has a watchdog, or when the part is sampled periodically.

The blocking helper can sit on top of the same sequence if that fits the project. What matters is that the code names the waiting step and the timeout.

Poll status instead of guessing delays

Polling exists because conversion time is often a function of configuration. Oversampling, filter settings, power mode, reference settling, and sensor temperature can all move the ready time. A fixed delay that was safe in the lab may be too short in a cold chamber or unnecessarily long in the normal case.

There is a balance here. Polling too aggressively can waste bus bandwidth and CPU time, especially on a shared bus. A good driver uses the ready flag when available, waits at a reasonable interval, and still keeps a timeout so a broken device does not trap the system forever.

Fixed delays are tempting because they make the first bench demo quick. They also age poorly. Temperature, supply voltage, oversampling, filter settings, and part revisions can all change conversion timing. If the chip provides a data-ready flag, a busy bit, or a data-ready pin, use it and keep a timeout as the guard.

There are chips where a fixed delay is the only option. In that case, keep the delay named, keep it close to the datasheet value, and avoid hiding it deep inside a function called read().

Decode raw data in the driver

Raw decoding belongs close to the transaction because that is where the byte order, alignment, status bits, and sign rules are still visible. If each caller decodes raw bytes independently, the project can end up with two definitions of the same measurement.

Lesson learned: One of the most common ADC bugs is a value that is correct near zero and wrong near full scale because sign extension or masking was done in the wrong order. Keeping the decoder small and isolated makes it easy to test those edge values without running the whole firmware.

Raw SPI bytes are not the same thing as a measurement. Many devices return left-aligned values, status bits, sign bits, two's complement values, or multi-byte big-endian fields. Do the decoding in one place and return a typed result.

The application can still request raw data when it needs calibration or diagnostics. The default API should return values that are difficult to misuse.

Handle burst reads deliberately

Burst reads are attractive because they reduce overhead and can keep related fields together. They also rely on several device-specific promises: address auto-increment, snapshot behavior, byte order, and sometimes a required read sequence. If any of those assumptions are wrong, the driver can return a mixture of old and new fields.

This is especially visible in motion sensors and environmental sensors. A single-axis read may work for weeks, then a later optimization changes it to a burst read and introduces occasional impossible vectors or pressure jumps. The burst helper should document what the device promises and what the driver assumes.

Sensors often support multi-byte burst reads for X, Y, Z axes or temperature, pressure, and humidity fields. Burst reads reduce overhead and keep fields coherent, but only when the chip supports auto-increment and when the register snapshot behavior is understood.

Do not assume every chip snapshots all fields at the first byte of a burst. Some devices require a specific status read, a data latch, or a read order. That assumption belongs beside the burst helper, not in a caller that only sees decoded axes.

Preserve error categories

Error categories are not decoration. They decide what the rest of the system can do next. A bus error may suggest wiring, DMA, or peripheral configuration trouble. A timeout may suggest a missing ready flag or a dead device. An overrange sample may be a valid measurement event that should not be treated like a broken bus.

When all of these collapse into one failure code, debugging becomes log archaeology. Engineers start adding prints around every call site because the driver threw away the useful information at the source. Keeping categories separate is one of the cheapest ways to reduce future bench time.

There is a big difference between an SPI bus error, a chip timeout, a bad device ID, an invalid parameter, and a data overrange condition. Collapsing them into false makes debugging much harder.

These counters should be treated as diagnostic state. If they are touched from both an interrupt and a task, protect them with the same synchronization rules you use elsewhere. volatile does not make increments atomic.

Share the SPI bus carefully

Shared SPI buses fail in ways that look unrelated to the driver being edited. A display update can leave the peripheral in another mode, a flash driver can change the clock divisor, or a sensor that is held in reset can pull MISO enough to corrupt another device. The driver cannot fix bad hardware, but it can make its own bus requirements explicit before every transaction.

The maintenance consequence is straightforward: every new device on the bus becomes less risky when each existing driver reapplies its own timing and mode. Without that habit, adding one harmless peripheral can break a measurement chip that nobody touched.

Multiple SPI chips can share MOSI, MISO, and SCLK, but they do not share timing assumptions. One chip might need mode 0 at 8 MHz. Another might need mode 3 at 1 MHz. Some chips release MISO cleanly when chip select is high. Some boards need pullups or series resistors because one device behaves poorly during reset.

The driver should not assume the bus is already configured correctly. The board transport can reconfigure the SPI peripheral before each transaction if needed.

This is especially important in projects where an RTOS driver, bootloader, display library, or storage stack also uses the same SPI peripheral.

Use DMA when the transaction is large enough

DMA is an engineering tradeoff, not a badge of maturity. It reduces CPU involvement for long transfers, but it adds lifetime rules for buffers, completion handling, interrupt interactions, and sometimes cache maintenance. For short register transfers, the simple blocking path is often easier to reason about and faster end to end.

Field note: A common migration bug appears when a driver is moved from blocking SPI to DMA and keeps using stack buffers. The transfer starts, the function returns, the stack frame is reused, and the DMA engine later transmits or receives garbage. A documented threshold and a board-level DMA wrapper make that class of bug much easier to avoid.

DMA is useful for long bursts, display transfers, high-rate ADC captures, and sensor FIFOs. It is not automatically better for a two-byte register read. DMA setup overhead, cache maintenance, buffer alignment, and completion synchronization can be more expensive than a short blocking transfer.

Use a threshold and document it.

If the MCU has data cache, DMA buffers may need alignment, cache clean before transmit, and cache invalidate after receive. That belongs in the board transport, not in every device driver.

Keep asynchronous operation explicit

Hidden blocking is one of those problems that looks harmless until the system grows. A helper named like a normal read can hide a conversion delay, a retry loop, or a long DMA wait. That affects watchdog servicing, control-loop timing, UI responsiveness, and power management.

An explicit state machine makes the cost visible. It gives the scheduler a chance to run other work, gives the watchdog a predictable path, and gives diagnostics a state name when something stalls. That is much easier to maintain than a call stack stuck somewhere inside a sensor read.

For slow sensors, high resolution ADCs, and periodic acquisition, an asynchronous state machine is often clearer than a blocking function with hidden delays. The state machine can start conversion, return to the scheduler, poll readiness later, and read the result when it is available.

The value of this pattern is not style. It makes watchdog behavior, scheduler latency, and acquisition timing visible during review.

By the end of this second page, the driver can handle the runtime behavior of ordinary SPI peripherals: conversion timing, status polling, data decoding, burst reads, shared buses, DMA thresholds, and asynchronous operation. Page 3 turns that into something you can bring up on hardware, test on a host, and maintain across board revisions.

Avatar photo
Saeid Yazdani

Embedded Systems Engineer with 15+ years of professional experience developing firmware, electronics, measurement systems, and hardware-software solutions. I have been programming for more than two decades and write about Embedded C/C++, STM32, AURIX, PCB design, debugging, and practical engineering lessons from real-world projects.

Articles: 35

Leave a Reply

Your email address will not be published. Required fields are marked *