Layer 2 · Threading & reactor

spdk_thread — the logical unit of work.

A spdk_thread is a logical thread of execution that the framework multiplexes onto a reactor. It's not a pthread. It's a struct with a mailbox. When you want to do work, you send a message to the thread; the thread's reactor will run it. When the thread has nothing to do, it sits idle and other threads on the same core get all the CPU.

~15 min read1 diagramprerequisite: Layer 2.1
On this page
  1. Why a thread-on-top-of-a-reactor
  2. The struct, end to end
  3. Creating a thread — and the lifetime rules
  4. The mailbox: spdk_thread_send_msg
  5. How a Go JSON-RPC call ends up on an spdk_thread
  6. Pollers vs. messages vs. threads
  7. Migration: when a thread hops reactors
  8. Edge cases & what trips people up

Why a thread-on-top-of-a-reactor

In 2.1 you saw that a reactor is one pthread per core, running a tight poll loop. That's a powerful primitive, but it's also rigid: a single reactor is one execution context. It has one spdk_get_thread(), one TLS variable, one "current io channel per subsystem" cache.

Real applications need many execution contexts. An SPDK-based NVMe-oF target has:

  • an RPC handler thread (for JSON-RPC requests)
  • one nvmf target thread per core (for I/O submission)
  • a poller thread per subsystem (bdev, copy, etc.)
  • one thread per active TCP connection, in some transports

Each of these needs its own state. The bdev subsystem, for example, caches an spdk_io_channel per execution context — and an "execution context" here means "the thing that submitted the I/O." If two threads on the same reactor shared a channel, they'd contend on the same submission queue and the bdev's poller would have no idea who was waiting for what completion.

The metaphor that helps: think of spdk_thread as a goroutine, and a reactor as a worker M:N scheduled onto a pthread. The mapping is the same idea. The reason the framework uses this M:N model is that with one pthread per core, you can have many "threads" of execution without paying the cost of a pthread for each.

The struct, end to end

Creating a thread — and the lifetime rules

A thread is created with spdk_thread_create():

Once created, a thread is "live" — it is on some reactor's threads list, the reactor is calling spdk_thread_poll() on it, and any spdk_thread_send_msg() targeted at it will be delivered.

When you're done with a thread, you call spdk_thread_exit() — but only from the thread itself, and only after all of its I/O channels have been spdk_put_io_channel()'d and all its pollers have been spdk_poller_unregister()'d. The exit sequence is asynchronous: spdk_thread_exit() just flips the state to EXITING and starts a 5-second timeout. Subsequent reactor iterations check thread_exit() at

lib/thread/thread.c:672

to see whether all the cleanup has actually happened.

The mailbox: spdk_thread_send_msg

This is the workhorse of the framework. Every cross-thread handoff that isn't a poller goes through spdk_thread_send_msg():

There is one more variant: spdk_thread_send_critical_msg():

How a Go JSON-RPC call ends up on an spdk_thread

The connection between the two layers is worth tracing once, end to end, because it's the abstraction boundary your diskengine code crosses constantly.

STEP 01
Go code
<code>spdkClient.BdevLvolCreate(...)</code> in diskengine
STEP 02
JSON-RPC encode
<code>Client.Call</code> serializes to JSON, writes to Unix socket
STEP 03
SPDK RPC server
<code>spdk_jsonrpc_server</code> thread reads the socket
STEP 04
Dispatch
RPC handler runs on a poller thread (the framework routes it)
STEP 05
bdev_lvol_create
Handler submits bdev I/O via the bdev module's submit callback
STEP 06
Completion
bdev poller fires, completes the bdev_io
STEP 07
RPC response
Handler sends the response, the RPC server writes to the socket
STEP 08
Go code resumes
<code>Client.Call</code> unblocks with the result

The detail that matters is step 4. The RPC framework doesn't run the handler on a thread you chose — it runs the handler on whatever thread the framework dispatches JSON-RPC work to (typically the "app thread" or a dedicated RPC thread). The handler then may decide to send the work to yet another thread (e.g. the nvmf target's submit thread) via spdk_thread_send_msg(). This is what the "every I/O channel is bound to one thread" rule looks like in practice: the bdev module's submit callback has to be called on the thread that owns the channel.

The diskengine side is intentionally simple: the Go code in Client.Call:43 does a synchronous request/response and waits. It has no concept of "which reactor am I on" because it's not on one. The threading model is entirely internal to the SPDK process. From Go's perspective, SPDK is a service that responds to JSON-RPC requests.

Pollers vs. messages vs. threads

Three abstractions, three use cases. Mixing them up is the most common architectural mistake.

AbstractionWhat it doesWhen to use it
PollerA function that runs repeatedly on the thread, on a period (or as fast as possible).When you need to poll a state, complete I/O, recheck a queue, etc. Anything that needs to run on every reactor iteration or on a timer.
MessageA one-shot function delivered to the thread's mailbox, run on the next reactor iteration.When you want a callback to run "soon, on this thread" without registering a recurring poller. RPC handlers, I/O completions, deferred cleanup, state transitions.
ThreadA logical unit of work that has its own state (pollers, channels, message ring).When you have a subsystem with long-lived state. The bdev module's submit callback, the nvmf target's poller, the RPC server's request thread — each is its own spdk_thread.

Rule of thumb: if you're tempted to "just register a 1 ms poller to do this one thing," you're almost always better off sending a message instead. Pollers are for recurring work; messages are for one-shots.

Migration: when a thread hops reactors

With the static scheduler, threads never migrate. With the dynamic scheduler (gpm), they can. Here's the mechanism:

Edge cases & what trips people up

1. spdk_thread_send_msg() from the target thread itself

The function checks nothing; it cheerfully enqueues a message on the very thread that just called it. The message will sit in the ring until the next reactor iteration, and then run. This is a recipe for deadlock if your fn is waiting for the message to be delivered. The pattern "send a message to self, then wait for it to be processed" is broken — by definition, you can't both be the producer and the consumer. Use a regular function call (or spdk_thread_exec_msg() at include/spdk/thread.h:547 , which detects the local case and runs the function immediately).

2. Calling spdk_get_io_channel() on a thread that doesn't exist

The function at lib/thread/thread.c:2376 does thread = _get_thread(); if (!thread) ... abort(). If you're in a pthread that the framework didn't set up — for example, a Go goroutine that crossed the CGo boundary — tls_thread is NULL and you abort. There is no implicit "current thread" for non-SPDK threads. Everything inside SPDK requires you to be on a known spdk_thread.

3. The first spdk_thread_create() sets the app thread

The atomic compare-and-exchange at lib/thread/thread.c:632 means "the first thread wins." If you create thread A, then create thread B, then create thread C, all of A, B, and C are normal threads, but A is the "app thread" because it was first. Framework init and fini must happen from the app thread.

4. What happens when the target's reactor is busy

Your spdk_thread_send_msg() succeeds — the message is in the ring. The target thread is still mid-poller on something slow. The message waits. Send-and-forget has unbounded latency. The framework gives you spdk_thread_send_critical_msg() for "I really need this to run now" but that still waits for the current poller to return. There is no preemption. If you need a back-pressure mechanism, the framework gives you the ring's fill level — check spdk_ring_count() before sending, or design your message handlers to be fast.

5. Migration while a poller is running

The migration check is at lib/event/reactor.c:922 , in reactor_post_process_lw_thread(). It runs after the thread's pollers, not during. So a poller is guaranteed to run to completion on the current reactor. After it returns, the thread might get moved. If your poller stashes a pointer to reactor-local data and assumes the data is still valid in the next iteration, you're wrong. Each iteration is "fresh." Persist data on the spdk_thread struct, not on the reactor.

6. Foreign threads, foreign locks

If a Go goroutine calls into the C side via CGo and that path tries to take an spdk_spinlock, it will trip the SPIN_ERR_NOT_SPDK_THREAD assertion at lib/thread/thread.c:3273 . The lock expects to be held by an spdk_thread. If you need a lock that a Go goroutine can take, take a pthread_mutex on the Go side and design the C side to never block waiting for it. The same is true for spdk_io_channel — the channel is bound to a thread, and "the thread" is the spdk_thread that acquired it, not whatever pthread happens to be running.

7. Holding an spdk_thread * across reactor iterations

The spdk_thread pointer is stable for the lifetime of the thread. The thread can be destroyed (via spdk_thread_exit + spdk_thread_destroy), and once it's destroyed the pointer is dangling. If you're tempted to "just keep the pointer in a global and send a message to it later," ask yourself: who guarantees it's still alive? The answer in practice is the framework's for_each_count / pending_unregister_count machinery, which is why spdk_for_each_thread() bumps those counts and refuses to unregister a thread that's the target of an in-flight iteration. Read

lib/thread/thread.c:2049

if you ever write a spdk_for_each_thread of your own.

8. The diskengine never knows which thread it talked to

Look at BdevLvolCreate:97 . The Go code just gets back a UUID string. It has no idea which spdk_thread the bdev module ran on, which reactor processed the request, or how many polls it took. This is the abstraction working as designed. If you ever find yourself wanting to "pass an spdk_thread pointer back to Go and use it later," stop. The pointer is meaningless outside the SPDK process.

What to take away

An spdk_thread is the unit of "where does this I/O submission come from." It's a struct, a name, a list of pollers, an io_channel tree, and a message ring. The reactor loop walks the threads. The thread's mailbox delivers cross-thread work. Pollers run on schedule; messages run on demand. The combination gives you a goroutine-like model on top of a pthread-per-core runtime, with the property that no syscall can yield your CPU to someone else.

The next page — 2.3 — spdk_io_channel + pollers — looks at the per-thread state that actually caches the I/O submission path. The spdk_thread is the thing; the spdk_io_channel is what the thing owns.