The threading rules.
Five rules. Learn them. Every bug you'll ever have with SPDK threading is a violation of one of them. This page is the single most important reference in the curriculum.
- Why this page exists
- Rule 1: Don't send yourself a message
- Rule 2: Channels are bound to one thread
- Rule 3: Complete bdev_io on the right thread
- Rule 4: Initialize via
spdk_app_start - Rule 5: Don't block the reactor
- The QMP quit wedge, in terms of these rules
- The diskengine client and these rules
- The threading violation checklist
- Edge cases & subtle traps
Why this page exists
The previous three pages described the threading model:
reactors, threads, channels, pollers, messages. The model
is flexible. The flexibility is the point. But flexibility
is not safety, and a flexible threading model with no
rules is just a flexible deadlock. SPDK has rules. The
rules are enforced by assert() in the C
source, but the asserts only fire in the failure path —
by the time you see them, something has already gone
wrong.
This page is the rules. Read it. Bookmark it. When you're debugging a "the bdev isn't completing" or "the RPC handler just hangs" problem, come back here and check each rule. One of them is the answer.
Rule 1: Don't send yourself a message
If you're on thread T, don't call
spdk_thread_send_msg(T, fn, ctx) as a way to
"do work later." The message will be enqueued
on T's message ring and delivered on the next reactor
iteration — which is after your current
function returns. If your function is waiting for the
work to complete (a common mistake), you deadlock
yourself. The reactor can never process the message
because the function is still on the call stack.
void
my_poller(void *arg)
{
struct my_state *s = arg;
int rc;
/* Submit some work, then wait for completion. */
spdk_thread_send_msg(spdk_get_thread(), do_later, s);
/* ^ This is wrong: do_later() will never run
* because we're still on the call stack. */
rc = wait_for_completion(s); /* spins forever */
}void
my_poller(void *arg)
{
struct my_state *s = arg;
int rc;
/* Do the work synchronously, since we're
* already on the right thread. */
do_later(s);
/* ... or, if you really need deferred delivery,
* use spdk_thread_send_msg to a DIFFERENT
* thread. */
}Rule 2: Channels are bound to one thread
An spdk_io_channel is bound to the
spdk_thread that acquired it.
Use it only from that thread. This is enforced by
the wrong_thread check in
spdk_put_io_channel and
spdk_io_channel_ref. The check is
one line:
if (ch->thread != spdk_get_thread()) {
wrong_thread("spdk_io_channel_ref", "ch", ch->thread, spdk_get_thread());
return NULL;
}There's no per-subsystem enforcement, only the
framework-level check on put. The subsystem
itself has to enforce the rule on submit.
For the bdev, the check is in the bdev's
submit callback — but only as
an assertion, not as a runtime guard. The
rule is: if your code is on a different
thread than the channel, send yourself
a message to the channel's thread.
What this looks like in the wild
The diskengine spdkclient
Client.Call:43 doesn't
violate this rule, because the Go side doesn't
hold channel pointers. But consider a Go code
path that calls into C, asks for a channel, and
tries to use it from a different Go goroutine.
The Go goroutine is on a different OS thread, so
spdk_get_thread() on the C side
returns NULL (or, worse, the wrong SPDK thread).
Either way, the put is rejected. The
rule applies even if you don't see the SPDK
threads directly.
Rule 3: Complete a bdev_io on the right thread
A bdev_io must be completed on the thread that submitted it. This is a specific instance of Rule 2 (the bdev_io carries a reference to the channel that submitted it), but it's worth its own rule because it's the most common violation in practice.
The bdev framework's submit path looks roughly like:
void
bdev_submit_request(struct spdk_io_channel *ch, struct spdk_bdev_io *bdev_io)
{
struct spdk_bdev_channel *channel = spdk_io_channel_get_ctx(ch);
/* ... validate, then call module->submit ... */
rc = bdev->fn_table->submit(ch, bdev_io);
if (rc != 0) {
/* Module refused; we complete with error. */
}
}The module's submit callback submits to
the underlying device. The completion callback
(spdk_bdev_io_complete) is the one that
needs to fire on the right thread. The framework
enforces this by:
- Every
spdk_bdev_iohas ach(channel) pointer embedded in it. The channel knows its thread. - When the device's poller drains the completion
queue, it calls
spdk_bdev_io_complete. spdk_bdev_io_completelooks at the bdev_io's channel, finds the channel's thread, and usesspdk_thread_send_msgto deliver the completion to that thread.
If you ever see a "the I/O completed on the wrong thread" error, the bdev module's submit code is doing one of these:
- Storing the bdev_io and completing it from a different thread (often a Go goroutine that crossed the CGo boundary).
- Using a stale bdev_io pointer after the channel was put and re-acquired (the bdev_io was freed and reallocated, and you're completing the old one).
- Calling
spdk_bdev_io_completefrom inside a poller that's not on the channel's thread, with no send_msg redirection.
Rule 4: Initialize via spdk_app_start
Every SPDK program must call
spdk_app_start as its entry point.
The framework initializes the reactors, the thread
library, the event framework, the JSON-RPC server,
and the memory pool. If you try to do any of this
manually — by calling
spdk_thread_create directly, or
spdk_poller_register, or even
spdk_env_init from your own main
loop — you'll get undefined behavior, missing
initialization, or an assert.
Rule 5: Don't block the reactor
Code running on a reactor (poller, message
handler, or anything inside spdk_thread_poll)
must not block on a kernel primitive. This
was covered in detail in
2.1; the threading
rule is the consequence: if a poller blocks, the
reactor doesn't return, and every other thread on
that core is starved.
The full set of "don'ts" for a reactor:
| Don't | Why | What to do instead |
|---|---|---|
read / write on
a non-SPDK file descriptor | May sleep; the kernel preempts your pthread. | Use spdk_io_channel for the
subsystem; let its poller do the I/O. |
Take a pthread_mutex held by
another pthread | Will block until the holder releases; could be never. | Use spdk_thread_send_msg to
get the work done on the holding thread. |
malloc a large buffer | May page in zeroed pages; might block on swap. | Pre-allocate in startup. Use mempool. |
sleep /
usleep | Explicit yield to the kernel. Defeats the whole point. | Use a periodic poller. |
sched_yield | Same as sleep — gives up the CPU. | Don't yield. Trust the model. |
| Go CGo call into a blocking Go function | Go's scheduler can park the goroutine on any pthread, including the reactor's. | Issue an RPC, don't synchronously call into Go from C. |
The QMP quit wedge, in terms of these rules
The "QMP quit wedge" is a real failure mode you'll see
in the nvmf / vhost layers (Layer 7). It happens when
a VM issues a quit via QMP and the
SPDK target wedges. Translating into the threading
rules:
- The VM issues
quit, which QEMU forwards to the SPDK vhost-user backend. - The backend's virtio poller fires (a poller on the vhost thread). It sees the QMP quit notification.
- The poller decides to drain all in-flight I/O and shut down the vhost connection.
- To drain, it needs to wait for completions on every bdev_io submitted under the connection.
- The bdev_io completions will arrive via
spdk_thread_send_msgon the thread that submitted them — but those messages cannot be processed while the poller is still running (the poller is on the same thread, see Rule 1). - The poller spins waiting for the completions, which will never arrive while it's holding the thread. The connection never drains. The VM appears to hang. The nvmf target appears to be dead. The wedge.
The fix: send a message to the bdev poller's thread asking it to drain, return from the poller, let the message run on the next iteration. This is a structural pattern — "I can't do X in a poller, send a message to do X, return from the poller." The threading rules make it explicit.
The diskengine client and these rules
The diskengine spdkclient at
CreateClientWithJsonCodec:232is a JSON-RPC client. It runs in Go, on a Go goroutine, possibly on a Go runtime thread that has no relationship to any SPDK reactor. From the SPDK threading rules' perspective:
- Rule 1 doesn't apply — the Go side never sends a message to an SPDK thread directly. It sends an RPC request to the SPDK RPC server. The framework routes the request to a thread.
- Rule 2 doesn't apply — the Go side never holds an spdk_io_channel pointer across an RPC call. Channels are internal.
- Rule 3 doesn't apply — the Go side never sees a bdev_io. bdev_ios are internal to the bdev framework.
- Rule 4 doesn't apply — diskengine is a Go program that calls into SPDK via RPC. It does not embed the SPDK event loop; it talks to a separate SPDK process.
- Rule 5 doesn't apply — the Go runtime can do whatever it wants. It's not on a reactor.
The implication: if diskengine ever stops being a JSON-RPC client and starts embedding the SPDK framework (via CGo), every one of the five rules starts to apply. The "RPC client" pattern is the safe path. The "embedded SPDK" pattern is the path where the rules bite.
The threading violation checklist
When you have a bug that smells like a threading
violation — "the I/O is hanging," "the RPC never
returns," "the channel is in a weird state," "the
process aborted with a wrong_thread
error" — run through this checklist:
- Which
spdk_threadis the code running on? Get this fromspdk_get_threador by walking the call stack. If you can't answer, you're in a pthread that the framework doesn't know about — that's already the bug (Rule 4). - Which
spdk_threadis the resource bound to? Channels and bdev_ios are bound to specific threads. If the answer to (1) doesn't match, send a message to the right thread instead of operating on the resource directly (Rule 2, Rule 3). - Are you sending a message to the
thread you're currently on? If
yes, use
spdk_thread_exec_msginstead (Rule 1). - Are you holding a spinlock?
The framework checks this on poller return
via
SPIN_ASSERTat lib/thread/thread.c:1005 . A lock held across a poller return is undefined behavior, and it will deadlock the moment the thread tries to migrate or a different poller tries to take the same lock. - Is the poller doing anything that
could block? A list of things that
are subtly blocking:
getaddrinfo,openon a network socket,mallocof a large buffer, reading from a regular file descriptor, calling into Go via CGo (which can yield). If any of these are in the hot path, the reactor is yielding and you have Rule 5 violation. - Is the channel still alive?
If you cached a channel pointer from a
previous
getand the correspondingputhas happened, the pointer is dangling. Always re-acquire. - Did you call
spdk_thread_exiton all non-app threads before app shutdown? If not, you'll see"spdk_thread_exit was not called on thread"at lib/event/reactor.c:1045 at shutdown. The framework tolerates this for the app thread but not for others.
Edge cases & subtle traps
These are the cases that don't show up in the rule statements but show up in the wild.
1. The "innocent global variable" trap
You have a global struct my_state g_state
and you read it from a poller. The poller is on one
thread. The write happens on another (via a message
or an RPC handler). Without a memory fence, the
read might be stale. Without an atomic, the read
might be torn. Globals are shared state.
Shared state needs synchronization. The framework
gives you spdk_spinlock; use it.
2. The "I called send_msg to the right thread, so I'm safe"
No. spdk_thread_send_msg is asynchronous.
The function returns before the message is delivered.
If your code continues to use a resource that the
message is supposed to free, you'll race the message
handler. Send-and-forget is fine for
fire-and-forget. Send-and-immediately-use is not.
The convention for "I need a completion" is to
send a follow-up message back to the requester.
3. The "I'm in a CGo callback, so I'm on a Go thread" trap
spdk_get_thread returns the
spdk_thread associated with the
current pthread (via thread-local storage).
If you're in a CGo callback running on a Go runtime
pthread, the pthread is not a reactor pthread, and
spdk_get_thread returns NULL. Calling
any SPDK API that needs an spdk_thread will assert
or return an error. From a CGo callback,
you're not in the framework. The safe
pattern is: in CGo, just marshal data and return.
Use SPDK's RPC framework for any actual work.
4. The "poller is on a busy loop and the timer never fires"
You have a busy poller that does
while (!done) { poll_something(); }.
You're waiting for an event that will set
done = true. The event is supposed to
arrive via spdk_thread_send_msg.
But the reactor is in your busy poller; it
never returns to the loop top; the message is
never processed; done is never
set; your poller spins forever. Never
busy-wait for a message in a poller.
Return from the poller and let the message
run on the next iteration.
5. The "I unregistered a poller, but it's still firing"
spdk_poller_unregister is
asynchronous. The poller's state goes to
UNREGISTERED, but the actual
free happens on the next
reactor iteration. If your code expects
the poller to be gone immediately, you'll
be surprised. Don't reuse the
poller pointer after unregister.
The framework sets it to NULL for you
(it's passed by reference), and you should
treat that NULL as the contract.
6. The "I'm in an interrupt handler, so I can't call send_msg"
Actually you can. The framework's
spdk_thread_send_critical_msg
exists for exactly this. It uses an atomic
CAS instead of a ring, and the message runs
first on the next reactor iteration. Use
critical messages from signal handlers, not
regular messages. Regular messages
can allocate from the mempool, which can take
a lock; critical messages never allocate.
7. The "channel is alive but the device isn't"
You acquired a channel for io_device X. Then
someone called spdk_io_device_unregister.
The unregister is deferred until all references
to X are released. Your channel is still
valid; the framework is just refusing to
create new ones. The moment you put, the device
is freed. If you try to get a new channel for X
in the meantime, you'll get NULL with an error
log. Don't do that.
8. The "I'm using a Go runtime mutex from a C poller"
Go's sync.Mutex is implemented in
Go, with the Go scheduler. If you take it in C
and a Go goroutine tries to release it, the
release happens on whatever pthread the Go
scheduler chose. There's no contract with
the SPDK threading model. Don't take
Go locks from C. Don't take C locks from
Go. Use a Go-only channel for the handoff
if you need synchronization.
9. The "I called poller_unregister from a non-thread context"
The function at
lib/thread/thread.c:1836does thread = spdk_get_thread(); if (!thread)
{ assert(false); return; }. If you're
calling this from a pthread that isn't a reactor,
the assert fires. Poller lifecycle
operations must happen on the thread that owns
the poller. If you need to unregister
a poller on a different thread, send a message
to the owning thread asking it to do the
unregister.
10. The "spdk_thread_send_msg aborts on enqueue failure"
At lib/thread/thread.c:1452 ,
the enqueue failure path is abort.
Not return, not log — abort. The ring is
65536 slots. The only way to fill it is for
the target thread to be completely starved
(its reactor is blocked). If you ever see
"msg could not be enqueued" in a log, the
target reactor is wedged. Look for a
blocking poller on the target reactor.
What to take away
Five rules. They look simple. They are simple. The
challenge is that "the right thread" is an
instance of a thread, not a property
of a pthread. The framework's
spdk_thread is the unit of
identity. Your code has to think in terms of
spdk_threads, not in terms of
pthreads, cores, or Go goroutines.
Every bug you ever debug with SPDK threading will reduce to one of these five rules. When you see the bug, find the rule. When you write the code, name the rule you're enforcing. When you review someone else's code, ask which rule each call site satisfies.
You're now done with Layer 2. The next layer —
3.1 — JSON-RPC
over a Unix socket — is the first layer
built on top of this model. Every RPC handler
runs on an spdk_thread. Every
channel was acquired on the right thread. Every
poller doesn't block. The rules are about to
become relevant in a new way.