Layer 2 · Threading & reactor

The threading rules.

Five rules. Learn them. Every bug you'll ever have with SPDK threading is a violation of one of them. This page is the single most important reference in the curriculum.

~15 min read1 diagramprerequisite: Layer 2.3
On this page
  1. Why this page exists
  2. Rule 1: Don't send yourself a message
  3. Rule 2: Channels are bound to one thread
  4. Rule 3: Complete bdev_io on the right thread
  5. Rule 4: Initialize via spdk_app_start
  6. Rule 5: Don't block the reactor
  7. The QMP quit wedge, in terms of these rules
  8. The diskengine client and these rules
  9. The threading violation checklist
  10. Edge cases & subtle traps

Why this page exists

The previous three pages described the threading model: reactors, threads, channels, pollers, messages. The model is flexible. The flexibility is the point. But flexibility is not safety, and a flexible threading model with no rules is just a flexible deadlock. SPDK has rules. The rules are enforced by assert() in the C source, but the asserts only fire in the failure path — by the time you see them, something has already gone wrong.

This page is the rules. Read it. Bookmark it. When you're debugging a "the bdev isn't completing" or "the RPC handler just hangs" problem, come back here and check each rule. One of them is the answer.

Rule 1: Don't send yourself a message

If you're on thread T, don't call spdk_thread_send_msg(T, fn, ctx) as a way to "do work later." The message will be enqueued on T's message ring and delivered on the next reactor iteration — which is after your current function returns. If your function is waiting for the work to complete (a common mistake), you deadlock yourself. The reactor can never process the message because the function is still on the call stack.

Wrong
void
my_poller(void *arg)
{
    struct my_state *s = arg;
    int rc;

    /* Submit some work, then wait for completion. */
    spdk_thread_send_msg(spdk_get_thread(), do_later, s);
    /* ^ This is wrong: do_later() will never run
     * because we're still on the call stack. */

    rc = wait_for_completion(s);   /* spins forever */
}
Right
void
my_poller(void *arg)
{
    struct my_state *s = arg;
    int rc;

    /* Do the work synchronously, since we're
     * already on the right thread. */
    do_later(s);
    /* ... or, if you really need deferred delivery,
     * use spdk_thread_send_msg to a DIFFERENT
     * thread. */
}

Rule 2: Channels are bound to one thread

An spdk_io_channel is bound to the spdk_thread that acquired it. Use it only from that thread. This is enforced by the wrong_thread check in spdk_put_io_channel and spdk_io_channel_ref. The check is one line:

if (ch->thread != spdk_get_thread()) {
    wrong_thread("spdk_io_channel_ref", "ch", ch->thread, spdk_get_thread());
    return NULL;
}

There's no per-subsystem enforcement, only the framework-level check on put. The subsystem itself has to enforce the rule on submit. For the bdev, the check is in the bdev's submit callback — but only as an assertion, not as a runtime guard. The rule is: if your code is on a different thread than the channel, send yourself a message to the channel's thread.

What this looks like in the wild

The diskengine spdkclient Client.Call:43 doesn't violate this rule, because the Go side doesn't hold channel pointers. But consider a Go code path that calls into C, asks for a channel, and tries to use it from a different Go goroutine. The Go goroutine is on a different OS thread, so spdk_get_thread() on the C side returns NULL (or, worse, the wrong SPDK thread). Either way, the put is rejected. The rule applies even if you don't see the SPDK threads directly.

Rule 3: Complete a bdev_io on the right thread

A bdev_io must be completed on the thread that submitted it. This is a specific instance of Rule 2 (the bdev_io carries a reference to the channel that submitted it), but it's worth its own rule because it's the most common violation in practice.

The bdev framework's submit path looks roughly like:

void
bdev_submit_request(struct spdk_io_channel *ch, struct spdk_bdev_io *bdev_io)
{
    struct spdk_bdev_channel *channel = spdk_io_channel_get_ctx(ch);
    /* ... validate, then call module->submit ... */
    rc = bdev->fn_table->submit(ch, bdev_io);
    if (rc != 0) {
        /* Module refused; we complete with error. */
    }
}

The module's submit callback submits to the underlying device. The completion callback (spdk_bdev_io_complete) is the one that needs to fire on the right thread. The framework enforces this by:

  1. Every spdk_bdev_io has a ch (channel) pointer embedded in it. The channel knows its thread.
  2. When the device's poller drains the completion queue, it calls spdk_bdev_io_complete.
  3. spdk_bdev_io_complete looks at the bdev_io's channel, finds the channel's thread, and uses spdk_thread_send_msg to deliver the completion to that thread.

If you ever see a "the I/O completed on the wrong thread" error, the bdev module's submit code is doing one of these:

  • Storing the bdev_io and completing it from a different thread (often a Go goroutine that crossed the CGo boundary).
  • Using a stale bdev_io pointer after the channel was put and re-acquired (the bdev_io was freed and reallocated, and you're completing the old one).
  • Calling spdk_bdev_io_complete from inside a poller that's not on the channel's thread, with no send_msg redirection.

Rule 4: Initialize via spdk_app_start

Every SPDK program must call spdk_app_start as its entry point. The framework initializes the reactors, the thread library, the event framework, the JSON-RPC server, and the memory pool. If you try to do any of this manually — by calling spdk_thread_create directly, or spdk_poller_register, or even spdk_env_init from your own main loop — you'll get undefined behavior, missing initialization, or an assert.

Rule 5: Don't block the reactor

Code running on a reactor (poller, message handler, or anything inside spdk_thread_poll) must not block on a kernel primitive. This was covered in detail in 2.1; the threading rule is the consequence: if a poller blocks, the reactor doesn't return, and every other thread on that core is starved.

The full set of "don'ts" for a reactor:

Don'tWhyWhat to do instead
read / write on a non-SPDK file descriptorMay sleep; the kernel preempts your pthread.Use spdk_io_channel for the subsystem; let its poller do the I/O.
Take a pthread_mutex held by another pthreadWill block until the holder releases; could be never.Use spdk_thread_send_msg to get the work done on the holding thread.
malloc a large bufferMay page in zeroed pages; might block on swap.Pre-allocate in startup. Use mempool.
sleep / usleepExplicit yield to the kernel. Defeats the whole point.Use a periodic poller.
sched_yieldSame as sleep — gives up the CPU.Don't yield. Trust the model.
Go CGo call into a blocking Go functionGo's scheduler can park the goroutine on any pthread, including the reactor's.Issue an RPC, don't synchronously call into Go from C.

The QMP quit wedge, in terms of these rules

The "QMP quit wedge" is a real failure mode you'll see in the nvmf / vhost layers (Layer 7). It happens when a VM issues a quit via QMP and the SPDK target wedges. Translating into the threading rules:

  1. The VM issues quit, which QEMU forwards to the SPDK vhost-user backend.
  2. The backend's virtio poller fires (a poller on the vhost thread). It sees the QMP quit notification.
  3. The poller decides to drain all in-flight I/O and shut down the vhost connection.
  4. To drain, it needs to wait for completions on every bdev_io submitted under the connection.
  5. The bdev_io completions will arrive via spdk_thread_send_msg on the thread that submitted them — but those messages cannot be processed while the poller is still running (the poller is on the same thread, see Rule 1).
  6. The poller spins waiting for the completions, which will never arrive while it's holding the thread. The connection never drains. The VM appears to hang. The nvmf target appears to be dead. The wedge.

The fix: send a message to the bdev poller's thread asking it to drain, return from the poller, let the message run on the next iteration. This is a structural pattern — "I can't do X in a poller, send a message to do X, return from the poller." The threading rules make it explicit.

The diskengine client and these rules

The diskengine spdkclient at

CreateClientWithJsonCodec:232

is a JSON-RPC client. It runs in Go, on a Go goroutine, possibly on a Go runtime thread that has no relationship to any SPDK reactor. From the SPDK threading rules' perspective:

  • Rule 1 doesn't apply — the Go side never sends a message to an SPDK thread directly. It sends an RPC request to the SPDK RPC server. The framework routes the request to a thread.
  • Rule 2 doesn't apply — the Go side never holds an spdk_io_channel pointer across an RPC call. Channels are internal.
  • Rule 3 doesn't apply — the Go side never sees a bdev_io. bdev_ios are internal to the bdev framework.
  • Rule 4 doesn't apply — diskengine is a Go program that calls into SPDK via RPC. It does not embed the SPDK event loop; it talks to a separate SPDK process.
  • Rule 5 doesn't apply — the Go runtime can do whatever it wants. It's not on a reactor.

The implication: if diskengine ever stops being a JSON-RPC client and starts embedding the SPDK framework (via CGo), every one of the five rules starts to apply. The "RPC client" pattern is the safe path. The "embedded SPDK" pattern is the path where the rules bite.

The threading violation checklist

When you have a bug that smells like a threading violation — "the I/O is hanging," "the RPC never returns," "the channel is in a weird state," "the process aborted with a wrong_thread error" — run through this checklist:

  1. Which spdk_thread is the code running on? Get this from spdk_get_thread or by walking the call stack. If you can't answer, you're in a pthread that the framework doesn't know about — that's already the bug (Rule 4).
  2. Which spdk_thread is the resource bound to? Channels and bdev_ios are bound to specific threads. If the answer to (1) doesn't match, send a message to the right thread instead of operating on the resource directly (Rule 2, Rule 3).
  3. Are you sending a message to the thread you're currently on? If yes, use spdk_thread_exec_msg instead (Rule 1).
  4. Are you holding a spinlock? The framework checks this on poller return via SPIN_ASSERT at lib/thread/thread.c:1005 . A lock held across a poller return is undefined behavior, and it will deadlock the moment the thread tries to migrate or a different poller tries to take the same lock.
  5. Is the poller doing anything that could block? A list of things that are subtly blocking: getaddrinfo, open on a network socket, malloc of a large buffer, reading from a regular file descriptor, calling into Go via CGo (which can yield). If any of these are in the hot path, the reactor is yielding and you have Rule 5 violation.
  6. Is the channel still alive? If you cached a channel pointer from a previous get and the corresponding put has happened, the pointer is dangling. Always re-acquire.
  7. Did you call spdk_thread_exit on all non-app threads before app shutdown? If not, you'll see "spdk_thread_exit was not called on thread" at lib/event/reactor.c:1045 at shutdown. The framework tolerates this for the app thread but not for others.

Edge cases & subtle traps

These are the cases that don't show up in the rule statements but show up in the wild.

1. The "innocent global variable" trap

You have a global struct my_state g_state and you read it from a poller. The poller is on one thread. The write happens on another (via a message or an RPC handler). Without a memory fence, the read might be stale. Without an atomic, the read might be torn. Globals are shared state. Shared state needs synchronization. The framework gives you spdk_spinlock; use it.

2. The "I called send_msg to the right thread, so I'm safe"

No. spdk_thread_send_msg is asynchronous. The function returns before the message is delivered. If your code continues to use a resource that the message is supposed to free, you'll race the message handler. Send-and-forget is fine for fire-and-forget. Send-and-immediately-use is not. The convention for "I need a completion" is to send a follow-up message back to the requester.

3. The "I'm in a CGo callback, so I'm on a Go thread" trap

spdk_get_thread returns the spdk_thread associated with the current pthread (via thread-local storage). If you're in a CGo callback running on a Go runtime pthread, the pthread is not a reactor pthread, and spdk_get_thread returns NULL. Calling any SPDK API that needs an spdk_thread will assert or return an error. From a CGo callback, you're not in the framework. The safe pattern is: in CGo, just marshal data and return. Use SPDK's RPC framework for any actual work.

4. The "poller is on a busy loop and the timer never fires"

You have a busy poller that does while (!done) { poll_something(); }. You're waiting for an event that will set done = true. The event is supposed to arrive via spdk_thread_send_msg. But the reactor is in your busy poller; it never returns to the loop top; the message is never processed; done is never set; your poller spins forever. Never busy-wait for a message in a poller. Return from the poller and let the message run on the next iteration.

5. The "I unregistered a poller, but it's still firing"

spdk_poller_unregister is asynchronous. The poller's state goes to UNREGISTERED, but the actual free happens on the next reactor iteration. If your code expects the poller to be gone immediately, you'll be surprised. Don't reuse the poller pointer after unregister. The framework sets it to NULL for you (it's passed by reference), and you should treat that NULL as the contract.

6. The "I'm in an interrupt handler, so I can't call send_msg"

Actually you can. The framework's spdk_thread_send_critical_msg exists for exactly this. It uses an atomic CAS instead of a ring, and the message runs first on the next reactor iteration. Use critical messages from signal handlers, not regular messages. Regular messages can allocate from the mempool, which can take a lock; critical messages never allocate.

7. The "channel is alive but the device isn't"

You acquired a channel for io_device X. Then someone called spdk_io_device_unregister. The unregister is deferred until all references to X are released. Your channel is still valid; the framework is just refusing to create new ones. The moment you put, the device is freed. If you try to get a new channel for X in the meantime, you'll get NULL with an error log. Don't do that.

8. The "I'm using a Go runtime mutex from a C poller"

Go's sync.Mutex is implemented in Go, with the Go scheduler. If you take it in C and a Go goroutine tries to release it, the release happens on whatever pthread the Go scheduler chose. There's no contract with the SPDK threading model. Don't take Go locks from C. Don't take C locks from Go. Use a Go-only channel for the handoff if you need synchronization.

9. The "I called poller_unregister from a non-thread context"

The function at

lib/thread/thread.c:1836

does thread = spdk_get_thread(); if (!thread) { assert(false); return; }. If you're calling this from a pthread that isn't a reactor, the assert fires. Poller lifecycle operations must happen on the thread that owns the poller. If you need to unregister a poller on a different thread, send a message to the owning thread asking it to do the unregister.

10. The "spdk_thread_send_msg aborts on enqueue failure"

At lib/thread/thread.c:1452 , the enqueue failure path is abort. Not return, not log — abort. The ring is 65536 slots. The only way to fill it is for the target thread to be completely starved (its reactor is blocked). If you ever see "msg could not be enqueued" in a log, the target reactor is wedged. Look for a blocking poller on the target reactor.

What to take away

Five rules. They look simple. They are simple. The challenge is that "the right thread" is an instance of a thread, not a property of a pthread. The framework's spdk_thread is the unit of identity. Your code has to think in terms of spdk_threads, not in terms of pthreads, cores, or Go goroutines.

Every bug you ever debug with SPDK threading will reduce to one of these five rules. When you see the bug, find the rule. When you write the code, name the rule you're enforcing. When you review someone else's code, ask which rule each call site satisfies.

You're now done with Layer 2. The next layer — 3.1 — JSON-RPC over a Unix socket — is the first layer built on top of this model. Every RPC handler runs on an spdk_thread. Every channel was acquired on the right thread. Every poller doesn't block. The rules are about to become relevant in a new way.