Layer 2 · Threading & reactor

The reactor poll loop.

A reactor is a thread that loops on epoll_wait and dispatches work to registered callbacks. There is one reactor per CPU core. The loop never voluntarily gives up the CPU. This single design choice is the reason SPDK gets its numbers.

~15 min read1 diagramprerequisite: Layer 0.1, Layer 0.3

On this page

What a reactor is
One reactor per core
The poll loop, line by line
How pollers get registered
Cooperative scheduling: why you must behave
How reactors are started
What happens at shutdown
Edge cases & what trips people up

What a reactor is

A reactor is a thread that loops forever, calling spdk_thread_poll() for every lightweight thread currently scheduled on its core. It's not a class, not a process, not a pthread in the usual sense — it is a control flow. A reactor is the thing that owns a core for as long as the application runs.

The minimum viable reactor in pseudocode:

while (running) {
    for each spdk_thread in this reactor's thread list:
        spdk_thread_poll(thread, 0, now);
}

That's it. The reactor does not block on user-space primitives. It does not wait for I/O with a syscall. It does not sleep. It runs a tight loop, checking every lightweight thread on its core for ready work — pending messages, expired timers, busy pollers that need to fire.

One reactor per core

SPDK's default scheduler creates exactly one reactor per CPU core that's in the application core mask. The mask is set at startup via --cpumask or --lcores:

spdk_v26_01_migration/lib/event/scheduler_static.c · full file The static scheduler — the default, one-reactor-per-core

The "static" scheduler is a stub. It performs no balancing at all. Each lightweight thread is permanently parked on the lcore that spdk_thread_create() chose for it, and the lcore is permanently associated with a single reactor. The decision to use a real scheduler (like gpm or dyn) only matters when you want threads to migrate between cores for load balancing. The static scheduler is the right answer for most apps.

static const struct spdk_scheduler scheduler = {
    .name = "static",
    .init = init_static,
    .deinit = deinit_static,
    .balance = balance_static,
    .set_opts = set_opts_static,
};
SPDK_SCHEDULER_REGISTER(scheduler);

The balance_static function at lib/event/scheduler_static.c:45 is the entire "rebalancing" pass: it just resets every thread to its initial_lcore and disables further scheduling. With static scheduling, this function never actually runs in a healthy application — it's a no-op, a recovery routine for if/when you switch to a different scheduler and back.

Each reactor is a struct spdk_reactor allocated at spdk_reactors_init() time, one slot per logical core on the machine, even if the core isn't in the active mask. Inactive cores get flags.is_valid = false and a no-op reactor slot. Here's the actual struct:

spdk_v26_01_migration/include/spdk_internal/event.h · lines 44-82 struct spdk_reactor — the per-core state

struct spdk_reactor {
    /* Lightweight threads running on this reactor */
    TAILQ_HEAD(, spdk_lw_thread)    threads;
    uint32_t                        thread_count;

    /* Logical core number for this reactor. */
    uint32_t                        lcore;

    struct {
        uint32_t                    is_valid : 1;
        uint32_t                    reserved : 31;
    } flags;

    uint64_t                        tsc_last;

    struct spdk_ring                *events;
    int                             events_fd;

    /* Each bit of cpuset indicates whether a reactor probably
     * requires event notification */
    struct spdk_cpuset              notify_cpuset;

    bool                            in_interrupt;
    /* ... interrupt-mode fields, see Layer 9 ... */
} __attribute__((aligned(SPDK_CACHE_LINE_SIZE)));

Three things to notice:

TAILQ_HEAD(, spdk_lw_thread) threads — this is the list of lightweight threads (a.k.a. spdk_thread) currently bound to this reactor. A reactor may have zero, one, or many threads. The reactor walks this list in _reactor_run() and calls spdk_thread_poll() on each one.
struct spdk_ring *events — a MPSC ring buffer for spdk_events (the cross-reactor message primitive). The events_fd is a Linux eventfd that the reactor reads to be told "there's an event for you."
attribute((aligned(SPDK_CACHE_LINE_SIZE))) — the entire struct is cache-line aligned. This matters because two reactors on adjacent cores will be in different cache lines, so writing to reactor A from reactor B doesn't trash reactor B's cache. (Even more important: the thread_count is read on every loop iteration, so it has to stay hot in L1.)

The poll loop, line by line

Here is the entire reactor loop, verbatim. This is the single most important function in the framework:

spdk_v26_01_migration/lib/event/reactor.c · lines 988-1036 reactor_run() — the entire control flow of a reactor

static int
reactor_run(void *arg)
{
    struct spdk_reactor  *reactor = arg;
    char                 thread_name[32];
    uint64_t             last_sched = 0;

    SPDK_NOTICELOG("Reactor started on core %u\n", reactor->lcore);

    /* Rename the POSIX thread because the reactor is tied to the
     * POSIX thread in the SPDK event library. */
    snprintf(thread_name, sizeof(thread_name), "reactor_%u", reactor->lcore);
    _set_thread_name(thread_name);

    reactor->trace_id = spdk_trace_register_owner(OWNER_TYPE_REACTOR, thread_name);
    reactor->tsc_last = spdk_get_ticks();

    while (1) {
        if (spdk_unlikely(reactor->in_interrupt)) {
            reactor_interrupt_run(reactor);
        } else {
            _reactor_run(reactor);
        }

        /* Periodically check rusage for voluntary / involuntary
         * context switches (debug aid) */
        if (g_framework_context_switch_monitor_enabled) {
            if ((reactor->last_rusage + g_rusage_period) < reactor->tsc_last) {
                get_rusage(reactor);
                reactor->last_rusage = reactor->tsc_last;
            }
        }

        /* Periodically run the dynamic scheduler's balancing */
        if (spdk_unlikely(g_scheduler_period_in_tsc > 0 &&
                          (reactor->tsc_last - last_sched) > g_scheduler_period_in_tsc &&
                          reactor == g_scheduling_reactor &&
                          !g_scheduling_in_progress)) {
            last_sched = reactor->tsc_last;
            g_scheduling_in_progress = true;
            _reactors_scheduler_gather_metrics(NULL, NULL);
        }

        if (g_reactor_state != SPDK_REACTOR_STATE_RUNNING) {
            break;
        }
    }

    /* ... teardown omitted for now, see shutdown section below ... */
}

Read this loop with care. Every line earns its place.

Line 1001–1002: the POSIX thread is renamed via prctl(PR_SET_NAME) to reactor_N. If you ever top -H an SPDK process, this is what you see in the COMMAND column. This is also what shows up in /proc/<pid>/task/<tid>/comm.
Line 1004: registers the reactor as a trace owner. This is what makes spdk_top (see Layer 9.1) able to color per-core activity.
Line 1008: while (1). Forever. There is no usleep, no poll(NULL, 0, X), no sched_yield(). The reactor is pinned and runs until somebody sets g_reactor_state to something other than RUNNING.
Line 1010–1014: the only two real bodies of the loop. reactor_interrupt_run() is the interrupt-mode variant (uses an spdk_fd_group / epoll_wait); _reactor_run() is the poll-mode variant (spins calling spdk_thread_poll on every thread on the core). The default is poll mode. Interrupt mode is opt-in via --interrupt-mode.
Line 1016–1021: once a second, the reactor calls getrusage(RUSAGE_THREAD) and logs the delta of voluntary and involuntary context switches. If you see involuntary context switches climbing, the kernel is preempting your reactor. That's a sign somebody set the CPU affinity wrong.
Line 1023–1031: dynamic scheduler tick. This is the only path that triggers balance() on schedulers like gpm. The static scheduler sets g_scheduler_period_in_tsc = 0, so this branch is never taken. One reactor is designated the "scheduling reactor" (g_scheduling_reactor); only it runs the periodic balance. Other reactors see reactor != g_scheduling_reactor and skip.
Line 1033–1035: the exit condition. When spdk_app_stop() (or SIGINT) sets g_reactor_state = SPDK_REACTOR_STATE_EXITING, the reactor breaks out of the loop and falls into teardown.

The actual work happens inside _reactor_run():

spdk_v26_01_migration/lib/event/reactor.c · lines 951-985 _reactor_run() — the per-iteration body

static void
_reactor_run(struct spdk_reactor *reactor)
{
    struct spdk_thread    *thread;
    struct spdk_lw_thread *lw_thread, *tmp;
    uint64_t               now;
    int                    rc;

    event_queue_run_batch(reactor);

    /* If no threads are present on the reactor,
     * tsc_last gets outdated. Update it to track
     * thread execution time correctly. */
    if (spdk_unlikely(TAILQ_EMPTY(&reactor->threads))) {
        now = spdk_get_ticks();
        reactor->idle_tsc += now - reactor->tsc_last;
        reactor->tsc_last = now;
        return;
    }

    TAILQ_FOREACH_SAFE(lw_thread, &reactor->threads, link, tmp) {
        thread = spdk_thread_get_from_ctx(lw_thread);
        rc = spdk_thread_poll(thread, 0, reactor->tsc_last);

        now = spdk_thread_get_last_tsc(thread);
        if (rc == 0) {
            reactor->idle_tsc += now - reactor->tsc_last;
        } else if (rc > 0) {
            reactor->busy_tsc += now - reactor->tsc_last;
        }
        reactor->tsc_last = now;

        reactor_post_process_lw_thread(reactor, lw_thread);
    }
}

Two things to internalize here.

Event queue is drained first (event_queue_run_batch). This processes any spdk_events that other reactors or threads sent to this one. Events are the cross-reactor communication primitive: when reactor A wants reactor B to do something, A does spdk_event_allocate(B_lcore, fn, arg1, arg2), spdk_event_call(), and the event lands in reactor_B->events. event_queue_run_batch pulls them out and runs them.
Then it walks the thread list. For every lightweight thread on this core, it calls spdk_thread_poll() with max_msgs = 0 and the current tick. spdk_thread_poll() returns 0 if nothing happened (idle), positive if it did work (busy). The reactor adds the elapsed time to either idle_tsc or busy_tsc — this is the per-core utilization that spdk_top plots.
Then reactor_post_process_lw_thread() at lib/event/reactor.c:921 . This is the thread lifecycle bookkeeping: it checks spdk_thread_is_exited() and tears down threads that have completed spdk_thread_exit(). It also handles migration: if a thread is marked for reschedule (lw_thread->resched = true), it gets removed from this reactor's list and the scheduler places it on another one. For the static scheduler, the resched branch is never hit.

How pollers get registered

A poller is a function you want called repeatedly. There are two flavors: timed pollers (every N microseconds) and active pollers (as fast as the reactor can fire them, also called "busy" pollers). Both live on an spdk_thread, not directly on a reactor. The reactor just walks the thread list and asks each thread to run its pollers.

From a user's perspective, you register a poller on the current thread:

  spdk_poller_register(my_periodic_fn, my_arg, 1000 /* µs */);

Internally, this calls into poller_register() at lib/thread/thread.c:1707 , which:

Pulls the current spdk_thread from thread-local storage (tls_thread).
Allocates a struct spdk_poller with the function, argument, and period.
Converts the period from microseconds to TSC ticks (so the reactor can compare against its own tsc_last without doing division in the hot loop).
Calls thread_insert_poller() at lib/thread/thread.c:955 , which either inserts the poller into the active_pollers TAILQ (period = 0) or into the timed_pollers red-black tree (period > 0).

The reactor's role is then trivial: walk the thread, ask the thread to fire any due pollers, return.

spdk_v26_01_migration/lib/thread/thread.c · lines 1119-1182 thread_poll() — what each lightweight thread does on each reactor tick

static int
thread_poll(struct spdk_thread *thread, uint32_t max_msgs, uint64_t now)
{
    uint32_t msg_count;
    struct spdk_poller *poller, *tmp;
    spdk_msg_fn critical_msg;
    int rc = 0;

    thread->tsc_last = now;

    /* 1. Run any critical message first (e.g. signal handler) */
    critical_msg = thread->critical_msg;
    if (spdk_unlikely(critical_msg != NULL)) {
        critical_msg(NULL);
        thread->critical_msg = NULL;
        rc = 1;
    }

    /* 2. Drain the message queue */
    msg_count = msg_queue_run_batch(thread, max_msgs);
    if (msg_count) {
        rc = 1;
    }

    /* 3. Run all active (busy) pollers */
    TAILQ_FOREACH_REVERSE_SAFE(poller, &thread->active_pollers,
                   active_pollers_head, tailq, tmp) {
        int poller_rc;

        poller_rc = thread_execute_poller(thread, poller);
        if (poller_rc > rc) {
            rc = poller_rc;
        }
        if (thread->num_pp_handlers) {
            thread_run_pp_handlers(thread);
        }
    }

    /* 4. Run all expired timed pollers */
    poller = thread->first_timed_poller;
    while (poller != NULL) {
        int timer_rc = 0;

        if (now < poller->next_run_tick) {
            break;
        }
        /* ... remove from tree, run, reinsert ... */
    }

    return rc;
}

This is what runs on every reactor iteration, for every lightweight thread, in this exact order:

Critical message. If a spdk_thread_send_critical_msg() arrived (typically from a signal handler), run it now. Now. This is the only path that can interrupt a running poller.
Message queue. spdk_thread_send_msg() is the normal cross-thread communication; messages get dequeued and run here. Up to SPDK_MSG_BATCH_SIZE per reactor iteration (currently 8).
Active pollers. Walk the TAILQ in reverse (so unregistering a poller during the walk is safe with _SAFE iteration) and fire every busy poller. Return value of each poller is SPDK_POLLER_BUSY (1) or SPDK_POLLER_IDLE (0); the thread aggregates to the reactor for utilization tracking.
Timed pollers. The first_timed_poller is a cached RB_MIN() of the timed poller tree. If its next_run_tick is <= now, fire it. Repeat for the next minimum. Skip out as soon as we hit one whose next_run_tick is in the future.

The post-poller handlers hook (thread_run_pp_handlers) lets you register one-shot functions to run after the current poller finishes but before the reactor moves on. Used internally by bdev to coalesce completions.

Cooperative scheduling: why you must behave

Now you have the picture: the reactor runs a tight loop, and on every iteration it asks every thread to do whatever it wants to do — fire a poller, drain a message, the lot. There is no preemption. If a poller takes 10 milliseconds, the reactor does not switch to a different thread; it sits there and waits. Every other lightweight thread on the same core is blocked behind that one slow poller.

flowchart LR
subgraph Core2["CPU core 2 — reactor_2"]
  direction TB
  R2["reactor_2 loop"] --> T1["spdk_thread 'nvmf_tgt'"]
  R2 --> T2["spdk_thread 'bdev_poll'"]
  R2 --> T3["spdk_thread 'rpc'"]
  T1 -.->|blocked| Slow["slow_poller (10 ms)"]
  T2 -.->|blocked| Slow
  T3 -.->|blocked| Slow
end

fig. 1 — one reactor, three threads, one bad poller · tap or scroll to zoom · ↗ for fullscreen

fig. 1 One reactor on core 2 has three threads: an nvmf target, a bdev poller, and an RPC handler. If slow_poller is busy-looping for 10 ms, the reactor doesn't preempt it — every other thread on the same core waits 10 ms. That's why a single bad poller takes down the whole core.

This is why a single bdev module can starve a whole core. It is also why the framework exposes per-core utilization in spdk_top: you need to see this happening, because it won't be obvious from the application's perspective. The RPC will simply be slow.

How reactors are started

Reactors don't start themselves. spdk_reactors_start() at lib/event/reactor.c:1097 is the orchestrator:

spdk_v26_01_migration/lib/event/reactor.c · lines 1097-1135 spdk_reactors_start() — one pthread per reactor, then the local one

void
spdk_reactors_start(void)
{
    struct spdk_reactor *reactor;
    uint32_t i, current_core;
    int rc;

    g_rusage_period = (CONTEXT_SWITCH_MONITOR_PERIOD * spdk_get_ticks_hz()) / SPDK_SEC_TO_USEC;
    g_reactor_state = SPDK_REACTOR_STATE_RUNNING;
    g_stopping_reactors = false;

    current_core = spdk_env_get_current_core();
    SPDK_ENV_FOREACH_CORE(i) {
        if (i != current_core) {
            reactor = spdk_reactor_get(i);
            if (reactor == NULL) {
                continue;
            }

            rc = spdk_env_thread_launch_pinned(reactor->lcore, reactor_run, reactor);
            if (rc < 0) {
                SPDK_ERRLOG("Unable to start reactor thread on core %u\n", reactor->lcore);
                assert(false);
                return;
            }
        }
        spdk_cpuset_set_cpu(&g_reactor_core_mask, i, true);
    }

    /* Start the main reactor */
    reactor = spdk_reactor_get(current_core);
    assert(reactor != NULL);
    reactor_run(reactor);

    spdk_env_thread_wait_all();

    g_reactor_state = SPDK_REACTOR_STATE_SHUTDOWN;
}

The structure is subtle:

Line 1105: flip the global state to RUNNING. From this moment, any reactor that checks g_reactor_state will keep looping.
Lines 1110–1125: for every core in the active mask except the one we're currently on, call spdk_env_thread_launch_pinned(). This is DPDK's call: it spawns a new pthread, pins it to a specific CPU, and tells it to start in reactor_run. By the time the loop returns, every other core's reactor thread is already running.
Line 1128–1130: now the current pthread calls reactor_run() directly. It doesn't spawn itself a child — it just turns into a reactor. This is why the pthread that called spdk_app_start() ends up named reactor_N where N is whatever core spdk_app_start() happened to be on.
Line 1132: spdk_env_thread_wait_all() blocks the current pthread until every other reactor thread has exited. The current pthread (which is itself a reactor) is not waited for. It just returns from reactor_run, falls out the bottom of spdk_reactors_start(), and returns from spdk_app_start().
Line 1134: state machine moves to SHUTDOWN. This is the post-mortem state; the application is about to exit.

From the application's perspective, spdk_app_start() blocks until shutdown. From the OS's perspective, every reactor is a pinned pthread, each with name reactor_N, and one of them (the original main thread) is the one that's actually inside spdk_app_start().

What happens at shutdown

Shutdown is initiated by spdk_app_stop() at lib/event/app.c:1111 . That function sends a message to the app thread, which kicks off subsystem finalization, which eventually calls spdk_reactors_stop():

spdk_v26_01_migration/lib/event/reactor.c · lines 1137-1175 spdk_reactors_stop() / _reactors_stop() — telling every reactor to break the loop

static void
_reactors_stop(void *arg1, void *arg2)
{
    uint32_t i;
    int rc;
    struct spdk_reactor *reactor;
    struct spdk_reactor *local_reactor;
    uint64_t notify = 1;

    g_reactor_state = SPDK_REACTOR_STATE_EXITING;
    local_reactor = spdk_reactor_get(spdk_env_get_current_core());

    SPDK_ENV_FOREACH_CORE(i) {
        if (local_reactor == NULL ||
            spdk_cpuset_get_cpu(&local_reactor->notify_cpuset, i)) {
            reactor = spdk_reactor_get(i);
            assert(reactor != NULL);
            rc = write(reactor->events_fd, &notify, sizeof(notify));
            if (rc < 0) {
                SPDK_ERRLOG("failed to notify event queue for reactor(%u): %s.\n",
                            i, spdk_strerror(errno));
                continue;
            }
        }
    }
}

void
spdk_reactors_stop(void *arg1)
{
    spdk_for_each_reactor(nop, NULL, NULL, _reactors_stop);
}

The trick: we can't just set a global flag and expect every reactor to break the loop on its next iteration. A reactor might be deep inside a poller that won't return for a while. So the shutdown sequence writes to each reactor's events_fd, an eventfd that the reactor is waiting on (in poll mode) or polling in its interrupt-mode path. The write wakes the reactor, the reactor drains its event queue (the nop), the event queue is empty, the loop iterates, and now g_reactor_state != RUNNING, and the reactor breaks.

Important detail: the actual state change happens on line 1146 — inside the function running on the scheduling reactor. By the time the other reactors get woken up and reach their loop top, the state is already EXITING. If you ever set the state somewhere else, you'll have a race where some reactors see RUNNING for one more iteration. Don't do that.

After the loop breaks, each reactor runs teardown:

spdk_v26_01_migration/lib/event/reactor.c · lines 1038-1072 reactor teardown — drain remaining threads, then exit

    TAILQ_FOREACH(lw_thread, &reactor->threads, link) {
        thread = spdk_thread_get_from_ctx(lw_thread);
        if (spdk_thread_is_running(thread)) {
            if (!spdk_thread_is_app_thread(thread)) {
                SPDK_ERRLOG("spdk_thread_exit() was not called on thread '%s'\n",
                            spdk_thread_get_name(thread));
            }
            spdk_set_thread(thread);
            spdk_thread_exit(thread);
        }
    }

    while (!TAILQ_EMPTY(&reactor->threads)) {
        TAILQ_FOREACH_SAFE(lw_thread, &reactor->threads, link, tmp) {
            thread = spdk_thread_get_from_ctx(lw_thread);
            spdk_set_thread(thread);
            if (spdk_thread_is_exited(thread)) {
                _reactor_remove_lw_thread(reactor, lw_thread);
                spdk_thread_destroy(thread);
            } else {
                if (spdk_unlikely(reactor->in_interrupt)) {
                    reactor_interrupt_run(reactor);
                } else {
                    spdk_thread_poll(thread, 0, 0);
                }
            }
        }
    }

The teardown is itself a loop. For each thread still on the reactor, the reactor calls spdk_thread_exit() (if it wasn't already exited), then keeps polling the thread until it reports SPDK_THREAD_STATE_EXITED. A thread won't reach EXITED until all its pollers are unregistered, all its I/O channels are released, and all its in-flight messages have been processed. So a thread that "won't die" almost always means: a poller wasn't unregistered, or an spdk_io_channel wasn't put, or there's a pending spdk_thread_send_msg() that hasn't been delivered.

Edge cases & what trips people up

This is the section that pays for the rest of the page. The reactor model is simple in the abstract and full of sharp edges in practice.

1. What happens if a reactor falls behind

The reactor is a tight loop with no catch-up. If a poller on the reactor runs for 1 ms, the reactor is busy for that 1 ms, and any timers that should have fired during it will fire late. The reactor doesn't "burst" the missed pollers on the next iteration; it just keeps going. A periodic poller with a 100 µs period can drift to a 1 ms effective period under load. You will not see a stack trace; you will see tail latency. The fix is to profile with spdk_top and find the slow poller.

2. What happens if a poller never returns

A poller that does while (1) hangs the reactor. The kernel will not preempt it. The whole core is dead. Worse: if your application accepts incoming TCP connections on that core, the kernel TCP state machine will still work, but the SPDK-side handlers never run. Clients will see "connection accepted but no response." This is the failure mode of an infinite loop in a poller. The only recovery is to SIGKILL the process and start over. Test your pollers with timeouts.

3. What happens if a poller blocks on a foreign mutex

Imagine your poller calls into a Go runtime (via a CGo shim) that holds a Go mutex. The Go scheduler is cooperative, but if the goroutine that owns the mutex is on a different OS thread, your poller will spin or block. In the best case, the kernel preempts the pthread and you see involuntary context switches climbing in the rusage log. In the worst case, you deadlock the reactor permanently. Never call out of a poller to a system that might block on something you don't own.

4. What happens at shutdown if a poller doesn't unregister

Look at the teardown loop again. The reactor keeps polling a thread until the thread reports SPDK_THREAD_STATE_EXITED. A thread can't reach that state while it has registered pollers. The teardown logs the poller name as a warning and does free it, but this is a sign of a leaked resource. If you have a long-running poller that you conditionally register but forget to unregister, your shutdown will hang. The error log you get is "active_poller %s still registered at thread exit" at lib/thread/thread.c:413 . Read that log line. It is telling you exactly what to fix.

5. Why you can't safely allocate huge amounts of memory from a poller

A poller that calls malloc(1<<30) might trigger a page fault, and the kernel might decide to swap. The kernel will happily schedule another thread while your swap is in progress. The reactor will see a context switch. Your tail latency will spike. Pre-allocate large buffers in startup, not in the hot path. This is a big reason SPDK has mempool abstractions — see Layer 0.1 for the "hugepages for DMA memory" rationale.

6. Why a reactor can't safely do file I/O

read() on a regular file descriptor can sleep in the kernel. The kernel will sleep your pthread. When it wakes up, you have a context switch you didn't budget for. Worse, the file I/O is a perfect storm of preemption: your pthread is descheduled, another thread (maybe even another reactor on a different core) is scheduled, your cache line gets invalidated, your spdk_io_channel shared state is in flight. If you absolutely need to do file I/O, do it on a dedicated thread that is allowed to block. SPDK gives you the abstraction: spdk_thread (see 2.2).

7. The "I'm sure it's just one syscall" trap

Every well-intentioned "it's just a single gettimeofday(), no harm done" is a thread-yielding syscall. gettimeofday() is vDSO and usually fine. clock_gettime(CLOCK_REALTIME, ...) with CLOCK_REALTIME_COARSE is fine. getpid() is fine. sysinfo() is fine. stat() on a path is not fine. open() is not fine. malloc() is sometimes fine and sometimes not. The rule of thumb: if it can block, don't call it from a poller. Use TSC ticks instead of clock reads whenever possible.

8. The diskengine client never sees a reactor

Your Go code in Client.Call:43 talks JSON-RPC over a Unix socket. The C side ( lib/event/reactor.c:558 spdk_event_call()) routes the request to a reactor, but the Go side has no concept of "which reactor am I on" — it just sees a synchronous RPC reply. This is by design. The threading model is internal to SPDK; the JSON-RPC API is the abstraction boundary. The consequence: you can never block the Go side waiting for an SPDK resource that's only safe to use on a specific thread. The RPC framework guarantees the response is computed on a thread that's safe for the RPC's handler.

What to take away

The reactor is the most important idea in the whole codebase, and it's also the simplest. A thread per core. A loop. A list of pollers. No preemption, no sleeping, no kernel scheduler in your way. The price is a strict set of rules about what you can do from a poller; the reward is that, when you follow the rules, every microsecond of CPU time goes to your application.

The next page — 2.2 — spdk_thread — digs into the next layer down: the lightweight thread that the reactor is actually looping over. The reactor owns a core; the spdk_thread is the unit of work that gets to run on it.