JSON-RPC over a Unix socket.
Every SPDK app — spdk_raid, nvmf_tgt, vhost,
the JSON config server in spdk_init — exposes its entire
control plane through one mechanism: a JSON-RPC 2.0 server bound to a
Unix domain socket. This page is the protocol. Framing, dispatch,
concurrency, error model, the lot. The next page (3.2)
is one full end-to-end trace.
- The wire format — what actually goes on the wire
- The Unix socket lifecycle: bind, listen, accept, poll
- Method registration — the registration list and the lookup
- The dispatch path — recv → parse → lookup → invoke
- Concurrency model — one RPC per connection, reactor-thread dispatch
- Error handling — the 4 SPDK error codes + one custom
- SPDK vs the JSON-RPC 2.0 spec — where it deviates
- Edge cases — what breaks, what surprises
The wire format — what actually goes on the wire
A JSON-RPC request is one JSON object terminated by a complete JSON value. There is no length prefix, no Content-Type header, no HTTP envelope — just bytes. The transport is a stream of such values back to back.
A canonical request from diskengine's Client.Call in
internal/spdkclient/client.go looks like this:
{
"jsonrpc": "2.0",
"id": 17,
"method": "bdev_lvol_create",
"params": {
"lvol_name": "vol-0001",
"size_in_mib": 1024,
"thin_provision": true,
"lvs_name": "lvs0"
}
}The four fields: jsonrpc is the protocol version string
(always "2.0"), id is a client-chosen token
(the Go client uses an atomic.Uint64), method
is the registered handler name, and params is a JSON
object that the handler will decode.
A success response from SPDK looks like this:
{
"jsonrpc": "2.0",
"id": 17,
"result": "bd56a4e6-2b22-4ee1-9e2f-7c1c8b8d2f11"
}The result field is whatever the handler wrote into the
response — for bdev_lvol_create the C handler at
writes the new lvol's UUID as a string.
A failure response from SPDK:
{
"jsonrpc": "2.0",
"id": 17,
"error": {
"code": -32602,
"message": "Invalid parameters"
}
}The code follows the JSON-RPC 2.0 spec for the standard
pre-defined errors, with a few SPDK-specific additions (see
Error handling). message is
freeform and may be a strerror() string from the bdev
layer.
The Unix socket lifecycle: bind, listen, accept, poll
The control plane is bound to a Unix domain socket (family
AF_UNIX, type SOCK_STREAM). The path is
conventional: /var/tmp/spdk<app>.sock in most setups
(or whatever the app's -r / --rpc-socket
option says). The higher-level wrapper
spdk_rpc_server_listen creates a sibling .lock
file to ensure only one SPDK process can own the socket at a time.
Inside spdk_jsonrpc_server_listen at
lib/jsonrpc/jsonrpc_server_tcp.c:11 , the
listen socket is created with SOCK_NONBLOCK | SOCK_CLOEXEC,
bound, and put into listen(2) with a backlog of 512. It
is not added to any epoll set. The poll loop below drives
accept/recv/send directly.
The poll loop is a single function:
spdk_jsonrpc_server_poll at
lib/jsonrpc/jsonrpc_server_tcp.c:389 . It
runs on a reactor thread (see 2.1)
and does four things in order:
Crucially, this is not epoll. It's a plain non-blocking poll over the listen socket + the connection list. Each iteration processes every active connection. The reactor schedules this poll at a fixed tick rate (typically 1 ms), which is what bounds RPC latency in SPDK.
Method registration — the registration list and the lookup
Every handler is registered with one of two macros:
SPDK_RPC_REGISTER(name, fn, state_mask) for the
high-level wrapper, or — if you need to bypass the wrapper —
spdk_jsonrpc_register_method(name, fn, state_mask) on the
low-level layer.
The SPDK_RPC_REGISTER macro expands to a
attribute((constructor(1000))) function — i.e. an
ELF constructor that runs before main(). This
is how SPDK plugins get their RPCs into the registry without the
application explicitly calling register for each one.
Registration appends to a global singly-linked list
(g_rpc_methods in lib/rpc/rpc.c). There is
no hash table. Lookup at dispatch time is
SLIST_FOREACH — linear, O(n). The comment in
lib/rpc/rpc.c:243 admits it:
/* TODO: use a hash table or sorted list */. For a few
hundred RPCs this is fine. For thousands, it would matter.
Each entry is a struct spdk_rpc_method with a name, a
function pointer, a state mask, and some alias/deprecated metadata
(see lib/rpc/rpc.c:30 ).
The dispatch path — recv → parse → lookup → invoke
This is the heart of the page. The path is:
sequenceDiagram participant Client participant Poll as spdk_jsonrpc_server_poll participant Recv as conn_recv participant Parse as jsonrpc_parse_request participant Handler as jsonrpc_handler participant Reg as g_rpc_methods Client->>Poll: connect(AF_UNIX) Poll->>Poll: accept() → conn fd loop every reactor tick Client->>Recv: send(bytes) Recv->>Parse: jsonrpc_parse_request Parse->>Parse: spdk_json_parse (twice) Parse->>Handler: jsonrpc_server_handle_request Handler->>Reg: _get_rpc_method(method) Reg-->>Handler: m (struct spdk_rpc_method*) Handler->>Handler: m->func(request, params) Handler-->>Client: response bytes (queued, drained on next poll) end
fig. 1 The reactor-thread poll loop, end to end. The handler runs synchronously on the reactor thread; the response is queued and drained on the next poll tick.
Step by step:
Recv.
jsonrpc_server_conn_recvat lib/jsonrpc/jsonrpc_server_tcp.c:267 does a non-blockingrecvinto a per-connection buffer (conn->recv_buf, 32 KiB) and then callsjsonrpc_parse_requestin a loop until the parser says "no more complete values."Parse.
jsonrpc_parse_requestat lib/jsonrpc/jsonrpc_server.c:171 does the two-pass trick: first a dry-run parse to find the value's end-offset, then a real parse that produces anspdk_json_valtree. The dry run returnsSPDK_JSON_PARSE_INCOMPLETEif the request is not yet fully buffered — in that case we return 0 and the outerdo while (rc > 0)exits without consuming any bytes.Decode the request object.
parse_single_requestat lib/jsonrpc/jsonrpc_server.c:85 uses a decoder table to pull out the four JSON-RPC fields:jsonrpc,method,params,id. It rejects any request whosejsonrpcisn't"2.0"with anSPDK_JSONRPC_ERROR_INVALID_REQUESTerror.Dispatch to the wrapper layer.
jsonrpc_server_handle_requestat lib/jsonrpc/jsonrpc_server_tcp.c:226 is a one-liner:request->conn->server->handle_request(request, method, params). Thehandle_requestfunction pointer was set when the server was created — for the standard SPDK path, it'sjsonrpc_handlerinlib/rpc/rpc.c.Method lookup.
jsonrpc_handlerat lib/rpc/rpc.c:103 walksg_rpc_methodsand finds the first entry whose name matches the requested method. If none is found, it sends aSPDK_JSONRPC_ERROR_METHOD_NOT_FOUNDerror.Invoke.
m->func(request, params)— the actual handler. This is whatSPDK_RPC_REGISTER("bdev_lvol_create", rpc_bdev_lvol_create, SPDK_RPC_RUNTIME)plumbed in. The handler runs synchronously on the reactor thread.Response. The handler is responsible for eventually calling
spdk_jsonrpc_end_resultorspdk_jsonrpc_send_error_response. Either of those puts the response on the connection's send queue.Send. On the next poll tick,
lib/jsonrpc/jsonrpc_server_tcp.c:334jsonrpc_server_conn_sendatdrains the send queue via a non-blocking
send.
Concurrency model — one RPC per connection, reactor-thread dispatch
There are three concurrency rules that govern the entire protocol, and they are not always obvious:
| Rule | What it means | Where it's enforced |
|---|---|---|
| One RPC at a time per connection. | A connection is busy until its response is fully sent. The next request on the same connection is not parsed until the previous response is in the send queue. | conn->outstanding_requests in jsonrpc_parse_request |
| Handlers run on the reactor thread. | Every m->func() invocation happens inside spdk_jsonrpc_server_poll, which itself runs on a reactor. The handler cannot block on a syscall; it must use SPDK's async APIs. | spdk_rpc_server_accept is called from the app framework's poll |
| SPDK is single-threaded at the framework level. | Even with multiple reactors, bdev / nvmf / blob operations are funneled through a single app thread. Concurrent writes to the same bdev or lvstore can race. | Documented in diskengine's coord.go |
The diskengine Go client sidesteps the first rule by using one
connection per in-flight RPC. Look at
createJsonCodec at
diskengine/diskengine/internal/spdkclient/client.go:120 :
the encoder/decoder pair wraps a single net.Conn, and
Call at client.go:43 does
encoder.Encode(request); decoder.Decode(response)
synchronously. In practice, if the Go client is used concurrently
from multiple goroutines, each goroutine needs its own
Client (or a goroutine-safe wrapper) — and indeed
diskengine's callers do exactly that.
Error handling — the 4 SPDK error codes + one custom
JSON-RPC 2.0 reserves a small set of pre-defined error codes. SPDK uses all of them and adds one custom code:
| Code | Name | SPDK meaning | Origin |
|---|---|---|---|
-32700 | Parse error | JSON itself is malformed (unterminated string, missing comma, etc.) | JSON-RPC 2.0 spec |
-32600 | Invalid request | Top-level is not an object/array, batch arrays, missing method, wrong jsonrpc version | JSON-RPC 2.0 spec |
-32601 | Method not found | No registered method matches | JSON-RPC 2.0 spec |
-32602 | Invalid params | Decoder for the request struct rejected an input — type mismatch, missing required field, etc. | JSON-RPC 2.0 spec |
-32603 | Internal error | Handler ran but failed (e.g. spdk_json_decode_object failed inside the handler body) | JSON-RPC 2.0 spec |
-1 | Invalid state | Method is registered only for SPDK_RPC_STARTUP but server is in SPDK_RPC_RUNTIME state, or vice versa | SPDK custom (SPDK_JSONRPC_ERROR_INVALID_STATE) |
In addition, handlers can return arbitrary negative-errno codes as
the code field. You'll see things like
-ENOENT, -EINVAL, -ENOMEM in
the wild. JSON-RPC 2.0 allows any integer in the
[-32000, -32768) range to be a "pre-defined" error and
says servers MAY define additional codes outside that range. SPDK's
choice to use raw -errno values is technically a spec
deviation, but the message string always carries the
strerror() text.
The state-mask enforcement is the most subtle. Methods are registered with a bitmask of when they're allowed to run:
SPDK_RPC_STARTUP(bit 0) — only callable beforeframework_start_initSPDK_RPC_RUNTIME(bit 1) — only callable after the framework is running
Most lvstore RPCs are SPDK_RPC_RUNTIME only. The
enforcement is at lib/rpc/rpc.c:125 — if
the bitmask doesn't include the current state, the handler is
replaced with an error response. bdev_lvol_create, for
instance, is SPDK_RPC_RUNTIME, so calling it before the
framework is up returns:
{
"jsonrpc": "2.0",
"id": 17,
"error": {
"code": -1,
"message": "Method may only be called after framework is initialized using framework_start_init RPC."
}
}SPDK vs the JSON-RPC 2.0 spec — where it deviates
Strictly speaking, SPDK's protocol is "JSON-RPC 2.0-ish." Here are the places where it does its own thing:
Batch requests are explicitly rejected. Per the spec, a request may be a JSON array of objects. SPDK's
parse_single_requestat lib/jsonrpc/jsonrpc_server.c:264 treatsSPDK_JSON_VAL_ARRAY_BEGINat the top level asSPDK_JSONRPC_ERROR_INVALID_REQUEST. The comment in the source: "Got batch array (not currently supported)".Notifications are silently dropped. Per the spec, a request with no
idis a "notification" — server must not reply. SPDK honors the spirit but, sincespdk_jsonrpc_end_resultat lib/jsonrpc/jsonrpc_server.c:375 callsskip_responsewhenidis missing or null, the request is parsed, dispatched, and the response is written into the request struct and then immediately freed without being sent. From the outside, the request appears to have been accepted and silently dropped.Non-standard error codes — see the previous section. Negative
errnovalues are common; the spec reserves only a small range.Version string is not optional. Per the spec,
jsonrpcshould be present and equal to"2.0". SPDK requires it —parse_single_requestjumps toinvalidif it's missing or wrong. Some other JSON-RPC servers are lenient. SPDK is not.Unix-domain socket only by default in the higher layer. The low-level server API supports
AF_INETtoo, but diskengine usesUnixexclusively. Thespdk_rpc_server_listenwrapper hardcodesAF_UNIX.
Edge cases — what breaks, what surprises
These are the things the docs do not tell you, and which have all been observed in production.
1. Client disconnects mid-RPC
If the client closes the socket after sending the request but
before reading the response, the handler still runs. When the
response is queued for send, the conn's
send() returns -1 with
EPIPE, the connection is closed, and the request
struct is freed via jsonrpc_free_request in
lib/jsonrpc/jsonrpc_server_tcp.c:321 . The
handler itself never knows the client went away. This is normally
fine — SPDK's idempotency model means the operation can complete in
the background — but for lvol create / delete, the lvstore state
will diverge from what the client thinks. There is no
application-level cancel.
2. Handler blocks (does a synchronous syscall)
If the handler does a blocking syscall — e.g. a read()
on a real file descriptor, or a synchronous usleep() —
the entire reactor thread stalls. The next reactor tick is
delayed, every other connection on that reactor stops making
progress, and the 1 ms latency budget is blown. This is the #1
cause of "RPC latency spikes that have nothing to do with my
code." The fix: never block. Use the bdev channel's
submit_request pattern and have the handler return
immediately, with the response written from a poller.
3. Concurrent RPCs from the same client
See the warning above. The protocol is strictly per-connection
serial. The Go client's
json.Encoder.Encode writes one JSON value per
Encode call, and
json.Decoder.Decode reads one. If two goroutines
encode on the same encoder, the bytes interleave and you get
invalid JSON. If two goroutines decode on the same decoder, the
first one will pick up the response meant for the second. The
client is single-goroutine-by-design.
4. Large params — the 32 KiB recv buffer and the 32 MiB send buffer
The connection's recv buffer is fixed at
SPDK_JSONRPC_RECV_BUF_SIZE = 32 * 1024 bytes (see
lib/jsonrpc/jsonrpc_internal.h:15 ). If
a single request is larger than that, the parser will not see a
complete value, recv will be called again next tick,
and eventually the request lands. But while the request is being
received, no other request on the same connection can be parsed
(per the one-RPC-per-connection rule). For params in the
hundreds-of-KiB range, this is a measurable stall.
The send buffer is bounded by
SPDK_JSONRPC_SEND_BUF_SIZE_MAX = 32 * 1024 * 1024
(32 MiB). The growth is doubling —
jsonrpc_server_write_cb at
lib/jsonrpc/jsonrpc_server.c:134 will
realloc the buffer if the next write doesn't fit. If a handler
tries to write a > 32 MiB response, the error log fires and the
response is dropped. (You'd have to be returning ~32 million
characters in a single result to hit this. It happens.)
5. Malformed request — what the client sees
If the request is malformed (bad JSON), the server returns
SPDK_JSONRPC_ERROR_PARSE_ERROR and closes the
connection. Look at lib/jsonrpc/jsonrpc_server.c:246 :
"Can't recover from parse error (no guaranteed resync point in
streaming JSON). Return an error to indicate that the connection
should be closed." This means a single typo in your Go
client permanently kills the connection — the next call will
return connection reset. The diskengine Go client
handles this by lazily reconnecting: ensureConnected
at diskengine/diskengine/internal/spdkclient/client.go:128
only opens the socket once per Client instance, and
on a transport error the upper layer needs to discard the
Client and create a new one.
6. Socket closed during handler execution
The handler runs to completion regardless. The completion (spdk_jsonrpc_end_result) queues the response, and the next poll tick tries to send. If the conn is closed by then, the response is freed and never sent. There is no way to know from the handler that the client has disconnected. If you need "RPC-was-delivered" semantics, the protocol does not give them to you.
7. The lock file gets stale
The flock is process-scoped, so a clean kill of the
SPDK process releases it. A hard kill (kill -9 of the parent
process group) also releases it. The pathological case is
filesystem-level: if the lock file is on NFS and the NFS server
goes away, you can get a stale lock. flock is a
fcntl-based advisory lock — its semantics across NFS are
famously weak. Mitigation: keep the lock file on a local tmpfs
(the conventional path /var/tmp is fine for this).
8. State mask mis-match — startup vs runtime
Methods like framework_start_init are
SPDK_RPC_STARTUP only. If you call them after the
framework is up, you get a clean INVALID_STATE error
— but the bitmask check is per-server-state, not per-method-state.
If your client tries to use a startup-only method during normal
operation, you'll get the same error every time. The error message
tells you the rule; the documentation does not.
What to take away
The JSON-RPC surface looks simple, but every part of it has consequences:
Framing is by JSON value. No length prefix. You must parse the value to know where it ends. This is why the server does a two-pass parse.
One RPC per connection. Pipeline at your own risk. The Go client is single-goroutine per connection by design.
Reactor-thread dispatch. The handler runs on the reactor. Blocking = stalled reactor = every other RPC on this reactor stops.
Error code negotiation is not standard. Use the message string, not the code, for actionable details.
Malformed requests kill the connection. The server can't resync a JSON stream, so it closes.
On the next page we trace one of these RPCs — bdev_lvol_create
— from a Go call site all the way through the C dispatch path,
looking at the actual code that runs at each step.