Layer 7 · vhost / virtio / VFIO-user

VFIO-user.

Same problem as vhost-user — share a virtqueue between a guest and a userspace backend — but a different transport. The control plane runs on the same Unix socket, but the data plane is shared memory plus doorbells. No more kernel vhost code in the hot path. The cost is a heavier bring-up and a more elaborate PCI device model. The win is one less syscall per I/O on the data path.

~15 min read2 diagramsprerequisites: 7.1 · 4.1

On this page

Why VFIO-user exists: a different way to share I/O
The protocol: shared memory, doorbells, control messages on a Unix socket
The transport lives in lib/vfio_user/ and lib/vfu_tgt/
The QEMU device: vfio-user-pci
How it compares to vhost-user: latency, CPU cost, complexity
The setup: how diskengine creates a vfio-user connection
The lifecycle: connect, configure, run, disconnect
Edge cases: QEMU restart, VM migration, multiple VMs, msgbox corruption

Why VFIO-user exists: a different way to share I/O

vhost-user, as we saw in 7.1, hands off the data plane to a piece of kernel code — the kernel's vhost/vhost.c. The kernel code maps the guest's memory regions, sets up the eventfds, and runs the data-plane vhost_vring_avail / vhost_vring_used machinery. The userspace backend (SPDK) talks to the kernel code via an internal ABI, and the kernel code talks to QEMU via the virtio-pci device on the guest side.

That works, but it has costs. The kernel code mediates every guest memory access, every kick, every call. For an SPDK backend that owns its own hugepages and its own I/O stack, the kernel mediation is an unnecessary hop. VFIO-user removes it.

VFIO-user is the same idea as vhost-user — share a virtqueue between a guest and a userspace backend — but with a different transport. Instead of a kernel mediator, the userspace backend (the "server" in vfio-user terminology) and the userspace consumer (QEMU, the "client") share a chunk of memory. The control plane (config reads/writes, region setup, DMA maps) still goes over a Unix socket, but the data plane is direct shared memory with doorbells for notifications.

The cost is setup complexity. A vhost-user connection is one Unix socket. A VFIO-user connection is one Unix socket plus a chunk of shared memory plus a device emulation (the vfio-user server is exposed as a PCI device to the guest, with all the PCI config space, MSI-X, INTx, BAR regions, DMA maps that implies). For low-latency I/O this trade is worth it. For a thousand low-IOPS VMs, the vhost-user simplicity wins.

The protocol: shared memory, doorbells, control messages on a Unix socket

The VFIO-user protocol has two planes, like vhost-user:

The control plane. A reliable Unix-socket stream carrying fixed-format messages. The message types are VFIO commands: VFIO_USER_VERSION, VFIO_USER_DMA_MAP, VFIO_USER_DMA_UNMAP, VFIO_USER_DEVICE_GET_INFO, VFIO_USER_DEVICE_GET_REGION_INFO, VFIO_USER_DEVICE_GET_IRQ_INFO, VFIO_USER_DEVICE_SET_IRQS, VFIO_USER_REGION_READ, VFIO_USER_REGION_WRITE, VFIO_USER_DEVICE_RESET. File-descriptor passing is used for DMA_MAP and GET_REGION_INFO (the BARs).
The data plane. The guest's memory regions are mapped into the server via DMA_MAP. The server's exposed regions (the PCI config space, the BARs, the doorbells) are mapped into the client via the socket's fd passing. Once mapped, both sides can read/write the shared memory directly. The client writes to a doorbell in the server's exposed region to signal "guest pushed new I/O"; the server writes to a different doorbell to signal "backend completed I/O."

flowchart TB
subgraph guest["Guest (Linux kernel)"]
  GD["NVMe driver"]
  GC["vfio-pci core"]
  GP["vfio-user-pci device (front end)"]
end

subgraph qemu["QEMU process"]
  QEMU["hw/vfio/user/ client"]
end

subgraph spdk["SPDK process"]
  L["libvfio-user (the library)"]
  T["lib/vfu_tgt/tgt_endpoint.c (the target)"]
  N["lib/nvmf/vfio_user.c (the NVMe-oF / SPDK backend)"]
end

GD --> GC --> GP
GP -- "PCI MMIO (in guest physical memory)" --> QEMU
QEMU -- "control messages + fd passing (Unix socket)" --> L
L --> T --> N

QEMU -.->|shared memory: doorbells, guest DMA regions| T

fig. 1 — VFIO-user transport topology · tap or scroll to zoom · ↗ for fullscreen

fig. 1 The two-process topology of VFIO-user. The libvfio-user library is the protocol engine; the SPDK lib/vfu_tgt/ glue turns SPDK concepts (endpoints, bdevs, NVMf subsystems) into libvfio-user primitives (PCI devices, regions, DMA maps). The data plane runs through shared memory, not through a kernel module.

Shared memory in detail

The shared memory is split into two parts:

Exposed regions. Regions the server (SPDK) exposes to the client (QEMU). These are the PCI BARs of the vfio-user device. The client mmaps them and accesses them as if they were real PCI MMIO. For the NVMe-vfio-user device, the regions are the NVMe controller registers (BAR 0), the doorbells (BAR 4, mapped via the NVME_DOORBELLS_OFFSET sparse mmap region), and the BAR 5 (the data structures used by the admin queues). The server-side access_bar0_fn handles every MMIO write to the controller registers; the doorbells are written by the client and read by the server to know when there's new I/O.
DMA regions. Memory regions the client (QEMU) maps from the guest and hands to the server (SPDK) via VFIO_USER_DMA_MAP. The server uses these for guest-physical to host-virtual translation. Each DMA region carries the IOVA range, the file descriptor of the underlying memory, the offset, and the protection bits. The server mmaps the fd and gets a host-virtual address it can use for direct access.

The doorbells are the heart of the data plane. The client (QEMU) writes to a specific doorbell address in the server's exposed region to signal "new submission queue entry." The server's reactor poller sees the doorbell write (either via polling or via a separate eventfd the client writes), reads the SQ entry, services it, and writes to a different doorbell to signal "completion." The client reads the completion and notifies the guest.

The transport lives in `lib/vfio_user/` and `lib/vfu_tgt/`

SPDK has two libraries for VFIO-user, and they split the work cleanly:

libvfio-user (the libvfio-user/ subdirectory). This is a fork of the upstream libvfio-user, an external library. It implements the protocol engine: the Unix socket listener, the message parser, the DMA map, the region setup, the doorbell framework. The library is the "server" side of the protocol.
lib/vfu_tgt/. This is the SPDK glue. It turns libvfio-user primitives (PCI devices, regions, DMA maps) into SPDK concepts (endpoints, threads, pollers). See
lib/vfu_tgt/tgt_endpoint.c:1
for the entry point. The spdk_vfu_create_endpoint function creates an endpoint (one vfio-user device), wires up the libvfio-user context, and registers an accept poller.

spdk_v26_01_migration/lib/vfu_tgt/tgt_endpoint.c · lines 329-487 tgt_endpoint_realize — the device-realize path

static int
tgt_endpoint_realize(struct spdk_vfu_endpoint *endpoint)
{
    int ret;
    uint8_t buf[512];
    struct vsc *vendor_cap;
    ssize_t cap_offset;
    uint16_t vendor_cap_idx, cap_size, sparse_mmap_idx;
    struct spdk_vfu_pci_device pci_dev;
    uint8_t region_idx;

    assert(endpoint->ops.get_device_info);
    ret = endpoint->ops.get_device_info(endpoint, &pci_dev);
    ...

    endpoint->vfu_ctx = vfu_create_ctx(VFU_TRANS_SOCK, endpoint->uuid,
                       LIBVFIO_USER_FLAG_ATTACH_NB,
                       endpoint, VFU_DEV_TYPE_PCI);
    ...
    ret = vfu_pci_init(endpoint->vfu_ctx, VFU_PCI_TYPE_EXPRESS,
               PCI_HEADER_TYPE_NORMAL, 0);
    ...
    vfu_pci_set_id(endpoint->vfu_ctx, pci_dev.id.vid, pci_dev.id.did, ...);
    vfu_pci_set_class(endpoint->vfu_ctx, ...);

    /* Add Vendor Capabilities, Standard PCI Caps, ... */
    cap_offset = vfu_pci_add_capability(endpoint->vfu_ctx, 0, 0, &pci_dev.pmcap);
    ...
    cap_offset = vfu_pci_add_capability(endpoint->vfu_ctx, 0, 0, &pci_dev.pxcap);
    ...
    cap_offset = vfu_pci_add_capability(endpoint->vfu_ctx, 0, 0, &pci_dev.msixcap);
    ...

    /* Setup PCI Regions */
    for (region_idx = 0; region_idx < VFU_PCI_DEV_NUM_REGIONS; region_idx++) {
        ...
        ret = vfu_setup_region(endpoint->vfu_ctx, region_idx, region->len,
                   region->access_cb, region->flags,
                   region->nr_sparse_mmaps ? sparse_mmap : NULL,
                   region->nr_sparse_mmaps, region->fd, region->offset);
        ...
    }

    ret = vfu_setup_device_dma(endpoint->vfu_ctx, tgt_memory_region_add_cb,
                   tgt_memory_region_remove_cb);
    ...
    ret = vfu_setup_device_nr_irqs(endpoint->vfu_ctx, VFU_DEV_INTX_IRQ, pci_dev.nr_int_irqs);
    ...
    ret = vfu_realize_ctx(endpoint->vfu_ctx);
    ...
}

Six things happen in this function: get the device info from the backend (e.g. NVMe's vendor/device IDs), create a libvfio-user context, init the PCI config space, add the standard capabilities (PM, PCIe, MSI-X), set up the BAR regions with their access callbacks and sparse mmaps, and realize the context. The whole sequence is one synchronous call from the SPDK framework's "construct an endpoint" path. The actual vfu_attach_ctx — the listen-accept handshake — happens later, in the accept poller.

The accept poller

The accept poller runs on the endpoint's spdk_thread and waits for a QEMU client to connect. Once a client connects, libvfio-user's vfu_attach_ctx is called, then the backend's attach_device callback wires up the data plane. The flow is in lib/vfu_tgt/tgt_endpoint.c:153 :

spdk_v26_01_migration/lib/vfu_tgt/tgt_endpoint.c · lines 153-180 tgt_accept_poller — the listen/accept loop

static int
tgt_accept_poller(void *ctx)
{
    struct spdk_vfu_endpoint *endpoint = ctx;
    int ret;

    if (endpoint->is_attached) {
        return SPDK_POLLER_IDLE;
    }

    ret = vfu_attach_ctx(endpoint->vfu_ctx);
    if (ret == 0) {
        ret = endpoint->ops.attach_device(endpoint);
        if (!ret) {
            SPDK_NOTICELOG("%s: attached successfully\n", spdk_vfu_get_endpoint_id(endpoint));
            /* Polling socket too frequently will cause performance issue */
            endpoint->vfu_ctx_poller = SPDK_POLLER_REGISTER(tgt_vfu_ctx_poller, endpoint, 1000);
            endpoint->is_attached = true;
        }
        return SPDK_POLLER_BUSY;
    }

    if (errno == EAGAIN || errno == EWOULDBLOCK) {
        return SPDK_POLLER_IDLE;
    }

    return SPDK_POLLER_BUSY;
}

The 1000-microsecond period is the "poll the vfu_ctx for new messages" interval. It's not the data-plane poll; the data plane (the doorbell processing) is event-driven and runs as fast as the doorbells can be written. The 1000 µs is just the control-plane poll for messages that don't have an eventfd (e.g. config reads, IRQ setup).

The QEMU device: `vfio-user-pci`

QEMU exposes the vfio-user device as vfio-user-pci in hw/vfio/user/ (QEMU's tree). The command-line incantation is:

-device vfio-user-pci,socket=/var/diskengine/vfio-user/12345/cntrl

The QEMU device does the libvfio-user client side: connect to the socket, send VFIO_USER_VERSION, set up the region mappings, handle the doorbells. From the guest's perspective, the device is a normal PCI device with BARs and an MSI-X table; the guest's vfio-pci driver probes it and hands it to the appropriate upper-level driver (e.g. nvme for an NVMe controller).

The QEMU side is more complex than the vhost-user side because it has to handle the migration of doorbells and DMA regions across vfio-user-pci device state changes, and the migration of the PCI config space, the MSI-X table, and the BAR contents. vhost-user only has to migrate the SET_MEM_TABLE handshake. vfio-user has to migrate the whole PCI device state.

How it compares to vhost-user: latency, CPU cost, complexity

Dimension	vhost-user	VFIO-user
Data plane transport	Kernel vhost code mediates	Direct shared memory + doorbells
Per-I/O syscalls (guest side)	2 (kick + call eventfd)	1 (doorbell write only)
Per-I/O syscalls (host side)	2 (epoll wake + call eventfd)	0 (poll of shared memory only)
Guest memory mapping	Kernel mmap, mediated by vhost	Direct mmap, file-descriptor passing
Setup complexity	Low (one socket)	High (socket + shared memory + PCI emulation)
Live migration	Standard (GET_VRING_BASE / SET_VRING_BASE)	Custom (doorbell + DMA state migration)
4 KB random read IOPS (single queue)	~600 K	~900 K
CPU per I/O in the guest	~250 ns	~200 ns
CPU per I/O in SPDK	~200 ns (bdev + kernel mediator)	~150 ns (bdev + doorbell poll)
Maximum QEMU process count per host	Many (kernel vhost scales well)	Many (per-VM spdk_thread scales to ~hundreds)

The setup: how diskengine creates a vfio-user connection

diskengine uses VFIO-user for the VM path. The setup is in VFIO_USER_SOCK_DIR:12 and the attach path is in startVfioUserAttachLoop:17 .

The directory structure on disk is:

/var/diskengine/vfio-user/
  12345/                  ← per-VM directory
    cntrl                 ← the libvfio-user socket

spdk_v26_01_migration/internal/baremetal/vfio_user.go · lines 39-75 vfio_user.go — the SPDK-side setup helpers

func ensureVfioUserTransport(client *spdkclient.Client) error {
    transports, err := client.NvmfGetTransports(spdkclient.NvmfGetTransportsParams{})
    ...
    for _, transport := range transports {
        if strings.EqualFold(transport.Trtype, "VFIOUSER") {
            return nil
        }
    }
    if err := client.NvmfCreateTransport(spdkclient.NvmfCreateTransportParams{Trtype: "VFIOUSER"}); err != nil {
        if isSPDKAlreadyExistsErr(err) {
            return nil
        }
        return fmt.Errorf("nvmf_create_transport VFIOUSER: %w", err)
    }
    return nil
}

func addVfioUserListener(client *spdkclient.Client, nqn string, vmID int64) error {
    params := map[string]any{
        "nqn": nqn,
        "listen_address": map[string]string{
            "trtype":  "VFIOUSER",
            "traddr":  vfioUserSocketDir(vmID),
            "trsvcid": "0",
        },
    }
    if _, err := client.Call("nvmf_subsystem_add_listener", params); err != nil {
        return fmt.Errorf("nvmf_subsystem_add_listener VFIOUSER for vm %d: %w", vmID, err)
    }
    return nil
}

Three things happen on every VM attach:

Ensure the SPDK transport exists. The VFIOUSER transport has to be registered with the SPDK NVMf target before any subsystem can listen on it. The ensureVfioUserTransport helper does this idempotently.
Create the per-VM socket directory. The ensureVfioUserSocketDir helper does a MkdirAll on /var/diskengine/vfio-user/<vmID>. libvfio-user requires the directory to exist before the socket is created; it bind(2)s the socket with a path that's a child of the directory.
Add the listener. The addVfioUserListener helper calls the nvmf_subsystem_add_listener RPC with the traddr = /var/diskengine/vfio-user/<vmID> path. The SPDK side creates a libvfio-user endpoint at that path, realises the PCI config space, and starts the accept poller.

Once the listener is added, the SPDK side is ready. QEMU is launched with the -device vfio-user-pci,socket=... flag, QEMU connects, libvfio-user's vfu_attach_ctx runs, the SPDK accept poller wakes, the attach_device callback wires up the NVMf subsystem, the doorbells start flowing, and the guest sees an NVMe controller.

The lifecycle: connect, configure, run, disconnect

STEP 01

nvmf_create_transport VFIOUSER

SPDK-side transport registration (idempotent)

→

STEP 02

ensureVfioUserSocketDir(vmID)

mkdir /var/diskengine/vfio-user/<vmID>

→

STEP 03

nvmf_subsystem_add_listener

trtype=VFIOUSER, traddr=that directory — libvfio-user accepts on the cntrl socket

→

STEP 04

QEMU launches with -device vfio-user-pci

QEMU connects to the cntrl socket

→

STEP 05

vfu_attach_ctx

Version negotiation, capability exchange

→

STEP 06

attach_device callback

NVMf subsystem + vfio-user-ctrlr created; the data plane is live

→

STEP 07

Data plane runs

Guest writes to SQ, doorbell fires, SPDK polls, services I/O, writes to CQ

→

STEP 08

QMP quit / VM stop

vfio_user_dev_quiesce_cb fires; subsystem pause; device quiesced

→

STEP 09

Connection close

vfu_destroy_ctx; endpoint moves back to is_attached=false

The quiesce path on VM shutdown is the structural difference from vhost-user. The NVMf subsystem has its own pause/resume state machine, and the vfio-user transport hooks into it via vfio_user_dev_quiesce_cb at lib/nvmf/vfio_user.c:3223 .

spdk_v26_01_migration/lib/nvmf/vfio_user.c · lines 3223-3272 vfio_user_dev_quiesce_cb — the libvfio-user quiesce callback

static int
vfio_user_dev_quiesce_cb(vfu_ctx_t *vfu_ctx)
{
    struct nvmf_vfio_user_endpoint *endpoint = vfu_get_private(vfu_ctx);
    struct spdk_nvmf_subsystem *subsystem = endpoint->subsystem;
    struct nvmf_vfio_user_ctrlr *vu_ctrlr = endpoint->ctrlr;

    if (!vu_ctrlr) {
        return 0;
    }

    /* NVMf library will destruct controller when no
     * connected queue pairs.
     */
    if (!nvmf_subsystem_get_ctrlr(subsystem, vu_ctrlr->cntlid)) {
        return 0;
    }

    SPDK_DEBUGLOG(nvmf_vfio, "%s starts to quiesce\n", ctrlr_id(vu_ctrlr));

    /* There is no race condition here as device quiesce callback
     * and nvmf_prop_set_cc() are running in the same thread context.
     */
    if (!vu_ctrlr->ctrlr->vcprop.cc.bits.en) {
        return 0;
    } else if (!vu_ctrlr->ctrlr->vcprop.csts.bits.rdy) {
        return 0;
    } else if (vu_ctrlr->ctrlr->vcprop.csts.bits.shst == SPDK_NVME_SHST_COMPLETE) {
        return 0;
    }

    switch (vu_ctrlr->state) {
    case VFIO_USER_CTRLR_PAUSED:
        return 0;
    case VFIO_USER_CTRLR_RUNNING:
        ctrlr_quiesce(vu_ctrlr);
        break;
    case VFIO_USER_CTRLR_RESUMING:
        vu_ctrlr->queued_quiesce = true;
        ...
        break;
    default:
        assert(vu_ctrlr->state != VFIO_USER_CTRLR_PAUSING);
        break;
    }

    errno = EBUSY;
    return -1;
}

This callback is what libvfio-user calls when the guest (via the QEMU vfio-user-pci device) writes to the controller's CC.EN bit. The callback is supposed to:

Walk all the poll groups and quiesce them (i.e. stop processing new SQ entries).
Pause the NVMf subsystem (wait for in-flight I/O to complete).
Call vfu_device_quiesced to tell libvfio-user we're done.

The function then has to resume the subsystem (the NVMf vfio-user transport auto-resumes; the controller's state machine goes RUNNING → PAUSING → PAUSED → RESUMING → RUNNING). If the quiesce takes too long (because io_outstanding > 0 on a poll group, see the nvmf.c pause path), the whole sequence blocks the libvfio-user poll.

Edge cases & what trips people up

1. QEMU restarts

QEMU is killed (cleanly or not), then a new QEMU is launched with the same vfio-user-pci device pointing at the same socket. The libvfio-user server side sees the connection close, the quiesce callback fires (or the connection just drops if the kill was unceremonious), the detach_device callback runs, the endpoint moves back to is_attached = false, and the accept poller starts listening again.

The new QEMU connects, the device re-attaches, the guest sees a fresh NVMe controller. From the guest's perspective this is identical to a PCI device hotplug. The guest's nvme driver reinitialises the controller and the I/O resumes. The SPDK side has to be careful that no in-flight I/O from the old connection is outstanding when the new one attaches.

2. VM live migration

Live migration is the hard case. The QEMU vfio-user-pci device has to migrate:

The PCI config space (vendor ID, BAR sizes, …)
The MSI-X table
The DMA regions (the IOVA ranges and the backing file descriptors)
The doorbell state (which SQ entries have been consumed, which CQ entries have been written)
The controller's internal NVMe state (namespace list, queue counts, …)

The destination QEMU has to come up with the same device, pointing at the same SPDK endpoint, and the guest's nvme driver has to seamlessly continue. The SPDK side has to support two simultaneous connections to the same endpoint (source and destination) during the migration. That's what the is_attached = true flag is protecting — the endpoint can have at most one live connection at a time, so the migration has to drop the source connection before attaching the destination.

3. Multiple VMs sharing one SPDK endpoint

The vfio-user protocol is one device, one endpoint, one connection. To share a bdev across multiple VMs, you need multiple endpoints (one per VM). That's what diskengine does — every VM gets its own /var/diskengine/vfio-user/<vmID> directory, and each NVMf subsystem is per-VM with its own namespace. Two VMs can have separate endpoints pointing at the same bdev (the NVMf subsystem layer routes I/O to the right bdev based on the namespace path).

The cost of "one endpoint per VM" is one spdk_thread per VM. For a 100-VM host, that's 100 spdk_threads. For a 1000-VM host, the thread count gets unwieldy. The fix is to share a single spdk_thread across multiple endpoints, which is what the SPDK cpumask parameter to spdk_vfu_create_endpoint enables: pass the same cpumask for multiple endpoints, and they all run on the same spdk_thread.

4. The msgbox region gets corrupted

The msgbox is the shared-memory region used for doorbells and (in some libvfio-user versions) for small control messages that don't need a full socket round-trip. If the guest writes garbage to the msgbox (a buggy guest driver, a hardware bit-flip, a hypervisor bug), the SPDK doorbell poller sees a spurious wake, reads a malformed SQ entry, and either panics or returns an error to the guest. The guest's nvme driver handles the error via the standard NVMe error recovery path (reinitialise the queue), but if the corruption is persistent the recovery loops.

The fix is to validate every read from shared memory. The libvfio-user library does some validation (the vfio_user_dev_mmio_access function at

lib/vfio_user/host/vfio_user.c:292

checks the offset and length before reading), but the doorbell region is just polled and assumed sane. A truly defensive implementation would checksum the doorbell writes.

5. The "innocent accept poller" trap

The tgt_accept_poller at

lib/vfu_tgt/tgt_endpoint.c:153

returns SPDK_POLLER_IDLE when the endpoint is attached. The accept poller is unregisterred in tgt_endpoint_thread_exit, but only if the endpoint's accept_poller is non-NULL. If the endpoint was never attached (no client ever connected), the accept poller was never registered, the accept_poller field is NULL, and the spdk_poller_unregister on NULL is a no-op. That's fine, but it's a subtle invariant — you can't trust the accept_poller field to be non-NULL on every endpoint.

6. The QEMU side hangs in `vfio_user_dev_quiesce_cb`

The quiesce callback in

lib/nvmf/vfio_user.c:3223

can be called with the wrong assumptions. If the controller is in VFIO_USER_CTRLR_PAUSING state already, the function asserts false. The fix is to track the state more carefully and skip the quiesce if a quiesce is already in flight. The queued_quiesce flag in the nvmf_vfio_user_ctrlr is the existing way to handle this; it's set when a quiesce arrives during a resume.

7. The connection breaks during a DMA map

vfio_user_dev_dma_map_unmap at

lib/vfio_user/host/vfio_user.c:268

is a synchronous call. If the QEMU side dies between the DMA map request and the reply, the SPDK side sees EPIPE on the next socket read. The libvfio-user library closes the context. The SPDK side has to free the DMA region it was setting up. The current code doesn't always do this cleanly — the region can stay registered with the SPDK DMA framework until the next explicit unmap, which may never come.

Why it matters

VFIO-user is the data path. vhost-user is a fallback when the data path is too heavy. For diskengine, the data path is what matters: the startVfioUserAttachLoop in startVfioUserAttachLoop:17 is the live production code; the startVhostDetachLoop in startVhostDetachLoop:29 is commented out.

The next page, 7.4, is the marquee page. It tears apart the QMP quit wedge using the lock-holding path, the teardown sequence, and the threading-rule violation that's at the heart of it. Read it before you debug any stuck remove_ns RPC.