我正在尝试将所有中断移至核心 0-3,以保持其余核心空闲,以实现高速、低延迟的虚拟化。
我写了一个快速脚本来将 IRQ 亲和性设置为 0-3:
#!/bin/bash
while IFS= read -r LINE; do
echo "0-3 -> \"$LINE\""
sudo bash -c "echo 0-3 > \"$LINE\""
done <<< "$(find /proc/irq/ -name smp_affinity_list)"
这似乎适用于 USB 设备和网络设备,但不适用于 NVME 设备。他们都产生这个错误:
bash: line 1: echo: write error: Input/output error
他们顽固地继续在我几乎所有的核心上均匀地产生中断。
如果我检查这些设备的当前亲和力:
$ cat /proc/irq/81/smp_affinity_list
0-1,16-17
$ cat /proc/irq/82/smp_affinity_list
2-3,18-19
$ cat /proc/irq/83/smp_affinity_list
4-5,20-21
$ cat /proc/irq/84/smp_affinity_list
6-7,22-23
...
似乎“某事”正在完全控制跨核心传播 IRQ,而不是让我改变它。
将这些移到其他内核是完全关键的,因为我在这些内核上的虚拟机中执行大量 IO,并且 NVME 驱动器正在产生大量的中断负载。这不是 Windows,我应该能够决定我的机器做什么。
什么是控制这些设备的 IRQ 亲和性以及如何覆盖它?
我在 Gigabyte Auros X570 Master 主板上使用 Ryzen 3950X CPU,3 个 NVME 驱动器连接到主板上的 M.2 端口。
(更新:我现在使用的是 5950X,仍然有完全相同的问题)
内核:5.12.2-arch1-1
lspci -v
与 NVME 相关的输出:
01:00.0 Non-Volatile memory controller: Phison Electronics Corporation E12 NVMe Controller (rev 01) (prog-if 02 [NVM Express])
Subsystem: Phison Electronics Corporation E12 NVMe Controller
Flags: bus master, fast devsel, latency 0, IRQ 45, NUMA node 0, IOMMU group 14
Memory at fc100000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [80] Express Endpoint, MSI 00
Capabilities: [d0] MSI-X: Enable+ Count=9 Masked-
Capabilities: [e0] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [f8] Power Management version 3
Capabilities: [100] Latency Tolerance Reporting
Capabilities: [110] L1 PM Substates
Capabilities: [128] Alternative Routing-ID Interpretation (ARI)
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Secondary PCI Express
Kernel driver in use: nvme
04:00.0 Non-Volatile memory controller: Phison Electronics Corporation E12 NVMe Controller (rev 01) (prog-if 02 [NVM Express])
Subsystem: Phison Electronics Corporation E12 NVMe Controller
Flags: bus master, fast devsel, latency 0, IRQ 24, NUMA node 0, IOMMU group 25
Memory at fbd00000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [80] Express Endpoint, MSI 00
Capabilities: [d0] MSI-X: Enable+ Count=9 Masked-
Capabilities: [e0] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [f8] Power Management version 3
Capabilities: [100] Latency Tolerance Reporting
Capabilities: [110] L1 PM Substates
Capabilities: [128] Alternative Routing-ID Interpretation (ARI)
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Secondary PCI Express
Kernel driver in use: nvme
05:00.0 Non-Volatile memory controller: Phison Electronics Corporation E12 NVMe Controller (rev 01) (prog-if 02 [NVM Express])
Subsystem: Phison Electronics Corporation E12 NVMe Controller
Flags: bus master, fast devsel, latency 0, IRQ 40, NUMA node 0, IOMMU group 26
Memory at fbc00000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [80] Express Endpoint, MSI 00
Capabilities: [d0] MSI-X: Enable+ Count=9 Masked-
Capabilities: [e0] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [f8] Power Management version 3
Capabilities: [100] Latency Tolerance Reporting
Capabilities: [110] L1 PM Substates
Capabilities: [128] Alternative Routing-ID Interpretation (ARI)
Capabilities: [200] Advanced Error Reporting
Capabilities: [300] Secondary PCI Express
Kernel driver in use: nvme
$ dmesg | grep -i nvme
[ 2.042888] nvme nvme0: pci function 0000:01:00.0
[ 2.042912] nvme nvme1: pci function 0000:04:00.0
[ 2.042941] nvme nvme2: pci function 0000:05:00.0
[ 2.048103] nvme nvme0: missing or invalid SUBNQN field.
[ 2.048109] nvme nvme2: missing or invalid SUBNQN field.
[ 2.048109] nvme nvme1: missing or invalid SUBNQN field.
[ 2.048112] nvme nvme0: Shutdown timeout set to 10 seconds
[ 2.048120] nvme nvme1: Shutdown timeout set to 10 seconds
[ 2.048127] nvme nvme2: Shutdown timeout set to 10 seconds
[ 2.049578] nvme nvme0: 8/0/0 default/read/poll queues
[ 2.049668] nvme nvme1: 8/0/0 default/read/poll queues
[ 2.049716] nvme nvme2: 8/0/0 default/read/poll queues
[ 2.051211] nvme1n1: p1
[ 2.051260] nvme2n1: p1
[ 2.051577] nvme0n1: p1 p2