This was on an SMP system? These issues are much more pronounced on a NUMA
system. There the locality of the device may be a prime issue.
Yes. The issue is even worse if the submission comes from a remote node.
F.e. If we have a system with a scsi controller on node 2. Now I/O
submission on node 1 and completion on node 2. In that case the
cacheline has to be transferred across the NUMA interlink.
However, you cannot avoid running the completion on the node where the
device sits. The device has all sorts of control structures and if you
would handle the completion on node 1 then it would have to transfer lots
of cachelines that contain device state to node 1.
I think it is better to leave things as is. Or have the I/O submission be
relocated to the node of the device.
I think that is the right approach. This will also help in cases where I/O
devices can only be accessed from a certain node (NUMA device address
restrictions on some systems may not allow remote cacheline access!)
Right.
-