These are changes to uio_pci_generic.c to allow better use of the driver by
non-privileged processes.
1. Add back old code which allowed interrupt re-enablement through uio fd.
2. Translate PCI bards to uio mmap regions, to allow mmap through uio fd.
3. Allow devices which support MSI or MSI-X, but not IRQ.
Signed-off-by: Tom Lyon <pugs@cisco.com>
---
checkpatch.pl is happy with this one.
--- linux-2.6.33/drivers/uio/uio_pci_generic.c 2010-02-24 10:52:17.000000000 -0800
+++ mylinux-2.6.33/drivers/uio/uio_pci_generic.c 2010-04-19 14:57:21.000000000 -0700
@@ -14,9 +14,10 @@
* # ls -l /sys/bus/pci/devices/0000:00:19.0/driver
* .../0000:00:19.0/driver -> ../../../bus/pci/drivers/uio_pci_generic
*
- * Driver won't bind to devices which do not support the Interrupt Disable Bit
- * in the command register. All devices compliant to PCI 2.3 (circa 2002) and
- * all compliant PCI Express devices should support this bit.
+ * Driver won't bind to devices which do not support MSI, MSI-x, or the
+ * Interrupt Disable Bit in the command register. All devices compliant
+ * to PCI 2.3 (circa 2002) and all compliant PCI Express devices should
+ * support one of these.
*/
#include <linux/device.h>
@@ -27,7 +28,7 @@
#define DRIVER_VERSION "0.01.0"
#define DRIVER_AUTHOR "Michael S. Tsirkin <mst@redhat.com>"
-#define DRIVER_DESC "Generic UIO driver for PCI 2.3 devices"
+#define DRIVER_DESC "Generic UIO driver for PCIe & PCI 2.3 devices"
struct uio_pci_generic_dev {
struct uio_info info;
@@ -41,6 +42,39 @@ to_uio_pci_generic_dev(struct uio_info *
return container_of(info, struct uio_pci_generic_dev, info);
}
+/* Read/modify/write command register to disable interrupts.
+ * Note: we could cache the value and optimize the read if there was a way to
+ * get notified of user changes to command register through sysfs.
+ * */
+static void irqtoggle(struct uio_pci_generic_dev *gdev, int irq_on)
+{
+ struct pci_dev *pdev = gdev->pdev;
+ unsigned long flags;
+ u16 ...Since it's common for drivers to need configuration cycles for device control, the above 2 are not sufficient for generic devices. And if you fix the above, you won't need irqcontrol, which IMO we are better off saving for stuff like eventfd mapping. Also - this modifies a kernel/userspace interface in a way that makes an operation that was always safe previously potentially unsafe. Also, BAR regions could be less than 1 page in size, mapping these to unpriveledged process is a security problem. Also, for a generic driver, we likely need write combining support in the interface. Also, io space often can not be mmaped. We need read/write If the device supports MSI or MSI-X, it can perform PCI writes upstream, and MSI-X vectors are controlled through memory. So with MSI-X + mmap to an unpriveledged process you can easily cause the device to modify any memory. With MSI it's hard to be sure, maybe some devices might make guarantees not to do writes except for MSI, but there's no generic way to declare that: bus master needs to be enabled for MSI to work, and once bus master is enabled, nothing seems to prevent the device from corrupting So the patch doesn't look like generic enough or safe enough for users I have in mind (virtualization). What users/devices do you have in mind? For virtualization, I've been thinking about unpriviledged access and msi as well, and here's a plan I thought might work: - add a uio_iommu character device that controls an iommu domain - uio_iommu would make sure iommu is programmed in a safe way - use irqcontrol to bind pci device to iommu domain - allow config cycles through uio fd, but force bus master to 0 unless device is bound to a domain - for sub-page regions, and io, we can't allow mmap to an unpriveledged process. extend irqcontrol to allow read/write and range-check the operations - for msi/msix, drivers use multiple vectors. One idea is to map them by binding an eventfd to a vector. This approach has the ...
That's right. porttype == UIO_PORT_X86 is only there for information purposes. Userspace then knows that it cannot map this but has to use things like inb(), outb() and friends after getting access rights with ioperm()/iopl(). "start" and "size" gives userspace the information needed to do this. Thanks, Hans --
So that fails in the declared purpose of allowing an unpriveledged userspace driver, as inb/outb are priveledged operations. -- MST --
Michael, et al - sorry for the delay, but I've been digesting the comments and researching new approaches. I think the plan for V4 will be to take things entirely out of the UIO framework, and instead have a driver which supports user mode use of "well-behaved" PCI devices. I would like to use read and write to support access to memory regions, IO regions, or PCI config space. Config space is a bitch because not everything is safe to read or write, but I've come up with a table driven approach which can be run-time extended for non-compliant devices (under root control) which could then enable non-privileged users. For instance, OHCI 1394 devices use a dword in config space which is not formatted as a PCI capability, root can use sysfs to enable access: echo <offset> <readbits> <writebits> > /sys/dev/pci/devices/xxxx:xx:xx.x/<yyy>/config_permit A "well-behaved" PCI device must have memory BARs >= 4K for mmaping, must have separate memory space for MSI-X that does not need mmaping by the user driver, must support the PCI 2.3 interrupt masking, and must not go totally crazy with PCI config space (tg3 is real ugly, e1000 is fine). Again, my primary usage model is for direct user-level access to network devices, not for virtualization, but I think both will work. So, I will go outside UIO because: 1 - it doesn't allow reads and writes to sub-drivers, just irqcontrol 2 - it doesn't have ioctls 3 - it has its own interrupt model which doesn't use eventfds 4 - it's ugly doing the new stuff and maintaining backwards compat. I hereby solicit comments on the name and location for the new driver. Michael - some of your comments below imply you didn't look at the companion changes to uio.c, which had the eventfd interrupts and effectively the same iommu handling - but see my inline comments below. Given that many system platforms don't have it, doesn't seem like a big deal. Yes, will "virtualize" this in the driver. User level will not be allowed to The code already requires iommu ...
I suspect that without mmap and (to a lesser extent) write-combining, --
How about page aligned BARs? --
