[PATCH 6/6 v3] PCI: document the change

Previous thread: [PATCH 5/6 v3] PCI: reserve bus range for SR-IOV by Zhao, Yu on Saturday, September 27, 2008 - 1:28 am. (1 message)

Next thread: [GIT PULL] m32r updates by Hirokazu Takata on Saturday, September 27, 2008 - 3:39 am. (1 message)
From: Zhao, Yu
Date: Saturday, September 27, 2008 - 1:28 am

Create how-to for SR-IOV user and device driver developer.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Alex Chiang <achiang@hp.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>

---
 Documentation/DocBook/kernel-api.tmpl |    2 +
 Documentation/PCI/pci-iov-howto.txt   |  227 +++++++++++++++++++++++++++++++++
 2 files changed, 229 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/PCI/pci-iov-howto.txt

diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl
index b7b1482..c6ceb39 100644
--- a/Documentation/DocBook/kernel-api.tmpl
+++ b/Documentation/DocBook/kernel-api.tmpl
@@ -239,6 +239,7 @@ X!Ekernel/module.c
      </sect1>

      <sect1><title>PCI Support Library</title>
+!Iinclude/linux/pci.h
 !Edrivers/pci/pci.c
 !Edrivers/pci/pci-driver.c
 !Edrivers/pci/remove.c
@@ -251,6 +252,7 @@ X!Edrivers/pci/hotplug.c
 -->
 !Edrivers/pci/probe.c
 !Edrivers/pci/rom.c
+!Edrivers/pci/iov.c
      </sect1>
      <sect1><title>PCI Hotplug Support Library</title>
 !Edrivers/pci/hotplug/pci_hotplug_core.c
diff --git a/Documentation/PCI/pci-iov-howto.txt b/Documentation/PCI/pci-iov-howto.txt
new file mode 100644
index 0000000..ff1969e
--- /dev/null
+++ b/Documentation/PCI/pci-iov-howto.txt
@@ -0,0 +1,227 @@
+               PCI Express Single Root I/O Virtualization HOWTO
+                       Copyright (C) 2008 Intel Corporation
+
+
+1. Overview
+
+1.1 What is SR-IOV
+
+Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
+capability which makes one physical device appear as multiple virtual
+devices. The physical device is referred to as Physical Function while
+the virtual devices are referred to as Virtual Functions. Allocation
+of Virtual Functions can be dynamically controlled by Physical Function
+via ...
From: Matthew Wilcox
Date: Wednesday, October 1, 2008 - 9:07 am

Why do you need to do this?  Thus far, all the documentation has been

I don't think this section actually helps a software developer use


Wouldn't it be more useful to have the iov/N directories be a symlink to

We already have tools to set the MAC and VLAN parameters for network

I think a better interface would put the 'notify' into the struct
pci_driver.  That would make 'notify' a bad name .... how about
'virtual'?  There's also no documentation for the second parameter to


I'm not 100% convinced about this API.  The assumption here is that the
driver will do it, but I think it should probably be in the core.  The
driver probably wants to be notified that the PCI core is going to
create a virtual function, and would it please prepare to do so, but I'm
not convinced this should be triggered by the driver.  How would the

I think we'd be better off having the driver create its own sysfs

From my reading of the SR-IOV spec, this isn't how it's supposed to
work.  The device is supposed to be a fully functional PCI device that
on demand can start peeling off virtual functions; it's not supposed to
boot up and initialise all its virtual functions at once.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Dong, Eddie
Date: Monday, October 13, 2008 - 5:23 pm

The main concern here is that a VF may be disabed such as when PF enter
D3 state or undergo an reset and thus be plug-off, but user won't

Do you mean Ethtool? If yes, it is impossible for SR-IOV since the

Our concern is that the PF driver may put an default state when it is
loaded so that SR-IOV can work without any user level configuration, but
of course the driver won't dynamically change it.

The spec defines either we enable all VFs or Disable. Per VF enabling is
not supported.
Is this what you concern?

Thanks, eddie
--

From: Matthew Wilcox
Date: Monday, October 13, 2008 - 6:08 pm

If we're relying on the user to reconfigure virtual functions on return

I don't think ethtool has that ability; ip(8) can set mac addresses and
vconfig(8) sets vlan parameters.

The device driver already has to be aware of SR-IOV.  If it's going to
support the standard tools (and it damn well ought to), then it should

Let me try to explain this a bit better.

The user decides they want a new ethernet virtual function.  In the
scheme as you have set up:

1. User communicates to ethernet driver "I want a new VF"
2. Ethernet driver tells PCI core "create new VF".

I propose:

1. User tells PCI core "I want a new VF on PCI device 0000:01:03.0"
2. PCI core tells driver "User wants a new VF"

My scheme gives us a unified way of creating new VFs, yours requires each
driver to invent a way for the user to tell them to create a new VF.

I don't think that's true.  The spec requires you to enable all the
VFs from 0 to NumVFs, but NumVFs can be lower than TotalVFs.  At least,
that's how I read it.

But no, that isn't my concern.  My concern is that you've written a
driver here that seems to be a stub driver.  That doesn't seem to be
how SR-IOV is supposed to work; it's supposed to be a fully-functional
driver that has SR-IOV knowledge added to it.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Dong, Eddie
Date: Monday, October 13, 2008 - 7:31 pm

No. that is the concern we don't put those configuration under VF nodes
because it will disappear.

OK, as if it has the VF parameter, will look into details.
BTW, the SR-IOV patch is not only for network, some other devices such
as IDE will use same code base as well and we image it could have other

If user need a new VF, the VF must be already enabled or existed in OS.
Otherwise, we need to disable all VFs first and then change NumVFs to
re-enable VFs.


Yes, but setting NumVFs can only occur when VFs are disabled.
Following are from spec.

NumVFs may only be written while VF Enable is Clear. If NumVFs is
written when VF Enable is
Set, the results are undefined.

Yes, it is a full feature driver as if PF has resource in, for example
not all queues are assigned to VFs.

Thx, eddie
--

From: Yu Zhao
Date: Monday, October 13, 2008 - 7:14 pm

Neither ip(8) nor vconfig(8) can set MAC and VLAN address for VF when

As Eddie said, we have two problems here:
1) User has to set device specific parameters of a VF when he wants to
use this VF with KVM (assign this device to KVM guest). In this case,
VF driver is not loaded in the host environment. So operations which
are implemented as driver callback (e.g. set_mac_address()) are not
supported.
2) For security reason, some SR-IOV devices prohibit the VF driver
configuring the VF via its own register space. Instead, the configurations
must be done through the PF which the VF is associated with. This means PF
driver has to receive parameters that are used to configure its VFs. These
parameters obviously can be passed by traditional tools, if without
modification for SR-IOV.
--

From: Matthew Wilcox
Date: Monday, October 13, 2008 - 9:01 pm

I suspect what you want to do is create, then configure the device in

I think that idea also covers this point.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Dong, Eddie
Date: Monday, October 13, 2008 - 9:18 pm

That is not true. Rememver the created VFs will be destroyed no matter
for PF power event or error recovery conducted reset.
So what we want is:

Config, create, assign, and then deassign and destroy and then
Sorry can u explain a little bit more? The SR-IOV patch won't define
what kind of entries should be created or not, we leave network
subsystem to decide what to do. Same for disk subsstem etc.

Thx, eddie
--

From: Matthew Wilcox
Date: Monday, October 13, 2008 - 9:46 pm

No entries should be created.  This needs to be not SR-IOV specific.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--

From: Anirban Chakraborty
Date: Thursday, October 16, 2008 - 10:48 pm

I think we need to cover both the scenarios here, virtualization and  
non virtualization. In the absence of virtualization, the VF and PF  
driver should be identical. In this context, how does the PF driver  
allocates a VF? Is dynamic allocation of VFs possible, or does it have  
to allocate all the VFs that the device supports when the PF driver  
loads? Also, will the probe function be called for the VFs, or does  
the PF driver handle only the probe for the physical function? In  
virtualization context things get bit more complex as the the VF  
driver in guest would like to treat the VF as a physical function but  
that may not be possible from the device perspective as the control  
registers may well be shared between VF and PF.
I would think that the VF allocation is the job of SR PCIM. PCIM may  
well ask the PF driver to configure a VF upon user request.

Thanks much,

--

From: Yu Zhao
Date: Monday, October 13, 2008 - 9:06 pm

^^^

Can you please elaborate this?

--

From: Yu Zhao
Date: Saturday, November 15, 2008 - 5:38 am

Yes, putting the callback function to the 'pci_driver' is better. Looks 
like the 'virtual' is not very descriptive (and it's a adj. while other 
callbacks are verb). Any other candidates?

Thanks,
Yu


--

Previous thread: [PATCH 5/6 v3] PCI: reserve bus range for SR-IOV by Zhao, Yu on Saturday, September 27, 2008 - 1:28 am. (1 message)

Next thread: [GIT PULL] m32r updates by Hirokazu Takata on Saturday, September 27, 2008 - 3:39 am. (1 message)