You may recall http://lkml.org/lkml/2006/9/29/268, wherein I described network device enumeration and naming challenges, and several possible fixes. Of these, Fix #1 (fix the PCI device list to be sorted breadth-first) has been implemented in the kernel, and Fix #3 (system board routing rules) have been implemented on Dell PowerEdge 10G and 11G servers (11G begin selling RSN). However, these have not been completely satisfactory. In particular, it keeps getting harder and harder to route PCI-Express lanes to guarantee the same ordering between a depth-first and breadth-first walk, and it turns out, that isn't sufficient anyhow. Problem: Users expect on-motherboard NICs to be named eth0..ethN. This can be difficult to achieve. Ethernet device names are initially assigned by the kernel, and may be changed by udev or nameif in userspace. The initial name assigned by the kernel is in monotonically increasing order, starting with eth0. In this instance, the enumeration directly leads to an assigned name. Complications: 1) Devices are discovered, and presented to the kernel for name assignment, based on several factors: a) the kernel hotplug mechanism emits events for udev to catch, to load the appropriate driver for a given device. The kernel emits these events in some ordering, tied to the depth-first PCI bus walk. Therefore the order in which userspace catches these events and starts to load a given device driver is tied to the depth-first bus walk. There is no guarantee within PCI-Express hardware topology of any ordering to the discovery of devices. To ease this complication, SMBIOS 2.6 includes a mechanism for BIOS to specify its expected ordering of devices, for naming purposes. Tools such as biosdevname use this information. b) udev may run modprobes in parallel. It guarantees that the events and modprobes are begun in order, but makes no guarantee that one event's modprobe ...
I would classify this as a bug, especially the fact that udev doesn't undo a failed rename, so you end up with ethX_rename. Virtual devices using the same MAC address trigger this reliably unless you add exceptions to the udev rules. You state that it only operates on one device at a time. If that is correct, I'm not sure why the _rename suffix is used at all instead of simply trying to assign the final name, which would avoid this problem. --
This is handled in most cases. Virtual interfaces claiming a configured name and created before the "hardware" interface are not How? The kernel assignes the names and the configured names may conflict. So you possibly can not rename a device to the target name when it's name is already taken. I don't see how to avoid this. Thanks, Kay --
I don't remember the exact circumstances, but I've seen it quite a few Sure, you can't rename it when the name is taken. But what udev apparently does when renaming a device is: - rename eth0 to eth0_rename - rename eth0_rename to eth2 - rename returns -EEXISTS: udev keeps eth0_rename What it could do is: - rename eth0 to eth2 - rename returns -EEXISTS: device at least still has a proper name Alternatively it should unroll the rename and hope that the old name is still free. But I don't see why the _rename step would do any good, assuming only a single device is handled at a time, it can't prevent clashes. --
Any particular reason the MAC addresses are the same? This came up a while ago with the 'dnet' device in the thread "Dave DNET ethernet controller". If the MAC address isn't a UUID for the device, then *what* is? If there isn't one, then certainly udev can't be blamed for getting ordering or names wrong, because there's nothing to use to actually match up the device to a name, uniquely. Note that combinations including bus IDs or device positions in the bus don't work for any type of hotplug case, because you can plug another adapter into the same location but it's a different adapter. Either people want (a) a name assigned to a specific device (which implies a UUID like a MAC address stored on that device somewhere accessible to the driver at plug/boot time), or they want (b) to assign a name to a *position* on the PCI or USB or firewire or whatever bus, or they (c) don't care about this at all. The answer is really 'all of the above'. Most of the people Matt cares about are probably in the (b) camp. But most desktop/laptop users are in the (a) camp because they use hotplug so much. --
> If the MAC address isn't a UUID for the device, then *what* is? MAC is technically per system if desired (eg old Sun boxes) and that is quite valid by IEE802.3. In that case you need MAC + topology. If you are running DECnet your system runs on assigned MAC addresses so you also have to be careful to use the EPROM MAC (if one exists which is I'd argue the fumdamental problem is that I can do this ln -s /dev/sda /dev/thebigdiskunderthefridge but cannot ln -s /dev/eth0 /dev/ethernet/slot0 and the SIOCGIF/SIF BSD style ioctl interface doesn't do pathnames or file handles of network devices. Anyone feel up to putting all the network devices into dev space and fixing the ioctls ;) --
Sometimes (I was referring to virtual devices) there may not be I agree that udev can't do anything useful in that case. I would prefer it it wouldn't even try though instead of messing with the names and leaving a bunch of _rename devices around. Sure, I can add a rule to disable it, but that shouldn't be necessary. Generally, I'm wondering whether it should touch virtual network devices at all since the MAC addresses are often not persistent, sometimes not unique and the name might have already been chosen explicitly by the administrator when creating the device. Currently there are some rules to ignore a couple of known virtual devices types. Are there actually cases where renaming virtual devices is desired? Otherwise a more future-proof way than blacklisting each type individually would be to add some attribute informing udev that the device has no unique key and should be ignored. --
I have seen systems (I think they were Sun boxes) where the _machine_ had a MAC address, and it used that same MAC on all interfaces. this is convienient for some things, but not for others. what's unique and reproducable is the discovery order David Lang --
Or even PCI. /me pats his laptop that reassigns PCI device ids randomly every 3rd or so boot. --
Also bear in mind that a module completing init() does not necessarily mean that the interfaces have been created. If the driver requires firmware, it will call out to userspace, and may not register the interface until well afterwards. One could even construct a pathological case where only a virtual device was registered, and userspace was required to add logical interfaces Well, the obvious fix to this is to make sure the names are always Actually udev handles this by using a temporary name. When renaming eth0->eth1 it actually uses an intermediate name first. This allows it to simultaneously swap eth0<->eth1 since one unblocks the other (actually both unblock each other). There is a failure case where two devices both end up trying to get the same name, in which case one will lock with a "_rename" name. There was an early debate in Ubuntu when we first wrote this code about using later names (eth2, eth3, etc.) but we realised that just hides the problem (and it happens again if you plug in a pccard or something that wants eth2). Since this is always a bug, making the problem visible was a "good While this works for PCI slots, it already doesn't scale to other buses. For example what slot number is the pccard slot? If you have two different pccard devices, would they get assigned the same name (udev currently assigns them different names). Now consider USB. Would the device name change depending on which USB port you plugged it into? Or is USB just a single slot, in which case what happens when you have two USB ethernet devices? The Apple USB Ethernet device in my iPhone is not the USB Wireless adapter I own, both have very different networking configurations. I quite liked the idea of /dev/eth0, then we could just use symlinks. Scott --=20 Scott James Remnant scott@ubuntu.com
actually biosdevname handles this already, using eth_pccard_X.Y where
we would obviously need a solution. eth_usb_{something} perhaps.
--
Matt Domsch
Linux Technology Strategist, Dell Office of the CTO
linux.dell.com & www.dell.com/linux
--
Right, but having biosdevname chase each new bus that comes along sounds iffy. I'd prefer /dev/net/by-name symlinks, if at all possible. But that's a lot of code that I'm not prepared to write. Bill --
Not to mention that All The World Is Not x86 Scott --=20 Scott James Remnant scott@ubuntu.com
My thoughts on the subject; from someone who is not
particularly qualified to have opinions.
Reading over your post, I searched for a single sentence describing
the problem you're trying to solve. What I came up with was
this:
Perhaps a little magic in the udev rule that creates the
z70_persistent-net-rules file would solve the basic problem.
It could sort the nics by mac address when creating the
names. It need only run when the z70 file does not exist.
I presume this would produce consistent results in most cases
and it feels technically feasible; although I am not
fully qualified to make that judgment.
Rather that put the onus on udev to make the above
change Dell could just run a little program at first
boot that mungs the z70 file as desired. (It could then
force a reboot; I forget if this would be needed.)
I imagine Dell boots the boxes once at the factory,
but if not then the user has to suffer with a longer
boot process at first boot. Because this is driven
by Dell, Dell would know exactly what nic has what
name. And Dell knows what nics are on the mobo and
what are not, and so can control the mac address sort
order as desired.
The other solution that screams out at me is to ditch
those legacy BIOSes and go to something like LinuxBIOS.
Again, I'm not really qualified, but it sure feels like
there's an answer in this approach.
The other point that struck me was that sometimes, it seems,
users want persistence in the naming of their network devices
and sometimes they want device names based on bus position.
The sucky thing is that symlinks and nics don't mix well
and so it seems impossible to satisfy both the above
requirements at the same time. This is an area that
IMHO could be better addressed by the Linux community.
Karl <kop@meme.com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein
--
nearly all dell systems running linux in the world were not factory-installed with that os. this isn't something i can simply patch in our factories. it needs to be fixed as far upstream as well, there is no "mac address sort" anywhere. (nor is that really a It's not a BIOS problem. BIOS can inform the OS of what it thinks about hardware location, names, etc. And our PowerEdge (9G and newer) servers do - using SMBIOS 2.6 standard features we added (types 9, 10, and 41) to the specification - exactly to allow such. Now something needs to use that information. That something today is biosdevname, correct. -- Matt Domsch Linux Technology Strategist, Dell Office of the CTO linux.dell.com & www.dell.com/linux --
I dispute this statement. I have several hundred servers that have the on-motherboard NICs as the last ones. anyone who's been making the assumption you describe will have been running into problems for many years. not everyone uses udev. I compile the nessasary drivers into the kernel this approach causes serious problems in a few cases, including 1. a NIC goes bad and you replace it. now all the configs change 2. you reinstall a box and it's interface names change. David Lang --
I agree it's not a valid assumption. People seem to want two things with names: 1) that devices be named deterministically 2) that the determinism doesn't change on a per-platform or per-configuration-of-a-platform basis. This tends to mean they want the onboard devices named first, then the add-in devices named. But not necessarily. I would hope to have a deterministic naming method that would work for most people by Right. These cases are only deterministic because they start from a known state; change or remove that state, and you're back to non-deterministic. -- Matt Domsch Linux Technology Strategist, Dell Office of the CTO linux.dell.com & www.dell.com/linux --
From: Matt Domsch <Matt_Domsch@dell.com>
I learned a long time ago that eth0 et al. have zero meaning.
If the system firmware folks gave us topology information with respect
to these things, we could export something that tools such as
NetworkManager, iproute2, etc. could use.
For example, if we were told that PCI device "domain:bus:dev:fn" has
string label "Onboard Ethernet 0" then we could present that to the
user.
Changing how the actual network device name is determined is going to
have zero traction.
So, please, put mapping tables into the ACPI or similar and then
programs can go:
for_each_network_device(name) {
fd = open(name);
label = get_system_label(fd, name);
present_to_user(label, name);
}
This "get_system_label()" thing can be an ethtool ioctl, some
rtnetlink call, or similar. In the kernel, a generic routine would
exist for major bus types to make the mapping translation, and drivers
would call these.
For PCI it might take the PCI device pointer and try to fish
out a string from the ACPI layer.
For OpenFirmware we might just simply give the full device path,
or a matching device alias name.
That's the only model which allows a smooth transition and
no major infrastructure changes.
I guess it's easier to spew about MAC addresses and other
irrelevant topics than try to solve this problem properly. :-)
--
What about things like USB network adapters where the topology is not fixed? Presumably we would want to use some sort of unique identifier, and the MAC comes to mind. Of course, then you run into the problem of how to deal with duplicate MACs. Chris --
USB devices do have a serial number field in the descriptors, but that only sometimes gets populated with sensible values. More often than not it's just zeros. But worth checking if the MAC isn't set yet. Dan --
Your wish is my command. DMTF SMBIOS 2.6 specification http://www.dmtf.org/standards/smbios/ contains changes which provide this for PCI devices. Specifically, Type 9 ("System Slots") was extended to include the PCI domain/bus/device/function for each slot. Type 10 ("On Board Devices Information") could not be extended, thus it was deprecated, and new Type 41 ("Onboard Devices Extended Information") was created to be extensible and now includes PCI domain/bus/device/function information. Both Type 9 and Type 41 include a String field which hopefully has a more descriptive value, such as "Onboard Ethernet Broadcom 5808 NIC 1" in the case of some Dell servers. Shipping Dell 10G (and very soon 11G) server BIOS includes this information. biosdevname can use this to report device names. Some HP systems have a vendor-specific SMBIOS extension to provide a While I'd be happy for NetworkManager to present these SMBIOS-provided human-parsable names when available, the names aren't terribly meaningful in a programatic fashion. The users I've encountered are looking for a programatic way to say: The first LOM is my management/admin NIC. The second LOM is my bulk traffic NIC. The first add-in card is my backup NIC. meaning we still need a translation from "how I want to use a NIC" to "which NIC should I plug the cable into". The SMBIOS names don't completely solve this. Hence my desire of having a way to have multiple alternate names for the same interface. One such name would be the full SMBIOS string. Another would be a bus topology name. A third could be a "how do I use it" name. Analogous to devices represented in /dev using symlinks for these other names. I don't care if it's symlinks in /dev or some other mechanism. Thanks, Matt -- Matt Domsch Linux Technology Strategist, Dell Office of the CTO linux.dell.com & www.dell.com/linux --
nm-applet could support some sort of "named" adapters, though I'd rather have this done with udev rules (or something like that) so that the NIC's common name would be consistent in both the CLI and in the GUI. The only reason nm-applet does what it does now (pulling VID/PID and dropping stupid words like "Corporation") is so the user has *some* clue what NIC they are about to touch; using "eth0" and "eth1" and "eth2" isn't very helpful. But the distinction between "Intel Gigabit Ethernet" and "D-Link 10/100 USB Adapter" is quite a bit easier to grasp at a glance. --
ACPI added _PLD (Physical Device Location) back in 3.0, ISTR. However, searching my archives, I have yet to see a single instance of its use in the field. ACPI also supplies the slot number stuff, which is exported via the existing pci_slot driver. cheers, Len Brown, Intel Open Source Technology Center --
David, would you be opposed to the additional device names being done
as device nodes in userspace, as several people suggested?
/sys/devices/*/net/ifindex already exports the netlink device index.
It would be trivial to add a /sys/devices/*/net/dev file, with
<major>:<minor> for a device, where <minor> = ifindex.
Then udev could then maintain /dev/net/by-{mac,path,...} as symlinks
to /dev/net/$kernelname.
Tools such as iproute's 'ip' could then be extended to look up their
'dev' argument by /dev path, resolve the symlink to name, get the device node, and
open the socket with the minor number / index (as normal).
Thanks,
Matt
--
Matt Domsch
Linux Technology Strategist, Dell Office of the CTO
linux.dell.com & www.dell.com/linux
--
My idea as a user, having configured some servers: with kernel point of view, there should be no preference. If users the problem here is the monotonic increasing order. I never rename ethX back to the monotonic ethX numbering. IMHO, renaming eth0 to eth1 sounds redundant. I rename ethx to lan, wan, wlan, remote, lan0, lan1, ... --
