Re: network interface *name* alias support?

Previous thread: Re: [Bugme-new] [Bug 10767] New: Seg Fault Instead of Swapping by Ilpo Järvinen on Friday, May 23, 2008 - 3:25 am. (1 message)

Next thread: Re: Patch to add M3Pnet Driver by Kevin D. Kissell on Friday, May 23, 2008 - 9:42 am. (2 messages)
From: Jan Engelhardt
Date: Friday, May 23, 2008 - 9:31 am

It is up to the user what name an interface gets. As such, you can 
encode all information you need into it, limited only by the 
maximum name length. Where is the problem?

If one personally cannot associate ethX with a hardware port, you can 
rename it to be more meaningful. I have done that with server boxen,
where things like igb0 and igb1 denoted the Intel E1000 in the PCI slot, 
iet0 the on-board Intel E100, and bcm0/bcm1 the two on-board (it's a 
Tyan S2892) Broadcom ethernets; and all is well.

It is a bit of a pity that Linux by default calls all its Ethernet 
devices just "eth", quite unlike BSD/Solaris. Only very few (Ethernet) 
drivers use non-ethX naming, namely raX for Ralink before it was merged, 
wlanX for ndiswrapper it seems, and athX for madwifi.
--

From: Kok, Auke
Date: Friday, May 23, 2008 - 10:14 am

FWIW you can just use ethtool to determine the slot address quickly in userspace.
There's no real need to do this in the kernel.

# ethtool -i eth0
driver: e1000e
version: 0.2.0
firmware-version: 1.3-0
bus-info: 0000:00:19.0

if you really want to make this structured, then a hal plugin seems a logical
place to implement this, since it already does device renaming. Having aliases
seems to be a bit of a nightmare and might confuse a lot of userspace
programs/scripts.

Auke
--

From: Rick Jones
Date: Friday, May 23, 2008 - 10:44 am

And if it happens to be in a hotplug slot today with a suitable hotplug 
module (term?) loaded like acpiphp you can then map that to a more human 
friendly slot number/name.  In the future, once Alex Chiang's pci slots 
patches make it to mainline it will be possible even with non-hotplug slots.

netperf omni tests and a couple other tools try to find such mappings today.

rick jones
--

From: Jon Masters
Date: Friday, May 23, 2008 - 12:06 pm

Yep, that's all great until the bus topology changes underneath you.
There is a need for alias support, because it will allow distributions
to assign a name based upon the *slot ordering specified by the vendor*
and therefore allow a consistent slot number no matter what hotplug
happens, what devices are added or removed, which devices are on-board
vs. in cards, and even (eventually) for non-PCI cards.

In the case of Fedora, right now, we have files:

ifcfg-eth<whatever>

These bind to an interface based on the MAC address. If you swap out the
card, you lose. If you pull out the disks from the machine and put them
into another similar machine, you lose. If you put the disks from the
machine into a less similar machine, but one that still has multiple
network interfaces, you lose.

Some enterprise distributions actually have to play with "bfsort" PCI
enumeration orderings in order to ensure that network devices come up in
a reliable order...this is not the way to be (in the longer term)
determining what order the vendor thinks those cards should be in. This
is why they have a DMI extension that allows them to specify this
without being concerned with PCI bus orderings, or anything else.

My intention is to also allow for:

ifcfg-slot_<whatever>

Where the configuration is based entirely upon what vendor <XYZ> says is
the first, second, or third card. Then, those who want to use the older
names can continue to do so, but those who prefer to base their
configuration upon the order the vendor states, can do so.

Jon.


--

From: Jon Masters
Date: Friday, May 23, 2008 - 12:11 pm

I'm aware there are other ways to achieve this than having udev assign
an alias on boot, but I think the alias approach is particularly clean.

Jon.


--

From: Jan Engelhardt
Date: Friday, May 23, 2008 - 1:46 pm

While it's gone now, openSUSE had support for ifcfg-bus-pci-0000:00:19.0
in versions prior to 10.3. I suggest you kindly ask they reinstate it
because with Fedora it is probably not going to happen that they
bus-pci-.. gets added in the first place.

--

From: Jon Masters
Date: Friday, May 23, 2008 - 1:55 pm

Yes but *that isn't what I'm talking about* :)

That doesn't infer the physical ordering of the devices on the back of
the machine, does it? How do I know which device is labeled "0" on the
back of the machine, and in the vendor documentation? The answer is,
because they added a DMI extension to tell us this information. So let's
please stop thinking about physical bus ordering and instead view this
as a simple problem of wanting to add an alias based on what the vendor
reports should be the ordering of the devices in the system :)

Jon.


--

From: Thomas Graf
Date: Friday, May 23, 2008 - 3:54 pm

I'd propose to extend the netlink configuration interface, f.e. introduce a
new netlink attribtue IFLA_SLOT which can be provided to select the device
to be changed based on the slot number instead of the name/ifindex. That
would also make it trivial to write a small app using RTM_GETLINK to
translate a slot number to the corresponding interface name.
--

From: Jon Masters
Date: Friday, May 23, 2008 - 9:25 pm

I guess that would also work quite nicely for what I want to do, but the
problem is that this will require either:

*). The kernel decodes the DMI extension directly.
*). We can first inform each device which slot it is in (set the slot).

My intention is to implement whatever seems reasonable, and my reason
for asking is that I am not a networking maintainer, so I want to know
what seems reasonable :)

Cheers,

Jon.


--

From: Jan Engelhardt
Date: Friday, May 23, 2008 - 9:53 pm

Why are we even looking at slot numbers? I do not think there is any
guarantee that the order of slots as a human would recognize them on
the board must always correspond to a monotonically increasing linear
function.
--

From: Matt Domsch
Date: Friday, May 23, 2008 - 10:16 pm

The guarantee comes from the SMBIOS tables describing the slot
physically, including the label on the motherboard for it, as well as
the new SMBIOS table bits in the 2.6 spec that provide the linkage
between a PCI domain/bus/device/function to slot (or embedded)
mapping.  New type 41, and extended type 9, can provide this linkage.
Dell late-model servers implement this in their BIOS.

Just to throw a wrench in, look at how udev handles disks presently.
The same physical device is represented in at least 6 different ways:
/dev/disk/by-{id,label,path,uuid,edd} and /dev/sdX.  There was much
confusion at first when the /dev/hda IDE driver device names changed
to /dev/sda with the advent of libata.  People used these alternate
naming schemes to circumvent the problem.  The by-label and by-uuid
names didn't change.  Only the tools that hard-coded /dev/hda needed
to change.

Conceptually I'm looking for the same thing.  The kernel uses the
names ethN for most ethernet type devices.  However, there might be
logical names we would want to assign (public, private, dmz, ...), or
some form of BIOS-assigned (Gb1, Gb2 to match the label printed on the
chassis), or some form of physical placement names (eth_embedded1,
eth_embedded2, eth_slot1_1 and eth_slot1_2 for a multiport card),
etc.  Right now network devices have essentially one name; yes, you
can change it, at the peril of breaking all the tools that assume your
network cards are ethN, just as there was breakage for tools that
assumed disks were /dev/hda.  But you can't have the multiple names.

In Fedora 10 rawhide, I'm prepared to change the names of the network
devices from ethN to eth_s0_1 (first embedded NIC) very early in the
process and try to find what all breaks.  But it would be really nice
to be able to assign these other types of names to a device as well,
ideally without breaking tools that are counting on the ethN names.

Any options for doing so would be appreciated.

Thanks,
Matt


-- 
Matt Domsch
Linux ...
From: James Chapman
Date: Saturday, May 24, 2008 - 2:15 am

I can see why netdev name aliases might be useful but there are 
potential usability issues. One example is a user who tries to name a 
device eth_s0_1 and gets an error that the device already exists 
(because an alias already exists with that name), yet ifconfig etc 
doesn't list a device with that name. Confusion ensues. Also, kernel 
logs will use the real name in messages, which makes it harder for the 
user to locate messages of her device if she only knows it by its 
aliased name.

Wouldn't it be better to fix any applications that can't handle renamed 
devices?

-- 
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

--

From: David Woodhouse
Date: Saturday, May 24, 2008 - 2:33 am

_Are_ there any such applications? Other than NetworkManager crapping
itself when the device name is too long, I'm not aware of any.

-- 
dwmw2

--

From: James Chapman
Date: Saturday, May 24, 2008 - 3:37 am

I think pppd is one such app (multilink and radius features may break), 
though this thread has only been concerned with eth devices so far. I'm 
sure I used a command line tool recently that was checking device names 
for eth%d patterns but I can't remember what it was now. :(

-- 
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

--

From: Patrick McHardy
Date: Saturday, May 24, 2008 - 1:31 pm

iptraf uses device names to determine the device type:

char ifaces[][6] =
     { "lo", "eth", "sl", "ppp", "ippp", "plip", "fddi", "isdn", "dvb",
     "pvc", "hdlc", "ipsec", "sbni", "tr", "wvlan", "wlan", "sm2", "sm3",
     "pent", "lec", "brg", "tun", "tap", "cipcb", "tunl", "vlan", "ath",
     "ra"
};

and

         if (strncmp(ifname, "eth", 3) == 0)
             result = LINK_ETHERNET;
         else if (strncmp(ifname, "ath", 3) == 0)
             result = LINK_ETHERNET;
         else if (strncmp(ifname, "plip", 4) == 0)
             result = LINK_PLIP;
...
--

From: Jan Engelhardt
Date: Saturday, May 24, 2008 - 1:54 pm

Run `iptraf -u`, and it works with any name.
--

From: Patrick McHardy
Date: Saturday, May 24, 2008 - 8:07 pm

Thanks for the hint. Its stupid nevertheless.

--

From: David Miller
Date: Sunday, May 25, 2008 - 5:17 am

From: Patrick McHardy <kaber@trash.net>

I think it's stupid too.

Physical geography information for a device is available
to userspace already.  If tools want to present that to
the user in a suitable interface, fine.  But what is
being proposed here is not necessary to implement that.
--

From: Matt Domsch
Date: Tuesday, May 27, 2008 - 12:03 pm

OK, I'm just trying to understand how you would see this "feature"
being implemented in userspace.  Advice welcome.

I keep looking at my analogue: disk devices.

Disks have a hard-coded association: Some device node with number
(8,0) means "the first SCSI disk device" to the kernel.  Regardless of
what the name of the file that implements the device node is (probably
/dev/sda, but not necessarily), or if there are symlinks pointing at
that file.  The kernel only cares about the linkage between the device
node and the driver that accepts read/write/ioctl/etc. to it.

Network devices have no such thing that I can tell.  I get at the
device names (as presently assigned) by reading /proc/net/dev (I'd be
happy to be told of a more correct way - this is what net-tools uses.)
The moment I've finished reading this though, another process can come
along and change these devices names.  Now every ioctl() my code makes
could fail because the name (in struct ifreq) is the handle used for
such calls.  One could argue it's a rare thing to change device
names...

This still leaves us the problem of wanting perhaps several naming
policies: by logical use, by physical geography, by kernel enumeration
name, etc.  Every tool that interacts with device names would need to
be modified to get at these "new names" to make use of them,
which are then translated (in userspace) to the current matching
kernel name, which is then used to make the ioctl() calls.  And we'll
have to persist these "new names" (assuming they aren't always
computable - e.g. the udev persistent net names rules today -
certainly we'd have to do that for any logical use naming policy; 
agreed the persistance mapping would have to exist in userspace).

Something like that?

-- 
Matt Domsch
Linux Technology Strategist, Dell Office of the CTO
linux.dell.com & www.dell.com/linux
--

From: Jan Engelhardt
Date: Tuesday, May 27, 2008 - 2:49 pm

The "correct" way seems to be sending off netlink messages (look at
the iproute2 code -- if you dare), or for shell scripts perhaps just
globbing up /sys/class/net/*.

And that's probably where it already ends. I like your analogy to
disk devices: the kernel keeps exactly one association (namely,
kdev_t to the device driver), so it seems sane to do the same for
network devices.

Make a /dev/net directory, and let udev populate it with symlinks in
the fashion of "bus-pci-0000:02.9 -> eth0". Note the potential
symlink loop, which should be of no concern since readlink(2)
dereferences it exactly once.

The catch: you cannot use "bus-pci-0000:02.9" as a device name
when directly talking to the kernel, it needs to be resolved to eth0
first. But then again, opening /dev/sda1 is also a resolution
procedure (finding the kdev_t for sda1), though it is in the kernel.

I am reminded of Solaris, which has device nodes for network
interfaces in /dev, though I am not aware how they are actually used.
(And granted, having to plumb it first is not as straightforward
compared to just-using-them in Linux). Though, when it is a device
node, userspace does not need to have knowledge of /dev/net and
readlink(2) it itself, as the kernel will auto-follow symlinks when
open(2)ed.
--

From: Thomas Graf
Date: Tuesday, May 27, 2008 - 3:11 pm

Regardless of whether you identify the link by name or slot, you
should translate the name/slot to ifindex and use netlink requests
to manipulate links. The ifindex is not going to change and won't
be reused by new links until the (large) counter overflows. Therefore
the chance of modyfing a wrong link is close to zero.
--

From: Stephen Hemminger
Date: Saturday, May 24, 2008 - 11:12 am

On Sat, 24 May 2008 00:25:55 -0400

If it is a physical device /sys/class/net/ethX/device is a link to
the actual device entry is /sys.



In newer kernels the contents /sys/class/net is just symlinks.
--

Previous thread: Re: [Bugme-new] [Bug 10767] New: Seg Fault Instead of Swapping by Ilpo Järvinen on Friday, May 23, 2008 - 3:25 am. (1 message)

Next thread: Re: Patch to add M3Pnet Driver by Kevin D. Kissell on Friday, May 23, 2008 - 9:42 am. (2 messages)