I've been spending LOTS of time to investigate various devicess sources, to understand some questions I've had, like: - Why NetBSD/arm has no bus_space_mmap(4)? - Why tty locking is messy? - Why sys/dev/wscons has so many #ifdef's? (Modular unfriendly!) After absorbed myself 3 days now, I think I've figured out almost all of problems I've had and how I can fix these. Before going directly to the answer, let me summarize problems I've found: * a) Device enumeration is unstable / unpredictable dk(4) is a pseudo device, and its instances are numbered in the order it's created. This is fine when you manually / explicitly add wedges(4) by using "dkctl addwedge". This is not fine, if I have a gpt(4) disk label which has ordered partitions. I expect disks to be created in the order I write in the gpt(4) disk label. It's annoying the numbering changes when I add a new disk. Same for raidframe(4). b) Consistent device topology management is missing The reason why NetBSD/arm has no bus_space_mmap(9) has turned out to be the fact that we have no consistent (MI) way to manage physical address space of devices. NetBSD/mips has a working bus_space_mmap(9) in sys/arch/mips/mips/bus_space_alignstride_chipdep.c. It defines address windows and manage it by itself. Who wants to reimplement it on all cpus/ports/platforms? Considering physical address space is a pretty much simple concept - a single linear address space. And we already manage (kind of) tree of devices in autoconf(9). Do we want to manage such a topology in many places? No. c) Control / data flow is unclear I've never remembered what wscons command/device to configure wscons to add screen, load font, change encoding. It's a total mess. I don't know how the ioctl I send via wscons command is delivered to device. Same for data. Even by looking at sys/dev/wscons. Why it it so complicated? Our tty locking code has so many hacks. See grep XXX sys/kern/tty*. And we have to fix all serial ...
The good news: dyoung@ and I started with prototyping pmem(9) that provides an MI physical address space management. The bad news: We had no time to continue on this for more than a year now. Christoph
What have you achieved? Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
http://www.netbsd.org/~cegger/pmem2.diff May not cleanly apply against -current. Christoph
Ah. You provide API for MD codes. (I thought you wanted to provide a new API for drivers, and was about to write a compllainment. :P) I wonder if we want to make device memory allocation very smart, using vmem(9). It's done in very low level; very probably handled in bootstrap code. We don't want to introduce unnecessary dependencies of kernel subsystems. Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
I don't think this is a problem by itself. With devfs I would normally expose the symlinks or so based on the label or UUID in the GPT, but that functionality is missing. Joerg
Identifying device by id would be useful, but I don't like those are exposed as path, like /devices/iommu@f,e0000000/sbus@f,e0001000/..., which is too complicated IMO. What I'm thinking of is to have a file showing device class specific information, like disk0.info, which would work like procfs. find /dev/mainbus0 -name 'disk*.info' -print | \ while read f; do grep -q 'the-guid-i-am-looking-for' $f && echo found: $f done Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
ioctl would be called via <device>.ctl file in devfs. devfs can lookup the device_t instance by the opened vnode. device_t points to its device class data, which in turn points to [bc]devsw entry. This means that device major can go. device_t will be more important because it represents device nodes shown in devfs tree. I think power hooks added by pmf(9) (?) should move out of there, because it makes device_t's responsibility ambiguous. What I'm thinking is to make device_t "inherit" behaviors, like - bus - where devices and bridges attach - bridge - owns address windows, bus is attached - addressable - have bus_addr_t - parent is also addressable) - device or psuedo - read device (like azalia) / device function (like audio) Probably ifnet could be merged with device_t. Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
I already pointed this out on port-i386@, but theoretically this sounds like a good thing for the ACPI side of things where "pseudo" (ACPI) devices need to be matched with "real" devices. As an example: one thing that holds back the ACPI CPU code I am working on is that I need to be sure that e.g. cpu3 that attaches to acpi0 is the same cpu3 that has attached to mainbus0. So: /dev/acpi0/cpu3 -> /dev/mainbus0/cpu3 No idea how this works in practice (frankly, I find autoconfiguration scary). - Jukka. PS. Another example is the Grégoire Sutre's work with ACPI display devices; http://mail-index.netbsd.org/tech-kern/2010/01/20/msg006983.html
Well, the answer to that is simple: there should only be one device. Anything design that doesn't produce that result can go to thrown out the window without further delay. --=20 Quentin Garnier - cube@cubidou.net - cube@NetBSD.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007.
In the above example it would be "acpicpu3 at acpi0" and "cpu3 at mainbus0". But as you know quite well what is involved, I am merely pointing out that the current situation holds back many possibilities. And noting that I don't have the competency to do anything about it. - Jukka.
On Sun, Mar 07, 2010 at 06:43:49PM +0900, Masao Uebayashi wrote: You're barking up the wrong tree. What's annoying is not that the numbering changes. It is that the numbering is relevant to the use of the device. I expect dk(4) devices to be given names (be it real names or GUIDs), and I expect to be able to use that whenever I currently have to use a string of the form "dkN". code. What kind of user do you talk about here? If it's the end user, then Wrong. Device numbers should be irrelevant to anything but operations This has nothing to do with what devfs is about. If your idea of devfs is that the user should know the whole device path to access a hard drive, you have strange ideas about simplicity. Beside, imagine you move said hard drive from one port to the other (or on to another, say, faster controller); the ultimate idea of devfs is that the device node for the hard drive doesn't change. Not that full, explicit device paths aren't something useful to expose one way or another to the userland. It's just not what devfs is about, Again, users shouldn't have to care about device numbering. With your idea of numbering, the way to access a device should change depending on which USB port I put my usb key drive in? I fail to see how this is better than what we have now. Are you really just discovering that wscons needs a lot of love? It's old news. The problem is that nobody wants to deal with that mess and the ensuing binary compatibility nightmare. Really? It seems to me that you are really confused, about a number of things. Out of those, the most important is what the user experience should be, so let me be clear on this: the end user should never, ever, ever deal with monstruosities like a full device path. And device paths are not devfs, okay? --=20 Quentin Garnier - cube@cubidou.net - cube@NetBSD.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to ...
Is it just me who find this whole ide ironic?
Have everyone forgotten how to set up their own kernel? Is everyone now
booting GENERIC? (Or just making a copy of GENERIC, with a few patches
without understanding what they are editing?)
The whole point being that if you boot a kernel, in which you have
configured the whole system to connect anything anywhere, you should not
be surprised if the device enumeration might seem random.
If you want predictable device enumetaion, you can have that, and have
been able to have that for over twenty years...
The line
wd* at atabus? drive ? flags 0x0000
(to use one example) says that match any wd type disk to any unit number
on any atabus, without doing any closer matching. Ie. kindof unpredictable.
The asterisks and question marks means exactly that. If you want
predictable matching that stays the same at every boot, no matter what
hardware you put on the system, you write explicit lines in the config
instead.
Jeezuz! How have we fallen to these lows? Trying to make a filesystem
that shows the hardware configuration, with absurd, long and silly
paths, which is pretty useless anyway, since if we just move the disk
the slightest, we lost it anyway.
For basically no gain in functionality, a lot of new mess to deal with
when managing the system, and a lot of work...
I can see a point in having a way to express a specific disk, based on a
disk label instead of the hardware, since that would actually be useful.
The idea suggested by Masao looks to me like a lot of cruft that will
break away even farther from the original simplicity of Unix, for ny
actual gain.
But I guess I'm a grumpy old fart, who thinks so already anyway.
NetBSD... A system that used to be better...
(Do I need to say that I agree with Quentin?)
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt@softjar.se || Reading murder ...How exactly does hard-wiring a kernel helps with some of the issues described here? Say you have two USB drives, and plug them in a different order in different ports (which defeats all config(5) Yes, it is random, and should be considered as such. That doesn't mean, however, that it is impossible to somehow locate device in a constant way regardless of how they attached. I know that poeple have There is a lot to be gained from providing a useful binary distribution of NetBSD. That includes a kernel that people don't have to play with in order to make it useful. Grumpy old farts will always compile their own kernel and do their own thing, but fortunately I don't think it is a goal for anybody in the NetBSD community to be useful only to grumpy old farts. Are you still positive about that? I am certainly not advocating the status quo. --=20 Quentin Garnier - cube@cubidou.net - cube@NetBSD.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007.
I often run GENERIC kernels. In fact, the only reason I'm not running it on the machine I'm on now is because for some reason one of the devices I wanted was commented out (spdmem). Given what I need the machine to do, GENERIC works fine, and I'd rather spend my time on more useful things (like spouting random I definitely agree that this is important criteria to keep in mind. It's easy to get distracted by things that look cool, even if they don't work. I know I've gone down that path many times. eric
Imagine if I want to use a USB disk as / on my DELL OptiPlex 745. The device
tree of that machine looks like:
/mainbus0
/pci0
/puhb0
/agp0
/ppb0
/pci0
/vge0
/ukphy0
/vga0
/wsdisplay0
/drm0
/uhci0
/azalia0
/ppb0
/pci0
/ppb1
/pci0
/uhci1
/uhci2
/uhci3
/uhci4
/ppb2
/pci0
/ichlpcib0
/isa0
/lpt0
/com0
/piixide0
/atabus0
/wd0
/atabus1
/atapibus0
/cd0
/ichsmb0
/piixide1
/atabus0
/atabus1
How do you write a kernel config which can always identify my USB disk as
sd0a, even if I plug random devices?
Masao
--
Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
It would help if you started by showing where your disk would be in the device tree. Then I can tell you what (more or less) you need in your config file. USB, or whatever else, is no magic. You can specify explicitly where your disk is, and have it show up with a specific device number even with other devices attached anywhere. I seem to remember that way back there was even a tool (in pkgsrc?) which extracted your current device setup, and created a config file from that, so that you would always get the same enumeration, no matter what else showed up on the machine. The point is, the config file totally, and exactly describes your hardware setup. Your suggestion would simply mean that this information would be duplicated in the file system. The config file have the additional "feature" of actually making the device appear with the same name, even if you move it around, by just changing the config file. Everything else in the system will not have to be told after that. And the names exposed, and referred to, are simple and short, even though you do have the full tree described in the config file. Someone else mentioned that the problem have grown for the simple reason that hardware configurations change much more often now than in the past. I would agree with that. However, for vital pieces of the hardware, the setup normally don't change that much (such as the disks normally used by the system). So, the device configuration and enumeration is only random so far as that if you tell the system that it is okay to give a device a random number, it will actually possibly do that. Otherwise it is totally predictable. If you, on the other hand, do move your disk around (be that by using USB and different ports and hubs, or different controllers), neither the old config, nor your new solution will help. The disk will change identity (or path) (well, with the old config, it might actually keep it's identity, but that's a chancy proposition at ...
"More or less" doesn't meet my criteria. Consider mission critical use cases. My devfs is not meant only for hobbysts.
"More or less", because I don't have all the details. If you were to post the dmesg from your booting, I could give you the exact thing. Are you sure your USB disk shows up as sd? Looking at the config file, I would have thought it would match wd. If it is wd, then the config should have something along these lines: wd0 at umass0 umass0 at uhub0 port 0 configuration 0 interface 0 uhub0 at usb0 usb0 at uhci0 uhci0 at pci1 dev 1 function 0 pci1 at ppb0 bus 0 ppb0 at pci0 dev 0 function 0 pci0 at mainbus 0 bus 0 Obviously I've thrown in a bunch of "0" here, where there probably should be something else, as well as a "1", since you have two pci buses involved already at this point. That's the "more or less" part. Now, if you don't understand the concept based on this, then I don't think putting correct numbers in here is going to help much more either. The basic idea though, is that this will always cause the same disk to be wd0, and no other disk will ever become that. No matter what hardware you add, or where. Johnny
What you really don't seem to understand is that this answers only half of the contract. Put the drive in another USB port and it doesn't show up as wd0. The idea was that only that disk would show up as wd0, and would always show up as wd0. (Incidentally, wd@umass is very rare. I think it was only some old Archos.) --=20 Quentin Garnier - cube@cubidou.net - cube@NetBSD.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007.
But you miss my other half point. In Masao's original idea, he complained about device enumeration being random, and wanted it moved out into the filename namespace. But if you move the device to another port, it will move just as much within the file system, so he didn't solve anything. If you want to be able to refer to a disk in the scenario where you actually move it around, you need some other solution. My answer only intended to show that the device enumeration isn't random, depending on if you add/remove other devices, which is what Masao was claiming. His original claim, and reason for his proposed solution, is basically wrong. The problem you are highlighting is another one, and one which I agree it would be nice to have a solution to. But the only solution I can come up with is to be able to refer to disks by their name in the disk label, or something similar, which is unique per disk, and have no relationship at all with which how they are attached to the system. Something like: wd0 at umass? label="foobar" But, as I said, this is another problem, which Masao hasn't at all addressed. His solution to his random device enumeration problem is simply a solution to a non-problem. I hope I made myself clear, since I sometimes seem to not be able to express clear enough what I mean. Johnny
Your answer only says that device enumeration is deterministic. Nobody said that it wasn't. I know autoconf(9) hasn't aged very well, but it's not that bad. Yet. --=20 Quentin Garnier - cube@cubidou.net - cube@NetBSD.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007.
I've had problems with SATA drive enumeration due to weird BIOS issues. I'd rather have had my configuration driven by what was on the disk. --Steve Bellovin, http://www.cs.columbia.edu/~smb
Masao said exactly that. Or rather, that it wasn't possible to get the
same device number for a specific device facing other changes in the
hardware configuration.
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt@softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
So, you want to be able to mount a disk by the label: $ mount -t msdosfs -o label "foobar" /external_disk_foobar or, if you know the UUID $ mount -t msdosfs -o uuid 3478374923723423 ~/thumb_drive What I'm asking is, why does the "device node" need to be deterministic and why is this a 'devfs' problem? The "special" argument to mount(8) does not really need to be a device node, it could find the right one on its own by checking hw.disknames and scanning the disklabels.. iain
That would be equally acceptable. As long as it'n not just mount, but also fsck, and whatever else that deals with disks on a low level The device node only need to be deterministic to the point that you can predictably get the same configuration on every boot. /etc/fstab is a typical example of a critical piece, that depends on this. Think boot time - your system starts by doing some fsck on the disks in fstab, and then mounts them. If suddenly another disk gets the id of your root disk, the bootup will fail miserably until you fix fstab. And having this fixed in the config/kernel seems like a much easier proposition than to make all possible potential tools that needs to be aware of this. Even tools that you might now know about, or even originate totally outside of NetBSD. Since there is no really standardized library to access the raw devices, most programs simply just open a device node. How do you, at that point, make it find the right device node, with a disk that matches a Yes. But it is not only mount that would need to be fixed. Johnny
At Wed, 10 Mar 2010 08:56:36 +0000 (GMT), Iain Hibbert <plunky@rya-online.n= et> wrote: Yes, something like that, using fs_volname of course. I've wanted this kind of feature for decades. And of course all the other filesystem tools should have this interface as well. It's no good if it's not uniformly usable. newfs and tunefs need to be able to set and change fs_volname to start with. Disk tools could be made to work with disk label names too for added fun, but let us not confuse fs_volname with pack names, disklabel names, etc. Naturally this should not replace the use of the device file, but rather be added in addition to it, as an optional way to specify the ultimate device used to access the filesystem. In fact I'd much rather see lots of work go into this feature than into anything even remotely related to devfs. BTW, we don't want to end up with the horrid mess some GNU/Linux systems now use when their kernel config's specify root=LABEL=xxx -- I think we I think UUID's, as I understand them so far (fs_id, right?), are really too fragile, too meaningless and difficult to read, and too dangerous, to use for this purpose. They are not actually unique, to start with, so labelling them so is just plain wrong. Search google for Russell Coker's discussion on Label vs. UUID. Filesystem volume names can be said to have many of the same problems, except to start with we know and understand that they're not unique right off the bat, and we can assign human meaning to them and make them memorable. Let's at least get filesystem access by volume names working right, then we can go on to think about other things, if they still seem worthwhile. --=20 Greg A. Woods Planix, Inc. <woods@planix.com> +1 416 218 0099 http://www.planix.com/
While I understand usefulness of human-readable labels, I don't think it should be handled in kernel. Because labels are arbitrary. They are not ensured to be unique. I think labels should be resolved by some name service. It's not different than /etc/hosts -> IP address. Masao
At Thu, 11 Mar 2010 10:22:29 +0900, Masao Uebayashi <uebayasi@gmail.com> wr= ote: The fs_id value is _NOT_ going to be any more unique than the fs_volname value. The fs_id value is also not guaranteed to be unique to start with, especially not across the operational lifetime of a filesystem. There are a plethora of ways the fs_id can be duplicated, and just about as many ways for it to get lost (or changed without change control) too. Sure, labels are arbitrary -- at least to the machine. They are not, necessarily, arbitrary to the human who creates them though. In any case the label doesn't have to be _guaranteed_ to be unique to be useful to both the human and the machine. Also, the filesystem identifier doesn't have to be a meaningless lengthy string of impossible to memorize sequences of digits to be useful to the system either -- a human created, human meaningful, label can be just as Sorry, but I'm flabbergasted! What the heck does that mean in this context of filesystem identification? Do you really want to add more complexity, goo, and mess, and places for errors to happen by adding a translation layer? First off, there's really nowhere to store your magical mappings. K.I.S.S. Please! We do have a place to store a human readable/meaningful filesystem identifier. Let the human provide this label. If the system finds duplicate labels then tell the human which devices have conflicting labels and where those filesystem were last mounted and let the human decide which device should be used. (i.e. the labels do need to be unique for a successful automatic initialisation of the system, but there needs to be a manual way to work around them not being unique regardless of what data they consist of) In my opinion the fs_id value is truly useless anywhere outside of the on-disk storage of a single filesystem copy where its sole valid use is (IIUC) to help to match valid backup superblock copies. The fact I'm not even sure it's safe or sane to ...
I want to simplify path namespace. I want labels and other "referencial" informations to be accessed via file, like procfs's doing. # cat wd0/.info Masao
One of the problems is that such a long term user like you have to know the full detailed dmesg and analyze it. That doesn't meet my goals. Imagize admins hot-swap multiple disks/NICs on missiong critical servers. Masao
And you have to disable configuration other PCI buses to prevent unwanted USB devices from appearing. You also have to rebuild kernel. Even all of these done, your system "more or less" works. Masao
Not sure what you mean here. If you don't want "unknown" devices to appear, then just don't have wildcarded devices in the config. If you want "unknown" devices to actually do appear, then you have to have the wildcard entries in there. But they will not get assigned to numbers for which you have explicit entries in the config. So they will be assigned "unused" numbers. No kernel rebuilding is neccesary. Maybe you should state more explicitly what your scenario is, and what you expect to happen? When the system boots up, I assume you want some set of devices to always get the same enumerations, no matter what other hardware might/might not exist. This is done by explicitly naming those devices in the config file. Devices which are more "unknown" can either be accepted, and accessed by the system, if you keep wildcarded devices around in the config. Exactly what number gets assigned to each device as it shows up, will be "kindof random". But since these are devices not normally expected, they can't really be predicable anyway. Or if you never want the system to accept totally unknown devices, just remove all the wildcarded device entries in the config. That way, if someone plugs in a new disk, or whatever, it will not be accessible by the system. Any other scenario you had in mind? Oh, and notice how the kernel is never rebuilt. You build the kernel once, with the configuration you expect, and then you just run it the whole time. Johnny
You built non-GENERIC in the first place. Masao
Two things comes to mind here: 1) Hot-swapping disks and so on have nothing to do with interpreting dmesg, or setting up a configuration. The configuration should already have been done, and working. 2) As I mentioned before - I know I have seen a program which will spit out the config neccesary to actually get a static setup in place, based on the current configuration. So, if you have managed to get a setup that is correct just now, you can basically "snapshot" it, and you'll get the same setup every time after that. And it does not take an "expert" to just use that program. So I can't say that this should be a problem. Basically, we already today have a way of getting a predictable device enumeration, which is repeatable, even in the face of other random changes to the hardware. So I don't see the point in why you want to change this. And moving it into a filesystem makes it awkward, more difficult in some ways, and in short is just a bunch of work that gives nothing. Wouldn't it be better to spend that energy on something that actually will buy us something? And of course I have to know the full details. Just as I would have to know the full details to know the path in your filesystem, if that were to reflect the hardware configuration. How would you know where to find your disk in the file system if you didn't know exactly all the buses and instances that lay between the root and your disk? Hmm, I just realized that I didn't completely follow how your file system design will even solve which controller gets which path. Maybe it was in your original mail, but I have forgotten that detail in that case. If you have two disk controllers on one bus, how do you decide which is "0", and which is "1"? Johnny
Physical location has to be known by drivers in some way. Bus drivers
are responsible to probe devices & enumerate them precisely.
Otherwise those buses and their children are not predictable. What
I've in mind is like:
/dev/.../pci0/pcislot0/isp0/...
/dev/.../pci0/pcislot1/isp0/...
/dev/.../pci0/pcislot2/isp0/...
/dev/.../pci0/pcislot3/isp0/...
or
/dev/.../pci0/isp0/...
/dev/.../pci0/isp1/...
/dev/.../pci0/isp2/...
/dev/.../pci0/isp3/...
Masao
That program was recently retired from pkgsrc because it hasn't really worked for ages, was marked as for NetBSD 1.5 only and noone cared since then. Joerg
Yikes. I AM an old fart. :-) Since I don't use it I was just digging through old memories. 1.5 sounds about right. Johnny
You'd need to put the UUID in the kernel config.
I'd go further and say that we should be able to supply a set of device properties (such as drvctl -p prints) to the kernel. Let us match a device by its intrinsic properties (MAC address, serial number, and/or GUID), and set the unit number according to the device property. Quentin is right that this *only* helps us to fix the unit number, but I think that in itself is an important, *feasible* step forward. Dave -- David Young OJC Technologies dyoung@ojctech.com Urbana, IL * (217) 278-3933
One thing that I think is problematic about trying to do that, is that you might sometimes need to attach a device (allocate the unit number) in order to discover its intrinsic properties. It can't always be done in the attach routine because you might have to wait for a query (or several) to return. For that reason, we should consider that the dv_xname is not necessarily a useful tag. (I say "device" rather than disk because I know that Bluetooth controllers work this way - you can't get the BDADDR until it is up and running) I have never used wedges but, for the disk case, would it not be better to make a method of configuring a dk in advance, so that whenever a disk appears with the correct parameters it will already be mapped to the dk you expect? (perhaps a daemon could handle it) Then you know that /dev/dk3 is your USB stick and will never be anything else.. iain
I don't think it has to be or should be in the kernel. Basically, /dev/dk3 gets created or is used by the kernel. A daemon is notified (*cough* udevd) and that scans the device properties, finds the UUID and creates /dev/uuid/2345324523453245. It also finds the label and creates /dev/label/my-usb-stick. The latter is what you put in /etc/fstab. Joerg
What if udevd is on /dev/uuid/2345324523453245 ? Dave -- David Young OJC Technologies dyoung@ojctech.com Urbana, IL * (217) 278-3933
The boot loader has a separate mechanism to pass down what is booted from. That should be good enough for getting root mounted. Joerg
What do you mean? How can you mount / on /dev/uuid/2345324523453245? Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
I agree completely. In fact, we could do that now, with a simple rc.d script. --Steve Bellovin, http://www.cs.columbia.edu/~smb
Sorry I got confused - in your method, what is dk3 needed for? What I suggested then, was a daemon that waits for a disk device to appear, then it can probe the disk and configure the appropriate dk(4) device. If it determines that the device is your USB stick then it configures as dk3. Otherwise, just put it as eg dk7. The admin knows that /dev/dk3 is normal and can arrange permissions accordingly so that you can access it, but dk7 can be restricted (thats for the paranoid admin) Then, I'm not sure why /dev/uuid/* and /dev/label/* would be necessary? (sure, they would be 'nice' to have) iain
It is still the device, the rest are just symlinks. Joerg
do you propose to do away with sd0a then? iain
That's a very good question. Right now you can decide whether you want to use the disklabel approach or wedges. I don't think we need both. I think having the full disk (raw) device ((r)sd0d) and compat symlinks to the wedges is good enough and would help eliminate quite a bit glue in the existing drivers. Joerg
And now anyone who can jack around with the userspace daemon process can cause you to mount a filesystem you didn't intend to mount. I think discovery of the identifiers used to mount devices needs to be in the kernel. We can do that already for RAIDframe and GPT; why back away from it now? Thor
Ah. I've been unaware of that. Thanks for pointing it out. Although I once said mknod /dev/id/... should be run in userland, now I believe it should be in-kernel. It's so simple. What I don't want is to dig not-truely unique strings like labels. That makes devfs responsible to resolve confliction, which in turn leads to some configuration thing, which I definitely want to avoid. Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
You need some kind of persistent state *somewhere*, to support chmod, chown, mv, rm, etc. Or are you proposing to break those? That idea strikes me as a pretty crippling regression. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
My devfs doesn't do that complicate thing. Mine is more or less procfs + kauth. Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
Wow, that sucks. Not being able to change permissions (and less importantly, mv or rm the device files) would definitely be a problem. eric
Could you show me use cases how it sucks? I need more use cases. Masao
That was my own reaction too, but, y'know what? What Uebayashi-san suggests is just fine as a research experiment, and, if it succeeds there, on the road to production use it can grow such things. NetBSD is still a decent framework for minor OS research experiments like that, and I think that's as it should be. Of course, anyone who proposes to put it into NetBSD's main released tree without support for such things should be shouted down. Vociferously. And thoroughly. Lack of chmod/chown/etc in /dev would be a total showstopper (for me, at the very least) for production use. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Could you tell me the senario how it sucks? Masao
On Wed, 10 Mar 2010 04:38:03 +0900
I programm microcontrolers with a serial programmer. I use a serial
connection to the target microcontroler for debugging. So I want to
be able to read/write the serial port device node (e.g. /dev/tty03
or /dev/ttyU0) directely. But I don't want other users grant access to
my serial devices. So I chown the device node to user jkunz and make it
read/writable by that user only.
The Linux devfs solved this problem with an init-script, that changed
ownership and modes after each reboot. Looked a bit awkward to me when
I had to deal with it.
Non-persistent ownership and modes of device nodes is a show stopper.
--
tschüß,
Jochen
Homepage: http://www.unixag-kl.fh-kl.de/~jkunz/
That's one of the scenarios I've run into, though for me it's at least as often been something other than a serial line - on my scanner machine, for example, /dev/scanner is a symlink to /dev/uk0, and /dev/uk0 is mode 600 owner mouse. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
I also have the same requirements, for some various kind of devices
(serial, USB, even some disk devices)
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
Is it acceptable for you to do such things by some layering? I don't know how to do that exactly yet, but the point is you need a little A bit? I thought udevfs's config file is totally inacceptable for missiong critical embedded purposes. Why do we have to learn a new Sure. Masao
On Wed, 10 Mar 2010 11:53:43 +0900
Masao Uebayashi <uebayasi@gmail.com> wrote:
Anything else but "chown jkunz.users /dev/ttyU0" is awkward to me.
So what ever solution for this problem is introduced (udevd,
rc.script, ...), I have to learn a special way to configure device
node ownership and modes on devfs.
There could be some devd(8), that listens for ownership and modes
on devfs and stores it on disk. At next boot it could reconstruct
I talked about the long gone Linux devfs, not udev. But it doesn't
matter. Anything else then chmod(1) and chown(8) needs to be learned.
So it doesn't matter what new stuff I have to learn to get it done.
BTW: My dayjob is to programm an embedded system running Linux.
A (small) part of my work is to fight with udevd. And those systems
are used in mission critical applications.
--
tschüß,
Jochen
Homepage: http://www.unixag-kl.fh-kl.de/~jkunz/
Fair enough. After some thinking, providing "traditional" view and persistent bits turns out to be not that difficult. /dev has a few reserved directory (like /dev/id). You have no freedom there. Any access other than that goes to devfsd. It has knowledge equivalt to sys/arch/*/conf/majors.* as reference. And it tracks mknod(2), rename(2), etc per-mount point. When you do mknod(/dev/wd0a); rename(/dev/wd0a, /dev/woah0a); open(/dev/woah0a), devfsd resoves it by using DBs and converts it to something like /dev/default/wd0a and pass it back to kernel. You have to shutdown cleanly, otherwise you lose DB. Are you happy about that? :) Masao
This seems as wrong approach to me. I was in contact with mjf@ when he designed his devfs and I think that his approach was not the best but reasonable. We do not want to have any sort of static major number definition everything should be dynamic. There should be some sort of config file which describe what should devfsd do when it receive event from kernel. e.g. if usb key with uuid abc was inserted create /dev/usb_work_key device. AFAIK. last version of devfsd was able to handle dynamic major numbers and configuring devices accordingly to config file(proplist). -- Regards. Adam
Dynamic major might make sense in transition. But it's not the goal. My devfs looks up the device instance via struct device. dev_t will be no longer used. devfsd maps major/minor recorded in filesystem to struct device instances. I don't see how dynamic major helps here... Masao
Date: Wed, 10 Mar 2010 21:21:21 +0900
From: Masao Uebayashi <uebayasi@gmail.com>
Message-ID: <70f62c5e1003100421s5c54035bkdee5917165b0104d@mail.gmail.com>
| dev_t will be no longer used.
I'm not sure if something that blatant (unqualified) is actually what
you meant to say, but if it was, you cannot do that.
dev_t (however poorly designed you might think it to be) is a part of
the kernel/application API that has been there forever, and is not
going away any time soon.
Stuff like "find -x" needs to keep on working (that uses dev_t), and for
that matter, doing a backup of a device tree using cpio, then restoring
it later (however insane it really is to use cpio for this kind of
purpose) needs to keep on working.
How the kernel actually associates between drivers and code that needs
to access the drivers is a whole different issue, and if your plan is
just to replace usages like
(*bdevsw[major(dev)].d_strategy)(...);
with something different, then that might be OK, but both the dev_t
and the minor() and major() macros to interpret it simply have to
remain.
Part of what I am seeing when reading this discussion is that it doesn't
appear as if any two participants have any real idea what the others are
talking about - everyone is focusing only upon their pet need or desire,
and no-one is really looking at the big picture.
I have no real opinion on how all this should be done, just two hints
for how a solution to whatever problem actually exists should be
investigated. First don't start with, and certainly don't concentrate
on, disks - they're way too easy (complicated sure, but it isn't hard
to come up with solutions that seem to work for disks). Probably even
network interfaces (not that we really treat them as devices anyway),
what you need to make work properly are devices like tape units, cd
readers & writers (with nothing loaded in them), serial line interfaces
(com ports, or tty devices), line printer interfaces (parallel ...The only property of dev_t that userland really cares is that it is a number and that it is unique per device. That is fulfilled as long as I don't disagree on this. Joerg
Date: Wed, 10 Mar 2010 15:41:44 +0100
From: Joerg Sonnenberger <joerg@britannica.bec.de>
Message-ID: <20100310144144.GB23857@britannica.bec.de>
| The only property of dev_t that userland really cares is that it is a
| number and that it is unique per device.
For the vast majority of userland that's right (and what's more,
temporally unique - the same number can mean something entirely
different tomorrow, and generally nothing will care).
That is, except for cpio (and similar) - that actually has a portable
(ie: defined) format that includes the ability to store devices.
Now no-one sane would expect to be able to take a device from one
system to another (that is, a name of a device and its dev_t) and have
it work (usefully) on another system, so the sole practical use of
this ability is backup/restore (a function for which cpio is particularly
useless, but for which it is used nevertheless). A shorter term
variant of the same thing is "cpio -p" to make a backup copy of a
filesystem (and all its device nodes, etc) - a function for which cpio
is not quite so useless.
For dump/restore we could alter the format in which devices are
represented, so that they could be correctly restored, no matter
what we do with them, but for cpio we don't really have that option,
and people do want to be able to get back their owner/modes for
device files, and have the right names apply to the right devices with
the right access permissions for the appropriate users - and (aside
from the name, used id, and modes) all that's available to indicate
what device is the dev_t.
kre
dev_t was, and is, a kludge, to deal with devices in the relatively primitive filesystem Unix used back in its early days (well, I think they might have been ints then, rather than dev_t, but the difference between the two is trivial). It's good enough for most purposes and its problems have been relatively minor so far, so it's survived, but there's nothing sacred about it. Everything starts somewhere. I would never go near Uebayashi-san's devfs in any of the incarnations described so far on a production system. But I think it's high time someone started thinking about, and experimenting with, alternatives to traditional device nodes and device numbers, and I'm glad to see this happening. If and when this gets to the point of being contemplated for production use, that is the time to worry about compatability with historical practices and decide whether historical compatability must be maintained or an incompatability is acceptable. There have been flag days before and there will be again; find -x might need to keep working (or might not; if filesystem mounting changes sufficiently it may no longer make sense, not that changes filesystem mounting seem to me to be part of this). There is nothing that says it has to continue working with exactly the same dev_t-based implementation it traditionally has - and even if it does, that needs dev_ts only for st_dev fields, not for special device nodes in the filesystem; the use of the same thing for both is a historical accident that I see no particular need to preserve. find -x cares about dev_t only in the sense that it equates "a.st_dev==b.st_dev" with "a and b are on the same filesystem"; something like indices into an array of mount points, or event-of-mounting serial numbers, would work Not given devfs. With a devfs, backing up device nodes makes about as A valid point. But it could also be that nobody has thought radically enough to come up with the interface that _is_ better - somewhat a la "your idea is crazy, ...
On Wed, 10 Mar 2010 10:22:40 -0500 (EST)
Seconded. If you rework it, do it thoroughly. Wipe everything and start
by zero.
Somthing else comes to my mind: Kernel configuration and devfs
configuration interact closely. E.g. you can give the device
enumeration order in the kernel configuration by "nailing down"
devices. Now those symbolic kernel devices like com(4) need to be
assigned to a device node name in /dev.
Why separate those two? There should be a single configuration file
that configures kernel options like what device to search where and
what device node to assign to it. (+ permissions and ownership etc.)
This file is used to get the kernel default configuration at compile
time. Now this file should be passed to the kernel at boottime
optionally. Thus makeing the kernel reconfigurable at reboot. In
addition the in-core version of that file must be runtime alterable.
This way you can en-/disable device drivers at runtime, probably
resulting in the (un)load of a kernel module and creation or delition
of device nodes in /dev. The current kernel configuration can be dumped
to a file and passed to the kernel at next boot...
If you chmod(1) or chown(8) a device node in /dev, devfs updates the
in-core kernel configuration as chmod(2) and chown(2) get down to devfs.
At (clean) reboot kernel configuration gets dumped and reloaded.
Et voila, devfs with persistent permissions without a devfsd(8).
--
tschüß,
Jochen
Homepage: http://www.unixag-kl.fh-kl.de/~jkunz/
chmod(2) / chown(2) are OK. devfsd(8) also needs to track rename(4) done in /dev. (I never do like that, but people want...) Masao
I missed when Jochen wrote this, so I'll comment now. This might sound tempting, but I don't think it is a good idea. Keeping track of changes and trying to retain them over reboots is risky. And the mappings need to be able to handle complex things, such as several names pointing to the same device. And people using totally different names. So, both renames, chmod, chown, unlink and mknods needs to be tracked. So, what we have basically done, at that point, is to reimplement what we already have, but in a more complex way. All for the sake of getting a default entry in there for a virgin system? (Or when would this actually be helpful?) In fact, even more complex - what do we do if someone removed a device entry, for which a device exists? Do we keep track of it in that database, marked as deleted then perhaps? Otherwise it would be recreated at next boot? What about a new kernel? Should we wipe the database? That might not be the right thing to do. Should we keep it? That might also be right - after all, this is a new kernel... We might have added some devices. Should they turn up or not? Nah, I don't see any gains. Only losses. The current entries in /dev is working better than this, in combination with MAKEDEV, which you can run if there is something you do want to add which is missing, with default values. After that, you can fool around with, and modify to your hearts content, without anything unexpected happening under your nose when you didn't expect it. Johnny
You're only pointing out that "managing static thing statically" is easy. Everyone already knows that. What we're talking is what we don't have now. Masao
Speaking of tracking state... I've found that keeping track of state
in devfsd is very wrong. It duplicates what filesystems already does.
So what we need for emulating "traditional" view is a way to proxy
those state bits nicely (probably to tmpfs).
Speaking of persistency, I come to think it's totally *not* worth in devfs.
So users have two options:
- Traditional /dev
- Fine grained access control
- Persistent
- Relying on UFS (or whatever)
- Static configuration
- New /dev
- Simplified access control
- Volatile
- Dynamic configuration
Masao
On Fri, 12 Mar 2010 00:35:24 +0900 Im wondering if it would be too hack-ish to make devfs file backed (at least optionally, in case of early boot or read only rootfs). For example: mount -t devfs /etc/devfs.db /dev So it could be persistant without a userland process. Or is this something that would be complicated? -- NetBSD - Simplicity is prerequisite for reliability
How is that different from mounting devfs and calling mtree next? Joerg
fwiw, that what I've always been assuming devfs would be doing. It needs *some* place to store the persistant state, regardless of how mtree won't handle renames and whiteouts; otherwise pretty close. eric
On Thu, 11 Mar 2010 18:18:54 +0100 Atomicity (or at least on-line updates to the database). This could also enable you to do mount -t devfs /etc/devfs-bindchroot.db /var/chroot/bind/dev for example. So you can have different permissions at a different location. -- NetBSD - Simplicity is prerequisite for reliability
I don't think a static database will cut it. What happens when someone attaches a new USB stick and devfs generates a bunch of new nodes? What ownership and permissions should they get? Eduardo
Presumably the database holds, among other things, information specifying whether new nodes will appear in such a case, and, if so, what ownership and modes they'll have. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
My (current) idea is to expose all devices as perm 0000, then let devfs promote those nodes. As joerg said, this is kind of mtree(8), in that: - it should use well-established syntax - it calls mknod(2) internally It's not like mtree(8) in that: - it can't hard-code paths. We'll probably end up with some patterns, but let's not re-invent a new syntax. Masao
At Fri, 12 Mar 2010 00:35:24 +0900, Masao Uebayashi <uebayasi@gmail.com> wr= ote: Indeed -- I do agree with that much at least! I've had diskless systems running for a long while now (since 2003) where /dev is created by init(8) on every boot (by running /sbin/MAKEDEV, as I've renamed it). In the extremely rare cases where I've wanted to change permissions or similar on a device node I can just use the normal commands: chmod 666 /dev/tty001 and if I want to make such a change persistent across boots I just add that exact same command to /etc/rc.local. There's no magic needed. I think the only key feature necessary is that devfs handle the normal permissions and ownership changes, but to do so of course with no more persistence than tmpfs, md. or mfs. --=20 Greg A. Woods Planix, Inc. <woods@planix.com> +1 416 218 0099 http://www.planix.com/
This wouldn't work very well for hot-plug devices.
As I understand it, nodes would be created at plug time, and removed at unplug
time (correct me if I'm wrong). So you would need to run you chmod
when your e.g. USB device is plugged (which is also the time at which you
know where it will how up in the device space).
Linux udev can handle this, and it's usefull (I've got do to such
special setups at work a few time already).
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
At Fri, 12 Mar 2010 20:22:25 +0100, Manuel Bouyer <bouyer@antioche.eu.org> = wrote: Hmmm.... well, we have had "hot plug" devices of a sort ever since 1.6 or earlier (when I began using MFS /dev)..... The only magic trick there is to be able to predict all the possible major and minor numbers at the time you write your MAKEDEV script, or at least be able to update that script as necessary. In the past this has been sufficient, eg. with SCSI probe and scan detecting new devices. However even that kind of magic really isn't truly necessary. Indeed without devfs it could be as easy as the kernel to simply spitting out a message saying that "a device at major N, minor Y" was available to be used (when it was detected), and then leave it entirely up to the user, or some agent of the user (eg. a script monitoring for such messages), to run "mknod" as appropriate, and perhaps adjusting permissions and ownerships at the same time, possibly even updating /etc/MAKEDEV.local. In fact I've wanted the kernel to tell me what major/minor number(s) to use for new SCSI devices, though to some extent the way MAKEDEV is written to use unit numbers, it works well enough. Obviously there are other ways for the kernel to notify userland of such events as device attach/detach besides having a script monitor /dev/console output or kernel syslog messages. Perhaps kqueue() monitoring /dev itself is sufficient, though perhaps then only for a "flat" file tree in /dev. So, with a devfs implementation that creates the new /dev file node automatically, the agent script could still be responsible for changing permissions and ownerships as desired. I.e. no magic for persistence of filesystem metadata is necessary in devfs so long as there are ways to monitor for and handle events that indicate changes have happened in the live state of devfs filesystem. --=20 Greg A. Woods Planix, Inc. <woods@planix.com> +1 416 218 0099 http://www.planix.com/
First, a note - I asked a Linux person I work with why the penguins switched from devfs to udevd. He said that it was a question of pulling relatively complex policy issues out of the kernel into userland, the stance being that things like "users in group pix should be able to access any USB scanner or camera devices that may appear" do not belong in the kernel. I'm sure this forms an argument of some sort for NetBSD's purposes, but I'm not sure which way. For your use cases, yes, perhaps. My use cases too, most of them at least. But there are other use cases (some of them reasonable, even :) which the traditional /dev does not support well, such as the "I want the disk with UUID xyz to appear at some fixed place regardless of whether it's on SCSI, USB, firewire, bluetooth, or what" one that's been mentioned upthread. Those use cases, the ones /dev does not handle well, are what are driving devfs. It may be that a devfs is not a good way to handle them. But /dev definitely is not; I don't see much alternative but to keep trying various things until someone finds something better. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Now it's obvious that we need an explicit "switch"; which we use, either "static" (legacy, current behavior) or "dynamic" (devfs, hot-plug friendly, but backward incompatible / standard inconformant). Masao
OK, this is something like, config exploring buses with the whole tree image as a recipe. IIUC this is exactly what ACPI needs. You build a whole tree from ACPI table, then enter configure(), build cfdata on-the-fly and give it to *_attach(). Bus drivers may have to be changed to pass its subtree to config_found()... For permissions, they're probably going to be per-mount (== per-view). We should concentrate on physical topology / connection during configure(). Masao
On Wed, Mar 10, 2010 at 10:22:40AM -0500, der Mouse wrote: > Everything starts somewhere. I would never go near Uebayashi-san's > devfs in any of the incarnations described so far on a production > system. But I think it's high time someone started thinking about, and > experimenting with, alternatives to traditional device nodes and device > numbers, and I'm glad to see this happening. ...twelve years ago. http://www.eecs.harvard.edu/syrah/vino/ Apart from the general problems with devfs as a concept (which I've blathered plenty about in this and other discussions) based on that experience there are some pertinent things I can say here: (1) dev_t cannot go away, because a fairly fundamental guarantee in Unix is that two files are the same if stat returns the same (st_dev, st_ino) pair for each. Violate this semantic at your own risk. (2) As Joerg (I think) already noted, it is perfectly sufficient to just number devices as they're attached. There is no particular need to give these numberings semantic significance, or make them persistent across reboot. (Although for nfsd you need to check where your NFS file handles are coming from.) (3) It is also necessary that device nodes continue to appear as device nodes to stat (S_IFBLK, S_IFCHR, etc.) because assorted regrettable things happen if e.g. disk partitions appear to be regular files. Given this, by far the path of least resistance is to fill st_rdev with the same dev_t value already generated. > With that in mind, I'd say that the more radical Uebayashi-san's devfs > is, the less like past (failed) attempts at devfses it is, the more > likely it is to turn out to be a better way. Eliminating (this use of) > dev_t is an example. As the foregoing implies, VINO's devfs had no dev_t, or at least, no semantic dev_t. I would still call it a failure; however, building it did point out at least two important points in addition to the ones above. ... oh, why not. (1) Attaching a device into devfs and ...
This dev_t does not have to correspond, though, to anything else in the Oh, they probably shouldn't appear to be ordinary files. (I'm not convinced they can't be; those "regrettable things" could be looked upon as things needing fixing upon switching paradigms.) They need not, however, be traditional character or block device "files". Indeed, I can't offhand see any reason why userland has to even be able to tell whether two of them are the same or not (though it can help at the human level in some cases); as long as opening one connects you to the correct driver, they could be pretty much anything. stat() returning an st_rdev is another of those implementation details that is not necessary but which people have trouble letting go of because they're not willing to bite big enough bullets. procfs and kernfs are examples of filesystems which illustrate that it's possible to have a non-"device" entities in the filesystem which, when opened, connect to specialized code. Doing this with a devfs might even involve creating a new type of filesystem entity (S_IFDEV, Only at a very general level, the level of "new stuff appearing in the filesystem", but at that level open(,O_CREAT,) also qualifies. So do other calls; perhaps most relevantly here, consider mknod() - some of the ideas mentioned upthread have involved a userland daemon that That actually does not follow. Attempting to look up the name (as opposed to doing something with an existing name) could be what triggers the load. Of course, that means that the name exists in some sense, but that sense does not have to be one that's visible to userland (while you may want an administrative interface that lets you see them, it is in no way essential). /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
On Thu, Mar 11, 2010 at 10:52:53PM -0500, der Mouse wrote: > > (1) dev_t cannot go away, because a fairly fundamental guarantee in > > Unix is that two files are the same if stat returns the same (st_dev, > > st_ino) pair for each. > > This dev_t does not have to correspond, though, to anything else in the > system. Not really, no, but it may as well be the same as what's in st_rdev. > > (3) It is also necessary that device nodes continue to appear as > > device nodes to stat (S_IFBLK, S_IFCHR, etc.) > > No, actually. See below. > > > because assorted regrettable things happen if e.g. disk partitions > > appear to be regular files. > > Oh, they probably shouldn't appear to be ordinary files. (I'm not > convinced they can't be; those "regrettable things" could be looked > upon as things needing fixing upon switching paradigms.) In the best case it's like when naive Linux users first encounter /proc/kcore. The biggest obvious real problem is that you'll probably end up with an extra copy of each disk on your backup tapes. You also get programs that know to avoid device nodes tripping on various special semantic properties some devices have, like blocking for carrier opening ttys or rewinding tapes. This issues could probably be fixed with attributes of some kind, but "I'm a device" is after all exactly the right attribute... Anyhow, I tried it and the other guys on the project made me revert :-) > procfs and kernfs are examples of filesystems which illustrate that > it's possible to have a non-"device" entities in the filesystem which, > when opened, connect to specialized code. Oh sure, and sometime I should write up VINO's kernfs too (it was not a failure) but these work out somewhat differently in practice. The files in procfs and kernfs are for the most part semantically equivalent to real files even when they're virtual or dynamically generated. Devices frequently have other properties. > Doing this with a devfs might even ...
If there still is an st_rdev. I see no particular reason that needs to Disagree. Writing to real files does not, for example, change the system hostname or alter a process's registers. In fact, that sounds a lot like the kind of dangers that inhere in In terms of the end state achieved, neither do I. But there can be value in that programs that haven't been ported are more likely to misbehave if they see a "name" (by which I mean S_IFCHR and S_IFBLK) they think they know the semantics of but with different semantics than In some respects. But lurking under all this has been doing away with st_rdev, which for some programs is a radical enoguh departure that a new name is deserved. (Others won't care, but I suspect most of them I'm not sure I'd call a filesystem a "foreign object". If that's fair, then the filesystem namespace is _all_ "foreign object"s, and the I'm not sure how fair it is to call it a "proxy object", any more than an S_IFREG inode is a proxy for the big array of bytes (stored elsewhere on the disk) that make up the file's contents. Alternatively, they're all proxies, and the adjective becomes pretty Well, I'm not sure I'd call it "non-devfs", in that you're basically creating either one devfs per device or a devfs which exports only one device per mount, depending on how much of the device-specific part you Well, they don't, really, but not automounting them doesn't solve any Not all that unsolved. I've used at least two automounters, each of which solved it well enough for their purposes. Device automounter config *is* unsolved - in exactly the same way that devfs config is I don't see any real difference between an automounter mounting devices into /dev individually and a devfs making devices appear under a devfs mount. It would even be just a relatively trivial bit of coding to The same thing you would have loaded for the same name in a "touch the Well, there may be value in its not appearing in readdir() output. /~\ The ...
On Sat, Mar 13, 2010 at 08:02:51AM -0500, der Mouse wrote: >>> [st_dev] does not have to correspond, though, to anything else in >>> the system. >> Not really, no, but it may as well be the same as what's in st_rdev. > > If there still is an st_rdev. I see no particular reason that needs to > be preserved. No, except that it is somewhat useful to be able to identify a device node (or at least distinguish it from others) and plenty of existing code expects the st_rdev field to exist. Patching all that is only worthwhile if it accomplishes some purpose, which it wouldn't really. > > The files in procfs and kernfs are for the most part semantically > > equivalent to real files even when they're virtual or dynamically > > generated. Devices frequently have other properties. > > Disagree. Writing to real files does not, for example, change the > system hostname or alter a process's registers. > > In fact, that sounds a lot like the kind of dangers that inhere in > writing to devices indiscriminately, doesn't it? Yes... and no. There's another sense in which /kern/hostname is the same as /etc/passwd: both are text files that affect the system configuration. Changes to both also have immediate operational effects on the running system. The fact that one is not preserved across reboots is a negligible difference from the perspective of some program that might randomly open either. Unexpectedly opening a tty without being prepared to hang indefinitely waiting for carrier-detect is a different class of problem. Many devices also are not like regular files in that you cannot read back what you write to them; /kern/hostname is again a regular file by that standard. I'm not saying that it might not be useful to tag /kern/hostname somehow (and /etc/passwd too) so that certain classes of programs, like say mail delivery tools, can categorically refuse to write to them. But that's kind of a different issue from marking devices... > >> [...] devfs might even ...
If we do away with device numbers - I think that was mentioned - what point does st_rdev have? What meaningful values could you put there? The only fundamental purpose they (as used here) serve is to handle the mapping between filesystem entities and driver instances, and that's And the device driver is part of the conceptual entity that a device Where it is stored is an implementation detail. (And, if it's autoloaded, the driver itself may very well be in the filesystem. Depending on what the driver does, the object, if any, backing it also Depends on exactly what you include under the "devfs" name. There is no devfs in the sense of something being passed to vfs_attach(), but it seems to me that that, like the existence of vfs_attach() at all, is an implementation detail; there is code (mostly in the automounter) that performs the functions we have been attributing to a devfs, and in that I'm not convinced "too many levels of indirection" is fair - and, even if it is, I'm not convinced individual device mounts aren't approximately as bad in that regard. The main purpose it serves, it seems to me, is to collect the relevant code together. Deciding whether to implement the loose conceptual devfs as individual automounted device nodes or a single devfs mount strikes me as a bit of a toss-up; either can perform the fundamental function of a dynamic mapping between filesystem-namespace strings and device drivers. I'm perfectly willing to accept your experience that the single-devfs-mount has operational problems, but, pending someone trying it, I don't believe that the other way doesn't. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
On Sun, Mar 14, 2010 at 04:43:18PM -0400, der Mouse wrote: > >> In some respects. But lurking under all this has been doing away > >> with st_rdev, which [...] > > Well, no, we're doing away with a specific interpretation of the > > contents of st_rdev. Getting rid of st_rdev itself doesn't serve much > > further purpose. > > If we do away with device numbers - I think that was mentioned - what > point does st_rdev have? What meaningful values could you put there? > The only fundamental purpose they (as used here) serve is to handle the > mapping between filesystem entities and driver instances, and that's > something that can, and maybe even should, be done differently. There's one other purpose, which is determining if two device inodes encountered in the FS namespace refer to the same device or not. This is not a completely useless property, and if you're going to have some kind of device identity anyway (as is needed for filling in st_dev) st_rdev is a natural place to put it for the device node itself. > >> I'm not sure how fair it is to call it a "proxy object", any more > >> than an S_IFREG inode is a proxy for the big array of bytes (stored > >> elsewhere on the disk) that make up the file's contents. > > But that big array is part of the conceptual entity that the inode > > represents. > > And the device driver is part of the conceptual entity that a device > inode represents. No, it's not, because the device inode belongs to a file system and the driver does not. > > The driver pointed to by a device special file is not part of > > anything in the filesystem. > > Where it is stored is an implementation detail. Yes, which is why it's not *part* of the filesystem. > >> Well, I'm not sure I'd call it "non-devfs", in that you're basically > >> creating either one devfs per device or a devfs which exports only > >> one device per mount, depending on how much of the device-specific > >> part you consider to be part of the ...
Sweet jesus. Talk about brittle solutions... Clean shutdown to survive... Yeah, that we can guarantee... Or maybe not... :-( And the extra overhead seems just excessive! Johnny
I'm a little confused, here. How can chmod and chown on /dev/wd0a do anything useful if /dev/wd0a just gets redirected to (say) /dev/default/wd0a? Removing access helps in only a few cases, because someone wishing to bypass the removal can go directly to /dev/default/wd0a. And granting access doesn't help either, because the access will fail on /dev/default/wd0a even if it doesn't on /dev/wd0a. Well, yes. But research efforts are like that. Robustness is pretty much necessary for production use but not for the stage this appears to be at. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
I'm not a researcher. I'm an engineer. I like steady move & feasible project. I think everyone agrees that having /dev/id is useful even only alone. So that would be the first step. If people complains the inability of rename(2) in /dev/id... I should quit using NetBSD. (Fortunately no one has.) Masao
On Thu, Mar 11, 2010 at 01:36:41AM +0900, Masao Uebayashi wrote: > > Well, yes. ?But research efforts are like that. ?Robustness is pretty > > much necessary for production use but not for the stage this appears to > > be at. > > I'm not a researcher. I'm an engineer. I like steady move & > feasible project. I am a researcher, and my core area of interest is exactly this kind of problem. If you are looking for a feasible project that can be relied on to move forward, my honest best recommendation is to pick something else. :-| -- David A. Holland dholland@netbsd.org
Perhaps it should be a hard link instead then? Unfortunately, then you The "DB" (whether it's an actual database, or a file on a filesystem, or whatever) shouldn't care about clean shutdown most of the time, only if you happen to crash in the middle of changing things. The contents and permissions of /dev aren't going to be changing much in most normal operation so it seems like optimizing the "DB" to be in a safe state most of the time, even if it makes changes slower, would solved this problem. eric
Anyone who can meddle with a root-run process can do a lot worse than that (to start with, mounting that filesystem directly). /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Not if the system is running in secure mode. Thor
This is already a problem with dkctl. And anyway, jacking around with the userspace daemon is unnecessarily complicated: if you have sufficient access to do that, you probably have sufficient access to just change the symlink. eric
I want to be able to tell the kernel to mount a device reliably identified by some kind of unique, symbolic name. I want to be able to load a list of permissible such names into the kernel while it's running insecure, and restrict mounting to those and only those when it's running secure. Relying on a userspace daemon for naming makes that impossible. Thor
I don't get it. What kind of devices are you talking about? If the environment is static, you can still use the same identifier as before. If it is not, why do you believe that the device you are dealing with is the one you hoped it is? Joerg
That's a matter for the kernel to decide -- not one for some userspace program which could be tampered with by any process running with euid 0. At least, that is how I would strongly prefer it to be. Thor
But what's to stop someone from mounting a new file system over /bin? Or are you talking about secure_level 2? --Steve Bellovin, http://www.cs.columbia.edu/~smb
I'm talking about trying to build policies which provide some of the guarantees we only provide at securelevel 2 now, but allow more flexibility to do things the administrator's decided ahead of time the system should be allowed to do. Doing this right is not trivial (it may require a signature binding the contents of a medium to its UUID, etc.) but it's certainly not impossible either. Causing all binding of names to devices to run forcibly through a userspace daemon *will* make such enhancements impossible. That would suck. Thor
I think that Joerg's proposal doesn't prevent you from doing what you want, though I don't think it helps, either. He suggested that /dev/uuid and /dev/label just have symlinks to the usual device file, so no user-level daemons would be involved. Those who have your security needs will mount on /dev/usualstuff; those who have topologically confused configurations would use /dev/label/whatever. Many folks will mix and match -- a typical laptop, with only one hard drive, could have / on /dev/usual, while USB sticks and external hard drives would be referenced via the /dev/label symlink. --Steve Bellovin, http://www.cs.columbia.edu/~smb
> I think that Joerg's proposal doesn't prevent you from doing what you want, though I don't think it helps, either. He suggested that /dev/uuid and /dev/label just have symlinks to the usual device file, so no user-level daemons would be involved. He said it has to be done in userland daemon. :) Masao
The userland daemon creates the symlinks but not the device files, I thought. --Steve Bellovin, http://www.cs.columbia.edu/~smb
On Tue, 9 Mar 2010 22:52:17 -0500 That's also my understanding, but since this system is also very simple, even a kernel implementation would probably be nice to do this, in which case users of such a system could add an entry in their /etc/fstab to mount the uuid fs under /dev/uuid/ ? -- Matt
Yes. That's exactly same view with me. Masao
So if you want to lock things down, why not just change the /dev mount to be read-only? Then bump the securelevel, and whoever the daemon is running as won't be able to change anything. eric
I don't understand why the intrinsic properties cannot be found out in
We don't need a second mechanism to handle dk(4), do we? If dk3 should
attach to the volume with GUID 60708090-a0b0-c0d0-e0f0-01020304050, let
the device properties say so:
<plist version="1.0">
<dict>
<key>device-driver</key>
<string>dk</string>
<key>device-unit</key>
<integer>0x3</integer>
<key>guid</key>
<string>60708090-a0b0-c0d0-e0f0-01020304050</string>
</dict>
</plist>
Dave
--
David Young OJC Technologies
dyoung@ojctech.com Urbana, IL * (217) 278-3933
Well, in the case of bt3c(4) it needs to load firmware before you can talk to it and find out the BDADDR. So, you also need to access the disk before it configures.. I don't think the boot up sequence can handle this scenario as yet? In that case, the firmware is loaded when the device is enabled (/etc/rc), not during autoconfig. But if you want to rewrite the autoconfig mechanism so that each xxx_attach() function is called in its own kernel thread so that devices can wait until its safe to load the firmware then I'm all for it.. when do you plan to allocate the unit number though? iain
I guess that you could split bt3c(4) into upper & lower drivers. The upper driver's responsibility is to load the firmware and to match and attach an instance of the lower driver. Dave -- David Young OJC Technologies dyoung@ojctech.com Urbana, IL * (217) 278-3933
Usually GUID is recorded in partition table. You're viewing things in reverse order... Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
I don't see a problem. Let the kernel read the partition table, iterate over the partitions, extract properties from each partition, try to match a dk to each partition by properties (e.g., guid(dk3) == guid(partition 7 at sd0)). If there is a match, take the dk unit number from the matching property list (e.g., dk3). If there is no match, choose a unit number that is used by neither a device_t or a configuration properties list. Dave -- David Young OJC Technologies dyoung@ojctech.com Urbana, IL * (217) 278-3933
That way you teach lots of knowledge into dk(4). That's what I don't like to do. Now you pass GUID from kernel config, what is the point to have the predefined unit number 3? Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
The code providing DKWEDGE_METHOD_GPT already has the knowledge. I don't think that the knowledge has to move from there. All that dk(4) has to do is to match device-properties lists, and for that it can use The point is to make the device node, /dev/dk3, a reliable handle for the volume. Dave -- David Young OJC Technologies dyoung@ojctech.com Urbana, IL * (217) 278-3933
What if you want to mount a NIC as /? You'll fix all drivers? All of you say that lookup-by-ID works in your way. It's possible, because ID is unique. What I'm talking is the best design how to do it. Now raidframe(4) alreadys does it itself, why do you have same logic in raidframe(4) and dk(4)? I think dk(4) does too many things. That means you have to re-implement same logic in many places. That also means users have to learn all devices' behavior. The point is you can't rely on device unit numbers of pseudo devices. Masao
Of course you have to fix drivers. Drivers don't extract the device Drivers have to know how to extract properties such as MAC address from their devices. I don't think that we can avoid that. If drivers record the properties that they extract under standard keys, then we can match We can discard the pseudo-devices concept, if need be. We cannot rely on any device's unit numbers, now, if it can change slot/port/chassis. If we extend the set of "locators" to include intrinsic device properties such as MAC address, volume GUID, and serial number, then we can establish a permanent correspondence between a device unit and a physical device. Dave -- David Young OJC Technologies dyoung@ojctech.com Urbana, IL * (217) 278-3933
OK, you want to match device by ID. Like:
fxp* macaddr xx:xx:xx:xx:xx:xx
That might make sense.
What doesn't make sense there is to *fix* device unit number. Device
unit number will be no longer used after devfs, because we lookup
"A library function" is inacceptable to me. This is a substantial
design of device(9) API. This should be a *primitive*.
Device probes, configures, and extracts properties from the real
device. Just before leaving attach(), it *puts* its ID in a
well-known place so that device(9) can lookup these IDs later.
Anything more than this is inacceptable to me. autoconf(9) is already
too complex. I got *huge* frustration to understand it, that's why
In what sense?
As I explained in the first post, pseudo device is strict definition;
it has no parent in terms of physiical topology. It may have parents
in terms of components. I've very carefully investigated those. I
strictly defferenciate them. Please re-read the first post in this
I wonder if we can assume serial numbers are unique.
And again, device unit is no more.
Masao
Remember the cost to fix drivers to extract IDs in match(). Now I see no value doing this. Masao
If a device has no parent, just attach it at root (similar to mainbus*), with parent == NULL, or even pseudo* at root, and pseudo-dev* at pseudo? It is a frustration when building a 'software' device that there are some differences between the methodology of configuration, and it is not possible to pass configuration arguments from userland into the device attach routine.. I think the "pseudo-device" abstraction is unnecessary iain
Could you show one (or more) real example(s) / senario(s)? That would help to understand problems & clarify requirements... Masao
Well, a line discipline which takes serial IO and converts it into a soft
device which interacts with the rest of the system. In particular example,
dev/bluetooth/btuart.c does that for a bluetooth device. The open routine
is called from the TIOSLINED ioctl code and does:
=09cfdata =3D malloc(sizeof(struct cfdata), M_DEVBUF, M_WAITOK);
=09for (unit =3D 0; unit < btuart_cd.cd_ndevs; unit++)
=09=09if (device_lookup(&btuart_cd, unit) =3D=3D NULL)
=09=09=09break;
=09cfdata->cf_name =3D btuart_cd.cd_name;
=09cfdata->cf_atname =3D btuart_cd.cd_name;
=09cfdata->cf_unit =3D unit;
=09cfdata->cf_fstate =3D FSTATE_STAR;
=09dev =3D config_attach_pseudo(cfdata);
=09if (dev =3D=3D NULL) {
=09=09free(cfdata, M_DEVBUF);
=09=09splx(s);
=09=09return EIO;
=09}
=09sc =3D device_private(dev);
=09sc->sc_tp =3D tp;
here, we must find the device softc and insert some information after
attach has finished (tp =3D=3D tty pointer) because there is no way to pass
that to the btuart_attach() routine.
There was a thread recently regarding extending this driver,
http://archive.netbsd.se/?ml=3Dnetbsd-tech-kern&a=3D2010-01&t=3D12251898
with Kiyohara wishing to pass some additional configuration that would be
possible with eg
=09dev =3D config_found(NULL, &arg, ...);
and would also solve the (very slight) race condition, and moreover it
would not require malloc of cfdata.
The "parent" argument is mostly unused by autoconfig anyway..
iain
On Thu, Mar 11, 2010 at 03:33:27PM +0000, Iain Hibbert wrote: You can use a static cfdata_t, you know. The malloc'ing was a mistake made initially and carried upon since. --=20 Quentin Garnier - cube@cubidou.net - cube@NetBSD.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007.
On Thu, Mar 11, 2010 at 03:33:27PM +0000, Iain Hibbert wrote: > > Could you show one (or more) real example(s) / senario(s)? That would > > help to understand problems & clarify requirements... > > Well, a line discipline which takes serial IO and converts it into a soft > device which interacts with the rest of the system. Line disciplines are a bad example, because they're a prehistoric kind of hacked-up bus attachment and as such ought to be rototilled out of existence. -- David A. Holland dholland@netbsd.org
Well, line discipline is a solution to a problem, which is that we want a 'device' in the kernel but the device is not directly accessible and communicates to us through a serial protocol. You can say its a bad idea all you like, but unless you suggest an alternative solution that doesn't help to remove it. One alternative is to move the translator out of the kernel, eg instead of using the pppd(8) which needs complicated hooks, import userland ppp(8) as per FreeBSD which IIRC provides a tap(4) interface. The argument against that is probably not as strong as it once was as even embedded devices these days can be several orders of magnitude faster than the computers that were prevalent when pppd(8) was written. But then, data rates have improved also - pppd(8) runs on my uhso(4) dongle at up to 180KiB/s and I expect there would still be objections to removing it. Any other solutions you would like to propose? iain
On Fri, Mar 12, 2010 at 09:00:11AM +0000, Iain Hibbert wrote: >>>> Could you show one (or more) real example(s) / senario(s)? That would >>>> help to understand problems & clarify requirements... >>> >>> Well, a line discipline which takes serial IO and converts it into a soft >>> device which interacts with the rest of the system. >> >> Line disciplines are a bad example, because they're a prehistoric kind >> of hacked-up bus attachment and as such ought to be rototilled out of >> existence. > > Well, line discipline is a solution to a problem, which is that we want a > 'device' in the kernel but the device is not directly accessible and > communicates to us through a serial protocol. > > You can say its a bad idea all you like, but unless you suggest an > alternative solution that doesn't help to remove it. I did; bus attachments. That is, instead of just having "com* at pci*" or whatever and all the tty stuff being a legacy blob layer you'd do something like this: attach com at pci with ... attach sl at com with ... attach ppp at com with ... attach tty at com with ... and then connect things up on the fly at runtime using whatever suitable device control tools. This is not necessarily that different from line disciplines in practice (maybe, maybe not), but it's a lot cleaner structurally and it allows this stuff to share common infrastructure with the rest of the device tree. Whatever that infrastructure might be in the long run. -- David A. Holland dholland@netbsd.org
If you pay a little more respect to engineers, you'll find this is almost same as Iain's saying and what I wrote in the first mail. Masao
On Sun, Mar 14, 2010 at 03:33:19PM +0900, Masao Uebayashi wrote: > > I did; bus attachments. > > If you pay a little more respect to engineers, you'll find this is > almost same as Iain's saying and what I wrote in the first mail. huh? he asked me what I meant, I said what I meant... -- David A. Holland dholland@netbsd.org
Although I have 0 knowledge & have no time to learn tty/line disc at the moment, I fully support to fix those *now*. You need struct device. You understand how data/control flow. I think it's perfectly reasonable to make it a device as a "function". Masao
you would set this in stone. It would be easy to extend the current attach_pseudo() function by an "attach args" argument. (An interface attribute should be passed too for consistency because this is used as a qualifier for the opaque "attach args" elsewhere.) I think it hasn't been done just because noone needed it. For the future, I'd think that the currently unstructured "attach args" needs to be split into 3 parts: 1. Information about the device type, what is needed by child driver's "match" function to select the right driver. This is qualified by the interface attribute. 2. If the hardware supports it, information about the individual instance, as Ethernet HW address, or UUID for disk partitions, to allow drivers to recognize a device after temporary disconnection. This is qualified by the child device type (which can have multiple attachments). 3. everything else: handles, cookies, whatever needed for parent-child communication For (1), it would make sense to make it a proplist, and pass it to drvctl, along with locator information, to support on-demand loading I think it makes sense, because it allows to limit use of the interface-attribute-less "root" to a minimum. There is a reason that many/most ports use just one "mainbus" at root. best regards Matthias ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. ...
That doesn't make the pseudo-device abstraction necessary. I've sometimes wondered why pseudo-devices weren't handled by creating a pseudo-bus for them to attach at; I did that once, and it was no more than an afternoon's work to have pseudo0 attach at mainbus0 and then my device attach at pseudo0. (I did this because I needed a struct device, but there's no reason it couldn't have been done instead of creating pseudo-devices as we know them today.) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
If we are taking the "interface attribute" abstraction serious this doesn't work easily: A device doesn't attach at a device but at an interface attribute (which is usually provided by a device but can come out of the blue in the pseudo-device case). The "mainbus" and the interfaces provided by it are platform That's what attach_pseudo does... I think it makes sense to have some way to create a device hierarchy without pretending that it is connected to the platform's physical main bus. best regards Matthias ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------
I probably could have done it that way, but it looked easier to do it as a pseudo-bus, and for the purposes at the time, the method didn't mainbus0 does not necessarily correspond to any physical bus. There is other precedent for things in the autoconf tree that do not correspond to anything physical, such as wsdisplay. (Or, more precisely, corresponds to the same physical thing as something else.) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Well, not physical in a sense that you can look at the wires, but commonly it has a number of interface attributes which match the capabilities of the the host bridge and some more platform Yes, devices in the autoconf tree are connected by interfaces (named by interface attributes) which are primarily APIs. Sometimes they correspond to some physical bus semantics, sometimes it is just a software abstraction. Anyway - the interface attributes of "mainbus" are fixed by the platform which makes it a bad choice to attach random MI things at. best regards Matthias ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------
The question is, why do you need device unit number, when we can lookup struct device directly? cfdriver_t -> device_t is a back reference. The only case it makes sense I can think of is device detachment, but that's handled by reference counting. Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
On Mon, 08 Mar 2010 01:56:28 +0100
The problem is: Over twenty years ago a hardware reconfiguration was a
infrequent and intrusive task. It required a power down and probably
rewireing the NPR grand chain on your UniBus backplane with a wire wrap
tool... System configuration was static in those good, old days where
Unix machines where administered by a professional sysadmin and cost a
fortune.
Today we have all sorts of hot plug devices. SCSI, SAS, FibreChannel,
(e)SATA, USB, FireWire, PCcard, ExpressCard, hot plugable PCI(-Express),
Bluetooth, ... System configuration is verry dynamic today and every
user is its own Root. We need a better way to deal with this.
Linux had a devfs and droped it. Now it has udevd(8). Most likely the
penguins had a reason for this. udevd(8) gives the user land control
over device enumeration. Maybe no bad idea. (Disclaimer: I don't like
Linux.)
BTW: OSF/1 aka DEC-Unix aka Tru64-Unix did somthing like Linux +
udevd(8) over 10 years ago.
--
tschüß,
Jochen
Homepage: http://www.unixag-kl.fh-kl.de/~jkunz/
I took a little glance at OpenSolaris/FreeBSD devfs man pages, and quickly stopped. They're all overly complicated. Those who complain about redundant device paths exposed should look at other implementations. I don't really like to have /etc/devfsd.conf and bikeshed its format. Exposing device tree works, because NetBSD's hardware device abstraction is Good point. We're hopelessly behind. Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
Surely there are mailing list messages or something that outline that reason? (Not that I have any idea where they'd be, but don't we have I think it probably is no bad idea. I don't like Linux either, but I don't think it's so irremediably disastrous that there's nothing at all Another reason to think that it's likely worth trying. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mouse@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
It is more like: Linux had a devfs and [dropped] it. Now it has udevd(8). Most likely the penguins had a reason for this. Linux had udevd(8) and reintroduced devfs. Now it has udevd(8) and some kind of devfs. Most likely the penguins had a reason for this. - Jukka. http://lwn.net/Articles/331818/
So did SGI, and it was a disaster. If you're going to break the common Unix idiom (single directory full of nodes for devices) you'd better be prepared to replace it with something that's very, very easy and intuitive for experienced Unix administrators to learn.
This page has links to various bits about the move from devfs to udev. http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev.html --Dan
At Sun, 7 Mar 2010 20:50:03 +0000, Quentin Garnier <cube@cubidou.net> wrote: Indeed. This needs carving in stone somewhere, since folks seem to Indeed. --=20 Greg A. Woods Planix, Inc. <woods@planix.com> +1 416 218 0099 http://www.planix.com/
Careful here: does "end user" mean grandma clicking though KDE, or an admin figuring out why one disk of his raid component disappeared? More precise I thought this was *one* of the things that devfs was supposed to do. The two features being: 1) provide a way to see detailed information about how devices are laid out and, the one relevant for this discussion 2) provide stable names for devices that don't change if they happen to be laid out differently today vs. yesterday. with #2 possibly provided by a userland script that parses the structure provided by #1, plus whatever additional information it needs, and creates symlinks (or otherwise causes device nodes to appear in the right paths) #2 == simplicity, #1 == transparency and (low level) control Perhaps I'm muddling the base feature requirements with various ideas for implementations? eric
Well, even if the admin gives proper names to all the disks and such, yes, there is one moment when they'll be faced with details of the device tree. But to me it's a brief part of the setup, not really of the use. The click-through user has their administrative tasks proxied by things like hald(8), but I expect said proxy to see as little gory details as a human admin as well. There's nothing complicated about what devfs is for: having all the relevant device nodes in /dev, and only those. Anything else is an implementation detail. Neither #1 or #2 are part of the immediate goal of devfs. I don't see why #1 should be a part of a devfs implementation, and #2 is certainly something nice to have, but it goes beyond the initial intent of a devfs. --=20 Quentin Garnier - cube@cubidou.net - cube@NetBSD.org "See the look on my face from staying too long in one place [...] every time the morning breaks I know I'm closer to falling" KT Tunstall, Saving My Face, Drastic Fantastic, 2007.
> What kind of user do you talk about here? This is a good question. When I speak on netbsd lists, I have following in mind: a) Desktop users They use NetBSD for browsing www, reading/writing mails, playing videos, doing math, studying, text processing, etc. They use a variety of machines, typically notebooks/netbooks and commonly seen x86 platforms. b) Server users They use NetBSD for mission critical, high-performance servers like high-load network servers, I/O servers, etc. They use multi-cpu high-performace machines with multiple network / disk interfaces. c) Embedded users They use NetBSD for production like routers, printers, cellphones, various home electronics, robots, factory machines, cars, trains, ships, airplanes, submarines, spaceships. They want small and reliable systems. d) Hobbists They run NetBSD on old machine like VAX, Alpha, m68k, SH, PowerPC, MIPS, some production NAS or routers. Most of them are slower than others. They like to hack NetBSD source code. * I'm a little biased to c), because I'm it myself, but I think all of these There're some devfs implementations around, and AFAIK there's no standard, right? I came up with my design by myself. So my devfs follows the design of my devfs. The overall intent is to concentrate the information into the device tree, where we can identify *all* the instances of devices, including ones that don't have any IDs like GUID or MAC address. Each node exactly matches device::dv_xname. I don't want to make this more complex, like give drivers freedom to decide how they look like. That leads to lots of code added around drivers (xxx_register / xxx_deregister), like mjf's proposal did. My devfs doesn't do that for consistency and simplicity of the implementation. As pointed out by all of you, the device tree of my devfs can't lookup device IDs. That could be easily realized ...
Lookup-by-ID should be symlink, like: /dev/id/guid/25892e17-80f6-415f-9c65-7395632f0223 -> /dev/mainbus0/.../wd0/disk0/gpt0 /dev/id/ieee802mac/00-b0-d0-86-bb-f7 -> /dev/mainbus0/.../bge0 /dev/id/ipv4/172.16.0.1 -> /dev/mainbus0/.../bge0/ether0/net0 Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
At the risk of being a wet blanket: On Sun, Mar 07, 2010 at 06:43:49PM +0900, Masao Uebayashi wrote: > I've been spending LOTS of time to investigate various devicess sources, to > understand some questions I've had, like: > > - Why NetBSD/arm has no bus_space_mmap(4)? hellifIknow; > - Why tty locking is messy? because ttys are messy, which is because they haven't had a big rototill in some twenty years or more; > - Why sys/dev/wscons has so many #ifdef's? (Modular unfriendly!) dunno; > - How dk(4) is enumerated? in the order found; > a) Device enumeration is unstable / unpredictable > > dk(4) is a pseudo device, and its instances are numbered in the order it's > created. This is fine when you manually / explicitly add wedges(4) by using > "dkctl addwedge". This is not fine, if I have a gpt(4) disk label which has > ordered partitions. I expect disks to be created in the order I write in > the gpt(4) disk label. It's annoying the numbering changes when I add a new > disk. Same for raidframe(4). Why doesn't gpt(4) create the wedges in that order? If it did that they'd come out numbered the way you'd expect. Having the numbering change when you add a new disk is unavoidable. See further notes below. > b) Consistent device topology management is missing > > The reason why NetBSD/arm has no bus_space_mmap(9) has turned out to be the > fact that we have no consistent (MI) way to manage physical address space of > devices. NetBSD/mips has a working bus_space_mmap(9) in > sys/arch/mips/mips/bus_space_alignstride_chipdep.c. It defines address > windows and manage it by itself. > > Who wants to reimplement it on all cpus/ports/platforms? > Considering physical address space is a pretty much simple concept > - a single linear address space. Except when it isn't really; consider for example NUMA systems. I think there have also been systems where different CPUs see a different physical address space view. Whether any ...
In my devfs, devices are enumerated in the local connection. /dev/mainbus0/.../piixide0/ata0/wd0 /dev/mainbus0/.../piixide0/ata1/wd0 /dev/mainbus0/.../piixide1/ata0/wd0 See the original post. I showed the reversed pseudobus, which I've found very powerful. More examples: /dev/mainbus0/.../screen0/vt100emul0 /dev/mainbus0/.../screen0/wsmuxout0 /dev/mainbus0/.../kbd0/wsmuxin0 /dev/mainbus0/.../mouse0/wsmuxin0 /dev/pseudobus0/wsmux0/wsmuxout0 -> /dev/mainbus0/.../screen0/wsmuxout0 /dev/pseudobus0/wsmux0/wsmuxin0 -> /dev/mainbus0/.../kbd0/wsmuxin0 /dev/pseudobus0/wsmux0/wsmuxin1 -> /dev/mainbus0/.../mouse0/wsmuxin0 Where screen0 has two children, vt100emul0 and wsmuxout0. wsmuxout0 *joins* wsmux0. kbd0's child wsmuxin0 joins wsmux0 too. When kbd0 receives a character, it delivers it to wsmuxin0, which in turn delivers it to wsmux0, which in turn delivers it to wsmuxout0, then finally screen0. screen0 sends the received character to its child vt100emul0. Now how multi-head support looks is pretty much straightforward. bridge(4) + tap(4) + some ether(4) would look exactly same manner. Masao -- Masao Uebayashi / Tombi Inc. / Tel: +81-90-9141-4635
No device unit number? You've lost me. Please fill in the blank,
Actually, that sounds great to me. Then we can, as you suggested at
the top of this thread, create the ether(4) pseudo-device that is
analogous to audio(4). Let us attach a particular ether(4) instance to
an ethernet h/w instance according to the h/w's properties.
Take fxp(4) for an example. Rename it fxphw(4). Let it attach an
ether(4) instance at its ether attribute, using an optional 'basename'
attach argument of 'fxp', so that the ether(4) instance knows that it
should take its customary name, fxp0 (or whatever).
I think that an added benefit of breaking things down this way is that
we can attach >1 ether(4) to a single h/w instance, which makes a sense
with those NICs that support >1 unicast address. Maybe we can attach
vlan(4) to the h/w backend's ether attribute, too.
Another added benefit of breaking things down this way is that we may be
able to get rid of the problematic "network" class in PMF.
You seemed to have in mind attaching at fxp0 an ether0, and attaching at
ether0 a net0. What is net0's role?
BTW, I have considered previously that splitting the WLAN drivers into
hardware backends and pseudo-device frontends makes a lot of sense as
you consider the possibility to operate more than one 802.11 station on
a single hardware adapter. It would probably look something like this:
rtwhw0
|
+---net80211arb0
|
+---rtw0
|
+---rtw1
rtw0 and rtw1 are instances of net80211 state machines. They command
net80211arb0 to send packets and to pass back received packets meeting
certain criteria ("received on channel y, BSSID x, destinations {p, q,
r}"). net80211arb0 will arbitrate access to the hardware. rtwhw0 will
We cannot.
Dave
--
David Young OJC Technologies
dyoung@ojctech.com Urbana, IL * (217) 278-3933
This looks somewhat shortsighted. "ethernet" is no interface in a technical sense anymore. It is just perhaps a tag put at protocols which use 48-bit MAC addresses and can be bridged to other protocols of that kind. But then, where do draw the line? FDDI can be bridged to ethernet as well, so would you call it "ether"? Besides that, I don't see how this could solve any real-world problem which a trivial shell script can't deal with. best regards Matthias ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------
It looks to me like the FDDI frame format differs from ethernet's, however, both frames carry 48-bit source and destination addresses. The code in fddi_input() and in fddi_output() resembles the code in ether_input() and in ether_output(); perhaps we can extract some of the common code into ieee802like_input() and _output() for re-use? Maybe fxp should have an ieee802like interface for the bridge to use? Is that what you have in mind? If not, can you please be more specific about your concerns? Dave -- David Young OJC Technologies dyoung@ojctech.com Urbana, IL * (217) 278-3933
In devfs world, traditional device names (/dev/xxxN or ifconfig xxxN) are provided as a short alias. Basically devfs internally walks the whole tree, count the base device name you requested ("fxp", "sd", Actually, I don't know. I'm not familar with network. Basic idea is if those devices share some ioctl()'s, they should have a superclass. Masao
