The enclosure misc device is really just a library providing sysfs support for physical enclosure devices and their components. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> --- See the additional ses patch for SCSI enclosure services users of this. --- drivers/misc/Kconfig | 10 + drivers/misc/Makefile | 1 + drivers/misc/enclosure.c | 449 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/enclosure.h | 120 ++++++++++++ 4 files changed, 580 insertions(+), 0 deletions(-) create mode 100644 drivers/misc/enclosure.c create mode 100644 include/linux/enclosure.h diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index b5e67c0..c6e5c09 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -11,6 +11,7 @@ menuconfig MISC_DEVICES If you say N, all options in this submenu will be skipped and disabled. + if MISC_DEVICES config IBM_ASM @@ -232,4 +233,13 @@ config ATMEL_SSC If unsure, say N. +config ENCLOSURE_SERVICES + tristate "Enclosure Services" + default n + help + Provides support for intelligent enclosures (bays which + contain storage devices). You also need either a host + driver (SCSI/ATA) which supports enclosures + or a SCSI enclosure device (SES) to use these services. + endif # MISC_DEVICES diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index 87f2685..de9f1f5 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_SONY_LAPTOP) += sony-laptop.o obj-$(CONFIG_THINKPAD_ACPI) += thinkpad_acpi.o obj-$(CONFIG_FUJITSU_LAPTOP) += fujitsu-laptop.o obj-$(CONFIG_EEPROM_93CX6) += eeprom_93cx6.o +obj-$(CONFIG_ENCLOSURE_SERVICES) += enclosure.o \ No newline at end of file diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c new file mode 100644 index 0000000..e4683cd --- /dev/null +++ b/drivers/misc/enclosure.c @@ -0,0 +1,449 @@ +/* + * Enclosure Services + * + * Copyright (C) 2008 James ...
Hi James. Nitpicking only. style purists would request you to put return type and function --
It is now ... plus there are a few other entries in this file with I tend to prefer the function name always at the beginning of the line. But even style purists have to agree it's better than trying to futilely squash all the arguments on separate lines because the return complex Yes, added. James --- From: James Bottomley <James.Bottomley@HansenPartnership.com> Date: Sun, 3 Feb 2008 15:40:56 -0600 Subject: [SCSI] enclosure: add support for enclosure services The enclosure misc device is really just a library providing sysfs support for physical enclosure devices and their components. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> --- drivers/misc/Kconfig | 9 + drivers/misc/Makefile | 1 + drivers/misc/enclosure.c | 486 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/enclosure.h | 120 +++++++++++ 4 files changed, 616 insertions(+), 0 deletions(-) create mode 100644 drivers/misc/enclosure.c create mode 100644 include/linux/enclosure.h diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index b5e67c0..ca68480 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -232,4 +232,13 @@ config ATMEL_SSC If unsure, say N. +config ENCLOSURE_SERVICES + tristate "Enclosure Services" + default n + help + Provides support for intelligent enclosures (bays which + contain storage devices). You also need either a host + driver (SCSI/ATA) which supports enclosures + or a SCSI enclosure device (SES) to use these services. + endif # MISC_DEVICES diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index 87f2685..639c755 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_SONY_LAPTOP) += sony-laptop.o obj-$(CONFIG_THINKPAD_ACPI) += thinkpad_acpi.o obj-$(CONFIG_FUJITSU_LAPTOP) += fujitsu-laptop.o obj-$(CONFIG_EEPROM_93CX6) += eeprom_93cx6.o +obj-$(CONFIG_ENCLOSURE_SERVICES) += enclosure.o diff --git ...
On Sun, 03 Feb 2008 18:16:51 -0600 This looks a little odd. We don't take a ref on the object after looking it up, so what prevents some other thread of control from freeing or Probably "non atomic context" would be more accurate. It would be less fuss if this were to test cb before doing the kzalloc(). See, right now, someone who found this enclosure_device via hrm, we do this conversion about 1e99 times in the kernel and we have to go So if an application does write(fd, "foo", 3) it won't work? Thye have to do write(fd, "foo\n", 4) Nice looking driver. --
The use case is for enclosure destruction, so the free should never
"should" to me means you don't have to do this but ought to. I'll add a
I just followed precedence ;-P
There doesn't seem to be a define for this maximum length, so 40 is the
No ... it's designed for echo; however, I'll add a check for '\0' which
OK ... I was just following precedence again, but I can make them
Thanks,
James
---
Here's the incremental diff.
diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
index 42e6e43..6fcb0e9 100644
--- a/drivers/misc/enclosure.c
+++ b/drivers/misc/enclosure.c
@@ -39,7 +39,8 @@ static struct class enclosure_component_class;
*
* Looks through the list of registered enclosures to see
* if it can find a match for a device. Returns NULL if no
- * enclosure is found.
+ * enclosure is found. Obtains a reference to the enclosure class
+ * device which must be released with class_device_put().
*/
struct enclosure_device *enclosure_find(struct device *dev)
{
@@ -48,6 +49,7 @@ struct enclosure_device *enclosure_find(struct device *dev)
mutex_lock(&container_list_lock);
list_for_each_entry(edev, &container_list, node) {
if (edev->cdev.dev == dev) {
+ class_device_get(&edev->cdev);
mutex_unlock(&container_list_lock);
return edev;
}
@@ -66,8 +68,9 @@ EXPORT_SYMBOL_GPL(enclosure_find);
* Loops over all the enclosures calling the function.
*
* Note, this function uses a mutex which will be held across calls to
- * @fn, so it must have user context, and @fn should not sleep or
- * otherwise cause the mutex to be held for indefinite periods
+ * @fn, so it must have non atomic context, and @fn may (although it
+ * should not) sleep or otherwise cause the mutex to be held for
+ * indefinite periods
*/
int enclosure_for_each_device(int (*fn)(struct enclosure_device *, void *),
void *data)
@@ -107,14 +110,11 @@ enclosure_register(struct device *dev, const char *name, int components,
...Who is the target audience/user of those facilities? a) The kernel itself needing to read/write SES pages? b) A user space application using sysfs to read/write SES pages? At the moment SES device management is done via an application (user-space) and a user-space library used by the application and /dev/sgX to send SCSI commands to the SES device. One could have a very good argument to not bloat the kernel with this but leave it to a user-space application and a library to do all this and communicate with the SES device via the kernel's /dev/sgX. Luben --
That depends on the enclosure integration, but right at the moment, it Not an application so much as a user. The idea of sysfs is to allow I must have missed that when I was looking for implementations; what's the URL? But, if we have non-scsi enclosures to integrate, that makes it harder for a user application because it has to know all the implementations. A sysfs framework on the other hand is a universal known thing for the The same thing goes for other esoteric SCSI infrastructure pieces like cd changers. On the whole, given that ATA is asking for enclosure management in kernel, it makes sense to consolidate the infrastructure and a ses ULD is a very good test bed. James --
Exactly the same argument stands for a user-space
application with a user-space library.
This is the classical case of where it is better to
do this in user-space as opposed to the kernel.
The kernel provides capability to access the SES
device. The user space application and library
provide interpretation and control. Thus if the
enclosure were upgraded, one doesn't need to
upgrade their kernel in order to utilize the new
capabilities of the SES device. Plus upgrading
a user-space application is a lot easier than
the kernel (and no reboot necessary).
Consider another thing: vendors would really like
unprecedented access to the SES device in the enclosure
so as your ses/enclosure code keeps state it would
get out of sync when vendor user-space enclosure
applications access (and modify) the SES device's
pages.
You can test this yourself: submit a patch
that removes SES /dev/sgX support; advertise your
I'm not aware of any GPLed ones. That doesn't
necessarily mean that the best course of action is
to bloat the kernel. You can move your ses/enclosure
stuff to a user space application library
So does the kernel. And as I pointed out above, it
is a lot easier to upgrade a user-space application and
library than it is to upgrade a new kernel and having
What is wrong with exporting the SES device as /dev/sgX
and having a user-space application and library to
do all this?
Luben
--
No, think again ... it's easy for SES based enclosures because they have a SCSI transport. We have no transport for SGPIO based enclosures nor for any of the other more esoteric ones. That's not to say it can't be done, but it does mean that it can't be How do you transport the enclosure commands over /dev/sgX? Only SES has SCSI command encapsulation ... the rest won't even be SCSI targets ... James --
I guess the same could be said for STGT and SCST, right? LOL, no seriously, this is unnecessary kernel bloat, But it would be trivial exercise to show that an inconsistent state can be had by modifying pages of the SES device directly from userspace bypassing I've non at the moment, plus I don't think you'd be the point of contact for a user-space SES library. Unless of course you've already started something up on sourceforge. Really, such an effort already exists: it is called Yes, for which the transport layer, implements the scsi device node for the SES device. It doesn't really matter if the SCSI commands sent to the SES device go over SGPIO or FC or SAS or Bluetooth or I2C, etc, the transport layer can implement that and present the /dev/sgX node. Case in point: the protocol FW running on the ASIC provides this capability so really the LLDD would only see a the pure SCSI SES or processor device and register that with the kernel. At which point no new kernel bloat is required. Your code doesn't quite do that at the moment as it actually goes further in to read and present SES pages. Ideally it would simply provide capability for transport layers to register a SCSI device of type SES, or processor. Architecturally, the LLDD/transport layer would register the SGPIO device on one end with the SGPIO layer and on the other end as a SCSI SES/processpr device. After that What is the protocol of those "rest" that you talk about? At any rate, this capability lies in the kernel providing a _device node_ -- not quite what your patch is doing. Luben --
I don't think so ... if you actually look at the code, you'll see it But it does matter if the enclosure device doesn't speak SCSI. SGPIO isn't a SCSI protocol ... it's a general purpose serial bus protocol. It's pretty simple and register based, but it might (or might not) be Yes, it provides a glue between the enclosure services and the SES That's possible, but none of these layers exist yet ... although I think So your idea is to provide a separate interface per enclosure in kernel? Sure ... like I said patches welcome. I just did a common in-kernel interface that abstracts common enclosure services. James --
No, you know very well what I mean. By the same logic you're preaching to include your solution part of the kernel, you can also apply to Enclosure management isn't as simple as you're portraying it here. The enclosure management device speaks either SES or SAF-TE. The transport I see. You've just discovered SGPIO -- good for you. At any rate, I told you already that what is needed is not what you've provided but a _device node_ exported by the kernel, either a processor or You're jumping over layers here. And that's not what's needed. What is needed is an instrumentation by the kernel to present the enclosure device as a device node to user space so that general utilities tools, like for example sg_ses(8) can work with it. Your solution seems to be interpreting SES in the kernel and this is really unwieldy. Just provide the device node to user space. You shouldn't have to interpret SES in the kernel as you're currently doing since the kernel itself Vendors prefer to abstract the transport protocol in HW/FW so that to ULD the device is of type processor or enclosure. This makes it easy for the vendor (in terms of support) and for the customer (in terms of You should separate "what it talks" by "how it No. My idea is to provide a device node of type processor or enclosure for each enclosure. Expanders already do this by providing a virtual phy which is "connected" to the SES-2 device, so a device of type SES is discovered and registered with the kernel as any other device on the domain. Then the enclosure can be controlled via sg_ses(8) from user space. Host adapters already do this in their FW, since it is there where they do the transport protocol abstraction as well. You can help AHCI (if this really is needed of course) to achieve the same thing by providing code so that a LLDD/etc can register a processor/enclosure device accessible via the ASIC. At any rate the actual protocol to talk to the enclosure device is abstracted ...
Ah, but it's not ... the current patch is merely exporting an interface. The debate in STGT vs SCST is not whether to export an interface but where to draw the line. You could also argue in the same vein that sd is redundant because a filesystem could talk directly to the device via /dev/sgX (in fact OSD based filesystems already do this). The argument is true, but misses the bigger picture that the interfaces exported by sd are more portable Look, just read the spec; SGPIO is a bus for driving enclosures ... it Wrong ... we don't export non-SCSI devices as SCSI (with the single and rather annoying exception of ATA via SAT). James --
"draw the line" -- I see. BTW, what is wrong with "exporting the interface"? What is wrong if both implementations are in the kernel and then let the users and distros decide which one they like best and use more? It'll not be the fist time this has happened in the kernel. Both are actively maintained. It seems highly arbitrary to say: "X is in the kernel, Y is not. If you want Y, just forget about it and fix X." Give people choice at config time. Yes, I've mentioned this thing before on this list. Oh, maybe 3 years ago. This is why I had wanted for transport protocols to export ... (oh, let's not get this off topic). It isn't quite the same thing. It's like comparing I thought Serial General Purpose Input Output (SGPIO) was a method to serialize general purpose That's true. And this is why I mentioned a couple of emails ago to simply export a sgpio device node *IF* this is what is needed. Of course devices that use SGPIO abstract it away for their functional purpose, e.g. enclosures, LED, etc, and provide a more general way to control it -- highly hardware specific on one side. Your abstraction currently deals with "SES" devices and I'd rather leave that to user-space. Alternatively, which I presume is what you're thinking, a HW specific core would be using your "abstraction" to provide some unified access to raw features, and that "unified access" isn't defined anywhere, and would likely not be. Alternatively that "unified access" is things like SES and SAF-TE, which is what vendors prefer to export, or they prefer to drive this directly via other means. That is, I fail to see the kernel bloat, for things that aren't necessary in the kernel. If you want your abstraction to fly, it first needs a common usage model to abstract, and the latter is missing _from the kernel_. Unless I don't know the details and you've been I didn't say you should do that. I had already mentioned that vendors export such controls as either enclosure or ...
Exactly, so the first patch in this series (a while ago now) was a common usage model abstraction of enclosures, and the second was an implementation in terms of SES. I will do one in terms of SGPIO as You can do it in user space as well. It's just a bit difficult to get information out of a SES enclosure without using it, and getting some of the information is a requirement of the abstraction. James --
^^^^^^^^^^^ The vendor would've abstracted that away most commonly You missed my point. Your abstraction is redundant and arbitrary -- it is not based on any known, in-practice, usage model, already in place that needs a better, common way of doing XYZ, and therefore needs an abstraction. Luben --
On Mon, 4 Feb 2008 18:01:36 -0800 (PST) Hi, I apologize for taking so long to review this patch. I obviously agree wholeheartedly with Luben. The problem I ran into while trying to design an enclosure management interface for the SATA devices is that there is all this vendor defined stuff. For example, for the AHCI LED protocol, the only "defined" LED is 'activity'. For LED2 and LED3 it is up to hardware vendors to define these. For SGPIO there's all kinds of ways for hw vendors to customize. I felt that it was going to be a maintainance nightmare to have to keep track of various vendors enclosure implementations in the ahci driver, and that it'd be better to just have user space libraries take care of that. Plus, that way a vendor doesn't have to get a patch into the kernel to get their new spiffy wizzy bang blinky lights working (think of how long it takes something to even get into a vendor kernel, which is what these guys care about...). So I'm still not sold on having an enclosure abstraction in the kernel - at least for the SATA controllers. Kristen --
Correct me if I'm wrong, but didn't the original AHCI enclosure patch expose activity LEDs via sysfs? I'm not saying there aren't a lot of non standard pieces that need to be activated by direct commands or other user activated protocol. I am saying there are a lot of standard pieces that we could do with showing in a uniform manner. The pieces I think are absolutely standard are 1. Actual enclosure presence (is this device in an enclosure) 2. Activity LED, this seems to be a feature of every enclosure. I also think the following are reasonably standard (based on the fact that most enclosure standards recommend but don't require this): 3. Locate LED (for locating the device). Even if you only have an activity LED, this is usually done by flashing the activity LED in a well defined pattern. 4. Fault. this is the least standardised of the lot, but does seem to be present in about every enclosure implementation. All I've done is standardise these four pieces ... the services actually take into account that it might not be possible to do certain of these (like fault). James --
On Tue, 12 Feb 2008 12:45:35 -0600 You are sort of wrong. we exposed a sysfs entry to enable sofware controlled activity LED, then the driver was responsible for turning it I understand what you are trying to do - I guess I just doubt the value you've added by doing this. I think that there's going to be so much customization that system vendors will want to add, that they are going to wind up adding a custom library regardless, so standardising those few things won't buy us anything. --
It depends ... if you actually have a use for the customisations, yes. If you just want the basics of who (what's in the enclousure), what (activity) and where (locate) then I think it solves your problem almost entirely. So, entirely as a straw horse, tell me what else your enclosures provide that I haven't listed in the four points. The SES standards too provide a huge range of things that no-one ever seems to implement (temperature, power, fan speeds etc). I think the users of enclosures fall int these categories 85% just want to know where their device actually is (i.e. that sdc is in enclosure slot 5) 50% like watching the activity lights 30% want to be able to have a visual locate function 20% want a visual failure indication (the other 80% rely on some OS notification instead) When you add up the overlapping needs, you get about 90% of people happy with the basics that the enclosure services provide. Could there be more ... sure; should there be more ... I don't think so ... that's what value add the user libraries can provide. James --
On Tue, 12 Feb 2008 13:28:15 -0600 I don't think I'm arguing whether or not your solution may work, what I am arguing is really a more philosophical point. Not "can we do it this way", but "should we do it way". I am of the opinion that management belongs in userspace. I also am of the opinion that if you can successfully accomplish something in user space, you should. I also believe that even if you provide this basic interface, all system vendors are going to provide libraries on top of that to customize it, so you've not added much value to just a simple message passing interface. So, I'm happy to defer to Jeff's judgement call here - I just want to do what's right for our customers and get an enclosure management interface for SATA exposed, preferrably in time for the 2.6.26 merge window. If he prefers your design, I'll disagree, but commit to his decision and try to get this to work for SATA. If he'd rather see something along the lines of what I proposed, then since it is 100% self contained in the SATA subsystem, it shouldn't impact whatever you want to do in the SCSI subsystem. Jeff? --
I'm not necessarily arguing against that. However, what you're providing is slightly more than just a userspace tap into the enclosure. You're adding a file to display and control the enclosure state (sw_activity). This constitutes an ad-hoc sysfs interface. I'm not telling you not to do it, but I am pleading that if we have to have all these sysfs interfaces, lets at least do it in a uniform way. Enclosures are such nasty beasts, that even the job of getting a tap into them is problematic, so if we have a different tap infrastructure for every different enclosure type and connection it's still going to be James --
Hw abstraction is still kernel's job. That's why we have leds exported in sysfs... let vendors have their libraries, but lets put the 'everyone does these' stuff in kernel. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
Which is already the case without the SES kernel bloat.
Case in point, the excellent user-space application
"lsscsi" would clearly show which device is SES.
And the excellent user-space application "sg_ses" could
So that means that it needs a kernel representation?
If this indeed were the case, for every "feature" of every
type of device (not only SCSI) then the kernel itself would
And none of this means that it needs a kernel representation.
1. You're not "standardizing" any known, in-practice,
kernel representation, that is already in practice and
thusly needs a kernel representation.
2. The kernel itself is not using nor needing this
"representation" in order to function properly (the kernel).
Leaving control of SES devices to user-space makes both
the kernel and the vendors happy. All the kernel needs
to do is expose the SES device to user-space as it currently
does. It makes it so much easier both to vendors and to
the kernel to stay out of unnecessary representations.
Vendors may choose to distribute their own applications
to control their hardware, as long as the kernel exposes
an SES device and provides functionality, as opposed to
policy of any kind.
Luben
--
The keep-it-in-user-space arguments seem fairly compelling to me. Especially as we've pushed whole i/o subsystems out to user space (iscsi, stgt, talked about fcoe, a lot of dm control, etc). The functionality seems to align with Doug's sg/lsscsi utility chain as well. Granted, the new utility would have to be designed in such as way that it can incorporate vendor "hardware handlers". But, if James has a somewhat common implementation already for a kernel implementation, I'm sure that can be the starting point for lsscsi. So, the main question I believe is being asked is: - Do we need to represent this via the object/sysfs tree or can an outside utility be depended upon to show it ? Note that I am not supporting: "Vendors may choose to distribute their own applications". For this to become truly useful, there needs to be a common tool/method that presents common features in a common manner, regardless of whether it is in kernel or not. -- james s --
I don't disagree with that, but the fact is that there isn't such a tool. It's also a fact that the enterprise is reasonably unhappy with the lack of an enclosure management infrastructure, since it's something they got on all the other unix systems. I think a minimal infrastructure in-kernel does just about everything the enterprise wants ... and since it's stateless, they can always use direct connect tools in addition. However, I'm happy to be proven wrong ... anyone on this thread is welcome to come up with a userland enclosure infrastructure. Once it does everything the in-kernel one does (which is really about the minimal possible set), I'll be glad to erase the in-kernel one. James --
yeah, but... putting something new in, only to pull it later, is a bad paradigm for adding new mgmt interfaces. Believe me, I've felt users pain in the reverse flow : driver-specific stuff that then has to migrate to upstream interfaces, complicated by different pull points by different distros. You can migrate a management interface, but can you really remove/pull one out ? Isn't it better to let the lack of an interface give motivation to create the "right" interface, once the "right way" is determined - which is what I thought we were discussing ? or is this simply that there is no motivation until something exists, that people don't like, thus they become motivated ? -- james s --
That depends on the result. I agree that migration will be a pain, so I suppose I set the bar a bit low; the user tool needs to be a bit more compelling; plus I'll manage the interface transition ... if there is Well ... I did learn the latter from Andrew, so I thought I'd try it. It's certainly true that the enclosure problem has been an issue for over a decade, so there doesn't seem to be anything motivating anyone to solve it. I wouldn't have bothered except that I could see ad-hoc in-kernel sysfs solutions beginning to appear. At least this way they can all present a unified interface. James --
If this is true, and if no one quickly volunteers to do the utility, then I agree with what you are doing. -- james s --
And I agree wholeheartedly with Kristen. Luben --
