* Differentiate between bay devices in dock stations and others:
- When an ACPI_NOTIFY_EJECT_REQUEST appears, just signal uevent to
userspace (that is when the optional eject button on a bay device is
pressed/pulled) giving the possibility to unmount file systems and to
clean up. Also, only send uevent in case we get an EJECT_REQUEST
without doing anything else. In other cases, you'll get an add/remove
event because libata attaches/detaches the device.
- In case of a dock event, which in turn signals an
ACPI_NOTIFY_EJECT_REQUEST, immediately detach the device, because it
may already have been gone
* In case of an ACPI_NOTIFY_DEVICE/BUS_CHECK, evaluate _STA to check if
the device has been plugged or unplugged. If plugged, hotplug it, if
unplugged, just signal event to userspace
(initial patch by Matthew Garrett <mjg59@srcf.ucam.org>)
* Call ACPI _EJ0 for detached devices
Signed-off-by: Holger Macht <hmacht@suse.de>
---
Changes regarding the previous patch:
* Make sure kobj does not go away outside locking
* Coding style cleanups
* Call _EJ0 for detached devices
diff --git a/drivers/ata/libata-acpi.c b/drivers/ata/libata-acpi.c
index dbf6ca7..e0e49fd 100644
--- a/drivers/ata/libata-acpi.c
+++ b/drivers/ata/libata-acpi.c
@@ -118,8 +118,40 @@ static void ata_acpi_associate_ide_port(struct ata_port *ap)
ap->pflags |= ATA_PFLAG_INIT_GTM_VALID;
}
+static void ata_acpi_eject_device(acpi_handle handle)
+{
+ struct acpi_object_list arg_list;
+ union acpi_object arg;
+
+ arg_list.count = 1;
+ arg_list.pointer = &arg;
+ arg.type = ACPI_TYPE_INTEGER;
+ arg.integer.value = 1;
+
+ if (ACPI_FAILURE(acpi_evaluate_object(handle, "_EJ0",
+ &arg_list, NULL)))
+ printk(KERN_ERR "Failed to evaluate _EJ0!\n");
+}
+
+static void ata_acpi_detach_device(struct ata_port *ap, struct ata_device *dev)
+{
+ if (dev)
+ dev->flags |= ATA_DFLAG_DETACH;
+ else {
+ struct ata_link *tlink;
+ struct ata_device ...Acked-by: Tejun Heo <htejun@gmail.com> -- tejun --
This might look nicer (and safer) if it used the
union acpi_object arg = {
.foo = bar,
...
};
It would be better if the printk were to self-identify where it is
coming from. If your kernel just blurts "Failed to evaluate _EJ0!"
it's a bit of a head-scratcher.
I guess the significance of `dev==0' is known to those who are steeped
in ata_acpi_handle_hotplug() knowledge, but it's pretty inscrutable to
the occasional visitor. Some comments would help.
--
ata_port_printk() would be better. Thanks. -- tejun --
Hm. This deadlocks my Thinkpad when I pull a device from the internal bay. -- Matthew Garrett | mjg59@srcf.ucam.org --
Can you please give a bit more details? Does your userspace react on the eject request? I tested with Thinkpads, too. - Holger --
No, I'm not doing anything with userspace - just pulling the drive out directly. I'm looking into it now. -- Matthew Garrett | mjg59@srcf.ucam.org --
This was one of my test scenarios. Just a hint for debugging, before my patch, HAL accessed /dev/sr0 directly after removing the device, and that caused the lockup. Hopefully that doesn't bite us again... - Holger --
Running with init=/bin/bash, I still get it. The only kernel output I get is libata telling me that the device is disabled, and the machine is dead after that. -- Matthew Garrett | mjg59@srcf.ucam.org --
Does this fix the problem?
---
drivers/ata/libata-acpi.c | 166 +++++++++++++++++++++++++++++++---------------
1 file changed, 115 insertions(+), 51 deletions(-)
diff --git a/drivers/ata/libata-acpi.c b/drivers/ata/libata-acpi.c
index dbf6ca7..1ac3e8f 100644
--- a/drivers/ata/libata-acpi.c
+++ b/drivers/ata/libata-acpi.c
@@ -118,12 +118,63 @@ static void ata_acpi_associate_ide_port(struct ata_port *ap)
ap->pflags |= ATA_PFLAG_INIT_GTM_VALID;
}
-static void ata_acpi_handle_hotplug(struct ata_port *ap, struct ata_device
- *dev, u32 event)
+static void ata_acpi_eject_device(acpi_handle handle)
+{
+ struct acpi_object_list arg_list;
+ union acpi_object arg;
+
+ arg_list.count = 1;
+ arg_list.pointer = &arg;
+ arg.type = ACPI_TYPE_INTEGER;
+ arg.integer.value = 1;
+
+ if (ACPI_FAILURE(acpi_evaluate_object(handle, "_EJ0",
+ &arg_list, NULL)))
+ printk(KERN_ERR "Failed to evaluate _EJ0!\n");
+}
+
+/* @ap and @dev are the same as ata_acpi_handle_hotplug() */
+static void ata_acpi_detach_device(struct ata_port *ap, struct ata_device *dev)
+{
+ if (dev)
+ dev->flags |= ATA_DFLAG_DETACH;
+ else {
+ struct ata_link *tlink;
+ struct ata_device *tdev;
+
+ ata_port_for_each_link(tlink, ap)
+ ata_link_for_each_dev(tdev, tlink)
+ tdev->flags |= ATA_DFLAG_DETACH;
+ }
+
+ ata_port_freeze(ap);
+ ata_port_schedule_eh(ap);
+}
+
+/**
+ * ata_acpi_handle_hotplug - ACPI event handler backend
+ * @ap: ATA port ACPI event occurred
+ * @dev: ATA device ACPI event occurred (can be NULL)
+ * @event: ACPI event which occurred
+ * @is_dock_event: boolean indicating whether the event was a dock one
+ *
+ * All ACPI bay / device realted events end up in this function. If
+ * the event is port-wide @dev is NULL. If the event is specific to a
+ * device, @dev points to it.
+ *
+ * Hotplug (as opposed to unplug) notification is always handled as
+ * port-wide while unplug only kills the target device on device-wide
+ * event.
+ *
+ * ...No, I'm seeing an identical hang. Mainline works fine. -- Matthew Garrett | mjg59@srcf.ucam.org --
Maybe you could also try this on top of Tejun's patch?
--- linux-2.6/drivers/ata/libata-acpi.c.orig 2008-05-29 16:13:00.000000000 +0200
+++ linux-2.6/drivers/ata/libata-acpi.c 2008-05-29 16:13:04.000000000 +0200
@@ -237,10 +237,8 @@
spin_unlock_irqrestore(ap->lock, flags);
- if (wait) {
+ if (wait)
ata_port_wait_eh(ap);
- ata_acpi_eject_device(handle);
- }
if (kobj && !is_dock_event) {
sprintf(event_string, "BAY_EVENT=%d", event);
--
No luck there either, I'm afraid. -- Matthew Garrett | mjg59@srcf.ucam.org --
That means device isn't being detached or something afterwards triggers hotplug event on the detached port. Can you please sprinkle printks around and see what ata_acpi_handle_hotplug() does? Thanks. -- tejun --
Got it. It works if I lose the ata_port_freeze() from ata_acpi_detach_device. -- Matthew Garrett | mjg59@srcf.ucam.org --
Tejun, any idea? This is in for quite some time now. Maybe there's some involved... Regards, Holger --
Does changing it to ata_port_schedule_eh() make any difference? -- tejun --
Finally found a system to reproduce the freeze... And yes, removing ata_port_freeze() and only doing ata_port_schedule_eh() in the case where the device has already gone works and prevents the freeze. Also the docking case works. If this change is not only for finding out what's going wrong, I'm going resend the patch with your documentation additions, ok? Regards, Holger --
The last sentence misses the one important word 'race'. - Holger --
<naivë question> Do we handle the possible fact that _EJ0 might want the port (and maybe the device) to be active when it is called AND that _EJ0 might attempt to shutdown the port or the device itself? </naivë question> Please ignore if it is something not even worth answering. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh --
Will this patch or the other bay/dock related patches you send in the past days allow me to undock my laptop and still be able to suspend/resume without locking the laptop up? And without having to run any userspace scripts? My dock is not a simple port replicator, it has an USB hub and an ATA bay in it. My tests showed that I need to 'echo 1 > /sys/.../scsi/.../eject' or something like that before I can take the laptop out of the docking station. If I don't do that and try to access the cdrom in the bay (or even rescaning the scsi bus) after I have taken the laptop out of the dock it results in a hard lockup. That in itself would not be a problem, because it's just a simple command that I can do in an acpid script. But far worse is that even if I do that, the computer locks up when I resume from suspend. I run 2.6.24.7 which locks up every time I resume. In recent versions from git it has somehow improved, there are situations where it doesn't lock up, but there are still a few left where it does (I don't remember the exact actions/commands I have to take). tom --
Which kernel versions did you already try? But yes, those patches will most likely help. But not sure about the resume issues... Regards, Holger --
Matthew, any comments? It would be nice if you and Holger could work together to produce a single patch[set]... right now I have patches from both of you, and I was sorta waiting on the thread to die down to see if competing patches might merge into a single set --
I think we were waiting for feedback from Tejun as to why removing the port freeze call fixed the hang I was seeing? Beyond that, Holger's latest patch looked good to me. -- Matthew Garrett | mjg59@srcf.ucam.org --
My suspicion is just that calling ata_port_freeze() on an already frozen port is not good ;-) I'll resend the merged patch. Regards, Holger --
Sorry, was off for the last week. The difference between freezing and scheduling EH is that the former immediately aborts all in-flight commands and resets the port while the latter waits till the in-flight commands finish or time out (EH scheduling kicks fast-drain and the timeout is reduced to three seconds). TF-based ATA controllers are very sensitive to how the registers are accessed and sometimes lock up the whole machine when they are not happy by indefinitely holding the PCI bus. This could have been the case if IOs were in flight when the dock event occurred. Were they? Thanks. -- tejun --
I'd stopped hal, so I can't imagine that userspace was causing any to be generated at that point. -- Matthew Garrett | mjg59@srcf.ucam.org --
Ah... okay. Stupid me. libata EH always resets a frozen port to un-freeze it even if it's unoccupied to listen for hotplug events. So, if dock notifies device removal after the actual device is gone && the port is frozen as a result, libata EH will try to reset the port after the device is gone and in this case the controller locks up the whole machine for that. If schedule_eh is used, libata EH just removes the device and does nothing else and the controller is happy. This isn't too safe tho. There can be other things which can trigger port reset. ie. hotplug request from userland, in-flight IOs at the time of dock removal, etc... Maybe we need to implement a flag to indicate that the port is dead and shouldn't be accessed in any way. Thanks. -- tejun --
* Differentiate between bay devices in dock stations and others:
- When an ACPI_NOTIFY_EJECT_REQUEST appears, just signal uevent to
userspace (that is when the optional eject button on a bay device is
pressed/pulled) giving the possibility to unmount file systems and to
clean up. Also, only send uevent in case we get an EJECT_REQUEST
without doing anything else. In other cases, you'll get an add/remove
event because libata attaches/detaches the device.
- In case of a dock event, which in turn signals an
ACPI_NOTIFY_EJECT_REQUEST, immediately detach the device, because it
may already have been gone
* In case of an ACPI_NOTIFY_DEVICE/BUS_CHECK, evaluate _STA to check if
the device has been plugged or unplugged. If plugged, hotplug it, if
unplugged, just signal event to userspace
(initial patch by Matthew Garrett <mjg59@srcf.ucam.org>)
* Call ACPI _EJ0 for detached devices
Signed-off-by: Holger Macht <hmacht@suse.de>
---
Previous patch differences:
* Remove one call to ata_port_freeze
* Add some documentation from Tejun
diff --git a/drivers/ata/libata-acpi.c b/drivers/ata/libata-acpi.c
index dbf6ca7..3ff8b14 100644
--- a/drivers/ata/libata-acpi.c
+++ b/drivers/ata/libata-acpi.c
@@ -118,12 +118,62 @@ static void ata_acpi_associate_ide_port(struct ata_port *ap)
ap->pflags |= ATA_PFLAG_INIT_GTM_VALID;
}
-static void ata_acpi_handle_hotplug(struct ata_port *ap, struct ata_device
- *dev, u32 event)
+static void ata_acpi_eject_device(acpi_handle handle)
+{
+ struct acpi_object_list arg_list;
+ union acpi_object arg;
+
+ arg_list.count = 1;
+ arg_list.pointer = &arg;
+ arg.type = ACPI_TYPE_INTEGER;
+ arg.integer.value = 1;
+
+ if (ACPI_FAILURE(acpi_evaluate_object(handle, "_EJ0",
+ &arg_list, NULL)))
+ printk(KERN_ERR "Failed to evaluate _EJ0!\n");
+}
+
+/* @ap and @dev are the same as ata_acpi_handle_hotplug() */
+static void ata_acpi_detach_device(struct ata_port *ap, struct ata_device *dev)
+{
+ if ...Acked-by: Matthew Garrett <mjg@redhat.com> -- Matthew Garrett | mjg59@srcf.ucam.org --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux Kernel Mailing List | ixbge: remove TX lock and redo TX accounting. |
| Linux Kernel Mailing List | ixgbe: fix several counter register errata |
| Linux Kernel Mailing |
