eHEA resources that are allocated via H_CALLs have a unique identifier each.
These identifiers are necessary to free the resources. A reboot notifier
is used to free all eHEA resources before the indentifiers get lost, i.e
before kexec starts a new kernel.Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com>
---
drivers/net/ehea/ehea.h | 2 +-
drivers/net/ehea/ehea_main.c | 21 +++++++++++++++++++++
2 files changed, 22 insertions(+), 1 deletions(-)diff --git a/drivers/net/ehea/ehea.h b/drivers/net/ehea/ehea.h
index 4b4b74e..f78e5bf 100644
--- a/drivers/net/ehea/ehea.h
+++ b/drivers/net/ehea/ehea.h
@@ -40,7 +40,7 @@
#include <asm/io.h>#define DRV_NAME "ehea"
-#define DRV_VERSION "EHEA_0079"
+#define DRV_VERSION "EHEA_0080"/* eHEA capability flags */
#define DLPAR_PORT_ADD_REM 1
diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index 0a7e789..f0319f1 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -33,6 +33,9 @@
#include <linux/if.h>
#include <linux/list.h>
#include <linux/if_ether.h>
+#include <linux/notifier.h>
+#include <linux/reboot.h>
+
#include <net/ip.h>#include "ehea.h"
@@ -3295,6 +3298,20 @@ static int __devexit ehea_remove(struct of_device *dev)
return 0;
}+static int ehea_reboot_notifier(struct notifier_block *nb,
+ unsigned long action, void *unused)
+{
+ if (action == SYS_RESTART) {
+ ehea_info("Reboot: freeing all eHEA resources");
+ ibmebus_unregister_driver(&ehea_driver);
+ }
+ return NOTIFY_DONE;
+}
+
+static struct notifier_block ehea_reboot_nb = {
+ .notifier_call = ehea_reboot_notifier,
+};
+
static int check_module_parm(void)
{
int ret = 0;
@@ -3351,6 +3368,8 @@ int __init ehea_module_init(void)
if (ret)
goto out;+ register_reboot_notifier(&ehea_reboot_nb);
+
ret = ibmebus_register_driver(&ehea_driver);
if (ret) {
ehea_error("failed registering ...
applied to #upstream-fixes
-
How do you plan to support kdump?
cheers
--=20
Michael Ellerman
OzLabs, IBM Australia Development Labwwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
When kexec is fully supported kdump should work out of the box
as for any other ethernet card (if you load the right eth driver).
There's nothing specific to kdump you have to handle in
ethernet device drivers.
Hope I didn't miss anything here...Gruss / Regards
Christoph R-
Perhaps. When we kdump the kernel does not call the reboot notifiers, so
the code Jan-Bernd just added won't get called. So the eHEA resources
won't be freed. When the kdump kernel tries to load the eHEA driver what
will happen?cheers
--=20
Michael Ellerman
OzLabs, IBM Australia Development Labwwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
Good point.
If the device driver tries to allocate resources again (in the kdump
kernel),
which have been allocated before (in the crashed kernel) the hcalls will
fail because from the hypervisor view the resources are still in use.
Currently there's no method to find out the resource handles for these
HEA resources allocated by the crashed kernel within the hypervisor...So we have to trigger a explicit deregister in the hypervisor before the
driver
is started again.
How do you recommend we should trigger this in the kdump process?
Is placing a hook into a ppc_md.machine_kexec be an option?Gruss / Regards
Christoph R.-
So the hypervisor can't allocate more resources, because they're already
allocated, but it can't free the ones that are allocated because it
doesn't know what they are? I don't think I understand.If that's really the way it works then eHEA is more or less broken for
kdump I'm afraid.cheers
--=20
Michael Ellerman
OzLabs, IBM Australia Development Labwwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
We think we have a way to workaround this, but let me first try to
explain the base problem.DD allocates HEA resources and gets firmware_handles for these resources.
To free the resources DD needs to use exactly these handles.
There's no generic firmware call "clean out all resources".
Allocating the same resources twice does not work.So a new kernel can't free the resources allocated by an old kernel,
because the numeric values of the handles aren't known anymore.Potential Solution:
Hea driver cleanup function hooks into ppc_md.machine_crash_shutdown
and frees all firmware resources at shutdown time of the crashed kernel.
crash_kexec continues and loads new kernel.
The new kernel restarts the HEA driver within kdump kernel, which will work
because resources have been freed before.Michael, would this work?
Gruss / Regards
Christoph R.-
How many possible handles are there?
If the handles are lost, is the only way to clear out the HEA resources
This means the crashed kernel now has to be trusted to shut down and
free up the resources. Isn't trusting the crashing kernel in this way
-
Well, there's no simple answer to this. I'm not working on firmware.
Depends on system configuration, between
I would hope that if the cleanup routine only does hcalls
and does not change any kernel memory areas, then the risk to damage
anything
else in kernel should be pretty small. This should allow to catch most
cases,
but as always you can imagine situations where the kernel memory is brokenIs ppc_md.machine_crash_shutdown the right hook?
Gruss/Regards
Christoph R-
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Andi Kleen | [PATCH x86] [0/16] Various i386/x86-64 changes |
| Vladislav Bolkhovitin | Re: Integration of SCST in the mainstream Linux kernel |
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
| Jarek Poplawski | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Natalie Protasevich | [BUG] New Kernel Bugs |
| Arjan van de Ven | Re: [GIT]: Networking |
