[PATCH 1/2] ACPI, APEI: Make APEI core configurable built-in instead of module

Previous thread: linux-next: build failure after merge of the ext3 tree by Stephen Rothwell on Monday, March 1, 2010 - 5:36 pm. (4 messages)

Next thread: Disadvantage of using yaffs checkpointing ? by Shivdas Gujare on Monday, March 1, 2010 - 7:01 pm. (1 message)
From: Huang Ying
Date: Monday, March 1, 2010 - 6:55 pm

PCIE AER optionally depends on APEI HEST tabling parsing. If AER is
built-in, HEST should be built-in or not configured at all instead of
module. It is hard to express this elegantly in Kconfig. It is better
to make APEI code part configurable built-in instead of module.

On the other hand, APEI core code is used for hardware error
processing. It may run in very bad condition. It is reasonable to keep
it built-in if enabled.

Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 Documentation/kernel-parameters.txt |   10 ++++----
 drivers/acpi/apei/Kconfig           |    2 +-
 drivers/acpi/apei/apei-base.c       |   42 +++++------------------------------
 drivers/acpi/apei/apei-internal.h   |    4 +--
 drivers/acpi/apei/einj.c            |    2 +-
 drivers/acpi/apei/hest.c            |   34 +++++++++++++++++++++++-----
 6 files changed, 42 insertions(+), 52 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b41a486..212d63f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -350,11 +350,6 @@ and is between 256 and 4096 characters. It is defined in the file
 			not play well with APC CPU idle - disable it if you have
 			APC and your system crashes randomly.
 
-	apei.hest_disable= [ACPI]
-			Disable Hardware Error Source Table (HEST) support,
-			corresponding firmware-first mode error processing
-			logic will be disabled.
-
 	apic=		[APIC,X86-32] Advanced Programmable Interrupt Controller
 			Change the output verbosity whilst booting
 			Format: { quiet (default) | verbose | debug }
@@ -847,6 +842,11 @@ and is between 256 and 4096 characters. It is defined in the file
 	hd=		[EIDE] (E)IDE hard drive subsystem geometry
 			Format: <cyl>,<head>,<sect>
 
+	hest_disable	[ACPI]
+			Disable Hardware Error Source Table (HEST) support,
+			corresponding firmware-first mode error processing
+			logic will be disabled.
+
 	highmem=nn[KMG]	[KNL,BOOT] forces the highmem zone to ...
From: Huang Ying
Date: Monday, March 1, 2010 - 6:55 pm

Now, a dedicated HEST tabling parsing code is used for PCIE AER
firmware_first setup. It is rebased on general HEST tabling parsing
code of APEI. The firmware_first setup code is moved from PCI core to
AER driver too, because it is only AER related.

Signed-off-by: Huang Ying <ying.huang@intel.com>
---
 drivers/acpi/Makefile              |    1 -
 drivers/acpi/hest.c                |  135 ------------------------------------
 drivers/pci/pcie/aer/aerdrv.h      |    6 ++
 drivers/pci/pcie/aer/aerdrv_acpi.c |   70 +++++++++++++++++++
 drivers/pci/pcie/aer/aerdrv_core.c |    2 +
 drivers/pci/probe.c                |    8 --
 include/acpi/acpi_hest.h           |   12 ---
 7 files changed, 78 insertions(+), 156 deletions(-)
 delete mode 100644 drivers/acpi/hest.c
 delete mode 100644 include/acpi/acpi_hest.h

diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index c00d683..3ba7feb 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -20,7 +20,6 @@ obj-y				+= acpi.o \
 # All the builtin files are in the "acpi." module_param namespace.
 acpi-y				+= osl.o utils.o reboot.o
 acpi-y				+= atomicio.o
-acpi-y				+= hest.o
 
 # sleep related files
 acpi-y				+= wakeup.o
diff --git a/drivers/acpi/hest.c b/drivers/acpi/hest.c
deleted file mode 100644
index 4bb18c9..0000000
--- a/drivers/acpi/hest.c
+++ /dev/null
@@ -1,135 +0,0 @@
-#include <linux/acpi.h>
-#include <linux/pci.h>
-
-#define PREFIX "ACPI: "
-
-static inline unsigned long parse_acpi_hest_ia_machine_check(struct acpi_hest_ia_machine_check *p)
-{
-	return sizeof(*p) +
-		(sizeof(struct acpi_hest_ia_error_bank) * p->num_hardware_banks);
-}
-
-static inline unsigned long parse_acpi_hest_ia_corrected(struct acpi_hest_ia_corrected *p)
-{
-	return sizeof(*p) +
-		(sizeof(struct acpi_hest_ia_error_bank) * p->num_hardware_banks);
-}
-
-static inline unsigned long parse_acpi_hest_ia_nmi(struct acpi_hest_ia_nmi *p)
-{
-	return sizeof(*p);
-}
-
-static inline unsigned long ...
From: Hidetoshi Seto
Date: Tuesday, March 2, 2010 - 1:09 am

The aer_init() will be called for root ports, but not for end point
devices or so on.  So please remain the firmware_first setup code in
PCI core.  Otherwise endpoint drivers will get success on call of
pci_enable_pcie_error_reporting() regardless of the firmware first.


Thanks,
H.Seto

--

From: Huang Ying
Date: Tuesday, March 2, 2010 - 2:13 am

Or we can call firmware_first setup code in
pci_enable_pcie_error_reporting(), because

1. I think AER related code should be put in drivers/pci/pcie/aer
instead of PCI core or drivers/acpi, if it is possible.

2. pci_setup_device is called so early, so that it is hard to do some
HEST related initialization (such as checking bad format) before it.

Best Regards,
Huang Ying


--

From: Hidetoshi Seto
Date: Tuesday, March 2, 2010 - 4:04 am

I understands the feeling, but before agreeing with your
proposal, I'd like to have an answer of a question:

 - Is it necessary to setup the firmware_first flag
   for an endpoint even if the endpoint's driver never
   call pci_enable_pcie_error_reporting()?

According to the current implementation, there are no
driver referring the firmware_first flag other than that
it owns.  However I guess that the flag will be necessary
for AER driver (i.e. aerdrv_core) in near future, because
we can use the flag to determine whether the AER driver
can check the device or not, when it is required to walk
pci bus hierarchy to find an erroneous device.

For example, assume that there are 2 endpoints under a same
root port.  One is (likely on-board) "firmware first" endpoint,
with driver which does not call pci_enable_pcie_error_reporting()
(because of no interest in AER, or just not implemented yet,
anyway).  The other is (likely card seated on a slot) not
firmware first, with better driver which can handle it's AER.
If my understanding is correct and if everything goes well,
errors on one should be reported via APEI while the other should
be reported via AER driver.

We could call firmware_first setup for all endpoint from aer_init(),
but it will need another care for hot-plugged endpoints.


Thanks,
H.Seto

--

From: Huang Ying
Date: Tuesday, March 2, 2010 - 6:43 pm

Yes. I think this should be supported. How about something as follow?

struct pci_dev {
	...
	unsigned int __firmware_first:2;
	...
};

int pcie_aer_get_firmware_first(struct pci_dev *dev)
{
	if (!dev->__firmware_first)
		aer_set_firmware_first(dev);
	return dev->__firmware_first & 0x1;
}

Then we use pcie_aer_get_firmware_first() instead of dev->firmware_first
directly.

Best Regards,
Huang Ying


--

From: Hidetoshi Seto
Date: Tuesday, March 2, 2010 - 7:30 pm

Looks reasonable.  I think the following is more straightforward:

struct pci_dev {
	...
	unsigned int __firmware_first_valid:1;
	unsigned int __firmware_first:1;
	...
};

int pcie_aer_get_firmware_first(struct pci_dev *dev)
{
	if (!dev->__firmware_first_valid)
		aer_set_firmware_first(dev);
	return dev->__firmware_first;
}


Thanks,
H.Seto

--

Previous thread: linux-next: build failure after merge of the ext3 tree by Stephen Rothwell on Monday, March 1, 2010 - 5:36 pm. (4 messages)

Next thread: Disadvantage of using yaffs checkpointing ? by Shivdas Gujare on Monday, March 1, 2010 - 7:01 pm. (1 message)