PCIE AER optionally depends on APEI HEST tabling parsing. If AER is
built-in, HEST should be built-in or not configured at all instead of
module. It is hard to express this elegantly in Kconfig. It is better
to make APEI code part configurable built-in instead of module.
On the other hand, APEI core code is used for hardware error
processing. It may run in very bad condition. It is reasonable to keep
it built-in if enabled.
Signed-off-by: Huang Ying <ying.huang@intel.com>
---
Documentation/kernel-parameters.txt | 10 ++++----
drivers/acpi/apei/Kconfig | 2 +-
drivers/acpi/apei/apei-base.c | 42 +++++------------------------------
drivers/acpi/apei/apei-internal.h | 4 +--
drivers/acpi/apei/einj.c | 2 +-
drivers/acpi/apei/hest.c | 34 +++++++++++++++++++++++-----
6 files changed, 42 insertions(+), 52 deletions(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b41a486..212d63f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -350,11 +350,6 @@ and is between 256 and 4096 characters. It is defined in the file
not play well with APC CPU idle - disable it if you have
APC and your system crashes randomly.
- apei.hest_disable= [ACPI]
- Disable Hardware Error Source Table (HEST) support,
- corresponding firmware-first mode error processing
- logic will be disabled.
-
apic= [APIC,X86-32] Advanced Programmable Interrupt Controller
Change the output verbosity whilst booting
Format: { quiet (default) | verbose | debug }
@@ -847,6 +842,11 @@ and is between 256 and 4096 characters. It is defined in the file
hd= [EIDE] (E)IDE hard drive subsystem geometry
Format: <cyl>,<head>,<sect>
+ hest_disable [ACPI]
+ Disable Hardware Error Source Table (HEST) support,
+ corresponding firmware-first mode error processing
+ logic will be disabled.
+
highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to ...Now, a dedicated HEST tabling parsing code is used for PCIE AER
firmware_first setup. It is rebased on general HEST tabling parsing
code of APEI. The firmware_first setup code is moved from PCI core to
AER driver too, because it is only AER related.
Signed-off-by: Huang Ying <ying.huang@intel.com>
---
drivers/acpi/Makefile | 1 -
drivers/acpi/hest.c | 135 ------------------------------------
drivers/pci/pcie/aer/aerdrv.h | 6 ++
drivers/pci/pcie/aer/aerdrv_acpi.c | 70 +++++++++++++++++++
drivers/pci/pcie/aer/aerdrv_core.c | 2 +
drivers/pci/probe.c | 8 --
include/acpi/acpi_hest.h | 12 ---
7 files changed, 78 insertions(+), 156 deletions(-)
delete mode 100644 drivers/acpi/hest.c
delete mode 100644 include/acpi/acpi_hest.h
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index c00d683..3ba7feb 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -20,7 +20,6 @@ obj-y += acpi.o \
# All the builtin files are in the "acpi." module_param namespace.
acpi-y += osl.o utils.o reboot.o
acpi-y += atomicio.o
-acpi-y += hest.o
# sleep related files
acpi-y += wakeup.o
diff --git a/drivers/acpi/hest.c b/drivers/acpi/hest.c
deleted file mode 100644
index 4bb18c9..0000000
--- a/drivers/acpi/hest.c
+++ /dev/null
@@ -1,135 +0,0 @@
-#include <linux/acpi.h>
-#include <linux/pci.h>
-
-#define PREFIX "ACPI: "
-
-static inline unsigned long parse_acpi_hest_ia_machine_check(struct acpi_hest_ia_machine_check *p)
-{
- return sizeof(*p) +
- (sizeof(struct acpi_hest_ia_error_bank) * p->num_hardware_banks);
-}
-
-static inline unsigned long parse_acpi_hest_ia_corrected(struct acpi_hest_ia_corrected *p)
-{
- return sizeof(*p) +
- (sizeof(struct acpi_hest_ia_error_bank) * p->num_hardware_banks);
-}
-
-static inline unsigned long parse_acpi_hest_ia_nmi(struct acpi_hest_ia_nmi *p)
-{
- return sizeof(*p);
-}
-
-static inline unsigned long ...The aer_init() will be called for root ports, but not for end point devices or so on. So please remain the firmware_first setup code in PCI core. Otherwise endpoint drivers will get success on call of pci_enable_pcie_error_reporting() regardless of the firmware first. Thanks, H.Seto --
Or we can call firmware_first setup code in pci_enable_pcie_error_reporting(), because 1. I think AER related code should be put in drivers/pci/pcie/aer instead of PCI core or drivers/acpi, if it is possible. 2. pci_setup_device is called so early, so that it is hard to do some HEST related initialization (such as checking bad format) before it. Best Regards, Huang Ying --
I understands the feeling, but before agreeing with your proposal, I'd like to have an answer of a question: - Is it necessary to setup the firmware_first flag for an endpoint even if the endpoint's driver never call pci_enable_pcie_error_reporting()? According to the current implementation, there are no driver referring the firmware_first flag other than that it owns. However I guess that the flag will be necessary for AER driver (i.e. aerdrv_core) in near future, because we can use the flag to determine whether the AER driver can check the device or not, when it is required to walk pci bus hierarchy to find an erroneous device. For example, assume that there are 2 endpoints under a same root port. One is (likely on-board) "firmware first" endpoint, with driver which does not call pci_enable_pcie_error_reporting() (because of no interest in AER, or just not implemented yet, anyway). The other is (likely card seated on a slot) not firmware first, with better driver which can handle it's AER. If my understanding is correct and if everything goes well, errors on one should be reported via APEI while the other should be reported via AER driver. We could call firmware_first setup for all endpoint from aer_init(), but it will need another care for hot-plugged endpoints. Thanks, H.Seto --
Yes. I think this should be supported. How about something as follow?
struct pci_dev {
...
unsigned int __firmware_first:2;
...
};
int pcie_aer_get_firmware_first(struct pci_dev *dev)
{
if (!dev->__firmware_first)
aer_set_firmware_first(dev);
return dev->__firmware_first & 0x1;
}
Then we use pcie_aer_get_firmware_first() instead of dev->firmware_first
directly.
Best Regards,
Huang Ying
--
Looks reasonable. I think the following is more straightforward:
struct pci_dev {
...
unsigned int __firmware_first_valid:1;
unsigned int __firmware_first:1;
...
};
int pcie_aer_get_firmware_first(struct pci_dev *dev)
{
if (!dev->__firmware_first_valid)
aer_set_firmware_first(dev);
return dev->__firmware_first;
}
Thanks,
H.Seto
--
