[PATCH 01/12] ACPI, IO memory pre-mapping and atomic accessing

Previous thread: [PATCH 07/12] ACPI Hardware Error Device (PNP0C33) support by Huang Ying on Monday, April 26, 2010 - 11:41 pm. (1 message)

Next thread: I am an emotional amoeba who loves Erika-with-a-k and Erika-with-a-k loves me. by Benjamin LaHaise on Monday, April 26, 2010 - 11:35 pm. (7 messages)
From: Huang Ying
Date: Monday, April 26, 2010 - 11:41 pm

This patchset adds the APEI (ACPI Platform Error Interface) support to
Linux kernel.

01-06 include the basic APEI HEST parsing and EINJ support. They have
been posted to mailing list before and get some reviews and
ACKs. Changes since the last post are as follow.

- Make APEI infrastructure and HEST parsing configurable built-in
  instead of modules to solve some dependency issues.

- Some minor fixes, such as printk

07-10 are newly added to support GHES (Generic Hardware Error Source)
corrected memory error. On some machines, Machine Check can not report
physical address for some corrected memory errors, but GHES can do
that. So this simplified GHES is implemented and posted firstly.

11-12 are newly added to support using ERST for persistent storage of
MCE. This can improve fatal MCE report.

[PATCH 01/12] ACPI, IO memory pre-mapping and atomic accessing
[PATCH 02/12] ACPI, APEI, APEI supporting infrastructure
[PATCH 03/12] ACPI, APEI, HEST table parsing
[PATCH 04/12] ACPI, APEI, EINJ support
[PATCH 05/12] ACPI, APEI, Document for APEI
[PATCH 06/12] ACPI, APEI, PCIE AER, use general HEST table parsing in AER firmware_first setup
[PATCH 07/12] ACPI Hardware Error Device (PNP0C33) support
[PATCH 08/12] Unified UUID/GUID definition
[PATCH 09/12] ACPI, APEI, UEFI Common Platform Error Record (CPER) header
[PATCH 10/12] ACPI, APEI, Generic Hardware Error Source memory error support
[PATCH 11/12] ACPI, APEI, Error Record Serialization Table (ERST) support
[PATCH 12/12] ACPI, APEI, Use ERST for persistent storage of MCE

--

From: Huang Ying
Date: Monday, April 26, 2010 - 11:41 pm

Some ACPI IO accessing need to be done in atomic context. For example,
APEI ERST operations may be used for permanent storage in hardware
error handler. That is, it may be called in atomic contexts such as
IRQ or NMI, etc. And, ERST/EINJ implement their operations via IO
memory/port accessing.  But the IO memory accessing method provided by
ACPI (acpi_read/acpi_write) maps the IO memory during it is accessed,
so it can not be used in atomic context. To solve the issue, the IO
memory should be pre-mapped during EINJ/ERST initializing. A linked
list is used to record which memory area has been mapped, when memory
is accessed in hardware error handler, search the linked list for the
mapped virtual address from the given physical address.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 drivers/acpi/Makefile   |    1 
 drivers/acpi/atomicio.c |  360 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/acpi/atomicio.h |   10 +
 3 files changed, 371 insertions(+)
 create mode 100644 drivers/acpi/atomicio.c
 create mode 100644 include/acpi/atomicio.h

--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -19,6 +19,7 @@ obj-y				+= acpi.o \
 
 # All the builtin files are in the "acpi." module_param namespace.
 acpi-y				+= osl.o utils.o reboot.o
+acpi-y				+= atomicio.o
 acpi-y				+= hest.o
 
 # sleep related files
--- /dev/null
+++ b/drivers/acpi/atomicio.c
@@ -0,0 +1,360 @@
+/*
+ * atomicio.c - ACPI IO memory pre-mapping/post-unmapping, then
+ * accessing in atomic context.
+ *
+ * This is used for NMI handler to access IO memory area, because
+ * ioremap/iounmap can not be used in NMI handler. The IO memory area
+ * is pre-mapped in process context and accessed in NMI handler.
+ *
+ * Copyright (C) 2009-2010, Intel Corp.
+ *	Author: Huang Ying <ying.huang@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License ...
From: Huang Ying
Date: Monday, April 26, 2010 - 11:41 pm

APEI stands for ACPI Platform Error Interface, which allows to report
errors (for example from the chipset) to the operating system. This
improves NMI handling especially. In addition it supports error
serialization and error injection.

For more information about APEI, please refer to ACPI Specification
version 4.0, chapter 17.

This patch provides some common functions used by more than one APEI
tables, mainly framework of interpreter for EINJ and ERST.

A machine readable language is defined for EINJ and ERST for OS to
execute, and so to drive the firmware to fulfill the corresponding
functions. The machine language for EINJ and ERST is compatible, so a
common framework is defined for them.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 drivers/acpi/Kconfig              |    2 
 drivers/acpi/Makefile             |    2 
 drivers/acpi/apei/Kconfig         |    9 
 drivers/acpi/apei/Makefile        |    3 
 drivers/acpi/apei/apei-base.c     |  593 ++++++++++++++++++++++++++++++++++++++
 drivers/acpi/apei/apei-internal.h |   95 ++++++
 6 files changed, 704 insertions(+)
 create mode 100644 drivers/acpi/apei/Kconfig
 create mode 100644 drivers/acpi/apei/Makefile
 create mode 100644 drivers/acpi/apei/apei-base.c
 create mode 100644 drivers/acpi/apei/apei-internal.h

--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -360,4 +360,6 @@ config ACPI_SBS
 	  To compile this driver as a module, choose M here:
 	  the modules will be called sbs and sbshc.
 
+source "drivers/acpi/apei/Kconfig"
+
 endif	# ACPI
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -67,3 +67,5 @@ processor-y			+= processor_idle.o proces
 processor-$(CONFIG_CPU_FREQ)	+= processor_perflib.o
 
 obj-$(CONFIG_ACPI_PROCESSOR_AGGREGATOR) += acpi_pad.o
+
+obj-$(CONFIG_ACPI_APEI)		+= apei/
--- /dev/null
+++ b/drivers/acpi/apei/Kconfig
@@ -0,0 +1,9 @@
+config ACPI_APEI
+	bool "ACPI Platform Error Interface (APEI)"
+	depends on ...
From: Huang Ying
Date: Monday, April 26, 2010 - 11:41 pm

Add document for APEI, including kernel parameters and EINJ debug file
sytem interface.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 Documentation/acpi/apei/einj.txt    |   49 ++++++++++++++++++++++++++++++++++++
 Documentation/kernel-parameters.txt |    5 +++
 2 files changed, 54 insertions(+)
 create mode 100644 Documentation/acpi/apei/einj.txt

--- /dev/null
+++ b/Documentation/acpi/apei/einj.txt
@@ -0,0 +1,49 @@
+			APEI Error INJection
+			~~~~~~~~~~~~~~~~~~~~
+
+EINJ provides a hardware error injection mechanism, it is very useful
+for debugging and testing of other APEI and RAS features.
+
+To use EINJ, make the following is enabled in your kernel
+configuration:
+
+CONFIG_DEBUG_FS
+CONFIG_ACPI_APEI
+CONFIG_ACPI_APEI_EINJ
+
+The user interface of EINJ is in debug file system, under the
+directory apei/einj. The following files are provided.
+
+- available_error_type
+  Read this file will return the error injection capability of the
+  platform, that is, which error types are supported. The error type
+  definition is as follow, the left field is the error type value, the
+  right field is error description.
+
+    0x00000001	Processor Correctable
+    0x00000002	Processor Uncorrectable non-fatal
+    0x00000004	Processor Uncorrectable fatal
+    0x00000008  Memory Correctable
+    0x00000010  Memory Uncorrectable non-fatal
+    0x00000020  Memory Uncorrectable fatal
+    0x00000040	PCI Express Correctable
+    0x00000080	PCI Express Uncorrectable fatal
+    0x00000100	PCI Express Uncorrectable non-fatal
+    0x00000200	Platform Correctable
+    0x00000400	Platform Uncorrectable non-fatal
+    0x00000800	Platform Uncorrectable fatal
+
+  The format of file contents are as above, except there are only the
+  available error type lines.
+
+- error_type
+  This file is used to set the error type value. The error type value
+  is defined in "available_error_type" description.
+
+- ...
From: Huang Ying
Date: Monday, April 26, 2010 - 11:41 pm

EINJ provides a hardware error injection mechanism, this is useful for
debugging and testing of other APEI and RAS features.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 drivers/acpi/apei/Kconfig  |    7 
 drivers/acpi/apei/Makefile |    1 
 drivers/acpi/apei/einj.c   |  485 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 493 insertions(+)
 create mode 100644 drivers/acpi/apei/einj.c

--- a/drivers/acpi/apei/Kconfig
+++ b/drivers/acpi/apei/Kconfig
@@ -7,3 +7,10 @@ config ACPI_APEI
 	  especially. In addition it supports error serialization and
 	  error injection.
 
+config ACPI_APEI_EINJ
+	tristate "APEI Error INJection (EINJ)"
+	depends on ACPI_APEI && DEBUG_FS
+	help
+	  EINJ provides a hardware error injection mechanism, it is
+	  mainly used for debugging and testing the other parts of
+	  APEI and some other RAS features.
--- a/drivers/acpi/apei/Makefile
+++ b/drivers/acpi/apei/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_ACPI_APEI)		+= apei.o
+obj-$(CONFIG_ACPI_APEI_EINJ)	+= einj.o
 
 apei-y := apei-base.o hest.o
--- /dev/null
+++ b/drivers/acpi/apei/einj.c
@@ -0,0 +1,485 @@
+/*
+ * APEI Error INJection support
+ *
+ * EINJ provides a hardware error injection mechanism, this is useful
+ * for debugging and testing of other APEI and RAS features.
+ *
+ * For more information about EINJ, please refer to ACPI Specification
+ * version 4.0, section 17.5.
+ *
+ * Copyright 2009 Intel Corp.
+ *   Author: Huang Ying <ying.huang@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * ...
From: Huang Ying
Date: Monday, April 26, 2010 - 11:41 pm

HEST describes error sources in detail; communicating operational
parameters (i.e. severity levels, masking bits, and threshold values)
to OS as necessary. It also allows the platform to report error
sources for which OS would typically not implement support (for
example, chipset-specific error registers).

HEST information may be needed by other subsystems. For example, HEST
PCIE AER error source information describes whether a PCIE root port
works in "firmware first" mode, this is needed by general PCIE AER
error subsystem. So a public HEST tabling parsing interface is
provided.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 drivers/acpi/apei/Makefile |    2 
 drivers/acpi/apei/hest.c   |  173 +++++++++++++++++++++++++++++++++++++++++++++
 include/acpi/apei.h        |   13 +++
 3 files changed, 187 insertions(+), 1 deletion(-)
 create mode 100644 drivers/acpi/apei/hest.c
 create mode 100644 include/acpi/apei.h

--- a/drivers/acpi/apei/Makefile
+++ b/drivers/acpi/apei/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_ACPI_APEI)		+= apei.o
 
-apei-y := apei-base.o
+apei-y := apei-base.o hest.o
--- /dev/null
+++ b/drivers/acpi/apei/hest.c
@@ -0,0 +1,173 @@
+/*
+ * APEI Hardware Error Souce Table support
+ *
+ * HEST describes error sources in detail; communicates operational
+ * parameters (i.e. severity levels, masking bits, and threshold
+ * values) to Linux as necessary. It also allows the BIOS to report
+ * non-standard error sources to Linux (for example, chipset-specific
+ * error registers).
+ *
+ * For more information about HEST, please refer to ACPI Specification
+ * version 4.0, section 17.3.2.
+ *
+ * Copyright 2009 Intel Corp.
+ *   Author: Huang Ying <ying.huang@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation;
+ *
+ * This program is distributed in ...
From: Andi Kleen
Date: Tuesday, May 11, 2010 - 4:25 am

I read all the code again and it looks all good to go for .35 to me.

Reviewed-by: Andi Kleen <ak@linux.intel.com>

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.
--

Previous thread: [PATCH 07/12] ACPI Hardware Error Device (PNP0C33) support by Huang Ying on Monday, April 26, 2010 - 11:41 pm. (1 message)

Next thread: I am an emotional amoeba who loves Erika-with-a-k and Erika-with-a-k loves me. by Benjamin LaHaise on Monday, April 26, 2010 - 11:35 pm. (7 messages)