[C/R ARM v2][PATCH 0/3] Linux Checkpoint-Restart - ARM port

Previous thread: mmotm 2010-04-28-16-53 uploaded by akpm on Wednesday, April 28, 2010 - 4:53 pm. (39 messages)

Next thread: [PATCH 1/1] hv: Added new hv_utils driver to hyper-v - CORRECTED by Hank Janssen on Wednesday, April 28, 2010 - 5:27 pm. (3 messages)
From: Christoffer Dall
Date: Monday, April 26, 2010 - 2:43 pm

Following there will be two preparatory patches for an ARM port of the
checkpoint-restart code and finally a third patch implementing the
architecture-specific parts of c/r.

The preparatory patches consist of a partial syscall trace implementation
for ARM and an eclone implementation for ARM. The syscall trace
implementation provides only the needed functionality for c/r.

There is a separate patch for the user space code, which supports
cross-compilation, extracting headers for ARM and an eclone implementation
for ARM.

The kernel patches presented here are based on the ckpt-v21-rc6 patch set.

---

CHANGELOG:

[2010-Apr-08] v2:
	- Systrace implementation now inspects process state to get the
	  system call number thereby avoiding extra work on system calls.
	- Removed __user attribute on long type in eclone implementation
	- Better check for architecture versions across C/R
	- Improved checking of user space ABI settings across C/R
	- Code simplifications

[2010-Mar-22] v1:
	- Initial version
	- Systrace implementation modified the system call entry path to
	  store the system call number globally in memory.
	- ARM implementation lightly tested
--

From: Christoffer Dall
Date: Monday, April 26, 2010 - 2:43 pm

Introduces a few of the system call inspection functions for ARM. The
current motivation is checkpoint restart, but the general interface
requirements are met, making it possible for a debugger or tracer to
obtain information about the system call status of another process.

The patch is in part based on the following proposal from Roland McGrath:
https://patchwork.kernel.org/patch/32101/

Compared to other architectures, the code to implement syscall_get_nr is
somewhat comprehensive. However, it's a result of no globally stored
location for the system call number and the complexity of the ARM ABI with
multiple versions.

Changelog[v2]:
	- Get the system call number by inspecting the process instead of
	  storing the system call number globally on entry to each system
	  call.

Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>
---
 arch/arm/include/asm/syscall.h |  133 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 133 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/include/asm/syscall.h

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
new file mode 100644
index 0000000..49cb10e
--- /dev/null
+++ b/arch/arm/include/asm/syscall.h
@@ -0,0 +1,133 @@
+/*
+ * syscalls.h - Linux syscall interfaces for ARM
+ *
+ * Copyright (c) 2010 Christoffer Dall
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#ifndef _ASM_ARM_SYSCALLS_H
+#define _ASM_ARM_SYSCALLS_H
+
+#include <linux/highmem.h>
+#include <linux/pagemap.h>
+#include <linux/memory.h>
+#include <asm/unistd.h>
+
+static inline int get_swi_instruction(struct task_struct *task,
+				      struct pt_regs *regs,
+				      unsigned long *instr)
+{
+	struct page *page = NULL;
+	unsigned long instr_addr;
+	unsigned long *ptr;
+	int ret;
+
+	instr_addr = regs->ARM_pc - ...
From: Christoffer Dall
Date: Monday, April 26, 2010 - 2:43 pm

In addition to doing everything that clone() system call does, the
eclone() system call:

	- allows additional clone flags (31 of 32 bits in the flags
	  parameter to clone() are in use)

	- allows user to specify a pid for the child process in its
	  active and ancestor pid namespaces.

Eclone is needed for restarting a process from a checkpoint. See more
in Documentation/eclone and refer to the original LKML posting:
http://lkml.org/lkml/2009/11/11/361

The new system call for ARM has number 366.

Changelog[v2]:
	- Removed __user attribute on long type

Cc: rmk@arm.linux.org.uk
Cc: libc-ports <libc-ports@sourceware.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>
---
 arch/arm/include/asm/unistd.h  |    1 +
 arch/arm/kernel/calls.S        |    1 +
 arch/arm/kernel/entry-common.S |    6 ++++++
 arch/arm/kernel/sys_arm.c      |   39 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index dd2bf53..8dcb42a 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -392,6 +392,7 @@
 #define __NR_rt_tgsigqueueinfo		(__NR_SYSCALL_BASE+363)
 #define __NR_perf_event_open		(__NR_SYSCALL_BASE+364)
 #define __NR_recvmmsg			(__NR_SYSCALL_BASE+365)
+#define __NR_eclone			(__NR_SYSCALL_BASE+366)
 
 /*
  * The following SWIs are ARM private.
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 37ae301..80047c8 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -375,6 +375,7 @@
 		CALL(sys_rt_tgsigqueueinfo)
 		CALL(sys_perf_event_open)
 /* 365 */	CALL(sys_recvmmsg)
+		CALL(sys_eclone_wrapper)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
diff --git a/arch/arm/kernel/entry-common.S ...
From: Christoffer Dall
Date: Monday, April 26, 2010 - 2:43 pm

Implements architecture specific requirements for checkpoint/restart on
ARM. The changes touch almost only c/r related code. Most of the work is
done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
and necessary fields on the thread_info struct.

The following restrictions are enforced:
----------------------------------------

The CPU architecture (given by cpu_architecture()) is checkpointed and
verified against the CPU architecture on restart. We require that the
restart architecture must be at least as new as the checkpoint
architecture.

We checkpoint whether the system is running with CONFIG_MMU or not and
require the same configuration for the system on which we restore the
process. As discussed in the original post of these patches, it should be
possible to checkpoint a non-mmu process and restart it on an mmu system.
However, the implementation and testing is left for someone with knowledge
about both configurations. (See
https://lists.linux-foundation.org/pipermail/containers/2010-March/023996.html)

Obviously, processes using the old ARM ABI cannot be restarted on kernels
configured with CONFIG_AEABI and without CONFIG_OABI_COMPAT. The same goes
for restarting processes using AEABI on kernels configured without
CONFIG_AEABI. Unfortunately, if the kernel on which we checkpoint is
configured with CONFIG_OABI_COMPAT there is no way of knowing which ABI the
process actually uses. Therefore, we raise warnings on restart whenever in
doubt and continue with the restart process optimistically.

Other:
------
Regarding ThumbEE, the thumbee_state field on the thread_info is stored
in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
set on the restore system, the restore is aborted. Feedback on this
implementation is very welcome.

Added support for syscall sys_checkpoint and sys_restart for ARM:
__NR_checkpoint         367
__NR_restart            ...
From: Oren Laadan
Date: Sunday, June 20, 2010 - 2:34 pm

Applied.

--

Previous thread: mmotm 2010-04-28-16-53 uploaded by akpm on Wednesday, April 28, 2010 - 4:53 pm. (39 messages)

Next thread: [PATCH 1/1] hv: Added new hv_utils driver to hyper-v - CORRECTED by Hank Janssen on Wednesday, April 28, 2010 - 5:27 pm. (3 messages)