Re: [PATCH 1/3] Add prctl commands PR_GET_TSC and PR_SET_TSC

Previous thread: [PATCH 13/13] Get rid of ipc_lock_down() by Nadia.Derbey on Friday, April 11, 2008 - 9:17 am. (1 message)

Next thread: [PATCH 2/3] x86: Implement prctl PR_GET_TSC and PR_SET_TSC by Erik Bosman on Friday, April 11, 2008 - 9:55 am. (23 messages)
From: Erik Bosman
Date: Friday, April 11, 2008 - 9:54 am

This patch adds prctl commands that make it possible
    to deny the execution of timestamp counters in userspace.
    If this is not implemented on a specific architecture,
    prctl will return -EINVAL.

	Signed-off-by: Erik Bosman <ejbosman@cs.vu.nl>
---
 include/linux/prctl.h |    6 ++++++
 kernel/sys.c          |   13 ++++++++++++-
 2 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/include/linux/prctl.h b/include/linux/prctl.h
index 3800639..5c80b19 100644
--- a/include/linux/prctl.h
+++ b/include/linux/prctl.h
@@ -67,4 +67,10 @@
 #define PR_CAPBSET_READ 23
 #define PR_CAPBSET_DROP 24

+/* Get/set the process' ability to use the timestamp counter instruction */
+#define PR_GET_TSC 25
+#define PR_SET_TSC 26
+# define PR_TSC_ENABLE		1	/* allow the use of the timestamp counter */
+# define PR_TSC_SIGSEGV		2	/* throw a SIGSEGV instead of reading the TSC */
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index a626116..6a0cc71 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -67,6 +67,12 @@
 #ifndef SET_ENDIAN
 # define SET_ENDIAN(a,b)	(-EINVAL)
 #endif
+#ifndef GET_TSC_CTL
+# define GET_TSC_CTL(a)		(-EINVAL)
+#endif
+#ifndef SET_TSC_CTL
+# define SET_TSC_CTL(a)		(-EINVAL)
+#endif

 /*
  * this is where the system-wide overflow UID and GID are defined, for
@@ -1737,7 +1743,12 @@ asmlinkage long sys_prctl(int option, unsigned long arg2, unsigned long arg3,
 #else
 			return -EINVAL;
 #endif
-
+		case PR_GET_TSC:
+			error = GET_TSC_CTL(arg2);
+			break;
+		case PR_SET_TSC:
+			error = SET_TSC_CTL(arg2);
+			break;
 		default:
 			error = -EINVAL;
 			break;
--

From: Andi Kleen
Date: Sunday, April 13, 2008 - 2:07 am

You forgot to state why you need that? 

-Andi
--

From: Erik Bosman
Date: Sunday, April 13, 2008 - 2:44 am

I'm using it for deterministic replay. With this trap it is
possible to emulate the instruction using ptrace and knowing
the outcome. Deterministic replay can be useful, amongst other
things, for debugging and security (instructing your debugger
to undo instructions for example, to see what happened before
a fault.)

Not having this trap means emulation should be used to catch
the instruction, leading to bad performance.

Without the timestamp counter, the only instruction leading
to non-determinism (that I'm aware of) is the CPUID instruction
that returns on which core it runs, but that doesn't seem to
be used that much.

Erik Bosman
--

From: Andi Kleen
Date: Sunday, April 13, 2008 - 9:47 am

Ok that should be in the changelog.

BTW x86 CPUs are not fully deterministic. e.g. there are a few errata that
can lead to differing EFLAGS (generally for instructions with undefined flags 
output) based on random internal pipe line conditions.

In my experience even simulators claiming to be fully deterministic
are not always. e.g. I remember trying to use instruction counts
on Simics to reproduce an issue for a scripted boot setup (with no user input),

There's also RDPMC, but by default the kernel does not enable that
for ring 3. And if you go for oddities there are the random number
generator instructions on VIA CPUs which will obviously not 
be repeatable.

-Andi
--

From: H. Peter Anvin
Date: Sunday, April 13, 2008 - 12:29 pm

There has been calls for an RDPMC counter which exposes true CPU cycles 
(varying with frequency, as opposed to wall time.)  And anything I/O -- 
including the RNG -- is obviously off.

I think what Erik is trying to do is to make it possible to disable as 
many of these in the kernel as possible; I/O is easy, it's off by 
default; RDTSC and RDPMC can be disabled in the kernel, and I think even 
XSTORE can be disabled.

	-hpa
--

From: Andi Kleen
Date: Sunday, April 13, 2008 - 1:50 pm

I'm a little sceptical it will work reliably, but we'll see.

There's also LSL (forgot that earlier). It is used by the vgetcpu()
because it's faster than CPUID or RDTSCP. Kernel
sets a magic segment up which has different limits for different CPUs.

-Andi

--

From: H. Peter Anvin
Date: Sunday, April 13, 2008 - 1:53 pm

Yeah, that's unlikely to be found in code that isn't actively looking 
for it, though.

	-hpa
--

Previous thread: [PATCH 13/13] Get rid of ipc_lock_down() by Nadia.Derbey on Friday, April 11, 2008 - 9:17 am. (1 message)

Next thread: [PATCH 2/3] x86: Implement prctl PR_GET_TSC and PR_SET_TSC by Erik Bosman on Friday, April 11, 2008 - 9:55 am. (23 messages)