Re: [patch 21/24] perfmon: Intel architectural PMU support (x86)

Previous thread: [patch 20/24] perfmon: system calls interface by eranian on Wednesday, November 26, 2008 - 1:42 am. (22 messages)

Next thread: [patch 22/24] perfmon: AMD64 processor support (x86) by eranian on Wednesday, November 26, 2008 - 1:42 am. (2 messages)
From: eranian
Date: Wednesday, November 26, 2008 - 1:42 am

This patch adds Intel architectural PMU support (version 1, 2, and 3).

Signed-off-by: Stephane Eranian <eranian@gmail.com>
--

Index: o3/arch/x86/perfmon/Makefile
===================================================================
--- o3.orig/arch/x86/perfmon/Makefile	2008-11-25 18:09:47.000000000 +0100
+++ o3/arch/x86/perfmon/Makefile	2008-11-25 18:21:33.000000000 +0100
@@ -3,3 +3,4 @@
 # Contributed by Stephane Eranian <eranian@hpl.hp.com>
 #
 obj-$(CONFIG_PERFMON)			+= perfmon.o
+obj-$(CONFIG_X86_PERFMON_INTEL_ARCH)	+= perfmon_intel_arch.o
Index: o3/arch/x86/perfmon/perfmon_intel_arch.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ o3/arch/x86/perfmon/perfmon_intel_arch.c	2008-11-25 18:22:00.000000000 +0100
@@ -0,0 +1,628 @@
+/*
+ * This file contains the Intel architectural perfmon v1, v2, v3
+ * description tables.
+ *
+ * Architectural perfmon was introduced with Intel Core Solo/Duo
+ * processors.
+ *
+ * Copyright (c) 2006-2007 Hewlett-Packard Development Company, L.P.
+ * Contributed by Stephane Eranian <eranian@hpl.hp.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ * 02111-1307 USA
+ */
+#include <linux/kprobes.h>
+#include <linux/perfmon_kern.h>
+#include <asm/msr.h>
+#include <asm/apic.h>
+
+static u64 enable_mask[PFM_MAX_PMCS];
+static u16 max_enable;
+static int ...
From: Thomas Gleixner
Date: Wednesday, November 26, 2008 - 7:55 am

in arch/x86/include/asm/intel_arch_perfmon.h we have already:

union cpuid10_eax {
        struct {
                unsigned int version_id:8;
                unsigned int num_counters:8;
                unsigned int bit_width:8;
                unsigned int mask_length:8;
        } split;
        unsigned int full;
};


A) static








So we have:

   set->used_pmcs and enable_mask and max_enable.

Why can set->used_pmcs contain bits which are not in the enable_mask
in the first place ? Why does the arch code not tell the generic code
which pmcs are available so we can avoid all this mask, weight
whatever magic ?

We store the same information in slightly different incarnations in
various places and then we need to mingle them all together to get to

Why are the counters enabled at all when an overflow is pending, which
stopped the counters anyway ?

Thanks,

	tglx
--

From: stephane eranian
Date: Wednesday, November 26, 2008 - 8:45 am

Thomas,


Ok, some explanations are needed here.

Both perfmon_intel_arch.c and perfmon_amd64.c are supposed to be kernel modules.
They are hardcoded right now to make the patch simpler. Have PMU description be
modules is a key features because it allows adding new processor support without
necessarily patch the core kernel or rebooting. This has been working
nicely on Itanium.
With the introduction of Intel architectural Perfmon (IA32 SDM chapter
18), this becomes
possible on Intel X86 as well. on AMD64 processor, this is not really
an issue as they
all use the same architected PMU, except for family 10h which has nice
extensions.

In anycase, the idea is to encapsulate as much as possible code
related into a PMU model
into each module. That is why you are seing some redundancy.

There is a difference between enable_mask and used_pmcs. The used_pmcs
bitmasks shows
all the config registers in use. Whereas enable_mask shows the all
config registers which have
start/stop capabilities. For the basic AMD64 PMU (4 counters)
used_pmcs and enable_mask
are equivalent, but that is not the case on Barcelona once we support
IBS and sampling. So

Because used_pmcs is part of generic code and enable_mask is a x86 construct.
As I said above, for now, I could drop enable_mask.
The arch code already export the list of available pmcs and pmds in
impl_pmcs and impl_pmds.

There is a difference between impl_pmcs and used_pmcs. The former list
everything that is
available. The latter shows what we are currently using. We may be
using fewer registers than
what is available, and we use this information to avoid
saving/restoring MSR on context switch
Because on Intel and AMD64, counters are not automatically frozen on interrupt.
On Intel X86, they can be configured to do so, but it is an all or
nothing setting.
I am not using this option because we would then have a problem with the NMI
watchdog given that it is also using a counter.
--

From: Thomas Gleixner
Date: Wednesday, November 26, 2008 - 9:10 am

Understood. If we need that in the near future then it's ok to keep
it, it just did not make any sense from the current code.

But I think you should do this once when you set up the context and
keep that as a separate mask. Right now you evaluate enable_mask and


Well, my question was: why do we have to stop the counters when an
overflow is pending already ?

The overflow pending is set inside of stop_save() and cleared
somewhere else. 

stop_save() is called from pfm_arch_stop() and
pfm_arch_ctxswout_thread(). The first thing it does is to disable the
counters.

Now at some points the counters are obviously reenabled for this
context, but why are they reenabled _before_ the pending overflow has
been resolved ? For N counters that N * 2 wrmsrl() overhead.

Thanks,

	tglx
--

From: Andi Kleen
Date: Wednesday, November 26, 2008 - 9:45 am

To install the new processor in the first place you have 
to reboot anyways.

Also typically at least for new families (which tend to be the
only ones with radically new performance counters) it's typically
needed to change some things in the core kernel anyways. So 

It becomes possible, but without having to use any modules. It should
just work.

Probably even without it worked -- at least if you limit yourself to 
family 6 -- because the register layout all stayed the same too.

That said having modular PMUs is probably a good thing for
distribution kernels, but there is really no need for any 
code compromises just to avoid a core kernel patch now and then.

-Andi
--

From: stephane eranian
Date: Monday, December 1, 2008 - 8:09 pm

Well, I need more than eax. We could rewrite this union to include
eax, edx,  So I propose we call it union cpuid10 and define it as:

union cpuid_eax {
       struct {
               unsigned int version_id:8;
               unsigned int num_counters:8;
               unsigned int bit_width:8;
               unsigned int mask_length:8;
       } split_eax;
       struct {
              unsigned int num_counters:5;
              unsigned int bit_width:8;
              unsigned int reserved:19;
       } split_edx;
       unsigned int full;
}
--

Previous thread: [patch 20/24] perfmon: system calls interface by eranian on Wednesday, November 26, 2008 - 1:42 am. (22 messages)

Next thread: [patch 22/24] perfmon: AMD64 processor support (x86) by eranian on Wednesday, November 26, 2008 - 1:42 am. (2 messages)