Re: [PATCH V2 2/5] ara virt interface of perf to support kvm guest os statistics collection in guest os

From: Zhang, Yanmin
Date: Monday, June 21, 2010 - 2:31 am

The 2nd patch is to change the definition of perf_event to facilitate
perf attr copy when a hypercall happens.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>

---

--- linux-2.6_tip0620/include/linux/perf_event.h	2010-06-21 15:19:52.821999849 +0800
+++ linux-2.6_tip0620perfkvm/include/linux/perf_event.h	2010-06-21 16:53:49.283999849 +0800
@@ -188,7 +188,10 @@ struct perf_event_attr {
 	__u64			sample_type;
 	__u64			read_format;
 
-	__u64			disabled       :  1, /* off by default        */
+	union {
+		__u64		flags;
+		struct {
+			__u64	disabled       :  1, /* off by default        */
 				inherit	       :  1, /* children inherit it   */
 				pinned	       :  1, /* must always be on PMU */
 				exclusive      :  1, /* only group on PMU     */
@@ -217,6 +220,8 @@ struct perf_event_attr {
 				mmap_data      :  1, /* non-exec mmap data    */
 
 				__reserved_1   : 46;
+		};
+	};
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -465,12 +470,6 @@ enum perf_callchain_context {
 # include <asm/local64.h>
 #endif
 
-struct perf_guest_info_callbacks {
-	int (*is_in_guest) (void);
-	int (*is_user_mode) (void);
-	unsigned long (*get_guest_ip) (void);
-};
-
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 #include <asm/hw_breakpoint.h>
 #endif
@@ -753,6 +752,20 @@ struct perf_event {
 
 	perf_overflow_handler_t		overflow_handler;
 
+	/*
+	 * pointers used by kvm perf paravirt interface.
+	 *
+	 * 1) Used in host kernel and points to host_perf_shadow which
+	 * has information about guest perf_event
+	 */
+	void				*host_perf_shadow;
+	/*
+	 * 2) Used in guest kernel and points to guest_perf_shadow which
+	 * is used as a communication area with host kernel. Host kernel
+	 * copies overflow data to it when an event overflows.
+	 */
+	void				*guest_perf_shadow;
+
 #ifdef CONFIG_EVENT_TRACING
 	struct ftrace_event_call	*tp_event;
 	struct event_filter		*filter;
@@ -838,6 +851,16 @@ struct perf_output_handle {
 	int				sample;
 };
 ...

We cannot allow a guest to pin a counter.

The other flags are also problematic.  I'd like to see virt-specific 
flags (probably we'll only need kernel/user and nested_hv for nested 
virtualization).

Something that is worrying is that we don't expose group information.  


It's strange to see both guest and host parts in the same patch.  

Perhaps we can have a batch read interface which will read many counters 
at once.  This would reduce the number of exits.  Also adjust the 

-- 
error compiling committee.c: too many arguments to function

--

From: Zhang, Yanmin
Date: Monday, June 21, 2010 - 7:08 pm

These flags are used by generic perf codes. To match with host kernel, we wish
Right. host kernel will reset it to 0 before create perf_event for guest os.
Here we couldn't delete the flag as it's used by perf generic codes.
I need separate the patch a little better. All definitions in include/linux/perf_event.h
How about to add more comments around struct guest_perf_attr->flags that
guest os developers should take a look at include/linux/perf_event.h?
It's a little hard to split the patches if they change the same file. Perhaps
It's a good idea. But that will touch many perf generic codes which causes it's hard


--


One way to do that and retain type safety is to have

     struct perf_client {
           struct perf_client_ops *ops;
           ...
     }

The client (kvm) can do

    struct kvm_perf_client {
          struct perf_client pc;
          // kvm specific stuff
    };

the callbacks receive struct perf_client and use container_of to reach 

With git, it's easy (once you're used to it):

   # go back one commit:
   git reset HEAD^
   # selectively add bits:
   git add -p
   # commit first patch
   git commit -s
   # selectively add bits:
   git add -p
   # commit second patch

I'm talking about the guest/host interface.  So you have one vmexit and 
many host perf calls.

-- 
error compiling committee.c: too many arguments to function

--

From: Zhang, Yanmin
Date: Tuesday, June 22, 2010 - 2:25 am

I understood what you were speaking. I mean, perf generic codes operate perf_event
one by one. At low layer, we just know one perf_event before calling hypercall to
vmexit to host kernel.



--


Ah.

We might fix that by having the perf ops work on guest memory, and add a 
commit that transfers them to the host.  But this is complicated and can 
be left for later.

-- 
error compiling committee.c: too many arguments to function

--