Re: [PATCH 6/6] ftrace: take advantage of variable length entries

Previous thread: [PATCH 5/6] ftrace: make work with new ring buffer by Steven Rostedt on Monday, September 29, 2008 - 8:02 pm. (1 message)

Next thread: [PATCH 3/6] ring_buffer: add paranoid check for buffer page by Steven Rostedt on Monday, September 29, 2008 - 8:02 pm. (1 message)
From: Steven Rostedt
Date: Monday, September 29, 2008 - 8:02 pm

[Empty message]
From: Pekka Paalanen
Date: Tuesday, September 30, 2008 - 10:33 am

Hi Steven,

This is great news. I have some questions below.


On Mon, 29 Sep 2008 23:02:42 -0400

I've read this paragraph over and over, and I can't really understand
"only storing the size of the ... event". Maybe you are trying to say
that now we are not consuming more ring space than the actual size of
the event?

Oh! I misunderstood also the last sentence, it's not generating extra

I'm just wondering, does this hold for both 32- and 64-bit kernels?
Long type is quite common in the structs. I realize this is going


This is different style than above, missing the struct
trace_mmiotrace_map intermediate step. Looks like a bug,
since struct mmiotrace_map is not the first field in

If we have variable length entries, doesn't that mean that
print_entry::buf will always hold the full nil-terminated message,
and that we could simply remove all the trace_seq_print_cont()
calls?

In fact, TRACE_FLAG_CONT will never be set. I didn't dig up the
other patches to verify, but trace_vprintk() isn't setting it

On the mmiotrace part, I don't see anything wrong here apart
from that one bug.


Thanks.

-- 
Pekka Paalanen
http://www.iki.fi/pq/
--

From: Steven Rostedt
Date: Tuesday, September 30, 2008 - 10:45 am

I believe 88 was what I came up with on x86_64. It was just a size to
pick out of the blue and has no real meaning. I'm not sure anyone uses it



This patch is bringing us to the stage to get rid of the CONT flag.

Thanks, I'll add a patch to my coming queue to handle this.

-- Steve

--

From: Steven Rostedt
Date: Tuesday, September 30, 2008 - 9:42 pm

Bah, I lied. The new locking design took a bit more time to do than 
I expected, and I forgot to add this fix. I'll add it tomorrow.

Thanks,

-- Steve

--

From: Ingo Molnar
Date: Wednesday, October 1, 2008 - 12:54 am

hm, there's a significant amount of type casts, see the grep below.

The ringbuffer becoming type-opaque has exactly these kinds of dangers, 
and as i suggested a few days ago, please think about a debug mode that 
stores the record type in the trace entry and validates it on extraction 
or something like that.

That might even be a robustness feature: the tracer should not crash if 
the trace buffer gets corrupted. ftrace had that invariant before, i 
think we should try to keep as many aspects of it as possible.

	Ingo

------------->
ring_buffer.c:		page = (struct buffer_page *)virt_to_page(addr);
ring_buffer.c:	cpu_buffer->reader_page = (struct buffer_page *)virt_to_page(addr);
ring_buffer.c:static void rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer);
ring_buffer.c:			page = (struct buffer_page *)virt_to_page(addr);
trace_boot.c:	struct trace_boot *field = (struct trace_boot *)entry;
trace.c:static DEFINE_PER_CPU(struct trace_array_cpu, global_trace_cpu);
trace.c:static DEFINE_PER_CPU(struct trace_array_cpu, max_data);
trace.c:		cont = (struct trace_field_cont *)ent;
trace.c:		struct ftrace_entry *field = (struct ftrace_entry *)entry;
trace.c:			(struct ctx_switch_entry *)entry;
trace.c:		struct special_entry *field = (struct special_entry *)entry;
trace.c:		struct stack_entry *field = (struct stack_entry *)entry;
trace.c:		struct print_entry *field = (struct print_entry *)entry;
trace.c:		struct ftrace_entry *field = (struct ftrace_entry *)entry;
trace.c:			(struct ctx_switch_entry *)entry;
trace.c:		struct special_entry *field = (struct special_entry *)entry;
trace.c:		struct stack_entry *field = (struct stack_entry *)entry;
trace.c:		struct print_entry *field = (struct print_entry *)entry;
trace.c:		struct ftrace_entry *field = (struct ftrace_entry *)entry;
trace.c:			(struct ctx_switch_entry *)entry;
trace.c:		struct special_entry *field = (struct special_entry *)entry;
trace.c:		struct print_entry *field = (struct print_entry *)entry;
trace.c:		struct ...
From: Steven Rostedt
Date: Wednesday, October 1, 2008 - 7:52 am

Although the ring buffer is type opaque, that isn't the cause of this 
issue. Ftrace still adds the types as needed. But since we no longer have 
the big "one type for all" union entry, we need to be a bit more careful. 
All types start with a one byte type id (giving us 255 types to use for 
ftrace, zero being saved).

The following patch fixes this bug and addresses the issue.

-- Steve

--

From: Steven Rostedt
Date: Wednesday, October 1, 2008 - 7:52 am

The mmiotrace map had a bug that would typecast the entry from
the trace to the wrong type.

This patch adds a macro to assign all entries of ftrace using the type
of the variable and checking the entry id. The typecasts are now done
in the macro for only those types that it knows about, which should
be all the types that are allowed to be read from the tracer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 kernel/trace/trace.c           |   85 ++++++++++++++++++++++++++++-------------
 kernel/trace/trace.h           |   42 ++++++++++++++++++++
 kernel/trace/trace_mmiotrace.c |   14 ++++--
 3 files changed, 112 insertions(+), 29 deletions(-)

Index: linux-tip.git/kernel/trace/trace.c
===================================================================
--- linux-tip.git.orig/kernel/trace/trace.c	2008-10-01 10:41:27.000000000 -0400
+++ linux-tip.git/kernel/trace/trace.c	2008-10-01 10:41:59.000000000 -0400
@@ -1350,7 +1350,9 @@ print_lat_fmt(struct trace_iterator *ite
 	}
 	switch (entry->type) {
 	case TRACE_FN: {
-		struct ftrace_entry *field = (struct ftrace_entry *)entry;
+		struct ftrace_entry *field;
+
+		trace_assign_type(field, entry);
 
 		seq_print_ip_sym(s, field->ip, sym_flags);
 		trace_seq_puts(s, " (");
@@ -1363,8 +1365,9 @@ print_lat_fmt(struct trace_iterator *ite
 	}
 	case TRACE_CTX:
 	case TRACE_WAKE: {
-		struct ctx_switch_entry *field =
-			(struct ctx_switch_entry *)entry;
+		struct ctx_switch_entry *field;
+
+		trace_assign_type(field, entry);
 
 		T = field->next_state < sizeof(state_to_char) ?
 			state_to_char[field->next_state] : 'X';
@@ -1384,7 +1387,9 @@ print_lat_fmt(struct trace_iterator *ite
 		break;
 	}
 	case TRACE_SPECIAL: {
-		struct special_entry *field = (struct special_entry *)entry;
+		struct special_entry *field;
+
+		trace_assign_type(field, entry);
 
 		trace_seq_printf(s, "# %ld %ld %ld\n",
 				 field->arg1,
@@ -1393,7 +1398,9 @@ print_lat_fmt(struct trace_iterator *ite
 		break;
 	}
 	case ...
From: Ingo Molnar
Date: Wednesday, October 1, 2008 - 10:42 am

ah, the -rt type trick ;-)


could you rename it to trace_type() - that way the initialization could 

[ NOTE: there's namespace clash with a local symbol in ./trace_irqsoff.c 
  and a trace_type enum as well ]

	Ingo
--

From: Ingo Molnar
Date: Wednesday, October 1, 2008 - 10:47 am

ah, it cannot be done that way, as 'field' has to be known to the 
compiler.

so your patch is fine as-is, the extra line is an acceptable cost, as we 
get:

- the type filter (trying to cast an object outside of the narrow list 
  of trace entry types will cause a failed build - i.e. most of the 
  practical dangers of C type casting are avoided!)

- the runtime entry type checker

	Ingo
--

Previous thread: [PATCH 5/6] ftrace: make work with new ring buffer by Steven Rostedt on Monday, September 29, 2008 - 8:02 pm. (1 message)

Next thread: [PATCH 3/6] ring_buffer: add paranoid check for buffer page by Steven Rostedt on Monday, September 29, 2008 - 8:02 pm. (1 message)