Tracking Historical Performance

Submitted by Jeremy
on January 23, 2008 - 1:03pm

"I'd like to send a small update on my progress on the Performance Tracker project," noted Erik Cederstrand on the FreeBSD -current mailing list. He continued, "I now have a small setup of a server and a slave chugging along, currently collecting data. I'm following CURRENT and collecting results from super-smack and unixbench." The project performs regular benchmarks of the FreeBSD -current source tree using Unixbench and Super Smack, allowing you to chart the results over time. Erik highlighted an example of a visible change in performance when the generic kernel moved from the 4BSD scheduler to the ULE scheduler on October 19th, 2007.

Kris Kennaway responded favorably, then noted, "one suggestion I have is that as more metrics are added it becomes important for an 'at a glance; overview of changes so we can monitor for performance improvements and regressions among many workloads." He went on to suggest, "at some point the ability to annotate the data will become important (e.g. 'We understand the cause of this, it was r1.123 of foo.c, which was corrected in r1.124. The developer responsible has been shot.")" Erik agreed with both recommendations, and noted that he would continue to work in that direction.


From: Erik Cederstrand <erik@...>
Subject: Performance Tracker project update
Date: Jan 23, 12:48 am 2008

Hi

I'd like to send a small update on my progress on the Performance
Tracker project.

I now have a small setup of a server and a slave chugging along,
currently collecting data. I'm following CURRENT and collecting results
from super-smack and unixbench.

The project still needs some work, but there's a temporary web interface
to the data here: http://littlebit.dk:5000/plot/. Apart from the
plotting it's possible to compare two dates and see the files that have
changed. Error bars are 3*standard deviation, for the points with
multiple measurements.

Of interest is e.g. super-smack (select-key, 1 client) right when the
GENERIC kernel was moved from the 4BSD to ULE scheduler on Oct. 19.
Unixbench (arithmetic test, float) also has a significant jump on Oct. 3.

There setup of the slave is documented roughly on the page but I'll be
writing a full report and documentation over the next month.

Comments are very welcome but please followup on performance@.

Thanks,
Erik
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"


From: Kris Kennaway <kris@...> Subject: Re: Performance Tracker project update Date: Jan 23, 7:44 am 2008

Erik Cederstrand wrote:
> Hi
>
> I'd like to send a small update on my progress on the Performance
> Tracker project.
>
> I now have a small setup of a server and a slave chugging along,
> currently collecting data. I'm following CURRENT and collecting results
> from super-smack and unixbench.
>
> The project still needs some work, but there's a temporary web interface
> to the data here: http://littlebit.dk:5000/plot/. Apart from the
> plotting it's possible to compare two dates and see the files that have
> changed. Error bars are 3*standard deviation, for the points with
> multiple measurements.
>
> Of interest is e.g. super-smack (select-key, 1 client) right when the
> GENERIC kernel was moved from the 4BSD to ULE scheduler on Oct. 19.
> Unixbench (arithmetic test, float) also has a significant jump on Oct. 3.
>
> There setup of the slave is documented roughly on the page but I'll be
> writing a full report and documentation over the next month.
>
> Comments are very welcome but please followup on performance@.

This is coming along very nicely indeed!

One suggestion I have is that as more metrics are added it becomes
important for an "at a glance" overview of changes so we can monitor for
performance improvements and regressions among many workloads.

One way to do this would be a matrix of each metric with its change
compared to recent samples. e.g. you could do a student's T comparison
of today's numbers with those from yesterday, or from a week ago, and
colour-code those that show a significant deviation from "no change".
This might be a bit noisy on short timescales, so you could aggregrate
data into larger bins and compare e.g. moving 1-week aggregates.
Fluctuations on short timescales won't stand out, but if there is a real
change then it will show up less than a week later.

These significant events could also be graphed themselves and/or a
history log maintained (or automatically annotated on the individual
graphs) so historical changes can also be pinpointed.

At some point the ability to annotate the data will become important
(e.g. "We understand the cause of this, it was r1.123 of foo.c, which
was corrected in r1.124. The developer responsible has been shot.")

Kris

P.S. If I understand correctly, the float test shows a regression? The
metric is calculations/second, so higher = better?
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"


From: Erik Cederstrand <erik@...> Subject: Re: Performance Tracker project update Date: Jan 23, 9:49 am 2008

Kris Kennaway wrote:
>
> This is coming along very nicely indeed!
>
> One suggestion I have is that as more metrics are added it becomes
> important for an "at a glance" overview of changes so we can monitor for
> performance improvements and regressions among many workloads.
>
> One way to do this would be a matrix of each metric with its change
> compared to recent samples. e.g. you could do a student's T comparison
> of today's numbers with those from yesterday, or from a week ago, and
> colour-code those that show a significant deviation from "no change".
> This might be a bit noisy on short timescales, so you could aggregrate
> data into larger bins and compare e.g. moving 1-week aggregates.
> Fluctuations on short timescales won't stand out, but if there is a real
> change then it will show up less than a week later.

I agree that there's a need for an overview and some sort of
notification. I've been collecting historical data to get a baseline for
the statistics and I'll try to see what I can do over the next weeks.

> These significant events could also be graphed themselves and/or a
> history log maintained (or automatically annotated on the individual
> graphs) so historical changes can also be pinpointed.
>
> At some point the ability to annotate the data will become important
> (e.g. "We understand the cause of this, it was r1.123 of foo.c, which
> was corrected in r1.124. The developer responsible has been shot.")

There's a field in the database for this sort of thing. I just think it
needs some sort of authentication. That'll have to wait a bit.

> P.S. If I understand correctly, the float test shows a regression? The
> metric is calculations/second, so higher = better?

The documentation on Unixbench is scarce, but I would think so.

BTW if anyone's interested my SVN repo is online at:

svn://littlebit.dk/website/trunk (Pylons project)
svn://littlebit.dk/tracker/trunk (sh/Python scripts for runnning the
server and slaves)

Be careful with your eyes - this is my first attempt at both shell
scripting and Python :-)

Erik
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org"