Jeff Roberson, the primary developer of FreeBSD's revived ULE scheduler [story], has committed a new a python/tkinter tool, schedgraph, to the freebsd -current tree. Schedgraph will assist with scheduler testing and refinement as well as help developers study application load and corresponding system behavior. In the initial commit message, Jeff gives a simple description:
"Schedgraph takes input from files produces by ktrdump -ct when KTR_SCHED is compiled into the kernel. The output represents the states of each thread with colored line segments as well as colored points for non-state scheduler events. Each line segment and point is clickable to obtain extra detail."
Jeff includes a screenshot and sample data. Robert Watson follows up with a link to pointers on getting KTR working. Jeff has also been very busy committing more fixes to the ULE scheduler, a couple of which solve long standing bugs. Read on for details.
From: Jeff Roberson [email blocked] To: freebsd-current Subject: schedgraph.py Date: 2004-12-26 0:31:19 I took a break from working on VFS to implement a tool that will help me further refine the ULE scheduler. This may also be interesting in understanding application behavior under load, analyzing lock contention and preemption in the kernel, etc. To use the tool, you will need to define KTR_SCHED in KTR_COMPILE and KTR_MASK. I'd also bump entires up to 32768 or larger so you can grab a few seconds of data. Run your workload, and then capture the data with 'ktrdump -ct > ktr.out'. Then you simply run python schedgraph.py ktr.out. This requires a recent version of python and ports/x11-toolkits/py-tkinter. Here's a screenshot from a recent run: http://www.chesapeake.net/~jroberson/schedgraph.jpg I also have some sample data at http://www.chesapeake.net/~jroberson/smp.out.gz if you want to play with the tool without capturing data. The configuration page acts as a legend so you can understand the colors. The least obvious feature of the display is that the background color changes according to the cpu that the thread is executing on. In the screenshot I posted, light grey is cpu 0 and dark grey is cpu 1. You can also click on any event for greater detail. For events that exist on two threads, you may click on the corrisponding thread's name in the event popup to change to that event. Feedback welcome, patches for new features are even more welcome. Cheers, Jeff ---------- Forwarded message ---------- Date: Sun, 26 Dec 2004 00:13:07 +0000 (UTC) From: Jeff Roberson [email blocked] To: src-committers, cvs-src, cvs-all Subject: cvs commit: src/tools/sched schedgraph.py jeff 2004-12-26 00:13:07 UTC FreeBSD src repository Added files: tools/sched schedgraph.py Log: - Add 'schedgraph' a scheduler trace visualization tool written with python and tkinter. Schedgraph takes input from files produces by ktrdump -ct when KTR_SCHED is compiled into the kernel. The output represents the states of each thread with colored line segments as well as colored points for non-state scheduler events. Each line segment and point is clickable to obtain extra detail. Revision Changes Path 1.1 +1209 -0 src/tools/sched/schedgraph.py (new)
From: Julian Elischer [email blocked] Subject: Re: schedgraph.py Date: 2004-12-26 6:30:02 Jeff Roberson wrote: > I took a break from working on VFS to implement a tool that will help me > further refine the ULE scheduler. This may also be interesting in > understanding application behavior under load, analyzing lock contention > and preemption in the kernel, etc. > [...] robert watson has been doing some work in a similar (but not as detailed) vein. he has some code that produces some quite nice graphs of threads and cpus using KTR.
From: Robert Watson [email blocked] Subject: Re: schedgraph.py Date: 2004-12-26 11:02:48 On Sat, 25 Dec 2004, Jeff Roberson wrote: > To use the tool, you will need to define KTR_SCHED in KTR_COMPILE and > KTR_MASK. I'd also bump entires up to 32768 or larger so you can grab a > few seconds of data. Run your workload, and then capture the data with > 'ktrdump -ct > ktr.out'. Then you simply run python schedgraph.py > ktr.out. This requires a recent version of python and > ports/x11-toolkits/py-tkinter. Great! For those who need a little more hand-holding getting KTR running, here's a URL to try: http://www.watson.org/~robert/freebsd/netperf/ktr/ It's been my hope people would start producing more post-processing tools -- KTR can collect some really great data that's just sitting there waiting to be mined. I'd be interested in seeing post-processing tools for locking as well. This looks like a great tool that will be really helpful in understanding behavior and performance. Thanks! Robert N M Watson
From: Jeff Roberson [email blocked] Subject: Re: schedgraph.py Date: 2004-12-26 18:17:41 On Sun, 26 Dec 2004, Robert Watson wrote: > > On Sat, 25 Dec 2004, Jeff Roberson wrote: > > > To use the tool, you will need to define KTR_SCHED in KTR_COMPILE and > > KTR_MASK. I'd also bump entires up to 32768 or larger so you can grab a > > few seconds of data. Run your workload, and then capture the data with > > 'ktrdump -ct > ktr.out'. Then you simply run python schedgraph.py > > ktr.out. This requires a recent version of python and > > ports/x11-toolkits/py-tkinter. > > Great! > > For those who need a little more hand-holding getting KTR running, here's > a URL to try: > > http://www.watson.org/~robert/freebsd/netperf/ktr/ > > It's been my hope people would start producing more post-processing tools > -- KTR can collect some really great data that's just sitting there > waiting to be mined. I'd be interested in seeing post-processing tools > for locking as well. This looks like a great tool that will be really > helpful in understanding behavior and performance. Well, if you want to display contention, you can filter on that using the configuration menu, and display only contested locks. This tool could be easily extended to add events for picking up and droping mutexes as well. It would only require a new KTR and a regexp to match it.
Most of the changes were just
Most of the changes were just cleaning up after some commits that I wasn't around to take care of during the 5.3 release. The few bug fixes probably had little impact, but all of this paves the way for some performance work that needs to be done on the load balancer.
I've already committed a patch that improved performance by 15% on super-smack with 4BSD as a result of using schedgraph. I found some cases where we were doing extra context switches that weren't needed.
Re: Most of the changes were just
Great news Jeff, thanks for the update!
thanks
I'm very excited about ULE being back on track. I'm just a desktop user (well, mainly, I also maintain a relatively low traffic server - http, ftp), not exactly the target market of FreeBSD, but I became to like it very much (found it more easy to learn than other unix like OS - and it is more fun!). So, especially under heavy load (portupgrade -a, or encoding tv capture into avi's with mencoder/avidemux2) I found interactivity with ULE very good. I could watch movies (even xvid) without any glitches, and sometimes I even had to check if portupgrade haven't stalled (with an options ncurses screen)!
So I'm grateful for your work, I would be happy to have back ULE on STABLE (I know about the commenting out stuff, but it's too risky for me). Thanks.