Re: Poor interactive performance with I/O loads with fsync()ing

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Ingo Molnar
Date: Wednesday, March 17, 2010 - 2:37 am

* Nick Piggin <npiggin@suse.de> wrote:


A call-graph profile will show the precise reason for IO latencies, and their 
relatively likelihood.

It's really simple to do it with a recent kernel. Firstly, enable 
CONFIG_BLK_DEV_IO_TRACE=y, CONFIG_EVENT_PROFILE=y:

  Kernel performance events and counters (PERF_EVENTS) [Y/?] y
    Tracepoint profiling sources (EVENT_PROFILE) [Y/n/?] y
    Support for tracing block IO actions (BLK_DEV_IO_TRACE) [N/y/?] y

(boot into this kernel)

Then build perf via:

  cd tools/perf/
  make -j install

and then capture 10 seconds of the DB workload:

  perf record -f -g -a -e block:block_rq_issue -c 1 sleep 10

  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.251 MB perf.data (~10977 samples) ]

and look at the call-graph output:

  perf report

# Samples: 5
#
# Overhead          Command      Shared Object  Symbol
# ........  ...............  .................  ......
#
    80.00%        kjournald  [kernel.kallsyms]  [k] perf_trace_block_rq_issue
                  |
                  --- perf_trace_block_rq_issue
                      scsi_request_fn
                     |          
                     |--50.00%-- __blk_run_queue
                     |          cfq_insert_request
                     |          elv_insert
                     |          __elv_add_request
                     |          __make_request
                     |          generic_make_request
                     |          submit_bio
                     |          submit_bh
                     |          sync_dirty_buffer
                     |          journal_commit_transaction
                     |          kjournald
                     |          kthread
                     |          kernel_thread_helper
                     |          
                      --50.00%-- __generic_unplug_device
                                generic_unplug_device
                                blk_unplug
                                blk_backing_dev_unplug
                                sync_buffer
                                __wait_on_bit
                                out_of_line_wait_on_bit
                                __wait_on_buffer
                                wait_on_buffer
                                journal_commit_transaction
                                kjournald
                                kthread
                                kernel_thread_helper

    20.00%               as  [kernel.kallsyms]  [k] perf_trace_block_rq_issue
                         |
                         --- perf_trace_block_rq_issue
                             scsi_request_fn
                             __generic_unplug_device
                             generic_unplug_device
                             blk_unplug
                             blk_backing_dev_unplug
                             page_cache_async_readahead
                             generic_file_aio_read
                             do_sync_read
                             vfs_read
                             sys_read
                             system_call_fastpath
                             0x39f8ad4930


This (very simple) example had 80% of the IO in kjournald and 20% of it in 
'as'. The precise call-paths of IO issues are visible.

For general scheduler context-switch events you can use:

  perf record -f -g -a -e context-switches -c 1 sleep 10

see 'perf list' for all events.

Thanks,

	Ingo
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: Poor interactive performance with I/O loads with fsync ..., Ingo Molnar, (Wed Mar 17, 2:37 am)
Re: Poor interactive performance with I/O loads with fsync ..., Hans-Peter Jansen, (Sun Apr 11, 4:43 pm)