I've just noticed collectl is reporting inconsistent data for dm disks and the disks they're made up from. The bottom line is that /proc/diskstats appears to be wrong and I say this for 2 reasons. I know collectl is correctly reporting the data and I also confirmed that iostat reports the same numbers as collectl.
Consider the following data snapshot while writing a large file to /tmp:
### RECORD 5 >>> poker <<< (1237225871.002) (Mon Mar 16 13:51:11 2009) ### # DISK STATISTICS (/sec) # <---------reads---------><---------writes---------><--------averages--------> Pct #Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util sda 0 0 0 0 3204 7171 10 320 320 2 31 8 7 sdb 0 0 0 0 0 0 0 0 0 0 0 0 0 dm-0 0 0 0 0 28952 0 7238 4 4 267 4 0 7 dm-1 0 0 0 0 0 0 0 0 0 0 0 0 0 hda 0 0 0 0 0 0 0 0 0 0 0 0 0
Here's another sample, this time the KBs are closer:
### RECORD 10 >>> poker <<< (1237225998.002) (Mon Mar 16 13:53:18 2009) ### # DISK STATISTICS (/sec) # <---------reads---------><---------writes---------><--------averages--------> Pct #Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util sda 0 0 0 0 38912 8000 77 505 505 141 1864 12 100 sdb 0 0 0 0 0 0 0 0 0 0 0 0 0 dm-0 0 0 0 0 32256 0 8064 4 4 17853 2251 0 100 dm-1 0 0 0 0 0 0 0 0 0 0 0 0 0 hda 0 0 0 0 0 0 0 0 0 0 0 0 0
This time KBs and util sort of agree, though I'd expect the dm KBs to be >= the device KBs.
-mark
queues
the values are KB/sec, not absolute values. the difference could be the data in the i/o-queues waiting to be written. if you write to dm-0 the data is counted and passed directly to the i/o-queue where it has to wait until the disk is ready, which can be a long time, if your load is seeky. also there is feedback, if the disk queue grew too big, the writers are throttled. you could sample the values less often to get the buffering oscillations smoothed out.
The thing is the disk
The thing is the disk performance rates as seen by the application are all consistent and steady. If you look at the data more closely I/Os reported with dm disks are approximately the sum of the merges and actual I/Os. Is that what's happening? But aren't the I/Os the number of actual reads/writes whereas the merges are counting pages?
I also see real small I/O sizes for the dm disks, on the order of 4K which doesn't sound right at all since I'm actually using a load generator doing 1MB writes. I understand they're actully broken into something smaller, but 4K?
And finally the queue depth - can close to 20K requests be sitting in a dm queue? That one doesn't feel right either.
-mark