Here we want to rate limit in block layer, I would think I/O scheduler
is the place where we are in much better position to do this kind of
limiting.
Also we are changing the behavior of application by adding sleeps to
it during request submission. Moreover, we will prevent requests from
being merged since we won't allow them to be submitted in this case.
Since bulk of submission for writes is done in background kernel
threads and we throttle based on limits on current, we will end up
throttling these threads and not the actual processes submitting i/o.
--