Why am I reading all over the place that Linux AIO only works with O_DIRECT?
Is it out of date? :-)
I admit I haven't even _tried_ buffered files with Linux AIO due to
the evil propaganda.
To read an indirect block, you have to allocate memory: another
callback after you've slept waiting for memory to be freed up.
Then you allocate a request: another callback while you wait for the
request queue to drain.
Then you submit the request: that's the callback you mentioned,
waiting for the result.
But then triple, double, single indirect blocks: each of the above
steps repeated.
In the case of writing, another group of steps for bitmap blocks,
inode updates, and heaven knows how fiddly it gets with ordered
updates to the journal, synchronised with other writes.
Plus every little mutex / rwlock is another place where you need those
callback functions. We don't even _have_ an async mutex facility in
the kernel. So every user of a mutex has to be changed to use
waitqueues or something. No more lockdep checking, no more RT
priority inheritance.
There are a _lot_ of places that can sleep on the way to a trivial
file I/O, and quite a lot of state to be past along the continuation
functions.
It's possible but by no means obvious that it's better.
I think people have mostly given up on that approach due to the how
much it complicates all the filesystem code, and how much goodness
there is in being able to call things which can sleep when you look at
all the different places. It seemed like a good idea for a while.
And it's not _that_ certain that it would be faster at high
loads after all the work.
A compromise where just a few synchronisation points are made async is
ok. But then it's a compromise... so you still need a multi-threaded
caller to keep the queues full in all situations.
For specific filesystems, you could do it. readahead() on directories
is not an unreasonable thing to add on.
Generically is not likely. It's not about blocking, it's about the
fact that directories don't always consist of data blocks on the store
organised similarly to a file. For example NFS, CIFS, or (I'm not
sure), maybe even reiserfs/btrfs?
-- Jamie
--