Incidentally, I was writing an external-memory radix-sort some time ago
and it turned out that writing to 256 files at once is much faster with
O_DIRECT than through the page cache, very likely because the page cache
is flushing pages in essentially random order. Tweaking VM parameters
and block device queue size helped, but only a little.
Have a nice fortnight
--
Martin `MJ' Mares <mj@ucw.cz> http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Linux vs. Windows is a no-WIN situation.
-