login
Header Space

 
 

Re: O_DIRECT question

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Andrea Arcangeli <andrea@...>
Cc: Denis Vlasenko <vda.linux@...>, Bill Davidsen <davidsen@...>, Michael Tokarev <mjt@...>, Linus Torvalds <torvalds@...>, Viktor <vvp01@...>, Aubrey <aubreylee@...>, Hua Zhong <hzhong@...>, Hugh Dickins <hugh@...>, Linux-kernel <linux-kernel@...>
Date: Tuesday, January 30, 2007 - 2:50 pm

Andrea Arcangeli wrote:

Then I'll restore the lkml to the cc list.


It should return the number of bytes successfully written before the 
error, giving you the location of the first error.  Also using smaller 
individual writes ( preferably issued in parallel ) also allows the 
problem spot to be isolated.


When you are writing a transaction log, you do; you don't need much 
data, but you do need to be sure it has hit the disk before continuing. 
  You certainly aren't writing many mb across a dozen write() calls and 
only then care to make sure it is all flushed in an unknown order.  When 
order matters, you can not use fsync, which is one of the reasons why 
databases use O_DIRECT; they care about the ordering.


I meant it is not a good idea to use fsync as you can't properly handle 
errors.


Throughput is nowhere near perfect, as the pipeline is stalled for quite 
some time.  The pipe fills up quickly while dd is blocked on the sync 
write, which then blocks tar until all 16 MB have hit the disk.  Only 
then does dd go back to reading from the tar pipe, allowing it to 
continue.  During the time it takes tar to archive another 16 MB of 
data, the write queue is empty.  The only time that the tar process gets 
to continue running while data is written to disk is in the small time 
it takes for the pipe ( 4 KB isn't it? ) to fill up.


No, semantics have nothing to do with performance.  Semantics deals with 
the state of the machine after the call, not how quickly it got there. 
Semantics is a question of correct operation, not optimal.

With both O_DIRECT and O_SYNC, the machine state is essentially the same 
after the call: the data has hit the disk.  Aside from the performance 
difference, the application can not tell the difference between O_DIRECT 
and O_SYNC, so if that performance difference can be resolved by 
changing the implementation, Linus can be happy and get rid of O_DIRECT.


-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: O_DIRECT question, Phillip Susi, (Tue Jan 30, 2:50 pm)
Re: O_DIRECT question, Andrea Arcangeli, (Tue Jan 30, 3:57 pm)
Re: O_DIRECT question, Phillip Susi, (Tue Jan 30, 7:07 pm)
Re: O_DIRECT question, Michael Tokarev, (Wed Jan 31, 5:37 am)
Re: O_DIRECT question, Andrea Arcangeli, (Tue Jan 30, 10:28 pm)
Re: O_DIRECT question, Andrea Arcangeli, (Tue Jan 30, 4:06 pm)
speck-geostationary