login
Header Space

 
 

Re: POHMELFS high performance network filesystem. Transactions, failover, performance.

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Evgeniy Polyakov <johnpol@...>
Cc: Jeff Garzik <jeff@...>, <linux-kernel@...>, <netdev@...>, <linux-fsdevel@...>
Date: Wednesday, May 14, 2008 - 9:35 am

> > What is your opinion of the Paxos algorithm?

For writes, Paxos is actually more or less optimal (in the non-failure 
cases, at least).  Reads are trickier, but there are ways to keep that 
fast as well.  FWIW, Ceph extends basic Paxos with a leasing mechanism to 
keep reads fast, consistent, and distributed.  It's only used for cluster 
state, though, not file data.

I think the larger issue with Paxos is that I've yet to meet anyone who 
wants their data replicated 3 ways (this despite newfangled 1TB+ disks not 
having enough bandwidth to actualy _use_ the data they store).  
Similarly, if only 1 out of 3 replicas is surviving, most people want to 
be able to read their data, while Paxos demands a majority to ensure it is 
correct.  (This is why Paxos is typically used only for critical cluster 
configuration/state, not regular data.)

sage
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: POHMELFS high performance network filesystem. Transactio..., Sage Weil, (Wed May 14, 9:35 am)
Re: POHMELFS high performance network filesystem. Transactio..., Evgeniy Polyakov, (Wed May 14, 11:00 am)
speck-geostationary