Hi Sage.
On Sat, Jun 14, 2008 at 09:27:55PM -0700, Sage Weil (sage@newdream.net) wrote:
Yes, not only writepage, but any request - if it sends sequest and then
receives reply (i.e. doing send/recv sequence without ability to do
something else in between or allow other users to do sends or receives
into the same socket), then it is synchronous. If it only sends, and
someone else receives, it is possible to send multiple requests from
different users who do reads or writes or lookups or whatever and
asynchronously in different thread receive replies not in particular
order, so this approach I call asynchronous.
Yes, POHMELFS does writing that way.
Not exactly. Transaction in a nutshell is a wrapper on top of command
(or multiple commands if needed like in writing), which contains all
information needed to perform appropriate action. When user calls read()
or 'ls' or write() or whatever, POHMELFS creates transaction for that
operation and tries to perform it (if operation is not cached, in that
case nothing actually happens). When transaction is submitted, it
becomes part of the failover state machine which will check if data has
to be read from different server or written to new one or dropped.
original caller may not even know from which server its data will be
received. If request sending failed in the middle, the whole transaction
will be redirected to new one. It is also possible to redo transaction
against different server, if server sent us error (like I'm busy), but
this functionality was dropped in previous release iirc, this can be
resurrected though. Having generic transaction tree callers do not
bother about how to store theirs requests, how to wait for results and
how to complete them - transactions do it for them. It is not rocket
science, but extrmely effective and simple way to help rule out
asynchronous machinery.
That was somewhat old approach, currently inode numbers and things like
open-by-inode or NFS style open-by-cookie are not used. I tried to
describe caching bits in docuementation I ent, although its a bit rough
and likely incomplete :) Feel free to ask if there are some white areas
there.
--
Evgeniy Polyakov
--