Re: [2/3] POHMELFS: Documentation.

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Evgeniy Polyakov <johnpol@...>
Cc: Jamie Lokier <jamie@...>, <linux-kernel@...>, <netdev@...>, <linux-fsdevel@...>
Date: Sunday, June 15, 2008 - 11:17 pm

On Sun, 15 Jun 2008, Evgeniy Polyakov wrote:

Right...


Well, I must still be misunderstanding you :(.  It sounded like you were 
saying other network filesystems take the socket exclusively for the 
duration of an entire operation (i.e., only a single RPC call oustanding 
with the server at a time).  And I'm pretty sure that isn't the case...

Which means I'm still confused as to how POHMELFS's transactions are 
fundamentally different here from, say, NFS's use of RPC.  In both cases, 
multiple requests can be in flight, and the server is free to reply to 
requests in any order.  And in the case of a timeout, RPC requests are 
resent (to the same server.. let's ignore failover for the moment).  Am I 
missing something?  Or giving NFS too much credit here?



I see.  And if the inode drops out of the client cache, and is later 
reopened, the st_ino seen by an application may change?  st_ino isn't used 
for much, but I wonder if that would impact a large cp or rsync's ability 
to preserve hard links.



Not if the server waits for the cache invalidation to be acked before 
applying the update.  That is, treat the client's cached copy as a lease 
or read lock.  I believe this is how NFSv4 delegations behave, and it's 
how Ceph metadata leases (dentries, inode contents) and file access 
capabilities (which control sync vs async file access) behave.  I'm not 
all that familiar with samba, but my guess is that its leases are broken 
synchronously as well.


That's half of it... ideally, though, the client would have a reference to 
the real object as well, so that the original foo.txt would be removed.  
I.e. not only avoid doing the wrong thing, but also do the right thing.

I have yet to come up with a satisfying solution there.  Doing a d_drop on 
dentry lease revocation gets me most of the way there (Ceph's path 
generation could stop when it hits an unhashed dentry and make the request 
path relative to an inode), but the problem I'm coming up against is that 
there is no explicit communication of the CWD between the VFS and fs 
(well, that I know of), so the client doesn't know when it needs a real 
reference to the directory (and I'm not especially keen on taking 
references for _all_ cached directory inodes).  And I'm not really sure 
how .. is supposed to behave in that context.

Anyway...

sage
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[0/3] POHMELFS high performance network filesystem. First st..., Evgeniy Polyakov, (Fri Jun 13, 12:37 pm)
[2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Fri Jun 13, 12:41 pm)
Re: [2/3] POHMELFS: Documentation., Jamie Lokier, (Fri Jun 13, 10:15 pm)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sat Jun 14, 2:56 am)
Re: [2/3] POHMELFS: Documentation., Sage Weil, (Sun Jun 15, 12:27 am)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sun Jun 15, 1:57 am)
Re: [2/3] POHMELFS: Documentation., Sage Weil, (Sun Jun 15, 12:41 pm)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sun Jun 15, 1:50 pm)
Re: [2/3] POHMELFS: Documentation., Sage Weil, (Sun Jun 15, 11:17 pm)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Mon Jun 16, 6:20 am)
Re: [2/3] POHMELFS: Documentation., Trond Myklebust, (Sat Jun 14, 2:45 pm)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sat Jun 14, 3:25 pm)
Re: [2/3] POHMELFS: Documentation., Jeff Garzik, (Sat Jun 14, 5:49 am)
[1/3] POHMELFS: VFS trivial change., Evgeniy Polyakov, (Fri Jun 13, 12:40 pm)