I am re-sending this after help from Ian and git-bisect. To me it's a
show-stopper: I cannot find an acceptable workaround that I can implement.
The problem: upgrading to 2.6.23-rc4 from 2.6.22 causes several autofs
mounts to fail silently - they just not appear when they should.
I believe it's caused by the NFS change that forces multiple mounts from
different directories under the same server side filesystem to have the same
mount options by default, otherwise it returns EBUSY.
For example, if server has a filesystem /a, and it exports /a/x and /a/y
(maybe with rw or ro), and a client must mount /a/x and /a/y with the same
mount options now.
Since in my setup they are managed by autofs, and the autofs map is managed
by nis, there is no way I could easily workaround it..
If we have to live with this regression, I want to hear some suggestions
about how to fix them realistically. Thanks.
By the way, I am not sure if I did the bisect right, but FWIW, git-bisect
says:
c98451bdb2f3e6d6cc1e03adad641e9497512b49 is first bad commit
commit c98451bdb2f3e6d6cc1e03adad641e9497512b49
Author: Frank van Maarseveen <frankvm@frankvm.com>
Date: Mon Jul 9 22:25:29 2007 +0200
NLM: fix source address of callback to client
Use the destination address of the original NLM request as the
source address in callbacks to the client.
Signed-off-by: Frank van Maarseveen <frankvm@frankvm.com>
2fec08debe51c20423a88b1a0d4281c683ba5daf M include
-----Original Message-----
From: Hua Zhong [mailto:hzhong@gmail.com]
Sent: Wednesday, August 29, 2007 1:59 PM
To: 'Linux Kernel Mailing List'
Subject: regression of autofs for current git?
Hi,
I am wondering if this is a known issue, but I just built the current git
and several autofs mounts mysteriously disappeared. Restarting autofs could
fix some, but then lose others. 2.6.22 was fine.
Is there anything I could check other than bisect? (It may take some time
for me to get to it)
Thanks for your ...Which is better than having it fail silently, or giving you a mount with the wrong mount options. If you need to mount the same filesystem with incompatible mount options on the same client, then there is a new mount option "nosharecache", which enables it. The new option is there in order to make it damned clear to sysadmins that this is a dangerous thing to do: mounts which don't share the same superblock also don't share the same data and attribute caches. Any file or directory which appears in both mounts had better only be used by one application at a time or be using an appropriate locking scheme. Trond -
Well, it depends on how you define "better". In this particular scenario, the maps read as follows: tools -fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve rs=3,actimeo=600 fs1.domain.com:/a/tools share -fstype=nfs,udp,rw,intr,nosuid,nodev,rsize=8192,wsize=8192,mountvers=3,nfsve rs=3 fs1.domain.com:/a/share The only difference is in the actimeo (I don't even know what it means). Is this enough to fail a mount? More importantly, it is a regression. My understanding is that unless absolutely necessary we do not introduce a "feature" that breaks working I guess the question is what should be the default. I'll convey this to our system admin (fortunately we are not a very big company), but I am just not 100% sure this is a well-thought change because I believe many people will be impacted once 2.6.23 is out. Shouldn't we give some time to user to fix their config before we enforce this, by like some kernel warnings? -
"better" as in: "I now have a chance to notice, when my 'read-only Yes. The default values for acregmin, acregmax, acdirmin, acdirmax are not 600. If /a/tools and /a/share are on the same filesystem on the server, then the NFS client should warn you that you are about to do something that may result in cache coherency problems instead of silently allowing it, and then leaving you to debug the coherency issue. If you know what you are doing, then there is an option which allows you Your turn to define what you mean by "working"? In my book that means "a setup that doesn't include unexpected or unintended behaviour". Not being able to notice cache coherency failures on a file that is mounted in two different places with two different sets of mount options counts as "unexpected behaviour". Not being able to notice that your mount options have been overridden by the kernel also counts as "unexpected behaviour". Trond -
There are two disjoint directories. I am wondering why there would be cache coherency issues in this case? Is this Linus nfs implementation specific or "working" as in "I can mount the directory and do my work". And there has Fine. These are all very nice theories, but I just want to report this regression and hope it won't cause any big problems for any users out there. -
How is the NFS client to know that these directories are disjoint, or that no-one will ever create a hard link from one directory to another? To my knowledge, the only way to ensure this is to put them on different disk partitions. I don't know if all Unix systems have this issue, but I have been told That is too narrow a definition: the minimum should be "everyone can mount their directories and do their work". Your particular setup may be safe, but that is why we have overrides: the default should be for the Your choice. Trond -
Every engineer in our organization mounts it too. No problem until now. It's not very conservative to suddenly change default behavior and break autofs mounts. There is not even one kernel message that "_tells_ user why No. I have no other choice as I explained before. Hua -
I believe I've already explained why that isn't a sufficient metric. No it doesn't. It reports an error code to the caller. If autofs is failing silently, then that is a bug in autofs: mount will report the error to the user. Trond -
If so, and since that's obviously what people _expect_ to happen, why not make that the default, with the "consistent" behaviour being the one that needs an explicit option. Just out of curiosity - Hua, is this NFSv2? Especially there, cache "consistency" is largely a joke anyway, so defaulting to some annoying careful mode is doubly ridiculous. Linus -
It's v3 as can be seen from the autofs maps I posted. These directories are used mostly as read-only and get pulled in via our build system. We do not actually write to them often, if at all. I don't think this setup is uncommon, and I am worried that once people start using -
The majority of "nfs sucks" complaints result from the general lack of understanding by sysadmins of the nfs caching model. I'd be very sceptical of any claim that most sysadmins "expect" broken cache consistency as a result of mounting the same filesystem with different NFSv2 has a close-to-open caching model which works fine as long as you don't break the underlying assumptions. See my comment above. Trond -
Actually, yes, it looks like I'm not logging mount errors at the correct log level. Oops. Ian -
Wrong(tm). autofs AND mounting at the commandline just say: mount.nfs: /mnt is already mounted or busy Which has an actual information value of about 1%. In my case i moved a nfs exported directory inside another nfs-exported directory month ago and placed a symlink where the direcotry was (on the server-side). It never acured to me that that was "wrong"(tm). Now i can only mount one of the two mounts and the other just tells "busy". After reading this i could fix my case easyly. I just erased the "deeper" mount and symlinked the directory from the other mount. But YOU HAVE TO KNOW THAT YOU DID SOMETHING WRONG. Just getting a "Busy" lets you staying with Question-marks flying around you head! Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. -
If we're going to send a message to sysadmins, we shouldn't force them to go through a git bisection search and a lkml discussion to receive it! Is there at least some way in which the kernel can detect this situation and emit a friendly printk which guides people to a friendly document? -
There are already error codes being passed back to the mount syscall. The problem here is that unlike the mount utility, autofs isn't passing that information on to the user. Trond -
No, Trond. That commit gets reverted or fixed. It's a regression, and your theories that it's "better" that way are obviously broken. It's obviously broken because you seem to say that you know better, even though you also admit that: "How is the NFS client to know that these directories are disjoint, or that no-one will ever create a hard link from one directory to another? To my knowledge, the only way to ensure this is to put them on different disk partitions." the point being that you just disallowed people from doing things that are sane but _potentially_ dangerous. That's now how we work. The UNIX way sis to give people rope - if you cannot *prove* that what they are doing is wrong, then you damn well better not disallow it. No regressions, Trond. Especially not for stuff that used to work, was used, and that could be sanely expected to work (which this *definitely* sounds like). Please send in a fix. If the fix involves making "nosharecache" the default, then that is better than making policy decisions like this in the kernel. The kernel should do what the user asks and not put in unnecessary roadblocks. Hua - that said, I don't actually see why the commit you bisected to has anything to do with the issue being discussed. Can you double-check that it's literally that particular commit that breaks for you (you could try just reverting that commit). Linus -
I will double check that tomorrow. Thanks. :-) I'm happy I'll still be able -
It did not. The previous behaviour was to always silently override the This is _not_ a kernel policy decision. The kernel is simply informing the user that it cannot fulfil the mount request as specified. Exactly why do you think that NFS should be any different from other filesystems when it comes to this? AFAIK, every other filesystem will give you an EBUSY if you try to mount a partition with -oro if you are already mounting somewhere else with -orw. Every filesystem will give you an EBUSY if you try to mount the partition with -oacl if it is mounted somewhere else with -onoacl. The reason: exactly the same as NFS, the caches cannot remain consistent when you try to mount two different super blocks that both refer to the same underlying filesystem. Trond -
..so it still worked for any sane setup, at least. You broke that. Hua gave good reasons for why he cannot use the current kernel. It's a regression. In other words, the new behaviour is *worse* than the behaviour you consider to be the incorrect one. Linus -
So you are saying that it is acceptable for the kernel to decide unilaterally to override mount options? Why aren't we doing that for any other filesystem than NFS? Trond -
How hard is it to acknowledge the following little word: "regression" It's simple. You broke things. You may want to fix them, but you need to fix them in a way that does not break user space. Linus -
Trond has a point Linus. What he "broke" is, for example, a ro mount being mounted as rw. That *could* be a very serious security (etc.etc.) problem which he just fixed. Anything depending on read-only not being enforced will cease to work, of course, and that is what a few people complain about(!). If ext3 in some rare case (which would still mean it hit a few thousand users) failed to remember that a file had been marked read-only and allowed writes to it, wouldn't we want to fix that too? It would cause regressions, but we'd fix it, right? mount passes back the error code on a failed mount. autofs passes that error along too (when people configure syslog correctly). In short; when these serious mistakes are made and caught, the admin sees an error in his logs. This is not wrong. This is good. -- / jakob -
I don't dispute that the new code does somethign good. But it changes existing behaviour. When we add NEW BEHAVIOUR, we don't add it to old interfaces when that breaks old user mode! We add a new flag saying "I want the new behaviour". This is not rocket science, guys. This is very basic kernel behaviour. The kernel exists only to serve user space, and that means that there is no more important thing to do than to make sure you don't break existing No. What he broke was a working and sane setup. The fact that he may *also* have broken insane setups is totally irrelevant. Don't go off on some tangent that has nothing to do with the Stop blathering. Of course we fix security holes. But we don't break things that don't need breaking. This wasn't a security hole. You are making up irrelevant arguments that have nothing to do with this regression. If you want new behaviour, you add a new flag saying you want new behaviour. You don't just start behaving differently from what you've always done before (and what *other* UNIXes do, for that matter). Besides, even *if* it was a matter of somebody doing a mount with "rw", when the previous mount was "ro", returning EBUSY is still the wrong thing to do! If the user asks for a new mount that is read-write, he should just get it - ie we should not re-use the old client handles, and we should do what Solaris apparently does, namely to just make it a totally different mount. In other words, it should (as I already mentioned once) have used "nosharecache" by default, which makes it all work. Then, people who want to re-use the caches (which in turn may mean that everything needs to have the same flags), THOSE PEOPLE, who want the NEW SEMANTICS (errors and all) should then use a "sharecache" flag. Bullshit. "Seeing the error in his logs" doesn't help anything. The problem wasn't the lack of error, the problem was that it was a new and unnecessary error in the first place. Logging it doesn't make ...
On Fri, Aug 31, 2007 at 01:07:56AM -0700, Linus Torvalds wrote: It does not have "nothing" to do with the regression. Some setups which worked more by accident than by design earlier on were broken by the fix. This could have been avoided, I agree, but the breakage was caused *part* of it wasn't a security hole. The other half very much was. Sure, given that Trond (or whomever) has the time it takes to go and implement all of this, there's no need to screw anyone. Assuming he's on a schedule and this will have to wait, I agree with him that it makes the most sense to play it safe security/consistency-wise rather than It makes troubleshooting possible, which adresses *the* major complaint from *one* of the *two* people who complained about this. -- / jakob -
Well, it's not a "fix" if it breaks other setups. It's especially not a fix since the whole requirement that all the flags be exactly the same is totally brain-dead in the first place. We *have* that kind of mount already, and it has nothing to do with NFS: it's called a "bind" mount. So if you want an identical mount, with cache coherency and tying the two mount-points together (requiring that they have the same mount flags), then that has absolutely *nothing* to do with NFS. The VFS layer does that No, the fix was simply wrong. It was done the wrong way, and it broke things it shouldn't have broken. Let's put it this way: if I create a patch that stops the system from booting, I sure as hell fix a potential security hole, don't I? Does that make my patch a "fix"? I disagree. Either that thing gets fixed before 2.6.23, or the commit that introduced the broken behaviour gets reverted. We've had this policy of "regressions are fixed" for a long time, and we're not suddenly changing it. This is *not* a security hole. In order to make it a security hole, you need to be root in the first place. So what you call a security hole is really no different from root installing a bad SUID binary. It's simply not the kernels place to then say "SUID binaries will not work, because it's a potential security hole". See? So stop calling this a security hole. It's certainly a misfeature, but: - it's a misfeature that people are used to, and has been around forever. - there are bound to be ways to fix it that don't break existing users. - the requirement that all flags be the same for a mount to the same NFS directory is *particularly* stupid, since there are better ways to do that than go through NFS! so I really don't see why people excuse the new behaviour. Linus -
On Fri, Aug 31, 2007 at 09:43:29AM -0700, Linus Torvalds wrote: Non-root users can write to places where root might believe they cannot write because he might be under the mistaken assumption that ro means ro. I am under the impression that that could have implications in some setups. Sure, they're used it it, but I doubt they are aware of it. We can certainly agree that a nicer fix would be nicer :) -- / jakob -
So, the right thing to do (tm) is to make them aware without breaking their setup. Log any detected inconsistencies in the dmesg buffer and to syslog. If the sysadmin is not competent enough to notice, to bad. Cheers Martin ------------------------------------------------------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de -
That would be a major change in existing semantics. The default has been
"sharecache" ever since Al Viro introduced the "sget()" function some 6
or 7 years ago. The problem was that we never advertised the fact that
the kernel was overriding your mount options, and so sysadmins were
(rightly IMO) complaining that they should _know_ when the client does
this.
The list of known problems with a "nosharecache" default is nasty too:
- file and directory attribute and data caching breaks.
Applications will see stale data in cases where they otherwise
would not expect it.
- the existing dcache and icache issues when a file is renamed
or deleted on the server are now extended to also include the
case where the rename or deletion occurs on an alias in another
directory on the client itself. In particular, sillyrename will
break.
- file locking breaks (the server knows that the client holds
locks on one file, whereas the client thinks it holds locks on
several).
- the NFSv4 delegation model breaks: the client will be using
OPEN when it could use cached opens. More importantly, when
performing an operation that requires it to return the
delegation on the aliased file, it won't know until the server
sends it a callback.
...and of course, the amount of unnecessary traffic to the server
increases. I'm not aware of any sane way of dealing with those issues,
and I doubt Solaris has a solution for them either.
Trond
-
All of this won't happen when server foo exports /bar and a client mounts /bar/x and /bar/y separately: there must be a shared subtree or hard-links between files within them, right? An obvious (but disruptive) server side workaround is to export the subtrees with different fsid= but that would give the same list of problems as above, right? IMHO I'd only consider returning EBUSY when trying to mount _exactly_ the same directory with different flags, not for arbitrary subtrees. The client should preferably not be bothered with server side disk partitioning (at least not beyond the obvious such as df output). -- Frank -
That is utterly inconsistent and confusing too.
If you have a filesystem "/foo" exported on the server "remote", then
why should
mount -oro remote:/foo
mount -orw remote:/foo/a
be allowed, but
mount -oro remote:/foo
mount -orw remote:/foo
be forbidden? The caching problems are the same. Telling the admin that
one is safe and the other is not, is just messing with his mind.
Trond
-
I'm not arguing to forbid the second case but confronting the sysadmin there with nosharedcache is much less likely to harm existing setups than the first case. Let's consider the most likely intention. The first case is probably used as: mount -oro remote:/foo <path>/foo mount -orw remote:/foo/a <path>/foo/a and I don't see a real issue with that, sharedcache or not. Ditto with: mount -oro remote:/foo/a <path>/a mount -orw remote:/foo/b <path>/b These are all typical use cases, without multiple views on the same tree. But mount -oro remote:/foo /foo1 mount -orw remote:/foo /foo2 is strange and much less likely. -- Frank -
Perhaps sharing could be the default on NFSv4 and non-sharing for 2 & 3? After all, NFSv4 is supposed to be able to handle local caching on disk. David -
Hua explained already that seeing the error is not the same as fixing the error: he cannot fix it because NFS implies other systems we _must_ co-operate with. -- Frank -
I think there are two reasons. First, I have no problem with the new behavior if it didn't cause a regression. I am not sure about the history of other filesystems, but NFS has had the old behavior for ages, and people get used to it. Second, NFS is actually special as this particular setup is very common and you'll get into this situation far too easily, as from the server you could export two directories within a filesystem as if they were two filesystems. Very few people actually want to mount the same local filesystem multiple times, but under NFS this is the norm. Last but not the least, NFS is often controlled by central corporate policies (autofs/nis), and has to work with various clients. For example, it's not possible to add "nosharecache" to auto.auto as almost nobody -
This all came about due to complains about not being able to mount the same server file system with different options, most commonly ro vs. rw which I think was due to the shared super block changes some time ago. And, to some extent, I have to plead guilty for not complaining enough about this default in the beginning, which is basically unacceptable for sure. We have seen breakage in Fedora with the introduction of the patches and this is typical of it. It also breaks amd and admins have no way of altering this that I'm aware of (help us here Ion). I understand Tronds concerns but the fact remains that other Unixs allow this behaviour but don't assert cache coherancy and many sysadmin don't realize this. So the broken behavior is expected to work and we can't simply stop allowing it unless we want to attend a public hanging with us as the paticipants. There is no question that the new behavior is worse and this change is unacceptable as a solution to the original problem. I really think that reversing the default, as has been suggested, documenting the risk in the mount.nfs man page and perhaps issuing a warning from the kernel is a better way to handle this. At least we will be doing more to raise public awareness of the issue than others. Ian -
I can only second that. Changing the default behavior in this way is really bad. Not that I am disagreeing with the technical reasons, but the change breaks working setups. And -EBUSY is not very helpful as a message here. It does not matter that the user tools may handle the breakage incorrect. The users (admins) had workings setups for years. And they were obviously working "good enough". And one should not forget that there will be a considerable time until "nosharecache" will trickle down into distributions. If the situation stays this way, quite a few people will not be able to move beyond 2.6.22 for some time. E.g. for I am working for a company that operates some linux "clusters" at a few german automotive cdompanies. For certain reasons everything there is based on automounter maps (both autofs and amd style). We have almost zero influence on that setup. The maps are a mess - we will run into the sharecache problem. At the same time I am trying to fight the notorious "system turns into frozen molassis on moderate I/O load". There maybe some interesting developements coming forth after 2.6.22. Not good :-( What I would like to see done for the at hand situation is: - make "nosharecache" the default for the forseeable future - log any attempt to mount option-inconsistent NFS filesystems to dmesh and syslog (apparently the NFS client is able to detect them :-). Do this regardless of the "nosharecache" option. This way admins will at least be made aware of the situation. - In a year or so we can talk about making the default safe. With proper advertising. Just my 0.02. Cheers Martin ------------------------------------------------------ Martin Knoblauch email: k n o b i AT knobisoft DOT de www: http://www.knobisoft.de -
The best I can do given the constraints appears to be to have the kernel first look for a superblock that matches both the fsid and the user-specified mount options, and then spawn off a new superblock if that search fails. The attached patch does just that. Note that this is not the same as specifying nosharecache everywhere since nosharecache will never attempt to match an existing superblock. Finally, for the record: I still feel very uncomfortable about not being able to report the state of the client setup back to the sysadmin. AFAIK, the only way to do so is to stat the mountpoints, and compare the device ids. Trond
I think this is probably acceptable to get roughly the old behaviour, but I still think it's a bit stupid. What happens at "mount -o remount,..." time? The fact is, the whole "match the fsid and user mount options, and re-use the mount" sounds like it's trying to solve a problem that doesn't need solving. If the user really wants to duplicate the mount, he really should be using a a bind-mount instead. In other words, let's assume that the user has /some/nfs/mount mounted over NFS, and wants to re-mount it (or even just a subset of it) somewhere else, the sane thing to do is not to mount it again, but to just do mount --bind /some/nfs/mount/subdir /new/mount/place instead. That *guarantees* that the low-level filesystem uses the same flags, and it also means that things like re-mounting have sane and well-defined semantics, and will fail or succeed predictably. In contrast, if a user wants to create a new NFS mount, it really should be independent of the old one, because that's (a) what other systems do, and (b) also makes the semantics of re-mounting it with other flags be clear and unambiguous (ie the remount has nothing what-so-ever to do with the independent NFS mount). See? This is why I think "nosharecache" should just be the default, because that's the behaviour that simply does not have any subtle issues. The *special* case should be the "sharecache" case, and 99% of the time that one should likely be done with a "--bind" mount. (I don't really see the point of _ever_ doing anything but a bind mount, but maybe there are reasons to try to share at a NFS layer that I don't Hua, does this fix things for you? If it gets rid of the regression, I can certainly live with it, but as per above, I don't really think this makes Well, not only don't I see that as being horribly wrong, I actually think that the sysadmin should know what his mount setup is, even without having to ask. But since he *can* ask, using easy and standard ...
I agree for the cases where you can use bind mounts, however you can't always do that. Consider the fairly common setup where /foo, /foo/a, /foo/b are all on the same filesystem on the server, but only /foo/a and /foo/b are exported. There can be plenty of files that are contain hard links in both directories, but because you cannot mount the parent, /foo, you will not be able to ensure that these common files are cached to the same inode (which they need to be). IOW: with this scenario, you can't ensure that local posix semantics hold (i.e. that if my client is the only user, then the filesystem will behave as if it were a posix filesystem). That would be a major (a) I'm not sure that is true: see (b). (b) You gain remount clarity at the expense of local posix filesystem correctness. Trond -
What about a superset? What about two intersecting sets? Bind mounts aren't quite it for this problem, and in any case your suggestion of: -
That helps one case, yes, but what about a superset? What about two sets that might intersect but for which you don't have the common root to hand? The current NFS code deals with all these problems by attempting to share the dentry sets. Superblocks can now have multiple roots and we graft trees together automatically when we discover one is a subset of another. The case I came up with was this: mount home:/home/fred /home/fred mount home:/home/jim /home/jim To effect these, the NFS mount process looks up "/home/fred" or "/home/jim" directly rather than looking up "/" and path walking. However, the NFS client in the kernel may note that both Fred's and Jim's home directories reside on the same NFS volume. You cannot use a bind mount here because there's nothing to bind from. Then, should, say, this happen: mount home:/home /mnt You'll probably end up with three roots in the NFS superblock. Following with an ls of /home, say, would then populate the dentries for /home - including those for fred and jim, and the code would splice in the dentried now rooted at /home/fred and /home/jim. You can't do that with bind mounts as far as I know because I don't believe that you can go up the tree (rootwards) from the apparent root of a vfsmount. So bind mounts aren't quite it for this problem, and in any case your suggestion of: mount --bind /some/nfs/mount/subdir /new/mount/place doesn't help with the automounter case particularly well. The automounter *could* probe to see if the server stuff is common with an already existing mount, but there would then be a race, and it doesn't help with the homedir example I gave above either. You might think "well, start by mounting '/' somewhere and then bind mounting subdirs of it", but that doesn't work if you can't mount "/" or "/home", and might go spectacularly wrong if the server has a symlink in the path that you Yeah, that's probably necessary, if annoying. However, local caching can The reason I added ...
The much more trivial case is mount -o ro server:/usr/bin /usr/share/bin mount server:/usr/tmp /usr/share/tmp and now tell me any reasonable reason why this should fail? (Replace "-o ro" with any other attributes). Quite frankly, if the above two mounts fail - just beause /usr/bin and /usr/tmp happen to be on the same filesystem on the server - then the implementation is more than just buggy - it's a pure piece of shit. And quite frankly, as far as I can tell, that was exactly what the NFS changes that are being discussed did. They failed the equivalent of the second mount, because it didn't have the same flags as the first one. I'm just saying that the whole "require all mount flags to be identical, and error out if they are not" is pure and utter CRAP. So anything that does that - for *any* reason what-so-ever - is just broken. If you require identical mount-time flags, that absolutely has to be a special case (like using "--bind", or perhaps using a special option like "sharecache"). It really is that simple. I don't know how anybody could possibly ever dispute that. As far as I can tell, the current situation in NFS is "reasonably ok", but I already asked Trond about what happens with "remount" with the "same mount options imply sharecache" code that he did, and afaik, I never got an answer. In other words, let's change the above two commands to the following three commands: mount server:/usr/bin /usr/share/bin mount server:/usr/tmp /usr/share/tmp mount -o remount,ro /usr/share/bin and I'm claiming that if the above fails (or remounts /usr/share/tmp as read-only too), then it's also obvious CRAP (replace "ro" with any other possible attribute - whether cache timeouts or similar) See? It really is that simple. The obvious mount usage above absolutely *has* to work, and anything that breaks it is crap, crap, crap. And that was exactly what apparently happened here, and I really don't see why anybody has the *gall* to ...
This patch fixes the problem for me, thanks. Is this patch changing the behavior of "sharecache" to "try-to-share-cache-if-possible", or adding a third behavior? If the user explicitly asks for "-o sharecache", does he get an error back if the mount -
There has never been a 'sharecache' flag as far as the kernel is concerned. The default behaviour has always been to share. Trond -
It's not about default (for which backward compatibility is most important and this patch is perfectly fine), but user explicitly asks for "sharecache". In this case if for any reason the cache cannot be shared, I am not sure if he should get an error back. I for one agree with Ian and Linus that changing default to nosharecache might be the best thing to do, but since I am now able to use the latest kernel, I am very happy already. -
I'm glad I read the whole thread, because when I saw it earlier and didn't respond, this was the question I had, why not replace the error Since clients may not know the server setup, and it may change for policy or error recovery reason, I think this patch is needed. The cases I think are common are: 1 - single export, multiple client mounts export /base - rw mount /base/share - ro [ client enforces r/o or not ] mount /base/upload - rw 2 - export parts of a filesystem (/base) [ server enforces access ] export /base/share - ro [ hopefully really r/o on client ] export /base/upload - rw [ should work for write ] 3 - mount the same f/s with different permissions on client export /base - rw mount /base on point1 - rw [ hopefully really r/w ] mount /base on point2 - ro [ hopefully r/o ] I consider this *really* bad practice, but I have seen it in enough places to know others don't agree. It assumes the client will protect the r/o data. 4 - export f/s and part of f/s export /base/ - ro export /base/upload - rw clients may mount one or both, with the upload directory as part of base -- Bill Davidsen <davidsen@tmr.com> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot -
I think Al Viro probably has the right idea as to how to fix this: Move the R/O R/W flag into vfsmount and count the number of R/W vfsmounts in the superblock. I never quite finished implementing the patch to do this, but I can go back and revisit it. David -
I think Al had a good idea there, that is nice and clean. What about bind mounts, will that just fall out? -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 -
I don't see that it should be a problem since the vfsmount is copied. David -
But what about mounting with different protocol, tcp vs udp for example. Ian -
I was referring specifically to the R/O / R/W variants of the same mount. Any flag variation that varies the way the NFS client talks to the NFS server must either result in a new superblock or be ignored. David -
We currently ignore remount requests that attempt to change the NFS mount parameters. This is not new behaviour, BTW: it has always been the case, and nobody has ever requested it. The ro flag is different, and I agree that it should be moved to the vfsmount structure. I'm hoping Dave Hansen's patches will be ready for merging soon... Trond -
Yes, I only mentioned it because I'm aware it. I've not payed much attention to it because there haven't been any complaints so far and it's been a long time. Ian -
With the patch that Linus merged, we will fork off a new superblock. Trond -
This does not have any relation with the mount problem, assuming commit and comment do match. -- Frank -
That's right. The commits we're discussing here are (I believe): http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=75180df2ed... http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=275a5d24bf... The later being the one returning EBUSY for the option mismatch and the former the addition of the "nosharecache" option. Ian -
