s/mm-commits/lkml/
OK, thanks. This leads to another question which I forgot to ask.
This patch has a lot of complications because it tries to preserve the
current behaviour: we release the bprm->file when all VM_EXECTUABLE vmas
are unmapped. Q: is this so important/useful? I don't think this is very
common case, and I don't quite understand why it is critical to release
the file. To unmount fs after starting the app? One can always copy
the file before execing, or do "/lib/ld-linus.so application" and then
unmap the vmas.Well, we only need down_read(mmap_sem) for the very short time. /proc/pid/maps
Sorry, I was wrong.
mmput() has to release ->exe_file if it is called when exec fails before the
first do_mmmap(MAP_EXECUTABLE). This also means that it is not completely
trivial to set ->exe_file before exec_mmap(), it can fail. This is solvable,
but I'm not sure we should do this.Still, the accounting looks a little bit fragile to me. flush_old_exec()
increments ->f_count but sets ->num_exe_file_vmas = 0 because we know that
the next elf_map() will bump ->num_exe_file_vmas and thus "sync" 2 counters.
But I don't see how to do better if we really want to release the file when
VM_EXECUTABLE disappears.Oleg.
--
Yes. While most programs don't need this it is still very important for
some critical programs to be able to unmap the executable and thereby
allow unmounting the filesystem. Unfortunately, I don't have a confirmed
specific example for you. A wild guess: some distro install or live CDsSo, rather than what ld-linux.so does now:
$ /lib/ld-linux.so.2 /bin/sleep 100 &
[1] 12645
$ cat /proc/12645/maps
08048000-0804b000 r-xp 00000000 03:04 606421 /bin/sleep
0804b000-0804c000 rw-p 00003000 03:04 606421 /bin/sleep
80000000-8001c000 r-xp 00000000 03:04 688355 /lib/ld-2.7.so
8001c000-8001e000 rw-p 0001b000 03:04 688355 /lib/ld-2.7.sohave ld-linux.so copy the mmap'd executable areas and then unmap the
originals. So it would look roughly like:0804c000-0804f000 r-xp 00000000 03:04 606421
0804f000-08050000 rw-p 00003000 03:04 606421
80000000-8001c000 r-xp 00000000 03:04 688355 /lib/ld-2.7.so
8001c000-8001e000 rw-p 0001b000 03:04 688355 /lib/ld-2.7.soThen there'd be no need to have the extra reference counting this patch
adds.I think these approaches could subtly break existing userspace
applications which don't already use these techniques. Furthermore, I
wonder if some applications may wish to unmount 'everything'. This means
there may be no mount that's acceptable to pin by either copying to or
using a modified ld-linux.so.Fixing the problem in userspace with these techniques also requires a
non-trivial audit of userspace. There could easily be two tasks that
have little or no apparent relation to each other. One does the unmap
trick and the other expects to be able to unmount. The first would then
need to be modified to cp the executable to a suitable location or
utilize a modified ld-linux.so.I'd be happy to submit a patch removing the extra reference counting if
there's a way to avoid breaking userspace or if there's consensus thatOK. I'll post a patch to remove the spinlock and replace it with
OK, I'll leave it unless somethi...
Err, I was double wrong. It _is_ trivial to set ->exe_file before exec_mmap(),
flush_old_exec:
+ get_file(bprm->file);
+ set_mm_exe_file(bprm->mm, bprm->file);
retval = exec_mmap(bprm->mm);
if (retval)
goto mmap_failed;bprm->mm = NULL; /* We're using it now */
If exec_mmap() fails, the caller (do_execve) has to mmput(bprm->mm)
anyway, and this imply set_mm_exe_file(NULL). This way set_mm_exe_file()
doesn't need any locking.Not that this is relly important, but still.
However. I didn't notice this patch plays with #ifdef CONFIG_PROC_FS.
Without CONFIG_PROC_FS we seem to leak bprm->file, I'd suggest to moveThanks for your answers ;)
Oleg.
--
On Wed, 2008-01-30 at 14:06 +0300, Oleg Nesterov wrote:
There are portions in there to deal with the presence or lack of proc
fs. In include/linux/mm_types.h for example. Also, I placed some of the
functions in fs/proc/base.c and avoided the need for #ifdefs in the .cGood catch. Also I noticed that I was still setting the exe_file field
in fork.c regardles of CONFIG_PROC_FS.So based on your feedback I'm working on 3 patches:
1 - fix the two CONFIG_PROC_FS issues
a) breaks compilation when CONFIG_PROC_FS is not defined
b) struct file leak when CONFIG_PROC_FS is not defined
2 - reuse mmap semaphore
3 - move the mm initialization bits in the exec path to close the window
where userspace could see a "NULL" exe_file.I will post each after it passes testing.
Thanks again!
Cheers,
-Matt Helsley--
| Rafael J. Wysocki | [Bug #10493] mips BCM47XX compile error |
| Ingo Molnar | [patch 02/13] syslets: add syslet.h include file, user API/ABI definitions |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Andrea Arcangeli | [PATCH 00 of 11] mmu notifier #v16 |
git: | |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Linus Torvalds | Re: [GIT]: Networking |
| Mark Lord | Re: [BUG] New Kernel Bugs |
