On Mon, Oct 4, 2010 at 5:38 PM, Daichi GOTO <daichi@ongs.co.jp> wrote:
Ok, weird. fstat on the file didn't yield anything nasty when I
ran the app, and deleting the file in /tmp allowed the server to go a
ways, then die, as opposed to die quickly, like what happened on the
second try.
On Linux (RHEL 4.8):
Window 1:
$ ls -l /tmp/lockfile
ls: /tmp/lockfile: No such file or directory
$ ./test_fcntl
Window 2:
$ ls -l /tmp/lockfile
--wxr-s--T 1 garrcoop eng 0 Oct 4 19:49 /tmp/lockfile
$ ./test_fcntl
test_fcntl: fcntl: Resource temporarily unavailable
Ok. This (EAGAIN) matches the Linux requirements specified in the
manpage [1] I found, as well as the POSIX manpage [2]. The author is
wrong about fcntl removing the file at exit though:
$ ls -l /tmp/lockfile
--wxr-s--T 1 garrcoop eng 0 Oct 4 19:49 /tmp/lockfile
The file descriptor is closed though, so I can remove it at will:
$ rm /tmp/lockfile
$ ls -l /tmp/lockfile
ls: /tmp/lockfile: No such file or directory
Following through the same process on FreeBSD...
Window 1:
$ ls -l /tmp/lockfile
ls: /tmp/lockfile: No such file or directory
$ ./test_fcntl
Window 2:
$ ls -l /tmp/lockfile
-rwsr-x--- 1 garrcoop wheel 0 Oct 4 20:14 /tmp/lockfile
$ ./test_fcntl
test_fcntl: fcntl: Resource temporarily unavailable
Well, lookie here! It locked as expected :).
$ ls -l /tmp/lockfile
-rwsr-x--- 1 garrcoop wheel 0 Oct 4 20:14 /tmp/lockfile
$ rm /tmp/lockfile
$ ls -l /tmp/lockfile
ls: /tmp/lockfile: No such file or directory
So something else is going on with the application that needs to be
resolved in that area.
With that aside though, after modifying the test app a bit, I'm
confused at the value of l_pid...
Window 1:
$ ./test_fcntl
My pid: 5372
Window 2:
$ ./test_fcntl
My pid: 5373
test_fcntl: fcntl: Resource temporarily unavailable
PID=1 has the lock
Huh...? init has the file locked...? WTF?!
So assuming Occam's Razor, I did a bit more reading and it turns
out that l_pid is only populated when you call with F_GETLK:
negative, l_start means end edge of the region. >>> The l_pid and l_sysid
fields are only used with F_GETLK to return the process ID of the process
holding a blocking lock and the system ID of the system that owns that
process. Locks created by the local system will have a system ID of
zero. <<< After a successful F_GETLK request, the value of l_whence is
SEEK_SET.
Thus, after fixing the test app I'm getting a sensical value:
Window 1:
$ ./test_fcntl
My pid: 5394
Window 2:
$ ./test_fcntl
My pid: 5395
test_fcntl: fcntl[1]: Resource temporarily unavailable
PID=5394 has the lock
Linux operates in the same manner:
Window 1:
$ ./test_fcntl
My pid: 17861
Window 2:
$ ./test_fcntl
My pid: 17963
test_fcntl: fcntl[1]: Resource temporarily unavailable
PID=17861 has the lock
Which I would expect because I'm not using anything exotic with
fcntl(2) / open(2).
I suspect mozc isn't properly initializing / calling fcntl(2), or
the author used a non-POSIX extension that is implementation dependent
and doesn't realize it (the Linux manpage has a pretty fat set of
warnings about POSIX compatibility up at the top of the manpage). The
developer might also want to use O_EXCL in the flags passed to open(2)
as well, unless they want to lock specific sections in the file.
Verified on UFS2 with SUJ. Test app attached.
[...]
No problem :).
Cheers,
-Garrett
[1] http://linux.die.net/man/2/fcntl
[2] http://www.opengroup.org/onlinepubs/009695399/functions/fcntl.html