Cyrus mmap vs lseek/write usage - (WAS: BUG: mmapfile/writev spurious zero bytes (x86_64/not i386, bisected, reproducable))

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

On Tue, Jun 17, 2008 at 09:03:17PM -0700, Linus Torvalds wrote:

Portability[tm].

It actually does use MAP_SHARED already, but only for reading.
Writing is all done with seeks and writes, followed by a map
"refresh", which is really just an unmmap/mmap if the file has
extended past the "SLOP" (between 8 and 16 k after the end of
the file length at last mapping).

I can't just right now find the place where the reasoning behind
this was explained to me.  The theory I think was that mmap write
support is unreliable across systems, but read support is pretty 
good (except HPUX for which there is map_stupidshared.c)


I'm not actually a maintainer for Cyrus, I just write patches to
keep our mail servers working.  I'll pass this on.


Yeah, indeed.

I suspect the response from the Cyrus side might include a
small dose of "POSIX says it's valid to do this, and making
it work is the kernel programmers' lookout".

Ahh - I found the explaination in doc/internal/hacking in
the Cyrus source tree.  While 'ack' is a nice tool, it
doesn't check files with no extention by default.  Ho hum:

- map_refresh and map_free

  - In many cases, it is far more effective to read a file via the operating
    system's mmap facility than it is to via the traditional read() and
    lseek system calls.  To this end, Cyrus provides an operating system
    independent wrapper around the mmap() services (or lack thereof) of the
    operating system.

  - Cyrus currently only supports read-only memory maps, all writes back
    to a file need to be done via the more traditional facilities.  This is
    to enable very low-performance support for operating systems which do not
    provide an mmap() facility via a fake userspace mmap.

  - To create a map, simply call map_refresh on the map (details are in
    lib/map.h).  To free it, call map_free on the same map.

  - Despite the fact that the maps are read-only, it is often useful to open
    the file descriptors O_RDWR, especially if the file decriptors could
    possibly be used for writing elsewhere in the code. Some operating
    systems REQUIRE file descriptors that are mmap()ed to be opened
    O_RDWR, so just do it.

If I was God in the Cyrus world (woot) I suspect I'd 
provide some sort of OS independent wrapper around the 
various write functions, using mmap where appropriate,
while still working via lseek/write for those systems 
without mmap support.

(Added CC: Ken Murchison, on the grounds that he actually 
is God in the Cyrus world)

Thanks for the good explaination.  I'll have a look at the
Cyrus code and see just how tricky that would actually be
(even just doing the skiplist, index and cache code would
hit 99% of the cases where files are both mmaped and
written concurrently)

Bron.

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Cyrus mmap vs lseek/write usage - (WAS: BUG: mmapfile/writ ..., Bron Gondwana, (Tue Jun 17, 10:11 pm)