On Saturday 13 February 2010 14:39:52 Ilari Liusvaara wrote:
That is still racy. The real problem is that the file is mmap()ed,
and git then first computes the SHA1 of that buffer, next it
compresses it.[*]
Due to the last sentence in the following snippet from mmap(2):
MAP_PRIVATE
Create a private copy-on-write mapping. Updates to the map-
ping are not visible to other processes mapping the same
file, and are not carried through to the underlying file.
It is unspecified whether changes made to the file after the
mmap() call are visible in the mapped region.
This is racy despite the use of MAP_PRIVATE: the mapped contents can
change at any time.
AFAICS there are only two possible solutions:
* Copy the file (possibly block-by-block) as we go, to make sure that
the data we SHA1 is the same we compress.
* Unpack and re-hash the compressed data to verify that the SHA1 is
correct. In case of failure either retry (but you could have to do
this infinitely often if the user just hates you!) or abort.
(Of course, in neither case does the user have any sort of guarantee
about what data ended up in the repository, but he never had that, we
only try to ensure repo consistency.)
[*] The "do we have this" check actually happens before the
compression, and that arm is thus race-free.
--
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html