Re: [PATCH] don't use mmap() to hash files

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Dmitry Potapov
Date: Sunday, February 14, 2010 - 12:06 pm

On Sun, Feb 14, 2010 at 9:10 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:

"much more likely" is not a very qualitative characteristic... I would
prefer to see numbers. My gut feeling is that it is not a problem in
real use cases where Git is used as VCS and not storage for huge media
files.  In fact, we do not use mmap() on Windows or MacOS at all, and
I have not heard that users of those platforms suffered much more from
inability to store huge files than those who use Linux.  I do not want
to say that there is no difference, but it may not as large as you try
to portray. In any case, my patch was to let people to test it and to
see what impact it has and come up with some numbers.

BTW, probably, it is not difficult to stream a large file in chunks (and
it may be even much faster, because we work on CPU cache), but I suspect
it will not resolve all issues with huge files, because eventually we
need to store them in a pack file. So we need to develop some strategy
how to deal with them.

One way to deal with them is to stream directly into a separate pack.
Still, it does not resolve all problems, because each pack file should
be mapped into a memory, and this may be a problem for 32-bit system
(or even 64-bit systems where a sysadmin set limit on amount virtual
memory available a single program).

The other way to handle huge files is to split them into chunks.
http://article.gmane.org/gmane.comp.version-control.git/120112

Maybe there are other approaches. I heard some people tried to do
something about it, but i have never interested in big files to look
at this issue closely.


I believe it did, or I did not understand your question.

Dmitry
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH] don't use mmap() to hash files, Dmitry Potapov, (Sat Feb 13, 6:18 pm)
Re: [PATCH] don't use mmap() to hash files, Junio C Hamano, (Sat Feb 13, 6:37 pm)
Re: mmap with MAP_PRIVATE is useless, Junio C Hamano, (Sat Feb 13, 6:53 pm)
Re: [PATCH] don't use mmap() to hash files, Johannes Schindelin, (Sat Feb 13, 6:53 pm)
Re: [PATCH] don't use mmap() to hash files, Junio C Hamano, (Sat Feb 13, 7:00 pm)
Re: mmap with MAP_PRIVATE is useless, Paolo Bonzini, (Sat Feb 13, 7:11 pm)
Re: [PATCH] don't use mmap() to hash files, Dmitry Potapov, (Sat Feb 13, 7:18 pm)
Re: [PATCH] don't use mmap() to hash files, Dmitry Potapov, (Sat Feb 13, 7:42 pm)
[PATCH v2] don't use mmap() to hash files, Dmitry Potapov, (Sat Feb 13, 8:05 pm)
Re: [PATCH] don't use mmap() to hash files, Junio C Hamano, (Sat Feb 13, 8:14 pm)
Re: [PATCH] don't use mmap() to hash files, Jakub Narebski, (Sun Feb 14, 4:07 am)
Re: [PATCH] don't use mmap() to hash files, Thomas Rast, (Sun Feb 14, 4:14 am)
Re: [PATCH] don't use mmap() to hash files, Junio C Hamano, (Sun Feb 14, 4:46 am)
Re: [PATCH] don't use mmap() to hash files, Paolo Bonzini, (Sun Feb 14, 4:55 am)
Re: [PATCH] don't use mmap() to hash files, Johannes Schindelin, (Sun Feb 14, 11:10 am)
Re: [PATCH] don't use mmap() to hash files, Dmitry Potapov, (Sun Feb 14, 12:06 pm)
Re: [PATCH] don't use mmap() to hash files, Johannes Schindelin, (Sun Feb 14, 12:22 pm)
Re: [PATCH] don't use mmap() to hash files, Johannes Schindelin, (Sun Feb 14, 12:28 pm)
Re: [PATCH] don't use mmap() to hash files, Dmitry Potapov, (Sun Feb 14, 12:55 pm)
Re: [PATCH] don't use mmap() to hash files, Dmitry Potapov, (Sun Feb 14, 12:56 pm)
Re: [PATCH] don't use mmap() to hash files, Avery Pennarun, (Sun Feb 14, 4:13 pm)
Re: [PATCH] don't use mmap() to hash files, Zygo Blaxell, (Sun Feb 14, 4:52 pm)
Re: [PATCH] don't use mmap() to hash files, Nicolas Pitre, (Sun Feb 14, 9:16 pm)
Re: [PATCH] don't use mmap() to hash files, Avery Pennarun, (Sun Feb 14, 10:01 pm)
Re: [PATCH] don't use mmap() to hash files, Nicolas Pitre, (Sun Feb 14, 10:05 pm)
Re: [PATCH] don't use mmap() to hash files, Nicolas Pitre, (Sun Feb 14, 10:48 pm)
Re: [PATCH] don't use mmap() to hash files, Paolo Bonzini, (Mon Feb 15, 12:48 am)
Re: [PATCH] don't use mmap() to hash files, Dmitry Potapov, (Mon Feb 15, 5:23 am)
Re: [PATCH] don't use mmap() to hash files, Dmitry Potapov, (Mon Feb 15, 5:25 am)
Re: [PATCH] don't use mmap() to hash files, Avery Pennarun, (Mon Feb 15, 12:19 pm)
Re: [PATCH] don't use mmap() to hash files, Nicolas Pitre, (Mon Feb 15, 12:29 pm)
16 gig, 350,000 file repository, Bill Lear, (Thu Feb 18, 1:11 pm)
Re: 16 gig, 350,000 file repository, Nicolas Pitre, (Thu Feb 18, 1:58 pm)
Re: 16 gig, 350,000 file repository, Erik Faye-Lund, (Fri Feb 19, 2:27 am)
Re: 16 gig, 350,000 file repository, Bill Lear, (Mon Feb 22, 3:20 pm)
Re: 16 gig, 350,000 file repository, Nicolas Pitre, (Mon Feb 22, 3:31 pm)