On Wed, 12 Dec 2007, Nicolas Pitre wrote:OK scrap that. When I returned to the computer this morning, the repack was completed... with a 1.3GB pack instead. So... The gcc repo apparently really needs a large window to efficiently compress those large objects. But when those large objects are already well deltified and you repack again with a large window, somehow the memory allocator is way more involved, probably even more so when there are several threads in parallel amplifying the issue, and things probably get to a point of no return with regard to memory fragmentation after a while. So... my conclusion is that the glibc allocator has fragmentation issues with this work load, given the notable difference with the Google allocator, which itself might not be completely immune to fragmentation issues of its own. And because the gcc repo requires a large window of big objects to get good compression, then you're better not using 4 threads to repack it with -a -f. The fact that the size of the source pack has such an influence is probably only because the increased usage of the delta base object cache is playing a role in the global memory allocation pattern, allowing for the bad fragmentation issue to occur. If you could run one last test with the mallinfo patch I posted, without the pack.windowmemory setting, and adding the reported values along with those from top, then we could formally conclude to memory fragmentation issues. So I don't think Git itself is actually bad. The gcc repo most certainly constitute a nasty use case for memory allocators, but I don't think there is much we can do about it besides possibly implementing our own memory allocator with active defragmentation where possible (read memcpy) at some point to give glibc's allocator some chance to breathe a bit more. In the mean time you might have to use only one thread and lots of memory to repack the gcc repo, or find the perfect memory allocator to be used with Git. After all, packing the whole gcc history to around 230MB is quite a stunt but it requires sufficient resources to achieve it. Fortunately, like Linus said, such a wholesale repack is not something that most users have to do anyway. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| Linus Torvalds | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Tony Lindgren | [PATCH 37/90] ARM: OMAP: MPUIO wake updates |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Miklos Szeredi | -rt doesn't compile for UML |
git: | |
| Florian Weimer | Re: Handling large files with GIT |
| Dana How | [PATCH] Prevent megablobs from gunking up git packs |
| Denis Bueno | Recovering from repository corruption |
| Peter Stahlir | Git as a filesystem |
| Richard Stallman | Real men don't attack straw men |
| Brian A. Seklecki | sshd_config(5) PermitRootLogin yes |
| Theo de Raadt | Re: dmesg IBM x3650 OpenBSD 4.3 |
| Stuart Henderson | Re: Actual BIND error - Patching OpenBSD 4.3 named ? |
| Auke Kok | [PATCH 5/6] e1000: Secondary unicast address support |
| Jon Nelson | tg3: strange errors and non-working-ness |
| Indan Zupancic | Re: Realtek 8111C transmit timed out |
| Brandeburg, Jesse | RE: 2.6.24 BUG: soft lockup - CPU#X |
| usb mic not detected | 3 hours ago | Applications and Utilities |
| Problem in Inserting a module | 4 hours ago | Linux kernel |
| Treason Uncloaked | 10 hours ago | Linux kernel |
| Shared swap partition | 21 hours ago | Linux general |
| high memory | 2 days ago | Linux kernel |
| semaphore access speed | 2 days ago | Applications and Utilities |
| the kernel how to power off the machine | 2 days ago | Linux kernel |
| Easter Eggs in windows XP | 2 days ago | Windows |
| Root password | 3 days ago | Linux general |
| Where/when DNOTIFY is used? | 3 days ago | Linux kernel |
