Hi, Jan! On Sat, 9 Feb 2008, Jan Holesovsky wrote:Here perhaps another optimization which wasn't done because git is fast enough on moderately-sized repositories, namely that IIRC git-clone (and git-fetch for sure) over native (smart) protocol recreates pack, even if sometimes better and simplier would be to just copy (transfer) existing pack. But this would need multi-pack "extension". (it should work just now without transport protocol extension, receiver must only be aware of the need to split resulting pack, and index them all). I hope that would work better... If I remember correctly fetching _into_ shallow clone works correctly, as deepening depth of shallow clone. What is not implemented AFAIK, but should be not too hard would be to allow to push from shallow clone to full clone. This way the network of full clones (functioning as centres to publish your work) and shallow + few branches repos (working repositories). I don't know if that would be enough. For better support git would need to exchange graft-like information, and use union of restrictions to get correct commits. Perhaps it would be best to mail 'shallow clone' author... You can configure separate 'remote's for the same repository with different heads. This would work both for pull and for push. I think the solution proposed by Marco Costalba, namely of creating "archive" repository, and "live" repository, joining them if needed by grafts, similarly to how linux kernel has live repo, and historical import repo, would be good alternative to shallow or lazy clone. There would be "archive" repo (or repos), read only, with whole history, very tightly packed with kept packs, with all branches and all tags, and "live" repo, with only current history (a year, or since major API change, or from today, or something like that), with only important branches (or repos, each containg important for a team set of branches). There would be prepared graft file to join two histories, if you have to examine full history. Hopefully repo would be smaller. Sidenote: due to (from what I have read) heavy use of topic branches in OOo development, Subversion would have to be used with svnmerge extension, or together with SVK, to make work with it not complete pain. In CVS you could have ad-hoc modules, and ad-hoc partial checkouts (so called 'modules'), but that plays merry hell with whole tree, atomic, recoverable state commits. In Git you have to plan carefully boundaries between submodules / subprojects. Additional advantage is that you would have boundaries more clear, and better modularity usually leads to better code. Comparing directly Subversion and Git is a bit stupid: they promote different workflows. From what I've read Git with its ability to very easily create branches, with easy _merging_ of branches, and ability to easily create _private_ branches (testing branches) have much common witch chosen OOo SCM workflow. Playing to strentghs of Subversion because that is why you used because of limits of previously used tools is not smart. But if you have to, then you have to. Git would hopefully get lazy clone support from your effort. But perhaps it would be possible (if additional work) to prepare two repositories: first the same as Subversion (and same as now in CVS), second one "how it should be done with Git". I hope that ability to work with submodules (with ability to not clone / checkout modules if not needed), i.e. "svn:externals done right" to para[hrase SVN slogan, would be one of reasons to chose Git over Subversion. For example what is the size of full checkout (all version-control managed files). Of for example it is 0.5 GB it would be hard to go to less that 0.5GB or so with pack size, even with compression of objects themselves in pack file. Such large repositories, like Mozilla, GCC, or now OpenOffice.org tests the limits of Git. Perhaps snapshot-based distributed SCMs cannot deal sensibly with such large projects; I hope this is not the case. I wonder if packv4 improvements, which development stalled because (if I understand correctly) because it didn't brough as much improvements, and what is now was good enough for up-till-now projects, would help with OpenOffice.org repository... P.S. From what I have read OOo uses CVS + some branch DB; does your importer make use of this branch info database? -- Jakub Narebski Poland - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| Rafael J. Wysocki | [Bug #10714] powerpc: Badness seen on 2.6.26-rc2 with lockdep enabled |
| Artem Bityutskiy | [RFC PATCH 06/26] UBIFS: add superblock and master node |
| Eric Paris | TALPA - a threat model? well sorta. |
| Balbir Singh | Re: [RFC][PATCH] Remove cgroup member from struct page |
git: | |
| Francis Moreau | emacs and git... |
| Daniel Berlin | git annotate runs out of memory |
| Wink Saville | Using git with Eclipse |
| Francis Moreau | git-bisect: weird usage of read(1) |
| Marc Balmer | Re: bcw(4) is gone |
| Stuart Henderson | Re: SMTP flood + spamdb |
| Theo de Raadt | Re: Richard Stallman... |
| Bryan Irvine | Re: Speed Problems |
| Christoph Lameter | tbench regression on each kernel release from 2.6.22 -> 2.6.28 |
| Peter Zijlstra | Re: [tbench regression fixes]: digging out smelly deadmen. |
| Johannes Berg | Re: mac80211 truesize bugs |
| Johannes Berg | [RFC] mac80211: assign needed_headroom/tailroom for netdevs |
