Re: clarify git clone --local --shared --reference

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Shawn O. Pearce
Date: Tuesday, June 5, 2007 - 10:11 pm

Brandon Casey <casey@nrlssc.navy.mil> wrote:

Well, you can repack, but only if if you account for everything.
The easiest way to do this is push every branch from the --shared
repos to the source repository, repack the source repository, then
you can run `git prune-packed` in the --shared repos to remove
loose objects that the source repository now has.

You can account for the refs by hand when you run pack-objects
by hand, but its horribly difficult compared to the push and then
repack I just described.  I think that long-lived --shared isn't that
common of a workflow; most people use --shared for shortterm things.
For example contrib/continuous uses --shared when it clones the
repository to create a temporary build area.
 

Yes.  Recently a --shared avoids copying the objects if at all
possible.  This makes fetches from the source repository into the
--shared repository very, very fast, and uses no additional disk.
 

Alternates are followed as many as 5 deep.  So you can do something like
this:

	git clone --shared source share1
	git clone --shared share1 share2
	git clone --shared share2 share3
	git clone --shared share3 share4
	git clone --shared share4 share5
	git clone --shared share5 corrupt

I think corrupt is corrupt; it doesn't have access to the source anymore
and therefore is missing 90%+ of the object database.  To help make this
case work the objects/info/alternates should always contain absolute paths;
we store them absolute in git-clone by default but you could set them up
by hand.  The other repositories should however be intact and usable, but
you cannot clone from share5.

Normal fetch/push/pull will work fine against any of those working
repos, as they are all using the normal Git object transport methods,
which means we copy objects unless they are available to us already
(see above).


Yes, they are.  I don't think we have a limit on the number of
alternates you are allowed to have.  However each additional
alternate adds some cost to starting up any given Git process.
The more alternates you have (or the more deeply nested they are)
the slower Git will initialize itself.  For 1 or 2 alternates its
within the fork+exec noise of any good UNIX system; for 50 alternates
I think you would notice it.


Yes, but that has very high risk.  If developer Joe Smith quits and
then the administrator `rm -rf /home/jsmith` everyone is hosed as
they can no longer access the objects that were originally created
by Joe.  Then the administrator is off looking for backup tapes,
assuming he has them and they are valid.  One nice property of Git
(really any DVCS) is that the data is automatically backed up by
every developer participating in the project.  Its unlikely you
will lose the project that way.

Also this scheme doesn't really work well for packing.  I don't
think we'll pack the loose objects that we borrow from the other
developers, and Git packfiles are a major performance improvement
for all Git operations.  Plus they are very small, so they save a
lot of disk.

You might find that it takes up less total disk to have everyone
keep a complete (non --shared) copy of the project, but repack
regularly, then to have everyone using alternates against each
other and nobody repacks.
 

Yes, exactly.  ;-)

In my day-job repositories I have about 150 MiB of blobs that
are very common across a number of Git repositories.  I've made a
single repo that has all of those packed, and then setup that as an
alternate for everything else.  It saves a huge chunk of disk for us.
But that common-blob.git thing that I created never gets changed,
and never gets repacked.  Its sort of a "historical archive" for us.
Works very nicely.  Alternates have their uses...

-- 
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
clarify git clone --local --shared --reference, Brandon Casey, (Mon Jun 4, 4:53 pm)
Re: clarify git clone --local --shared --reference, Shawn O. Pearce, (Mon Jun 4, 9:50 pm)
Re: clarify git clone --local --shared --reference, Brandon Casey, (Tue Jun 5, 9:30 am)
Re: clarify git clone --local --shared --reference, Shawn O. Pearce, (Tue Jun 5, 10:11 pm)
Re: clarify git clone --local --shared --reference, Brandon Casey, (Wed Jun 6, 11:50 am)
Re: clarify git clone --local --shared --reference, Brandon Casey, (Wed Jun 6, 11:55 am)
Re: clarify git clone --local --shared --reference, Shawn O. Pearce, (Thu Jun 7, 10:37 pm)
RE: clarify git clone --local --shared --reference, Loeliger Jon-LOELIGER, (Fri Jun 8, 8:57 am)
Re: clarify git clone --local --shared --reference, Brandon Casey, (Fri Jun 8, 11:35 am)
Re: clarify git clone --local --shared --reference, Brandon Casey, (Wed Jun 13, 4:07 pm)