login
Header Space

 
 

Using git as a general backup mechanism (was Re: Using GIT to store /etc)

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: <git@...>
Date: Tuesday, December 12, 2006 - 6:49 pm

This discussion reminds me of a use of git I've had in the back of my 
head to try out for a while. Right now I'm doing my local snapshot 
backups using the rsync-with-hard-links scheme 
(http://www.mikerubel.org/computers/rsync_snapshots/ if you're not 
familiar with it). This is nice in that the contents of files that don't 
change are only stored once on the backup disk. But it is less than 
optimal in that a file that changes even a little bit is stored from 
scratch.

What would be great for this would be to store each day's backup as a 
git revision; with a periodic repack, this would be much more 
space-efficient than the rsync hard links.

The problem is that while that would give me a very efficient backup 
scheme, the repository would still grow over time. In rsync land, I 
solve the disk space issue by keeping two weeks' worth of daily 
snapshots, then six months' worth of weekly snapshots, then two years' 
worth of monthly snapshots; files that change daily have a constant 
number of revisions stored in my backups, and older files drop off the 
backup disk as they age.

Given that there's no way (or is there?) to delete revisions from the 
*beginning* of a git revision history, right now it seems like the only 
approach that comes close is to give up on the "daily then weekly then 
monthly" thing -- probably fine given the space savings of delta 
compression -- and periodically make shallow clones of the backup 
repository that fetch all but the first N revisions; once a shallow 
clone is made, the original gets deleted and the clone is the new backup 
repo.

But it would sure be more efficient to be able to "shallow-ize" an 
existing repository. That would be useful for things other than backups, 
too, e.g. the recent request for some way to track just the current 
version of the kernel code rather than its revision history. If there 
were a shallowize command, you could do something like "git pull; git 
shallowize --depth 1" to track the latest revision without keeping the 
history locally.

Anyone think that sounds like an interesting thing to explore?

-Steve

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Using git as a general backup mechanism (was Re: Using GIT t..., Steven Grimm, (Tue Dec 12, 6:49 pm)
Re: Using git as a general backup mechanism, Junio C Hamano, (Tue Dec 12, 7:43 pm)
Re: Using git as a general backup mechanism, Steven Grimm, (Thu Dec 14, 7:33 pm)
Re: Using git as a general backup mechanism, Junio C Hamano, (Thu Dec 14, 8:33 pm)
Re: Using git as a general backup mechanism (was Re: Using G..., Johannes Schindelin, (Tue Dec 12, 6:57 pm)
Re: Using git as a general backup mechanism (was Re: Using G..., Johannes Schindelin, (Tue Dec 12, 8:01 pm)
speck-geostationary