Hello all, I have several (mostly private) git repositories which I'm using for various purposes including versioned backup of moderate amounts of data, and I'm trying to work out the cheapest way of having them remotely backed-up. Since I'm not doing collaborative development, the git hosting options I've found aren't a good fit - for this amount of data they tend to assume you must have a huge project with many contributors, and charge accordingly. It looks like the cheapest option from a pure storage and data-transfer point of view would be S3, so I'm looking at the best way to use it with git. So far, the options I've found are either using jgit, which I've never used but appears to have a native S3 transport, or using one of the FUSE options to mount S3 as a filesystem. I'm not particularly happy about the idea of using jgit since it would require java on all the machines I might want to use it with, and it would mean learning to use a different command for fetch and push. It does have the bonus that it's possible to publish repositories for read access via dumb http though. On the other hand, I'm concerned about the fuse option because a) I'm not sure how reliable it is, and b) I'm concerned that the abstraction might leak if, for example, git assumes that it is accessing a local filesystem and acts differently. Does anyone have any remarks about these options? Is there a better option - how difficult would it be to add native support to git? Are there any other options for more git-friendly remote storage at a comparable price? Or maybe I should just give up, spend more and get a Linode; then I'd have the flexibility to do whatever I want with it. Thanks for your time, Aneurin Price --
There are also tools such as curlftpfs. Then you can mount any FTP account. However git will be slow. There are some .git options. Maybe it's worth a try - Don't know excactly. There should be fuse like filesystems for Amazon S3 as well. Google showns some hits. Don't know which one works best. Good luck. Marc Weber --
Hi, being a backup, I believe you can just sync your .git folder one-way, with your local copy being the authoritative one, and the S3 one just following it. -- Cheers, Ray Chuan --
I guess it might exceed your costs, but you could use a small EC2 instance backed by an EBS volume. The instance would have git installed. When you need to push, fire up the instance, push to git running on that instance, then shutdown the instance and snapshot the EBS volume to S3. Hmm, maybe that's over-engineered. :-) j. --
I'm not really familiar with Amazon S3 _or_ the current transport code, but by cursory examination of both, it seems that it would be fairly easy to add support for another transfer. And it might be even better idea to actually just add generic support to invoke an external helper to perform all the heavy lifting. Basically, all the abstraction is already pre-cooked in the form of rsync protocol support. I would just cut'n'paste that and replace rsync magic with simple calls to external helper along some sensible simple API, then code up an easy wrapper for S3 there. Or just add S3 API support directly to core Git - it doesn't seem to be licence-encumbered. Should take just a couple of hours including debugging, if you just copy the existing rsync support functions. Another idea might be to actually use the rsync protocol support itself. ;-) There seems to be some sort of commercial rsync-S3 interface, though I can't tell from their terribly strange pricing policy how expensive it is to use it in practice. -- Petr "Pasky" Baudis If you can't see the value in jet powered ants you should turn in your nerd card. -- Dunbal (464142) --
Thanks for the replies everyone. This sounds like the most interesting option, if not necessarily the most practical. I've also discovered s3cmd (http://s3tools.org/s3cmd) which seems to be widely packaged and could probably serve nicely as that wrapper. If I can manage to get those couple of hours free at some point I'll give it a go. Is this something that might be a mergeable feature? Thanks again, Aneurin Price --
