Re: [SoC RFC] libsvn-fs-git: A git backend for the subversion filesystem

Previous thread: global hooks by Victor Bogado da Silva Lins on Wednesday, March 19, 2008 - 9:04 am. (6 messages)

Next thread: Two bugs with renaming by John Goerzen on Wednesday, March 19, 2008 - 4:21 pm. (9 messages)
From: Bryan Donlan
Date: Tuesday, March 18, 2008 - 9:08 pm

Hello,

I'm planning to apply for the git summer of code project. My proposal
is based on the project idea of a subversion gateway for git,
implemented with a new subversion filesystem layer. A draft of my
proposal follows; I'd appreciate any comments/questions on it before
the application period proper begins.

Thanks,

Bryan Donlan

=== Project Goals ===

I propose to implement a subversion filesystem driver (libsvn-fs-git) that
uses a git repository as its backing store. Commits will be supported either
directly in the git repository, or in the corresponding subversion repository,
and automatically mirrored to the other side as appropriate.

I intend to support the following:
* Full or near full (possibly forbidding modification of the toplevel
trunk/ branches/ tags/ structure) read/write access from subversion
* svnadmin create/dump/load to convert existing subversion repositories
* Support for wrapping a pre-existing git repository and presenting it
as a subversion repository
* Support for mapping git branches and tags onto subversion branches
and tags (and vice versa)
* Support for syncing svn:executable with git file mode information
* Representation of git merge data using svk:merge and/or svn:mergeinfo
* Syncing .gitignore and svn:ignore data

As both subversion and git are written in C, this driver will also be in C.

Here are some tentative milestones:
* Read-only access from SVN to the master branch (no trunk/ etc layout)
  = Conversion of git commit information into svn revprops
  = git mode/.gitignore -> svn property conversion here?
* Read-write access from SVN to the master branch
  = Map svn usernames to git full name/email according to a configuration map
    - how should git commits with names unknown to svn be handled? Just pass
      them through, spaces and <@> as well?
  = Bidirectional svn:execute and svn:ignore conversion.
  = Copyfrom and file property information needs to be recorded
  = Test importing a largish repository (without ...
From: Sam Vilain
Date: Wednesday, March 19, 2008 - 9:31 pm

This seems like a large milestone.  Can you break this up any more?

For instance, your design notes on storing the necessary mapping
information are good.  How about a separate milestone of having a test
suite for the library functions you make for accessing that information.

I would be tempted to check the protocol -
http://svn.collab.net/repos/svn/trunk/subversion/libsvn_ra_svn/protocol
- and make milestones for each request type that the protocol allows
for.  Perhaps there is a more relevant list that you can find, such as
groups of tests in the back-end test suite that ships with Subversion.
Even taking the list of svn sub-commands, and deciding which fit into

Meh.  Just ignore them, and set revprops with all of the git committer

An honourable notion, but I'd steer away from worrying about
self-hosting, if it is irrelevant to the task at hand.  Focus more on a
finding a good test suite to check you supported all the operations.
Eg, can the test suite bundled with the Subversion project run against

I don't like this word "guess".  It might be dangerous to not
deterministically or repeatably answer a request.  If any random
decisions were made, or information derived based on things that might
change, then it should be stored in your mapping information branch.  In

Whew!  That's a lot of big milestones, but it's your summer ... :)

I think the merging thing is a nice-to-have, and doing it would just
prove that you can use the metadata that you have collected well.

One thing I like about your approach is that the tracking branch itself

AFAIK the interface for libgit is not yet finalized, so bear in mind the
application will possibly need porting work for each release.

Sam.
--

From: Shawn O. Pearce
Date: Wednesday, March 19, 2008 - 9:56 pm

Very cool.  Have you had a chance to look at the prototype python
implementation of an SVN server that Julian Phillips started?

  http://git.q42.co.uk/w/git_svn_server.git

I'm just curious what your take is regarding this approach.  Why
would you choose to construct libsvn-fs-git over a standalone server?
There are several advantages and drawbacks to both approaches.
I am not advocating over the other, but want to make sure you have

That's probably the only sane way to go about it; disallow read/write
on the top level, map whatever branch "HEAD" points to in Git to the
trunk/, put the other branches in branches/ and the tags under tags/.

These are gravy.  Sure they are going to be difficult to make work,
but people can limp by without them.  Most users who want an SVN
client to speak to a Git repository are trying to do so from a
platform that does not honor executable bits (hi Windows!) and
telling users to edit the funny ".gitignore" file to alter ignore
lists is something they can work around without too much trouble
if they are already able to modify and commit files.

Though their clients won't provide the proper ignore support out

I think you may have underestimated the challenges associated with
linking "libgit.a" (which is _not_ a library) with SVN.  Critical
routines within libgit that you want to be able to invoke will do
not so nice things like leak massive amounts of memory or cause
your process to terminate if the function is fed an invalid input.

Most of the C code of Git is designed for single-shot execution.
We leak memory like mad because it is more efficient to load up what
we need, exit, and let the OS just return the pages to the free pool.

I don't think you'd want to put a copy of the tree inside of a tree,
as this can then get out of sync with changes made directly through
git, plus you run into issues about connecting the two histories
together in a meaningful way.

I would suggest having the root directory of the SVN tree be built
on the ...
From: Harvey Harrison
Date: Wednesday, March 19, 2008 - 11:18 pm

Why not just copy the rev_map format git-svn already uses, it's pretty
efficient.

Harvey

--

From: Julian Phillips
Date: Thursday, March 20, 2008 - 2:22 am

You might need to get svn:eol-style working to prevent the svn client from 

Since you have to explicitly enable revprop editing in the subversion 
repository by enabling a hook script, I should think that this was 
definately something that could be left at the bottom of the TODO list ...

Though you do need to be able to convert commit info into the appropriate 
revprops (e.g. commit msg -> svn:log revprop)

-- 
Julian

  ---
Often statistics are used as a drunken man uses lampposts -- for support
rather than illumination.
--

From: Bryan Donlan
Date: Friday, March 21, 2008 - 10:02 pm

[Empty message]
From: Johannes Schindelin
Date: Saturday, March 22, 2008 - 4:35 am

Hi,


My preference is to have single replies, possibly changing the subject 

As I said on IRC yesterday, I think that such a libgit.a would be nice, 
_but_

- a lot of git programs expect to be one-shot, and libgit.a shows that,

- not many people will help you with your effort, but just ignore it and 
  actively introduce things that do not help libification (at least that's 
  my experience),

- unless you have a proper need for such a library, I do not think there 
  is enough motivation to actually get it to completion.

I once thought that libification would be nice, and important, but as I do 
not need it myself, I reversed my opinion.

Ciao,
Dscho

--


On Sat, Mar 22, 2008 at 6:35 AM, Johannes Schindelin

I would use it for my pyrite work, although it will be some time before I
could contribute to such an effort.  I expect it would be useful for
anyone who wants to make a language binding that uses native
git underneath.

Just so you know *someone* will use it.

Thanks,
Govind.
--


Hi,


I know people would use it.  My point was: those people that want to use 
it have the best starting point to make it happen, because they (should) 
actually care about libification.

Ciao,
Dscho

--


On Sat, Mar 22, 2008 at 7:35 AM, Johannes Schindelin

All right. If I do end up having to recreate (thread-safe,
multiple-git-dir-safe) logic for my project, I'll try to keep in mind
--

Previous thread: global hooks by Victor Bogado da Silva Lins on Wednesday, March 19, 2008 - 9:04 am. (6 messages)

Next thread: Two bugs with renaming by John Goerzen on Wednesday, March 19, 2008 - 4:21 pm. (9 messages)