I think the Git format is tighter in terms of compression,
and simpler in terms of understanding and writing code. I have
personally written the code to read and write the Git repository
format in both C and Java, and in both cases it falls out in just
a few hundred lines of code (assuming you have libz handy to do
the compression/decompression for you).
The Git format is completely safe with regards to parallel
modification of a repository, which is good for shared repositories
that might have multiple people pushing into it at once.
Git's format is also safe with regards to *any* update.
You literally cannot destroy the repository during an update.
Its impossible. You'd have to physically destroy the storage device.
(OK, that's overstating it a bit, but it is really hard.)
The point Keith was making was the Git format is "add-only".
Once something has been stored, we NEVER modify it again. This
bypasses any sort of possible problems that can occur with partial
modifications caused by a process aborting in the middle of a change.
I think hg modifies files as it goes, which could cause some issues
when a writer is aborted. I'm sure they have thought about the
problem and tried to make it safe, but there isn't anything safer
than just leaving the damn thing alone. :)
Yes. By a huge margin. Git's *fast*. Ignore anything from a year
or two ago.
No clue.
Its a good way to stage the stuff in your next commit. By that I
mean you edit some code. Then you look at what differs between the
index and your working directory. You decide "this hunk is good, it
passed the tests, I want to commit that, so toss it into the index".
Now that hunk isn't different anymore.
When it comes time to commit, all of your already reviewed stuff is
staged in the index. You just need to issue a commit and supply
the message. But you can leave modified stuff in the working
directory, even for files that were alerady updated in the index.
This really helps during a merge. Only the stuff which Git could
not merge for you is seen as different between the index and the
working directory; all of the stuff that Git merged for you is
already staged in the index. So you can focus on the conflicts,
and stage their resolutions into the index as you go. This makes
it easier to work through larger merges where more than 1 or 2
files contains conflicts.
A LONG time ago, like in the very first version Linus offered out
to the public, we computed the identity of an object using the
SHA-1 hash of the *compressed* data. This is sensitive to the
compression settings used, and was not the best idea as a result.
It was very quickly changed to compute the identity of the object
using the SHA-1 has of the raw (user) data, removing any dependence
on the compression routine to always yield the same result for the
same input.
We haven't had a change since then. We have added some new
compression options which are just that, options. If you use them
older Git binaries won't necessarily recognize the repository data,
but these are off by default and can be enabled on a per-repository
basis. E.g. if you are only using newer Git on a given system you
can enable the newer compression features on all of the repositories
on that system.
Git can use hardlinks if you ask it to. We only use them for the
repository files, not for the user's actual source files.
Git has its own native transport (git-push, git-fetch) which can
move data between two Git repositories via local filesystem access,
SSH, HTTP, FTP, and rsync (latter two are read-only transports).
Yes, a Python clone of Git is conceivable. Indeed, there is a
pure Java clone in process (jgit) for an Eclipse plugin (egit).
If you wanted to rewrite Git in Python, knock yourself out.
But we've ported all of our Python to C, as its just faster.
No clue. I know multiple heads in one Git repository works
*awesome*. Especially on large repositories (>10k files) as the time
required to start a new branch is only the time needed to update the
files in the working directory which don't have the correct version.
Usually that's a small percentage (<200) of the files and thus its
very fast to switch to a new branch of development, and switch back.
On a decent UNIX system (and my Mac OS X PowerBook doesn't really
count) flipping branches in git-gui is almost immediate. You pick
the branch in the menu and *wham* its switched. And that's my
PowerBook, which as I said, doesn't quite count as good UNIX
system...
--
Shawn.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html