login
Header Space

 
 

Re: repo.or.cz wishes?

Previous thread: Question about git-log --date and default behavior by Seth Falcon on Sunday, August 26, 2007 - 7:35 pm. (4 messages)

Next thread: Re: [PATCH 1/2] gitweb: Lift any characters restriction on searched strings by Jakub Narebski on Sunday, August 26, 2007 - 9:02 pm. (3 messages)
To: <git@...>
Date: Sunday, August 26, 2007 - 7:59 pm

Hi,

  I've just finally killed the HTTP auth for project administration that
was destroying everyone's lives, and added support for resetting
forgotten passwords, two main things that seemed to be the popular nits
of the repo.or.cz audience.

  So now I wonder, what is the thing you miss most there? Any cool stuff
repo.or.cz could (preferrably easily) do and doesn't?

  And please don't ask for smaller roundtrip times of requests for
administrator assistance. ;-)

  Thanks,

-- 
				Petr "Pasky" Baudis
Ever try. Ever fail. No matter. // Try again. Fail again. Fail better.
		-- Samuel Beckett
-
To: Petr Baudis <pasky@...>
Cc: <git@...>
Date: Tuesday, August 28, 2007 - 5:20 am

Hi,


Ah, I was reminded again about a pretty obscure bug with http transport: 
http-fetch uses all alternates and http-alternates it finds.  However, the 
alternates are wrong, since they are for the ssh:// protocol.

So here comes my wish: could you install http-alternates for all forks 
(and preferably include the http-alternates of a forked project, so that a 
fork of a fork works)?

Thank you very, very much,
Dscho

-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: <git@...>
Date: Tuesday, August 28, 2007 - 5:24 am

Hi,


  yes, I already did this the last week.

-- 
				Petr "Pasky" Baudis
Early to rise and early to bed makes a male healthy and wealthy and dead.
                -- James Thurber
-
To: Petr Baudis <pasky@...>
Cc: <git@...>
Date: Tuesday, August 28, 2007 - 5:52 am

Hi,


Thank you for responding _that_ quickly :-)

Ciao,
Dscho

-
To: Petr Baudis <pasky@...>
Cc: <git@...>
Date: Monday, August 27, 2007 - 3:49 pm

Hello Petr,

I have two wishes (even though I don't use repo.or.cz regularly).

- searching for sha1 id.
  OK, I know how to create an URL for that, but that's inconvient.

- When looking on forks of say git.git, I want the "main" project shown,
  too.  That is http://repo.or.cz/w/git.git?a=forks should include a
  link to http://repo.or.cz/w/git.git.

After rereading these may more be gitweb wishes than repo.or.cz, but
anyhow ...

Best regards
Uwe

-- 
Uwe Kleine-K
To: Petr Baudis <pasky@...>
Cc: <git@...>
Date: Monday, August 27, 2007 - 4:35 am

Hi,


I wonder if this is repo.or.cz specific, but we recently had a problem 
where the blobs of a project went away, and the forked project still 
relied on them.  Any ideas how to solve that issue?

Ciao,
Dscho

-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: <git@...>
Date: Tuesday, August 28, 2007 - 6:19 am

Hi,


  I actually *don't* want any objects to go ever away from a project. I
just have no idea how to prevent it but still have some sane packing
behaviour.

  I've been already thinking about this few years ago and even ended up
with some patches in progress, but never finished them...

  (What I've found in my IRC logs:)

22:03 &lt; pasky&gt; I do run git repack -a -d but as far as I understand, that should be fine
22:03 &lt; gitster&gt; Huh?  "repack -a -d" within git.git?
22:04 &lt; gitster&gt; What happens if an object only on 'pu' was already packed (you run prune-packed, don't you?), pu is rewound/rebased, and then you run "repack -a -d" again?
22:04 &lt; gitster&gt; And that object was borrowed by one of the forks?
22:05 &lt; pasky&gt; gitster: according to git-repack documentation no objects should be removed
22:05 &lt; pasky&gt; gitster: only redundant ones
22:05 &lt; pasky&gt; maybe the documentation is using some obscure definition of "redundant" I'm not familiar with
22:05 &lt; gitster&gt; redundant within that repository.  In short, my worry comes from the fact that I do not know what you are doing to ensure that reachability extends to forks.
22:06 &lt; pasky&gt; gitster: I'm relying on git to ensure it for me
22:06 &lt; pasky&gt; gitster: if I can't do that, something is seriously badly broken
22:07 &lt; pasky&gt; gitster: well "redundant" means duplicate
22:07 &lt; pasky&gt; gitster: does "unreachable" mean "duplicate"?
22:09 &lt; pasky&gt; in short, dropping unreachable objects in the process of git-repack -a -d is absolutely ludicruous, especially the way git-repack -a -d is documented now
22:22 &lt; pasky&gt; gitster: I'd still like to know if git-repack -a -d can remove unreachable objects too, whether it should, if it's a bug/feature and why is it not documented...
22:49 &lt; gitster&gt; documentation fixes probably is needed but we have warned about pruning/repacking alternates for a long time.
22:50  * gitster was away for lunch.
22:51 &lt; pasky&gt; gitster:...
To: Petr Baudis <pasky@...>
Cc: <git@...>
Date: Tuesday, August 28, 2007 - 7:06 am

Hi,


Hmm.  I think you'll have to hack pack-objects, or you'll have to create 
dummy refs from the dangling objects/blobs.

The latter is probably easier, since it just involves a "git fsck 
--lost-found" and putting the found things into a 
"refs/I-dont-wanna-lose-you/" refspace.  But it will be certainly more 
expensive.

The upside of this method would be that you have an integrity test for 
free.

Putting all objects into a single database could easily break the current 
packfile size limit, and it will be suboptimal for cloning: chances are 
that objects delta well against objects which are not in the same 
(forked)project, and therefore the server is more likely to get 
non-reusable deltas.

Ciao,
Dscho

-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: <git@...>
Date: Tuesday, August 28, 2007 - 7:48 am

That is possible as well, but I like the method I've proposed better
since if there are any objects that _really_ aren't referenced anymore,

The overhead for fetching over HTTP might be insane.

-- 
				Petr "Pasky" Baudis
Early to rise and early to bed makes a male healthy and wealthy and dead.
                -- James Thurber
-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: Petr Baudis <pasky@...>, <git@...>
Date: Tuesday, August 28, 2007 - 12:10 am

Its not isolated to repo.or.cz.

Its what happens when you use `git clone --shared A B` and the
repository A that you borrow from removes objects that B depends
upon.  That is exactly what repo.or.cz is doing, and it is exactly
what I'm doing at day-job with some large repositories.

You more or less cannot repack A without making sure it takes B's
refs into account when it generates the new packfile.

But that's actually a problem because A doesn't have B's ODB, and
some objects that are reachable through B's refs won't be available
to A for traversal or packing.  Not to mention that you may not
want to stuff B's objects into A (at day-job I don't).

This is partly why Junio doesn't push to repo.or.cz/git.git but
instead pushes to alt-git.git and why he doesn't allow forking
of alt-git.git.  This way his `pu` branch doesn't lose objects
and cause a fork to suffer data loss after a repack of git.git.

At day-job I have a hard rule that you cannot even push into an A,
let alone rewind a branch in it or delete a branch from it.  All of
my A's that my B's fork from a strictly read-only and are managed
by me and me alone.  I know what I'm doing...  I think.... ;-)


So it really comes down to a rule like the following:

  If `git clone --shared A B` is used then either:
    - Never repack A;
    - Never delete/rewind a commit in A.

  If A must be repacked then:
    1) Hardlink all loose objects/packs from A to all B;
    2) `git --git-dir=A repack -a -d`
	3) `git --git-dir=B repack -a -d -l`
	4) `git --git-dir=B prune`

Ain't easy, is it?  That --shared flag to git-clone is playing
with fire.  If you don't want to get burnt, don't use it.  Maybe
we should rename it --light-a-match-and-burn-me-please.  ;-)

Forks on repo.or.cz are cool, but they can cause problems if the
root project is allowed to rewind or delete branches.

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Johannes Schindelin <Johannes.Schindelin@...>, Petr Baudis <pasky@...>, <git@...>
Date: Tuesday, August 28, 2007 - 7:19 am

This has been discussed before, and it wouldn't be *that* hard to have
"git clone --shared" create a backpointer from B to A, so that
"git-prune" could also search the B's refs and not prune anything that

Why don't you even allow people to push into A?  That should be

This is morally the same, but it makes the hardlink step easier (only
one pack to link from A to B), and by using git-gc mit makes it
conceptually easier for people to understand what's going on.

git --git-dir=A gc
ln A/.git/objects/pack/* B/.git/objects
git --git-dir=B gc --prune
git --git-dir=A prune

... and I don't think this is all that scary at all.  :-)

					- Ted
-
To: Theodore Tso <tytso@...>
Cc: Johannes Schindelin <Johannes.Schindelin@...>, Petr Baudis <pasky@...>, <git@...>
Date: Wednesday, August 29, 2007 - 12:15 am

Not if I already have a pointer from B to A's refs.  repo.or.cz
also has this same pointer:

	git clone --shared A B
	ln -s A/refs B/refs/forkee

If we then do the backpointer we'll get a circular loop between these
two repositories and the output of git-ls-remote will be horrid to
look at.  It will also be confusing when you push, as we'll try to
match "refs/forkee/refs/forker/refs/forkee/heads/master"... ;-)

The reason I (and repo.or.cz) create the pointer from B to A is
so that push and fetch can see that B already has the objects in

Push new things yes.  Rewind a branch or delete a branch, no.
I actually don't allow pushing into A because it just doesn't make

No, it won't work.

The problem is that during the first `git --git-dir=A gc` call
you are deleting packfiles that may contain objects that B needs.
*poof*.  They are gone.  B cannot traverse its object list to gc
itself during the third command.  That's why you have to do the
hardlinking of *everything* stored in A (even if it is not reachable)
into B before you can gc A.

The shorter and safest approach is the following, but it will cost
a lot of disk space and IO while running:

  git --git-dir=B repack -a -d    ; # yes, really no -l !
  git --git-dir=A repack -a -d
  git --git-dir=B repack -a -d -l ; # yes, now use -l

The first repack of B will copy everything from A into B, so that
if the repack of A removes the object it will reside in B and B's
own repack can keep the object.  But if A still has the object then
B won't copy it during the final repack of B.

You have to use repack here and not gc as gc defaults to including
the -l flag.

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Theodore Tso <tytso@...>, Johannes Schindelin <Johannes.Schindelin@...>, Petr Baudis <pasky@...>, <git@...>
Date: Wednesday, August 29, 2007 - 1:11 pm

Now, this doesn't work well with packed refs, I'm afraid.

So I suspect that if we really want to support something like this, we'd 
need to do more than just avoid the recursion when you cross-link.

		Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Theodore Tso <tytso@...>, Johannes Schindelin <Johannes.Schindelin@...>, Petr Baudis <pasky@...>, <git@...>
Date: Friday, August 31, 2007 - 10:58 pm

No, it doesn't work well.  So I actually also avoid packing A's refs.
Which is yet another reason why my A's don't allow pushing, that
way nobody goes nuts and creates a ton of refs in there.  With only

Yes.  I've been thinking about trying to better share the ODB and
the ref database between repositories, but it has been low priority
for me.

I rely on this ref symlinking/alternate ODB trick a lot at day-job
to help me cope with an ugly situation I created across a number of
repositories.  Most of our codebase came from one Git repository,
but has been refactored and split into about 10 different Git
repositories.  I did that refactoring by just cloning and deleting
the uninteresting content, so each repository actually has a huge
block of its history in common with the other 9.

One such A is "common-crap.git" that is the shared common history.
Since its strictly history nobody changes that repository, and
everyone borrows objects from it.  This reduces my common working
set by about 900MiB, as the history lives in only one packfile and
not in 10.

There are obviously other ways to deal with this:

 - start the 10 repositories over again and use info/grafts to
   reinsert the old history when/if required;

 - just hardlink the same .keep'd packfile into the 10 repositories,
   since it is held by .keep it won't be touched during repack.

So one reason it has been low priority for me to improve upon is
because there's more than one way to solve the problem, and the
particular solution I have settled upon may not be the best solution
for anyone.

Though I think we can all agree that repo.or.cz's use of forks
is increasingly more popular, and one of the more powerful social
features of git.  Better supporting it out of the box by making it
easier to setup and manage can only be a good thing for our users.

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Johannes Schindelin <Johannes.Schindelin@...>, Petr Baudis <pasky@...>, <git@...>
Date: Wednesday, August 29, 2007 - 7:13 am

But "git-gc" without the --prune doesn't delete any objects.  So it
should always be safe to use git-gc even if there are repositories
that are relying on that repo's ODB.  It's only if you use git-gc
--prune that you could get in troudble.  It might delete some
packfiles containing objects needed by B, but only after consolidating
all of the objects into a single packfile that contains all of the
objects that had always been in A's ODB.

So I don't see why this wouldn't work.

						- Ted
-
To: Theodore Tso <tytso@...>
Cc: Johannes Schindelin <Johannes.Schindelin@...>, Petr Baudis <pasky@...>, <git@...>
Date: Friday, August 31, 2007 - 5:09 pm

Yes, it does delete objects.  Even without --prune.  That is
because git-gc is running `git-repack -a -d -l`.  repack -a means
repack all objects reachable from the current refs.  The -d means
delete the packfiles that existed when the repack started, as it
is assumed that all needed (reachable) objects were copied into
the new output packfile(s).  The -d also means delete any loose
objects that are now packed (git-prune-packed).

Yet there may be objects in A that A cannot reach anymore (deleted
or rewound branch) but that B needs and B does not have a copy of.
If these objects were in one of the prior packfiles of A and is

But when we repack we don't repack everything in A's ODB, we only
repack the things that A can reach.  If A cannot reach something
because a branch was rewound or deleted it won't survive the repack.

It only works if A cannot delete a branch or rewind a branch.
In other words, once an object is stored in A's ODB it must always
be reachable from A's refs.

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Theodore Tso <tytso@...>, Johannes Schindelin <Johannes.Schindelin@...>, Petr Baudis <pasky@...>, <git@...>
Date: Wednesday, August 29, 2007 - 1:08 am

Two things to watch out for are (1) packed refs won't be
protected with this trick, and (2) symrefs in refs/ hierarchy
will point at wrong place if you did this.  The latter hopefully
won't be a problem because the trick being discussed is only to
add reachability and not _using_ the borrowed refs for anything
(iow, this makes B/refs/forkee/remote/origin/HEAD incorrectly
point at refs/remotes/origin/master, but what it really should
point at is B/refs/forkee/remote/origin/master).

-
To: Junio C Hamano <gitster@...>
Cc: Shawn O. Pearce <spearce@...>, Theodore Tso <tytso@...>, Johannes Schindelin <Johannes.Schindelin@...>, <git@...>
Date: Wednesday, August 29, 2007 - 5:58 am

BTW gitweb actually uses refs/forkee/ to add funny ref tags to commits,
which was completely unintended but is actually in the end quite handy
(though the tags should be modified to look less confusing).

-- 
				Petr "Pasky" Baudis
Early to rise and early to bed makes a male healthy and wealthy and dead.
                -- James Thurber
-
To: Theodore Tso <tytso@...>
Cc: Shawn O. Pearce <spearce@...>, Petr Baudis <pasky@...>, <git@...>
Date: Tuesday, August 28, 2007 - 7:32 am

Hi,


It _is_ hard, if you want to keep any kind of safe permissions.  Think 

Nope:

for b in $(git ls-remote /that/other/repo | sed "s/^[^ ]* //")
do
	git push /that/other/repo :$b
done

Ciao,
Dscho

-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: Theodore Tso <tytso@...>, Petr Baudis <pasky@...>, <git@...>
Date: Wednesday, August 29, 2007 - 12:20 am

Well, at day-job I use contrib/hooks/update-paranoid to deny all
push access into my A's (/that/other/repo).  But that could just
as easily be configured to allow branch creation and branch update
(fast-forward) but no rewind or delete.

When I symlink A's refs into B I also don't allow B to update,
create, rewind or delete the symlinked refs via push.  This way
you can't do something weird to A like upload new objects into B's
ODB but then change A's refs to point to objects that A's own ODB
doesn't have.

Hmm, I wonder of Pasky handles that correctly on repo.or.cz...

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Johannes Schindelin <Johannes.Schindelin@...>, Theodore Tso <tytso@...>, <git@...>
Date: Wednesday, August 29, 2007 - 5:54 am

I don't handle it at all, but if you don't have permissions to modify A
you simply won't be able to do anything weird to A. If you have the
permissions, I'm still not sure if Git will keep symlinked refs over
ref updates; if so, hey, you had the permissions for A and it's your
reponsibility if you screw up.

-- 
				Petr "Pasky" Baudis
Early to rise and early to bed makes a male healthy and wealthy and dead.
                -- James Thurber
-
To: Shawn O. Pearce <spearce@...>
Cc: Petr Baudis <pasky@...>, <git@...>
Date: Tuesday, August 28, 2007 - 4:25 am

Hi,



But maybe on repo.or.cz we can do something about it?  Like using a global 
object database, or actually teaching git about _splitting_ object 
databases into parts: something along the lines of "we repack the forked 
projects before the forkees.  That way, the objects will be still there 
when the forkee "lost" them, but will be repacked into the forked 
project's database".

Ciao,
Dscho

-
To: Petr Baudis <pasky@...>
Cc: <git@...>
Date: Sunday, August 26, 2007 - 10:40 pm

I'd like to see the service mirrored, including potentially a repository
which contains the meta/auth information (sans password hashes, perhaps)
for admins on accounts.

This of course opens a large can of worms when it comes to achieving
decentralization of the service as a whole, however I think those
questions will be best answered once the information is available for
setting up mirrors.

I've also got hardware, bandwidth and some tuits.

Sam.
-
To: Petr Baudis <pasky@...>
Cc: <git@...>, Jakub Narebski <jnareb@...>
Date: Sunday, August 26, 2007 - 8:16 pm

Just a minor nit, but how about dropping the "git+" from the
Push URL?

Jakub was also talking about support in gitweb for specifying
the location of submodules.  It would be nice if admins could
set this information, wherever it ends up getting stored.

skimo
-
To: <skimo@...>
Cc: <git@...>, Jakub Narebski <jnareb@...>
Date: Sunday, August 26, 2007 - 8:41 pm

I'm a major proponent of the "git+" - it's just the correct thing to
specify. ssh:// by itself means secure _shell_, and that's not what the
URL means - ssh is literaily just a transport layer for the git
protocol. This is not my invention but fairly standard thing which
plenty of people use, and it makes it possible to select proper protocol
handlers and so on, shall something generic crunch on the URL. I've

Hmm, this shouldn't be very hard to do if the support will get into
guess I'll wait for the gitweb side.

-- 
				Petr "Pasky" Baudis
Early to rise and early to bed makes a male healthy and wealthy and dead.
                -- James Thurber
-
To: Petr Baudis <pasky@...>
Cc: Sven Verdoolaege <skimo@...>, <git@...>
Date: Monday, August 27, 2007 - 5:58 pm

Is it now possible to _upload_ SSH key, instad of copy'n'paste it
when creating repository/account?

Would it be reasonable to limit repository name length, or at least

First, it is in the context of git, so one can say that "git+" is
implied. Documentation mentions only "ssh://". That said I prefer
"git+ssh://" to "ssh://" alone.

Second, IIRC Linus prefers scp-like syntax for SSH protocol, namely
"[user]@host:/path/to/repo", so perhaps that one should be used

The problem is that repo.or.cz needs support for that in gitweb, while
gitweb in turn needs support for that in git. This needs git consensus
on how to specify object database location (or just gitdir) for
submodules, to have later submodule support in gitweb.

-- 
Jakub Narebski
Poland
-
To: Jakub Narebski <jnareb@...>
Cc: Petr Baudis <pasky@...>, <git@...>
Date: Tuesday, August 28, 2007 - 4:49 am

What would be the use of that (outside of gitweb) ?

skimo
-
To: Sven Verdoolaege <skimo@...>
Cc: Petr Baudis <pasky@...>, <git@...>
Date: Tuesday, August 28, 2007 - 5:56 pm

For the hypothetical (planned?) future '--recurse-submodules' option
to git-diff family, git-ls-tree and git-ls-files, git-fetch and git-push
(but I think not git-pull), perhaps git-log (besides what it supports
by the way of git-diff-tree), maybe even git-status and git-commit.
Gitweb support is a special case of git-ls-tree support, of sorts.

BTW I think this would need complementing "lightweight checkout" support,
i.e. core.gitdir (overrideable by GIT_DIR and --git-dir) and perhaps
core.objectdir (overridable by GIT_OBJECT_DIRECTORY) in the subproject
config itself.

-- 
Jakub Narebski
Poland
-
To: Jakub Narebski <jnareb@...>
Cc: Petr Baudis <pasky@...>, <git@...>
Date: Wednesday, August 29, 2007 - 3:32 am

Ah... you're talking about bare repositories, right?
For non-bare repos, I'd assume you would only recurse for those
submodules that you have actually checked out in your working tree.

skimo
-
To: <skimo@...>
Cc: Petr Baudis <pasky@...>, <git@...>
Date: Wednesday, August 29, 2007 - 7:12 pm

For bare repositories, and for repositories which move around (at least
once change path). Gitweb uses repository like it is bare...

-- 
Jakub Narebski
Poland
-
To: Petr Baudis <pasky@...>
Cc: <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 2:23 pm

I'd say "only", not "major".


No it isn't, and no it doesn't.

It makes no sense what-so-ever.

"ssh://" is the *protocol*. What is actually done over the protocol is 
specified by the program.

This is not at all git specific. Try running "ssh" vs "scp" some day, and 
you'll notice the exact same thing: they both use the ssh _protocol_, but 
no, your statement that "ssh://" by itself means "secure _shell_" is total 
and utter garbage.

It means nothing at all of the kind. 

"ssh://" means the ssh protocol. It is that unambiguous, and that simple. 
Saying "git+ssh://" is totally idiotic, always has been, and always will 
be.

It's as stupid as it would be to require people to say

	scp cp+ssh://host/filename .

and nobody sane would *ever* advocate something that stupid. It's not how 
it's done.

So why do you continue to advocate "git+ssh://", when nobody else does, 
and several people have asked you not to.

And yes, I realize that SVN does it. SVN for some unfathomable reason uses 
"svn+ssh://", but let's face it, the SVN developers have neither taste nor 
brains. They don't know any better.

			Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 4:05 pm

Really?

What does `ssh://what.the.hell.org/some/file' per se mean?

SSH is a protocol, but rather in the sense similar to TLS, not to HTTP.
If it has some addressable objects, which could be referred to by the
path part of the URL, they should be the programs to execute at the
remote server, i.e., in our case the path to the GIT client binary,
and certainly not the name of the repository, which has nothing to do
with the SSH protocol.

(Just for completeness: I do not advocate using git+ssh, but your arguments
against it look somewhat illogical.)

				Have a nice fortnight
-- 
Martin `MJ' Mares                          &lt;mj@ucw.cz&gt;   http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Only dead fish swim with the stream.
-
To: Martin Mares <mj@...>
Cc: Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 6:27 pm

So what does "http://what.the.hell.org/some/file" mean?

Does it mean that you have to start a web browser? Should we make that be

	git+http://what.the.hell.org/some/file

to make it clear that we're doing "git work" over the "http" protocol?


What does *that* mean? A protocol is a protocol. Your argument that 
protocols are "different" is pointless. Some protocols are usable for git, 
others aren't. OF COURSE different protocols are different. They are 
different in different ways.

Git uses URL's to say how to access something, which includes a protocol, 
an optional host, and a location within the host. It's quite obvious what 
they mean, and it's *also* obvious that the meaning is git-specific. 

Here's what it boils down to:

 - do you think it is sensible to write

	git clone git+file:///some/directory
	git clone git+http://host/directory
	git clone git+rsync://host/directory

   when cloning from the local filesystem, over http, or over rsync 
   respectively? The first one, btw, actually uses the "git protocol". The 
   two others do not, but since a user shouldn't care, it would be really 
   stupid to try to make some internal implementation detail show up in 
   the URL scheme.

 - if you really think that the above is sensible, then explain why.

 - if you think that is TOTALLY IDIOTIC, then explain why "ssh://" is so 
   magically special that it would somehow make sense to say "git+" for 
   it?

As to your TLS example: if we were to do "git over TLS", it would make 
perfect sense to use either "tls://" (although "gits://" might be more 
natural, not because tls is wrong, but because people have gotten used to 
"https://") if we were to have a "secure git" port. Or maybe we'd use the 
same port number that we already have assigned for git, and just add some 
"use TLS to authenticate/encrypt", and use "tls://" for that. It makes 
perfect sense.

In short: you should just ask yourself: what is the most natural thing for 
a *user* to type to "git c...
To: Linus Torvalds <torvalds@...>, Petr Baudis <pasky@...>, <git@...>
Cc: Martin Mares <mj@...>, Sven Verdoolaege <skimo@...>
Date: Monday, August 27, 2007 - 7:16 pm

I like gits:// idea for "git over TLS", and I'm against "tls://". I wonder
if it would be hard to implement "git overt TLS"? We could resurrect patch
which allowed push over git protocol, onnly restricting pushing to gits
protocol.

-- 
Jakub Narebski
Poland
-
To: Jakub Narebski <jnareb@...>
Cc: Linus Torvalds <torvalds@...>, Petr Baudis <pasky@...>, <git@...>, Martin Mares <mj@...>, Sven Verdoolaege <skimo@...>
Date: Tuesday, August 28, 2007 - 4:27 am

Hi,


I really have to wonder what the benefits are.  git:// does not need 
authentification, it is fetch-only, and you can (and should!) verify the 
integrity with git-fsck anyway.

So all TLS would add to is waste bandwidth and CPU cycles.

Ciao,
Dscho

-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: Jakub Narebski <jnareb@...>, Linus Torvalds <torvalds@...>, Petr Baudis <pasky@...>, <git@...>, Martin Mares <mj@...>, Sven Verdoolaege <skimo@...>
Date: Tuesday, August 28, 2007 - 4:30 am

It isn't fetch-only since 4b3b1e1e488fe83a8a889ff26cf88355692b6a8c
(though you have to enable it with a config option).

-Peff
-
To: Jeff King <peff@...>
Cc: Jakub Narebski <jnareb@...>, Linus Torvalds <torvalds@...>, Petr Baudis <pasky@...>, <git@...>, Martin Mares <mj@...>, Sven Verdoolaege <skimo@...>
Date: Tuesday, August 28, 2007 - 5:17 am

Hi,


Sorry for not mentioning it, but IMHO it does not really matter for this 
discussion, or does it?

Ciao,
Dscho

-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: Jakub Narebski <jnareb@...>, Linus Torvalds <torvalds@...>, Petr Baudis <pasky@...>, <git@...>, Martin Mares <mj@...>, Sven Verdoolaege <skimo@...>
Date: Tuesday, August 28, 2007 - 5:28 am

Sorry, I have not been keeping up with this thread, so I may be
confused. But I thought you were saying that there is no point to
git-daemon over TLS, since git-daemon is purely for fetching public
data. My point is that it's not, and thus there might be some value to
it (though really, I would have no interest in implementing it -- its
functionality would be a duplicate of git over ssh, which works fine).

-Peff
-
To: Jeff King <peff@...>
Cc: Johannes Schindelin <Johannes.Schindelin@...>, Jakub Narebski <jnareb@...>, Petr Baudis <pasky@...>, <git@...>, Martin Mares <mj@...>, Sven Verdoolaege <skimo@...>
Date: Tuesday, August 28, 2007 - 6:45 pm

There's possibly another reason: using TLS for validating not the *client* 
or encrypting the data, but in order to be able to trust the *server* in 
the face of man-in-the-middle attacks etc.

A lot of people think if authentication as a way to verify the identity of 
the client. But it's equally valid as a way to verifyt that the server you 
talk to is the one you _expected_ to talk to.

[ That said, I'd also actually like to support encrypted git repositories, 
  at least on a pack-file basis. I realize that people should probably use 
  whole-disk encryption on their laptops etc regardless, but I really can 
  see the point of wanting to secure your repository history even if you 
  might not care anough to secure everything else - including necessarily 
  the last checked-out version. I could well imagine the repo history 
  being considered much more critical than any particular checked-out 
  state.

  I could also imagine just having a bare repository (encrypted) on hand, 
  to get access to it *if*needed*. I suspect I'd have used something like 
  that back when I worked at Transmeta if it had been available - not 
  necessarily have anything checked out, but just knowing that I *could* 
  get to if it I needed to ]

			Linus

-
To: Linus Torvalds <torvalds@...>
Cc: Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 6:58 pm

This is also useful for foreign SCM support; the idea of supporting
svn+ssh:// "directly" with git remote and the likes.

I don't usually write git+ssh://, but I do consider it to be the form
which is more in the spirit of application interoperability.  It says

The scheme is bad because it doesn't integrate with other appliations.
Seeing the URI in a web page they have no way of knowing which
application or port this tls:// URI refers to.  It's not *universal*.

This is fine for URIs passed into git, but bad if you want to link to it
from elsewhere.

Sam.
-
To: <git@...>
Cc: Sam Vilain <sam@...>, Linus Torvalds <torvalds@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, Jakub Narebski <jnareb@...>
Date: Tuesday, August 28, 2007 - 8:26 am

I'm not so sure that what you do with a connection is a relevant part of a 
URL.  What about this:

 $ ssh somehost mycommand

If I wrote that as a URL should it be

 mycommand+ssh://somehost/

?  Obviously not.  The protocol part tells you how to talk, not what you 
should be done with it.  The very fact that the first thing git does is run 
git-daemon (or whatever) demonstrates that you are talking "ssh" protocol 
_not_ "git+ssh" protocol.



Andy

-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com
-
To: Sam Vilain <sam@...>
Cc: Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 7:17 pm

..and by that logic, you should add "git+" to *everything*, not just ssh.

Which simply isn't practical or sane - only damn annoying.

			Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 7:30 pm

Why annoying and impractical, if you don't ever have to specify it
unless you want to write a URI which is portable between applications?

Sam.
-
To: Sam Vilain <sam@...>
Cc: Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 7:34 pm

Sure. I'm perfectly happy to make connect.c just ignore any "git+" prefix, 
and let people do it.

What I object to is:
 - the totally *idiotic* notion that "ssh" is somehow different
 - encouraging people to actually *use* that inconvenient format

The fact is, nobody really cares. We've happily used the non-"git+" forms 
for over two years, and there has never *ever* been a case of actual 
confusion. So allowing the "git+" prefix everywhere may be _logical_, but 
it's still totally idiotic and user-unfriendly.

		Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Sam Vilain <sam@...>, Martin Mares <mj@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 8:50 pm

My logic for using git+ssh is that this URL really denotes a resource
accessed over the *git* *protocol*, merely _wrapped up_ in ssh session
for the authentication and security. But I admit that your point that
this is just exposing an implementation detail is valid.

I still like the naming git+ssh and think it is logical, but I don't
really care that much (and fail to see why do you *hate* it so
intensively). So I'll sleep on it and if no further compelling positive
argument pops in my head, I'll just change it to ssh. (I don't want to
use the host:path notation because I want to make clear it's ssh and
people need to get their usernames and ssh keys in order before

I've never heard of anybody getting confused by the git+ssh syntax
either, and I honestly can't remember the several people that asked me
not to use it (besides one your mail complaining about it, not even
directed at me). But I may have just forgotten.

-- 
				Petr "Pasky" Baudis
Early to rise and early to bed makes a male healthy and wealthy and dead.
                -- James Thurber
-
To: Linus Torvalds <torvalds@...>
Cc: Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Monday, August 27, 2007 - 7:27 pm

Not exactly. You can browse using http:// and file:// protocols,
rsync:// is simply rsync, while ssh:// (or git_ssh://) can be limited
using git-shell.

-- 
Jakub Narebski
Poland
-
To: Jakub Narebski <jnareb@...>
Cc: Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Monday, August 27, 2007 - 7:38 pm

Bullshit. You carefully left out "git://", since that doesn't fit your 
"argument".

The fact is, all git URL's make sense for *git*, not necessarily for 
anything else. They may have incidental meanings outside of git, but 
certainly nothing that is really *sensible*.

And that is how it was designed to be. The URL's are for *git*, not for 
other uses. If you want to do cross-SCM tools, you need to let them know 
it's a "git" thing wheher it's browsable or not, so the argument that ssh 
is something "different" is bogus crapola.

Just face it, ssh is in no way different from any of the other git URL 
specifiers. 

			Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Jakub Narebski <jnareb@...>, Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Monday, August 27, 2007 - 8:38 pm

That throws out the *U* of *URL*, which stands for Uniform. If you
have to say whether a URL is a "git" URL or some other kind of URL,
it's no longer a Uniform Resource Locator.

However, to quote RFC 3986:
   Although many URI schemes are named after protocols, this does not
   imply that use of these URIs will result in access to the resource
   via the named protocol.  URIs are often used simply for the sake of
   identification.  Even when a URI is used to retrieve a representation
   of a resource, that access might be through gateways, proxies,
   caches, and name resolution services that are independent of the
   protocol associated with the scheme name.

Perhaps git:// URLs shouldn't be used at all, and we should just use
file://, ssh://, http://, etc., since the scheme is usually there to
explicitly denote the protocol we want to use, and the fact we're
using that protocol to do git work is implicit in the 'git' at the
start of the command-line. It makes sense to me that both 'git clone
&lt;URL&gt; foo' and 'rsync &lt;URL&gt; foo' would both work roughly the same,
assuming both git and rsync support the URL's scheme.


Dave.
-
To: David Symonds <dsymonds@...>
Cc: Linus Torvalds <torvalds@...>, Jakub Narebski <jnareb@...>, Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Tuesday, August 28, 2007 - 1:44 am

Huh?  Do you think a URL "http:" ceases to be a URL if there is


It doesn't to me.  Not least of all because rsync does not work using
URLs to start with.  Can all of the U-is-Universal-flouting people
read the manual pages of the utilities they are advertising?

It would make the discussion less surreal.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
To: David Kastrup <dak@...>
Cc: Linus Torvalds <torvalds@...>, Jakub Narebski <jnareb@...>, Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Tuesday, August 28, 2007 - 2:06 am

Well, my original point, which I carefully disguised with my
misunderstanding of git:// URLs, was that the URL should be enough to
locate the resource (say, a git repo); you shouldn't restrict what
those resources are used for, so that the URLs *can* be for more than
just git, because they are Uniform, and they only *locate* a resource,

Sorry, I meant something like wget. Those keys are close together, heh.


Dave.
-
To: David Symonds <dsymonds@...>
Cc: David Kastrup <dak@...>, Linus Torvalds <torvalds@...>, Jakub Narebski <jnareb@...>, Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Tuesday, August 28, 2007 - 10:31 am

There is still the question though - "located by _whom_?"   IMHO, the  
answer to that question determines the _different contexts_ for these  
URLs.  When I'm specifying a URL on a git command line (e.g. "git  
clone /url/to/some/repo"), it does seem redundant to say the least to  
have to always prefix the underlying protocol with "git+". (I think  
Linus had a good point here, in that if we were to say "git clone  
git+ssh://...", we should also say "git clone git+rsync://..." etc.  
for the sake of consistency.)

On the other hand, if we were to get to a point where we would like to  
be able to feed a URL to a 3rd-party tool, say, konqueror, and have it  
recognize that it is a git resource and access as such, then under  
_that_ context, I would agree that it is definitely necessary to have  
the protocol somehow indicates that it's "git'ish", be it "git+ssh" or  
"gits" or whatever.

-- 
Jing Xue

-
To: David Symonds <dsymonds@...>
Cc: Linus Torvalds <torvalds@...>, Jakub Narebski <jnareb@...>, Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Monday, August 27, 2007 - 8:46 pm

No it doesn't. It's no less uniform just because not every program in the
world understands what to do with a particular URL. It's still uniform
in that it follows the tried and tested scheme of "protocol://host/path".
The fact that firefox, opera, konqueror and IE doesn't know what to do
with it means absolutely nothing.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
-
To: Andreas Ericsson <ae@...>
Cc: Linus Torvalds <torvalds@...>, Jakub Narebski <jnareb@...>, Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Monday, August 27, 2007 - 8:54 pm

What I then went on to say was that *assuming* your different programs
understand the scheme, the URL should be the same for them all. *That*
is the uniformity that should be maintained, so that you don't use
git://path/to/something for git, and file://path/to/something for
everything else.


Dave.
-
To: David Symonds <dsymonds@...>
Cc: Linus Torvalds <torvalds@...>, Jakub Narebski <jnareb@...>, Sam Vilain <sam@...>, Martin Mares <mj@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>
Date: Monday, August 27, 2007 - 9:08 pm

Well, if all URL's were the same we wouldn't need them in the first
place. The only "same" about them is the notation we use to describe
what's the protocol, host and path part, and that *is* the same.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
-
To: Martin Mares <mj@...>
Cc: Linus Torvalds <torvalds@...>, Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 5:27 pm

Not to advocate either way (me being completely new to git), but as  
far as ssh is concerned, I don't think that the addressable objects  
necessarily have to be executables.

Quoting RFC4251:
"The Secure Shell (SSH) Protocol is a protocol for secure remote login  
and other secure network services over an insecure network."

That reads rather vague to me.
-- 
Jing Xue

-
To: Linus Torvalds <torvalds@...>
Cc: Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 3:09 pm

You could find a better example.

scp doesn't accept URL syntax at all. So, no, it doesn't know about
cp+ssh://, but doesn't know ssh:// either.

-- 
Matthieu
-
To: Linus Torvalds <torvalds@...>
Cc: Petr Baudis <pasky@...>, <skimo@...>, <git@...>, Jakub Narebski <jnareb@...>
Date: Monday, August 27, 2007 - 2:58 pm

Chuckles...

I'd rather see hostname:/path/to/file on the page as it tends to
be even shorter.
-
Previous thread: Question about git-log --date and default behavior by Seth Falcon on Sunday, August 26, 2007 - 7:35 pm. (4 messages)

Next thread: Re: [PATCH 1/2] gitweb: Lift any characters restriction on searched strings by Jakub Narebski on Sunday, August 26, 2007 - 9:02 pm. (3 messages)
speck-geostationary