login
Header Space

 
 

Re: [PATCH] Add a message explaining that automatic GC is about to start

Previous thread: [PATCH] GIT home page. Mentioning that Cogito is no longer maintained. by Paolo Ciarrocchi on Tuesday, October 16, 2007 - 1:54 pm. (2 messages)

Next thread: linux-2.6.git mirror by Medve Emilian-EMMEDVE1 on Tuesday, October 16, 2007 - 4:27 pm. (6 messages)
To: <git@...>
Date: Tuesday, October 16, 2007 - 1:06 pm

* Tue 2007-10-16 Michael Witten &lt;mfwitten AT MIT.EDU&gt;

Spaces are guaranteed to interpreted correctly in all environments. TABs
are the source of too many problems.

Jari

-- 
Welcome to FOSS revolution: we fix and modify until it shines

-
To: Jari Aalto <jari.aalto@...>
Cc: <git@...>
Date: Tuesday, October 16, 2007 - 3:20 pm

No.

Tabs are 8 spaces wide. Live with it. It's the only sane solution.

The fact is, people do mix the two. No ifs, buts or maybes about it. Even 
in the absense of any actual *spaces*, the size of a tab matters, since 
you can - and do - have two separately indented things (the initial 
indentation, and then things like comments being indented separately).

The only sane solution is the one the kernel and git have always used: 
tabs are 8 spaces wide, and anybody who disagrees can go screw themselves. 
If you don't have 8-character tabs, you *will* get odd indentation.

And no, the answer is not to say "don't use tabs at all" and replace them 
by spaces. The answer is *also* not "tabs are just for initial code 
indents", because not only will most sane editors never even show the 
difference, it's simply not how people work. So such a rule about 
invisible things doesn't work.

People who want to be contrary, and have a 2-character-wide tab only have 
themselves to blame. It's THEIR problem, not somethign that is even worth 
trying to address. 

If there are problems with people having small screens, that is damn well 
not about TAB, it's about code being way too deeply indented, and smaller 
indents are absolutely *not* the answer - they are part of the damn 
problem to begin with.

The fact that some projects have encouraged bad coding style and *insane* 
tab values is not a git problem. We should teach people to do *better*, 
not become worse just because others have done idiotic things.

				Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Jari Aalto <jari.aalto@...>, <git@...>
Date: Thursday, October 18, 2007 - 12:36 am

Unfortunately, it leads to some problems. For example, you can type:
git blame alloc.c

2c1cbec1 (Linus Torvalds     2007-04-16 22:10:19 -0700 21) #define DEFINE_ALLOCATOR(name, type)				\
855419f7 (Linus Torvalds     2006-06-19 10:44:15 -0700 22) static unsigned int name##_allocs;				\
100c5f3b (Linus Torvalds     2007-04-16 22:11:43 -0700 23) void *alloc_##name##_node(void)					\
855419f7 (Linus Torvalds     2006-06-19 10:44:15 -0700 24) {								\
855419f7 (Linus Torvalds     2006-06-19 10:44:15 -0700 25) 	static int nr;						\

and see that the end of line 23 does not look right. Because of that,
I prefer tabs for initial code indents and spaces in other places. Of
course, my preferences are irrelevant when it comes to someone else's
project, and I can easily use whatever style it takes to get things
done. It is just that "use tabs elsewhere and everything will be fine
as long as you have the standard tab setting" is not exactly correct.
The rest is people's preferences and habits...

Dmitry
-
To: Linus Torvalds <torvalds@...>
Cc: Jari Aalto <jari.aalto@...>, <git@...>
Date: Tuesday, October 16, 2007 - 4:56 pm

It is insane to *require* diciplined people to use tabs for more than
code indents.
If you insist on using tabs all over the place - fine with me.
But do not frown upon me and other diciplined people becasue we use
spaces to make sure our arguments to a function call is properly
aligned in a tab=10,tab=8,tab=2 environment.

The arguments "tabs are always 8 spaces properly aligned" is just
to reach the lowest common denominator around developers.
And frankly there are some that do better than that.

The root casue are the stupid editors that does not make it
easy to be diciplined and thats where the errors come from
and all the stupid rules like the above.

	Sam
-
To: Linus Torvalds <torvalds@...>
Cc: Jari Aalto <jari.aalto@...>, <git@...>
Date: Tuesday, October 16, 2007 - 3:36 pm

Actually, part of the mess with tabs is due to the fact they're not
exactly 8 spaces wide, but any width that ends at a multiple of 8
characters from the start of the line. So 0 &lt;= n &lt; 8 spaces and a tab
is still 8 spaces.

Anyways, it's maybe just simpler to run indent before sending patches.

Mike
-
To: Mike Hommey <mh@...>
Cc: Jari Aalto <jari.aalto@...>, <git@...>
Date: Tuesday, October 16, 2007 - 3:47 pm

Umm.. That's the definition of "tab width".

The tab width is 8. Not "0 &lt; n &lt;= 8". Not "depends on where you are". The 
tab width is 8.

The whole history of tab is that it comes from mechanical "tab stops" that 
you could set, and that were independent of the text - pressing the tab 
key would move to the next tab stop.

Now, those tab stops were movable, and in fact, I think lots of terminals 
still support setting those tab stops dynamically (ie you can send control 
sequences to set their "tab stops" to different points, exactly like an 
old mechanical typewriter).

But when it comes to computers, 8-character wide tab stops is the 
de-facto standard. It's what every single terminal defaults to. It's the 
only thing that some printers/terminals support. Anything else is by 
definition non-standard.

			Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Mike Hommey <mh@...>, Jari Aalto <jari.aalto@...>, <git@...>
Date: Tuesday, October 16, 2007 - 4:32 pm

Read better before replying, and I'm sure you'll agree with Mike ...

-- 
Matthieu
-
To: Mike Hommey <mh@...>
Cc: Jari Aalto <jari.aalto@...>, <git@...>
Date: Tuesday, October 16, 2007 - 3:51 pm

Side note: one reason you *have* to use 8-character wide tab stops if you 
want to be sane is that while your editor may have alternate tab-stops, 
but when you look at the sources any other ways or on any other setup, the 
default is *always* going to be that 8-character wide tab-stop.

Do a "git cat-file -p :Makefile", and it will default to using "less". 
Have you added "-x2" to you LESS environment variable? Has everybody else? 
Not likely.

Or what happens when you just cat it straight, without any less at all? 

In short: using anything but 8-char wide tab-stops is INSANE, because it 
will inevitably showing the same source code in different ways depending 
on which editor or other environment you use.

In contrast, if you just accept that 8-wide tabs are a fact, you never see 
any of these issues. Everything "just works".

			Linus
-
To: Linus Torvalds <torvalds@...>
Cc: <git@...>
Date: Tuesday, October 16, 2007 - 4:18 pm

I'm reading two different ideas here, and it seems like you're
conflating the two — and, in the process, telling some pretty smart
people (smarter than me, anyhow) to go fuck themselves.

If a project uses tabs, your statement regarding 8-char-width tabs makes
sense; you need some rule by which you can assume others are viewing the
same thing you are.

But you then dismiss out of hand the option of using all spaces; Python
has been getting along perfectly well for quite some time by following
this rule, and my experience with the language leads me to believe it's
the wiser of the choices.  Questions over tab width simply *go away*;
you pick an indentation level (Python uses 4) and stick with it.

I'm not arguing that git should switch to all spaces; projects tend to
become set in their ways, and consistency can be valuable.  I'm merely
pointing out that all-spaces is a quite *sane* option, even if it's one
git doesn't choose.


-
To: Tom Tobin <korpios@...>
Cc: <git@...>
Date: Tuesday, October 16, 2007 - 7:05 pm

I do indeed. I don't think it's sensible. And I did think I already 
answered that issue by talking about how most editors don't even support 
it or show the difference between tabs and spaces.

For example, the editor I use - microemacs - supports tabs just fine. It 
does auto-indentation etc. But it does it with hard-tabs by default, so 
now you have to have some editor-specific setup for that particular 
project if you ever want to do anything else.

And that's really what it boils down to. Everybody support 8-character 
hardtabs (and usually by default). They may support other things *too*, 
but any time you move away from that standard behaviour, you'll most 
likely find something that doesn't support the alternatives.

So yes, the answer really is: "git uses 8-character hard-tabs, live with 
it". 

		Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Tom Tobin <korpios@...>, <git@...>
Date: Tuesday, October 16, 2007 - 7:51 pm

On Tue, 16 Oct 2007 16:05:34 -0700 (PDT)

Unfortunately most editors are totally confused about the difference
between tab size and indentation level.  Visual Studio, probably the
most commonly used development environment on Windows, by default uses
TAB characters that are 4 spaces wide, and users are recommended
not to change that because of that a lot of existing Windows source code
and examples uses those settings.

Two years ago, when I last looked at it, Eclipse, a very commonly used
development environment, managed to confuse tabs and indentation and
make it almost impossible to write Java or C code with a tab size of 8
with a different indentation level.  The Eclipse 3 betas did see some
improvement there, I think it got possible to do the right thing in
Java at least, but the normal text editor and C editor lagged behind.
But it was still a big mess and it was much too easy for someone to get
a tab size which is not 8.  Hopefully this has been fixed by now, but I
wouldn't bet any significant amount of money on it.

Nedit (which runs on Linux) has a very confusing settings dialog with
terms such as "tab spacing", "emulated tabs".  I guess emulated tabs
means the indentation level, but guess how easy that is to mess up.

gedit can control the tab width, but has no setting at all for
configuring the indentation level.  Guess what people do when they want
a 4 space indentation level?  Yes, right, change the tab size to 4.

A a former colleague who used visual slickedit usually produced code
with tab size 4.  I think I've gotten the same crap from ultra edit 32
users.

And so on...  Mercifully, _all_ of these editors have a setting to use
spaces instead of tabs, and telling people to turn on that setting is
the absolutely easiest way of making things "just work".  Yes, I know,
the correct answer is to tell people to always use tab size 8, and I
frequently and loudly do that.  But at the same time, perfect is the
enemy of good.  It's much easier to explain "tabs will act dif...
To: Christer Weinigel <christer@...>
Cc: Tom Tobin <korpios@...>, <git@...>
Date: Tuesday, October 16, 2007 - 8:45 pm

One issue may well be that Windows programmers also probably don't work 
very much with patches, do they?

One reason for *really* wanting to use hard-tabs is that it makes patches 
look better, exactly because diffs contain that one extra (or two, in the 
case of old-style context diffs) character at the beginning of the line.

Which means that while all-space indents look fine, *mixing* styles 
definitely does not. In particular, a two-character indent (which 
hopefully nobody uses, but people are crazy) will be totally unreadable as 
a patch if you have the (fairly common, at least in UNIX projects) style 
of using spaces for less-than-eight-character-indents and tabs for the 
full 8 characters.

(In particular, a 3-level and 4-level indent will look *identical* in such 
a project, when using context diffs).

And sure, you can use all-spaces-everywhere, but that just isn't what any 
normal UNIX editors are set up for by default. In contrast, under UNIX, I 
can pretty much guarantee that hard-tab indents look at least reasonable 
in any editor.

And if you have an editor that shows hard-tabs as 4-character indents, 
generally you can work with it. You may have odd indentation, and people 
may complain about your patches not lining up, and yes, it would be up to 
*you* to understand that 8-wide tabs are the normal and default. But you 
can certainly work with a source base that uses a single hard-tab for 
indentation.

In contrast, if you use spaces (or worse - mixing), things really look 
ugly as sin, to the point of actually being unworkable.

In short:

 - if the project has the rule that an indentation is "one hard-tab", then 
   at least everybody can *work* with that project. Different people may 
   see things laid out slightly differently, but it's generally not a 
   horrible disaster, especially if you aim to use block comments indented 
   with the code (like we *mostly* do both in the kernel and in git)

 - all-space and all-tabs just leads to problems. Y...
To: <git@...>
Cc: Linus Torvalds <torvalds@...>
Date: Tuesday, October 16, 2007 - 11:08 pm

It's unreasonable not to list that anywhere.

mfwitten
-
To: Michael Witten <mfwitten@...>
Cc: <git@...>
Date: Tuesday, October 16, 2007 - 11:29 pm

Heh.

I was sure we had a "CodingStyle", but it turns out that no, we don't, and 
yes, the 8-tab assumption is implicit in (a) the kernel rules (which git 
started out following for obvious reasons, and which *does* have 
documentation making this very explicit indeed) and (b) those few places 
where you can actually see it in the result.

So maybe it should be made explicit. You can see the effect right now by 
doing

	git grep -1 '	 ' *.c

(again, that regex is a "tab+space", although it's not obvious) and then 
looking for places where we line up things in ways that simply wouldn't 
have worked if it wasn't a 8-wide tab, ie things like

	...
	check_all_attr = xrealloc(check_all_attr,
				  sizeof(*check_all_attr) * attr_nr);
	..
	read_tree_recursive(args-&gt;tree, args-&gt;base, plen, 0,
			    args-&gt;pathspec, write_zip_entry);
	..

where the arguments wouldn't line up for anything but 8-char-wide tabs.

(But the code is certainly *readable* with other tab sizes, so it's not 
like this makes it impossible to work if somebody has a 4-space tab, it 
just means that such people can get odd effects - but they may not even 
realize that others see things line up!)

		Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 3:17 am

I'm late in this game. But it's too classic a debate to miss the fun.


Yes, all-space would look fine in patches. It'll look better than all  
tabs for tables and ascii formula and diagrams in comments, as one  
prepended character could screw up the tabs (depending on the  
content), rendering them totally unreadable. In all-space case,  
things just shift to the right by one character column.

I believe the indentation convention for ruby is 2 spaces. It looks  

But all-space would look perfect in any editor as the authors  
intended, including the tables and ascii arts, as long as it's using  
monospace font. It's easy to setup all space editing on all platforms  
(Windows, Mac, *nix) It's also much easier to enforce. I've used pre- 
commit hook to check for tabs in the source and reject them if a tab  



But I still haven't seen any compelling arguments against the "all  
space" case, other than "people will screw it up into mixed spaces",  
which is really a straw man, as many multi-platform projects enforced  
the all-space policy easily by using a pre-commit hook in  
maintainers' repository.

The only downside of all-space is a moderate space bloat in source,  
which is insignificant, all things considered.

I agree that "8-character tabs are the gold standard", only for the  
tabstop==8 part but not the indent==tab part. For me the question is:  
is it really so unreasonable to just say "all-space is the holy grail"?

__Luke



-
To: Luke Lu <git@...>
Cc: <git@...>
Date: Wednesday, October 17, 2007 - 5:09 am

Overhead!

If you use 8 spaces instead of one tab,
that's using up 7x more space!

Consider:

     # calculates the extra space required to
     # use the given number of spaces/tab.
     size()
     {
         count=`grep -RIo "\`printf \"\t\"\`" . | wc -l`;
         perl -e "print $count*$(($1-1))/1024/1024 . \" MB\n\"";
     }

     Then in in a git working tree:

         size 8; # 1.28701210021973 MB
         size 4; # 0.551576614379883 MB

     In a linux kernel working tree:

         size 8; # 61.4902725219727 MB
         size 4; # 26.3529739379883 MB

Conclusion:

     Yikes!


I hate tabs, but I can't argue with that!

Michael Witten
-
To: Michael Witten <mfwitten@...>
Cc: Luke Lu <git@...>, <git@...>
Date: Wednesday, October 17, 2007 - 6:21 am

As already pointed out, this isn't the true waste.  Run the following
Ruby script to determine the true waste:

------------ cut here -----------
TabWidth = 8

actual_size = 0
expanded_size = 0
ARGF.each_line do |line|
  width = 0
  line.each_byte do |byte|
    width += (byte == ?\t) ? (TabWidth - (width % TabWidth)) : 1
  end
  actual_size += line.length
  expanded_size += width
end
puts (expanded_size - actual_size).to_s
------------ cut here -----------

This will give you the actual space waste.  Run it like so:

% ruby space-waste.rb /usr/src/linux/**/*.[ch]

(or in a similar manner that doesn't fail due to going over the
maximum command-line limit).

According to this calculation the waste is 47808782 bytes, or about
45.6 MiB, for 8-spaces-wide tabs.
-
To: Nikolai Weibull <now@...>
Cc: <git@...>
Date: Wednesday, October 17, 2007 - 7:23 am

I concede my calculation was crude.

Interestingly, modifying my calculation to look
for tabs at the beginning of the line gives a
similar result:

     # calculates the extra space required to
     # use the given number of spaces/tab.
     size()
     {
         count=`grep -RIo "^\`printf \"\t\"\`" . | wc -l`;
         perl -e "print $count*$(($1-1))/1024/1024 . \" MiB\n\"";
     }

     size 8; =&gt; 49.7416791915894 MiB

and for git:
	
     size 8; =&gt; 1.25082969665527 MiB


Anyway, thanks for the neat script.

mfwitten
-
To: Michael Witten <mfwitten@...>
Cc: <git@...>
Date: Wednesday, October 17, 2007 - 6:03 am

First, the overhead is not a simple x4 or x8 conversion in size, but  
it's the upper bound. Given that, let's look at the percentage of the  
overhead: my git working tree is 56MB after gc, so the overhead is  

Now, compile the kernel, do a du in the tree and report back  
percentages of the overhead.

Disk is cheap (1GB costs less than half a dollar), people's  
productivity/time is not. The overhead argument is compelling, not!

__Luke
  
-
To: Luke Lu <git@...>
Cc: Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 11:53 am

But we also established that an all-space model is not stable, because any 

Hell no, it's not.

More importantly, I can guarantee that certain developers will refuse to 
be part of such a project with such an idiotic design that eats disk-space 

Hey, you start your own projct, and you can enforce whatever idiotic rules 
you want to. 

But in the meantime, all-tab indentations are equally good, and are the 
defacto rule. So *you* are the one who should show compelling arguments 
for changing, and so far you haven't shown any.

Really: what is the problem with hardtabs? Absolutely none.

			Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 2:05 pm

Hi,


Yes.  Me, for one.

But heck, _everyone_ is free to fork.  That is one of the missions of git: 
"fork!".  You can maintain you tab-less fork, until people flock to you, 
deciding to use your repo instead of Junio's, or Shawn's.  If enough 
people decide, you will have more followers than the others.

Ciao,
Dscho

-
To: Linus Torvalds <torvalds@...>
Cc: Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 8:32 pm

In what way does an all-space model cause people to accidentally add
tabs, but an all-tab model does not cause people to accidentally add
spaces?

-Peff
-
To: Jeff King <peff@...>
Cc: Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 8:59 pm

It happens. We do de-spacification in the kernel occasionally when it is 
an annoyance. Usually it shows up in patches, though - exactly because 
code which adds spaces instead of tabs won't line up correctly in the 
diff.

So it doesn't matter *which* one you use (all spaces or all tabs) in that 
sense. But clearly tabs are *way* more common at least in any UNIX 
project, and tabs really do have the advantage of being smaller.

And smaller *is* faster. Do something like this on the kernel:

	GIT_PAGER= time git grep sched_fair

and then do the same thing with the kernel sources blown up by 20% by 
de-tabification. Guess which one is 20% slower?

And whoever said that disk space doesn't matter doesn't know what he is 
talking about. Disk space most *definitely* matters. Do the above test 
with a cold-cache case, and think what 20% more IO does to you (or 20% 
less disk cache).

But no, the size issues are secondary, I'm not claiming anything else. 
Although I do suspect that historically, they have been primary, and have 
been the thing that has resulted in the fact that tabs are so commonly 
used.

			Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 10:45 pm

You have made this claim several times, and I really don't understand
it. If I have 8 spaces, then a diff line will have either " ", "+", or
"-" followed by 8 spaces. If I use a hard tab, then the tab will end up
only taking up 7 spaces because of the nature of tabs.

This might matter if I'm comparing non-diff code to diff code. But in a
diff, _everything_ is indented by exactly one space, so it all lines up.

Yes, I agree with that (even with an all-tabs policy, there are still
mangled and incorrect patches that come in -- and the maintainer rejects
or fixes them).

Which was what I was trying to point out with my question (though I was
also curious to hear your answer): all-space versus all-tab is largely a
matter of preference. And that means that people who want git to change

I was about to tell you that you're full of it, but there really is a
slowdown:

$ cd linux-2.6
$ GIT_PAGER= time git grep sched_fair &gt;/dev/null
0.34user 0.94system 0:01.30elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7548minor)pagefaults 0swaps

$ find . -name .git -prune -o -type f | xargs perl -pi -e 's/\t/        /g'
$ git-commit -a -m de-tabify
$ git-repack -a -d
$ GIT_PAGER= time git grep sched_fair &gt;/dev/null
0.42user 1.06system 0:01.54elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+7591minor)pagefaults 0swaps

It's actually about 16%.


Gah, I can't believe I've not only been sucked into a tab vs spaces
discussion, but now I've actually wasted time doing a performance
comparison on it.

As an aside, that commit was enough to trigger a "git-gc --auto", which
was my first experience with it. It's actually kind of annoying
(especially since I was about to repack -a -d).

-Peff
-
To: Jeff King <peff@...>
Cc: Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 11:03 pm

if the code uses a tab and the patch uses 8 spaces the two will not line 
up in the diff becouse in the diff output the tab is 'only 7 spaces;

useing one or the other isn't the problem, it's the mixing of the two.

David Lang
-
To: <david@...>
Cc: Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 11:00 pm

Yes, obviously. The people who advocate mixing really _are_ objectively
wrong. But I was talking about all-spaces versus all-tabs.

-Peff
-
To: Jeff King <peff@...>
Cc: <david@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 11:32 pm

If you really are all one-or-the-other, then everything is obviously fine, 
and spaces have somewhat stronger guarantees (I say "somewhat", because 
the line-up-guarantee of all-spaces is only guaranteed with fixed-width 
fonts, and hard-tab indents often look nicer in printouts, and are 
generally much more flexible in just how wide you make the indent *look*, 
ie hard-tabs at least *allow* people to see the indents in different 
ways, even if that will potentially mess up any alignment).

But some mixing is inevitable, and at least in UNIX, the tendency is for 
tabs, not spaces, by default, so tabs have a much higher chance of 
*staying* mostly tabs, while anybody who uses spaces pretty much *will* 
get tabs inserted by just about any programming editor that isn't set up 
for python.

So you always get _some_ amount of mixing, exactly because most editors 
won't show you the difference, and what people aren't aware of, they don't 
think about. There's no getting away from that, unless you actually 
enforce it with hooks (and in a distributed environment, even that isn't 
really going to fly, is it?).

And if you *do* decide to enforce it with hooks, you now have issues like 
the fact that some files mustn't do it (autoconvert tabs to spaces in a 
Makefile, and it just stops working!), and others have somewhat subtle 
issues forcing your converter to be somewhat knowledgeable (trivial 
example: strings that are spread across multiple lines in C..)

In general, if you do enforce it (which I personally think is not likely a 
good idea, but hey, it's up to the project), I'd *still* suggest going the 
way of enforcing hard-tabs, not spaces. As mentioned, space does matter, 
but hardtabs really are "friendlier", and if you're a vi user, you can do 
a :set tabstop=4 and if that's what you're used to, it will all look 
better to you.

In contrast, all-spaces just sucks. It really has no redeeming values.

		Linus
-
To: Jeff King <peff@...>
Cc: <david@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 12:17 am

btw, don't get me wrong. I think 8-char tabstops are much better, and 
would much prefer to really teach people not to indent too deeply (because 
six-deep indents really look ugly as hell with 8-char indents).

So setting tabstops to smaller values is not something I think is good 
practice, but at least that way you can keep your dirty perversions to 
yourself, and don't have to admit to the world that you molest dogs and 
small children and use an inferior tab-stop.

The rest of the world might notice occsionally when you don't hide your 
non-indentation tab-uses well enough, of course, but keep to block 
comments and spaces for non-indentation, and you'll be reasonably safe.

However, if I see people actually having indentations six+ deep, I'll know 
that (a) you're likely a small-tab-misusing-deviant and (b) a horrible 
programmer. And then the tab-deviancy is the smaller problem.

(Yes, we do have some cases of six+ deep indentations in the kernel. I'm 
happy to say that they are fairly rare, and yes, I'm personally convinced 
that the 8-wide indent level is part of it. It really *does* end up 
encouraging people to split things up and write nested cases as separate 
functions etc, because it simply becomes so obvious when things are going 
south..)

			Linus
-
To: Jeff King <peff@...>
Cc: Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 12:52 am

Well, you actually touched every files in the tree, and there are about 
22K of them.  this, plus the tree objects leading to them, your commit 
certainly did create an unusual amount of loose objects.  Repacking them 
will inevitably take a wile.


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 12:54 am

Yes, I know. I wasn't complaining so much about the speed, but rather
the behavior of "git-gc" running while I was in the middle of trying to
accomplish something else (I hadn't seen it before, because I generally
keep my repos fairly packed).

-Peff
-
To: Nicolas Pitre <nico@...>
Cc: Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 12:55 am

But if you are trying to say that my case is pathological, then yes.

-Peff
-
To: Jeff King <peff@...>
Cc: Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 11:13 pm

Yes. 

You're missing the fact that some people have problems with editors.

So they add a line, and they add *that* line with the wrong kind of 
indentation. And it shows up among the other lines like this (here the 
whole patch is indented):

	diff --git a/kernel/sched.c b/kernel/sched.c
	index 92721d1..1ecb164 100644
	--- a/kernel/sched.c
	+++ b/kernel/sched.c
	@@ -127,6 +127,7 @@ static inline u32 sg_div_cpu_power(const struct sched_group *sg, u32 load)
	 static inline void sg_inc_cpu_power(struct sched_group *sg, u32 val)
	 {
	 	sg-&gt;__cpu_power += val;
	+        wrong indentation here.
	 	sg-&gt;reciprocal_cpu_power = reciprocal_value(sg-&gt;__cpu_power);
	 }
	 #endif

and so you see the fact that somebody messed up in the patch itself.

It actually more often goes the other way: somebody may have messed up 
earlier, but did so *consistently* so it wasn't obvious when looking at 
the patch. And then somebody fixes one line, and now that one fixed line 
is indented correctly but differently.

When it gets *too* bad, we just reindent the whole file, but more 
commonly when I notice it in a diff, I just edit that particular region 
or even just the diff itself in-place.

Generally, it seldom comes to even that. Doing a

	git grep '        ' -- '*.c'

(that's now eight spaces) returns quite a lot of lines, and it's generally 
not worth worrying about (not all of them are indentation - people do use 
spaces for lining things up etc - but a lot of it really is just indents 

I didn't even time it, and I called it at 20% without even counting any 
tabs. Why? Because it's inevitable!

It so happens that "grep" has a lot of really clever heuristics, so that 
it is actually better at passing over characters that it knows cannot 
start the pattern you are searching for, so timing "grep" is actually 
quite complex in the general case. So I bet that if you had grepped for 
something that started with a space, you'd probably have found a bigger 
slowdown. 

But ig...
To: Linus Torvalds <torvalds@...>
Cc: Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 11:23 pm

Oh. I thought you meant "if you have an all-spaces policy, your diffs
will not look good." But what you meant was "when people screw up your
policy by mixing tabs and spaces, your diffs will not look good." And
that applies equally, whether they are screwing up your all-tabs or
all-spaces setup (and I remain convinced that no matter what your
policy, people _will_ screw it up).


Yes. I wondered whether the increased size would really matter here, or
if it would get lost in the noise of program startup and other

No, it's not a waste. In the grand scheme of things, I don't actually
care that much about the result, but hey, I think I may be the only

It would have been nicer if it said something like "Your repository
looks crufty. Running git-gc --auto..." using whatever terms users would
be comfortable with. Instead, it just started with "Counting objects"
and a long wait. I happen to know what that means, but I'm not sure how
a git newbie would react (though it looked _much_ nicer because of
Nico's recent terser progress patches).

-Peff
-
To: Jeff King <peff@...>
Cc: Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 12:41 am

---


And as an added bonus, we can tell people how to turn off automatic GC
and how to invoke it by hand.

 builtin-gc.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/builtin-gc.c b/builtin-gc.c
index 23ad2b6..b1159d6 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -211,6 +211,10 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		prune = 0;
 		if (!need_to_gc())
 			return 0;
+		fprintf(stderr, "Packing your repository for optimum "
+			"performance. If you would rather run\n"
+			"\"git gc\" by hand, run \"git config gc.auto 0\" "
+			"to disable automatic cleanup.\n");
 	}
 
 	if (pack_refs &amp;&amp; run_command_v_opt(argv_pack_refs, RUN_GIT_CMD))
-- 
1.5.3.rc2.4.g726f9

-
To: <koreth@...>
Cc: Jeff King <peff@...>, Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 9:52 am

I'm not sure telling the users how to disable it every time it shows  
up is a good idea.  gc --auto exists for the naive user, and  
suggesting they turn it off each time it happens will just result  
in...  people turning it off, leading back to the performance issues  
that caused the feature to be installed in the first place.  Perhaps  
a message more along the lines of "To avoid this, run "git gc"  
manually on a regular basis.  See 'git help gc' for more information."

~~ Brian
-
To: Brian Gernhardt <benji@...>
Cc: <koreth@...>, Jeff King <peff@...>, Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 10:21 am

This is indeed a good point.

And for those who start repacking manually then the automatic repacking 
will very rarely trigger, reducing the need for turning automatic 
repacking off anyway.


Nicolas
-
To: Brian Gernhardt <benji@...>
Cc: Jeff King <peff@...>, Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 10:16 am

That's a good point. Jeff / Shawn, do you agree with that? I'll come up 
with an alternate patch if so.

-Steve
-
To: Steven Grimm <koreth@...>
Cc: Brian Gernhardt <benji@...>, Jeff King <peff@...>, Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 8:16 pm

Arrgh.  I already have the original patch in this thread in my master
so I can't rewind it.  But yes, the argument Brian is making above
makes a lot of sense and I like his proposed message even better
than what I've already applied.

A patch against spearce/master to revert the prior message and insert
something that is perhaps more reasonable would be most appreciated.

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Steven Grimm <koreth@...>, Brian Gernhardt <benji@...>, <git@...>
Date: Thursday, October 18, 2007 - 9:12 pm

The previous message had too much of a "boy, you should
really turn off this annoying gc" flair to it. Instead,
let's make sure the user understands what is happening, that
they can run it themselves, and where to find more info.

Suggested by Brian Gernhardt.

Signed-off-by: Jeff King &lt;peff@peff.net&gt;
---


Geez, you really _are_ the maintainer now, prodding your minions to
write trivial patches for you. :) I don't see any point in reverting the
other patch separately, since we can just improve the message.

I tried not to use the word "avoid" since I think we don't want to imply
that auto-gc sucks. It doesn't, but some people might prefer to run it
manually, and we should let them know it's an option. I'm open to
wording improvements.

 builtin-gc.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin-gc.c b/builtin-gc.c
index f99b212..3a2ca4f 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -206,9 +206,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 		if (!need_to_gc())
 			return 0;
 		fprintf(stderr, "Packing your repository for optimum "
-			"performance. If you would rather run\n"
-			"\"git gc\" by hand, run \"git config gc.auto 0\" "
-			"to disable automatic cleanup.\n");
+			"performance. You may also\n"
+			"run \"git gc\" manually. See "
+			"\"git help gc\" for more information.\n");
 	} else {
 		/*
 		 * Use safer (for shared repos) "-A" option to
-- 
1.5.3.4.1249.g895be-dirty
-
To: Jeff King <peff@...>
Cc: Steven Grimm <koreth@...>, Brian Gernhardt <benji@...>, <git@...>
Date: Thursday, October 18, 2007 - 9:24 pm

Heh.  But didn't I just post a different trivial patch to the

I agree.  No point in pissing in the snow multiple times over a
simple language change.  I was perhaps a little too aggressive in
applying Steven's first patch.  Which I also now see git-am actually
split the From line incorrectly and doesn't actually show Steven's

I think what you have is many times better.  It doesn't tell the
user that they can prevent having this activate at the wrong time
by just running git-gc every so often, but if the message (and
the subsequent packing itself) is annoying they'll read the manual

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Steven Grimm <koreth@...>, Brian Gernhardt <benji@...>, <git@...>
Date: Thursday, October 18, 2007 - 9:26 pm

Yes, I tried many wordings of "this is annoying and you want to avoid
it," but explaining the situation takes way too much time for such a
commonly seen message. And I think some people will actually prefer it
that way.

BTW, the git-gc manpage needs some cleanup. Patches to follow.

-Peff
-
To: Steven Grimm <koreth@...>
Cc: Brian Gernhardt <benji@...>, Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 2:08 pm

Yes, that seems reasonable. I think the most important thing is that
they realize that "git-gc" is responsible for what is happening.  That
should allow them to find more information in the documentation if they
want (and Brian's suggestion points directly to the documentation, which
is great).

-Peff
-
To: <koreth@...>
Cc: Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 1:01 am

I like it, especially with the new progress patches.

-Peff
-
To: Jeff King <peff@...>
Cc: <koreth@...>, Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 1:11 am

Me too.  Its in my tree now.  :-)

-- 
Shawn.
-
To: Jeff King <peff@...>
Cc: Linus Torvalds <torvalds@...>, Luke Lu <git@...>, Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 12:44 am

Sigh... Forgot to add:

Signed-off-by: Steven Grimm &lt;koreth@midwinter.com&gt;

-
To: Linus Torvalds <torvalds@...>
Cc: <git@...>
Date: Wednesday, October 17, 2007 - 2:25 pm

Damn unix developers!  They just can't be controlled!

... seriously now.  You're trying on one hand to enforce a particular
indentation rule (use tabs for indentations, assume tabs are 8
characters wide, use spaces for partial indentation) — which assumes
unix developers *can* follow a project's rules for coding style — and
yet you're arguing *against* all-spaces because unix developers *can't*
follow rules.


Interesting how you waver between "certain developers" and "me".  I'm
convinced at this point that your argument comes down to "I can't use my
favorite text editor with all-spaces, therefore all-spaces sucks".

As for *disk space*?  When we can measure cheap drives in sizable

Yeah, can you believe some projects actually *survive* with an

Problems have been outlined, but since everything for you comes down to
"anything that comes between me and microemacs sucks", rational
discussion breaks down.

Thank goodness the git community (not to mention the Linux community!)
is larger than you; they exist in no small part due to your programming
skill and initial open-sourcing, but certainly in *spite* of your
personality otherwise.


-
To: Tom Tobin <korpios@...>
Cc: <git@...>
Date: Wednesday, October 17, 2007 - 2:54 pm

Well, let's put it this way: that "sample" is the one that started the 
project.

I got to pick the license. Are you going to argue about that too? I got to 
pick the way I wrote the code. Are you going to continue arguing about 
that?

The fact is, I don't see the people arguing for spaces having actually 

Umm. And I've *told* you that.

The whole point is:
 - every single damn editor out there can handle tabs.
 - it's the default
 - end of story.


That disk-space translates into memory usage too, and into just being 
technically the *inferior* choice.

How hard is that to accept? If you have a choice between a technically 
better solution, and a technically worse one, why are you arguing for the 

Hey, Ḯ'm not saying that others shouldn't use spaces. I'm saying that 
*git* should not, the same way the Linux kernel does not and will not.

Why? Because tabs are better. You (or anybody else) have simply never 
given any argument against that very simple argument. You try to push an 

Don't talk about "rational discussion", since you don't even *have* any.

The starting point for any rational decision would be to explain why 
changing tabs to spaces would actually improve anything at all. And you 
have yet to show *any* such argument, while I've shown arguments to the 
reverse.

One big one being: the person who started the project and still actually 
*does* something for it actually cares.

In contrast, your argument seems to be "I've not actually done anything, 
but I want to paint the bikeshed pink".

		Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 3:47 pm

Possibly because they do not want to cooperate :-) Anyway, it is mostly
a Unix &lt;-&gt; the rest issue. In the Unix world all tools by default use
8-spaces per tab and although most of them can be told otherwise, nobody
bothers. That way at least all code looks the same, regardless of spaces
or tabs. Many editors are smart enough to create sequences
N&lt;TAB&gt;+M&lt;SPACE&gt; for initial indentation to get consistent usage, but even
if it isn't consistent patch and diff can be told to handle it.

Outside the Unix world there appears to be little standard, so you receive
files with any combination of tab distance, using tabs vs. spaces, etc.
Most often I have to re-indent them before reading :-(

I guess the most ideal situation is to use only tabs for initial indentation
and spaces elsewhere, so changing the tab distance gets consistent layout.
The drawback is that there is no room for `half indentation', so style
conventions have to take care of that.

Still, the main developers take the decision.  If you don't like it, don't
cooperate or fork.

	--- Jan

-
To: Linus Torvalds <torvalds@...>
Cc: <git@...>
Date: Wednesday, October 17, 2007 - 3:33 pm

The way you *wrote* the code is different from deciding how the code
*should* be written — or is your code set in stone?  (Funny, considering
that we're talking about a revision control system here.)

As for the license bit ­— yes, you certainly *did* get to pick the
license.  What's your point?  We wouldn't even be having this discussion

Oh, here you pull out the big stick: if you haven't already done
anything for git, your ideas (as for what to, hmm, do for git) are

Every single damn editor out there can handle spaces.

The "default" is project-by-project.

Since you're BD in git-land, yes, your say-so is ultimately

That disk space translates into memory usage exactly *how*?  Compiled
code?  Or the in-memory text while you're editing?  The former can't be
the issue, and the latter is trivial.

And, of course, this still comes up against the *benefits* of
all-spaces.  Benefits which have been mentioned by several people;




Because, once again, being new makes one incorrect, doesn't it?

You've essentially demonstrated that git's "benevolent" dictator is an
asshole, and even worse, an irrational asshole.  It's one thing to deal
with a community member like that; when it's the BD, I think I'll move
along elsewhere.  Congratulations.


-
To: Tom Tobin <korpios@...>
Cc: Linus Torvalds <torvalds@...>, <git@...>
Date: Wednesday, October 17, 2007 - 5:08 pm

Hi,


Tom, you have to contribute way more code than you did, in order to be 
taken seriously with statements as these.

Ciao,
Dscho

-
To: Tom Tobin <korpios@...>
Cc: Linus Torvalds <torvalds@...>, <git@...>
Date: Wednesday, October 17, 2007 - 3:48 pm

If you can't overcome your dislike for tabs and accept that this is the 
coding style for the Git project then please go away.

Can this discussion, rational or not, just stop now?


Nicolas
-
To: Tom Tobin <korpios@...>
Cc: <git@...>
Date: Wednesday, October 17, 2007 - 3:53 pm

I notice how you didn't even list them. Why? Because they don't exist? 

So here's the deal: I claim that "use hard-tabs, and accept that they are 
8 characters wide" is a provably working situation. For lots of *large* 
projects. I'm not some odd-ball person here, I bet that if you go and look 
at any sourceforge entry that is written in C (which is the language we're 
debating here), you'll find that the ones that use hard-tabs (even if they 
use spaces for smaller indents) are the vast majority.

So what's your point? You're pushing something that is provably odd-ball, 
since almost nobody uses it, and you cannot even state what the huge 
advantages are, and you claim that I'm the one that ignores them, when it 
is *you* who have refused to acknowledge that there are reasons to *not* 
do it (one big reason being that there are current existing and 
productive developers that definitely do *not* want to change - and no, 
it wasn't just me, either).

Your arguments make no sense. So *of*course* they don't sway me.

And you know what? I don't much care if you aren't swayed by mine. It's to 
some degree a matter of taste, and the fact is, if you don't like the 
current git model, you can go away and play with your own model. It *is* 
open source, after all. 

Put another way: if you cannot respect the wishes of the people who have 
done the work, then I damn well have no reason what-so-ever to respect 
*yours*. 

		Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 5:21 pm

This is bloody ridiculous, but...

On Wed, 17 Oct 2007 12:53:58 -0700 (PDT)

I think you would get at lot less argument if you weren't so damn self
righteous about it.  If you'd just said "it's a matter of taste and
hard tabs 8 spaces wide are the way we prefer to do it and that's the
way we want to keep doing it" I'd not have any problem with it.

But when you start claiming that there are no reasons to use all
spaces, and that it doesn't solve any problems (it definitely does),
and that all editors work with hard tabs at 8 (which they don't), at
least I get a bit frustrated with you.  It's very non-respectful of you
to claim that everyone is stupid that doesn't agree with you and is
just asking for silly arguments.  Yes, I know, you're an opinionated
bastard and proud of it, but maybe you should tone that down a bit.

(At least I got quite insulted by your Git talk at Google. People
are not stupid just because they don't agree with you, most of the time
they just have different preferences or different goals.)

  /Christer (trying to escape from Lilliput)


-
To: Christer Weinigel <christer@...>
Cc: Linus Torvalds <torvalds@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 6:11 pm

Hi,


Why is it that in this thread, people whom I have heard _nothing_ of 
before seem to think this would be a good time to let their opinion be 
heard?  Now, _that_ is what I call righteous.  Show nothing but opinion.

Ciao,
Dscho

-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: Linus Torvalds <torvalds@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 7:17 pm

On Wed, 17 Oct 2007 23:11:37 +0100 (BST)

I'm not that invisible am I?

    cd linux-2.6.22.7
    find . -type f | xargs egrep -il "weinigel" | wc -l
    17

Ahh, confirmation, I really do exist.  The earliest LSM entry I can find
for myself is from 1995:

http://www.ibiblio.org/pub/linux/kernel/patches/scanners/hpscan-0.1.1.5.lsm

Mmm, nostalgia. :-)

But yes, I haven't contributed anything to git so far, I mostly lurk
on the mailing list and try to keep up with what's happening to git.
In the beginning I sent a few ideas to the list (which quite sadly were
ignored), but unfortunately I've been busy with paying work to do much
with git after that.  And at work we use Perforce (yuk) or Subversion
(a lot nicer than Perforce) and privately I use CVS just because I
started using CVS ten years ago and haven't bothered to change.  And
when I'm free I prefer to hack on hardware or device drivers instead.

The reason I wrote my first mail in this thread is that I work in a
different environment than Linus does and wanted to share that
experience.  I usually work with embedded programming, where people use
lots of different editors and in mixed environments.  Some people use
Visual Studio on Windows because that's what they use for host
programming, a lot of embedded development is done in the (almost
invariably sucky) IDE that comes with the compilers for the embedded
CPU, such as Microchips MPLAB or IAR Workbench, Eclipse is becoming
quite popular for C development, Slickedit is also popular on Windows,
a colleague prefers nedit for some strange reason, and so on. In such a
heterogeneous environment the easiest way to make sure that people see
the same indentation in all editors is to just tell them to use spaces
for indentation, and I think that every editor I just mentioned has a
setting to do that automatically.  Microemacs is the odd one out in
that it doesn't support it.  And my employers haven't really been
paying me to go on a crusade for the holy TAB is 8 spaces cause, the...
To: Christer Weinigel <christer@...>
Cc: Linus Torvalds <torvalds@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 7:44 pm

Hi,


File or directory not found.

We are not Linux specific here.  Besides, I was talking about _git_ source 

I wonder why you have to leave your hideout and comment on the source code 
of _git_, then.  I mean, why do _you_ care?  (It should have become 
apparent to you that _we_ care, so it looks even more like you wanted to 
dictate a policy on git, which you have no business with.)

Puzzled, and a little unnerved,
Dscho

-
To: Johannes Schindelin <Johannes.Schindelin@...>
Cc: Linus Torvalds <torvalds@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 8:31 pm

On Thu, 18 Oct 2007 00:44:22 +0100 (BST)

You do not have a sense of humor, do you?  And you elided the paragraph

Where did I say that I'm not interested in git?  If I weren't I
wouldn't have been reading this mailing list since it was first
created.  The reason I don't use git is a lack of time.  Professionally
I usually have no need to use git since my employer doesn't use it.
On my own time, I haven't even had time and energy to switch away from
CVS, even though I ought to have done that years ago.  I dabble a bit
with git, mercurial, org bitkeeper (or did at least) when I need to
access something out on the net which is stored in such a repository,
but I haven't switched any of my projects over to using anything else.

I'm definitely interested in git, and it's interesting to read the
mailing list to see where git is going.  And I'd definitely like git to
move in a direction where it's usable for me, or even better, where I
can recommend it to my (often Windows-only) colleagues.  

And once again you cut out the part where I said that git is Junio's
baby, he decides what goes into the main git tree, and if he says 8
wide hard tabs, that it is.  I haven't argued against that at all.  So
I'm wondering why you are trying to project things I haven't
said on me.

  /Christer (pissing contensts are silly, so why the hell am I getting
             involved in one?)
-
To: Christer Weinigel <christer@...>
Cc: Johannes Schindelin <Johannes.Schindelin@...>, Linus Torvalds <torvalds@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 2:02 am

Because the right to an opinion is so fundamental to us that when someone
claims yours is stupid for reasons you (or me, or anyone) find unfair, we
stand up and say so. It's not a sign of being stupid. It's a sign of being
protective about a very basic democratic right, and oss people are usually
very protective about those. Freedom of everything comes easier to people
than "force christianity on the world, but keep your sources open!".

Linus is definitely not the most polite of people out there. Had he been
less than top-ranked in the meritocracy he would have been utterly and
totally annihilated on several mailing lists a really long time ago. At
the same time, he's entitled to his opinion the same way everyone is, and
he's entitled to express it any way he wants. When he's *really* off the
mark, people will tell him so. Or ignore him totally, like an embarrassing
grandfather who's just gotten too drunk at the christmas dinner and started
fondling his grandsons girlfriend. However, the same goes for everyone else,
and Linus *is* fond of telling people "so". If nothing else, it saves
time to head off dead ends right at the start.

Linus isn't sacrosankt, just enough of a corner-stone that people will keep
listening to him even if he's wildly offending.

tabs-vs-spaces is not, I feel, a discussion that has a great impact in
the world, so being "really off the mark" is not really possible there.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231
-
To: Christer Weinigel <christer@...>
Cc: Johannes Schindelin <Johannes.Schindelin@...>, Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 7:53 pm

I suspect it works quite well in practice.

But we've had to tweak the xdiff code before, and the hash calculations 
for bucket size limits. If somebody actually points out a problem case, we 

In general, *any* situation where you have tons of character sequences 
that are the same (and here it's not the characters *themselves* that have 
to be the same - it's the *sequence* that has to be the same, so it's not 
about repeating the same character over and over per se: it's about 
repeating a certain block of characters many many times in the source 
code) will be problematic for pretty much any similarity analysis.

Why? Because you just have a lot of the same sequence, and to get a good 
delta you want to find common "sequences of these sequences" (call them 
supersequences) in order to find the biggest common chunk.

So the badly performing cases for any delta algorithm (and I do want to 
point out that this has nothing what-so-ever to do with the particular one 
that git uses) tends to be exactly the ones where you have lots and lots 
of smaller chunks that match in two files, and that then makes it costlier 
to find the *bigger* chunks that are build up of those smaller chunks.

And generally you tend to have two situations: you either (a) take *much* 
longer to find the common areas (they are often quadratic or worse 
algorithms) or (b) you decide to ignore chunks that are so common that 
they don't really add any real information when it comes to finding truly 
common chunks. Where that second choice generally means that you can miss 
some cases where you *could* have found a good match for deltification.

In fact, usually you have a combination of the above two effects: certain 
deltas may be more expensive to find but there is also a limit that kicks 
in and means that you never spend *too* much time on finding them if the 
pattern space is not amenable to it.

Would lots of spaces be such a pattern? I personally doubt it would really 
matter. In general, source c...
To: Christer Weinigel <christer@...>
Cc: Tom Tobin <korpios@...>, <git@...>
Date: Wednesday, October 17, 2007 - 6:03 pm

Hey, fair enough. 

I'm not very tolerant of people who haven't actually done anything, and 
then come in and say things should be done certain ways. The fact is, code 
talks, bullshit walks. 

And bullshit should most definitely not be encouraged. People like that 
should be discouraged *immediately*. I'm not interested in bikeshed 
painters, I think they should be told so forcefully enough that they 
either shut up or go away. 

Maybe it's a character flaw. I'll respect people who do something 
interesting, but re-implementing CVS (and badly, at that) or talking about 
syntactic changes to other peoples projects is not going to fill me with 
respect.

In short, I'll give respect when somebody is shown to be *worth* that 
respect. But respect really has to be earned, not just "assumed", 
otherwise it's pointless.

			Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Christer Weinigel <christer@...>, Tom Tobin <korpios@...>, <git@...>
Date: Thursday, October 18, 2007 - 2:25 am

I seem to remember somebody who started out reimplementing Unix, and
badly, at that.  With the help of many other people, this thing
finally became portable to more than i386 and more than AT drives.
Took quite a long time.  The ancient floppy disk code still has the
renown of being one of the worst pieces of drivers all around.


Cooperation is easier if one is of the opinion rather that disrespect
has to be earned: otherwise you alienate potential contributors before
they even had a chance to contribute.

This difference in approach is one of the things I can't help admiring
in Junio.  And I don't think it is to the detriment of git.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
To: Tom Tobin <korpios@...>
Cc: <git@...>
Date: Wednesday, October 17, 2007 - 3:44 pm

They sure as hell automatically become a lot more relevant than YOUR 

Yes, and they'll start mixing spaces and tabs when they auto-indent.

		Linus
-
To: Tom Tobin <korpios@...>
Cc: Linus Torvalds <torvalds@...>, <git@...>
Date: Wednesday, October 17, 2007 - 4:31 pm

You are doing Junio a harsh injustice here.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
To: Tom Tobin <korpios@...>
Cc: Linus Torvalds <torvalds@...>, <git@...>
Date: Wednesday, October 17, 2007 - 3:52 pm

Think compression.  Worse yet in my opinion is the bandwidth spike that
the larger tarball would create.  Estimates earlier in the thread put
the difference at upwards of 40MB.  Disk space may be cheap, but highly
available redundant mirrored disk space is not, and neither is

I think what Linus is trying to say here is this:  THE "BENEFITS" DO

Lovely.  Personal attacks are such an effective way to get your point
across.  You, sir, are a tool.  Good riddance.

-JE



-
Previous thread: [PATCH] GIT home page. Mentioning that Cogito is no longer maintained. by Paolo Ciarrocchi on Tuesday, October 16, 2007 - 1:54 pm. (2 messages)

Next thread: linux-2.6.git mirror by Medve Emilian-EMMEDVE1 on Tuesday, October 16, 2007 - 4:27 pm. (6 messages)