Re: Worrisome bug trend

Previous thread: Replacement for cvs2cl, for generating ChangeLog by Simon Josefsson on Tuesday, February 27, 2007 - 4:41 am. (19 messages)

Next thread: How do get a specific version of a particular file? by Theodore Ts'o on Tuesday, February 27, 2007 - 5:34 am. (31 messages)
From: Junio C Hamano
Date: Tuesday, February 27, 2007 - 5:31 am

I was reviewing the bugs we fixed since v1.5.0 and noticed
almost all of them are ancient ones.  We do have small number of
bugs introduced by recently added commands and options, but I
see quite a few that are from 2005.

I take that as a sign that git hasn't been exercised well and
yet more ancient bugs are sleeping, waiting to be triggered, not
as a sign that we are very careful and adding only small number
of risky new code in the releases.

Which is kind of depressing...

The following table shows each bug fixed since v1.5.0, and the
commit that introduced the bug.  Many bugs are attributed to the
first commit that introduced the feature.

20276889 (daemon socksetup() does not set FD_CLOEXEC)
	a87e8be2 Jul 13 2005

de6f0def (no-trivial-merge)
	6ea23343 Mar 18 2006

f4421325 (blame with missing parameter)
	cee7f245 Oct 19 2006

185c975f (trust_executable_bit not trusting too much)
	3e09cdfd Oct 11 2005

256c3fe6 (rev-list commit encoding)
	52883fbd Dec 25 2006

75b62b48 (combine-diff broken cast)
	e702496e Aug 23 2006 (memcpy->hashcpy)
	funny thing is that another similar cast is correct.

8ab40a20 (show-ref --verify)
	26cdd1e7 Dec 17 2006

c06d2daa (format-patch filename length)
	0acfc972 Jul  5 2005

50892777 (diff --git a//etc/inittab)
	65056021 Apr 28 2006 (first built-in diff)

ab242f80 rerere (find_conflict skips adjacent)
	658f3650 Dec 20 2006 (inception, C rewrite)

12891727 rerere (find_conflict uses symlinks)
	658f3650 Dec 20 2006 (inception, C rewrite)
	8389b52b Jan 28 2006 (original Perl version)

ffa84ffb (pack-object fixed arglen)
	8d1d8f83 Sep 06 2006 

308efc10 (merge-index symlink handling)
	54dd99a1 Dec 02 2005

17cd29b2 (merge-recursive symlink handling)
	3af244ca Jul 27 2006

4fc970c4 (diff --cc symlink while merging)
	ea726d02 Jan 28 2006 (teach --cc to diff-files)

4e5104c1 (git-remote command did not like dots in name)
	e194cd1e Jan 03 2007

34fc5cef (mailinfo choke with too long a line)
	c5f7674a Jul 16 2005 ...
From: Randal L. Schwartz
Date: Tuesday, February 27, 2007 - 8:09 am

>>>>> "Junio" == Junio C Hamano <junkio@cox.net> writes:

Junio> Which is kind of depressing...

Maybe if you looked at who has been reporting the bugs, you'd find a different
story.  It's quite possible that the "inner circle" all used git in a
homogeneous way, not performing every possible advertised operation, but now
that git is being used by more people, older bugs are getting revealed because
people really are using it out there in some nicely unique (or perhaps
boneheaded :) ways.

Any quick stats on diversity of bug submitters?

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
-

From: Junio C Hamano
Date: Tuesday, February 27, 2007 - 1:22 pm

I do not know how others noticed them but I am guessing most of
them were found out by hitting a breakage in real life use.
Annotated list of reporters and context of discovery is attached
at the end.

Observations.

 - There are a handful 'missing argument validation and erroring
   out' bugs.  People in the know tend not make such a mistake
   to trigger these bugs, so it is a sign that git is being used
   by wider population that these bugs are surfacing now.

 - A few leaks and hardwired limits are signs that the code was
   not used for heavy-duty settings back when it was written,
   but it is now.

 - Tracking symlinks were added quite early by explicit need to
   support them in the kernel archive, and the bug was there
   ever since and nobody noticed.  Maybe the need for tracking
   symlinks in SCM were real but they never got changed by more
   than one party (not needing merges).

 - There are a few "diff --cc" fixes.  Maybe not as many people
   perform "interesting" merges as the effort spent on writing
   the --cc codde.

 - rerere's skipping adjacent paths were there from the
   beginning of C-rewrite.  It's either rerere is not as widely
   used, or conflicting merges are not so frequent in real life
   to trigger this, or maybe a little bit of both.

----------------------------------------------------------------

20276889 (daemon socksetup() does not set FD_CLOEXEC)
	a87e8be2 Jul 13 2005

Alex Julliard.

f4421325 (blame with missing parameter)
	cee7f245 Oct 19 2006

Tommi Kyntola.

256c3fe6 (rev-list commit encoding)
	52883fbd Dec 25 2006

Fredrik Kuivinen.

8ab40a20 (show-ref --verify)
	26cdd1e7 Dec 17 2006

Dmitry V Levin.

c06d2daa (format-patch filename length)
	0acfc972 Jul  5 2005

Robin Rosenberg, presumaby by noticing breakage while dealing
with a foreign SCM import.

ffa84ffb (pack-object fixed arglen)
	8d1d8f83 Sep 06 2006 

Roland Dreier, by noticing breakage.

4e5104c1 (git-remote command did not like dots in ...
From: Johannes Schindelin
Date: Tuesday, February 27, 2007 - 1:30 pm

Hi,


... while working on some diff --no-index code. (Strangely enough.)

Ciao,
Dscho

-

From: Alexandre Julliard
Date: Tuesday, February 27, 2007 - 1:41 pm

... by noticing breakage when upgrading to 1.5.0.

-- 
Alexandre Julliard
julliard@winehq.org
-

From: Dmitry V. Levin
Date: Tuesday, February 27, 2007 - 2:30 pm

On Tue, Feb 27, 2007 at 12:22:43PM -0800, Junio C Hamano wrote:

=2E.. while playing with new utilities (introduced after v1.4.4.x).


--=20
ldv
From: Andy Parkins
Date: Tuesday, February 27, 2007 - 8:36 am

Maybe not.  It's my feeling that the traffic on the git list has taken a 
significant jump lately.  There have been a lot of new users asking 
questions.

Could it be that more users equals more bug reports?  Could it also be that 
new users tend to use git (let's put this politely) "outside of the box".

The users popping up on the list will be a small percentage of actually 
using/adopting git.  I think it's great, not depressing.  I'm a relatively 
new user of git, and just in the time I've been lurking the bugs/issues that 
affect me have dropped significantly.

As the philosopher says: don't worry, be happy. :-)



Andy

-- 
Dr Andy Parkins, M Eng (hons), MIET
andyparkins@gmail.com
-

From: Linus Torvalds
Date: Tuesday, February 27, 2007 - 9:00 am

I'd say that it's good news. I'd be a lot more worried if there is a big 
rash of *new* bugs being introduced, rather than small and subtle *old* 
bugs being fixed.

There were a number of "December, 2006" bugs there, and I'd worry more 
about those. The old bugs were all fairly obscure (face it, nobody 
actually uses SCM's to track symlinks, because symlinks are not 
historically tracked by most SCM's).

And the _really_ old bugs (eg the mailinfo one) were either features that 
you'd never use on Unix anyway (trust_executable_bit) and thus haven't 
gotten any testing, or were about over-long buffers that mostly wouldn't 
realistically trigger in practice (ie lack of coverage).

Finding bugs is good. Some of it may well be due to just having more 
users. And much of it is probably because everybody has bugs - but 
absolutely none of the bugs on that list looked even remotely like a "we'd 
screw up the repository". They were all pretty much harmless details.

		Linus
-

From: Johannes Schindelin
Date: Tuesday, February 27, 2007 - 1:00 pm

Hi,


I tend to agree with all the answers that this trend is to be expected.

Especially since we seem to attract more and more users who are unable or 
unwilling to fix the bugs themselves (up until recently, most bug reports 
seemed to me to be accompanied by patches).

So, it does not appear worrisome to me.

However, I would like to see people thinking about how to teach "sparse" 
to catch those kind of errors so that we can actually learn in an 
efficient way from our mistakes.

For example, I refuse to believe that an error like checking int against 
ssize_t cannot be found by "sparse".

Ciao,
Dscho


-

From: Sam Vilain
Date: Tuesday, February 27, 2007 - 1:25 pm

No! It's a sign that there aren't enough tests :)

Maybe investigate the coverage of the test suite?

Sam.
-

From: Junio C Hamano
Date: Tuesday, February 27, 2007 - 1:42 pm

I know we cover most of the success (expected) cases for things
we care about, but at the same time I personally find that tests
for failure cases (insane input, dataset expected to fail) are
missing.

We do not need investigation.  We need a volunteer.

And perhaps a new patch/feature acceptance criteria that
requires both expected behaviour and expected failure tests, but
I am lazy ;-).

-

From: Johannes Schindelin
Date: Tuesday, February 27, 2007 - 1:44 pm

Hi,



I think that's okay. Many, many new features and bug fixes come with 
tests. I think that we do not have _few_ tests. At least not comparing to 
other projects (especially commercial ones...).

Ciao,
Dscho

-

From: Robin Rosenberg
Date: Tuesday, February 27, 2007 - 2:07 pm

When bugs gets fixed and reappear, that's the time to start worry. That old 
bugs gets fixed is a very good sign. It means Git is being tested by users 
that care. Git is very feature rich and considering that, it's amazing that 
is isn't completly bugridden with a 10K known unfixed bugs. That would be 
depressing, the current state isn't.

-- robin
-

Previous thread: Replacement for cvs2cl, for generating ChangeLog by Simon Josefsson on Tuesday, February 27, 2007 - 4:41 am. (19 messages)

Next thread: How do get a specific version of a particular file? by Theodore Ts'o on Tuesday, February 27, 2007 - 5:34 am. (31 messages)