How to efficiently blame an entire repo?

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Jay Soffian
Date: Thursday, April 29, 2010 - 4:12 pm

Let's say you've got a repo with ~ 40K files and 35K commits.
Well-packed .git is about 800MB.

You want to find out how many lines of code a particular group of
individuals has contributed to HEAD.

The naive solution is to run git blame on all 40K files grep'ing for
the just the authors you want.

Possibly a step up from that is first using log --name-status
--author=... to find just the files which have been touched by those
authors and then blaming only those files.

I guess the next step up would be parsing the diff hunks output by log
-p, but then you're basically re-implementing blame I think.

Am I missing a clever solution?

j.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
How to efficiently blame an entire repo?, Jay Soffian, (Thu Apr 29, 4:12 pm)
Re: How to efficiently blame an entire repo?, Avery Pennarun, (Fri Apr 30, 12:45 pm)
Re: How to efficiently blame an entire repo?, Jay Soffian, (Fri Apr 30, 1:16 pm)
Re: How to efficiently blame an entire repo?, Jeff King, (Fri Apr 30, 2:21 pm)