Re: QGit: Shrink used memory with custom git log format

Previous thread: [PATCH 1/2] builtin-apply: rename "whitespace" variables and fix styles by Junio C Hamano on Friday, November 23, 2007 - 9:24 pm. (8 messages)

Next thread: Re: [RFC/PATCH] git-help: add new options -w (for web) and -i (for info) by Jakub Narebski on Saturday, November 24, 2007 - 1:52 am. (1 message)
From: Marco Costalba
Date: Saturday, November 24, 2007 - 1:14 am

Hi all,

   I have pushed a patch series to

git://git.kernel.org/pub/scm/qgit/qgit4.git

that changes the format of git log used to read data from a git repository.

Now instead of --pretty=raw a custom made --pretty=format is given,
this shrinks loaded data of 30% (17MB less on Linux tree) and gives a
good speed up when you are low on memory (especially on big repos)

Next step _would_ be to load log message body on demand (another 50%
reduction) but this has two drawbacks:

(1) Text search/filter on log message would be broken

(2) Slower to browse through revisions because for each revision an
additional git-rev-list /git-log command should be executed to read
the body

The second point is worsted by the fact that it is not possible to
keep a command running and "open" like as example git-diff-tree
--stdin and feed with additional revision's sha when needed. Avoiding
the burden to startup a new process each time to read a new log
message given an sha would let the answer much more quick especially
on lesser OS's

Indeed there is a git-rev-list --stdin option but with different
behaviour from git-diff-tree --stdin and not suitable for this.

Marco
-

From: Shawn O. Pearce
Date: Monday, November 26, 2007 - 6:52 pm

There was a proposed patch for git-cat-file that would let you run
it in a --stdin mode; the git-svn folks wanted this to speed up
fetching raw objects from the repository.  That may help as you
could get commit bodies (in raw format - not reencoded format!)
quite efficiently.

Otherwise I think what you really want here is a libgit that you can
link into your process and that can efficiently inflate an object
on demand for you.  Like the work Luiz was working on this past
summer for GSOC.  Lots of downsides to that current tree though...
like die() kills the GUI...

-- 
Shawn.
-

From: Johannes Schindelin
Date: Tuesday, November 27, 2007 - 3:48 am

Hi,


But then, die() calls die_routine, which you can override.  And C++ has 
this funny exception mechanism which just begs to be used here.  The only 
thing you need to add is a way to flush all singletons like the object 
array.

Ciao,
Dscho

-

From: Marco Costalba
Date: Tuesday, November 27, 2007 - 5:36 am

On Nov 27, 2007 11:48 AM, Johannes Schindelin


I would think libgit is overkilling for this.

You probably would not use libgit to just add a single feature but to
change completely the interface with git because the required work is
heavy both on git side and qgit side (you probably would want to run
the libgit linked part in a separated thread to avoid GUI soft locks
during slow  processing, now, because the executed git command is a
different process from qgit, the OS scheduler takes care of this 'for
free').

Marco
-

From: Jan Hudec
Date: Tuesday, November 27, 2007 - 12:19 pm

Unfortunately, exceptions won't really work. Why? Because to use exceptions,
you need to have an exception-safe code. That is the code needs to free any
allocated resources when it's aborted by exception. And git code is not
exceptions safe. Given the lack of destructors in C, it means registering a=
ll
resource allocation in some kind of pool, so they can be freed en masse in
case of failure. Than you can also use longjmp for die (for C they really
behave the same).

--=20
						 Jan 'Bulb' Hudec <bulb@ucw.cz>
From: Johannes Schindelin
Date: Wednesday, November 28, 2007 - 5:01 am

Hi,


Sorry, I just assumed that you can read my mind (or alternatively remember 
what I suggested a few months ago, namely to "override" xmalloc(), 
xcalloc(), xrealloc() and xfree() (probably you need to create the 
latter)).

Ciao,
Dscho

-

From: jhud7196
Date: Wednesday, November 28, 2007 - 8:53 am

That sounds like the easiest (but not necessarily easy) direction towards
the goal. Thread-local or global (I don't think git is currently reentrant
anyway) would do. Also filehanles would have to be taken care of and
everything checked for using malloc, calloc, strdup and other libc
functions directly.

Than die could longjmp out to a specified buffer and could be safely
overriden to throw exception for C++ apps.

--
                                         - Jan Hudec <bulb@ucw.cz>

-

Previous thread: [PATCH 1/2] builtin-apply: rename "whitespace" variables and fix styles by Junio C Hamano on Friday, November 23, 2007 - 9:24 pm. (8 messages)

Next thread: Re: [RFC/PATCH] git-help: add new options -w (for web) and -i (for info) by Jakub Narebski on Saturday, November 24, 2007 - 1:52 am. (1 message)