Re: Another bench on gitweb (also on gitweb caching)

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Bruno Cesar Ribas <ribas@...>
Cc: <git@...>, Petr Baudis <pasky@...>
Date: Monday, February 11, 2008 - 8:44 pm

Bruno Cesar Ribas <ribas@c3sl.ufpr.br> writes:


Could you please do not mix English and your native language
(Portuguese?) in shown examples? Mixing two languages in one
identifier name (unless it is ref in br too) is especially bad
form... TIA.

Besides, what I'm more interested in is a script used to generate
those 1000 projects...
 

Those are results of running gitweb as standalone script, or your
script runing git-for-each-ref?

Besides, I'd rather see results of running ApacheBench. On Linux it
usually comes with installed Apache, and it is called by runing
'ab'. Your tests instead of adding superficial load could try to use
concurrent requests, and more than 1 request to get better average.
 

Below there are my thoughts about caching information for gitweb:

First, the basis of each otimisation is checking the bottlenecks.
I think it was posted sometime there that the pages taking most load
are projects list and feeds. 

Kernel.org even run modified version of gitweb, with some caching
support; Cgit (git web interface in C) also has caching support.


Due to the fact that gitweb produces relative time in output for
projects list page and for project summary page, it is unfortunately
not easy to just simply cache HTML output: one would have either
resign from using relative time, or rewrite time from relative to
absolute, either on server (in gitweb), or on client (in JavaScript).
So perhaps it would be better to cache generating (costly to obtain)
information; like lastchanged time for projects.

Or we can for example assume (i.e. do that if appropriate gitweb
feature is set) that projects are bare projects pushed to, and that
git-update-server-info is ran on repository update (for example for
HTTP protocol transport), and stat $GIT_DIR/info/refs and/or
$GIT_DIR/objects/info/packs instead of running git-for-each-ref.
Of course then column would be called something like "Last Update"
instead of "Last Change".

The "Last Update" information is especially easy because it can be
invalidated / update externally, by the update / post-receive hook,
outside gitweb. So gitweb doesn't need to implement some caching
invalidation mechanism for this.

We can store lastref / lastchange information in repository config, as
for example "gitweb.lastref" key. We can store it in gitweb wide
config, for example in $projectroot/gitwebconfig file, as for example
"gitweb.<project>.lastref" key. Or we can store it as hash initializer
in some sourced Perl file, read from gitweb_config.perl (this I think
can be done even now without touching gitweb code at all); we can use
Data::Dumper to save such information.

The possibilities are many.

-- 
Jakub Narebski
Poland
ShadeHawk on #git
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Another bench on gitweb, Bruno Cesar Ribas, (Sat Feb 9, 11:09 pm)
Re: Another bench on gitweb (also on gitweb caching), Jakub Narebski, (Mon Feb 11, 8:44 pm)
Re: Another bench on gitweb (also on gitweb caching), Bruno Cesar Ribas, (Tue Feb 12, 8:45 pm)
Re: Another bench on gitweb (also on gitweb caching), Jakub Narebski, (Wed Feb 13, 8:17 am)
Re: Another bench on gitweb (also on gitweb caching), Jakub Narebski, (Wed Feb 13, 9:01 pm)
Re: Another bench on gitweb (also on gitweb caching), Jakub Narebski, (Fri Feb 15, 7:19 pm)
Re: Another bench on gitweb (also on gitweb caching), Bruno Cesar Ribas, (Tue Feb 12, 8:50 pm)