>>> I posted separately about those. And I've been mulling about whetherI think I agree with you, but not as strongly. Certainly, having any kind of effective cacheing (heck, just comparing the timestamp of the relevant ref(s) with the If-Modified-Since: header) will help kernel.org enormously. But as soon as there's a push, particularly a release push, that invalidates *all* of the popular pages *and* the thindering herd arrives. The result is that all of the popular "what's new?" summary pages get fetched 15 times in parallel and, because the front end doesn't serialize them, populating the caches can be a painful process involving a lot of repeated work. I tend to agree that for the basic project summary pages, generating them preemptively as static pages out of the push script seems best. ("find /usr/src/linux -type d -print | wc -l" is 1492. Dear me. Oh! There is no per-directory shortlog page; that simplifies things. But there *should* be.) The only tricky thing is the "n minutes/hours/days ago" timestamps. Basically, you want to generate a half-formatted, indefinitely-cacheable page that contains them as absolute timestamps, and a have system for regenerating the fully-formatted page from that (and the current time). The ideas that people have been posting seem excellent. Give a page two timeouts. If a GET arrives before the first timestamp, and no prerequisites have changes, it's served directly from cache. If it arrives after the second timeout, or the prerequisites have changed, it blocks until the page is regenerated. But if it arrives between those two times, it serves the stale data and starts generating fresh data in the background. So for the fully-formed timestamps, the first timeout is when the next human-readable timestamp on the page ticks over. But the second timeout can be past that by, say, 5% of the timeout value. It's okay to display "3 hours ago" until 12 minutes past the 4 hour mark. It might be okay to allow even the prerequisites to be slightly stale when serving old data; it's okay if it takes 30 seconds for the kernel.org web page to notice that Linus pushed. But on my office gitweb, I'm not sure that it's okay to take 30 seconds to notice that *I* just pushed. (I'm also not sure about consistency issues. If I link from one page that shows the new release to another, it would be a bit disconcerting if it disappeared.) The nasty problem with built-in cacheing is that you need a whole cache reclaim infrastructure; it would be so much nicer to let Squid deal with that whole mess. But it can't deal with anything other than fully rendered HTML. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
| Andrew Morton | -mm merge plans for 2.6.23 |
| Benjamin Herrenschmidt | Re: [PATCH] Remove process freezer from suspend to RAM pathway |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Mel Gorman | [PATCH 6/8] x86_64 - Specify amount of kernel memory at boot time |
git: | |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| David Miller | [GIT]: Networking |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Jarek Poplawski | Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem) |
