Re: Thoughts about memory requirements in traversals [Was: Re: [RFC] Submodules in GIT]

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Linus Torvalds
Date: Saturday, December 2, 2006 - 7:25 pm

On Sun, 3 Dec 2006, Josef Weidendorfer wrote:

You're missing the big issue.

The issue is that a cache like that would ABSOLUTELY SUCK.

You could speed up the non-common operations with it, but:

 - any changes would become a LOT more expensive to do, because they all 
   need to update every single object they add (ie a "commit" would now 
   have to add backpointers TO EVERY SINGLE BLOB).

   Imagine what this does to something like the kernel, where a commit 
   reaches 22,000 files!

   You can do it at a finer granularity (ie do just the direct backlinks 
   and only do the "tree->blob" and "tree->tree" things rather than the 
   full commit reachability, but it's still going to be MUCH more painful 
   than what we do now.

 - the cache would be a lot bigger than the current pack-files, and it 
   would be fragile as hell to boot. Because it needs to get rewritten for 
   every operation, it gets corrupted much more easily, and that's 
   ignoring things like race conditions, so it would now need a ton of 
   locking that git simply doesn't do at all.

 - everything would basically slow down.

 - you couldn't do shared object databases AT ALL, because backpointers 
   wouldn't work. The whole _reason_ you can share object databases is the 
   same reason we can't have backpointers: objects are immutable and never 
   change depending on circustances.

The _only_ downside of the current situation is literally the 24 or 28 
bytes per object that we look at. For most operations, we don't even look 
at that many objects, so it's really the worst-case things.


Right. If the project is totally read-only, the cache would work well.

For real development, it would SUCK. It would make things like "git reset" 
very expensive indeed, for example (you'd have to unwind the whole cache: 
either regenerating it - which would take minutes - or being very careful 
indeed and being able to always remove objects properly and keeping track 
of them 100%).

IOW, it's nasty nasty nasty. And it doesn't really even help anything but 
a case that we actually already handle really well (I spent a lot of 
effort on making the memory footprint minimal).

But it does mean that you do NOT want to traverse a hundred different 
project "as if" they were one. That's really the only thing it means.

And since you can do submodules as independent projects, and you SHOULD do 
them that way for tons of other reasons _anyway_, even that isn't a reason 
to screw up all the _wonderful_ properties of the git object database.

So what I'm trying to say is that the immutable non-backpointer nature of 
the git database is what makes it so WONDERFUL. It's efficient, it's 
dense, it's stable, and it allows us all the clever things we do. But it 
means that we do end up alway spending 28 bytes per object, and we can 
never throw those 28 bytes away during a single "traversal" run.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: [RFC] Submodules in GIT, Jakub Narebski, (Tue Nov 28, 3:50 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Tue Nov 28, 6:35 am)
Re: [RFC] Submodules in GIT, Shawn Pearce, (Tue Nov 28, 8:44 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Tue Nov 28, 9:29 am)
Re: [RFC] Submodules in GIT, Shawn Pearce, (Tue Nov 28, 9:36 am)
Re: [RFC] Submodules in GIT, Jon Loeliger, (Tue Nov 28, 10:38 am)
Re: [RFC] Submodules in GIT, Steven Grimm, (Tue Nov 28, 12:58 pm)
Re: [RFC] Submodules in GIT, Shawn Pearce, (Tue Nov 28, 2:02 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Wed Nov 29, 9:03 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Wed Nov 29, 9:15 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Wed Nov 29, 1:00 pm)
Re: [RFC] Submodules in GIT, sf, (Thu Nov 30, 4:57 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Thu Nov 30, 5:16 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Thu Nov 30, 5:40 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Thu Nov 30, 10:06 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Thu Nov 30, 11:57 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Fri Dec 1, 1:49 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Fri Dec 1, 2:02 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Fri Dec 1, 2:33 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Fri Dec 1, 3:38 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 4:00 am)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 5:03 am)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 5:09 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 5:11 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 5:12 am)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 6:05 am)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 6:21 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 6:35 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 6:43 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Fri Dec 1, 6:43 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 6:46 am)
Re: [RFC] Submodules in GIT, Stephan Feder, (Fri Dec 1, 6:51 am)
Re: [RFC] Submodules in GIT, Stephan Feder, (Fri Dec 1, 7:23 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Fri Dec 1, 7:52 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 7:58 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 8:00 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 8:07 am)
Re: [RFC] Submodules in GIT, Stephan Feder, (Fri Dec 1, 8:47 am)
Re: [RFC] Submodules in GIT, Stephan Feder, (Fri Dec 1, 9:04 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 9:15 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Fri Dec 1, 9:38 am)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 1, 9:49 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 9:54 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 9:57 am)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 10:08 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 10:14 am)
Re: [RFC] Submodules in GIT, Stephan Feder, (Fri Dec 1, 10:33 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Fri Dec 1, 11:06 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Fri Dec 1, 11:08 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 11:48 am)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 11:51 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Fri Dec 1, 12:17 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 12:38 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 1, 1:13 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 1:30 pm)
Re: [RFC] Submodules in GIT, Andy Parkins, (Fri Dec 1, 2:04 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 2:37 pm)
Re: [RFC] Submodules in GIT, Andy Parkins, (Fri Dec 1, 2:54 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Fri Dec 1, 3:06 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 3:08 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 3:12 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Fri Dec 1, 3:26 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 1, 3:26 pm)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 3:35 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 3:40 pm)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 3:41 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Fri Dec 1, 3:55 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Fri Dec 1, 4:03 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Fri Dec 1, 4:07 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 1, 4:09 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Fri Dec 1, 4:17 pm)
Re: [RFC] Submodules in GIT, Alan Chandler, (Fri Dec 1, 4:23 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 1, 4:30 pm)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 4:34 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Fri Dec 1, 4:36 pm)
Re: [RFC] Submodules in GIT, sf, (Fri Dec 1, 4:49 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 1, 5:12 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Fri Dec 1, 5:14 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 1, 5:33 pm)
Re: [RFC] Submodules in GIT, Andy Parkins, (Sat Dec 2, 2:22 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Sat Dec 2, 2:27 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Sat Dec 2, 3:04 am)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Sat Dec 2, 4:32 am)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Sat Dec 2, 6:50 am)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Sat Dec 2, 11:57 am)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Sat Dec 2, 12:41 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 12:46 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Sat Dec 2, 12:52 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 1:12 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 1:18 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 1:21 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 1:24 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 1:40 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 1:43 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Sat Dec 2, 1:44 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Sat Dec 2, 1:46 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 1:58 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 2:06 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Sat Dec 2, 2:22 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Sat Dec 2, 2:29 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Sat Dec 2, 5:55 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Sat Dec 2, 6:02 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Sat Dec 2, 6:11 pm)
Thoughts about memory requirements in traversals [Was: Re: ..., Josef Weidendorfer, (Sat Dec 2, 7:07 pm)
Re: Thoughts about memory requirements in traversals [Was: ..., Linus Torvalds, (Sat Dec 2, 7:25 pm)
Re: Thoughts about memory requirements in traversals [Was: ..., Josef Weidendorfer, (Sat Dec 2, 8:21 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sat Dec 2, 11:29 pm)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Sun Dec 3, 2:19 am)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Sun Dec 3, 10:54 am)
Re: [RFC] Submodules in GIT, Andy Parkins, (Sun Dec 3, 12:33 pm)
Re: [RFC] Submodules in GIT, Martin Waitz, (Sun Dec 3, 1:46 pm)
Re: [RFC] Submodules in GIT, Sven Verdoolaege, (Sun Dec 3, 3:16 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Sun Dec 3, 3:32 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Mon Dec 4, 4:12 am)
Re: [RFC] Submodules in GIT, Michael K. Edwards, (Mon Dec 4, 11:56 am)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Mon Dec 4, 1:26 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Mon Dec 4, 1:41 pm)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Mon Dec 4, 2:36 pm)
Re: [RFC] Submodules in GIT, Sam Vilain, (Mon Dec 4, 6:31 pm)
Re: [RFC] Submodules in GIT, Daniel Barkalow, (Mon Dec 4, 7:33 pm)
Re: [RFC] Submodules in GIT, Uwe Kleine-Koenig, (Tue Dec 5, 2:01 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Tue Dec 5, 3:33 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Tue Dec 5, 3:38 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Tue Dec 5, 3:42 am)
Re: [RFC] Submodules in GIT, Uwe Kleine-Koenig, (Tue Dec 5, 8:02 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Tue Dec 5, 8:30 am)
Re: [RFC] Submodules in GIT, Sven Verdoolaege, (Tue Dec 5, 9:00 am)
Re: [RFC] Submodules in GIT, sf, (Tue Dec 5, 3:07 pm)
Re: [RFC] Submodules in GIT, Jon Loeliger, (Fri Dec 8, 11:29 am)
Re: [RFC] Submodules in GIT, Sven Verdoolaege, (Fri Dec 8, 11:45 am)
Re: [RFC] Submodules in GIT, R. Steve McKown, (Sat Dec 9, 2:34 pm)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Sun Dec 10, 4:47 am)
Re: [RFC] Submodules in GIT, Andreas Ericsson, (Tue Dec 12, 1:32 am)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Thu Dec 14, 2:27 pm)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Thu Dec 14, 4:07 pm)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Fri Dec 15, 10:43 am)
Re: [RFC] Submodules in GIT, Josef Weidendorfer, (Fri Dec 15, 2:42 pm)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Fri Dec 15, 4:43 pm)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Fri Dec 15, 6:13 pm)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Fri Dec 15, 6:20 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 15, 6:49 pm)
Re: [RFC] Submodules in GIT, Linus Torvalds, (Fri Dec 15, 7:12 pm)
Re: [RFC] Submodules in GIT, Torgil Svensson, (Sat Dec 16, 1:50 am)