login
Header Space

 
 

Re: [PATCH] Change 'Deltifying objects' to 'Delta compressing objects'

Previous thread: Re: git on afs (fwd) by Brandon Casey on Thursday, October 18, 2007 - 6:48 pm. (1 message)

Next thread: [PATCH 1/2] Documentation/git-gc: explain --auto in description by Jeff King on Thursday, October 18, 2007 - 10:01 pm. (1 message)
To: <git@...>
Cc: Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 8:45 pm

Recently I was referred to the Grammar Police as the git-pack-objects
progress message 'Deltifying %u objects' is considered to be not
proper English to at least some small but vocal segment of the
English speaking population.  Techncially we are applying delta
compression to these objects at this stage, so the new term is
slightly more acceptable to the Grammar Police but is also just
as correct.

Signed-off-by: Shawn O. Pearce &lt;spearce@spearce.org&gt;
---

 To go on top of your 'np/progress' series that is currently in next.
 I think it might keep us out of trouble with certain individuals...

 builtin-pack-objects.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 15d3750..f79fc0b 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1718,7 +1718,7 @@ static void prepare_pack(int window, int depth)
 	if (nr_deltas &amp;&amp; n &gt; 1) {
 		unsigned nr_done = 0;
 		if (progress)
-			start_progress(&amp;progress_state, "Deltifying objects",
+			start_progress(&amp;progress_state, "Delta compressing objects",
 					nr_deltas);
 		qsort(delta_list, n, sizeof(*delta_list), type_size_sort);
 		ll_find_deltas(delta_list, n, window+1, depth, &amp;nr_done);
-- 
1.5.3.4.1249.g895be
-
To: Shawn O. Pearce <spearce@...>
Cc: <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:12 pm

Boo. I _like_ "deltifying". Sure, it's probably not in the dictionary,
but that's how languages change: saying "delta compressing" all the time
will get awkward, so people invent a new word using existing rules to
explain a common phenomenon.

Anyway, if you want to please the Grammar Police, should it not be
"Delta-compressing"?  "Delta" is not an adverb here, but rather the
phrase acts as a compound verb (i.e., the two words work in place of a
single verb). Although "Delta-compressing objects" just looks stupid.

-Peff
-
To: Jeff King <peff@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 10:45 pm

Yet that progress display isn't solely about "delta compressing".  It 
also includes the search for best object match in order to keep the 
smallest delta possible.

I'd like to add my own boo, but since I'm not a native speaker I won't 
argue too much.


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 10:59 pm

In fact, isn't that progress meter _solely_ about finding the best
matches? The actual deltification (that is, the creation of the deltas
and writing of them to the packfile happens during the writing phase --
unless, of course, we've cached the deltas during the search phase).

Perhaps one of:

  Finding deltas
  Finding delta candidates
  Matching objects

or something similar (though I don't especially like any of them, I
think you get the idea).

-Peff
-
To: Jeff King <peff@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:11 pm

Well, to find which combination is the smallest you actually have to 

I think we might sidestep the issue entirely by remaining somewhat vague 
and simply saying "compressing objects" for that phase.  This is the 
part responsible for the reduction of a Git repository from 3GB down to 
200MB anyway.


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:15 pm

OK. I liked it a little more specific, but perhaps users really don't
care what's going on. And it seems there is some support of simply
"compressing", so that's probably reasonable.

I think that is going to make the statistics line doubly confusing,
though, since we never even use the word "delta" (or any wacky
verbifications based on it), and then they get a line about numbers of
new and reused deltas.

-Peff
-
To: Jeff King <peff@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:27 pm

Well, at least the statistics are real, and we're not inventing anything 
here.


Nicolas
-
To: Jeff King <peff@...>
Cc: <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:21 pm

My eyes have gotten used to "Deltifying" but I have to admit that
in my early Git days I thought it looked damn odd.  Today I'm far
too familiar with Git to really notice this as a problem now.

I really don't have an opinion either way.  Actually I think the
entire progress system of git-pack-objects should be reduced even
further so that users aren't exposed to the internal phases we
are going through.  Most of them just don't care.  They just

Yea, that is just stupid.  And is why I didn't use a - in my patch.


*1*: Yes, yes, I know that's hard to predict and impossible in
the really important cases (like network transfer of not yet
compressed data).  So just telling them that we are working and
have recently done something useful towards finishing our goal is
more than enough.  Its like putting mirrors on the wall in a lobby;
folks won't notice how long it takes for the elevator to arrive as
they are too busy preening themselves.

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Jeff King <peff@...>, <git@...>
Date: Thursday, October 18, 2007 - 10:49 pm

Well, with my latest patches in that area, the typical progress on 
screen has been cut in half.  And the different phases are intertaining.


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Jeff King <peff@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:17 pm

Yup.  Your patches were a big improvement.  But I'm now sitting here
wondering if we shouldn't just allow a progress meter to overwrite
the prior one.  Then you only see the current task and progress,
or the final output if we have nothing further to say about that.

Hmmph.  Maybe something like this:

I like how it comes out in the end, but it really screws with the
sideband protocol.  Like horribly.  I got a whole ton of "remote:
remote: remote: remote: remote: remote:" during a remote clone.

diff --git a/progress.c b/progress.c
index 7629e05..099fc14 100644
--- a/progress.c
+++ b/progress.c
@@ -37,8 +37,6 @@ static void clear_progress_signal(void)
 
 static int display(struct progress *progress, unsigned n, int done)
 {
-	char *eol;
-
 	if (progress-&gt;delay) {
 		if (!progress_update || --progress-&gt;delay)
 			return 0;
@@ -55,18 +53,26 @@ static int display(struct progress *progress, unsigned n, int done)
 	}
 
 	progress-&gt;last_value = n;
-	eol = done ? ", done.   \n" : "   \r";
-	if (progress-&gt;total) {
+	if (done) {
+		size_t c = strlen(progress-&gt;title);
+		while (c--)
+			fputc(' ', stderr);
+		fputs("            ", stderr);
+		for (; n &gt; 0; n /= 10)
+			fputs("  ", stderr);
+		fputc('\r', stderr);
+		return 1;
+	} else if (progress-&gt;total) {
 		unsigned percent = n * 100 / progress-&gt;total;
 		if (percent != progress-&gt;last_percent || progress_update) {
 			progress-&gt;last_percent = percent;
 			fprintf(stderr, "%s: %3u%% (%u/%u)%s", progress-&gt;title,
-				percent, n, progress-&gt;total, eol);
+				percent, n, progress-&gt;total, "   \r");
 			progress_update = 0;
 			return 1;
 		}
 	} else if (progress_update) {
-		fprintf(stderr, "%s: %u%s", progress-&gt;title, n, eol);
+		fprintf(stderr, "%s: %u%s", progress-&gt;title, n, "   \r");
 		progress_update = 0;
 		return 1;
 	}

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Jeff King <peff@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:38 pm

And then you've lost some diagnostic clue (the absolute numbers) about 
the actual number of objects that were listed for "deltification" for 
example.

And imagine that you see the progress moving slowly because the remote 
server is a NSLU2, but it says 80%.  Then you go for a coffee and the 
progress says 20% when you return because it now has moved to a 
different phase.  Rather counter intuitive.


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Jeff King <peff@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:45 pm

Leave the "Total" line.  Add to it the number of objects we had to

Yea, I didn't consider that.  That's where you need to show the
number of steps and which one you are on, so the meter looks
more like:

	Step 1/3: Counting objects: .... \r
	Step 2/4: Compressing objects: ... \r
	Step 3/3: Writing objects: .... \r

only all smashed into one line of course, so only the most recent
one is being displayed.

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: Nicolas Pitre <nico@...>, Jeff King <peff@...>, <git@...>
Date: Friday, October 19, 2007 - 7:44 am

Looks good, although as Nicholas pointed out you might not know in  
advanced how many steps there are.

So I still think that Nicholas' original point is valid is here, and  
if you overwrite old progress bars with new ones then you may falsely  
give the impression of "going backwards".

Cheers,
Wincent



-
To: Shawn O. Pearce <spearce@...>
Cc: Jeff King <peff@...>, <git@...>
Date: Friday, October 19, 2007 - 12:02 am

Yet you might not know in advance how many steps there'll be.  You might 
or might not have the deltification phase (I simply can't let that term 
go...), pack indexing also have 1 or 2 steps, and if objects are 
unpacked instead then you have only one step.

Given the asynchronous nature of the sideband messages, I think that 
could only create messed up displays.  Some messages are terminated with 
\n and others with \r.


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:41 pm

Obligatory me too: I totally agree with Nicolas here.

-Peff
-
To: Shawn O. Pearce <spearce@...>
Cc: <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:34 pm

OK, I will confess I found it a little odd at first, but I think it's a
straightforward and playful extension of the language, which is
something I like. But you know, we have the corporate git customers to

On a similar note, some complaints with progress meters, even after
recent patches:
  - When fetching, one progress meter says "Indexing" which, while
    technically true, is almost certainly blocking on "Downloading". In
    fact, it is not clear from the existing messages exactly _when_ we
    are downloading, and when we are just computing, which is something
    I think a user might want to know. Objections to changing this
    (though perhaps index-pack will need to be told when it is
    downloading and when it is just indexing)? Objections to a
    throughput indicator?
  - Running git-gc, we now get something like:
      Counting objects: 62317, done.
      Deltifying objects: 100% (18042/18042), done.
      Writing objects: 100% (62317/62317), done.
      Total 62317 (delta 43861), reused 61404 (delta 43036)
      Pack pack-32f8ac40c1a5ec146e45c657cb16f53fdd354095 created.
      Removing unused objects 100%...
      Done.
    Can we get rid of total statistics (I think this is useful for some
    power users, but perhaps there should be a verbosity level), the
    name of the pack file (same deal), and the totally useless "Done."?

-Peff
-
To: Jeff King <peff@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:01 pm

Agreed for the pack name.  Certainly no one cares.

Maybe the "Removing unused objects" should use the common progress 
infrastructure?  It could even use the delayed interface, just like when 
checking out files, so no progress at all is displayed when that 
operation completes within a certain delay.  And the removal of unused 
objects is usually quick.

But I like the statistics.  They might be pretty handy to diagnoze 
performance issues on remote servers for example.


Nicolas
-
To: Nicolas Pitre <nico@...>, Jeff King <peff@...>
Cc: <git@...>
Date: Friday, October 19, 2007 - 1:03 am

Discussion on the list tonight came to the conclusion that showing
the name of the packfile we just created during git-repack is not
a very useful message for any end-user.  For the really technical
folk who need to have the name of the newest packfile they can use
something such as `ls -t .git/objects/pack | head -2` to find the
most recently created packfile.

Signed-off-by: Shawn O. Pearce &lt;spearce@spearce.org&gt;
---

 Nicolas Pitre &lt;nico@cam.org&gt; wrote:
 &gt; On Thu, 18 Oct 2007, Jeff King wrote:
 &gt; &gt;     Can we get rid of total statistics (I think this is useful for some
 &gt; &gt;     power users, but perhaps there should be a verbosity level), the
 &gt; &gt;     name of the pack file (same deal), and the totally useless "Done."?
 &gt;
 &gt; Agreed for the pack name.  Certainly no one cares.

 This makes it so.

 git-repack.sh |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/git-repack.sh b/git-repack.sh
index e72adc4..7220635 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -83,9 +83,6 @@ for name in $names ; do
 	fullbases="$fullbases pack-$name"
 	chmod a-w "$PACKTMP-$name.pack"
 	chmod a-w "$PACKTMP-$name.idx"
-	if test "$quiet" != '-q'; then
-	    echo "Pack pack-$name created."
-	fi
 	mkdir -p "$PACKDIR" || exit
 
 	for sfx in pack idx
-- 
1.5.3.4.1249.g895be
-
To: Nicolas Pitre <nico@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:07 pm

Are you volunteering (I think you know the progress code best)?

They are by far the most useful of the three lines I mentioned, but I
just wonder if they are a bit meaningless and cluttery for light users.
We can always cut the others and see how it looks.

-Peff
-
To: Jeff King <peff@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:24 pm

Frankly, I think effort should be spent on the refs update display at 
this point.  Something that looks like:

* refs/heads/origin: fast forward to branch 'master' of git://gi
t.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
  old..new: 66ffb04..4fa4d23

[ note that I arbitrarily cut the long line before the 80th column to 
  show the effect within an email ]

You usually get long lines that gets wrapped, so that means 3 lines of 
screen space for one updated branches.  Is the "66ffb04..4fa4d23" 
information really useful?  Might someone ever care?


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Jeff King <peff@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:33 pm

The reason its formatted the way it is today is someone can grab
that line into a copy-paste buffer and throw it onto a "git-log"
or "gitk" command line with ease to see what new stuff has come in.

Me, I just use the reflog if I care (`origin@{1}..origin`) to see
what a fetch changed in the tracking branch.

However I *don't* need the remote branch name or the remote URL,
especially if we are storing it into a tracking branch.  That's most
likely coming from a configured remote that the user fetches from
frequently.  I don't think about Linus' URL, I think about the fact
that in my linux-2.6 repository his directory is my origin remote.

Maybe something like this would be more useful:

 * origin: fast-forwarded: 66ffb04..4fa4d23

Or if you are using refs/remotes style tracking branches:

 * origin/master: fast-forwarded: 66ffb04..4fa4d23
 * origin/pu: forcing update: 66ffb04..4fa4d23

Too terse?  Yea, probably.  But it is a whole lot shorter.


My other pet peeve here is the display from send-pack and
receive-pack when you push a ref.  Hello?

  updating 'refs/heads/master'
    from de61e42b539ffbd28d2a2ba827bb0eb79767057b
    to   d7e56dbc4f60f6bd238e8612783541d89f006fb7
  ...
  refs/heads/master: de61e42b539ffbd28d2a2ba827bb0eb79767057b -&gt; d7e56dbc4f60f6bd238e8612783541d89f006fb7

That's like 4 too many SHA-1 strings for the average user.

-- 
Shawn.
-
To: Nicolas Pitre <nico@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:32 pm

I have used it occasionally when tracking repos to see what new commits
have happened. Usually I use a separate branch to mark "what I've seen"
(i.e., fetch, gitk origin..master, pull), but if it's a branch that I'm
not actively tracking, the display is useful.

What is really useless in that line is the fact that _every_ ref is
going to have the name of the remote, even though we only support
fetching from one remote at a time. Perhaps something like:

Fetching from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
 * refs/heads/origin: fast forward to branch 'master'

although that URL is almost a line by itself. :)

Also, why do we abbreviate "refs/heads/master" from the remote, but we
don't abbreviate refs/heads/origin for the local? Maybe something like:

  * local heads/origin -&gt; remote heads/master (fast forward)

or for separate remote

  * local remotes/origin/master -&gt; remote heads/master (fast forward)

Thoughts?

-Peff
-
To: Jeff King <peff@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:50 pm

Maybe we should have a shortcut notation for &lt;ref&gt;@{1}..&lt;ref&gt; instead?
I end up using that all the time since the fetch result has long 
scrolled off the screen when I want to look at what was fetched.

Usully this llooks like:

	git pull
	git log @{1}..



That looks fine for a push.  I'd say "remote foo -&gt; local bar" for a 
fetch.


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Jeff King <peff@...>, <git@...>
Date: Friday, October 19, 2007 - 1:09 am

The fetch side is easy to change on top of the db/fetch-pack series
that is currently in next.

What's hard is the remote side.  receive-pack on the remote side
doesn't have the local's refname but it prints its own message upon
a successful update.

-- 
Shawn.
-
To: Nicolas Pitre <nico@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:56 pm

I must confess to never using reflogs in that case...for some reason
they just never come to mind. But now that you and Shawn mention them,
there's really no reason to leave the hash..hash for the fetch.

As for a shortcut notation, what about allowing '..' notation inside a
reflog. I.e., &lt;ref&gt;@{a..b} is the same as &lt;ref&gt;@{a}..&lt;ref&gt;@{b}? So you
could perhaps do origin/master@{1..}?

I'm not sure if there are syntactic issues with parsing out the '..' (or

Yes, I was tempted to suggest that. I think it might need some context,

Agreed.

-Peff
-
To: Jeff King <peff@...>
Cc: Nicolas Pitre <nico@...>, Shawn O. Pearce <spearce@...>, <git@...>
Date: Friday, October 19, 2007 - 12:21 am

I'd love it, but the way our current SHA1 parser works, I don't think it 
can really do it.

Basically, we currently assume that a SHA1 expression always expands to a 
*single* SHA1.

And then, on top of that SHA1 expression parser, we then have a totally 
separate logic (which is *not* part of the SHA1 expression parser itself) 
that handles the "a..b" and "a...b" cases.

In other words, all the magic "head@{xyz}" logic is all in sha1_name.c, 
but that never handles any ranges at all.

And then the range handling is a separate thing in revision.c and 
builtin-rev-parse.c.

So I think your syntax is wonderful, but it would require moving the 
complex range handling into sha1_name.c, and would also require that file 
to get a more complex interface (right now all the sha1_name.c routines 
just take the "fill in this one SHA1 hash" approach, but ".." and "..." 
means that you have multiple objects *and* you can mark one of them as 

See above: I don't think we have syntax problems per se: it's just that 
right now the "complex" parser (the one that knows about ^, ~, and @{x} 
etc) simply cannot do anything but a single SHA1 due to internal interface 
issues.

		Linus
-
To: Linus Torvalds <torvalds@...>
Cc: Nicolas Pitre <nico@...>, Shawn O. Pearce <spearce@...>, <git@...>
Date: Friday, October 19, 2007 - 12:29 am

Ah, right. I hadn't thought of that. While it would be a nice
convenience feature, it's probably not worth the deep internal hackery
that would be required.

-Peff
-
To: Jeff King <peff@...>
Cc: Linus Torvalds <torvalds@...>, Shawn O. Pearce <spearce@...>, <git@...>
Date: Friday, October 19, 2007 - 7:58 am

What about a preprocessor that could match &lt;1&gt;@{&lt;2&gt;..&lt;3&gt;} in the 
argument list and substitute that with &lt;1&gt;@{&lt;2&gt;}..&lt;1&gt;@{&lt;3&gt;} before it is 
actually parsed?


Nicolas
-
To: Nicolas Pitre <nico@...>
Cc: Jeff King <peff@...>, Shawn O. Pearce <spearce@...>, <git@...>
Date: Friday, October 19, 2007 - 3:00 pm

Could be done, but I almost think it would be better to just make the 
sha1_name.c interfaces take some extended data structure which allows 
looking up multiple SHA1's at the same time.

Sure, we'd leave the "simple" interfaces around for users that simply want 
just one SHA1 value, but that would allow us to remove duplicate code from 
revision.c and rev-parse.c. And it would allow us to generally make the 
SHA1 parsing fancier: there may well be other expressions that are worth 
doing.

		Linus
-
To: Nicolas Pitre <nico@...>
Cc: Jeff King <peff@...>, Shawn O. Pearce <spearce@...>, <git@...>
Date: Friday, October 19, 2007 - 5:57 pm

Btw, I knew I had wanted this in the past, but I had forgotten why. I now 
remembered.

The thing is, sometimes you want an expression for "all the parents of X". 
We don't have that, and again, the current internal implementation of 
sha1_name.c makes it essentially impossible to do within that interface 
(ie you can do it on *top* of that interface in revision.c, but it cannot 
be a general SHA1 expression).

So wouldn't it be nice to have a "commit^*" expression to go with 
"commit^" and "commit^2" etc? One that would name all the parents. It's 
useful, for example, for saying that you still want to see that commit, 
but not any of its parents:

	git log commit^*..

could basically work to show that commit (and all subsequent commits), but 
not the commits leading up to it. Right now, you can't easily say that in 
the git "sha1 expression algebra".

There are some other cases where you'd like to have things expand to more 
than one commit. We currently do those with special flags, like --all, but 
in many ways it would be really nice to be able to do SHA1 operations on 
them. If we were to make the SHA1 arithmetic able to handle multiple 
SHA1's (instead of just one), I could see us being able to do things like

	git diff {master,pu}:Makefile

as a way of saying

	git diff master:Makefile pu:Makefile

which already works - simply because we could make the ":name" be able to 
operate on multiple commit SHA1's and turn them into multiple blob (or 
tree) SHA1's.

(The above may not sound very useful, but

	git diff {ORIG_HEAD...MERGE_HEAD}:file

would essentially expand to "base version of file", ORIG_HEAD:file and 
MERGE_HEAD:file, and we could fairly easily teach diff to do a nice 
three-way diff for things like this! So it does have potential to be a 
reasonably powerful model)

			Linus
-
To: Jeff King <peff@...>
Cc: <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:47 pm

Yes!  I agree entirely.  This is actually not very difficult.

I think the only time we run `git-index-pack --stdin` is from within
git-fetch-pack and git-receive-pack.  These are the only two points
where index-pack's stdin is attached to a network socket and not
to a file.  Its also where you'd want this to say "Transferring",
"Uploading" or "Downloading".

Really the important one to change here is probably the call in
fetch-pack.c as that is the most visible and most time consuming
operation for the average user (think git-clone on a large project).

The same change probably should also be made for unpack-objects as
fetch-pack/receive-pack may have chosen to use that if the object

Yea.  I keep forgetting to write a patch to do this.  I've had much
the same thought as you.

The verbosity should probably be controlled like merge-recursive's
is, but should default to not showing the "Total" line or the "Pack
.. created" line.  For the average user there isn't any valuable
information in either line.

I also think that the progress meter of git-prune-packed should be
fixed to use the standard progress meter system.  And maybe also
be delayed so it doesn't trip if its going to be very quick.

-- 
Shawn.
-
To: Jeff King <peff@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:19 pm

This is not very considerate to non-native speakers, though, who might
not grasp the neogolism.

Perhaps just "compressing" if it gets awkward.

Sam.
-
To: Sam Vilain <sam@...>
Cc: Shawn O. Pearce <spearce@...>, <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:25 pm

I would have thought it would be easier for non-native speakers, since
it has a precise meaning which I assumed we defined in a glossary (but
now I see that we don't, though that can be corrected). But of course,
I'm a native speaker (and AIUI, so are you), so perhaps we should wait
for somebody else to comment.

But I do agree that making sense to non-native speakers should be a
priority when phrasing things like this.

-Peff
-
To: Sam Vilain <sam@...>
Cc: Jeff King <peff@...>, Shawn O. Pearce <spearce@...>, <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:24 pm

Forward thinking, that's probably most sensible, since git 4.7 might
not use delta compression, but maybe wavelet compression, or other
scheme entirely. Using deltas is an implementation detail, after all.


Dave.
-
To: David Symonds <dsymonds@...>
Cc: Sam Vilain <sam@...>, Shawn O. Pearce <spearce@...>, <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:36 pm

Git already uses two types of compression (zlib on all objects, deltas
between objects in packfiles). So just saying "compressing" is actually
a bit ambiguous, and I think noting that what we are _actually_ doing
right now is delta compression is worthwhile.

-Peff
-
To: Jeff King <peff@...>
Cc: David Symonds <dsymonds@...>, Shawn O. Pearce <spearce@...>, <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:55 pm

True.  But then, zlib compression is also delta compressing and huffman
coding.  Git's delta compression is quite similar; it just uses a larger
window than the sliding 64k gzip one.

Sam.`
-
To: David Symonds <dsymonds@...>
Cc: Sam Vilain <sam@...>, Jeff King <peff@...>, <git@...>, Nicolas Pitre <nico@...>
Date: Thursday, October 18, 2007 - 10:34 pm

Heh.  I just remembered that my packv4 topic had a phase it called
"Dictifying objects".  Which ran before "Deltifying objects".  It was
just another form of compression, but it was also timeconsuming
and an entirely different loop so it got its own progress meter.

I'm leaning to making it just say "Compressing objects".  Simple,
to the point, reasonably describes the action, and most people will
understand what it means: "Oh, time to go get coffee if that number
there is reeeealy big..."  :-)

-- 
Shawn.
-
To: Shawn O. Pearce <spearce@...>
Cc: David Symonds <dsymonds@...>, Sam Vilain <sam@...>, Jeff King <peff@...>, <git@...>
Date: Thursday, October 18, 2007 - 11:02 pm

I think this is sensible.


Nicolas
-
Previous thread: Re: git on afs (fwd) by Brandon Casey on Thursday, October 18, 2007 - 6:48 pm. (1 message)

Next thread: [PATCH 1/2] Documentation/git-gc: explain --auto in description by Jeff King on Thursday, October 18, 2007 - 10:01 pm. (1 message)
speck-geostationary