[FYI] pack idx format

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Junio C Hamano
Date: Wednesday, February 15, 2006 - 1:39 am

This is still WIP but if anybody is interested...  Once done, it
should become Documentation/technical/pack-format.txt.

The reason I started doing this is to prototype this one:

	<7v4q3453qu.fsf@assigned-by-dhcp.cox.net>

-- >8 --

Idx file:

The idx file is to map object name SHA1 to offset into the
corresponding pack file.  There is the 'first-level fan-out'
table at the beginning, and then the main part of the index
follows.  This is a table whose entries are sorted by their
object name SHA1.  The file ends with some trailer information.

The main part is a table of 24-byte entries, and each entry is:

	offset : 4-byte network byte order integer.
	SHA1   : 20-byte object name SHA1.

The data for the named object begins at byte offset "offset" in
the corresponding pack file.

Before this main table, at the beginning of the idx file, there
is a table of 256 4-byte network byte order integers.  This is
called "first-level fan-out".  N-th entry of this table records
the offset into the main index for the first object whose object
name SHA1 starts with N+1.  fanout[255] points at the end of
main index.  The offset is expressed in 24-bytes unit.

Example:

	idx
	    +--------------------------------+
	    | fanout[0] = 2                  |-.
	    +--------------------------------+ |
	    | fanout[1]                      | |
	    +--------------------------------+ |
	    | fanout[2]                      | |
	    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
	    | fanout[255]                    | |
	    +--------------------------------+ |
main	    | offset                         | |
index	    | object name 00XXXXXXXXXXXXXXXX | |
table	    +--------------------------------+ | 
	    | offset                         | |
	    | object name 00XXXXXXXXXXXXXXXX | |
	    +--------------------------------+ |
	  .-| offset                         |<+
	  | | object name 01XXXXXXXXXXXXXXXX |
	  | +--------------------------------+
	  | | offset                         |
	  | | object name 01XXXXXXXXXXXXXXXX |
	  | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
	  | | offset                         |
	  | | object name FFXXXXXXXXXXXXXXXX |
	  | +--------------------------------+
trailer	  | | packfile checksum              |
	  | +--------------------------------+
	  | | idxfile checksum               |
	  | +--------------------------------+
          .-------.      
                  |
Pack file entry: <+

     packed object header:
	1-byte type (upper 4-bit)
	       size0 (lower 4-bit) 
        n-byte sizeN (as long as MSB is set, each 7-bit)
		size0..sizeN form 4+7+7+..+7 bit integer, size0
		is the most significant part.
     packed object data:
        If it is not DELTA, then deflated bytes (the size above
		is the size before compression).
	If it is DELTA, then
	  20-byte base object name SHA1 (the size above is the
	  	size of the delta data that follows).
          delta data, deflated.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[FYI] pack idx format, Junio C Hamano, (Wed Feb 15, 1:39 am)
Re: [FYI] pack idx format, Johannes Schindelin, (Wed Feb 15, 4:16 am)
Re: [FYI] pack idx format, Nicolas Pitre, (Wed Feb 15, 9:46 am)
[PATCH] pack-objects: reuse data from existing pack., Junio C Hamano, (Wed Feb 15, 6:43 pm)
[PATCH] packed objects: minor cleanup, Junio C Hamano, (Wed Feb 15, 6:45 pm)
Re: [FYI] pack idx format, Junio C Hamano, (Wed Feb 15, 6:58 pm)
Re: [PATCH] pack-objects: reuse data from existing pack., Nicolas Pitre, (Wed Feb 15, 8:41 pm)
Re: [PATCH] pack-objects: reuse data from existing pack., Linus Torvalds, (Wed Feb 15, 8:55 pm)
Re: [PATCH] pack-objects: reuse data from existing pack., Junio C Hamano, (Wed Feb 15, 8:59 pm)
Re: [PATCH] pack-objects: reuse data from existing pack., Junio C Hamano, (Wed Feb 15, 9:07 pm)
Re: [PATCH] pack-objects: reuse data from existing pack., Andreas Ericsson, (Thu Feb 16, 1:32 am)
Re: [PATCH] pack-objects: reuse data from existing pack., Junio C Hamano, (Thu Feb 16, 2:13 am)
Re: [PATCH] pack-objects: reuse data from existing pack., Junio C Hamano, (Thu Feb 16, 9:30 pm)
Re: [PATCH] pack-objects: reuse data from existing pack., Linus Torvalds, (Fri Feb 17, 8:39 am)
Re: [PATCH] pack-objects: reuse data from existing pack., Junio C Hamano, (Fri Feb 17, 11:18 am)