login
Header Space

 
 

Linux: Beginner's Guide To Git

April 15, 2005 - 8:36am
Submitted by Jeremy on April 15, 2005 - 8:36am.
Linux news

Tony Luck offered a beginner's guide to git [story] on the git mailing list. He explains:

"If the mechanics of git still have you stumped, then here are a couple of examples of using the really low level tools to do some basic operations. In real life you would have higher level wrappers around these tools so that you don't have to remember to update .git/HEAD with the new SHA1 after you do a commit ... but knowing what the low-level tools do may help you understand what is going on."

In an interesting thread on how to handle renames, Linus Torvalds repeatedly explained, "git is not an SCM. it's a filesystem designed to _host_ an SCM". Combining this simple knowledge with Tony's guide is helpful in understanding how things work, and why.


From: tony luck [email blocked]
To:  git
Subject: git for beginners
Date: 	Thu, 14 Apr 2005 16:19:00 -0700

If the mechanics of git still have you stumped, then here are a couple of
examples of using the really low level tools to do some basic operations.
In real life you would have higher level wrappers around these tools so
that you don't have to remember to update .git/HEAD with the new SHA1 after
you do a commit ... but knowing what the low-level tools do may help you
understand what is going on.

-Tony

$ # How to create your own project from scratch
$ mkdir tmp
$ cd tmp
$ # Create initial empty git filesystem
$ init-db 
defaulting to private storage area
$ # Look around ... lots of directories, but no files yet
$ ls -l .git
total 4
drwxr-xr-x  258 aegl aegl 4096 Apr 14 14:54 objects
$ ls .git/objects/
00  09  12  1b  24  2d  36  3f  48  51  5a  63  6c  75  7e  87  90  99  a2  ab  b4  bd  c6  cf  d8  e1  ea  f3  fc
01  0a  13  1c  25  2e  37  40  49  52  5b  64  6d  76  7f  88  91  9a  a3  ac  b5  be  c7  d0  d9  e2  eb  f4  fd
02  0b  14  1d  26  2f  38  41  4a  53  5c  65  6e  77  80  89  92  9b  a4  ad  b6  bf  c8  d1  da  e3  ec  f5  fe
03  0c  15  1e  27  30  39  42  4b  54  5d  66  6f  78  81  8a  93  9c  a5  ae  b7  c0  c9  d2  db  e4  ed  f6  ff
04  0d  16  1f  28  31  3a  43  4c  55  5e  67  70  79  82  8b  94  9d  a6  af  b8  c1  ca  d3  dc  e5  ee  f7
05  0e  17  20  29  32  3b  44  4d  56  5f  68  71  7a  83  8c  95  9e  a7  b0  b9  c2  cb  d4  dd  e6  ef  f8
06  0f  18  21  2a  33  3c  45  4e  57  60  69  72  7b  84  8d  96  9f  a8  b1  ba  c3  cc  d5  de  e7  f0  f9
07  10  19  22  2b  34  3d  46  4f  58  61  6a  73  7c  85  8e  97  a0  a9  b2  bb  c4  cd  d6  df  e8  f1  fa
08  11  1a  23  2c  35  3e  47  50  59  62  6b  74  7d  86  8f  98  a1  aa  b3  bc  c5  ce  d7  e0  e9  f2  fb
$ find . -type f
$
$ # create some files ready to check in ... use a subdirectory just to show
$ # how subdirs work
$ mkdir src
$ cat > src/hello.c
#include <stdio.h>

main()
{
	printf("Hello, world!\n");
}
$ cat > Makefile
hello: src/hello.c
	cc -o hello -O src/hello.c
$ # Now add these two files to the cache, ready for checkin (use the
$ # "--add" option because these files are new)
$ update-cache --add Makefile src/hello.c
$ # save a tree listing these files
$ write-tree
eab75ce51622aa312bb0b03572d43769f420c347
$ # commit the change. We tell it the SHA1 of the tree we just made
$ commit-tree eab75ce51622aa312bb0b03572d43769f420c347
Committing initial tree eab75ce51622aa312bb0b03572d43769f420c347
First revision of the hello system.
0107d57e748b2f01601adb6749a03aed7b3f5a84
$ # Save the SHA1 for that changeset ... we need it later
$ echo 0107d57e748b2f01601adb6749a03aed7b3f5a84 > .git/HEAD
$ 
$ # Take a look at the changeset
$ cat-file commit 0107d57e748b2f01601adb6749a03aed7b3f5a84
tree eab75ce51622aa312bb0b03572d43769f420c347
author Tony Luck [email blocked] Thu Apr 14 14:57:27 2005
committer Tony Luck [email blocked] Thu Apr 14 14:57:27 2005

First revision of the hello system.
$ # And dig into the tree we saved
$ ls-tree eab75ce51622aa312bb0b03572d43769f420c347
100664	blob	3a7a1c51dbc62797d6c903203de44cc6a734c05c	Makefile
40000	tree	ba103f91defa4b3885b826d6630a055f27800398	src
$ # see that git automatically made a tree for the "src" subdir, look at it
$ ls-tree ba103f91defa4b3885b826d6630a055f27800398
100664	blob	522fff361ad5c07351479ea8504b7c370d189524	hello.c
$ # This blob is our src/hello.c file
$ cat-file blob 522fff361ad5c07351479ea8504b7c370d189524
#include <stdio.h>

main()
{
	printf("Hello, world!\n");
}
$ # Look at all the files we have now, a blob for each file, a
$ # pair of tree objects for the directory and subdir, and a lone
$ # changeset.
$ find .git -type f
.git/index
.git/objects/01/07d57e748b2f01601adb6749a03aed7b3f5a84
.git/objects/ba/103f91defa4b3885b826d6630a055f27800398
.git/objects/ea/b75ce51622aa312bb0b03572d43769f420c347
.git/objects/52/2fff361ad5c07351479ea8504b7c370d189524
.git/objects/3a/7a1c51dbc62797d6c903203de44cc6a734c05c
.git/HEAD
$ # Now make a change
$ ed src/hello.c
59
$i
	return 0;
.
w
70
q
$ # We need to tell .git/index which file(s) are going to be
$ # in this changeset. Don't need "--add" option because we are
$ # changing a file that already exists
$ update-cache src/hello.c 
$ # Now we can write a new tree incorporating the change
$ write-tree
8f5ba0203e31204c5c052d995a5b4449226bcfb5
$ # and finally create a changeset ... this time we tell commit that
$ # the parent of this change is the previous change
$ commit-tree 8f5ba0203e31204c5c052d995a5b4449226bcfb5 -p `cat .git/HEAD`
Fix hello program to return successful exit code.
5403689e0c29607f57da8f751d4ba40637134e87
$ # save the new changeset in .git/HEAD
$ echo 5403689e0c29607f57da8f751d4ba40637134e87 > .git/HEAD 
$ # walk the tree from HEAD to the new version of hello.c
$ cat-file commit 5403689e0c29607f57da8f751d4ba40637134e87
tree 8f5ba0203e31204c5c052d995a5b4449226bcfb5
parent 0107d57e748b2f01601adb6749a03aed7b3f5a84
author Tony Luck [email blocked] Thu Apr 14 15:00:34 2005
committer Tony Luck [email blocked] Thu Apr 14 15:00:34 2005

Fix hello program to return successful exit code.
$ ls-tree 8f5ba0203e31204c5c052d995a5b4449226bcfb5
100664	blob	3a7a1c51dbc62797d6c903203de44cc6a734c05c	Makefile
40000	tree	77dc2cb94930017f62b55b9706cbadda8c90f650	src
$ ls-tree 77dc2cb94930017f62b55b9706cbadda8c90f650
100664	blob	8a6a2a7261742c6f69adaa8c876045e721ffff22	hello.c
$ cat-file blob 8a6a2a7261742c6f69adaa8c876045e721ffff22
#include <stdio.h>

main()
{
	printf("Hello, world!\n");
	return 0;
}
$ 


$ # Now an example of getting started with a pre-existing project
$ # download www.kernel.org/pub/linux/kernel/people/torvalds/sparse.git
$ # then ...
$ ls -l sparse.git
total 8
-rw-rw-r--    1 aegl aegl   41 Apr 14 14:50 HEAD
drwxr-xr-x  258 aegl aegl 4096 Apr 12 21:33 objects
$ # set up environment so that git sees these objects when we are elsewhere
$ export SHA1_FILE_DIRECTORY=`pwd`/sparse.git/objects
$ # make a directory to work in
$ mkdir sparse
$ cd sparse
$ # Still need a local .git for the cache file
$ mkdir .git
$ # Now look at the most recent commit, to find the topmost tree
$ cat-file commit `cat ../sparse.git/HEAD`
tree 67607f05a66e36b2f038c77cfb61350d2110f7e8
parent 9c59995fef9b52386e5f7242f44720a7aca287d7
author Christopher Li [email blocked] Sat Apr  2 09:30:09 PST 2005
committer Linus Torvalds [email blocked] Thu Apr  7 20:06:31 2005

[PATCH] static declear

This patch add static declare to make sparse happy of checking itself.
$ # load up that tree into our cache (.git/index)
$ read-tree 67607f05a66e36b2f038c77cfb61350d2110f7e8
$ # and checkout all the files
$ checkout-cache -a
$ # quick as a flash, our directory is full of files
$ ls
allocate.c       compat-linux.c    example.c     inline.c     memops.c       scope.c       symbol.h          tokenize.c
allocate.h       compat-mingw.c    expand.c      lib.c        obfuscate.c    scope.h       target.c          validation
bitmap.h         compat-solaris.c  expression.c  lib.h        parse.c        show-parse.c  target.h
cgcc             compile.c         expression.h  LICENSE      parse.h        simplify.c    test-lexing.c
check.c          compile.h         FAQ           linearize.c  pre-process.c  sort.c        test-linearize.c
compat           compile-i386.c    flow.c        linearize.h  ptrlist.c      storage.c     test-parsing.c
compat-cygwin.c  cse.c             flow.h        liveness.c   ptrlist.h      storage.h     test-sort.c
compat.h         evaluate.c        ident-list.h  Makefile     README         symbol.c      token.h
$ 



Related Links:

Wow, this looks very intuitiv

April 15, 2005 - 12:49pm
Anonymous (not verified)

Wow, this looks very intuitive and easy to use. Where can I sign up to use it?

Indeed...

April 15, 2005 - 7:22pm
bmc (not verified)

Seems very obfuscated when you don't even understand what it's used for, and where it could be of any help. Sorry but explanations have come through my brain :(

Anyway, wouldn't it be a good idea to prefix all the commands with "git" ? update-thingies are a very crowded place...

Are you dense?

April 15, 2005 - 2:11pm
Anonymous (not verified)

Like the article says, git is not a SCM, it's a filesystem. Complaining that it is not easy to use is like saying ext2 is not easy to use. There are several high level SCM tools that use git being developed already.

Right, this seems like (and L

April 15, 2005 - 7:19pm
bmc (not verified)

Right, this seems like (and Linus said it was) a filesystem. I'm curious to know whether similar "filesystems" didn't exist previously. In short words, what's the value added by git over EXT2 (note that I'm not saying that git and ext2 share the same functionalities, I'm just trying to make it clear) ?

Is git an additionnal layer between the filesystem (like ext2) and the so-called high level SCM ? If this assertion is true, how do other existing SCM handle this ? Have they similar "abstraction" layers ? If not, what is the need for git ? If yes, why didn't Linus take one (maybe they're just too badly coded, or python coded, or not modular enough to only keep the git-like stuff, or Linus didn't like the way they handle braces in their sources, or some maintener stole him a pen at the last FOSDEM, or whatever you'd like) ?

There may be answers to these questions somewhere on the git ML, or anywhere else, but let's try to make it a bit clearer, please... I'm no computer science beginner, but still a newbie when it comes to those SHA / filesystems / SCM mixes. I'm probably brainless, but others with slightly more capacity should be interested too :)

Abstract for those of you that are busy people : explain git functionnalities and compare to other filesystems / SCM abstraction layers / SCM high-level tools. Thanks !

comparison b/w git and traditional fs

April 15, 2005 - 9:32pm
Anonymous (not verified)

First impressions: git seems to be a FS where you address file by content, rather than by name. It has nothing to do with an SCM per se. It is just that an SCM should be content addressable as well, and hence it should be easy for an SCM to make use of git as its FS. Ext2 gets you to byte stream from the filename. Git gets you to byte stream from tree/blob objects. These objects are defined purely by their content.

I hope someone will correct me if I have made any mistakes.

Clearcase is not distributed

April 27, 2005 - 2:22pm

It's very much a monolithic, centralized system. Yes, mutlisite exists, but heavan forbid if you try and do distributed development with it. git is specificially NOT centralized.

Of course git != SCM

April 16, 2005 - 5:18am
Anonymous (not verified)

If git was a SCM, Larry could sue linus as linus is tainited from the first bitkeeper license, he accepted.

sure

April 16, 2005 - 6:13am
Steve Yee (not verified)

i've thought so too. larry will sue linus and linux, thus linux becomes property software and no free linux any more ...

Furthermore

April 16, 2005 - 10:00am
Anonymous (not verified)

By focusing on data, and not fretting about interfaces, Linus
a) avoids personal distraction
b) gathers interest so that arbitrary, existing SCM tools can play.
What is desperately needed is a way to communicate irrespective of tool, license, and philosopy.

Why help linus.

April 16, 2005 - 11:06am
Anonymous (not verified)

Linus is blaming Tridge, why would anyone want to help him.

Everyone told at the early stages about BitKeeper issues, he ignored them, so everyone should ignore him now.

Re: Why help Linus?

April 16, 2005 - 11:55pm
Larry Reaves (not verified)

Some computer geek from Finland just released version 0.0.1 of some operating system called Linux. While not fully yet as functional as the available alternatives, it has the potential to become something great. Why would anyone help him?

Please keep in mind that this is about creating good software, not politics. Acting childish doesn't accomplish anything. And while I never did agree with Linus' choice to use BK for kernel development, I believe he has the right to choose what SCM to use for HIS project. While Linux is very much a community effort, keep in mind that without Linus, there would be no Linux.

Why Linus don't use Visual SourceSafe?

April 19, 2005 - 4:12pm
Anonymous (not verified)

I mean, he can email Billy and ask for a couple of copies of VSS. I'm sure Billy will be glad to help in the development of Linux...

It is perfect as soon as they solved the following

April 20, 2005 - 9:11am
Anonymous (not verified)

1) major downtime
2) major loses to virus/worms/etc.
3) It seems to lose lines.
4) Unintuitive interface
5) does not scale.
6) is totally unusable.

But you are probably right. I think that MS would more than gladly offer up whatever to hook into Linux.

Updated Note on Getting Sparse Tree

May 1, 2005 - 11:54pm
Brandon Philips (not verified)

cg-init rsync://rsync.kernel.org/pub/scm/devel/sparse/sparse.git

that will sync the .git directory right on up.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
speck-geostationary