Hello all, one of the common remarks done about git is that since it tracks tree contents, it's not the best-suited tool to track a bunch of independent files which happen to be in the same directory. I've found myself in the situation of wanting to track my changes done to one or more 'single' files in a directory (e.g. $HOME), and deciding to use antiquate, clumsy, slow and inefficient but file-based RCS (yes, you read that right) over git. In other situations (e.g. for my UserJS folder) I ended up using git, but not liking the idea of having things such as tags referring to all of my UserJS projects instead of the single file they were inteded for, or having to put 'filename: ' at the beginning of commit messages just because the history was shared. So today I decided to start hacking at a git-based but file-oriented content tracker, which I decided to name Zit. The principle is extremely simple: when you choose to start tracking a file with Zit, zit track file Zit will create a directory .zit.file to hold a git repository tracking the single file .zit.file/file, which is just a hard link to file. The reason for using .zit.file as a non-bare repository rather than just a GIT_DIR is that it allows things such as 'git status' to ignore everything else. A possible alternative could have been to use .zit.file as the GIT_DIR and create an all-encopassing .zit.file/info/exclude, but the general idea of having this kind of detached GIT_DIR felt less robust (or maybe I just forgot some export). I also don't like the idea of the hardlink, first of all because of portability problems, and secondly because of the way too many possibility that the hardlink broke somewhere along the way. For example, I haven't tested any fancy git commands on my sample zit implementation, and I'm not sure checking out some older version would actually work. If anybody is intered in trying out my quick hack for the idea, there's a git repository for Zit at git://git.oblomov.eu/zit ...
It sounds interesting. I have some single files that I would like to track using git, zit seems to be a good solution. -- Felipe --
Why not use one .zit repo and track each file on each own branch?. -- Duy --
So your proposal is to have a single .zit repo which is actually a git repo and where each additional tracked file becomes its own branch, and zit would take care of switching from branch to branch when zit commands are called? I think this solution would have a number of problems, apart from being generally quite messy. First of all, moving a file and its history somewhere else means toying around with the history of a much wider repo, whereas the current approach would mean just moving the .zit.file dir together with the file (modulo hardlinks). Non-linear histories for a single file would be more complex to handle, too. And publishing just the history of one file would be damn complex. -- Giuseppe "Oblomov" Bilotta --
I don't know if switching is necessary. With one file per pranch, the The history should be linear. Git (or zit) repository is just a container for git branches. Each branch contains only one file. Moving a file history is equivalent to "git push" + "git branch -D". Something like this (not tested): cd dst git init cd src git push dst local-branch:remote-branch git branch -D local-branch -- Duy --
Looks a little too clumsy for my taste. Also, I don't like the idea of having to enforce linear history for files, or getting rid of the index. I would like zit to be as lightweight a wrapper for git as possible, retaining the whole functionality. -- Giuseppe "Oblomov" Bilotta --
git breaks hard links, mind you! (Just in case you check out older versions and you wonder why your "real" file is not updated). But there's a recent patch by Dscho floating around that takes care of the hard link case. -- Hannes --
I feared that the hardlink choice was not the best one. I would definitely prefer finding a solution that didn't depend on hardlinks: not only there would be no worry about breaking them, it'd also be more portable. -- Giuseppe "Oblomov" Bilotta --
Hi, Yep, I still want to work on it; it breaks on one of Junio's machines. Ciao, Dscho --
On Fri, Oct 24, 2008 at 7:44 PM, Johannes Schindelin Well, it's not needed by Zit anymore, but there was someone else asking about on the ml recently, too 8-) -- Giuseppe "Oblomov" Bilotta --
Hi! This sounds great and would seem very useful to manage my ~/bin/ directory which contains a set of unrelated one-file-tools that If you have many files you want to track in a single directory (like ~/bin/), all those additional directories will quickly feel like clutter. If you track every file, it will even double the number of things you see with an "ls -a". If you decide against a shared repository, maybe you want to consider to not use ".zit.file/", but ".zit/file/" as the repository? This would reduce the clutter to a single directory, just like with ".git". And moving files around wouldn't be that much complicated. jlh --
Right. I'll give that a shot. -- Giuseppe "Oblomov" Bilotta --
By the way RCS which I use for version control of single files use both approaches: it can store 'file,v' alongside 'file' (just like your '.zit.file/' or '.file.git/'), but it can also store files on per-directory basis in 'RCS/' subdirectory (proposed '.zit/file/' or '.zit/file.git/' solution) By the way, it would be nice to have VC interface for Emacs for Zit... -- Jakub Narebski Poland ShadeHawk on #git --
Indeed, there's not particular reason why both solutions shouldn't be available. I'll think about implementing it this way: $ zit init will indicate that we want to track many files, and thus it will create a .zit directory under which RCS files will be available. $ zit track somefile will start tracking somefile by setting up .zit/somefile.git if .zit is available or .somefile.git otherwise. The only problem then is priority. When looking for a file's repo, do we look at .file.git first, or .zit/file.git? How does RCS behave in I'm afraid someone else will have to take care of that, since Emacs is not really something I use. -- Giuseppe "Oblomov" Bilotta --
rcsintro(1) states: If you don't want to clutter your working directory with RCS files, create a subdirectory called RCS in your working directory, and move all your RCS files there. RCS commands will look *first* into that directory to find I'll try to hack it using contrib/emacs/vc-git.el as a base... -- Jakub Narebski Poland --
Cool. I pushed changes to this end to git.oblomov.eu/zit --now zit will look for .zit/file.git first, then for .file.git; if neither is found, and .zit/ exists, the repo is set to .zit/file.git, otherwise it's set to .file.git You can either manually mkdir .zit, or use zit init that does exactly Cool, thanks. -- Giuseppe "Oblomov" Bilotta --
I am not opposed to the wish to track a single file (but I have to say I am not personally in need for such a feature), but I have to wonder from the technical point of view if one-repo-per-file is the right approach. Running "git init" in an empty directory consumes about 100k of diskspace on the machine I am typing this on, and you should be able to share most of them (except one 41-byte file that is the branch tip ref) when you track many files inside a single directory by using a single repository, one branch per file (or "one set of branches per file") model. --
the reason to use seperate repos is to ease the work involved if you need to move that file (and it's repo) elsewhere. with the git directory being under .zit, would it be possible to link the things that are nessasary togeather? hmm, looking at this in more detail. about 44K of diskspace is used by the .sample hook files, so those can be removed the remaining 56K is mostly directories eating up a disk block find . -ls 200367 4 drwxr-xr-x 7 dlang users 4096 Oct 24 12:00 . 200368 4 drwxr-xr-x 4 dlang users 4096 Oct 24 12:00 ./refs 200369 4 drwxr-xr-x 2 dlang users 4096 Oct 24 12:00 ./refs/heads 200370 4 drwxr-xr-x 2 dlang users 4096 Oct 24 12:00 ./refs/tags 200371 4 drwxr-xr-x 2 dlang users 4096 Oct 24 12:00 ./branches 200372 4 drwxr-xr-x 2 dlang users 4096 Oct 24 12:00 ./hooks 200373 4 drwxr-xr-x 2 dlang users 4096 Oct 24 12:00 ./info 1798469 4 -rw-r--r-- 1 dlang users 240 Oct 24 12:00 ./info/exclude 1600716 4 -rw-r--r-- 1 dlang users 58 Oct 24 12:00 ./description 200374 4 drwxr-xr-x 4 dlang users 4096 Oct 24 12:00 ./objects 200375 4 drwxr-xr-x 2 dlang users 4096 Oct 24 12:00 ./objects/pack 200376 4 drwxr-xr-x 2 dlang users 4096 Oct 24 12:00 ./objects/info 1600717 4 -rw-r--r-- 1 dlang users 23 Oct 24 12:00 ./HEAD 1600719 4 -rw-r--r-- 1 dlang users 92 Oct 24 12:00 ./config how many of these are _really_ nessasary? tags, info, hooks, branches, and description could probably be skipped for the common zit case, as long as they can be created as needed. If git has problems with these not existing, would it make sense to make git survive if they are missing and create them if needed? the objects directory will eat up more space as revisions are checked in (and more sub-directories are created), would it make sense to have a config option to do a flat ...
I was slowly writing a reply but it seems David beat me to it, so here goes a couple of additional comments. Precisely. The one-repo-per-file is just the simplest and most flexible solution. But yes, I have to admit I hadn't looked into disk Exactly. I'm setting up zit to prepare its repos to a more compact For starters, I'm wondering if setting core.preferSymlinkRefs would be It seems that tags, hooks, branches and description can be done with. info contains exclude which is rather essential, and this is something that could be shared across repositories. Also, we could spare a block by removing info, moving exclude to the .git dir and setting This is probably the biggest remaining spacewaste. Typical zit usage will generate a rather small number of objects, so flattening the object store for the repo wouldn't be a bad idea. Is that possible? -- Giuseppe "Oblomov" Bilotta --
is it? by default everything in this file is commented out. And with you only adding files explicitly why would it ever need to excluded anything? David Lang --
Ahem. Yes. I've got a patch ready for zit that gets rid of them. Zit does echo "*" > $GIT_DIR/info/exclude and yes it sucks to use a whole block for a file that only contains one character. Suggestions welcome. The reason why we want the exclude is that when you do zit status somefile you don't want every other file in the directory to come up as 'not tracked'. -- Giuseppe "Oblomov" Bilotta --
Yes, the file pointed at by the config key core.excludesfile is read too, so we could have it point at $GIT_DIR/zitexclude, which would allow us to spare a block. The most space saving would be achieved by a core.excludepattern or similar key, which would allow us to get rid of the exclude file altogether. -- Giuseppe "Oblomov" Bilotta --
Well, with all zit repositories in '.zit/' directory (similar to RCS/) you could have point core.excludesfile to _common_ '.zit/excludes'; the pattern doesn't change from zit repository to zit repository? You could even use per-user ~/.zitignore (I'm not sure if git expands '~' in paths; there was some patch for it, but was it accepted?) or system-wide /usr/lib/zitignore or /usr/libexec/zitignore file. -- Jakub Narebski Poland --
System-wide means maximum space save, but it require system administration to install Zit, and considering that one of the things I love of Zit now is its being self contained, I would rather not depend on anything system-wide anyway. The user .zitignore file is probably the best approach: we can create it ourselves (usually), and even if Git doesn't expand the pathname itself, we can just use an absolute path. I'll go that way. -- Giuseppe "Oblomov" Bilotta --
First, absolute path to ~/.zitignore is a bit fragile: what if layout of home directories for users change, for example because of increasing number of users some fan-out is required (/home/nick -> /home/2/nick)? Second, ~/.zitignore looks like something that user can change; if you install zit, it can install libexec/zitignore somewhere... or just use ./zit/excludes (with 'do not edit' comment perhaps...). -- Jakub Narebski Poland --
(Actually, I just found another interesting thing about the config, in that it stores the path to the work tree. This is not a problem, though, because zit_setup() sets GIT_WORK_TREE.) As I said, I don't like depending on stuff that needs to be installed. For example, what about user (non-system) installs? the libexec (or whatever) solution would have the same problem as the ~/.zitignore solution, with the moving $HOME. I guess this leaves the .zit/ solution as the most robust one, although it's not the most space-effective, especially if you have many directories, each with a single tracked file. On the plus side, going for the .zit/ solution and dropping support for .somefile.git/ means some significant code semplification. -- Giuseppe "Oblomov" Bilotta --
I just had what's probably a silly thought. how close is a zit setup to a subproject setup? David Lang --
Honestly, I haven't the slightest idea how they work. My understanding, which could be completely wrong, is that they are full-fledged git repositories, and that additional metadata at the top level takes care of understanding what ref is needed for each toplevel project. If this is true, using them wouldn't simplify zit, but rather make it more complex (and space intensive). -- Giuseppe "Oblomov" Bilotta --
