Hello, I'd like to track my huge /etc directory by using Git but I'm feeling unconfortable to do it. First, the idea is to see and check what system config tools do. Second, it will be useful to track my own configurations that I need to customize. But as I said I'm a bit worry to create a git repository as root even if I trust git which do a really good job. Are there any tricks or pitfalls that I should know before starting this ? Could anyone share his experience ? Any advices are welcome. thanks -- Francis -
I found git to be unsuitable for /etc maintenance, which is really a shame. The problem is with permissions, see this thread: http://thread.gmane.org/gmane.comp.version-control.git/56221/focus=3D56372 --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 windows 2000: designed for the internet. the internet: designed for unix. =20 spamtraps: madduck.bogus@madduck.net
Hello, Indeed that seems a major drawback for this kind of usage. Did you find an alternative to git in this case ? thanks -- Francis -
No, and I did not look anywhere, but I know of no other VCS that can adequatly track permissions. The solution, IMHO, is to enhance git with a configuration option/policy, which enables to users to choose which permission bits should be stored and restored. If I had the time, I'd dig into this right away, since I cannot imagine that it's a difficult endeavour. It might be more difficult to get the patch past Junio than to actually write and test it. :) --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 "arrogance on the part of the meritorious is even more offensive to us than the arrogance of those without merit: for merit itself is offensive." -- friedrich nietzsche =20 spamtraps: madduck.bogus@madduck.net
Has anyone checked out metastore? http://repo.or.cz/w/metastore.git ... there's an XML error in there somewhere, so its not loading the 'main' page, but http://repo.or.cz/w/metastore.git?a=shortlog should work. It looks like it could work.... any thoughts on this? -- Thomas Harning Jr. -
This looks interesting, though I guess getfacl/setfacl and getfattr/setfattr can pretty much do the same job, especially if you can call them from shell scripts/hooks (except for mtime). Or did I misunderstand something? The problem with metdata getting corrupted, which Nicolas reported, may well have to do with the use of a single file. It may be worth to consider using a shadow hierarchy of files, each containing the metadata, e.g. for a project with foo, and bar/foo and bar/baz files, you might have .metastore/foo .metastore/bar/.<uniqueid>.dir .metastore/bar/foo .metastore/bar/baz and each file could just be an rfc822-style file: Owner: root Group: root Mode: 4754 Mtime: 1234567890 Fattr-<key1>: <value1> Fattr-<key2>: <value2> This would be my approach, which should probably be a little better at preventing corruption. Anyway, this *really* should go into git itself! --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 "i like wagner's music better than anybody's. it is so loud that one can talk the whole time without other people hearing what one says." -- oscar wilde =20 spamtraps: madduck.bogus@madduck.net
Hi, Then the tool is corrupt. Introducing a shadow hierarchy, as you propose, No. Git is a source code management system. Everything else that you can do with it is a bonus, a second class citizen. Should we really try to support your use case, we will invariably affect the primary use case. Ciao, Dscho -
also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.15.16= I thought git was primarily a content tracker... so it all comes down to how to define content, doesn't it? But either way, we need not discuss that because that definition depends a lot on context and purpose and thus cannot be answered once and for all. I understand that for the primary use case, tracking nothing more than +x makes sense and should not be interfered with. This is why I was proposing a policy-based approach. The primary use case is unaffected, it's the default policy. Someone may choose to track other mode bits or file/inode attributes, according to one of several policies available with git, or even a custom policy. In that case, the repository needs to be appropriately configured. The reason why I say this should be done inside git rather than with hooks and an external tool, such as metastore is quite simple: git knows about every content entity in any tree of a repo and already has a data node for each object. Rather than introducing a parallel object database (shadow hierarchy or single file), it would make a lot more sense and be way more robust to attach additional information to these object nodes, wouldn't it? So with "appropriately configured" above, I meant that one should be able to say git-config core.track all or git-config core.track mode+attr or the default: git-config core.track 7666 (read that as a umask, which masks out everything but the three x bits. I made it 7666 instead of 7677 because core.umask and core.sharedrepository then override the group and world bits if needed) and have git do the right thing, rather than expecting those who want to track more than the executable bit to assemble a brittle set of hooks and metadata collectors+applicators and hope it all works. I understand also that this is not top priority for git, which is why I said earlier in the thread that the real difficulty might be to get Junio to accep...
Configuration options only apply to the local aspects of the repository. That is, when you clone a repository, you don't get the configuration options from it, in general. And changing configuration options on a repository does not have any effect on the content it contains. So Git doesn't have any way to represent owners or groups, and they would need to be represented carefully in order to make sense across multiple computers. If you're adding support for metadata-as-content (for more than "is this a script?"), you should be able to cover all of the common cases of extended stuff, like AFS-style ACLs. And if you want to allow meaningful development with this mechanism (as opposed to just archival of a sequence of states of a live system), the normal case will be that the metadata beyond +x is manipulated by ordinary users in some way other than modifying their working directory. So the normal case here will be like working on a filesystem that doesn't support symlinks or an executable bit when this is important content. -Daniel *This .sig left intentionally blank* -
Sure they are. Just like git-commit figures out your email address=20 if user.email is missing from git-config, or core.sharedRepository=20 or core.umask deal with permissions only when you tell them to,=20 you'd have to enable core.track or else git would just do what it Ideally, git should be able to store an open-ended number of =2E.. and yet, we support symlinks and executable files. But anyway, I really don't understand what you're trying to say. --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 "ist gott eine erfindung des teufels?" - friedrich nietzsche =20 spamtraps: madduck.bogus@madduck.net
I haven't followed the discussion at all I must admit (I wrote metastore as a quick hack to store some extended metadata and it works for my purposes as long as I don't do anything fancy). But I agree, if any changes were made to git, I'd advocate adding arbitrary attributes to files (much like xattrs) in name=value pairs, then any extended metadata could be stored in those attributes and external scripts/tools could use them in some way that makes sense...and also make sure to only update them when it makes sense. -- David Härdeman -
So where would those metdata be stored in your opinion? --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 seen on an advertising for an elaborate swiss men's watch: "almost as complicated as a woman. except it's on time" =20 spamtraps: madduck.bogus@madduck.net
I'm not sufficiently versed in the internals of git to have an informed opinion :) -- David Härdeman -
My theory was that we would provide an API for getting the "current state"= =20 listing with all of the filenames and matching contents, and leave it up=20 to metastore to put things in the filesystem; in the other direction,=20 metastore would build up this state, and we'd store it. People who are using this in practice would set a config option to=20 delegate the "working tree" filesystem I/O to metastore, while other=20 people could interact with the state as files describing the state, and=20 could therefore specify operations that are impossible or prohibited on=20 the filesystems that their development is done on. (This would effectively be like giving people a convenient way of setting= =20 attributes on entries in a tar file, such that they can edit it to=20 represent a stste that they can't necessarily create in their own=20 filesystems, and version controlling that; but more convenient, since the= =20 file contents are represented as file contents and the attributes are=20 plain text in a listing of some sort) =09-Daniel *This .sig left intentionally blank*
I think we have something like a length count for file names in index and/or tree. We could just put the (sorted) attributes after a NUL byte in the file name and include them in the count. It would also make those artificially longer file names work more or less when sorting them for deltification. However, this requires implementing _policies_: it must be possible to specify per repository exactly what will and what won't get tracked, or one will get conflicts that are not necessary or appropriate. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum -
Or perhaps the index format could be extended to include a new field for value=name pairs instead of overloading the name field. But as I said, I have no idea how feasible it would be to change git to I think the opposite approach would be better. Let git provide set/get/delete attribute operations and leave it at that. Then external programs can do what they want with that data and add/remove/modify tags as necessary (and also include the smarts to not, e.g. remove the permissions on all files if the git repo is checked out to a FAT fs). -- David Härdeman -
You need more than that. You need to be able to log, blame etc on the=20 attributes. One of the big annoyances of Subversion properties is being=20 unable to find out when or why a property value was changed. I still don't see why the attributes need to be stored in git directly -=20 particularly if you are going to use an external program to actually apply= =20 any settings - why not store the attributes as normal file (or files) of=20 some sort tracked by git? You could use any number of methods - e.g. use= =20 an sqlite database stored in the root of your tree, or a .<name>.props=20 file alongside each path that you have properties for. You could even=20 write a system that uses such a method and was then SCM agnostic, allowing= =20 you to keep your attribute tracking system if/when something better than=20 git comes along - or simply share it with less-fortunate souls stuck in an= =20 inferior system. --=20 Julian --- A strong conviction that something must be done is the parent of many bad measures. =09=09-- Daniel Webster
> On Tue, 2 Oct 2007, David H
Hi, git show :<filename> (Note the ":") Hth, Dscho -
I like that idea. --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 "information superhighway" is just an anagram for "i'm on a huge wispy rhino fart". =20 spamtraps: madduck.bogus@madduck.net
On Tue, 2 Oct 2007, David Kastrup wrote: > David H
In which case you should not be able to manipulate them (as you could not test the result) and any commits could not affect them, meaning they'd just stay unchanged. --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 the unix philosophy basically involves giving you enough rope to hang yourself. and then some more, just to be sure. =20 spamtraps: madduck.bogus@madduck.net
two problems with this 1. you do want to be able to manipulate them 1a. how do you reconcile a conflict during a merge? 2. git is a series of snapshots, what does it mean to 'stay unchanged'? David Lang -
How could there be a conflict if you can't make local changes In simple terms, let (content,A,B) be an object with content "content" and extended attributes A,B, and B cannot be represented locally, but a new object is committed with a change to attribute A (content2,A2), then the result is (content2,A2,B), as B simply comes from the (corresponding object of the) parent. Or am I totally misunderstanding? --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 when compared to windoze, unix is an operating system. =20 spamtraps: madduck.bogus@madduck.net
it's very possible that I am misunderstanding, but do we really want to have to go back to the parent to duplicate things when creating a new commit? and aren't you supposed to be able to have more then one parent? if you do, which one would you use? David Lang -
You win. :) --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 "mein gott, selbst ein huhn kann debian installieren, wenn du genug koerner auf die enter-taste legst." -- thomas koehler in de.alt.sysadmin.recovery =20 spamtraps: madduck.bogus@madduck.net
don't underestimate the usefullness of the ability to archive and restore snapshots of a live system. just that ability would be wonderful to have. the ability to checkout a copy of things elsewhere and tinker with it would be better, but the lack of that doesn't eliminate the utility by any means. -
Hi, [speaking mostly to the proponents of git-as-a-backup-tool] While at it, you should invent a fallback what to do when the owner is not present on the system you check out on. And a fallback when checking out on a filesystem that does not support owners. And a fallback when a non-root user uses it. Oh, and while you're at it (you said that it would be nice not to restrict git in any way: "it is a content tracker") support the Windows style "Group-or-User-or-something:[FRW]" ACLs. Looking forward to your patches, Dscho -
also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.16.00= Like rsync, git would use numerical UIDs (which are always present) by default, but could be told to try to map account names. If the filesystem does not support owners, chown() would not exist. I actually tend to think of things the other way around: instead of a fallback when chown() does not work (what would such a fallback be other than not chown()ing?), it would only try chown() if such That's easy, Unix already provides you with that "fallback": pack up Provided we find a way to implement this in an extensible manner, this should not be hard to do. I can't do it since I don't have access to a Windows machine. Your statement does catch me off-guard though. Does git now officially target Windows? --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 if you find a spelling mistake in the above, you get to keep it. =20 spamtraps: madduck.bogus@madduck.net
There's a problem. You need to know that the functionality is missing and n= ot try to read attributes back, but instead consider them unchanged. Nothing But if you tar that up again, the owners will be different. But you don't Official git works in cygwin. There is also a port to msys, which is not official in a sense it is not merged into mainline. --=20 Jan 'Bulb' Hudec <bulb@ucw.cz>
This is a good consideration. One way of implementing this seems to be to iterate over all file attributes recorded in the object cache (or metastore) and try to apply each. For every attribute that was properly applied to the worktree, a note is attached to the object's data in the index. Tools identifying differences between index and As per my above suggestion, this would solve itself. Untarring as non-root simply means that the chmod/chown/whatever calls would fail or not be tried at all. Thus, they would not be recorded in the index and later commits would never consider changes to these attributes. One could probably simplify the implementation such that failure to chmod/chown/whatever a single file would make the attribute be=20 ignored when worktree and index are compared. Then, it would all=20 boil down to a combination of configuration and functionality: the attributes the user wants to have tracked (configuration) and those which can be applied to the worktree when logically and'ed result in the final mask of attributes to consider when identifying changes. --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 a gourmet concerned about calories is like a punter eyeing the clock. =20 spamtraps: madduck.bogus@madduck.net
but this can be handled by a local config option. yes, you have to be careful, but it'snot that hard. David Lang -
git has pre-commit hooks that could be used to gather the permission
information and store it into a file.
git now has the ability to define cusom merge strategies for specific file
types, which could be used to handle merges for the permission files.
what git lacks the ability to do is to deal with special cases on
checkout.
the handling of gitattributes came really close, but there are two
problems remaining.
1. whatever is trying to write the files with the correct permissions
needs to be able to query the permission store before files are
written. This needs to either be an API call into git to retreive the
information for any file when it's written, or the ability to define a
specific file to be checked out first so that it can be used for
everything else.
2. the ability to specify a custom routine/program to write the file out
(assuming that it's being written to a filesystem not a pipe). this
routine would be responsible for querying the permission store and
doing 'the right thing' when the file is written during a checkout
there are some significant advantages of having the permission store be
just a text file.
1. it doesn't require a special API to a new datastore in git
2. when working in an environment that doesn't allow for implementing the
permissions (either a filesystem that can't store the permissions or
when not working as root so that you can't set the ownership) the file
can just be written and then edited with normal tools.
3. normal merge tools do a reasonable job of merging them.
however to do this git would need to gain the ability to say 'this
filename is special, it must be checked out before any other file is
checked out' (either on a per-directory or per-repository level)
if this is acceptable then altering the routines that write the files to
have the additional option of calling a different routine based on the
settings in .gitattributes seems relativly simple. there should alre...You seem to be forgetting about the index. Git never writes trees directly = to filesystem, but always with intermediate step in the index. So the API actually exists -- simply read from the index. --=20 Jan 'Bulb' Hudec <bulb@ucw.cz>
Ok, this sounds promising.
looking into one approach here.
assume for the moment that at write time an external program gets called.
this program reads the file contents from stdin and gets it's other
information from git as command line parameters
parameters I can think it would need are
path to write the file to
length of file
name of the permission file
id of the commit this is part of (possibly)
how does this program access the contents of the permission file in the
index?
David Lang
-I'd rather not implement it at such a low level where a true "checkout" happens. For one thing, I am afraid that the special casing will affect the normal codepath too much and would make it into a maintenance nightmare. But more importantly, if you are switching between commits (this includes switching branches, checking out a different commit to a detached HEAD, or pulling/merging updates your HEAD and updates your work tree), and the contents of a path does not change between the original commit and the switched-to commit, you may still have to "checkout" the external information for that path if your "permission information file" are different between these two commits. To the underlying checkout aka "two tree merge" operation, that kind of change is invisible and it should stay so for performance reasons, not to harm the normal operation. IOW, I do not want the core level to even know about the existence of "permission information file", even the code that implements it is well isolated, ifdefed out or made conditional based on some config variable. I however think your idea to have extra "permission information file" is very interesting. What would be more palatable, than mucking with the core level git, would be to have an external command that takes two tree object names that tells it what the old and new trees our work tree is switching between, and have that command to: - inspect the diff-tree output to find out what were checked out and might need their permission information tweaked; - inspect the differences between the "permission information file" in these trees to find out what were _not_ checked out, but still need their permission information tweaked. - tweak whatever external information you are interested in expressing in your "permission information file" in the work tree for the paths it discovered in the above two steps. This step may involve actions specific to projects and call hook scripts with <path, info from "pe...
some of this duplicates thoughts from other messages in this thread.
apologies for the duplication, but I want to be clear the response to
Junio's concerns here as well
as I understand it, at this point you already choose between three
options.
1. write to a file (and set the write bit if needed)
2. write to stdout
3. write to a pager program
I am suggesting adding
4. write to a .gitattributes defined program and pass it some parameters.
(and only if the .gitattributes tell you to)
this should be a very small change to the codepath
or am I missing something major here?
if this program can get the contents of the permission file out of the
index, then the requirement I listed before to make sure the permission
file gets written before anything else goes away, and the only requirement
I had not thought of this condition.
however, I think this may be easier then you are thinking
we have two conditions.
1. the permission file hasn't changed.
Solution: do nothing
2. the permission file has changed
Solution: set all the permissions to match the new file
this could be done by useing .gitattributes to specify a different program
for checking out the permission file, and that program goes through the
file and sets the permssions on everything. yes this is a bit inefficiant
compared to diffing the two permission files and only touching the files
that have changed, but is the efficiancy at this point that critical? if
so then instead of feeding the program the contents of the new file you
could feed it the diff between the old and the new file.
in theory you could do this for any file, and it would be a win for some
files (a large file that has a few changes to it would possibly be more
efficiant to modify in place then to re-write), but I'm not sure the
results would be worth the complications. if .gitattributes gains the
ability to specify the program to be used to write the file, it could also
gain the ability to specify feeding th...I do not think we are choosing any option in the codepath at all. What I mean by the normal "checkout" is what checkout_entry in entry.c does. There is no other option than (1) above. I would want to see an extremely good justification if you need to touch that codepath to implement this fringe use case. I do not think there is nothing that writes file contents to stdout/pager other than "git cat-file" or "git show"; I do not think they are what you have in mind when talking about managing the files under /etc. So unfortunately I do not understand the rest of the discussion you made in your message. -
Ok, I thought that there was common code for these different uses. could you re-read the rest of the logic based on the change being done in checkout_entry? if you are unwilling to have any changes made to the checkout_entry code then the only remaing question is what you think of Daniel's suggestion to have a hook to replace check_updates()? if it's not acceptable either then we are down to doing a post-checkout trigger. one concern I have with that approach is how to deal with partial checkouts. if a user checks out one file how can the post-checkout trigger know if it's looking at the correct permissions file as opposed to one left over from something else? can/should it go and read the file from the index instead of reading the file on the filesystem? (I don't like this becouse it leads to non-obvious behavior), or can/should there be a config option to say that whenever any file is checked out the permissions file needs to be checked out as well. a post checkout trigger is useful in enough different situations that the answers to the above questions don't eliminate the usefulness of the trigger, they just map out the pitfalls of useing it. David Lang -
Post-checkout trigger is something I can say I can live with without looking at the actual patch, but that does not mean it would be a better approach at all. I would not be able to answer the first question right now; that needs a patch to prove that it can be done with a well contained set of changes that results in a maintainable code. I haven't tried to assess the potential extent of damage needed to checkout_entry(), and I have never been interested in this "keeping track of /etc in place" topic myself. It is unlikely I'll try to come up with such a patch on my own to support it at such a low level near the core. Somebody who cares about that feature needs to take the initiative of doing that work before we can discuss and decide, although older-times including myself can help spot potential issues. So while I admit I am skeptical, consider me neither willing nor unwilling at this point. -
you cannot answer the question in the affirmitive, but you could say that any changes in that area would be completely unacceptable to you (and for a while it sounded like you were saying exactly that). in which case any this is reasonable. thanks for pointing me so clearly at the routine that needs to be modified. David Lang -
I tend to disagree. It's far from a waste of time. While, as I said, I am skeptical that such a patch would be small impact, if it helps people's needs, somebody will pick it up and carry forward, even if that somebody is not me. It can then mature out of tree and later could be merged. We simply do not know unless somebody tries. And I am quite happy that you seem to be motivated enough to see how it goes. On the other hand, the experiment could fail and you may end up with a patch that is too messy to be acceptable, in which case you might feel it a waste of time, but I do not think it is a waste even in such a case. We would learn what works and what doesn't, and we can bury "keeping track of /etc" topic to rest. I also need to rant here a bit. Fortunately we haven't had this problem too many times on this list, but sometimes people say "Here is my patch. If this is accepted I'll add documentation and tests". I rarely reply to such patches without sugarcoating my response, but my internal reaction is, "Don't you, as the person who proposes that change, believe in your patch deeply enough to be willing to perfect it, in order to make it suitable for consumption by the general public, whether it is included in my tree or not? A change that even you do not believe in yourself has very little chance of benefitting the general public, so thanks but no thanks, I'll pass." -
There's certainly the possibility that a changeset could consist of some patches that make the index/filesystem handling more clear, some patches that make the tree/index handling more clear, and some patches that allow a hook to replace one of these entirely. Things can be a lot more acceptable if the intrusive changes are improvements for the maintainability of the normal case, and the special case code is no longer intrusive at all. -Daniel *This .sig left intentionally blank* -
Very well said. -
this is perfectly acceptable to me. I was trying to make very sure that
this topic fell in this catagory.
there are other topics that come up repeatedly that do get (and deserve)
automatic rejections ('patch to explicitly record renames' for example).
and while I didn't think that 'managing /etc' was in the same catagory,
sometimes that catagory is defined as much by the opinions and goals of
the core team as it is by techinical considerations.
there's a huge difference between 'this patch is rejected becouse we think
the implementation is bad' and 'this patch is rejected becouse we disagree
with the fundamental goal of the patch' effort spent on a patch rejected
for the first reason is never a complete waste (if nothing else it can
serve an an example of how not to do things for future developers ;-) but
effort spent on a patch that's rejected for the second reason is useually
a waste, and as such I make it a point to discuss the objective and basic
I hope that my questions did not seem to fall into this catagory.
David Lang
-Not at all. -
Why not have the command also responsible for creating the files that need to be created (calling back into git to read their contents)? That way, there's no window where they've been created without their metadata, and there's more that the core git doesn't have to worry about. I could see the program getting the index, the target tree, and the directory to put files in, and being told to do the whole 2-way merge (except, perhaps, updating the index to match the tree, which git could do afterwards). As far as git would be concerned, it would mostly be like a bare repository. -Daniel *This .sig left intentionally blank* -
my initial thoughts were to have git do all it's normal work and hook into git at the point where it's writing the file out (where today it chooses between writing the data to a file on disk, pipeing to stdout, or pipeing to a pager) by adding the option to pipe into a different program that would deal with the permission stuff. this program would only have to write the file and set the permissions, it wouldn't have to know anything about git other then where to find the permissions it needs to know. it sounds like you are suggesting that the hook be much earlier in the process, and instead of one copy of git running and calling many copies of the writing program, you would have one copy of the writing program that would call many copies of git. I'll admit that my initial reaction is that it's probably a lot more expensive to do all the calls into git. git just has a lot more complex if this functionality does shift to earlier in the process, how much of the git logic needs to be duplicated in this program? if this program needs to do the merge, won't it have to duplicate the merge logic, including the .gitattributes checking for custom merge calls? I have been thinking primarily in terms of doing a complete checkout, overwriting all files, and secondarily how do do a checkout of just a few files, but again where all files selected overwrite the existing files. I wasn't thinking of the fact that git optimizes the checkout and avoids writing a file that didn't change. this changes things slightly prior to this I was thinking that the permission file needed to be handled differently becouse writing it out needed to avoid doing any circular refrences where you would need to check the contents of it to write it out. it now appears as if what really needs to happen is that if the permission file changes a different program needs to be called when it's written out then when the other files are written out. by itself this isn't hard as .gitattribu...
A lot of the git commands are actually currently shell scripts that call back to git, so that's not too different. The reason to have a single copy of the writing program is that it would be able to get the whole set of differences that need to be handled, and first pick out the metadata file, process it to figure out the writing instructions once, figure out the changes in the writing instructions, and figure out the changes in the This is two-way merge, not three-way merge. The basic concept is that you're in state A, and you want to be in state B. Rather than writing out all of state B, you write out all of state B that's different from state A. Think of taking a diff of two big trees and then applying it as a patch, instead of copying the new tree onto the old tree; the benefit is that stuff that doesn't change doesn't get rewritten, and the diff is blazingly fast, given how we store our information. 3-way merge will be handled by git, and not in a live /etc directory anyway (that is, you'd want to fix up the metadata files as plain text files, not as metadata bits on a checked out directory; otherwise, you'll be trying to put conflict markers in mode bits, and that's clearly not While we're at it, you probably don't even want to write the permission file to the live filesystem. It's just one more thing that could leak information, and changes to the permissions of files that you record by committing the live filesystem would presumably be done by changing the permissions of files in the filesystem, not by changing the text file. (Of course, you could check out the same commits as ordinary source, with developer-owned 644 files and a 644 "permissions" file, and there you'd have the permissions file appear in the work tree, and you could edit it You probably want to be able to keep local uncommitted changes. People like to be able to have things slightly different in their particular deployment from the way things are in the repository, for stuff...
I'm still a little unclear on how much work this program would then have to do. it's problably my lack of understanding that's makeing this sound so what would this program be given? it sounds like it would be called once for the entire tree checkout would it be handed just the start and end commits and query git for everything else it needs? it sounds like there is more then this, you refer to git fully crafting the new index. so would this program be accessing an old and new index and do the comparison between the two? or would git feed it a list of what's changed and then have it query git right, we don't want conflict markers on mode bits or other ACL type the permissions and ACL's can be queried directly from the filesystem, so I don't see any security problems with writing the permission file to the filesystem. changing the permissions would be done by changing the files themselves (when you are running as root on a filesystem that supports the changes, otherwise it would need to fall back to writing the file and getting the changes there, but that should be able to be a local config option) I don't like the idea of having a file that doesn't appear on the local right, and the same thing if the filesystem doesn't support something in if so this means that the permission changing program definantly needs to operate on the diff of the permisison file, not on the absolute file. this complicates things slightly, but it shouldn't be too bad. changing topic slightly. I know git has pre-commit hooks, but I've never needed to use them. at what point can you hook in? can you define a hook that runs when you do a git-add? or only when you do a git-commit? the reason I'm asking is to try and figure out when and how to create the permissions file. when I was thinking in terms of dealing with the permissions as a single bog block it wasn't that bad to say that at git-commit time you have to scan every file and check it's permissions ...
Reading over your thoughts, I get this uneasy feeling about such a permissions file, because it stores redundant information, and redundant information has a tendency to get out of sync. If we cannot attach attributes to objects in the git database, then I understand the need for such a metastore. But I don't think it should be checked out and visible, or maybe we should think of it not in terms of a file anyway, but a metastore. Or how do you want to resolve the situation when a user might edit the file, changing a mode from 644 to 640, while in the filesystem, it was changed by other means to 600. =2Egitattributes is a different story since it stores git-specificy attributes, which are present nowhere else in the checkout. I still maintain it would be best if git allowed extra data to be attached to object nodes. When you start thinking about cherry-picking or even simple merges, I think that makes most sense. And we don't need conflict markers, we could employ an iterative merge process as e.g. git-rebase uses: "a conflict has been found in the file mode of ... ... 2750 vs. 2755 ... please set the file mode as it should be and do git-merge I'd much rather see something like `git-attr chmod 644 file-in-index` to make this change, rather than a file, which introduces the potential for syntax errors. --=20 martin; (greetings from the heart of the sun.) \____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck =20 "to me, vi is zen. to use vi is to practice zen. every command is a koan. profound to the user, unintelligible to the uninitiated. you discover truth everytime you use it." -- reddy =E4t lion.austin.ibm.com =20 spamtraps: madduck.bogus@madduck.net
each local repository would need to be configured to either recreate the permissions file at checkin time or to use the permission file and ignore the actual permissions on the file. while I agree that it would be ideal to store this data inside git, I'm more interested in getting a functional implementation, and given the reluctance of the git core team to allow any changes to support this use-case anything that can be done to minimize the changes needed to and there's nothing to prevent the checkin hook from running such a first make this useable, then if it starts getting used widely (which would not at all surprise me, many distros are looking for good options for doing this sort of thing, I wouldn't be surprised to see several of them start useing git if it did the job well) things can be moved from external scripts and storage to internal capabilities as appropriate. David Lang -
I'd like to point out the following two posts, as I think they are relevant to this thread: [PATCH] example hook script to save/restore file permissions/ownership http://marc.info/?l=git&m=118953004817642&w=2 [PATCH] post_merge hook, related documentation, and tests http://marc.info/?l=git&m=118953004730496&w=2 The hook script above runs in a pre-commit hook to write out file metadata to a file in the repository. It can then be run from the post-merge hook (patch above) to restore permissions. Running it from a post-checkout hook may be more appropriate, but post-merge seems to work well for my purposes. The script handles merge conflicts and (in my testing) does the right thing. I'm using it now to track metadata for not just /etc, but an entire linux image. It will handle merge conflicts by recognizing that the metadata file had a conflict, and will direct the user to resolve the conflict and reset working dir perms before allowing a commit. -JE -
Well, you misread me or what I said was confusing or both. I was suggesting totally opposite. Let git do all its normal work, and then call your hook to munge the work tree in any way you want. -
so you are saying, have git write everything out as-is and then call a program afterwords to do things? essentially a post-checkout hook? such a hook is useful in many situations, and would allow for the workflow where you have /etc, /etc.git, and write scripts to move things back and forth between them. so I do think that this is a capability that would be useful to git overall. however, for the specific use-case of maintaining /etc I don't think that it's as good as having a hook at write time. David Lang -
I think he was replying to me, not you. I was suggesting that git stop at the index, and let him take care of deciding how the index relates to the work tree. That is, he'd get called instead of check_updates() in unpack-trees. (And we might have to funnel more code paths through this function, so that checkout-index does what read-tree -m would do, wrt changes to the filesystem). -Daniel *This .sig left intentionally blank* -
Doing this atomically involves creating the file in question by specifying the permissions on the creat system call already, and possibly wrap seteuid calls and similar around it for getting the right file/ownership. However, it is not really necessary to do this atomically: instead one can rather create the file using safe permissions (600) at first, then do fchown and fchmod (or chown/chmod) at some point in time afterwards as required. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum -
the problem with this in /etc is if you do the wrong file as 600 you can cause lots of nasty problems to the system during the window. for some files/directories you will want to write the file to a temp name and then move the file atomicly to the final location. git itself shouldn't need to worry about this, the external write routine I'm talking about is the correct place for this (at least until all the bugs get worked out and everyone is comfortable that everything is good, and doesn't impact the core git code badly) David Lang -
Hi, Would they be acceptable for you? If so, go ahead. If not, don't. Hth, Dscho -
frankly, unless I am willing for fork git (which I am not) it matters a whole lot less if such a change is acceptable to me then if it is acceptable to the maintainers. if it's not acceptable to the maintainers as a concept then it's not worth going to the effort of producing the patches as they will just be rejected. David Lang -
I also think such configuration option would be cool. Not only for tracking /etc or /home but also for example for "web applications" (for example in PHP). In that case file and directory permissions can be as important as the source code tracked and it is pain to chmod (and sometimes chown) all files to different values after each checkout. Not speaking about potential race. Thanks, Grzegorz Kulewski -
>>>>> "Grzegorz" == Grzegorz Kulewski <kangur@polcom.net> writes: Grzegorz> Not only for tracking /etc or /home but also for example for "web Grzegorz> applications" (for example in PHP). In that case file and directory Grzegorz> permissions can be as important as the source code tracked and it is pain to Grzegorz> chmod (and sometimes chown) all files to different values after each Grzegorz> checkout. Not speaking about potential race. Uh, works just fine for me to manage my web site content. Th
