Hello,
I'd like to track my huge /etc directory by using Git but I'm feeling
unconfortable to do it.First, the idea is to see and check what system config tools do.
Second, it will be useful to track my own configurations that I need
to customize.But as I said I'm a bit worry to create a git repository as root even
if I trust git which do a really good job.Are there any tricks or pitfalls that I should know before starting
this ? Could anyone share his experience ?Any advices are welcome.
thanks
--
Francis
-
I found git to be unsuitable for /etc maintenance, which is really
a shame. The problem is with permissions, see this thread:http://thread.gmane.org/gmane.comp.version-control.git/56221/focus=3D56372
--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
windows 2000: designed for the internet.
the internet: designed for unix.
=20
spamtraps: madduck.bogus@madduck.net
Hello,
Indeed that seems a major drawback for this kind of usage.
Did you find an alternative to git in this case ?
thanks
--
Francis
-
No, and I did not look anywhere, but I know of no other VCS that can
adequatly track permissions.The solution, IMHO, is to enhance git with a configuration
option/policy, which enables to users to choose which permission
bits should be stored and restored. If I had the time, I'd dig into
this right away, since I cannot imagine that it's a difficult
endeavour. It might be more difficult to get the patch past Junio
than to actually write and test it. :)--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
"arrogance on the part of the meritorious is even more
offensive to us than the arrogance of those without merit:
for merit itself is offensive."
-- friedrich nietzsche
=20
spamtraps: madduck.bogus@madduck.net
Has anyone checked out metastore? http://repo.or.cz/w/metastore.git
... there's an XML error in there somewhere, so its not loading the
'main' page, but http://repo.or.cz/w/metastore.git?a=shortlog should
work.It looks like it could work.... any thoughts on this?
--
Thomas Harning Jr.
-
This looks interesting, though I guess getfacl/setfacl and
getfattr/setfattr can pretty much do the same job, especially if you
can call them from shell scripts/hooks (except for mtime). Or did
I misunderstand something?The problem with metdata getting corrupted, which Nicolas reported,
may well have to do with the use of a single file. It may be worth
to consider using a shadow hierarchy of files, each containing the
metadata, e.g. for a project with foo, and bar/foo and bar/baz
files, you might have.metastore/foo
.metastore/bar/.<uniqueid>.dir
.metastore/bar/foo
.metastore/bar/bazand each file could just be an rfc822-style file:
Owner: root
Group: root
Mode: 4754
Mtime: 1234567890
Fattr-<key1>: <value1>
Fattr-<key2>: <value2>This would be my approach, which should probably be a little better
at preventing corruption.Anyway, this *really* should go into git itself!
--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
"i like wagner's music better than anybody's. it is so loud that one
can talk the whole time without other people hearing what one says."
-- oscar wilde
=20
spamtraps: madduck.bogus@madduck.net
Hi,
Then the tool is corrupt. Introducing a shadow hierarchy, as you propose,
No. Git is a source code management system. Everything else that you can
do with it is a bonus, a second class citizen. Should we really try to
support your use case, we will invariably affect the primary use case.Ciao,
Dscho-
also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.15.16=
I thought git was primarily a content tracker... so it all comes
down to how to define content, doesn't it? But either way, we need
not discuss that because that definition depends a lot on context
and purpose and thus cannot be answered once and for all.I understand that for the primary use case, tracking nothing more
than +x makes sense and should not be interfered with. This is why
I was proposing a policy-based approach. The primary use case is
unaffected, it's the default policy. Someone may choose to track
other mode bits or file/inode attributes, according to one of
several policies available with git, or even a custom policy. In
that case, the repository needs to be appropriately configured.The reason why I say this should be done inside git rather than with
hooks and an external tool, such as metastore is quite simple: git
knows about every content entity in any tree of a repo and already
has a data node for each object. Rather than introducing a parallel
object database (shadow hierarchy or single file), it would make
a lot more sense and be way more robust to attach additional
information to these object nodes, wouldn't it?So with "appropriately configured" above, I meant that one should be
able to saygit-config core.track all
or
git-config core.track mode+attr
or the default:
git-config core.track 7666
(read that as a umask, which masks out everything but the three
x bits. I made it 7666 instead of 7677 because core.umask and
core.sharedrepository then override the group and world bits if
needed)and have git do the right thing, rather than expecting those who
want to track more than the executable bit to assemble a brittle set
of hooks and metadata collectors+applicators and hope it all works.I understand also that this is not top priority for git, which is
why I said earlier in the thread that the real difficulty might be
to get Junio to accep...
Configuration options only apply to the local aspects of the repository.
That is, when you clone a repository, you don't get the configuration
options from it, in general. And changing configuration options on a
repository does not have any effect on the content it contains. SoGit doesn't have any way to represent owners or groups, and they would
need to be represented carefully in order to make sense across multiple
computers. If you're adding support for metadata-as-content (for more than
"is this a script?"), you should be able to cover all of the common cases
of extended stuff, like AFS-style ACLs. And if you want to allow
meaningful development with this mechanism (as opposed to just archival of
a sequence of states of a live system), the normal case will be that the
metadata beyond +x is manipulated by ordinary users in some way other than
modifying their working directory. So the normal case here will be like
working on a filesystem that doesn't support symlinks or an executable bit
when this is important content.-Daniel
*This .sig left intentionally blank*
-
Sure they are. Just like git-commit figures out your email address=20
if user.email is missing from git-config, or core.sharedRepository=20
or core.umask deal with permissions only when you tell them to,=20
you'd have to enable core.track or else git would just do what itIdeally, git should be able to store an open-ended number of
=2E.. and yet, we support symlinks and executable files. But anyway,
I really don't understand what you're trying to say.--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
"ist gott eine erfindung des teufels?"
- friedrich nietzsche
=20
spamtraps: madduck.bogus@madduck.net
I haven't followed the discussion at all I must admit (I wrote metastore
as a quick hack to store some extended metadata and it works for my
purposes as long as I don't do anything fancy). But I agree, if any
changes were made to git, I'd advocate adding arbitrary attributes to
files (much like xattrs) in name=value pairs, then any extended metadata
could be stored in those attributes and external scripts/tools could use
them in some way that makes sense...and also make sure to only update
them when it makes sense.--
David Härdeman
-
So where would those metdata be stored in your opinion?
--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
seen on an advertising for an elaborate swiss men's watch:
"almost as complicated as a woman. except it's on time"
=20
spamtraps: madduck.bogus@madduck.net
I'm not sufficiently versed in the internals of git to have an informed
opinion :)--
David Härdeman
-
My theory was that we would provide an API for getting the "current state"=
=20
listing with all of the filenames and matching contents, and leave it up=20
to metastore to put things in the filesystem; in the other direction,=20
metastore would build up this state, and we'd store it.People who are using this in practice would set a config option to=20
delegate the "working tree" filesystem I/O to metastore, while other=20
people could interact with the state as files describing the state, and=20
could therefore specify operations that are impossible or prohibited on=20
the filesystems that their development is done on.(This would effectively be like giving people a convenient way of setting=
=20
attributes on entries in a tar file, such that they can edit it to=20
represent a stste that they can't necessarily create in their own=20
filesystems, and version controlling that; but more convenient, since the=
=20
file contents are represented as file contents and the attributes are=20
plain text in a listing of some sort)=09-Daniel
*This .sig left intentionally blank*
I think we have something like a length count for file names in index
and/or tree. We could just put the (sorted) attributes after a NUL
byte in the file name and include them in the count. It would also
make those artificially longer file names work more or less when
sorting them for deltification.However, this requires implementing _policies_: it must be possible to
specify per repository exactly what will and what won't get tracked,
or one will get conflicts that are not necessary or appropriate.--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
Or perhaps the index format could be extended to include a new field for
value=name pairs instead of overloading the name field.But as I said, I have no idea how feasible it would be to change git to
I think the opposite approach would be better. Let git provide
set/get/delete attribute operations and leave it at that. Then external
programs can do what they want with that data and add/remove/modify tags
as necessary (and also include the smarts to not, e.g. remove the
permissions on all files if the git repo is checked out to a FAT fs).--
David Härdeman
-
You need more than that. You need to be able to log, blame etc on the=20
attributes. One of the big annoyances of Subversion properties is being=20
unable to find out when or why a property value was changed.I still don't see why the attributes need to be stored in git directly -=20
particularly if you are going to use an external program to actually apply=
=20
any settings - why not store the attributes as normal file (or files) of=20
some sort tracked by git? You could use any number of methods - e.g. use=
=20
an sqlite database stored in the root of your tree, or a .<name>.props=20
file alongside each path that you have properties for. You could even=20
write a system that uses such a method and was then SCM agnostic, allowing=
=20
you to keep your attribute tracking system if/when something better than=20
git comes along - or simply share it with less-fortunate souls stuck in an=
=20
inferior system.--=20
Julian---
A strong conviction that something must be done is the parent of many
bad measures.
=09=09-- Daniel Webster
> On Tue, 2 Oct 2007, David H
Hi,
git show :<filename>
(Note the ":")
Hth,
Dscho-
I like that idea.
--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
"information superhighway"
is just an anagram for
"i'm on a huge wispy rhino fart".
=20
spamtraps: madduck.bogus@madduck.net
On Tue, 2 Oct 2007, David Kastrup wrote:
> David H
In which case you should not be able to manipulate them (as you
could not test the result) and any commits could not affect them,
meaning they'd just stay unchanged.--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
the unix philosophy basically involves
giving you enough rope to hang yourself.
and then some more, just to be sure.
=20
spamtraps: madduck.bogus@madduck.net
two problems with this
1. you do want to be able to manipulate them
1a. how do you reconcile a conflict during a merge?
2. git is a series of snapshots, what does it mean to 'stay unchanged'?
David Lang
-
How could there be a conflict if you can't make local changes
In simple terms, let (content,A,B) be an object with content
"content" and extended attributes A,B, and B cannot be represented
locally, but a new object is committed with a change to attribute
A (content2,A2), then the result is (content2,A2,B), as B simply
comes from the (corresponding object of the) parent.Or am I totally misunderstanding?
--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
when compared to windoze, unix is an operating system.
=20
spamtraps: madduck.bogus@madduck.net
it's very possible that I am misunderstanding, but do we really want to
have to go back to the parent to duplicate things when creating a new
commit?and aren't you supposed to be able to have more then one parent? if you
do, which one would you use?David Lang
-
You win. :)
--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
"mein gott, selbst ein huhn kann debian installieren, wenn du genug
koerner auf die enter-taste legst."
-- thomas koehler in de.alt.sysadmin.recovery
=20
spamtraps: madduck.bogus@madduck.net
don't underestimate the usefullness of the ability to archive and restore
snapshots of a live system. just that ability would be wonderful to have.the ability to checkout a copy of things elsewhere and tinker with it
would be better, but the lack of that doesn't eliminate the utility by any
means.-
Hi,
[speaking mostly to the proponents of git-as-a-backup-tool]
While at it, you should invent a fallback what to do when the owner is not
present on the system you check out on. And a fallback when checking out
on a filesystem that does not support owners.And a fallback when a non-root user uses it.
Oh, and while you're at it (you said that it would be nice not to restrict
git in any way: "it is a content tracker") support the Windows style
"Group-or-User-or-something:[FRW]" ACLs.Looking forward to your patches,
Dscho-
also sprach Johannes Schindelin <Johannes.Schindelin@gmx.de> [2007.09.16.00=
Like rsync, git would use numerical UIDs (which are always present)
by default, but could be told to try to map account names.If the filesystem does not support owners, chown() would not exist.
I actually tend to think of things the other way around: instead of
a fallback when chown() does not work (what would such a fallback be
other than not chown()ing?), it would only try chown() if suchThat's easy, Unix already provides you with that "fallback": pack up
Provided we find a way to implement this in an extensible manner,
this should not be hard to do. I can't do it since I don't have
access to a Windows machine.Your statement does catch me off-guard though. Does git now
officially target Windows?--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
if you find a spelling mistake in the above, you get to keep it.
=20
spamtraps: madduck.bogus@madduck.net
There's a problem. You need to know that the functionality is missing and n=
ot
try to read attributes back, but instead consider them unchanged. NothingBut if you tar that up again, the owners will be different. But you don't
Official git works in cygwin. There is also a port to msys, which is
not official in a sense it is not merged into mainline.--=20
Jan 'Bulb' Hudec <bulb@ucw.cz>
This is a good consideration. One way of implementing this seems to
be to iterate over all file attributes recorded in the object cache
(or metastore) and try to apply each. For every attribute that was
properly applied to the worktree, a note is attached to the object's
data in the index. Tools identifying differences between index andAs per my above suggestion, this would solve itself. Untarring as
non-root simply means that the chmod/chown/whatever calls would fail
or not be tried at all. Thus, they would not be recorded in the
index and later commits would never consider changes to these
attributes.One could probably simplify the implementation such that failure to
chmod/chown/whatever a single file would make the attribute be=20
ignored when worktree and index are compared. Then, it would all=20
boil down to a combination of configuration and functionality: the
attributes the user wants to have tracked (configuration) and those
which can be applied to the worktree when logically and'ed result in
the final mask of attributes to consider when identifying changes.--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
a gourmet concerned about calories
is like a punter eyeing the clock.
=20
spamtraps: madduck.bogus@madduck.net
but this can be handled by a local config option. yes, you have to be
careful, but it'snot that hard.David Lang
-
git has pre-commit hooks that could be used to gather the permission
information and store it into a file.git now has the ability to define cusom merge strategies for specific file
types, which could be used to handle merges for the permission files.what git lacks the ability to do is to deal with special cases on
checkout.the handling of gitattributes came really close, but there are two
problems remaining.1. whatever is trying to write the files with the correct permissions
needs to be able to query the permission store before files are
written. This needs to either be an API call into git to retreive the
information for any file when it's written, or the ability to define a
specific file to be checked out first so that it can be used for
everything else.2. the ability to specify a custom routine/program to write the file out
(assuming that it's being written to a filesystem not a pipe). this
routine would be responsible for querying the permission store and
doing 'the right thing' when the file is written during a checkoutthere are some significant advantages of having the permission store be
just a text file.1. it doesn't require a special API to a new datastore in git
2. when working in an environment that doesn't allow for implementing the
permissions (either a filesystem that can't store the permissions or
when not working as root so that you can't set the ownership) the file
can just be written and then edited with normal tools.3. normal merge tools do a reasonable job of merging them.
however to do this git would need to gain the ability to say 'this
filename is special, it must be checked out before any other file is
checked out' (either on a per-directory or per-repository level)if this is acceptable then altering the routines that write the files to
have the additional option of calling a different routine based on the
settings in .gitattributes seems relativly simple. there should alre...
You seem to be forgetting about the index. Git never writes trees directly =
to
filesystem, but always with intermediate step in the index. So the API
actually exists -- simply read from the index.--=20
Jan 'Bulb' Hudec <bulb@ucw.cz>
Ok, this sounds promising.
looking into one approach here.
assume for the moment that at write time an external program gets called.
this program reads the file contents from stdin and gets it's other
information from git as command line parameters
parameters I can think it would need are
path to write the file to
length of file
name of the permission file
id of the commit this is part of (possibly)how does this program access the contents of the permission file in the
index?David Lang
-
I'd rather not implement it at such a low level where a true
"checkout" happens. For one thing, I am afraid that the special
casing will affect the normal codepath too much and would make
it into a maintenance nightmare. But more importantly, if you
are switching between commits (this includes switching branches,
checking out a different commit to a detached HEAD, or
pulling/merging updates your HEAD and updates your work tree),
and the contents of a path does not change between the original
commit and the switched-to commit, you may still have to
"checkout" the external information for that path if your
"permission information file" are different between these two
commits. To the underlying checkout aka "two tree merge"
operation, that kind of change is invisible and it should stay
so for performance reasons, not to harm the normal operation.
IOW, I do not want the core level to even know about the
existence of "permission information file", even the code that
implements it is well isolated, ifdefed out or made conditional
based on some config variable.I however think your idea to have extra "permission information
file" is very interesting. What would be more palatable, than
mucking with the core level git, would be to have an external
command that takes two tree object names that tells it what the
old and new trees our work tree is switching between, and have
that command to:- inspect the diff-tree output to find out what were checked
out and might need their permission information tweaked;- inspect the differences between the "permission information
file" in these trees to find out what were _not_ checked out,
but still need their permission information tweaked.- tweak whatever external information you are interested in
expressing in your "permission information file" in the work
tree for the paths it discovered in the above two steps.
This step may involve actions specific to projects and call
hook scripts with <path, info from "pe...
some of this duplicates thoughts from other messages in this thread.
apologies for the duplication, but I want to be clear the response to
Junio's concerns here as wellas I understand it, at this point you already choose between three
options.1. write to a file (and set the write bit if needed)
2. write to stdout
3. write to a pager programI am suggesting adding
4. write to a .gitattributes defined program and pass it some parameters.
(and only if the .gitattributes tell you to)this should be a very small change to the codepath
or am I missing something major here?
if this program can get the contents of the permission file out of the
index, then the requirement I listed before to make sure the permission
file gets written before anything else goes away, and the only requirementI had not thought of this condition.
however, I think this may be easier then you are thinking
we have two conditions.
1. the permission file hasn't changed.
Solution: do nothing
2. the permission file has changed
Solution: set all the permissions to match the new file
this could be done by useing .gitattributes to specify a different program
for checking out the permission file, and that program goes through the
file and sets the permssions on everything. yes this is a bit inefficiant
compared to diffing the two permission files and only touching the files
that have changed, but is the efficiancy at this point that critical? if
so then instead of feeding the program the contents of the new file you
could feed it the diff between the old and the new file.in theory you could do this for any file, and it would be a win for some
files (a large file that has a few changes to it would possibly be more
efficiant to modify in place then to re-write), but I'm not sure the
results would be worth the complications. if .gitattributes gains the
ability to specify the program to be used to write the file, it could also
gain the ability to specify feeding th...
I do not think we are choosing any option in the codepath at
all.What I mean by the normal "checkout" is what checkout_entry in
entry.c does. There is no other option than (1) above. I would
want to see an extremely good justification if you need to touch
that codepath to implement this fringe use case.I do not think there is nothing that writes file contents to
stdout/pager other than "git cat-file" or "git show"; I do not
think they are what you have in mind when talking about managing
the files under /etc. So unfortunately I do not understand the
rest of the discussion you made in your message.
-
Ok, I thought that there was common code for these different uses. could
you re-read the rest of the logic based on the change being done in
checkout_entry?if you are unwilling to have any changes made to the checkout_entry code
then the only remaing question is what you think of Daniel's suggestion to
have a hook to replace check_updates()?if it's not acceptable either then we are down to doing a post-checkout
trigger.one concern I have with that approach is how to deal with partial
checkouts. if a user checks out one file how can the post-checkout trigger
know if it's looking at the correct permissions file as opposed to one
left over from something else? can/should it go and read the file from the
index instead of reading the file on the filesystem? (I don't like this
becouse it leads to non-obvious behavior), or can/should there be a config
option to say that whenever any file is checked out the permissions file
needs to be checked out as well.a post checkout trigger is useful in enough different situations that the
answers to the above questions don't eliminate the usefulness of the
trigger, they just map out the pitfalls of useing it.David Lang
-
Post-checkout trigger is something I can say I can live with
without looking at the actual patch, but that does not mean it
would be a better approach at all.I would not be able to answer the first question right now; that
needs a patch to prove that it can be done with a well contained
set of changes that results in a maintainable code.I haven't tried to assess the potential extent of damage needed
to checkout_entry(), and I have never been interested in this
"keeping track of /etc in place" topic myself. It is unlikely
I'll try to come up with such a patch on my own to support it at
such a low level near the core. Somebody who cares about that
feature needs to take the initiative of doing that work before
we can discuss and decide, although older-times including myself
can help spot potential issues.So while I admit I am skeptical, consider me neither willing nor
unwilling at this point.
-
you cannot answer the question in the affirmitive, but you could say that
any changes in that area would be completely unacceptable to you (and for
a while it sounded like you were saying exactly that). in which case anythis is reasonable. thanks for pointing me so clearly at the routine that
needs to be modified.David Lang
-
I tend to disagree. It's far from a waste of time. While, as I
said, I am skeptical that such a patch would be small impact, if
it helps people's needs, somebody will pick it up and carry
forward, even if that somebody is not me. It can then mature
out of tree and later could be merged. We simply do not know
unless somebody tries. And I am quite happy that you seem to be
motivated enough to see how it goes.On the other hand, the experiment could fail and you may end up
with a patch that is too messy to be acceptable, in which case
you might feel it a waste of time, but I do not think it is a
waste even in such a case. We would learn what works and what
doesn't, and we can bury "keeping track of /etc" topic to rest.I also need to rant here a bit.
Fortunately we haven't had this problem too many times on this
list, but sometimes people say "Here is my patch. If this is
accepted I'll add documentation and tests". I rarely reply to
such patches without sugarcoating my response, but my internal
reaction is, "Don't you, as the person who proposes that change,
believe in your patch deeply enough to be willing to perfect it,
in order to make it suitable for consumption by the general
public, whether it is included in my tree or not? A change that
even you do not believe in yourself has very little chance of
benefitting the general public, so thanks but no thanks, I'll
pass."
-
There's certainly the possibility that a changeset could consist of some
patches that make the index/filesystem handling more clear, some patches
that make the tree/index handling more clear, and some patches that allow
a hook to replace one of these entirely. Things can be a lot more
acceptable if the intrusive changes are improvements for the
maintainability of the normal case, and the special case code is no longer
intrusive at all.-Daniel
*This .sig left intentionally blank*
-
Very well said.
-
this is perfectly acceptable to me. I was trying to make very sure that
this topic fell in this catagory.there are other topics that come up repeatedly that do get (and deserve)
automatic rejections ('patch to explicitly record renames' for example).
and while I didn't think that 'managing /etc' was in the same catagory,
sometimes that catagory is defined as much by the opinions and goals of
the core team as it is by techinical considerations.there's a huge difference between 'this patch is rejected becouse we think
the implementation is bad' and 'this patch is rejected becouse we disagree
with the fundamental goal of the patch' effort spent on a patch rejected
for the first reason is never a complete waste (if nothing else it can
serve an an example of how not to do things for future developers ;-) but
effort spent on a patch that's rejected for the second reason is useually
a waste, and as such I make it a point to discuss the objective and basicI hope that my questions did not seem to fall into this catagory.
David Lang
-
Not at all.
-
Why not have the command also responsible for creating the files that need
to be created (calling back into git to read their contents)? That way,
there's no window where they've been created without their metadata, and
there's more that the core git doesn't have to worry about.I could see the program getting the index, the target tree, and the
directory to put files in, and being told to do the whole 2-way merge
(except, perhaps, updating the index to match the tree, which git could do
afterwards). As far as git would be concerned, it would mostly be like a
bare repository.-Daniel
*This .sig left intentionally blank*
-
my initial thoughts were to have git do all it's normal work and hook into
git at the point where it's writing the file out (where today it chooses
between writing the data to a file on disk, pipeing to stdout, or pipeing
to a pager) by adding the option to pipe into a different program that
would deal with the permission stuff. this program would only have to
write the file and set the permissions, it wouldn't have to know anything
about git other then where to find the permissions it needs to know.it sounds like you are suggesting that the hook be much earlier in the
process, and instead of one copy of git running and calling many copies of
the writing program, you would have one copy of the writing program that
would call many copies of git.I'll admit that my initial reaction is that it's probably a lot more
expensive to do all the calls into git. git just has a lot more complexif this functionality does shift to earlier in the process, how much of
the git logic needs to be duplicated in this program?if this program needs to do the merge, won't it have to duplicate the
merge logic, including the .gitattributes checking for custom merge calls?I have been thinking primarily in terms of doing a complete checkout,
overwriting all files, and secondarily how do do a checkout of just a few
files, but again where all files selected overwrite the existing files.I wasn't thinking of the fact that git optimizes the checkout and avoids
writing a file that didn't change.this changes things slightly
prior to this I was thinking that the permission file needed to be handled
differently becouse writing it out needed to avoid doing any circular
refrences where you would need to check the contents of it to write it
out.it now appears as if what really needs to happen is that if the permission
file changes a different program needs to be called when it's written out
then when the other files are written out. by itself this isn't hard as
.gitattribu...
A lot of the git commands are actually currently shell scripts that call
back to git, so that's not too different. The reason to have a single copy
of the writing program is that it would be able to get the whole set of
differences that need to be handled, and first pick out the metadata file,
process it to figure out the writing instructions once, figure out the
changes in the writing instructions, and figure out the changes in theThis is two-way merge, not three-way merge. The basic concept is that
you're in state A, and you want to be in state B. Rather than writing out
all of state B, you write out all of state B that's different from state
A. Think of taking a diff of two big trees and then applying it as a
patch, instead of copying the new tree onto the old tree; the benefit is
that stuff that doesn't change doesn't get rewritten, and the diff is
blazingly fast, given how we store our information.3-way merge will be handled by git, and not in a live /etc directory
anyway (that is, you'd want to fix up the metadata files as plain text
files, not as metadata bits on a checked out directory; otherwise, you'll
be trying to put conflict markers in mode bits, and that's clearly notWhile we're at it, you probably don't even want to write the permission
file to the live filesystem. It's just one more thing that could leak
information, and changes to the permissions of files that you record by
committing the live filesystem would presumably be done by changing the
permissions of files in the filesystem, not by changing the text file.(Of course, you could check out the same commits as ordinary source, with
developer-owned 644 files and a 644 "permissions" file, and there you'd
have the permissions file appear in the work tree, and you could edit itYou probably want to be able to keep local uncommitted changes. People
like to be able to have things slightly different in their particular
deployment from the way things are in the repository, for stuff...
I'm still a little unclear on how much work this program would then have
to do. it's problably my lack of understanding that's makeing this soundso what would this program be given?
it sounds like it would be called once for the entire tree checkout
would it be handed just the start and end commits and query git for
everything else it needs?it sounds like there is more then this, you refer to git fully crafting
the new index.so would this program be accessing an old and new index and do the
comparison between the two?or would git feed it a list of what's changed and then have it query git
right, we don't want conflict markers on mode bits or other ACL type
the permissions and ACL's can be queried directly from the filesystem, so
I don't see any security problems with writing the permission file to the
filesystem.changing the permissions would be done by changing the files themselves
(when you are running as root on a filesystem that supports the changes,
otherwise it would need to fall back to writing the file and getting the
changes there, but that should be able to be a local config option)I don't like the idea of having a file that doesn't appear on the local
right, and the same thing if the filesystem doesn't support something in
if so this means that the permission changing program definantly needs to
operate on the diff of the permisison file, not on the absolute file. this
complicates things slightly, but it shouldn't be too bad.changing topic slightly.
I know git has pre-commit hooks, but I've never needed to use them.
at what point can you hook in?
can you define a hook that runs when you do a git-add? or only when you do
a git-commit?the reason I'm asking is to try and figure out when and how to create the
permissions file. when I was thinking in terms of dealing with the
permissions as a single bog block it wasn't that bad to say that at
git-commit time you have to scan every file and check it's permissions ...
Reading over your thoughts, I get this uneasy feeling about such
a permissions file, because it stores redundant information, and
redundant information has a tendency to get out of sync. If we
cannot attach attributes to objects in the git database, then
I understand the need for such a metastore. But I don't think it
should be checked out and visible, or maybe we should think of it
not in terms of a file anyway, but a metastore. Or how do you want
to resolve the situation when a user might edit the file, changing
a mode from 644 to 640, while in the filesystem, it was changed by
other means to 600.=2Egitattributes is a different story since it stores git-specificy
attributes, which are present nowhere else in the checkout.I still maintain it would be best if git allowed extra data to be
attached to object nodes. When you start thinking about
cherry-picking or even simple merges, I think that makes most sense.
And we don't need conflict markers, we could employ an iterative
merge process as e.g. git-rebase uses:"a conflict has been found in the file mode of ...
... 2750 vs. 2755 ...
please set the file mode as it should be and do git-mergeI'd much rather see something like `git-attr chmod 644
file-in-index` to make this change, rather than a file, which
introduces the potential for syntax errors.--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
"to me, vi is zen. to use vi is to practice zen. every command is
a koan. profound to the user, unintelligible to the uninitiated.
you discover truth everytime you use it."
-- reddy =E4t lion.austin.ibm.com
=20
spamtraps: madduck.bogus@madduck.net
each local repository would need to be configured to either recreate the
permissions file at checkin time or to use the permission file and ignore
the actual permissions on the file.while I agree that it would be ideal to store this data inside git, I'm
more interested in getting a functional implementation, and given the
reluctance of the git core team to allow any changes to support this
use-case anything that can be done to minimize the changes needed toand there's nothing to prevent the checkin hook from running such a
first make this useable, then if it starts getting used widely (which
would not at all surprise me, many distros are looking for good options
for doing this sort of thing, I wouldn't be surprised to see several of
them start useing git if it did the job well) things can be moved from
external scripts and storage to internal capabilities as appropriate.David Lang
-
I'd like to point out the following two posts, as I think they are
relevant to this thread:[PATCH] example hook script to save/restore file permissions/ownership
http://marc.info/?l=git&m=118953004817642&w=2[PATCH] post_merge hook, related documentation, and tests
http://marc.info/?l=git&m=118953004730496&w=2The hook script above runs in a pre-commit hook to write out file
metadata to a file in the repository. It can then be run from the
post-merge hook (patch above) to restore permissions. Running it from a
post-checkout hook may be more appropriate, but post-merge seems to work
well for my purposes. The script handles merge conflicts and (in my
testing) does the right thing. I'm using it now to track metadata for
not just /etc, but an entire linux image.It will handle merge conflicts by recognizing that the metadata file had
a conflict, and will direct the user to resolve the conflict and reset
working dir perms before allowing a commit.-JE
-
Well, you misread me or what I said was confusing or both. I
was suggesting totally opposite. Let git do all its normal
work, and then call your hook to munge the work tree in any way
you want.
-
so you are saying, have git write everything out as-is and then call a
program afterwords to do things? essentially a post-checkout hook?such a hook is useful in many situations, and would allow for the workflow
where you have /etc, /etc.git, and write scripts to move things back and
forth between them.so I do think that this is a capability that would be useful to git
overall.however, for the specific use-case of maintaining /etc I don't think that
it's as good as having a hook at write time.David Lang
-
I think he was replying to me, not you. I was suggesting that git stop at
the index, and let him take care of deciding how the index relates to the
work tree. That is, he'd get called instead of check_updates() in
unpack-trees. (And we might have to funnel more code paths through this
function, so that checkout-index does what read-tree -m would do, wrt
changes to the filesystem).-Daniel
*This .sig left intentionally blank*
-
Doing this atomically involves creating the file in question by
specifying the permissions on the creat system call already, and
possibly wrap seteuid calls and similar around it for getting the
right file/ownership.However, it is not really necessary to do this atomically: instead one
can rather create the file using safe permissions (600) at first, then
do fchown and fchmod (or chown/chmod) at some point in time afterwards
as required.--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
the problem with this in /etc is if you do the wrong file as 600 you can
cause lots of nasty problems to the system during the window. for some
files/directories you will want to write the file to a temp name and then
move the file atomicly to the final location.git itself shouldn't need to worry about this, the external write routine
I'm talking about is the correct place for this (at least until all the
bugs get worked out and everyone is comfortable that everything is good,
and doesn't impact the core git code badly)David Lang
-
Hi,
Would they be acceptable for you? If so, go ahead. If not, don't.
Hth,
Dscho-
frankly, unless I am willing for fork git (which I am not) it matters a
whole lot less if such a change is acceptable to me then if it is
acceptable to the maintainers.if it's not acceptable to the maintainers as a concept then it's not worth
going to the effort of producing the patches as they will just be
rejected.David Lang
-
I also think such configuration option would be cool.
Not only for tracking /etc or /home but also for example for "web
applications" (for example in PHP). In that case file and directory
permissions can be as important as the source code tracked and it is pain
to chmod (and sometimes chown) all files to different values after each
checkout. Not speaking about potential race.Thanks,
Grzegorz Kulewski
-
>>>>> "Grzegorz" == Grzegorz Kulewski <kangur@polcom.net> writes:
Grzegorz> Not only for tracking /etc or /home but also for example for "web
Grzegorz> applications" (for example in PHP). In that case file and directory
Grzegorz> permissions can be as important as the source code tracked and it is pain to
Grzegorz> chmod (and sometimes chown) all files to different values after each
Grzegorz> checkout. Not speaking about potential race.Uh, works just fine for me to manage my web site content. The point is
that I treat git for what it is... a source code management system.
And then I have a Makefile that "installs" my source code into the live
directory, with the right modes during installation.Why does everyone keep wanting "work dir == live dir". Ugh! The work dir is
the *source*... it gets *copied* into your live dir *somehow*. And *that* is
where the meta information needs to be. In that "somehow".--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
-
Hello,
Interesting. Could you show us what this makefile actually looks ?
How would you create a repo to track /etc ? I'm thinking of importing
this directory by using tar, do you think it's correct ?thanks.
--
Francis
-
>>>>> "Francis" == Francis Moreau <francis.moro@gmail.com> writes:
Francis> Interesting. Could you show us what this makefile actually looks ?
In fact, I wrote a magazine article about it. :)
http://www.stonehenge.com/merlyn/LinuxMag/col38.html
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
-
the problem is that at checkin you need to do the reverse process. the
other tools that you use on the system work on the live dir, not the 'work
dir', so it's only a 'work dir' in that git requires it as an staging step
between the repository and the place where it's going to be used.David Lang
-
david> the problem is that at checkin you need to do the reverse process. the
david> other tools that you use on the system work on the live dir, not the
david> 'work dir', so it's only a 'work dir' in that git requires it as an
david> staging step between the repository and the place where it's going to
david> be used.Eh? Are we still talking about a "website", or "/etc"? I'm talking about the
website case. I don't do *anything* to the live site. When I want to add a
file, I add it to my dev repo, possibly modifying my Makefile, and then spit
it out on my staging server. (You *do* have one of those, right?) Once I
know it's good, I push it to the live repo, and then "go live" with it. I
*never* work on the files that are the result of "make install".--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
-
even when working on a website it can be relavent.
yes, when you are developing html you want to do it on a test server ,
move it to staging, and then move to production. but it's also not
uncommon to have web based tools that allow other people to make some
changes as well (for example, a bank's website is mostly maintained by
their web development company, but the bank administraters want the
ability to change rate information instantly). sometimes this is
implemented by writing the info to a database and then querying that
database for every hit, but a far more efficiant way is to store that data
in a file on the webserver, which can include modifying pages directly.but yes, I was mostly thinking of /etc instead of the webserver when I
wrote that.David Lang
-
Hi,
Why don't you just give it a try? Hack on git, make it work for what you
want to do, clean it up, make a nice patch series, post it here.Then we'll talk.
Ciao,
Dscho-
That's what bad design is all about, after all.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
I use that tool. If you just have one branch, it works. With the
commit-hook, which also updates the metadata, you have current
permission tracking.There is a lack of a checkout-hook, which sets the permissions, so you
have to remeber todo a metastore -a after you checked out a revision.But if you have several branches which fork the master branch and try to
rebase the branches on master, you get trouble, because the metadata gets
corrupted somehow. I will think about a solution on this sometime.Nicolas
-
Note that having metastore run by a hook makes it unsuitable for /etc
versioning, because you may have short period of times during which
s3kr3t files are readable by more people that what it should be.The sole sane way to do that would be to track permissions, acls,
whatever _in_ git. Though, I'm still not convinced that it is such a
good idea at all. I mean for source code you absolutely _don't_ want git
to track permissions (outside from the +x bit). You don't want git to
try to chown your files to "madcoder:madcoder" because I was the last
one committing. So that would mean that you want sometimes to track
permissions, sometimes not. So you need a bunch of tools to list files
whose permissions have to be tracked, and whose permissions don't need
to be.I fear that you'll end up with quite a big bloat of git, for a use
case that is fairly limited.--=20
=C2=B7O=C2=B7 Pierre Habouzit
=C2=B7=C2=B7O madcoder@debia=
n.org
OOO http://www.madism.org
I think it doesn't get bloated until you try to support the model of
tracking different stuff for different files in the same repo. If
you just track one set of data across all files in the repo, I don't
think it'll cause too much bloat.--=20
`. `'` http://people.debian.org/~madduck - http://debiansystem.info
`- Debian - when you have better things to do than fixing systems
=20
gentoo: the performance placebo.
Yeah but if the stuff is opaque to git, you'll definitely end up with
security issues, which makes it also a no-go for /etc versionning.Note that I don't specifically care about git being able to deal with
/etc, I was just pointing out some issues I can see with it, but I'm
neiter in favor nor against it.--=20
=C2=B7O=C2=B7 Pierre Habouzit
=C2=B7=C2=B7O madcoder@debia=
n.org
OOO http://www.madism.org
With "opaque to git" do you mean "implemented outside git"?
I'd say if done properly inside git, the security issues could be
prevented.--=20
`. `'` http://people.debian.org/~madduck - http://debiansystem.info
`- Debian - when you have better things to do than fixing systems
=20
"this sentence contradicts itself -- no actually it doesn't."
-- douglas hofstadter
On Thu, Sep 13, 2007 at 02:11:13PM +0200, Francis Moreau <francis.moro@gmai=
you should not use git directly, see this project:
- VMiklos
Hello
I just give it a quick look and the point raised by Martin (permission
issue) doesn't seem to be addressed. That said I'm not sure it will be
a real issue for my use case: I'm only interested in tracking file
history. But I must be sure to deactivate any checkout operations.Another point is that I don't really see why not using Git directly
instead of another tools since I'm interested in tracking file history
only.The funny thing is that this tool is based on git/cogito but the scm
used to manage it is darc.thanks
--
Francis
-
They switched to git after running their heads too many times
against darcs walls.--=20
martin; (greetings from the heart of the sun.)
\____ echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
=20
if god had meant for us to be naked,
we would have been born that way.
=20
spamtraps: madduck.bogus@madduck.net
Since they are not using git as a source code management system nor
darcs as a versioned file system, it is not particularly funny.It's like being proud when the neighbor watchmaker borrows a hammer:
"ah, I always told him that a hammer is the ultimate device for
repairing a clock".--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
-
| david | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg KH | [2.6.22.2 review 05/84] Fix deadlocks in sparc serial console. |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Andrew Morton | -mm merge plans for 2.6.23 |
git: | |
| Jeff Kirsher | [RESEND][NET-NEXT PATCH 01/29] ixgbe: fix bug where using wake queue instead of st... |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Patrick McHardy | Re: [GIT]: Networking |
| Manuel Bouyer | Re: Interactive performance in -current |
| Christian Limpach | Re: newfs: determining file system parameters |
| YAMAMOTO Takashi | Re: statvfs(2) replacement for statfs(2) patch |
| Charles M. Hannum | Re: kern/22869: Slave IDE drive not detected |
