What the "file" command thinks is hardly relevant here. "file" just
attempts to guess what the contents of a file might be, by applying
a simple set of heuristics. Your results only highlight the actual
problem: "git" is apparently unable to handle character sets properly
and instead produces a mix of encodings as output.
For software with proper multilingual support, that should have been
enough to make sure that all its output would be in iso-8859-1, too.
Obviously "git" doesn't fall into that category.
The loss has happened long before you run that command, when the
data was committed into "git".
Exactly.
The problem is not programs thinking the universe is UTF-8 only; it's
people mixing different charsets, in conjunction with programs not
caring about charsets at all.
Specifically, your non-UTF-8 single charset environment was not broken
by git thinking everything was UTF-8, but to the contrary by some data
in the git repository actually being UTF-8 and git *not* thinking that.
And that problem is, I repeat, much older than UTF-8.
HTH
Tilman
--
Tilman Schmidt E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)
-