Re: [2.6 patch] UTF-8 fixes in comments

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Willy Tarreau
Date: Tuesday, April 29, 2008 - 3:33 am

On Tue, Apr 29, 2008 at 11:10:14AM +0100, Alan Cox wrote:

The console yes (by default until I disabled it to restore correct
behaviour). The shell no, it was the one present on my machine and
has never been compiled with UTF-8 support, and should not have to.

If we say that starting with 2.6.24, we're explicitly breaking
compatiblity with old userland, fine. But that was not explicitly
stated.

In my opinion, the problem is that when I press "é", the system sends
two chars to the bash, which itself sends two chars to the terminal,
which only displays one and moves the cursor one step ahead. Then,
pressing backspace once sends one backspace all along, resulting in
the terminal blanking one displayed char, but the shell not being
aware that only half of it was removed. But if you look at how
control chars are handled, if you display ^H then press backspace,
you remove all of it. It's the terminal which adjusts the position
depending on the character length.

So in my opinion, when we send one backspace to the terminal to
remove one character, since there are two in the buffer, we
should not get back one full char. Ideally, the console driver
should send as many backspaces as needed to fix the multiple
characters that were emitted. It's not logical at all that if
we send 3 chars to a process with one key, sending a cancellation
of those chars only sends one backspace.

You see, that's really what I hate with this encoding. Every
stage relies on the next one to do the fixup. And of course, a
lot of combinations fail.


But at least, there is no feeling of having it working. You immediately
see if your tools are compliant or not.


I cannot imagine how one can believe that something which transcodes one
char as a series of 1-to-4 chars will be a painless move. A lot of code
is totally broken and was not before the move.


That's a useful information, thanks. I was not aware of this.


Willy

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[2.6 patch] UTF-8 fixes in comments, Adrian Bunk, (Mon Apr 28, 8:40 am)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Mon Apr 28, 4:05 pm)
Re: [2.6 patch] UTF-8 fixes in comments, H. Peter Anvin, (Mon Apr 28, 6:29 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Mon Apr 28, 10:06 pm)
Re: [2.6 patch] UTF-8 fixes in comments, H. Peter Anvin, (Mon Apr 28, 11:04 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Adrian Bunk, (Tue Apr 29, 12:29 am)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Tue Apr 29, 1:14 am)
Re: [2.6 patch] UTF-8 fixes in comments, Alan Cox, (Tue Apr 29, 2:01 am)
Re: [2.6 patch] UTF-8 fixes in comments, Helge Hafting, (Tue Apr 29, 2:06 am)
Re: [2.6 patch] UTF-8 fixes in comments, Jan Engelhardt, (Tue Apr 29, 2:19 am)
Re: [2.6 patch] UTF-8 fixes in comments, Alan Cox, (Tue Apr 29, 2:33 am)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Tue Apr 29, 2:34 am)
Re: [2.6 patch] UTF-8 fixes in comments, Alan Cox, (Tue Apr 29, 2:41 am)
Re: [2.6 patch] UTF-8 fixes in comments, Adrian Bunk, (Tue Apr 29, 2:43 am)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Tue Apr 29, 3:09 am)
Re: [2.6 patch] UTF-8 fixes in comments, Alan Cox, (Tue Apr 29, 3:10 am)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Tue Apr 29, 3:33 am)
Re: [2.6 patch] UTF-8 fixes in comments, Alan Cox, (Tue Apr 29, 3:34 am)
Re: [2.6 patch] UTF-8 fixes in comments, Adrian Bunk, (Tue Apr 29, 3:42 am)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Tue Apr 29, 4:06 am)
Re: [2.6 patch] UTF-8 fixes in comments, Adrian Bunk, (Tue Apr 29, 4:27 am)
Re: [2.6 patch] UTF-8 fixes in comments, Adrian Bunk, (Tue Apr 29, 4:32 am)
Re: [2.6 patch] UTF-8 fixes in comments, KOSAKI Motohiro, (Tue Apr 29, 5:18 am)
Re: [2.6 patch] UTF-8 fixes in comments, H. Peter Anvin, (Tue Apr 29, 12:31 pm)
Re: [2.6 patch] UTF-8 fixes in comments, H. Peter Anvin, (Tue Apr 29, 12:33 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Tue Apr 29, 1:05 pm)
Re: [2.6 patch] UTF-8 fixes in comments, H. Peter Anvin, (Tue Apr 29, 1:09 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Jeremy Fitzhardinge, (Tue Apr 29, 1:18 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Tue Apr 29, 3:12 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Alan Cox, (Tue Apr 29, 3:15 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Willy Tarreau, (Tue Apr 29, 4:05 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Helge Hafting, (Wed Apr 30, 2:15 am)
Re: [2.6 patch] UTF-8 fixes in comments, Adrian Bunk, (Wed Apr 30, 12:22 pm)
Re: [2.6 patch] UTF-8 fixes in comments, H. Peter Anvin, (Wed Apr 30, 12:42 pm)
Re: [2.6 patch] UTF-8 fixes in comments, Alexander E. Patrakov, (Thu May 1, 2:46 am)
Re: [2.6 patch] UTF-8 fixes in comments, H. Peter Anvin, (Thu May 1, 1:18 pm)
Re: [2.6 patch] UTF-8 fixes in comments, David Kågedal, (Fri May 9, 5:48 am)