I am experiencing a rather intermittent and hard-to-reproduce issue in the tty
layer, and I am posting here to get ideas on how to debug it from those of you
who have delved into the tty internals. I suspect some kind of race is going on
or the echo is caught in the tty buffer when it gets flushed (and never makes it
to the tty). Heavy load (compiling, etc.) seems to make it more likely.
When a signal character (e.g. ctrl-c) is received, the tty ldisc and driver are
flushed, the character is echoed (e.g. "^C" if echoctl is on), and the signal is
issued. Because the flush happens first, the echo should always appear on the
tty, but sometimes it does not. What I am wondering is how the echo could get
"swallowed". The code I have been using to test this is:
#include <stdio.h>
main()
{
while (1) {
printf("a");
fflush(stdout);
}
}
During the run, I simply hit ctrl-c to break out, and I've also tried
ctrl-s/ctrl-q (alternating), then ctrl-c when stopped. Normally, the echo is
displayed, but sometimes not.
Two things come to mind... First, there is this caveat in the code (n_tty.c,
write_chan()):
* Write function of the terminal device. This is serialized with
* respect to other write callers but not to termios changes, reads
* and other such events. We must be careful with N_TTY as the receive
* code will echo characters, thus calling driver write methods.
This talks specifically about echo during receive, so I am wondering if there is
some timing-sensative thing going on.
Second, perhaps the echo is caught in the driver pipe when the flush is done,
and that's why the echo is getting eaten by the flush.
Would it make sense to flush_to_ldisc from within the ldisc before the buffer is
flushed to make sure the pipe is empty?
It can take quite a few tries before I see the issue, but I have seen it in
2.6.26. If anyone has some ideas, let me know.
Thanks, Joe
--
On Mon, 04 Aug 2008 16:03:02 -0600 If the output buffer is full then echoed characters/^C etc will vanish the way n_tty implements its buffering internally. It's always worked that way. Alan --
But since the flush is done just prior, shouldn't the buffer be empty just before the ^C is written? Or are you saying that the buffer could refill in the meantime (between the flush and the ^C) if the chars are comming in too fast? What about the order of flush?... Currently, it is: n_tty_flush_buffer(tty); (ldisc buf) tty_driver_flush_buffer(tty); (driver buf) Would it be better to reverse this order, flushing the driver buffer first so characters do not then refill the ldisc buffer before the driver buffer can be flushed? -Thanks, Joe --
One more observation: In Linux, try this: # cat > foo hi^Sthere^Q ^D (in other words, during the cat into "foo", type "hi", hit ^S, then type "there", then hit ^Q, then, on the next line, ^D to end the file) Note that the "there" does not appear after hitting ^Q, but it does appear in the file. So the characters were accepted, but they were not echoed (not even saved for echo when the terminal is restarted). This behavior differs from that of FreeBSD (just tried it for fun - haven't tried other Unix's yet). I have noticed other times that the echo seems to get lost while the tty is stopped. Not sure if all this is related, but something seems amiss. Thoughts? Thanks, Joe --
On Wed, 06 Aug 2008 14:17:29 -0600 It should certainly occur if the output buffer is full but that shouldn't be the case for a few bytes. Agreed the current behaviour is unexpected and less than desirable so hack away. --
Will do. If you know, off-hand, which part of the code suppresses/holds output while ^S is in effect, that would be of great help to me. I'm combing the tty kernel code, but nothing sticks out to me yet. Thanks, Joe --
I see the problem clearly now. The driver does indeed reject writes when the tty is stopped or full, and the ldisc throws them away in that case (only for echos or other ldisc-generated output, of course). This problem goes deeper in that the column logic (for eraser, tabs, etc.) relies on the characters making it to the tty, and here are many places this is never checked/guaranteed. I am working up an echo buffer (fifo) that would hold these characters until they can be sent (if the write is not possible), but since that means they will arrive at the tty later, their interleaving within the application output stream is not guaranteed, which again is a problem for the column logic. I'm still looking into it - not trivial for sure... -Joe --
First thought would be to simply fire all output through your buffer in n_tty. It's not a performance critical bit of code and it would still be fast as the usual case would be if (queue->length == 0) try_write_directly(...) --
Only problem with that is that with a steady stream of output, you would likely end up quickly filling the new buffer and be right back to the same issue of losing echos and other ldisc chars off the start of the buf. Also, the regular output channel has the mechanism to throttle itself already and try again later (causing the upstream process to wait as expected - example: ping foo.com; hit ^S; it stops pinging until ^Q...). What I may try is moving the code that manipulates the column marker (mostly "eraser") to happen at real output (i.e. when we know space is available), and have all ldisc echos/chars go into the buffer initially. That way, the regular output stream and ldisc stream (out of the buffer) will be able to control the column stuff properly and in sync. By knowing how many chars need to go out (eraser is the only non-trivial case, and we can have it tell us the char string before sending it), we can be sure there is space before trying and subsequently changing the column. BTW, I'll also need to call the new buffer push function periodically after the output happens in case there are characters left to process. I suspect the right place is in the write wakeup func or the poll func. If you know off hand, let me know. Thanks, Joe --
Attached is a patch that addresses the issue of lost character echos during program output and in a stopped tty. The patch adds an "echo buffer" to N_TTY that handles all ldisc-generated output (including echoed characters). The main thing this solves is the loss of characters when they cannot be immediately written to the tty driver. Highlights are: * ^C (and other chars) are no longer lost when write buffer is full - (often happens with continuous program output) * Character echoes are not lost when tty is in stopped state (e.g. ^S) - (e.g.: ^Q will cause held characters to be output) * When echoing control char pairs (e.g. "^C"), ensure pair stays together The echo buffer stores characters as well as operations that need to be done in sync with character output (like management of the column position). This allows it to play well with the interleaved program output. I am currently testing the patch on a couple of systems, but it looks good so far. Note that this patch currently applies cleanly to 2.6.26 and with offsets to 2.6.27 (but I have not tested yet on the latter - should work, though). -Joe
