Re: [PATCH] TTY: Fix loss of echoed characters (2nd follow-on PATCH attached)

Previous thread: latest -git: BUG: unable to handle kernel paging request (numaq_tsc_disable) by Vegard Nossum on Wednesday, August 20, 2008 - 8:28 am. (5 messages)

Next thread: Re: 2.6.26 Breaks USB Fax Modem by Frank Peters on Wednesday, August 20, 2008 - 9:06 am. (2 messages)
From: Joe Peterson
Date: Wednesday, August 20, 2008 - 8:36 am

Andrew, attached is a patch that fixes the loss of echoed characters in
two common cases: 1) tty output buffer is full due to lots of output and
2) tty is stopped.  It was discussed on the list here (and I had
supplied an earlier rev of this patch at the end of the thread last week
- no further comments were posted):

	http://marc.info/?l=linux-kernel&m=121788756631983&w=2

I have been testing on several machines for the past week or so, and
things look good.

					Thanks, Joe
From: Joe Peterson
Date: Tuesday, August 26, 2008 - 5:41 am

Attached is a follow-on patch.  It just does some code style cleanup and
minor code efficiency tweaks.  Please apply it after the main patch
(probably should be combined into one patch).

						Thanks, Joe
From: Joe Peterson
Date: Monday, September 8, 2008 - 9:11 am

Here is another follow-on patch.  It gets applied after and probably
should be consolidated (for final submission) with the following two:

	tty-fix-loss-of-echoed-characters.patch
	tty-minor-code-efficiency-and-style-cleanup.patch

This one fixes handling of some tab erasure cases and also improves
locking for the echo buffer.

						Thanks, Joe
From: Andrew Morton
Date: Monday, September 8, 2008 - 5:32 pm

On Mon, 08 Sep 2008 10:11:46 -0600

Taking a spinlock outside lock_kernel() isn't good, and is quite unusual.

- It might be ab/ba deadlockable (I didn't check) (I trust you always
  test with lockdep enabled?)

- will make Ingo unhappy when he applies his "make lock_kernel use
  mutex_lock" patch (if it's still around).

- will probably give the -rt guys conniptions.


swapping the above two lines would presumably be an easy fix, but one
wonders whether we still need lock_kernel() in there once you've added
this lock.

--

From: Alan Cox
Date: Tuesday, September 9, 2008 - 3:55 am

On Mon, 8 Sep 2008 17:32:50 -0700

lock_kernel can sleep. Its not allowed...

Alan
--

From: Andrew Morton
Date: Tuesday, September 9, 2008 - 10:43 am

That used to be the case, but the mutex_lock() version of the bkl got

yup.
--

From: Joe Peterson
Date: Tuesday, September 9, 2008 - 1:42 pm

OK, attached is a new version of this patch:

	tty-fix-echo-tab-erase-and-locking.patch

I have removed lock_kernel() from n_tty.c completely (which is a nice
benefit).  As we discussed, this is now easier due to my echo patch
isolating the column state code.  In its place, I created two new mutex
locks: "output_lock" (protects column state and and driver buffer space
left) and "echo_lock" (protects the echo buffer).  I could have used
just one to protect all of this, but there was no need to make the lock
that wide-ranging (e.g., it's OK to add stuff to the echo buffer while
regular output is happening).

All echo buffer operations take the echo lock, and all process_out*
functions take output lock.  The one function that takes both locks is
process_echoes (this is the one that erroneously took both the echo spin
lock and the BKL before).  Note that the process_output functions are at
least sometimes (and maybe all of the time) locked by tty_write_lock (as
Alan mentioned), but this does not protect echo from interfering with
regular output.

Let me know what you think of this patch.  I am using it now, and it
appears to work well so far.

						Thanks again, Joe


From: Andrew Morton
Date: Wednesday, September 10, 2008 - 4:39 pm

On Tue, 09 Sep 2008 14:42:12 -0600

Boy.  Has this been carefully tested with lockdep enabled?
--

From: Joe Peterson
Date: Thursday, September 11, 2008 - 5:53 am

Yes, I have the following in my kernel config:

CONFIG_KALLSYMS_ALL=y
CONFIG_RT_MUTEX_TESTER=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_LOCK_STAT=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_TRACE_IRQFLAGS=y
CONFIG_STACKTRACE=y

I have run the various tests that I used to find/investigate the echo
loss problem to begin with (i.e. that make output compete with echo).
Also some manual tests that fill the echo buffer during a stopped tty,
test the eraser interaction, etc.  Things look fine so far.

						-Joe
--

From: Joe Peterson
Date: Tuesday, September 9, 2008 - 6:00 am

Indeed - and, as Alan said, lock_kernel() can sleep (a nuance I had not
realized until looking more into the kernel locking mechanisms just
now).  Although I have seen no issues during testing (and I do have
lockdep in the kernel), you are 100% right.

I had wanted to keep from disturbing the locking situation in n_tty, but
maybe it is time to get rid of the BKL there.

My echo buffer patches actually isolate the tty column state stuff to
the output processing functions now anyway, so the BLK may not really be

I don't think this is a good idea either, since I don't want to spinlock
during the output processing, which calls the driver output func.  I
think a mutex is more appropriate anyway (and there are some already
defined and in use for tty write locking, etc.) - let me know if you
think otherwise.  I will play around with this and re-post a patch for
review soon.

						-Thanks, Joe
--

From: Alan Cox
Date: Tuesday, September 9, 2008 - 6:12 am

The driver output side can sleep, and it has to be able to sleep because
the drivers like USB need to be able to sleep.

If you have the column handling isolated and locked that is a big step
towards exterminating the BKL in the n_tty code. It also illustrates why
locking people always say "lock data not code".

Alan
--

From: Joe Peterson
Date: Tuesday, September 9, 2008 - 6:15 am

Well, it's isolated, but still locked with the BKL, which would be great
to get rid of.  A few questions for you, since you've worked with this
code (and kernel locking stuff) a lot longer than I:

1) Now that column state is confined to the process_out/echo funcs in
n_tty, would using tty_write_lock() (the defined atomic write lock
mutex) be a good replacement for lock_kernel(), even though interruptible?

2) To protect echo buffer operations, I would lean toward using a
separate echo lock mutex so it does not lock against non-echo-buffer
output.  Would nesting this with #1 be advisable?  Should it be
interruptable?

3) tty_write() mentions refers to ldisc use of the BKL.  If we change
this, are there any considerations for the tty_io or driver code?

					Thanks, Joe
--

From: Alan Cox
Date: Tuesday, September 9, 2008 - 6:19 am

That lock orders writes to the ldisc so you would often be called with it

I would start with a lock for each object you need to get the locking
right for. If you ever take more than one at once it must be in the same
order (a lock heirarchy) to avoid deadlocks.


Shouldn't be. n_tty is the last real user of the BKL in the ldisc layer.
I've been over the driver write methods already and they *should* now be
sorted and safe.

Alan
--

Previous thread: latest -git: BUG: unable to handle kernel paging request (numaq_tsc_disable) by Vegard Nossum on Wednesday, August 20, 2008 - 8:28 am. (5 messages)

Next thread: Re: 2.6.26 Breaks USB Fax Modem by Frank Peters on Wednesday, August 20, 2008 - 9:06 am. (2 messages)