Re: [PATCH 3/3] Add timeout feature

Previous thread: [PATCH 2/3] Remove XFS specific ioctl interfaces for freeze feature by Takashi Sato on Monday, September 8, 2008 - 4:53 am. (1 message)

Next thread: [git pull] Please pull powerpc.git merge branch by Paul Mackerras on Monday, September 8, 2008 - 4:56 am. (1 message)
From: Takashi Sato
Date: Monday, September 8, 2008 - 4:53 am

The timeout feature is added to "freeze ioctl" to solve a deadlock
when the freezer accesses a frozen filesystem. And new ioctl
to reset the timeout period is added to extend the timeout period.
For example, the freezer resets the timeout period to 10 seconds every 5
seconds.  In this approach, even if the freezer causes a deadlock by
accessing the frozen filesystem, it will be solved by the timeout
in 10 seconds and the freezer will be able to recognize that
at the next reset of timeout period.
o Freeze the filesystem
  int ioctl(int fd, int FIFREEZE, long *timeout_sec)
    fd: The file descriptor of the mountpoint
    FIFREEZE: request code for the freeze
    timeout_sec: the timeout period in seconds
             If it's 0 or 1, the timeout isn't set.
             This special case of "1" is implemented to keep
             the compatibility with XFS applications.
    Return value: 0 if the operation succeeds. Otherwise, -1

o Reset the timeout period
  int ioctl(int fd, int FIFREEZE_RESET_TIMEOUT, long *timeout_sec)
    fd:file descriptor of mountpoint
    FIFREEZE_RESET_TIMEOUT: request code for reset of timeout period
    timeout_sec: new timeout period in seconds
    Return value: 0 if the operation succeeds. Otherwise, -1
    Error number: If the filesystem has already been unfrozen,
                  errno is set to EINVAL.

Signed-off-by: Takashi Sato <t-sato@yk.jp.nec.com>
Signed-off-by: Masayuki Hamaguchi <m-hamaguchi@ys.jp.nec.com>
---
 drivers/md/dm.c             |    2 -
 fs/block_dev.c              |    2 +
 fs/buffer.c                 |   44 ++++++++++++++++++++++++---
 fs/ioctl.c                  |   71 ++++++++++++++++++++++++++++++++++++++++++--
 fs/super.c                  |   37 ++++++++++++++++++++++
 fs/xfs/xfs_fsops.c          |    2 -
 include/linux/buffer_head.h |    4 +-
 include/linux/fs.h          |    8 ++++
 8 files changed, 159 insertions(+), 11 deletions(-)

diff -uprN -X linux-2.6.27-rc5.org/Documentation/dontdiff ...
From: Christoph Hellwig
Date: Monday, September 8, 2008 - 10:11 am

And as with all previous posting I still fundamentally disagree about
the need of this functionality.  We don't need a timeout for freezing.

--

From: Ric Wheeler
Date: Thursday, September 25, 2008 - 2:06 pm

I agree with Christoph here, I think that the timeout is unneeded.

Regards,

Ric

--

From: Takashi Sato
Date: Friday, September 26, 2008 - 1:52 am

Hi,


I think that your concern is that the freezer cannot recognize the occurrence
of a timeout and it continues the backup process and the backup data is
corrupted finally.
If the freezer can recognize it by the unfreeze ioctl's errono, will your concern
 be solved?
If so, I will implement it.

Cheers, Takashi

--

From: Ric Wheeler
Date: Friday, September 26, 2008 - 3:58 am

I think that is certainly part a big part of my concern.

Also note that the timeout seems to be quite low relative to say the 
standard timeout for a SCSI device (30 seconds worst case).

In general, I am quite supportive of the patch series and think that 
this is a great addition.

Thanks!

Ric


--

From: Takashi Sato
Date: Monday, September 29, 2008 - 4:11 am

Hi Ric and Christoph,


Thank you for your comments.
Christoph, do you have any comments about this solution?

If it's OK, I will change the freeze patch so that the unfreeze ioctl sets
ETIMEDOUT to errno when the timeout occurs.

Cheers, Takashi
--

From: Valdis.Kletnieks
Date: Friday, September 26, 2008 - 5:35 am

On Fri, 26 Sep 2008 17:52:35 +0900, Takashi Sato said:


That would also address my concerns about merging patches 8 and 10 of the
other patch series (because patch 10 wouldn't be needed then)...

From: Christoph Hellwig
Date: Monday, September 29, 2008 - 7:13 am

What timeout should happen?  the freeze ioctl must not return until the
filesystem is a clean state and all writes are blocked.

--

From: Eric Sandeen
Date: Monday, September 29, 2008 - 7:36 am

The suggestion was that *UN*freeze would return ETIMEDOUT if the
filesystem had already unfrozen itself, I think.  That way you know that
the snapshot you just took is worthless, at least.

I'm still not really sold on the timeout, but I think the above was the
intent.

-Eric
--

From: Christoph Hellwig
Date: Monday, September 29, 2008 - 7:37 am

But why would the filesystem every unfreeze itself?  That defeats the
whole point of freezing it.

--

From: Eric Sandeen
Date: Monday, September 29, 2008 - 7:45 am

I agree.  Was just trying to clarify the above point.

But there have been what, 12 submissions now, with the unfreeze timeout
in place so it's a persistent theme ;)

Perhaps a demonstration of just how easy (or not easy) it is to deadlock
a filesystem by freezing the root might be in order, at least.

And even if it is relatively easy, I still maintain that it is the
administrator's role to not inflict damage on the machine being
administered.  There are a lot of potentially dangerous tools at root's
disposal; why this particular one needs a nanny I'm still not quite sure.

-Eric
--

From: jim owens
Date: Monday, September 29, 2008 - 3:08 pm

Eric Sandeen wrote:
 > Christoph Hellwig wrote:
 >> But why would the filesystem every unfreeze itself?  That defeats the
 >> whole point of freezing it.
 >
 > I agree.  Was just trying to clarify the above point.
 >
 > But there have been what, 12 submissions now, with the unfreeze timeout
 > in place so it's a persistent theme ;)
 >
 > Perhaps a demonstration of just how easy (or not easy) it is to deadlock
 > a filesystem by freezing the root might be in order, at least.
 >
 > And even if it is relatively easy, I still maintain that it is the
 > administrator's role to not inflict damage on the machine being
 > administered.  There are a lot of potentially dangerous tools at root's
 > disposal; why this particular one needs a nanny I'm still not quite sure.

Since this patch hit fsdev, there have been an equal number
of supporters and opponents of the timeout.

I'm not opposed to the timeout on the API, but I don't think
it is needed if we have a system configurable timeout (default
is no timeout) that can be changed by an admin.

My experience is that a timeout is not needed protect against
a stupid admin or against software bugs.

The justification for a timeout as far as I am concerned
is so the admin can log in and reset hung hardware.  If we
think there is no chance of forcing the external device that
went to sleep to respond so the system can continue to be used,
then I don't think a timeout has any valid use.

My timeout desire is based on some past SAN behavior and
I'm OK if people argue those devices should just be fixed.
But we argued the same thing and were ignored because bad
device behavior did not stop people from buying them.

jim
--

From: Pavel Machek
Date: Sunday, October 5, 2008 - 3:00 am

Can you docuument what administrator must not do for freezing to be
safe?

What if so much dirty data accumulates on frozen filesystem that
there's not enough memory for the unfreeze tool?

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--

From: Takashi Sato
Date: Thursday, October 9, 2008 - 3:12 am

Hi,


I think we need the timeout for the case someone dirties so much data
with mmap, hence the freeze process is swapped out and cannot unfreeze.

Cheers, Takashi
--

From: Christoph Hellwig
Date: Thursday, October 9, 2008 - 3:18 am

That is not supposed to happen.  That's why write blocks early on a
frozen filesystem (the shared mmap write path is currently missing the
check, but that's a rather small patch)

--

Previous thread: [PATCH 2/3] Remove XFS specific ioctl interfaces for freeze feature by Takashi Sato on Monday, September 8, 2008 - 4:53 am. (1 message)

Next thread: [git pull] Please pull powerpc.git merge branch by Paul Mackerras on Monday, September 8, 2008 - 4:56 am. (1 message)