Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Andrea Arcangeli <aarcange@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>
Cc: syzbot <syzbot+96cc7aba7e969b1d305c@syzkaller.appspotmail.com>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: INFO: task hung in pipe_read (2)
Date: Sat, 8 Aug 2020 10:01:21 +0900	[thread overview]
Message-ID: <e673cccb-1b67-802a-84e3-6aeea4513a09@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20200807053148.GA10409@redhat.com>

On 2020/08/07 14:31, Andrea Arcangeli wrote:
>> Andrea? Comments? As mentioned, this is probably much too aggressive,
>> but I do think we need to limit the time that the kernel will wait for
>> page faults.
> 
> Why is pipe preventing to SIGKILL the task that is blocked on the
> mutex_lock? Is there any good reason for it or it simply has margin
> for improvement regardless of the hangcheck report? It'd be great if
> we can look into that before looking into the uffd specific bits.

It would be possible to use _killable version for this specific function, but

> 
> The hangcheck timer would have zero issues with tasks that can be
> killed, if only the pipe code could be improved to use mutex_lock_killable.
> 
> 		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */
> 		if (t->state == TASK_UNINTERRUPTIBLE)
> 			check_hung_task(t, timeout);
> 
> The hangcheck report is just telling us one task was in D state a
> little too long, but it wasn't fatal error and the kernel wasn't
> actually destabilized and the only malfunction reported is that a task
> was unkillable for too long.

use of killable waits disables ability to detect possibility of deadlock (because
lockdep can't check possibility of deadlock which involves actions in userspace), for
syzkaller process is SIGKILLed after 5 seconds while khungtaskd's timeout is 140 seconds.

If we encounter a deadlock in an unattended operation (e.g. some server process),
we don't have a method for resolving the deadlock. Therefore, I consider that
t->state == TASK_UNINTERRUPTIBLE check is a bad choice. Unless a sleep is neutral
(e.g. no lock is held, or obviously safe to sleep with that specific lock held),
sleeping for 140 seconds inside the kernel is a bad sign even if interruptible/killable.

> 
> Now if it's impossible to improve the pipe code so it works better not
> just for uffd, there's still no reason to worry: we could disable uffd
> in the pipe context. For example ptrace opts-out of uffds, so that gdb
> doesn't get stuck if you read a pointer that should be handled by the
> process that is under debug. I hope it won't be necessary but it
> wouldn't be a major issue, certainly it wouldn't risk breaking qemu
> (and non-cooperative APIs are privileged so it could still skip the
> timeout).

Can we do something like this?

  bool retried = false;
retry:
  lock();
  disable_fault();
  ret = access_memory_that_might_fault();
  enable_fault();
  if (ret == -EWOULDFAULT && !retried)
    goto retry_without_lock;
  if (ret == 0)
    ret = do_something();
  unlock();
  return ret;
retry_without_lock:
  unlock();
  ret = access_memory_that_might_fault();
  retried = true;
  goto retry;


  reply	other threads:[~2020-08-08  1:02 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-01  8:45 syzbot
2020-08-01 15:29 ` Tetsuo Handa
2020-08-01 17:39   ` Linus Torvalds
2020-08-07  5:31     ` Andrea Arcangeli
2020-08-08  1:01       ` Tetsuo Handa [this message]
2020-08-10 19:29         ` Andrea Arcangeli
2020-08-13  7:00           ` Tetsuo Handa
2020-08-13 11:20             ` Tetsuo Handa
2020-08-22  4:34               ` [RFC PATCH] pipe: make pipe_release() deferrable Tetsuo Handa
2020-08-22 16:30                 ` Linus Torvalds
2020-08-23  0:21                   ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e673cccb-1b67-802a-84e3-6aeea4513a09@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dvyukov@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=syzbot+96cc7aba7e969b1d305c@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --subject='Re: INFO: task hung in pipe_read (2)' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).