Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: syzbot <syzbot+d6ec23007e951dadf3de@syzkaller.appspotmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Miklos Szeredi <mszeredi@redhat.com>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: kernel BUG at mm/hugetlb.c:LINE!
Date: Tue, 12 May 2020 17:04:51 +0200	[thread overview]
Message-ID: <CAJfpegtVca6H1JPW00OF-7sCwpomMCo=A2qr5K=9uGKEGjEp3w@mail.gmail.com> (raw)
In-Reply-To: <aa7812b8-60ae-8578-40db-e71ad766b4d3@oracle.com>

On Tue, Apr 7, 2020 at 12:06 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 4/5/20 8:06 PM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    1a323ea5 x86: get rid of 'errret' argument to __get_user_x..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=132e940be00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=8c1e98458335a7d1
> > dashboard link: https://syzkaller.appspot.com/bug?extid=d6ec23007e951dadf3de
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12921933e00000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=172e940be00000
> >
> > The bug was bisected to:
> >
> > commit e950564b97fd0f541b02eb207685d0746f5ecf29
> > Author: Miklos Szeredi <mszeredi@redhat.com>
> > Date:   Tue Jul 24 13:01:55 2018 +0000
> >
> >     vfs: don't evict uninitialized inode
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=115cad33e00000
> > final crash:    https://syzkaller.appspot.com/x/report.txt?x=135cad33e00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=155cad33e00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+d6ec23007e951dadf3de@syzkaller.appspotmail.com
> > Fixes: e950564b97fd ("vfs: don't evict uninitialized inode")
> >
> > overlayfs: upper fs does not support xattr, falling back to index=off and metacopy=off.
> > ------------[ cut here ]------------
> > kernel BUG at mm/hugetlb.c:3416!
> > invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> > CPU: 0 PID: 7036 Comm: syz-executor110 Not tainted 5.6.0-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > RIP: 0010:__unmap_hugepage_range+0xa26/0xbc0 mm/hugetlb.c:3416
> > Code: 00 48 c7 c7 60 37 35 88 e8 57 b4 a2 ff e9 b3 fd ff ff e8 cd 90 c6 ff 0f 0b e9 c4 f7 ff ff e8 c1 90 c6 ff 0f 0b e8 ba 90 c6 ff <0f> 0b e8 b3 90 c6 ff 83 8c 24 c0 00 00 00 01 48 8d bc 24 a0 00 00
> > RSP: 0018:ffffc900017779b0 EFLAGS: 00010293
> > RAX: ffff88808cf5c2c0 RBX: ffffffff8c641c08 RCX: ffffffff81ac50b4
> > RDX: 0000000000000000 RSI: ffffffff81ac58a6 RDI: 0000000000000007
> > RBP: 0000000020000000 R08: ffff88808cf5c2c0 R09: ffffed10129d8111
> > R10: ffffed10129d8110 R11: ffff888094ec0887 R12: 0000000000003000
> > R13: 0000000000000000 R14: 0000000020003000 R15: 0000000000200000
> > FS:  00000000013c0880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000020000140 CR3: 0000000093554000 CR4: 00000000001406f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  __unmap_hugepage_range_final+0x30/0x70 mm/hugetlb.c:3507
> >  unmap_single_vma+0x238/0x300 mm/memory.c:1296
> >  unmap_vmas+0x16f/0x2f0 mm/memory.c:1332
> >  exit_mmap+0x2aa/0x510 mm/mmap.c:3126
> >  __mmput kernel/fork.c:1082 [inline]
> >  mmput+0x168/0x4b0 kernel/fork.c:1103
> >  exit_mm kernel/exit.c:477 [inline]
> >  do_exit+0xa51/0x2dd0 kernel/exit.c:780
> >  do_group_exit+0x125/0x340 kernel/exit.c:891
> >  __do_sys_exit_group kernel/exit.c:902 [inline]
> >  __se_sys_exit_group kernel/exit.c:900 [inline]
> >  __x64_sys_exit_group+0x3a/0x50 kernel/exit.c:900
> >  do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
> >  entry_SYSCALL_64_after_hwframe+0x49/0xb3
>
> This is not new and certainly not caused by commit e950564b97fd.

Sorry for replying late...

> hugetlbf only operates on huge page aligned and sized files/mappings.
> To make sure this happens, the mmap code contians the following to 'round
> up' length to huge page size:
>
>         if (!(flags & MAP_ANONYMOUS)) {
>                 audit_mmap_fd(fd, flags);
>                 file = fget(fd);
>                 if (!file)
>                         return -EBADF;
>                 if (is_file_hugepages(file))
>                         len = ALIGN(len, huge_page_size(hstate_file(file)));
>                 retval = -EINVAL;
>                 if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
>                         goto out_fput;
>         } else if (flags & MAP_HUGETLB) {
>                 struct user_struct *user = NULL;
>                 struct hstate *hs;
>
>                 hs = hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK);
>                 if (!hs)
>                         return -EINVAL;
>
>                 len = ALIGN(len, huge_page_size(hs));
>
> However, in this syzbot test case the 'file' is in an overlayfs filesystem
> created as follows:
>
> mkdir("./file0", 000)                   = 0
> mount(NULL, "./file0", "hugetlbfs", MS_MANDLOCK|MS_POSIXACL, NULL) = 0
> chdir("./file0")                        = 0
> mkdir("./file1", 000)                   = 0
> mkdir("./bus", 000)                     = 0
> mkdir("./file0", 000)                   = 0
> mount("\177ELF\2\1\1", "./bus", "overlay", 0, "lowerdir=./bus,workdir=./file1,u"...) = 0
>
> The routine is_file_hugepages() is just comparing the file ops to huegtlbfs:
>
>         if (file->f_op == &hugetlbfs_file_operations)
>                 return true;
>
> Since the file is in an overlayfs, file->f_op == ovl_file_operations.
> Therefore, length will not be rounded up to huge page size and we create a
> mapping with incorrect size which leads to the BUG.
>
> Because of the code in mmap, the hugetlbfs mmap() routine assumes length is
> rounded to a huge page size.  I can easily add a check to hugetlbfs mmap
> to validate length and return -EINVAL.  However, I think we really want to
> do the 'round up' earlier in mmap.  This is because the man page says:
>
>    Huge page (Huge TLB) mappings
>        For mappings that employ huge pages, the requirements for the arguments
>        of  mmap()  and munmap() differ somewhat from the requirements for map‐
>        pings that use the native system page size.
>
>        For mmap(), offset must be a multiple of the underlying huge page size.
>        The system automatically aligns length to be a multiple of the underly‐
>        ing huge page size.
>
> Since the location for the mapping is chosen BEFORE getting to the hugetlbfs
> mmap routine, we can not wait until then to round up the length.  Is there a
> defined way to go from a struct file * to the underlying filesystem so we
> can continue to do the 'round up' in early mmap code?

That's easy enough:

static inline struct file *real_file(struct file *file)
{
    return file->f_op != ovl_file_operations ? file : file->private_data;
}

But adding more filesystem specific code to generic code does not
sound like the cleanest way to solve this...

> One other thing I noticed with overlayfs is that it does not contain a
> specific get_unmapped_area file_operations routine.  I would expect it to at
> least check for and use the get_unmapped_area of the underlying filesystem?
> Can someone comment if this is by design?

Not sure.  What exactly is f_op->get_unmapped_area supposed to do?

> In the case of hugetlbfs, get_unmapped_area is even provided by most
> architectures.  So, it seems like we would like/need to be calling the correct
> routine.

Okay.

Thanks,
Miklos

  reply	other threads:[~2020-05-12 15:05 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-06  3:06 syzbot
2020-04-06 22:05 ` Mike Kravetz
2020-05-12 15:04   ` Miklos Szeredi [this message]
2020-05-12 18:11     ` Mike Kravetz
2020-05-15 22:15       ` Mike Kravetz
2020-05-18 11:12         ` Miklos Szeredi
2020-05-18 23:22           ` Mike Kravetz
2020-05-18 23:41     ` Colin Walters
2020-05-19  0:35       ` Mike Kravetz
2020-05-20 11:20         ` Miklos Szeredi
2020-05-20 17:27           ` Mike Kravetz
2020-05-22 10:05             ` Miklos Szeredi
2020-05-28  0:01               ` Mike Kravetz
2020-05-28  8:37                 ` [PATCH v2] ovl: provide real_file() and overlayfs get_unmapped_area() kbuild test robot
2020-05-28 21:01                   ` Mike Kravetz
2020-06-04  9:16                     ` Miklos Szeredi
2020-06-11  0:13                       ` Mike Kravetz
2020-06-11  0:37                         ` Al Viro
2020-06-11  1:36                           ` Matthew Wilcox
2020-06-11  2:17                             ` Al Viro
2020-06-11  2:31                               ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJfpegtVca6H1JPW00OF-7sCwpomMCo=A2qr5K=9uGKEGjEp3w@mail.gmail.com' \
    --to=miklos@szeredi.hu \
    --cc=akpm@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=mszeredi@redhat.com \
    --cc=syzbot+d6ec23007e951dadf3de@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    --subject='Re: kernel BUG at mm/hugetlb.c:LINE'\!'' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).