Linux-Fsdevel Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Thiago Macieira <thiago.macieira@intel.com>
Cc: linux-fsdevel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: fcntl(F_DUPFD) causing apparent file descriptor table corruption
Date: Tue, 19 May 2020 22:45:20 +0100	[thread overview]
Message-ID: <20200519214520.GS23230@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1645568.el9gB4U55B@tjmaciei-mobl1>

On Tue, May 19, 2020 at 02:03:03PM -0700, Thiago Macieira wrote:

> On my machine, /proc/sys/fs/nr_open is 1073741816 and I have 32 GB of RAM (if 
> the problem is related to memory consumption).
> 
> The problem only occurs when growing the table.
> 
> strace shows something like:
> fcntl(2, F_DUPFD, 1024)                 = 1024
> close(1024)                             = 0
> fcntl(2, F_DUPFD, 2048)                 = 2048
> close(2048)                             = 0
> fcntl(2, F_DUPFD, 4096)                 = 4096
> close(4096)                             = 0
> fcntl(2, F_DUPFD, 8192)                 = 8192
> close(8192)                             = 0
> fcntl(2, F_DUPFD, 16384)                = 16384
> close(16384)                            = 0
> fcntl(2, F_DUPFD, 32768)                = 32768
> close(32768)                            = 0
> fcntl(2, F_DUPFD, 65536)                = 65536
> close(65536)                            = 0
> fcntl(2, F_DUPFD, 131072)               = 131072
> close(131072)                           = 0
> fcntl(2, F_DUPFD, 262144)               = 262144
> close(262144)                           = 0
> fcntl(2, F_DUPFD, 524288)               = 524288
> close(524288)                           = 0
> fcntl(2, F_DUPFD, 1048576)              = 1048576
> close(1048576)                          = 0
> fcntl(2, F_DUPFD, 2097152)              = 2097152
> close(2097152)                          = 0
> fcntl(2, F_DUPFD, 4194304)              = 4194304
> close(4194304)                          = 0
> fcntl(2, F_DUPFD, 8388608)              = 8388608
> close(8388608)                          = 0
> fcntl(2, F_DUPFD, 16777216)             = 16777216
> close(16777216)                         = 0
> fcntl(2, F_DUPFD, 33554432)             = 33554432
> close(33554432)                         = 0
> fcntl(2, F_DUPFD, 67108864)             = 67108864
> close(67108864)                         = 0
> fcntl(2, F_DUPFD, 134217728)            = 134217728
> close(134217728)                        = 0
> fcntl(2, F_DUPFD, 536870912)            = 536870912
> close(536870912)                        = 0
> write(1, "success\n", 8)                = EBADF

Er...  It's going by powers of two, isn't it?  So where's
268435456 between 134217728 and 536870912?  And, assuming
it's a 64bit box, 536870912 pointers is 4Gb, which suggests
that we are running into 32bit overflow on calculating
some size...  Let's see -
static int expand_fdtable(struct files_struct *files, unsigned int nr)
...
        cur_fdt = files_fdtable(files);
        BUG_ON(nr < cur_fdt->max_fds);
        copy_fdtable(new_fdt, cur_fdt);
so we probably want copy_fdtable().
static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt)
{
        unsigned int cpy, set;

        BUG_ON(nfdt->max_fds < ofdt->max_fds);

        cpy = ofdt->max_fds * sizeof(struct file *);

Right, here's your multiplication overflow.

        set = (nfdt->max_fds - ofdt->max_fds) * sizeof(struct file *);
        memcpy(nfdt->fd, ofdt->fd, cpy);
... and here's not getting the things copied.  Which means that pointer
is left uninitialized and the damn thing might very well be a security
problem - you'd lucked out and ran into NULL, but had there been a pointer
to something, you would've gotten a memory corruptor.

The obvious fix would be to turn cpy and set into size_t - as in
ed fs/file.c <<'EOF'
/copy_fdtable/+2s/unsigned int/size_t/
w
q
EOF

On size_t overflow you would've failed allocation before getting to that
point - see sysctl_nr_open_max initializer.  Overflow in alloc_fdtable()
(nr is unsigned int there) also can't happen, AFAICS - the worst you
can get is 1U<<31, which will fail sysctl_nr_open comparison.

I really wonder about the missing couple of syscalls in your strace, though;
could you verify that they _are_ missing and see what the fix above does to
your testcase?

  reply	other threads:[~2020-05-19 21:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-19 21:03 Thiago Macieira
2020-05-19 21:45 ` Al Viro [this message]
2020-05-19 22:04   ` Al Viro
2020-05-19 22:18   ` Thiago Macieira
2020-05-19 22:28     ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200519214520.GS23230@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=thiago.macieira@intel.com \
    --cc=torvalds@linux-foundation.org \
    --subject='Re: fcntl(F_DUPFD) causing apparent file descriptor table corruption' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).