LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.com>
To: Mike Marion <mmarion@qualcomm.com>, Ian Kent <raven@themaw.net>
Cc: autofs mailing list <autofs@vger.kernel.org>,
Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 3/3] autofs - fix AT_NO_AUTOMOUNT not being honored
Date: Wed, 29 Nov 2017 12:17:27 +1100 [thread overview]
Message-ID: <87a7z5yjbs.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <20171128002935.GC27898@qualcomm.com>
[-- Attachment #1: Type: text/plain, Size: 3412 bytes --]
On Tue, Nov 28 2017, Mike Marion wrote:
> On Tue, Nov 28, 2017 at 07:43:05AM +0800, Ian Kent wrote:
>
>> I think the situation is going to get worse before it gets better.
>>
>> On recent Fedora and kernel, with a large map and heavy mount activity
>> I see:
>>
>> systemd, udisksd, gvfs-udisks2-volume-monitor, gvfsd-trash,
>> gnome-settings-daemon, packagekitd and gnome-shell
>>
>> all go crazy consuming large amounts of CPU.
>
> Yep. I'm not even worried about the CPU usage as much (yet, I'm sure
> it'll be more of a problem as time goes on). We have pretty huge
> direct maps and our initial startup tests on a new host with the link vs
> file took >6 hours. That's not a typo. We worked with Suse engineering
> to come up with a fix, which should've been pushed here some time ago.
>
> Then, there's shutdowns (and reboots). They also took a long time (on
> the order of 20+min) because it would walk the entire /proc/mounts
> "unmounting" things. Also fixed now. That one had something to do in
> SMP code as if you used a single CPU/core, it didn't take long at all.
>
> Just got a fix for the suse grub2-mkconfig script to fix their parsing
> looking for the root dev to skip over fstype autofs
> (probe_nfsroot_device function).
>
>> The symlink change was probably the start, now a number of applications
>> now got directly to the proc file system for this information.
>>
>> For large mount tables and many processes accessing the mount table
>> (probably reading the whole thing, either periodically or on change
>> notification) the current system does not scale well at all.
>
> We use Clearcase in some instances as well, and that's yet another thing
> adding mounts, and its startup is very slow, due to the size of
> /proc/mounts.
>
> It's definitely something that's more than just autofs and probably
> going to get worse, as you say.
If we assume that applications are going to want to read
/proc/self/mount* a log, we probably need to make it faster.
I performed a simple experiment where I mounted 1000 tmpfs filesystems,
copied /proc/self/mountinfo to /tmp/mountinfo, then
ran 4 for loops in parallel catting one of these files to /dev/null 1000 times.
On a single CPU VM:
For /tmp/mountinfo, each group of 1000 cats took about 3 seconds.
For /proc/self/mountinfo, each group of 1000 cats took about 14 seconds.
On a 4 CPU VM
/tmp/mountinfo: 1.5secs
/proc/self/mountinfo: 3.5 secs
Using "perf record" it appears that most of the cost is repeated calls
to prepend_path, with a small contribution from the fact that each read
only returns 4K rather than the 128K that cat asks for.
If we could hang a cache off struct mnt_namespace and use it instead of
iterating the mount table - using rcu and ns->event to ensure currency -
we should be able to minimize the cost of this increased use of
/proc/self/mount*.
I suspect that the best approach would be implement a cache at the
seq_file level.
One possible problem might be if applications assume that a read will
always return a whole number of lines (it currently does). To be
sure we remain safe, we would only be able to use the cache for
a read() syscall which reads the whole file.
How big do people see /proc/self/mount* getting? What size reads
does 'strace' show the various programs using to read it?
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2017-11-29 1:17 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-10 4:18 [PATCH 1/3] autofs - make disc device user accessible Ian Kent
2017-05-10 4:18 ` [PATCH 2/3] autofs - make dev ioctl version and ismountpoint " Ian Kent
2017-05-10 4:18 ` [PATCH 3/3] autofs - fix AT_NO_AUTOMOUNT not being honored Ian Kent
2017-05-12 12:49 ` Colin Walters
2017-11-21 1:53 ` NeilBrown
2017-11-22 4:28 ` Ian Kent
2017-11-23 0:36 ` Ian Kent
2017-11-23 2:21 ` NeilBrown
2017-11-23 2:46 ` Ian Kent
2017-11-23 3:04 ` Ian Kent
2017-11-23 4:49 ` NeilBrown
2017-11-23 6:34 ` Ian Kent
2017-11-27 16:01 ` Mike Marion
2017-11-27 23:43 ` Ian Kent
2017-11-28 0:29 ` Mike Marion
2017-11-29 1:17 ` NeilBrown [this message]
2017-11-29 2:13 ` Mike Marion
2017-11-29 2:28 ` Ian Kent
2017-11-29 2:48 ` NeilBrown
2017-11-29 3:14 ` Ian Kent
2017-11-29 2:56 ` Ian Kent
2017-11-29 3:45 ` NeilBrown
2017-11-29 6:00 ` Ian Kent
2017-11-29 7:39 ` NeilBrown
2017-11-30 0:00 ` Ian Kent
2017-11-29 16:51 ` Mike Marion
2017-11-23 0:47 ` NeilBrown
2017-11-23 1:43 ` Ian Kent
2017-11-23 2:26 ` Ian Kent
2017-11-23 3:04 ` NeilBrown
2017-11-23 3:41 ` Ian Kent
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a7z5yjbs.fsf@notabene.neil.brown.name \
--to=neilb@suse.com \
--cc=autofs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mmarion@qualcomm.com \
--cc=raven@themaw.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).