LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: David Hildenbrand <email@example.com>
To: Dan Williams <firstname.lastname@example.org>
Cc: Linux Kernel Mailing List <email@example.com>,
Arnd Bergmann <firstname.lastname@example.org>,
Greg Kroah-Hartman <email@example.com>,
"Michael S. Tsirkin" <firstname.lastname@example.org>,
Jason Wang <email@example.com>,
"Rafael J. Wysocki" <firstname.lastname@example.org>,
Andrew Morton <email@example.com>,
Hanjun Guo <firstname.lastname@example.org>,
Andy Shevchenko <email@example.com>,
Linux MM <firstname.lastname@example.org>
Subject: Re: [PATCH v2 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions
Date: Wed, 25 Aug 2021 19:27:20 +0200 [thread overview]
Message-ID: <email@example.com> (raw)
On 25.08.21 19:07, Dan Williams wrote:
> On Wed, Aug 25, 2021 at 12:23 AM David Hildenbrand <firstname.lastname@example.org> wrote:
>> On 25.08.21 02:58, Dan Williams wrote:
>>> On Mon, Aug 16, 2021 at 7:25 AM David Hildenbrand <email@example.com> wrote:
>>>> virtio-mem dynamically exposes memory inside a device memory region as
>>>> system RAM to Linux, coordinating with the hypervisor which parts are
>>>> actually "plugged" and consequently usable/accessible. On the one hand, the
>>>> virtio-mem driver adds/removes whole memory blocks, creating/removing busy
>>>> IORESOURCE_SYSTEM_RAM resources, on the other hand, it logically (un)plugs
>>>> memory inside added memory blocks, dynamically either exposing them to
>>>> the buddy or hiding them from the buddy and marking them PG_offline.
>>>> virtio-mem wants to make sure that in a sane environment, nobody
>>>> "accidentially" accesses unplugged memory inside the device managed
>>>> region. After /proc/kcore has been sanitized and /dev/kmem has been
>>>> removed, /dev/mem is the remaining interface that still allows uncontrolled
>>>> access to the device-managed region of virtio-mem devices from user
>>>> There is no known sane use case for mapping virtio-mem device memory
>>>> via /dev/mem while virtio-mem driver concurrently (un)plugs memory inside
>>>> that region. So once the driver was loaded and detected the device
>>>> along the device-managed region, we just want to disallow any access via
>>>> /dev/mem to it.
>>>> Let's add the basic infrastructure to exclude some physical memory
>>>> regions completely from /dev/mem access, on any architecture and under
>>>> any system configuration (independent of CONFIG_STRICT_DEVMEM and
>>>> independent of "iomem=").
>>> I'm certainly on team "/dev/mem considered harmful", but this approach
>>> feels awkward. It feels wrong for being non-committal about whether
>>> CONFIG_STRICT_DEVMEM is in wide enough use that the safety can be
>>> turned on all the time, and the configuration option dropped, or there
>>> are users clinging onto /dev/mem where they expect to be able to build
>>> a debug kernel to turn all of these restrictions off, even the
>>> virtio-mem ones. This splits the difference and says some /dev/mem
>>> accesses are always disallowed for "reasons", but I could say the same
>>> thing about pmem, there's no sane reason to allow /dev/mem which has
>>> no idea about the responsibilities of properly touching pmem to get
>>> access to it.
>> For virtio-mem, there is no use case *and* access could be harmful; I
>> don't even want to allow if for debugging purposes. If you want to
>> inspect virtio-mem device memory content, use /proc/kcore, which
>> performs proper synchronized access checks. Modifying random virtio-mem
>> memory via /dev/mem in a debug kernel will not be possible: if you
>> really have to play with fire, use kdb or better don't load the
>> virtio-mem driver during boot, such that the kernel won't even be making
>> use of device memory.
>> I don't want people disabling CONFIG_STRICT_DEVMEM, or booting with
>> "iomem=relaxed", and "accidentally" accessing any of virtio-mem memory
>> via /dev/mem, while it gets concurrently plugged/unplugged by the
>> virtio-mem driver. Not even for debugging purposes.
> That sounds more an argument that all of the existing "kernel is using
> this region" cases should become mandatory exclusions. If unloading
> the driver removes the exclusion then that's precisely
> CONFIG_IO_STRICT_DEVMEM. Why is the virtio-mem driver more special
> than any other driver that expects this integrity guarantee?
Unloading the driver will only remove exclusion if the driver can be
unloaded cleanly -- if there is no memory added to Linux. Similar to
force-unbinding dax/kmem without offlining memory, the whole device
range will remain excluded.
(unloading the driver is only even implemented because there is no way
to not implement it; there is no sane use case for virtio-mem to do that)
There are 2 things that are relevant for virtio-mem memory in regards of
1. Kernel is currently using it (added virtio-mem memory). Don't allow
access. Pretty much like most other things we want to exclude, I agree.
2. Kernel is currently not using it (not yet added virtio-mem memory),
or not using it right now any more (removed virtio-mem memory). In
contrast to other devices (DIMM, PMEM, ...) there is no sane use case
for this memory, because the VM must not use it (as defined in the
I care about 2) a lot because I don't want people looking at
/proc/iomem, figuring out that there is something to map. And by the
time they try to map it via /dev/mem, virtio-mem emoved that memory, yet
a /dev/mem mapping happened and we have invalid memory access.
Mapping /dev/mem and accidentally being able to read/write virtio-mem
memory has to be forbidden in sane environments. Force unloading a
driver or preventing it from loading just to touch virtio-mem memory via
/dev/mem is not a sane environment, someone is explicitly is asking for
trouble, which is fine.
>> We disallow mapping to some other regions independent of
>> CONFIG_STRICT_DEVMEM already, so the idea to ignore CONFIG_STRICT_DEVMEM
>> is not completely new:
>> "Note that with PAT support enabled, even in this case there are
>> restrictions on /dev/mem use due to the cache aliasing requirements."
>> Maybe you even want to do something similar with PMEM now that there is
>> infrastructure for it and just avoid having to deal with revoking
>> /dev/mem mappings later.
> That would be like blocking writes to /dev/sda just because a
> filesytem might later be mounted on it. If the /dev/mem access is not
> actively colliding with other kernel operations what business does the
> kernel have saying no?
That the spec defines that that memory must not be read/written, because
there might not be any memory after all anymore backing the virtio-mem
device, or there is and the hypervisor will flag you as "malicious" and
eventually zap the VM. That's different to most physical devices I am
> I'm pushing on this topic because I am also considering an exclusion
> on PCI configuration access to the "DOE mailbox" since it can disrupt
> the kernel's operation, at the same time, root can go change PCI BARs
> to nonsensical values whenever it wants which is also in the category
> of "has no use case && could be harmful".
>> I think there are weird debugging/educational setups  that still
>> require CONFIG_STRICT_DEVMEM=n even with iomem=relaxed. Take a look at
>> lib/devmem_is_allowed.c:devmem_is_allowed(), it disallows any access to
>> (what's currently added as) System RAM. It might just do what people
>> want when dealing with system RAM that doesn't suddenly vanish , so I
>> don't ultimately see why we should remove CONFIG_STRICT_DEVMEM=n.
> Yes, I wanted to tease out more of your rationale on where the line
> should be drawn, I think a mostly unfettered /dev/mem mode is here to
I could most certainly be convinced to
a) Leave CONFIG_STRICT_DEVMEM=n untouched
b) Restrict what I propose to CONFIG_STRICT_DEVMEM=y.
I could even go ahead and require CONFIG_STRICT_DEVMEM for virtio-mem.
David / dhildenb
next prev parent reply other threads:[~2021-08-25 17:27 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-16 14:25 [PATCH v2 0/3] virtio-mem: disallow mapping virtio-mem memory via /dev/mem David Hildenbrand
2021-08-16 14:25 ` [PATCH v2 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions David Hildenbrand
2021-08-25 0:58 ` Dan Williams
2021-08-25 7:23 ` David Hildenbrand
2021-08-25 17:07 ` Dan Williams
2021-08-25 17:27 ` David Hildenbrand [this message]
2021-08-16 14:25 ` [PATCH v2 2/3] virtio-mem: disallow mapping virtio-mem memory via /dev/mem David Hildenbrand
2021-08-16 14:25 ` [PATCH v2 3/3] kernel/resource: cleanup and optimize iomem_is_exclusive() David Hildenbrand
2021-08-23 19:14 ` [PATCH v2 0/3] virtio-mem: disallow mapping virtio-mem memory via /dev/mem David Hildenbrand
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--subject='Re: [PATCH v2 1/3] /dev/mem: disallow access to explicitly excluded system RAM regions' \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).