LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Alexander Mihalicyn <alexander@mihalicyn.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Manfred Spraul <manfred@colorfullife.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Pavel Tikhomirov <ptikhomirov@virtuozzo.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrei Vagin <avagin@gmail.com>,
	Christian Brauner <christian@brauner.io>,
	alexander.mikhalitsyn@virtuozzo.com
Subject: Re: [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed
Date: Wed, 14 Jul 2021 20:42:39 +0300	[thread overview]
Message-ID: <CAJqdLrqUmY02fPPFzdq_cUM-U4vDHneQ-RDQbCpDErdsLcNNfQ@mail.gmail.com> (raw)
In-Reply-To: <CAJqdLrov0VBzHamSvMRKBHSJX+NROUx0TUsRD9U0zEWUn5XxDA@mail.gmail.com>

Hi Eric,

I've sent the RFC patch "shm: extend forced shm destroy to support
objects from", please take a look once time permits.

A few words about design:
1. In the exit_shm() function I'm using a temporary list to collect
shm's before destroying because I can't take semaphore (which protects
ns idr structure) under spinlock (which protects list on task)
2. I've to take IPC ns refcounter in exit_shm() to prevent possible
race with shm_exit_ns().
3. I haven't added list_del_init(&shp->shm_clist) into shmat() syscall
because I can't understand why this is safe. But maybe I've to think
more about that.
4. I need to remove the extra "ns" argument (todo).

Thanks,
Alex

On Mon, Jul 12, 2021 at 10:27 PM Alexander Mihalicyn
<alexander@mihalicyn.com> wrote:
>
> Hi Eric,
>
> On Mon, Jul 12, 2021 at 10:18 PM Eric W. Biederman
> <ebiederm@xmission.com> wrote:
> >
> > Alexander Mihalicyn <alexander@mihalicyn.com> writes:
> >
> > > Hello Manfred,
> > >
> > > On Sun, Jul 11, 2021 at 2:47 PM Manfred Spraul <manfred@colorfullife.com> wrote:
> > >>
> > >> Hi Alex,
> > >>
> > >>
> > >> Am Sonntag, 11. Juli 2021 schrieb Alexander Mihalicyn <alexander@mihalicyn.com>:
> > >> >
> > >> > Hi, Manfred,
> > >> >
> > >> > On Sun, Jul 11, 2021 at 12:13 PM Manfred Spraul
> > >> > <manfred@colorfullife.com> wrote:
> > >> > >
> > >> > > Hi,
> > >> > >
> > >> > >
> > >> > > Am Samstag, 10. Juli 2021 schrieb Alexander Mihalicyn <alexander@mihalicyn.com>:
> > >> > >>
> > >> > >>
> > >> > >> Now, using setns() syscall, we can construct situation when on
> > >> > >> task->sysvshm.shm_clist list
> > >> > >> we have shm items from several (!) IPC namespaces.
> > >> > >>
> > >> > >>
> > >> > > Does this imply that locking ist affected as well? According to the initial patch, accesses to shm_clist are protected by "the" IPC shm namespace rwsem. This can't work if the list contains objects from several namespaces.
> > >> >
> > >> > Of course, you are right. I've to rework this part -> I can add check into
> > >> > static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
> > >> > function and before adding new shm into task list check that list is empty OR
> > >> > an item which is present on the list from the same namespace as
> > >> > current->nsproxy->ipc_ns.
> > >> >
> > >> Ok. (Sorry, I have only smartphone internet, thus I could not check
> > >> the patch fully)
> > >>
> > >> > >> I've proposed a change which keeps the old behaviour of setns() but
> > >> > >> fixes double free.
> > >> > >>
> > >> > > Assuming that locking works, I would consider this as a namespace design question: Do we want to support that a task contains shm objects from several ipc namespaces?
> > >> >
> > >> > This depends on what we mean by "task contains shm objects from
> > >> > several ipc namespaces". There are two meanings:
> > >> >
> > >> > 1. Task has attached shm object from different ipc namespaces
> > >> >
> > >> > We already support that by design. When we doing a change of namespace
> > >> > using unshare(CLONE_NEWIPC) even with
> > >> > sysctl shm_rmid_forced=1 we not detach all ipc's from task!
> > >>
> > >> OK. Thus shm and sem have different behavior anyways.
> > >>
> > >> >
> > >> > 2. Task task->sysvshm.shm_clist list has items from different IPC namespaces.
> > >> >
> > >> > I'm not sure, do we need that or not. But I'm ready to prepare a patch
> > >> > for any of the options which we choose:
> > >> > a) just add exit_shm(current)+shm_init_task(current);
> > >> > b) prepare PATCHv2 with appropriate check in the newseg() to prevent
> > >> > adding new items from different namespace to the list
> > >> > c) rework algorithm so we can safely have items from different
> > >> > namespaces in task->sysvshm.shm_clist
> > >> >
> > >> Before you write something, let's wait what the others say. I don't
> > >> qualify AS shm expert
> > >>
> > >> a) is user space visible, without any good excuse
> > >
> > > yes, but maybe we decide that this is not so critical?
> > > We need more people here :)
> >
> > It is barely visible.  You have to do something very silly
> > to see this happening.  It is probably ok, but the work to
> > verify that nothing cares so that we can safely backport
> > the change is probably much more work than just updating
> > the list to handle shmid's for multiple namespaces.
> >
> >
> > >> c) is probably highest amount of Changes
> > >
> > > yep. but ok, I will prepare patches fast.
> >
> > Given that this is a bug I think c) is the safest option.
> >
> > A couple of suggestions.
> > 1) We can replace the test "shm_creator != NULL" with
> >    "list_empty(&shp->shm_clist)" and remove shm_creator.
> >
> >    Along with replacing "shm_creator = NULL" with
> >    "list_del_init(&shp->shm_clist)".
> >
> > 2) We can update shmat to do "list_del_init(&shp->shm_clist)"
> >    upon shmat.  The last unmap will still shm_destroy the
> >    shm segment as ns->shm_rmid_forced is set.
> >
> >    For a multi-threaded process I think this will nicely clean up
> >    the clist, and make it clear that the clist only cares about
> >    those segments that have been created but never attached.
> >
> > 3) Put a non-reference counted struct ipc_namespace in struct
> >    shmid_kernel, and use it to remove the namespace parameter
> >    from shm_destroy.
>
> Thanks for your detailed suggestions! ;)
> I will prepare a patch tomorrow for kernel + test what's happening with
> CRIU and will prepare a fix for it.
>
> >
> > I think that is enough to fix this bug with no changes in semantics,
> > no additional memory consumed, and an implementation that is easier
> > to read and perhaps a little faster.
> >
> > Eric
>
> Regards,
> Alex

  parent reply	other threads:[~2021-07-14 17:42 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-06 13:22 Alexander Mikhalitsyn
2021-07-06 13:22 ` [PATCH 1/2] shm: skip shm_destroy " Alexander Mikhalitsyn
2021-07-06 13:22 ` [PATCH 2/2] ipc: WARN if trying to remove ipc object which is absent Alexander Mikhalitsyn
2021-07-10  1:12 ` [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed Andrew Morton
2021-07-10 10:55   ` Alexander Mihalicyn
     [not found]     ` <CALgW_8VUk0us_umLncUv2DUMkOi3qixmT+YkHV4Dhpt_nNMZHw@mail.gmail.com>
2021-07-11 10:33       ` Alexander Mihalicyn
2021-07-11 11:46         ` Manfred Spraul
2021-07-12  9:54           ` Alexander Mihalicyn
2021-07-12 19:18             ` Eric W. Biederman
2021-07-12 19:27               ` Alexander Mihalicyn
2021-07-14 17:30                 ` [RFC PATCH] shm: extend forced shm destroy to support objects from several IPC nses Alexander Mikhalitsyn
2021-07-21  6:32                   ` Manfred Spraul
2021-07-22 18:46                   ` Manfred Spraul
2021-07-30 14:52                     ` Alexander Mihalicyn
2021-07-14 17:42                 ` Alexander Mihalicyn [this message]
2021-09-23 16:36               ` [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed Manfred Spraul
2021-09-24 16:45                 ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJqdLrqUmY02fPPFzdq_cUM-U4vDHneQ-RDQbCpDErdsLcNNfQ@mail.gmail.com \
    --to=alexander@mihalicyn.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.mikhalitsyn@virtuozzo.com \
    --cc=avagin@gmail.com \
    --cc=christian@brauner.io \
    --cc=dave@stgolabs.net \
    --cc=ebiederm@xmission.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=mhocko@suse.com \
    --cc=ptikhomirov@virtuozzo.com \
    --cc=vdavydov.dev@gmail.com \
    --subject='Re: [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).