LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Alexander Mihalicyn <alexander@mihalicyn.com>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Milton Miller <miltonm@bga.com>,
	Jack Miller <millerjo@us.ibm.com>,
	Pavel Tikhomirov <ptikhomirov@virtuozzo.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andrei Vagin <avagin@gmail.com>,
	Christian Brauner <christian@brauner.io>
Subject: Re: [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed
Date: Sun, 11 Jul 2021 13:33:22 +0300	[thread overview]
Message-ID: <CAJqdLrofd76x_hziq7F3wY3jqZfE1LNZbQ8sD6MUFXbPHVcdVw@mail.gmail.com> (raw)
In-Reply-To: <CALgW_8VUk0us_umLncUv2DUMkOi3qixmT+YkHV4Dhpt_nNMZHw@mail.gmail.com>

Hi, Manfred,

On Sun, Jul 11, 2021 at 12:13 PM Manfred Spraul
<manfred@colorfullife.com> wrote:
>
> Hi,
>
>
> Am Samstag, 10. Juli 2021 schrieb Alexander Mihalicyn <alexander@mihalicyn.com>:
>>
>>
>> Now, using setns() syscall, we can construct situation when on
>> task->sysvshm.shm_clist list
>> we have shm items from several (!) IPC namespaces.
>>
>>
> Does this imply that locking ist affected as well? According to the initial patch, accesses to shm_clist are protected by "the" IPC shm namespace rwsem. This can't work if the list contains objects from several namespaces.

Of course, you are right. I've to rework this part -> I can add check into
static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
function and before adding new shm into task list check that list is empty OR
an item which is present on the list from the same namespace as
current->nsproxy->ipc_ns.

>
> From ipc/shm.c:
>
>  397                down_read(&shm_ids(ns).rwsem);
>  398                list_for_each_entry(shp, &task->sysvshm.shm_clist, shm_clist)
>  399                        shp->shm_creator = NULL;
>  400                /*
>  401                 * Only under read lock but we are only called on current
>  402                 * so no entry on the list will be shared.
>  403                 */
>  404                list_del(&task->sysvshm.shm_clist);
>  405                up_read(&shm_ids(ns).rwsem);
>
>
> Task A and B in same namespace
>
> - A: create shm object
>
> - A: setns()
>
> - in parallel B) does shmctl(IPC_RMID), A) does exit()

Yep.

>
>
>>
>>
>> So, semantics of setns() and unshare() is different here. We can fix
>> this problem by adding
>> analogical calls to exit_shm(), shm_init_task() into
>>
>> static void commit_nsset(struct nsset *nsset)
>> ...
>> #ifdef CONFIG_IPC_NS
>> if (flags & CLONE_NEWIPC) {
>>      exit_sem(me);
>> +   shm_init_task(current);
>> +  exit_shm(current);
>> }
>> #endif
>>
>> with this change semantics of unshare() and setns() will be equal in
>> terms of the shm_rmid_forced
>> feature.
>
> Additional advantage: exit_sem() and exit_shm() would appear as pairs, both in unshare () and setns().
>
>> But this may break some applications which using setns() and
>> IPC resources not destroying
>> after that syscall. (CRIU using setns() extensively and we have to
>> change our IPC ns C/R implementation
>> a little bit if we take this way of fixing the problem).
>>
>> I've proposed a change which keeps the old behaviour of setns() but
>> fixes double free.
>>
> Assuming that locking works, I would consider this as a namespace design question: Do we want to support that a task contains shm objects from several ipc namespaces?

This depends on what we mean by "task contains shm objects from
several ipc namespaces". There are two meanings:

1. Task has attached shm object from different ipc namespaces

We already support that by design. When we doing a change of namespace
using unshare(CLONE_NEWIPC) even with
sysctl shm_rmid_forced=1 we not detach all ipc's from task! Let see on
shm_exit() functio which is validly called
when we doing unshare():

if (shm_may_destroy(ns, shp)) { <--- (shp->shm_nattch == 0) &&
(ns->shm_rmid_forced || (shp->shm_perm.mode & SHM_DEST))
    shm_lock_by_ptr(shp);
    shm_destroy(ns, shp);
}

here all depends on shp->shm_nattch which will be non-zero if used
doing something like this:

int id = shmget(0xAAAA, 4096, IPC_CREAT|0700);
void *addr = shmat(id, NULL, 0); // <-- map shm to the task address space
unshare(CLONE_NEWIPC); // <--- task->sysvshm.shm_clist is cleared! But
shm 0xAAAA remains attached
id = shmget(0xBBBB, 4096, IPC_CREAT|0700); // <-- add item to the
task->sysvshm.shm_clist now it contains object only from new IPC
namespace
addr = shmat(id, NULL, 0);

So, this task->sysvshm.shm_clist list used only for shm_rmid_forced
feature. It doesn't affect any mm-related things like /proc/<pid>/maps
or something similar.

2. Task task->sysvshm.shm_clist list has items from different IPC namespaces.

I'm not sure, do we need that or not. But I'm ready to prepare a patch
for any of the options which we choose:
a) just add exit_shm(current)+shm_init_task(current);
b) prepare PATCHv2 with appropriate check in the newseg() to prevent
adding new items from different namespace to the list
c) rework algorithm so we can safely have items from different
namespaces in task->sysvshm.shm_clist

and, of course, prepare a test case with this particular bug
reproducer to prevent future degradations and increase coverage.
(I'm not publishing the reproducer program directly on the list at the
moment because it may be not fully safe.
But I think any of us already knows how to trigger the problem.)

>
> Does it work everywhere (/proc/{pid}, ...)?
> --
>    Manfred

Thanks,
Alex

  parent reply	other threads:[~2021-07-11 10:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-06 13:22 Alexander Mikhalitsyn
2021-07-06 13:22 ` [PATCH 1/2] shm: skip shm_destroy " Alexander Mikhalitsyn
2021-07-06 13:22 ` [PATCH 2/2] ipc: WARN if trying to remove ipc object which is absent Alexander Mikhalitsyn
2021-07-10  1:12 ` [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed Andrew Morton
2021-07-10 10:55   ` Alexander Mihalicyn
     [not found]     ` <CALgW_8VUk0us_umLncUv2DUMkOi3qixmT+YkHV4Dhpt_nNMZHw@mail.gmail.com>
2021-07-11 10:33       ` Alexander Mihalicyn [this message]
2021-07-11 11:46         ` Manfred Spraul
2021-07-12  9:54           ` Alexander Mihalicyn
2021-07-12 19:18             ` Eric W. Biederman
2021-07-12 19:27               ` Alexander Mihalicyn
2021-07-14 17:30                 ` [RFC PATCH] shm: extend forced shm destroy to support objects from several IPC nses Alexander Mikhalitsyn
2021-07-21  6:32                   ` Manfred Spraul
2021-07-22 18:46                   ` Manfred Spraul
2021-07-30 14:52                     ` Alexander Mihalicyn
2021-07-14 17:42                 ` [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed Alexander Mihalicyn
2021-09-23 16:36               ` Manfred Spraul
2021-09-24 16:45                 ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJqdLrofd76x_hziq7F3wY3jqZfE1LNZbQ8sD6MUFXbPHVcdVw@mail.gmail.com \
    --to=alexander@mihalicyn.com \
    --cc=akpm@linux-foundation.org \
    --cc=avagin@gmail.com \
    --cc=christian@brauner.io \
    --cc=dave@stgolabs.net \
    --cc=ebiederm@xmission.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manfred@colorfullife.com \
    --cc=mhocko@suse.com \
    --cc=millerjo@us.ibm.com \
    --cc=miltonm@bga.com \
    --cc=ptikhomirov@virtuozzo.com \
    --cc=vdavydov.dev@gmail.com \
    --subject='Re: [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).