LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
From: Haggai Eran <haggaie@mellanox.com>
To: Yann Droneaud <ydroneaud@opteya.com>
Cc: Shachar Raindel <raindel@mellanox.com>,
	Sagi Grimberg <sagig@mellanox.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: CVE-2014-8159 kernel: infiniband: uverbs: unprotected physical memory access
Date: Tue, 14 Apr 2015 11:11:29 +0300	[thread overview]
Message-ID: <552CCBB1.6000207@mellanox.com> (raw)
In-Reply-To: <1428931781.22575.232.camel@opteya.com>

On 13/04/2015 16:29, Yann Droneaud wrote:
> Le jeudi 02 avril 2015 à 18:12 +0000, Haggai Eran a écrit :
...
>>
>> I want to add that we would like to see users registering a very large
>> memory region (perhaps the entire process address space) for local
>> access, and then enabling remote access only to specific regions using
>> memory windows. However, this isn't supported yet by our driver.
> 
> In such scheme, the registration must still be handled "manually":
> one has to create a memory window to get a rkey to be exchanged with
> a peer, so why one would want to register such a large memory region
> (the whole process space) ?
> 
> I guess creating the memory window is faster than registering memory
> region. 
Right.

It takes time to create and fill the hardware's page tables. Using
memory windows allows you to reuse the work done previously, while still
having a more granular control over the RDMA permissions. The larger MR
can be created with only local permissions, and the memory window can
add specific remote permissions to a smaller range. The memory window
utilizes the page tables created for the memory region.

> I'd rather say this is not an excuse to register a larger
> memory region (up to the whole process space, current and future) as it 
> sounds like a surprising optimisation: let the HCA known too many
> pages just to be sure it already knows some when the process want to 
> use them. It seems it would become difficult to handle if there's too
> many processes.
Are you worried about pinning too many pages? That is an issue we want
to solve with ODP :)

> 
> I would prefer creating memory region becoming costless (through ODP :).
I agree :)

> 
>>  Still, there are valid cases where you would still want the results
>> of an mmap(0,...) call to be remotely accessible, in cases where there
>> is enough trust between the local process and the remote process.
> 
> mmap(0, ...., fd) let the kernel choose where to put the file in 
> process virtual memory space, so it may, may not, partially, end up in 
> an ODP pre registered memory region for a range unallocated/unused yet.
> 
> I don't think one want such to happen.
I think that in some cases the benefit of allowing this outweigh the
risks. This is why it is an opt-in feature.

> 
>>  It may help a middleware communication library register a large
>> portion of the address space in advance, and still work with random
>> pointers given to it by another application module.
>>
> 
> But as said in the beginnig of your message, the middleware would have
> bind a memory window before posting work request / exposing rkey for
> the "random pointers".
> 
> So I fail to understand how could be used ODP when it comes to 
> registering a memory region not yet backed by something.

In this scenario, the middleware would first register the full address
space as an ODP memory region with local permissions only. When it wants
to provide remote access to some buffer, it would bind a memory window
over the ODP MR. This is possible with multiple processes because it
uses the virtual memory system without pinning. It won't cause random
mmap regions to be mapped for RDMA without the specific intent of the
application.

However, we currently don't have support for memory windows over ODP
MRs. Even if we did, there is some performance penalty due to binding
and invalidating memory windows. Some applications will still need full
process address space access for RDMA.

Regards,
Haggai

  reply	other threads:[~2015-04-14  8:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AM3PR05MB0935AABF569F15EA846B8E72DC000@AM3PR05MB0935.eurprd05.prod.outlook.com>
2015-04-02 10:04 ` Yann Droneaud
2015-04-02 10:52   ` Shachar Raindel
2015-04-02 13:30     ` Yann Droneaud
2015-04-02 15:18       ` Haggai Eran
2015-04-02 16:35         ` Yann Droneaud
2015-04-02 16:44           ` Shachar Raindel
2015-04-02 18:12             ` Haggai Eran
2015-04-13 13:29               ` Yann Droneaud
2015-04-14  8:11                 ` Haggai Eran [this message]
2015-04-02 20:40             ` Yann Droneaud
2015-04-03  8:39               ` Haggai Eran
2015-04-03 11:49                 ` Yann Droneaud
2015-04-02 15:15     ` Yann Droneaud
2015-04-02 16:34       ` Shachar Raindel
2015-04-08 12:19         ` Yann Droneaud
2015-04-08 12:44           ` Yann Droneaud

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=552CCBB1.6000207@mellanox.com \
    --to=haggaie@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=raindel@mellanox.com \
    --cc=sagig@mellanox.com \
    --cc=ydroneaud@opteya.com \
    --subject='Re: CVE-2014-8159 kernel: infiniband: uverbs: unprotected physical memory access' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).