Netdev Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers
@ 2020-09-04 22:41 Jason Gunthorpe
2020-09-04 22:41 ` [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR Jason Gunthorpe
2020-09-09 18:38 ` [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
0 siblings, 2 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2020-09-04 22:41 UTC (permalink / raw)
To: Adit Ranadive, Ariel Elior, Potnuri Bharat Teja, David S. Miller,
Devesh Sharma, Doug Ledford, Faisal Latif, Gal Pressman,
GR-everest-linux-l2, Wei Hu(Xavier),
Jakub Kicinski, Leon Romanovsky, linux-rdma, Weihang Li,
Michal Kalderon, Naresh Kumar PBS, netdev, Lijun Ou,
VMware PV-Drivers, Selvin Xavier, Yossi Leybovich, Somnath Kotur,
Sriharsha Basavapatna, Yishai Hadas
Cc: Firas JahJah, Henry Orosco, Leon Romanovsky, Michael J. Ruhl,
Michal Kalderon, Miguel Ojeda, Shiraz Saleem
Most RDMA drivers rely on a linear table of DMA addresses organized in
some device specific page size.
For a while now the core code has had the rdma_for_each_block() SG
iterator to help break a umem into DMA blocks for use in the device lists.
Improve on this by adding rdma_umem_for_each_dma_block(),
ib_umem_dma_offset() and ib_umem_num_dma_blocks().
Replace open codings, or calls to fixed PAGE_SIZE APIs, in most of the
drivers with one of the above APIs.
Get rid of the really weird and duplicative ib_umem_page_count().
Fix two problems with ib_umem_find_best_pgsz(), and several problems
related to computing the wrong DMA list length if IOVA != umem->address.
At this point many of the driver have a clear path to call
ib_umem_find_best_pgsz() and replace hardcoded PAGE_SIZE or PAGE_SHIFT
values when constructing their DMA lists.
This is the first series in an effort to modernize the umem usage in all
the DMA drivers.
v1: https://lore.kernel.org/r/0-v1-00f59ce24f1f+19f50-umem_1_jgg@nvidia.com
v2:
- Fix ib_umem_find_best_pgsz() to use IOVA not umem->addr
- Fix ib_umem_num_dma_blocks() to use IOVA not umem->addr
- Two new patches to remove wrong open coded versions of
ib_umem_num_dma_blocks() from EFA and i40iw
- Redo the mlx4 ib_umem_num_dma_blocks() to do less and be safer
until the whole thing can be moved to ib_umem_find_best_pgsz()
- Two new patches to delete calls to ib_umem_offset() in qedr and
ocrdma
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Jason Gunthorpe (17):
RDMA/umem: Fix ib_umem_find_best_pgsz() for mappings that cross a page
boundary
RDMA/umem: Prevent small pages from being returned by
ib_umem_find_best_pgsz()
RDMA/umem: Use simpler logic for ib_umem_find_best_pgsz()
RDMA/umem: Add rdma_umem_for_each_dma_block()
RDMA/umem: Replace for_each_sg_dma_page with
rdma_umem_for_each_dma_block
RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks()
RDMA/efa: Use ib_umem_num_dma_pages()
RDMA/i40iw: Use ib_umem_num_dma_pages()
RDMA/qedr: Use rdma_umem_for_each_dma_block() instead of open-coding
RDMA/qedr: Use ib_umem_num_dma_blocks() instead of
ib_umem_page_count()
RDMA/bnxt: Do not use ib_umem_page_count() or ib_umem_num_pages()
RDMA/hns: Use ib_umem_num_dma_blocks() instead of opencoding
RDMA/ocrdma: Use ib_umem_num_dma_blocks() instead of
ib_umem_page_count()
RDMA/pvrdma: Use ib_umem_num_dma_blocks() instead of
ib_umem_page_count()
RDMA/mlx4: Use ib_umem_num_dma_blocks()
RDMA/qedr: Remove fbo and zbva from the MR
RDMA/ocrdma: Remove fbo from MR
.clang-format | 1 +
drivers/infiniband/core/umem.c | 45 +++++++-----
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 72 +++++++------------
drivers/infiniband/hw/cxgb4/mem.c | 8 +--
drivers/infiniband/hw/efa/efa_verbs.c | 9 ++-
drivers/infiniband/hw/hns/hns_roce_alloc.c | 3 +-
drivers/infiniband/hw/hns/hns_roce_mr.c | 49 +++++--------
drivers/infiniband/hw/i40iw/i40iw_verbs.c | 13 +---
drivers/infiniband/hw/mlx4/cq.c | 1 -
drivers/infiniband/hw/mlx4/mr.c | 5 +-
drivers/infiniband/hw/mlx4/qp.c | 2 -
drivers/infiniband/hw/mlx4/srq.c | 5 +-
drivers/infiniband/hw/mlx5/mem.c | 4 +-
drivers/infiniband/hw/mthca/mthca_provider.c | 8 +--
drivers/infiniband/hw/ocrdma/ocrdma.h | 1 -
drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 5 +-
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 25 +++----
drivers/infiniband/hw/qedr/verbs.c | 52 +++++---------
drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c | 2 +-
.../infiniband/hw/vmw_pvrdma/pvrdma_misc.c | 9 ++-
drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c | 2 +-
drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c | 6 +-
drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 2 +-
drivers/net/ethernet/qlogic/qed/qed_rdma.c | 12 +---
include/linux/qed/qed_rdma_if.h | 2 -
include/rdma/ib_umem.h | 37 ++++++++--
include/rdma/ib_verbs.h | 24 -------
27 files changed, 170 insertions(+), 234 deletions(-)
--
2.28.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR
2020-09-04 22:41 [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
@ 2020-09-04 22:41 ` Jason Gunthorpe
2020-09-06 8:01 ` [EXT] " Michal Kalderon
2020-09-09 18:38 ` [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
1 sibling, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2020-09-04 22:41 UTC (permalink / raw)
To: Ariel Elior, David S. Miller, Doug Ledford, GR-everest-linux-l2,
Jakub Kicinski, linux-rdma, Michal Kalderon, netdev
zbva is always false, so fbo is never read.
A 'zero-based-virtual-address' is simply IOVA == 0, and the driver already
supports this.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/qedr/verbs.c | 4 ----
drivers/net/ethernet/qlogic/qed/qed_rdma.c | 12 ++----------
include/linux/qed/qed_rdma_if.h | 2 --
3 files changed, 2 insertions(+), 16 deletions(-)
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 278b48443aedba..cca69b4ed354ea 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -2878,10 +2878,8 @@ struct ib_mr *qedr_reg_user_mr(struct ib_pd *ibpd, u64 start, u64 len,
mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered;
mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size);
mr->hw_mr.page_size_log = PAGE_SHIFT;
- mr->hw_mr.fbo = ib_umem_offset(mr->umem);
mr->hw_mr.length = len;
mr->hw_mr.vaddr = usr_addr;
- mr->hw_mr.zbva = false;
mr->hw_mr.phy_mr = false;
mr->hw_mr.dma_mr = false;
@@ -2974,10 +2972,8 @@ static struct qedr_mr *__qedr_alloc_mr(struct ib_pd *ibpd,
mr->hw_mr.pbl_ptr = 0;
mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered;
mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size);
- mr->hw_mr.fbo = 0;
mr->hw_mr.length = 0;
mr->hw_mr.vaddr = 0;
- mr->hw_mr.zbva = false;
mr->hw_mr.phy_mr = true;
mr->hw_mr.dma_mr = false;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
index a4bcde522cdf9d..baa4c36608ea91 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
@@ -1520,7 +1520,7 @@ qed_rdma_register_tid(void *rdma_cxt,
params->pbl_two_level);
SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_ZERO_BASED,
- params->zbva);
+ false);
SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_PHY_MR, params->phy_mr);
@@ -1582,15 +1582,7 @@ qed_rdma_register_tid(void *rdma_cxt,
p_ramrod->pd = cpu_to_le16(params->pd);
p_ramrod->length_hi = (u8)(params->length >> 32);
p_ramrod->length_lo = DMA_LO_LE(params->length);
- if (params->zbva) {
- /* Lower 32 bits of the registered MR address.
- * In case of zero based MR, will hold FBO
- */
- p_ramrod->va.hi = 0;
- p_ramrod->va.lo = cpu_to_le32(params->fbo);
- } else {
- DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
- }
+ DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
DMA_REGPAIR_LE(p_ramrod->pbl_base, params->pbl_ptr);
/* DIF */
diff --git a/include/linux/qed/qed_rdma_if.h b/include/linux/qed/qed_rdma_if.h
index f464d85e88a410..aeb242cefebfa8 100644
--- a/include/linux/qed/qed_rdma_if.h
+++ b/include/linux/qed/qed_rdma_if.h
@@ -242,10 +242,8 @@ struct qed_rdma_register_tid_in_params {
bool pbl_two_level;
u8 pbl_page_size_log;
u8 page_size_log;
- u32 fbo;
u64 length;
u64 vaddr;
- bool zbva;
bool phy_mr;
bool dma_mr;
--
2.28.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [EXT] [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR
2020-09-04 22:41 ` [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR Jason Gunthorpe
@ 2020-09-06 8:01 ` Michal Kalderon
0 siblings, 0 replies; 4+ messages in thread
From: Michal Kalderon @ 2020-09-06 8:01 UTC (permalink / raw)
To: Jason Gunthorpe, Ariel Elior, David S. Miller, Doug Ledford,
GR-everest-linux-l2, Jakub Kicinski, linux-rdma, netdev
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Saturday, September 5, 2020 1:42 AM
> zbva is always false, so fbo is never read.
>
> A 'zero-based-virtual-address' is simply IOVA == 0, and the driver already
> supports this.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/infiniband/hw/qedr/verbs.c | 4 ----
> drivers/net/ethernet/qlogic/qed/qed_rdma.c | 12 ++----------
> include/linux/qed/qed_rdma_if.h | 2 --
> 3 files changed, 2 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/infiniband/hw/qedr/verbs.c
> b/drivers/infiniband/hw/qedr/verbs.c
> index 278b48443aedba..cca69b4ed354ea 100644
> --- a/drivers/infiniband/hw/qedr/verbs.c
> +++ b/drivers/infiniband/hw/qedr/verbs.c
> @@ -2878,10 +2878,8 @@ struct ib_mr *qedr_reg_user_mr(struct ib_pd
> *ibpd, u64 start, u64 len,
> mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered;
> mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size);
> mr->hw_mr.page_size_log = PAGE_SHIFT;
> - mr->hw_mr.fbo = ib_umem_offset(mr->umem);
> mr->hw_mr.length = len;
> mr->hw_mr.vaddr = usr_addr;
> - mr->hw_mr.zbva = false;
> mr->hw_mr.phy_mr = false;
> mr->hw_mr.dma_mr = false;
>
> @@ -2974,10 +2972,8 @@ static struct qedr_mr *__qedr_alloc_mr(struct
> ib_pd *ibpd,
> mr->hw_mr.pbl_ptr = 0;
> mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered;
> mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size);
> - mr->hw_mr.fbo = 0;
> mr->hw_mr.length = 0;
> mr->hw_mr.vaddr = 0;
> - mr->hw_mr.zbva = false;
> mr->hw_mr.phy_mr = true;
> mr->hw_mr.dma_mr = false;
>
> diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> index a4bcde522cdf9d..baa4c36608ea91 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
> @@ -1520,7 +1520,7 @@ qed_rdma_register_tid(void *rdma_cxt,
> params->pbl_two_level);
>
> SET_FIELD(flags,
> RDMA_REGISTER_TID_RAMROD_DATA_ZERO_BASED,
> - params->zbva);
> + false);
>
> SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_PHY_MR,
> params->phy_mr);
>
> @@ -1582,15 +1582,7 @@ qed_rdma_register_tid(void *rdma_cxt,
> p_ramrod->pd = cpu_to_le16(params->pd);
> p_ramrod->length_hi = (u8)(params->length >> 32);
> p_ramrod->length_lo = DMA_LO_LE(params->length);
> - if (params->zbva) {
> - /* Lower 32 bits of the registered MR address.
> - * In case of zero based MR, will hold FBO
> - */
> - p_ramrod->va.hi = 0;
> - p_ramrod->va.lo = cpu_to_le32(params->fbo);
> - } else {
> - DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
> - }
> + DMA_REGPAIR_LE(p_ramrod->va, params->vaddr);
> DMA_REGPAIR_LE(p_ramrod->pbl_base, params->pbl_ptr);
>
> /* DIF */
> diff --git a/include/linux/qed/qed_rdma_if.h
> b/include/linux/qed/qed_rdma_if.h index f464d85e88a410..aeb242cefebfa8
> 100644
> --- a/include/linux/qed/qed_rdma_if.h
> +++ b/include/linux/qed/qed_rdma_if.h
> @@ -242,10 +242,8 @@ struct qed_rdma_register_tid_in_params {
> bool pbl_two_level;
> u8 pbl_page_size_log;
> u8 page_size_log;
> - u32 fbo;
> u64 length;
> u64 vaddr;
> - bool zbva;
> bool phy_mr;
> bool dma_mr;
>
> --
> 2.28.0
Thanks,
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers
2020-09-04 22:41 [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
2020-09-04 22:41 ` [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR Jason Gunthorpe
@ 2020-09-09 18:38 ` Jason Gunthorpe
1 sibling, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2020-09-09 18:38 UTC (permalink / raw)
To: Adit Ranadive, Ariel Elior, Potnuri Bharat Teja, David S. Miller,
Devesh Sharma, Doug Ledford, Faisal Latif, Gal Pressman,
GR-everest-linux-l2, Wei Hu(Xavier),
Jakub Kicinski, Leon Romanovsky, linux-rdma, Weihang Li,
Michal Kalderon, Naresh Kumar PBS, netdev, Lijun Ou,
VMware PV-Drivers, Selvin Xavier, Yossi Leybovich, Somnath Kotur,
Sriharsha Basavapatna, Yishai Hadas
Cc: Firas JahJah, Henry Orosco, Leon Romanovsky, Michael J. Ruhl,
Michal Kalderon, Miguel Ojeda, Shiraz Saleem
On Fri, Sep 04, 2020 at 07:41:41PM -0300, Jason Gunthorpe wrote:
> Most RDMA drivers rely on a linear table of DMA addresses organized in
> some device specific page size.
>
> For a while now the core code has had the rdma_for_each_block() SG
> iterator to help break a umem into DMA blocks for use in the device lists.
>
> Improve on this by adding rdma_umem_for_each_dma_block(),
> ib_umem_dma_offset() and ib_umem_num_dma_blocks().
>
> Replace open codings, or calls to fixed PAGE_SIZE APIs, in most of the
> drivers with one of the above APIs.
>
> Get rid of the really weird and duplicative ib_umem_page_count().
>
> Fix two problems with ib_umem_find_best_pgsz(), and several problems
> related to computing the wrong DMA list length if IOVA != umem->address.
>
> At this point many of the driver have a clear path to call
> ib_umem_find_best_pgsz() and replace hardcoded PAGE_SIZE or PAGE_SHIFT
> values when constructing their DMA lists.
>
> This is the first series in an effort to modernize the umem usage in all
> the DMA drivers.
>
> v1: https://lore.kernel.org/r/0-v1-00f59ce24f1f+19f50-umem_1_jgg@nvidia.com
> v2:
> - Fix ib_umem_find_best_pgsz() to use IOVA not umem->addr
> - Fix ib_umem_num_dma_blocks() to use IOVA not umem->addr
> - Two new patches to remove wrong open coded versions of
> ib_umem_num_dma_blocks() from EFA and i40iw
> - Redo the mlx4 ib_umem_num_dma_blocks() to do less and be safer
> until the whole thing can be moved to ib_umem_find_best_pgsz()
> - Two new patches to delete calls to ib_umem_offset() in qedr and
> ocrdma
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>
> Jason Gunthorpe (17):
> RDMA/umem: Fix ib_umem_find_best_pgsz() for mappings that cross a page
> boundary
> RDMA/umem: Prevent small pages from being returned by
> ib_umem_find_best_pgsz()
> RDMA/umem: Use simpler logic for ib_umem_find_best_pgsz()
> RDMA/umem: Add rdma_umem_for_each_dma_block()
> RDMA/umem: Replace for_each_sg_dma_page with
> rdma_umem_for_each_dma_block
> RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks()
> RDMA/efa: Use ib_umem_num_dma_pages()
> RDMA/i40iw: Use ib_umem_num_dma_pages()
> RDMA/qedr: Use rdma_umem_for_each_dma_block() instead of open-coding
> RDMA/qedr: Use ib_umem_num_dma_blocks() instead of
> ib_umem_page_count()
> RDMA/bnxt: Do not use ib_umem_page_count() or ib_umem_num_pages()
> RDMA/hns: Use ib_umem_num_dma_blocks() instead of opencoding
> RDMA/ocrdma: Use ib_umem_num_dma_blocks() instead of
> ib_umem_page_count()
> RDMA/pvrdma: Use ib_umem_num_dma_blocks() instead of
> ib_umem_page_count()
> RDMA/mlx4: Use ib_umem_num_dma_blocks()
> RDMA/qedr: Remove fbo and zbva from the MR
> RDMA/ocrdma: Remove fbo from MR
Applied to for-next with Leon's note. Thanks everyone
Jason
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-09-09 18:39 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04 22:41 [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
2020-09-04 22:41 ` [PATCH v2 16/17] RDMA/qedr: Remove fbo and zbva from the MR Jason Gunthorpe
2020-09-06 8:01 ` [EXT] " Michal Kalderon
2020-09-09 18:38 ` [PATCH v2 00/17] RDMA: Improve use of umem in DMA drivers Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).