LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap
@ 2021-08-09 14:29 John Garry
  2021-08-09 14:29 ` [PATCH v2 01/11] blk-mq: Change rqs check in blk_mq_free_rqs() John Garry
                   ` (11 more replies)
  0 siblings, 12 replies; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

Currently a full set of static requests are allocated per hw queue per
tagset when shared sbitmap is used.

However, only tagset->queue_depth number of requests may be active at
any given time. As such, only tagset->queue_depth number of static
requests are required.

The same goes for using an IO scheduler, which allocates a full set of
static requests per hw queue per request queue.

This series changes shared sbitmap support by using a shared tags per
tagset and request queue. Ming suggested something along those lines in
v1. But we'll keep name "shared sbitmap" name as it is fimilar. In using
a shared tags, the static rqs also become shared, reducing the number of
sets of static rqs, reducing memory usage.

Patch "blk-mq: Use shared tags for shared sbitmap support" is a bit big,
and could be broken down. But then maintaining ability to bisect
becomes harder and each sub-patch would get more convoluted.

For megaraid sas driver on my 128-CPU arm64 system with 1x SATA disk, we
save approx. 300MB(!) [370MB -> 60MB]

Baseline is 2112f5c1330a (for-5.15/block) loop: Select I/O scheduler ...

Changes since v1:
- Switch to use blk_mq_tags for shared sbitmap
- Add new blk_mq_ops init request callback
- Add some RB tags (thanks!)

John Garry (11):
  blk-mq: Change rqs check in blk_mq_free_rqs()
  block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ
  blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests()
  blk-mq: Invert check in blk_mq_update_nr_requests()
  blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}()
  blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap()
  blk-mq: Add blk_mq_ops.init_request_no_hctx()
  scsi: Set blk_mq_ops.init_request_no_hctx
  blk-mq: Use shared tags for shared sbitmap support
  blk-mq: Stop using pointers for blk_mq_tags bitmap tags

 block/bfq-iosched.c      |   4 +-
 block/blk-core.c         |   2 +-
 block/blk-mq-debugfs.c   |   8 +--
 block/blk-mq-sched.c     |  92 +++++++++++++------------
 block/blk-mq-sched.h     |   2 +-
 block/blk-mq-tag.c       | 111 +++++++++++++-----------------
 block/blk-mq-tag.h       |  12 ++--
 block/blk-mq.c           | 144 +++++++++++++++++++++++----------------
 block/blk-mq.h           |   8 +--
 block/kyber-iosched.c    |   4 +-
 block/mq-deadline-main.c |   2 +-
 drivers/block/rbd.c      |   2 +-
 drivers/scsi/scsi_lib.c  |   6 +-
 include/linux/blk-mq.h   |  20 +++---
 include/linux/blkdev.h   |   5 +-
 15 files changed, 218 insertions(+), 204 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 01/11] blk-mq: Change rqs check in blk_mq_free_rqs()
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-09 14:29 ` [PATCH v2 02/11] block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ John Garry
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

The original code in commit 24d2f90309b23 ("blk-mq: split out tag
initialization, support shared tags") would check tags->rqs is non-NULL and
then dereference tags->rqs[].

Then in commit 2af8cbe30531 ("blk-mq: split tag ->rqs[] into two"), we
started to dereference tags->static_rqs[], but continued to check non-NULL
tags->rqs.

Check tags->static_rqs as non-NULL instead, which is more logical.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 2c4ac51e54eb..ae28f470893c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2348,7 +2348,7 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 {
 	struct page *page;
 
-	if (tags->rqs && set->ops->exit_request) {
+	if (tags->static_rqs && set->ops->exit_request) {
 		int i;
 
 		for (i = 0; i < tags->nr_tags; i++) {
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 02/11] block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
  2021-08-09 14:29 ` [PATCH v2 01/11] blk-mq: Change rqs check in blk_mq_free_rqs() John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-09 14:29 ` [PATCH v2 03/11] blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests() John Garry
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

It is a bit confusing that there is BLKDEV_MAX_RQ and MAX_SCHED_RQ, as
the name BLKDEV_MAX_RQ would imply the max requests always, which it is
not.

Rename to BLKDEV_MAX_RQ to BLKDEV_DEFAULT_RQ, matching it's usage - that being
the default number of requests assigned when allocating a request queue.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-core.c       | 2 +-
 block/blk-mq-sched.c   | 2 +-
 block/blk-mq-sched.h   | 2 +-
 drivers/block/rbd.c    | 2 +-
 include/linux/blkdev.h | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 04477697ee4b..5d71382b6131 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -579,7 +579,7 @@ struct request_queue *blk_alloc_queue(int node_id)
 
 	blk_queue_dma_alignment(q, 511);
 	blk_set_default_limits(&q->limits);
-	q->nr_requests = BLKDEV_MAX_RQ;
+	q->nr_requests = BLKDEV_DEFAULT_RQ;
 
 	return q;
 
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 0f006cabfd91..2231fb0d4c35 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -606,7 +606,7 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 	 * Additionally, this is a per-hw queue depth.
 	 */
 	q->nr_requests = 2 * min_t(unsigned int, q->tag_set->queue_depth,
-				   BLKDEV_MAX_RQ);
+				   BLKDEV_DEFAULT_RQ);
 
 	queue_for_each_hw_ctx(q, hctx, i) {
 		ret = blk_mq_sched_alloc_tags(q, hctx, i);
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index 5246ae040704..1e46be6c5178 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -5,7 +5,7 @@
 #include "blk-mq.h"
 #include "blk-mq-tag.h"
 
-#define MAX_SCHED_RQ (16 * BLKDEV_MAX_RQ)
+#define MAX_SCHED_RQ (16 * BLKDEV_DEFAULT_RQ)
 
 void blk_mq_sched_assign_ioc(struct request *rq);
 
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 6d596c2c2cd6..8bae60f6fa75 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -836,7 +836,7 @@ struct rbd_options {
 	u32 alloc_hint_flags;  /* CEPH_OSD_OP_ALLOC_HINT_FLAG_* */
 };
 
-#define RBD_QUEUE_DEPTH_DEFAULT	BLKDEV_MAX_RQ
+#define RBD_QUEUE_DEPTH_DEFAULT	BLKDEV_DEFAULT_RQ
 #define RBD_ALLOC_SIZE_DEFAULT	(64 * 1024)
 #define RBD_LOCK_TIMEOUT_DEFAULT 0  /* no timeout */
 #define RBD_READ_ONLY_DEFAULT	false
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b5c033cf5f26..56870a43ae4c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -45,7 +45,7 @@ struct blk_stat_callback;
 struct blk_keyslot_manager;
 
 #define BLKDEV_MIN_RQ	4
-#define BLKDEV_MAX_RQ	128	/* Default maximum */
+#define BLKDEV_DEFAULT_RQ	128
 
 /* Must be consistent with blk_mq_poll_stats_bkt() */
 #define BLK_MQ_POLL_STATS_BKTS 16
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 03/11] blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests()
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
  2021-08-09 14:29 ` [PATCH v2 01/11] blk-mq: Change rqs check in blk_mq_free_rqs() John Garry
  2021-08-09 14:29 ` [PATCH v2 02/11] block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-18  3:49   ` Ming Lei
  2021-08-09 14:29 ` [PATCH v2 04/11] blk-mq: Invert check " John Garry
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

For shared sbitmap, if the call to blk_mq_tag_update_depth() was
successful for any hctx when hctx->sched_tags is not set, then it would be
successful for all (due to nature in which blk_mq_tag_update_depth()
fails).

As such, there is no need to call blk_mq_tag_resize_shared_sbitmap() for
each hctx. So relocate the call until after the hctx iteration under the
!q->elevator check, which is equivalent (to !hctx->sched_tags).

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index ae28f470893c..0e4825ca9869 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3624,8 +3624,6 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 		if (!hctx->sched_tags) {
 			ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr,
 							false);
-			if (!ret && blk_mq_is_sbitmap_shared(set->flags))
-				blk_mq_tag_resize_shared_sbitmap(set, nr);
 		} else {
 			ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags,
 							nr, true);
@@ -3643,9 +3641,13 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 	}
 	if (!ret) {
 		q->nr_requests = nr;
-		if (q->elevator && blk_mq_is_sbitmap_shared(set->flags))
-			sbitmap_queue_resize(&q->sched_bitmap_tags,
-					     nr - set->reserved_tags);
+		if (blk_mq_is_sbitmap_shared(set->flags)) {
+			if (q->elevator)
+				sbitmap_queue_resize(&q->sched_bitmap_tags,
+						     nr - set->reserved_tags);
+			else
+				blk_mq_tag_resize_shared_sbitmap(set, nr);
+		}
 	}
 
 	blk_mq_unquiesce_queue(q);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 04/11] blk-mq: Invert check in blk_mq_update_nr_requests()
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (2 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 03/11] blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests() John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-18  3:53   ` Ming Lei
  2021-08-09 14:29 ` [PATCH v2 05/11] blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}() John Garry
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

It's easier to read:

if (x)
	X;
else
	Y;

over:

if (!x)
	Y;
else
	X;

No functional change intended.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0e4825ca9869..42c4b8d1a570 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3621,18 +3621,18 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 		 * If we're using an MQ scheduler, just update the scheduler
 		 * queue depth. This is similar to what the old code would do.
 		 */
-		if (!hctx->sched_tags) {
-			ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr,
-							false);
-		} else {
+		if (hctx->sched_tags) {
 			ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags,
-							nr, true);
+						      nr, true);
 			if (blk_mq_is_sbitmap_shared(set->flags)) {
 				hctx->sched_tags->bitmap_tags =
 					&q->sched_bitmap_tags;
 				hctx->sched_tags->breserved_tags =
 					&q->sched_breserved_tags;
 			}
+		} else {
+			ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr,
+						      false);
 		}
 		if (ret)
 			break;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 05/11] blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}()
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (3 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 04/11] blk-mq: Invert check " John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-18  3:55   ` Ming Lei
  2021-08-09 14:29 ` [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping() John Garry
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

Function blk_mq_sched_alloc_tags() does same as
__blk_mq_alloc_map_and_request(), so give a similar name to be consistent.

Similarly rename label err_free_tags -> err_free_map_and_request.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq-sched.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 2231fb0d4c35..b4d7ad9a7a60 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -515,9 +515,9 @@ void blk_mq_sched_insert_requests(struct blk_mq_hw_ctx *hctx,
 	percpu_ref_put(&q->q_usage_counter);
 }
 
-static int blk_mq_sched_alloc_tags(struct request_queue *q,
-				   struct blk_mq_hw_ctx *hctx,
-				   unsigned int hctx_idx)
+static int blk_mq_sched_alloc_map_and_request(struct request_queue *q,
+					      struct blk_mq_hw_ctx *hctx,
+					      unsigned int hctx_idx)
 {
 	struct blk_mq_tag_set *set = q->tag_set;
 	int ret;
@@ -609,15 +609,15 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 				   BLKDEV_DEFAULT_RQ);
 
 	queue_for_each_hw_ctx(q, hctx, i) {
-		ret = blk_mq_sched_alloc_tags(q, hctx, i);
+		ret = blk_mq_sched_alloc_map_and_request(q, hctx, i);
 		if (ret)
-			goto err_free_tags;
+			goto err_free_map_and_request;
 	}
 
 	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
 		ret = blk_mq_init_sched_shared_sbitmap(q);
 		if (ret)
-			goto err_free_tags;
+			goto err_free_map_and_request;
 	}
 
 	ret = e->ops.init_sched(q, e);
@@ -645,7 +645,7 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 err_free_sbitmap:
 	if (blk_mq_is_sbitmap_shared(q->tag_set->flags))
 		blk_mq_exit_sched_shared_sbitmap(q);
-err_free_tags:
+err_free_map_and_request:
 	blk_mq_sched_free_requests(q);
 	blk_mq_sched_tags_teardown(q);
 	q->elevator = NULL;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (4 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 05/11] blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}() John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-09 17:00   ` kernel test robot
                     ` (2 more replies)
  2021-08-09 14:29 ` [PATCH v2 07/11] blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap() John Garry
                   ` (5 subsequent siblings)
  11 siblings, 3 replies; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

Function blk_mq_clear_rq_mapping() will be used for shared sbitmap tags
in future, so pass a driver tags pointer instead of the tagset container
and HW queue index.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 42c4b8d1a570..0bb596f4d061 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2310,10 +2310,9 @@ static size_t order_to_size(unsigned int order)
 }
 
 /* called before freeing request pool in @tags */
-static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
-		struct blk_mq_tags *tags, unsigned int hctx_idx)
+void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
+			     struct blk_mq_tags *tags)
 {
-	struct blk_mq_tags *drv_tags = set->tags[hctx_idx];
 	struct page *page;
 	unsigned long flags;
 
@@ -2322,7 +2321,7 @@ static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
 		unsigned long end = start + order_to_size(page->private);
 		int i;
 
-		for (i = 0; i < set->queue_depth; i++) {
+		for (i = 0; i < drv_tags->nr_tags; i++) {
 			struct request *rq = drv_tags->rqs[i];
 			unsigned long rq_addr = (unsigned long)rq;
 
@@ -2346,8 +2345,11 @@ static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
 void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		     unsigned int hctx_idx)
 {
+	struct blk_mq_tags *drv_tags;
 	struct page *page;
 
+		drv_tags = set->tags[hctx_idx];
+
 	if (tags->static_rqs && set->ops->exit_request) {
 		int i;
 
@@ -2361,7 +2363,7 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		}
 	}
 
-	blk_mq_clear_rq_mapping(set, tags, hctx_idx);
+	blk_mq_clear_rq_mapping(drv_tags, tags);
 
 	while (!list_empty(&tags->page_list)) {
 		page = list_first_entry(&tags->page_list, struct page, lru);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 07/11] blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap()
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (5 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping() John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-18  7:28   ` Ming Lei
  2021-08-09 14:29 ` [PATCH v2 08/11] blk-mq: Add blk_mq_ops.init_request_no_hctx() John Garry
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

Put the functionality to update the sched shared sbitmap size in a common
function.

Since the same formula is always used to resize, and it can be got from
the request queue argument, so just pass the request queue pointer.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq-sched.c | 3 +--
 block/blk-mq-tag.c   | 6 ++++++
 block/blk-mq-tag.h   | 1 +
 block/blk-mq.c       | 3 +--
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index b4d7ad9a7a60..ac0408ebcd47 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -575,8 +575,7 @@ static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue)
 					&queue->sched_breserved_tags;
 	}
 
-	sbitmap_queue_resize(&queue->sched_bitmap_tags,
-			     queue->nr_requests - set->reserved_tags);
+	blk_mq_tag_update_sched_shared_sbitmap(queue);
 
 	return 0;
 }
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 86f87346232a..5f06ad6efc8f 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -634,6 +634,12 @@ void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set, unsigned int s
 	sbitmap_queue_resize(&set->__bitmap_tags, size - set->reserved_tags);
 }
 
+void blk_mq_tag_update_sched_shared_sbitmap(struct request_queue *q)
+{
+	sbitmap_queue_resize(&q->sched_bitmap_tags,
+			     q->nr_requests - q->tag_set->reserved_tags);
+}
+
 /**
  * blk_mq_unique_tag() - return a tag that is unique queue-wide
  * @rq: request for which to compute a unique tag
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 8ed55af08427..88f3c6485543 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -48,6 +48,7 @@ extern int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
 					unsigned int depth, bool can_grow);
 extern void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set,
 					     unsigned int size);
+extern void blk_mq_tag_update_sched_shared_sbitmap(struct request_queue *q);
 
 extern void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool);
 void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0bb596f4d061..f14cc2705f9b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3645,8 +3645,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 		q->nr_requests = nr;
 		if (blk_mq_is_sbitmap_shared(set->flags)) {
 			if (q->elevator)
-				sbitmap_queue_resize(&q->sched_bitmap_tags,
-						     nr - set->reserved_tags);
+				blk_mq_tag_update_sched_shared_sbitmap(q);
 			else
 				blk_mq_tag_resize_shared_sbitmap(set, nr);
 		}
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 08/11] blk-mq: Add blk_mq_ops.init_request_no_hctx()
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (6 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 07/11] blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap() John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-18  7:38   ` Ming Lei
  2021-08-09 14:29 ` [PATCH v2 09/11] scsi: Set blk_mq_ops.init_request_no_hctx John Garry
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

Add a variant of the init_request function which does not pass a hctx_idx
arg.

This is important for shared sbitmap support, as it needs to be ensured for
introducing shared static rqs that the LLDD cannot think that requests
are associated with a specific HW queue.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq.c         | 15 ++++++++++-----
 include/linux/blk-mq.h |  7 +++++++
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f14cc2705f9b..4d6723cfa582 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2427,13 +2427,15 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
 			       unsigned int hctx_idx, int node)
 {
-	int ret;
+	int ret = 0;
 
-	if (set->ops->init_request) {
+	if (set->ops->init_request)
 		ret = set->ops->init_request(set, rq, hctx_idx, node);
-		if (ret)
-			return ret;
-	}
+	else if (set->ops->init_request_no_hctx)
+		ret = set->ops->init_request_no_hctx(set, rq, node);
+
+	if (ret)
+		return ret;
 
 	WRITE_ONCE(rq->state, MQ_RQ_IDLE);
 	return 0;
@@ -3487,6 +3489,9 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set)
 	if (!set->ops->get_budget ^ !set->ops->put_budget)
 		return -EINVAL;
 
+	if (set->ops->init_request && set->ops->init_request_no_hctx)
+		return -EINVAL;
+
 	if (set->queue_depth > BLK_MQ_MAX_DEPTH) {
 		pr_info("blk-mq: reduced tag depth to %u\n",
 			BLK_MQ_MAX_DEPTH);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 22215db36122..c838b24944c2 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -357,6 +357,13 @@ struct blk_mq_ops {
 	 */
 	int (*init_request)(struct blk_mq_tag_set *set, struct request *,
 			    unsigned int, unsigned int);
+
+	/**
+	 * @init_request: Same as init_request, except no hw queue index is passed
+	 */
+	int (*init_request_no_hctx)(struct blk_mq_tag_set *set, struct request *,
+				    unsigned int);
+
 	/**
 	 * @exit_request: Ditto for exit/teardown.
 	 */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 09/11] scsi: Set blk_mq_ops.init_request_no_hctx
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (7 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 08/11] blk-mq: Add blk_mq_ops.init_request_no_hctx() John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-09 14:29 ` [PATCH v2 10/11] blk-mq: Use shared tags for shared sbitmap support John Garry
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

The hctx_idx argument is not used in scsi_mq_init_request(), so set as the
blk_mq_ops.init_request_no_hctx callback instead.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/scsi/scsi_lib.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7456a26aef51..6ea4d0847970 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1738,7 +1738,7 @@ static enum blk_eh_timer_return scsi_timeout(struct request *req,
 }
 
 static int scsi_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
-				unsigned int hctx_idx, unsigned int numa_node)
+				unsigned int numa_node)
 {
 	struct Scsi_Host *shost = set->driver_data;
 	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
@@ -1856,7 +1856,7 @@ static const struct blk_mq_ops scsi_mq_ops_no_commit = {
 #ifdef CONFIG_BLK_DEBUG_FS
 	.show_rq	= scsi_show_rq,
 #endif
-	.init_request	= scsi_mq_init_request,
+	.init_request_no_hctx = scsi_mq_init_request,
 	.exit_request	= scsi_mq_exit_request,
 	.initialize_rq_fn = scsi_initialize_rq,
 	.cleanup_rq	= scsi_cleanup_rq,
@@ -1886,7 +1886,7 @@ static const struct blk_mq_ops scsi_mq_ops = {
 #ifdef CONFIG_BLK_DEBUG_FS
 	.show_rq	= scsi_show_rq,
 #endif
-	.init_request	= scsi_mq_init_request,
+	.init_request_no_hctx = scsi_mq_init_request,
 	.exit_request	= scsi_mq_exit_request,
 	.initialize_rq_fn = scsi_initialize_rq,
 	.cleanup_rq	= scsi_cleanup_rq,
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 10/11] blk-mq: Use shared tags for shared sbitmap support
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (8 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 09/11] scsi: Set blk_mq_ops.init_request_no_hctx John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-18  8:16   ` Ming Lei
  2021-08-09 14:29 ` [PATCH v2 11/11] blk-mq: Stop using pointers for blk_mq_tags bitmap tags John Garry
  2021-08-17  7:02 ` [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
  11 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

Currently we use separate sbitmap pairs and active_queues atomic_t for
shared sbitmap support.

However a full set of static requests are used per HW queue, which is
quite wasteful, considering that the total number of requests usable at
any given time across all HW queues is limited by the shared sbitmap depth.

As such, it is considerably more memory efficient in the case of shared
sbitmap to allocate a set of static rqs per tag set or request queue, and
not per HW queue.

So replace the sbitmap pairs and active_queues atomic_t with a shared
tags per tagset and request queue.

Continue to use term "shared sbitmap" for now, as the meaning is known.

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/blk-mq-sched.c   | 77 ++++++++++++++++++++-----------------
 block/blk-mq-tag.c     | 65 ++++++++++++-------------------
 block/blk-mq-tag.h     |  4 +-
 block/blk-mq.c         | 86 +++++++++++++++++++++++++-----------------
 block/blk-mq.h         |  8 ++--
 include/linux/blk-mq.h | 13 +++----
 include/linux/blkdev.h |  3 +-
 7 files changed, 131 insertions(+), 125 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index ac0408ebcd47..1101a2e4da9a 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -522,14 +522,19 @@ static int blk_mq_sched_alloc_map_and_request(struct request_queue *q,
 	struct blk_mq_tag_set *set = q->tag_set;
 	int ret;
 
+	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
+		hctx->sched_tags = q->shared_sbitmap_tags;
+		return 0;
+	}
+
 	hctx->sched_tags = blk_mq_alloc_rq_map(set, hctx_idx, q->nr_requests,
-					       set->reserved_tags, set->flags);
+					       set->reserved_tags);
 	if (!hctx->sched_tags)
 		return -ENOMEM;
 
 	ret = blk_mq_alloc_rqs(set, hctx->sched_tags, hctx_idx, q->nr_requests);
 	if (ret) {
-		blk_mq_free_rq_map(hctx->sched_tags, set->flags);
+		blk_mq_free_rq_map(hctx->sched_tags);
 		hctx->sched_tags = NULL;
 	}
 
@@ -544,35 +549,39 @@ static void blk_mq_sched_tags_teardown(struct request_queue *q)
 
 	queue_for_each_hw_ctx(q, hctx, i) {
 		if (hctx->sched_tags) {
-			blk_mq_free_rq_map(hctx->sched_tags, hctx->flags);
+			if (!blk_mq_is_sbitmap_shared(q->tag_set->flags))
+				blk_mq_free_rq_map(hctx->sched_tags);
 			hctx->sched_tags = NULL;
 		}
 	}
 }
 
+static void blk_mq_exit_sched_shared_sbitmap(struct request_queue *queue)
+{
+	blk_mq_free_rq_map(queue->shared_sbitmap_tags);
+	queue->shared_sbitmap_tags = NULL;
+}
+
 static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue)
 {
 	struct blk_mq_tag_set *set = queue->tag_set;
-	int alloc_policy = BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags);
-	struct blk_mq_hw_ctx *hctx;
-	int ret, i;
+	struct blk_mq_tags *tags;
+	int ret;
 
 	/*
 	 * Set initial depth at max so that we don't need to reallocate for
 	 * updating nr_requests.
 	 */
-	ret = blk_mq_init_bitmaps(&queue->sched_bitmap_tags,
-				  &queue->sched_breserved_tags,
-				  MAX_SCHED_RQ, set->reserved_tags,
-				  set->numa_node, alloc_policy);
-	if (ret)
-		return ret;
+	tags = queue->shared_sbitmap_tags = blk_mq_alloc_rq_map(set, 0,
+					  set->queue_depth,
+					  set->reserved_tags);
+	if (!queue->shared_sbitmap_tags)
+		return -ENOMEM;
 
-	queue_for_each_hw_ctx(queue, hctx, i) {
-		hctx->sched_tags->bitmap_tags =
-					&queue->sched_bitmap_tags;
-		hctx->sched_tags->breserved_tags =
-					&queue->sched_breserved_tags;
+	ret = blk_mq_alloc_rqs(set, tags, 0, set->queue_depth);
+	if (ret) {
+		blk_mq_exit_sched_shared_sbitmap(queue);
+		return ret;
 	}
 
 	blk_mq_tag_update_sched_shared_sbitmap(queue);
@@ -580,12 +589,6 @@ static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue)
 	return 0;
 }
 
-static void blk_mq_exit_sched_shared_sbitmap(struct request_queue *queue)
-{
-	sbitmap_queue_free(&queue->sched_bitmap_tags);
-	sbitmap_queue_free(&queue->sched_breserved_tags);
-}
-
 int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 {
 	struct blk_mq_hw_ctx *hctx;
@@ -607,21 +610,21 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 	q->nr_requests = 2 * min_t(unsigned int, q->tag_set->queue_depth,
 				   BLKDEV_DEFAULT_RQ);
 
-	queue_for_each_hw_ctx(q, hctx, i) {
-		ret = blk_mq_sched_alloc_map_and_request(q, hctx, i);
+	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
+		ret = blk_mq_init_sched_shared_sbitmap(q);
 		if (ret)
-			goto err_free_map_and_request;
+			return ret;
 	}
 
-	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
-		ret = blk_mq_init_sched_shared_sbitmap(q);
+	queue_for_each_hw_ctx(q, hctx, i) {
+		ret = blk_mq_sched_alloc_map_and_request(q, hctx, i);
 		if (ret)
 			goto err_free_map_and_request;
 	}
 
 	ret = e->ops.init_sched(q, e);
 	if (ret)
-		goto err_free_sbitmap;
+		goto err_free_map_and_request;
 
 	blk_mq_debugfs_register_sched(q);
 
@@ -641,12 +644,12 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
 
 	return 0;
 
-err_free_sbitmap:
-	if (blk_mq_is_sbitmap_shared(q->tag_set->flags))
-		blk_mq_exit_sched_shared_sbitmap(q);
 err_free_map_and_request:
 	blk_mq_sched_free_requests(q);
 	blk_mq_sched_tags_teardown(q);
+	if (blk_mq_is_sbitmap_shared(q->tag_set->flags))
+		blk_mq_exit_sched_shared_sbitmap(q);
+
 	q->elevator = NULL;
 	return ret;
 }
@@ -660,9 +663,13 @@ void blk_mq_sched_free_requests(struct request_queue *q)
 	struct blk_mq_hw_ctx *hctx;
 	int i;
 
-	queue_for_each_hw_ctx(q, hctx, i) {
-		if (hctx->sched_tags)
-			blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i);
+	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
+		blk_mq_free_rqs(q->tag_set, q->shared_sbitmap_tags, 0);
+	} else {
+		queue_for_each_hw_ctx(q, hctx, i) {
+			if (hctx->sched_tags)
+				blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i);
+		}
 	}
 }
 
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 5f06ad6efc8f..e97bbf477b96 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -27,10 +27,11 @@ bool __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
 	if (blk_mq_is_sbitmap_shared(hctx->flags)) {
 		struct request_queue *q = hctx->queue;
 		struct blk_mq_tag_set *set = q->tag_set;
+		struct blk_mq_tags *tags = set->shared_sbitmap_tags;
 
 		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags) &&
 		    !test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
-			atomic_inc(&set->active_queues_shared_sbitmap);
+			atomic_inc(&tags->active_queues);
 	} else {
 		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state) &&
 		    !test_and_set_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
@@ -61,10 +62,12 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
 	struct blk_mq_tag_set *set = q->tag_set;
 
 	if (blk_mq_is_sbitmap_shared(hctx->flags)) {
+		struct blk_mq_tags *tags = set->shared_sbitmap_tags;
+
 		if (!test_and_clear_bit(QUEUE_FLAG_HCTX_ACTIVE,
 					&q->queue_flags))
 			return;
-		atomic_dec(&set->active_queues_shared_sbitmap);
+		atomic_dec(&tags->active_queues);
 	} else {
 		if (!test_and_clear_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
 			return;
@@ -510,38 +513,16 @@ static int blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
 	return 0;
 }
 
-int blk_mq_init_shared_sbitmap(struct blk_mq_tag_set *set)
-{
-	int alloc_policy = BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags);
-	int i, ret;
-
-	ret = blk_mq_init_bitmaps(&set->__bitmap_tags, &set->__breserved_tags,
-				  set->queue_depth, set->reserved_tags,
-				  set->numa_node, alloc_policy);
-	if (ret)
-		return ret;
-
-	for (i = 0; i < set->nr_hw_queues; i++) {
-		struct blk_mq_tags *tags = set->tags[i];
-
-		tags->bitmap_tags = &set->__bitmap_tags;
-		tags->breserved_tags = &set->__breserved_tags;
-	}
-
-	return 0;
-}
-
 void blk_mq_exit_shared_sbitmap(struct blk_mq_tag_set *set)
 {
-	sbitmap_queue_free(&set->__bitmap_tags);
-	sbitmap_queue_free(&set->__breserved_tags);
+	blk_mq_free_rq_map(set->shared_sbitmap_tags);
+	set->shared_sbitmap_tags = NULL;
 }
 
 struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
 				     unsigned int reserved_tags,
-				     int node, unsigned int flags)
+				     int node, int alloc_policy)
 {
-	int alloc_policy = BLK_MQ_FLAG_TO_ALLOC_POLICY(flags);
 	struct blk_mq_tags *tags;
 
 	if (total_tags > BLK_MQ_TAG_MAX) {
@@ -557,9 +538,6 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
 	tags->nr_reserved_tags = reserved_tags;
 	spin_lock_init(&tags->lock);
 
-	if (blk_mq_is_sbitmap_shared(flags))
-		return tags;
-
 	if (blk_mq_init_bitmap_tags(tags, node, alloc_policy) < 0) {
 		kfree(tags);
 		return NULL;
@@ -567,12 +545,10 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
 	return tags;
 }
 
-void blk_mq_free_tags(struct blk_mq_tags *tags, unsigned int flags)
+void blk_mq_free_tags(struct blk_mq_tags *tags)
 {
-	if (!blk_mq_is_sbitmap_shared(flags)) {
-		sbitmap_queue_free(tags->bitmap_tags);
-		sbitmap_queue_free(tags->breserved_tags);
-	}
+	sbitmap_queue_free(tags->bitmap_tags);
+	sbitmap_queue_free(tags->breserved_tags);
 	kfree(tags);
 }
 
@@ -604,18 +580,25 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
 		if (tdepth > MAX_SCHED_RQ)
 			return -EINVAL;
 
+		if (blk_mq_is_sbitmap_shared(set->flags)) {
+			/* No point in allowing this to happen */
+			if (tdepth > set->queue_depth)
+				return -EINVAL;
+			return 0;
+		}
+
 		new = blk_mq_alloc_rq_map(set, hctx->queue_num, tdepth,
-				tags->nr_reserved_tags, set->flags);
+				tags->nr_reserved_tags);
 		if (!new)
 			return -ENOMEM;
 		ret = blk_mq_alloc_rqs(set, new, hctx->queue_num, tdepth);
 		if (ret) {
-			blk_mq_free_rq_map(new, set->flags);
+			blk_mq_free_rq_map(new);
 			return -ENOMEM;
 		}
 
 		blk_mq_free_rqs(set, *tagsptr, hctx->queue_num);
-		blk_mq_free_rq_map(*tagsptr, set->flags);
+		blk_mq_free_rq_map(*tagsptr);
 		*tagsptr = new;
 	} else {
 		/*
@@ -631,12 +614,14 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
 
 void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set, unsigned int size)
 {
-	sbitmap_queue_resize(&set->__bitmap_tags, size - set->reserved_tags);
+	struct blk_mq_tags *tags = set->shared_sbitmap_tags;
+
+	sbitmap_queue_resize(&tags->__bitmap_tags, size - set->reserved_tags);
 }
 
 void blk_mq_tag_update_sched_shared_sbitmap(struct request_queue *q)
 {
-	sbitmap_queue_resize(&q->sched_bitmap_tags,
+	sbitmap_queue_resize(q->shared_sbitmap_tags->bitmap_tags,
 			     q->nr_requests - q->tag_set->reserved_tags);
 }
 
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 88f3c6485543..c9fc52ee07c4 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -30,8 +30,8 @@ struct blk_mq_tags {
 
 extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
 					unsigned int reserved_tags,
-					int node, unsigned int flags);
-extern void blk_mq_free_tags(struct blk_mq_tags *tags, unsigned int flags);
+					int node, int alloc_policy);
+extern void blk_mq_free_tags(struct blk_mq_tags *tags);
 extern int blk_mq_init_bitmaps(struct sbitmap_queue *bitmap_tags,
 			       struct sbitmap_queue *breserved_tags,
 			       unsigned int queue_depth,
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4d6723cfa582..d3dd5fab3426 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2348,6 +2348,9 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 	struct blk_mq_tags *drv_tags;
 	struct page *page;
 
+	if (blk_mq_is_sbitmap_shared(set->flags))
+		drv_tags = set->shared_sbitmap_tags;
+	else
 		drv_tags = set->tags[hctx_idx];
 
 	if (tags->static_rqs && set->ops->exit_request) {
@@ -2377,21 +2380,20 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 	}
 }
 
-void blk_mq_free_rq_map(struct blk_mq_tags *tags, unsigned int flags)
+void blk_mq_free_rq_map(struct blk_mq_tags *tags)
 {
 	kfree(tags->rqs);
 	tags->rqs = NULL;
 	kfree(tags->static_rqs);
 	tags->static_rqs = NULL;
 
-	blk_mq_free_tags(tags, flags);
+	blk_mq_free_tags(tags);
 }
 
 struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 					unsigned int hctx_idx,
 					unsigned int nr_tags,
-					unsigned int reserved_tags,
-					unsigned int flags)
+					unsigned int reserved_tags)
 {
 	struct blk_mq_tags *tags;
 	int node;
@@ -2400,7 +2402,8 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 	if (node == NUMA_NO_NODE)
 		node = set->numa_node;
 
-	tags = blk_mq_init_tags(nr_tags, reserved_tags, node, flags);
+	tags = blk_mq_init_tags(nr_tags, reserved_tags, node,
+				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags));
 	if (!tags)
 		return NULL;
 
@@ -2408,7 +2411,7 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 				 GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
 				 node);
 	if (!tags->rqs) {
-		blk_mq_free_tags(tags, flags);
+		blk_mq_free_tags(tags);
 		return NULL;
 	}
 
@@ -2417,7 +2420,7 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 					node);
 	if (!tags->static_rqs) {
 		kfree(tags->rqs);
-		blk_mq_free_tags(tags, flags);
+		blk_mq_free_tags(tags);
 		return NULL;
 	}
 
@@ -2859,8 +2862,14 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set,
 	unsigned int flags = set->flags;
 	int ret = 0;
 
+	if (blk_mq_is_sbitmap_shared(flags)) {
+		set->tags[hctx_idx] = set->shared_sbitmap_tags;
+
+		return true;
+	}
+
 	set->tags[hctx_idx] = blk_mq_alloc_rq_map(set, hctx_idx,
-					set->queue_depth, set->reserved_tags, flags);
+					set->queue_depth, set->reserved_tags);
 	if (!set->tags[hctx_idx])
 		return false;
 
@@ -2869,7 +2878,7 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set,
 	if (!ret)
 		return true;
 
-	blk_mq_free_rq_map(set->tags[hctx_idx], flags);
+	blk_mq_free_rq_map(set->tags[hctx_idx]);
 	set->tags[hctx_idx] = NULL;
 	return false;
 }
@@ -2877,11 +2886,11 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set,
 static void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set,
 					 unsigned int hctx_idx)
 {
-	unsigned int flags = set->flags;
-
 	if (set->tags && set->tags[hctx_idx]) {
-		blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx);
-		blk_mq_free_rq_map(set->tags[hctx_idx], flags);
+		if (!blk_mq_is_sbitmap_shared(set->flags)) {
+			blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx);
+			blk_mq_free_rq_map(set->tags[hctx_idx]);
+		}
 		set->tags[hctx_idx] = NULL;
 	}
 }
@@ -3348,6 +3357,21 @@ static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set)
 {
 	int i;
 
+	if (blk_mq_is_sbitmap_shared(set->flags)) {
+		int ret;
+
+		set->shared_sbitmap_tags = blk_mq_alloc_rq_map(set, 0,
+						  set->queue_depth,
+						  set->reserved_tags);
+		if (!set->shared_sbitmap_tags)
+			return -ENOMEM;
+
+		ret = blk_mq_alloc_rqs(set, set->shared_sbitmap_tags, 0,
+				       set->queue_depth);
+		if (ret)
+			goto out_free_sbitmap_tags;
+	}
+
 	for (i = 0; i < set->nr_hw_queues; i++) {
 		if (!__blk_mq_alloc_map_and_request(set, i))
 			goto out_unwind;
@@ -3359,6 +3383,11 @@ static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set)
 out_unwind:
 	while (--i >= 0)
 		blk_mq_free_map_and_requests(set, i);
+	if (blk_mq_is_sbitmap_shared(set->flags))
+		blk_mq_free_rqs(set, set->shared_sbitmap_tags, 0);
+out_free_sbitmap_tags:
+	if (blk_mq_is_sbitmap_shared(set->flags))
+		blk_mq_exit_shared_sbitmap(set);
 
 	return -ENOMEM;
 }
@@ -3492,6 +3521,9 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set)
 	if (set->ops->init_request && set->ops->init_request_no_hctx)
 		return -EINVAL;
 
+	if (set->ops->init_request && blk_mq_is_sbitmap_shared(set->flags))
+		return -EINVAL;
+
 	if (set->queue_depth > BLK_MQ_MAX_DEPTH) {
 		pr_info("blk-mq: reduced tag depth to %u\n",
 			BLK_MQ_MAX_DEPTH);
@@ -3541,23 +3573,11 @@ int blk_mq_alloc_tag_set(struct blk_mq_tag_set *set)
 	if (ret)
 		goto out_free_mq_map;
 
-	if (blk_mq_is_sbitmap_shared(set->flags)) {
-		atomic_set(&set->active_queues_shared_sbitmap, 0);
-
-		if (blk_mq_init_shared_sbitmap(set)) {
-			ret = -ENOMEM;
-			goto out_free_mq_rq_maps;
-		}
-	}
-
 	mutex_init(&set->tag_list_lock);
 	INIT_LIST_HEAD(&set->tag_list);
 
 	return 0;
 
-out_free_mq_rq_maps:
-	for (i = 0; i < set->nr_hw_queues; i++)
-		blk_mq_free_map_and_requests(set, i);
 out_free_mq_map:
 	for (i = 0; i < set->nr_maps; i++) {
 		kfree(set->map[i].mq_map);
@@ -3589,11 +3609,15 @@ void blk_mq_free_tag_set(struct blk_mq_tag_set *set)
 {
 	int i, j;
 
-	for (i = 0; i < set->nr_hw_queues; i++)
-		blk_mq_free_map_and_requests(set, i);
+	if (blk_mq_is_sbitmap_shared(set->flags)) {
+		struct blk_mq_tags *tags = set->shared_sbitmap_tags;
 
-	if (blk_mq_is_sbitmap_shared(set->flags))
+		blk_mq_free_rqs(set, tags, 0);
 		blk_mq_exit_shared_sbitmap(set);
+	}
+
+	for (i = 0; i < set->nr_hw_queues; i++)
+		blk_mq_free_map_and_requests(set, i);
 
 	for (j = 0; j < set->nr_maps; j++) {
 		kfree(set->map[j].mq_map);
@@ -3631,12 +3655,6 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 		if (hctx->sched_tags) {
 			ret = blk_mq_tag_update_depth(hctx, &hctx->sched_tags,
 						      nr, true);
-			if (blk_mq_is_sbitmap_shared(set->flags)) {
-				hctx->sched_tags->bitmap_tags =
-					&q->sched_bitmap_tags;
-				hctx->sched_tags->breserved_tags =
-					&q->sched_breserved_tags;
-			}
 		} else {
 			ret = blk_mq_tag_update_depth(hctx, &hctx->tags, nr,
 						      false);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index d08779f77a26..f595521a4b3d 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -54,12 +54,11 @@ void blk_mq_put_rq_ref(struct request *rq);
  */
 void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		     unsigned int hctx_idx);
-void blk_mq_free_rq_map(struct blk_mq_tags *tags, unsigned int flags);
+void blk_mq_free_rq_map(struct blk_mq_tags *tags);
 struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 					unsigned int hctx_idx,
 					unsigned int nr_tags,
-					unsigned int reserved_tags,
-					unsigned int flags);
+					unsigned int reserved_tags);
 int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		     unsigned int hctx_idx, unsigned int depth);
 
@@ -334,10 +333,11 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
 	if (blk_mq_is_sbitmap_shared(hctx->flags)) {
 		struct request_queue *q = hctx->queue;
 		struct blk_mq_tag_set *set = q->tag_set;
+		struct blk_mq_tags *tags = set->shared_sbitmap_tags;
 
 		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
 			return true;
-		users = atomic_read(&set->active_queues_shared_sbitmap);
+		users = atomic_read(&tags->active_queues);
 	} else {
 		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
 			return true;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index c838b24944c2..e67068bf648c 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -232,13 +232,11 @@ enum hctx_type {
  * @flags:	   Zero or more BLK_MQ_F_* flags.
  * @driver_data:   Pointer to data owned by the block driver that created this
  *		   tag set.
- * @active_queues_shared_sbitmap:
- * 		   number of active request queues per tag set.
- * @__bitmap_tags: A shared tags sbitmap, used over all hctx's
- * @__breserved_tags:
- *		   A shared reserved tags sbitmap, used over all hctx's
  * @tags:	   Tag sets. One tag set per hardware queue. Has @nr_hw_queues
  *		   elements.
+ * @shared_sbitmap_tags:
+ *		   Shared sbitmap set of tags. Has @nr_hw_queues elements. If
+ *		   set, shared by all @tags.
  * @tag_list_lock: Serializes tag_list accesses.
  * @tag_list:	   List of the request queues that use this tag set. See also
  *		   request_queue.tag_set_list.
@@ -255,12 +253,11 @@ struct blk_mq_tag_set {
 	unsigned int		timeout;
 	unsigned int		flags;
 	void			*driver_data;
-	atomic_t		active_queues_shared_sbitmap;
 
-	struct sbitmap_queue	__bitmap_tags;
-	struct sbitmap_queue	__breserved_tags;
 	struct blk_mq_tags	**tags;
 
+	struct blk_mq_tags	*shared_sbitmap_tags;
+
 	struct mutex		tag_list_lock;
 	struct list_head	tag_list;
 };
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 56870a43ae4c..f5a039c251e3 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -467,8 +467,7 @@ struct request_queue {
 
 	atomic_t		nr_active_requests_shared_sbitmap;
 
-	struct sbitmap_queue	sched_bitmap_tags;
-	struct sbitmap_queue	sched_breserved_tags;
+	struct blk_mq_tags	*shared_sbitmap_tags;
 
 	struct list_head	icq_list;
 #ifdef CONFIG_BLK_CGROUP
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 11/11] blk-mq: Stop using pointers for blk_mq_tags bitmap tags
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (9 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 10/11] blk-mq: Use shared tags for shared sbitmap support John Garry
@ 2021-08-09 14:29 ` John Garry
  2021-08-18  8:18   ` Ming Lei
  2021-08-17  7:02 ` [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
  11 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-09 14:29 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare,
	ming.lei, John Garry

Now that we use shared tags for shared sbitmap support, we don't require
the tags sbitmap pointers, so drop them.

This essentially reverts commit 222a5ae03cdd ("blk-mq: Use pointers for
blk_mq_tags bitmap tags").

Signed-off-by: John Garry <john.garry@huawei.com>
---
 block/bfq-iosched.c      |  4 ++--
 block/blk-mq-debugfs.c   |  8 +++----
 block/blk-mq-tag.c       | 50 ++++++++++++++++------------------------
 block/blk-mq-tag.h       |  7 ++----
 block/blk-mq.c           |  8 +++----
 block/kyber-iosched.c    |  4 ++--
 block/mq-deadline-main.c |  2 +-
 7 files changed, 35 insertions(+), 48 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 727955918563..91ba4eacafaa 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -6881,8 +6881,8 @@ static void bfq_depth_updated(struct blk_mq_hw_ctx *hctx)
 	struct blk_mq_tags *tags = hctx->sched_tags;
 	unsigned int min_shallow;
 
-	min_shallow = bfq_update_depths(bfqd, tags->bitmap_tags);
-	sbitmap_queue_min_shallow_depth(tags->bitmap_tags, min_shallow);
+	min_shallow = bfq_update_depths(bfqd, &tags->bitmap_tags);
+	sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, min_shallow);
 }
 
 static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index)
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index 4b66d2776eda..4000376330c9 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -452,11 +452,11 @@ static void blk_mq_debugfs_tags_show(struct seq_file *m,
 		   atomic_read(&tags->active_queues));
 
 	seq_puts(m, "\nbitmap_tags:\n");
-	sbitmap_queue_show(tags->bitmap_tags, m);
+	sbitmap_queue_show(&tags->bitmap_tags, m);
 
 	if (tags->nr_reserved_tags) {
 		seq_puts(m, "\nbreserved_tags:\n");
-		sbitmap_queue_show(tags->breserved_tags, m);
+		sbitmap_queue_show(&tags->breserved_tags, m);
 	}
 }
 
@@ -487,7 +487,7 @@ static int hctx_tags_bitmap_show(void *data, struct seq_file *m)
 	if (res)
 		goto out;
 	if (hctx->tags)
-		sbitmap_bitmap_show(&hctx->tags->bitmap_tags->sb, m);
+		sbitmap_bitmap_show(&hctx->tags->bitmap_tags.sb, m);
 	mutex_unlock(&q->sysfs_lock);
 
 out:
@@ -521,7 +521,7 @@ static int hctx_sched_tags_bitmap_show(void *data, struct seq_file *m)
 	if (res)
 		goto out;
 	if (hctx->sched_tags)
-		sbitmap_bitmap_show(&hctx->sched_tags->bitmap_tags->sb, m);
+		sbitmap_bitmap_show(&hctx->sched_tags->bitmap_tags.sb, m);
 	mutex_unlock(&q->sysfs_lock);
 
 out:
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index e97bbf477b96..b84955ee0967 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -46,9 +46,9 @@ bool __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
  */
 void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool include_reserve)
 {
-	sbitmap_queue_wake_all(tags->bitmap_tags);
+	sbitmap_queue_wake_all(&tags->bitmap_tags);
 	if (include_reserve)
-		sbitmap_queue_wake_all(tags->breserved_tags);
+		sbitmap_queue_wake_all(&tags->breserved_tags);
 }
 
 /*
@@ -104,10 +104,10 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
 			WARN_ON_ONCE(1);
 			return BLK_MQ_NO_TAG;
 		}
-		bt = tags->breserved_tags;
+		bt = &tags->breserved_tags;
 		tag_offset = 0;
 	} else {
-		bt = tags->bitmap_tags;
+		bt = &tags->bitmap_tags;
 		tag_offset = tags->nr_reserved_tags;
 	}
 
@@ -153,9 +153,9 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
 						data->ctx);
 		tags = blk_mq_tags_from_data(data);
 		if (data->flags & BLK_MQ_REQ_RESERVED)
-			bt = tags->breserved_tags;
+			bt = &tags->breserved_tags;
 		else
-			bt = tags->bitmap_tags;
+			bt = &tags->bitmap_tags;
 
 		/*
 		 * If destination hw queue is changed, fake wake up on
@@ -189,10 +189,10 @@ void blk_mq_put_tag(struct blk_mq_tags *tags, struct blk_mq_ctx *ctx,
 		const int real_tag = tag - tags->nr_reserved_tags;
 
 		BUG_ON(real_tag >= tags->nr_tags);
-		sbitmap_queue_clear(tags->bitmap_tags, real_tag, ctx->cpu);
+		sbitmap_queue_clear(&tags->bitmap_tags, real_tag, ctx->cpu);
 	} else {
 		BUG_ON(tag >= tags->nr_reserved_tags);
-		sbitmap_queue_clear(tags->breserved_tags, tag, ctx->cpu);
+		sbitmap_queue_clear(&tags->breserved_tags, tag, ctx->cpu);
 	}
 }
 
@@ -343,9 +343,9 @@ static void __blk_mq_all_tag_iter(struct blk_mq_tags *tags,
 	WARN_ON_ONCE(flags & BT_TAG_ITER_RESERVED);
 
 	if (tags->nr_reserved_tags)
-		bt_tags_for_each(tags, tags->breserved_tags, fn, priv,
+		bt_tags_for_each(tags, &tags->breserved_tags, fn, priv,
 				 flags | BT_TAG_ITER_RESERVED);
-	bt_tags_for_each(tags, tags->bitmap_tags, fn, priv, flags);
+	bt_tags_for_each(tags, &tags->bitmap_tags, fn, priv, flags);
 }
 
 /**
@@ -462,8 +462,8 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
 			continue;
 
 		if (tags->nr_reserved_tags)
-			bt_for_each(hctx, tags->breserved_tags, fn, priv, true);
-		bt_for_each(hctx, tags->bitmap_tags, fn, priv, false);
+			bt_for_each(hctx, &tags->breserved_tags, fn, priv, true);
+		bt_for_each(hctx, &tags->bitmap_tags, fn, priv, false);
 	}
 	blk_queue_exit(q);
 }
@@ -498,19 +498,9 @@ int blk_mq_init_bitmaps(struct sbitmap_queue *bitmap_tags,
 static int blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
 				   int node, int alloc_policy)
 {
-	int ret;
-
-	ret = blk_mq_init_bitmaps(&tags->__bitmap_tags,
-				  &tags->__breserved_tags,
-				  tags->nr_tags, tags->nr_reserved_tags,
-				  node, alloc_policy);
-	if (ret)
-		return ret;
-
-	tags->bitmap_tags = &tags->__bitmap_tags;
-	tags->breserved_tags = &tags->__breserved_tags;
-
-	return 0;
+	return blk_mq_init_bitmaps(&tags->bitmap_tags, &tags->breserved_tags,
+				   tags->nr_tags, tags->nr_reserved_tags,
+				   node, alloc_policy);
 }
 
 void blk_mq_exit_shared_sbitmap(struct blk_mq_tag_set *set)
@@ -547,8 +537,8 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
 
 void blk_mq_free_tags(struct blk_mq_tags *tags)
 {
-	sbitmap_queue_free(tags->bitmap_tags);
-	sbitmap_queue_free(tags->breserved_tags);
+	sbitmap_queue_free(&tags->bitmap_tags);
+	sbitmap_queue_free(&tags->breserved_tags);
 	kfree(tags);
 }
 
@@ -605,7 +595,7 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
 		 * Don't need (or can't) update reserved tags here, they
 		 * remain static and should never need resizing.
 		 */
-		sbitmap_queue_resize(tags->bitmap_tags,
+		sbitmap_queue_resize(&tags->bitmap_tags,
 				tdepth - tags->nr_reserved_tags);
 	}
 
@@ -616,12 +606,12 @@ void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set, unsigned int s
 {
 	struct blk_mq_tags *tags = set->shared_sbitmap_tags;
 
-	sbitmap_queue_resize(&tags->__bitmap_tags, size - set->reserved_tags);
+	sbitmap_queue_resize(&tags->bitmap_tags, size - set->reserved_tags);
 }
 
 void blk_mq_tag_update_sched_shared_sbitmap(struct request_queue *q)
 {
-	sbitmap_queue_resize(q->shared_sbitmap_tags->bitmap_tags,
+	sbitmap_queue_resize(&q->shared_sbitmap_tags->bitmap_tags,
 			     q->nr_requests - q->tag_set->reserved_tags);
 }
 
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index c9fc52ee07c4..ba6502a9a5d8 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -11,11 +11,8 @@ struct blk_mq_tags {
 
 	atomic_t active_queues;
 
-	struct sbitmap_queue *bitmap_tags;
-	struct sbitmap_queue *breserved_tags;
-
-	struct sbitmap_queue __bitmap_tags;
-	struct sbitmap_queue __breserved_tags;
+	struct sbitmap_queue bitmap_tags;
+	struct sbitmap_queue breserved_tags;
 
 	struct request **rqs;
 	struct request **static_rqs;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index d3dd5fab3426..a98ba16f7a76 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1082,14 +1082,14 @@ static inline unsigned int queued_to_index(unsigned int queued)
 
 static bool __blk_mq_get_driver_tag(struct request *rq)
 {
-	struct sbitmap_queue *bt = rq->mq_hctx->tags->bitmap_tags;
+	struct sbitmap_queue *bt = &rq->mq_hctx->tags->bitmap_tags;
 	unsigned int tag_offset = rq->mq_hctx->tags->nr_reserved_tags;
 	int tag;
 
 	blk_mq_tag_busy(rq->mq_hctx);
 
 	if (blk_mq_tag_is_reserved(rq->mq_hctx->sched_tags, rq->internal_tag)) {
-		bt = rq->mq_hctx->tags->breserved_tags;
+		bt = &rq->mq_hctx->tags->breserved_tags;
 		tag_offset = 0;
 	} else {
 		if (!hctx_may_queue(rq->mq_hctx, bt))
@@ -1132,7 +1132,7 @@ static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
 		struct sbitmap_queue *sbq;
 
 		list_del_init(&wait->entry);
-		sbq = hctx->tags->bitmap_tags;
+		sbq = &hctx->tags->bitmap_tags;
 		atomic_dec(&sbq->ws_active);
 	}
 	spin_unlock(&hctx->dispatch_wait_lock);
@@ -1150,7 +1150,7 @@ static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
 static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx *hctx,
 				 struct request *rq)
 {
-	struct sbitmap_queue *sbq = hctx->tags->bitmap_tags;
+	struct sbitmap_queue *sbq = &hctx->tags->bitmap_tags;
 	struct wait_queue_head *wq;
 	wait_queue_entry_t *wait;
 	bool ret;
diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c
index 81e3279ecd57..d6f9f17e40b4 100644
--- a/block/kyber-iosched.c
+++ b/block/kyber-iosched.c
@@ -451,11 +451,11 @@ static void kyber_depth_updated(struct blk_mq_hw_ctx *hctx)
 {
 	struct kyber_queue_data *kqd = hctx->queue->elevator->elevator_data;
 	struct blk_mq_tags *tags = hctx->sched_tags;
-	unsigned int shift = tags->bitmap_tags->sb.shift;
+	unsigned int shift = tags->bitmap_tags.sb.shift;
 
 	kqd->async_depth = (1U << shift) * KYBER_ASYNC_PERCENT / 100U;
 
-	sbitmap_queue_min_shallow_depth(tags->bitmap_tags, kqd->async_depth);
+	sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, kqd->async_depth);
 }
 
 static int kyber_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int hctx_idx)
diff --git a/block/mq-deadline-main.c b/block/mq-deadline-main.c
index 6f612e6dc82b..9cdc80da28fe 100644
--- a/block/mq-deadline-main.c
+++ b/block/mq-deadline-main.c
@@ -554,7 +554,7 @@ static void dd_depth_updated(struct blk_mq_hw_ctx *hctx)
 
 	dd->async_depth = max(1UL, 3 * q->nr_requests / 4);
 
-	sbitmap_queue_min_shallow_depth(tags->bitmap_tags, dd->async_depth);
+	sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, dd->async_depth);
 }
 
 /* Called by blk_mq_init_hctx() and blk_mq_init_sched(). */
-- 
2.26.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-09 14:29 ` [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping() John Garry
@ 2021-08-09 17:00   ` kernel test robot
  2021-08-09 17:10     ` John Garry
  2021-08-09 19:55   ` kernel test robot
  2021-08-18  4:02   ` Ming Lei
  2 siblings, 1 reply; 31+ messages in thread
From: kernel test robot @ 2021-08-09 17:00 UTC (permalink / raw)
  To: John Garry, axboe, jejb, martin.petersen
  Cc: kbuild-all, linux-block, linux-kernel, linux-scsi, kashyap.desai,
	hare, ming.lei, John Garry

[-- Attachment #1: Type: text/plain, Size: 2906 bytes --]

Hi John,

I love your patch! Perhaps something to improve:

[auto build test WARNING on block/for-next]
[also build test WARNING on v5.14-rc5 next-20210809]
[cannot apply to mkp-scsi/for-next ceph-client/for-linus scsi/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/John-Garry/blk-mq-Reduce-static-requests-memory-footprint-for-shared-sbitmap/20210809-223943
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: um-x86_64_defconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/535293cab26a196d29b64d9ce8a7274bfd1806d8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review John-Garry/blk-mq-Reduce-static-requests-memory-footprint-for-shared-sbitmap/20210809-223943
        git checkout 535293cab26a196d29b64d9ce8a7274bfd1806d8
        # save the attached .config to linux build tree
        make W=1 ARCH=um SUBARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> block/blk-mq.c:2313:6: warning: no previous prototype for 'blk_mq_clear_rq_mapping' [-Wmissing-prototypes]
    2313 | void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
         |      ^~~~~~~~~~~~~~~~~~~~~~~


vim +/blk_mq_clear_rq_mapping +2313 block/blk-mq.c

  2311	
  2312	/* called before freeing request pool in @tags */
> 2313	void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
  2314				     struct blk_mq_tags *tags)
  2315	{
  2316		struct page *page;
  2317		unsigned long flags;
  2318	
  2319		list_for_each_entry(page, &tags->page_list, lru) {
  2320			unsigned long start = (unsigned long)page_address(page);
  2321			unsigned long end = start + order_to_size(page->private);
  2322			int i;
  2323	
  2324			for (i = 0; i < drv_tags->nr_tags; i++) {
  2325				struct request *rq = drv_tags->rqs[i];
  2326				unsigned long rq_addr = (unsigned long)rq;
  2327	
  2328				if (rq_addr >= start && rq_addr < end) {
  2329					WARN_ON_ONCE(refcount_read(&rq->ref) != 0);
  2330					cmpxchg(&drv_tags->rqs[i], rq, NULL);
  2331				}
  2332			}
  2333		}
  2334	
  2335		/*
  2336		 * Wait until all pending iteration is done.
  2337		 *
  2338		 * Request reference is cleared and it is guaranteed to be observed
  2339		 * after the ->lock is released.
  2340		 */
  2341		spin_lock_irqsave(&drv_tags->lock, flags);
  2342		spin_unlock_irqrestore(&drv_tags->lock, flags);
  2343	}
  2344	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 9523 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-09 17:00   ` kernel test robot
@ 2021-08-09 17:10     ` John Garry
  0 siblings, 0 replies; 31+ messages in thread
From: John Garry @ 2021-08-09 17:10 UTC (permalink / raw)
  To: kernel test robot, axboe, jejb, martin.petersen
  Cc: kbuild-all, linux-block, linux-kernel, linux-scsi, kashyap.desai,
	hare, ming.lei

On 09/08/2021 18:00, kernel test robot wrote:
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot<lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
>>> block/blk-mq.c:2313:6: warning: no previous prototype for 'blk_mq_clear_rq_mapping' [-Wmissing-prototypes]
>      2313 | void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
>           |      ^~~~~~~~~~~~~~~~~~~~~~~
> 
> 
> vim +/blk_mq_clear_rq_mapping +2313 block/blk-mq.c
> 
>    2311	
>    2312	/* called before freeing request pool in @tags */
>> 2313	void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
>    2314				     struct blk_mq_tags *tags)

I will fix in a new version after other review.

Thanks

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-09 14:29 ` [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping() John Garry
  2021-08-09 17:00   ` kernel test robot
@ 2021-08-09 19:55   ` kernel test robot
  2021-08-18  4:02   ` Ming Lei
  2 siblings, 0 replies; 31+ messages in thread
From: kernel test robot @ 2021-08-09 19:55 UTC (permalink / raw)
  To: John Garry, axboe, jejb, martin.petersen
  Cc: clang-built-linux, kbuild-all, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare, ming.lei, John Garry

[-- Attachment #1: Type: text/plain, Size: 3289 bytes --]

Hi John,

I love your patch! Perhaps something to improve:

[auto build test WARNING on block/for-next]
[also build test WARNING on v5.14-rc5 next-20210809]
[cannot apply to mkp-scsi/for-next ceph-client/for-linus scsi/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/John-Garry/blk-mq-Reduce-static-requests-memory-footprint-for-shared-sbitmap/20210809-223943
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
config: hexagon-randconfig-r036-20210809 (attached as .config)
compiler: clang version 12.0.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/535293cab26a196d29b64d9ce8a7274bfd1806d8
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review John-Garry/blk-mq-Reduce-static-requests-memory-footprint-for-shared-sbitmap/20210809-223943
        git checkout 535293cab26a196d29b64d9ce8a7274bfd1806d8
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=hexagon 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> block/blk-mq.c:2313:6: warning: no previous prototype for function 'blk_mq_clear_rq_mapping' [-Wmissing-prototypes]
   void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
        ^
   block/blk-mq.c:2313:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
   ^
   static 
   1 warning generated.


vim +/blk_mq_clear_rq_mapping +2313 block/blk-mq.c

  2311	
  2312	/* called before freeing request pool in @tags */
> 2313	void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
  2314				     struct blk_mq_tags *tags)
  2315	{
  2316		struct page *page;
  2317		unsigned long flags;
  2318	
  2319		list_for_each_entry(page, &tags->page_list, lru) {
  2320			unsigned long start = (unsigned long)page_address(page);
  2321			unsigned long end = start + order_to_size(page->private);
  2322			int i;
  2323	
  2324			for (i = 0; i < drv_tags->nr_tags; i++) {
  2325				struct request *rq = drv_tags->rqs[i];
  2326				unsigned long rq_addr = (unsigned long)rq;
  2327	
  2328				if (rq_addr >= start && rq_addr < end) {
  2329					WARN_ON_ONCE(refcount_read(&rq->ref) != 0);
  2330					cmpxchg(&drv_tags->rqs[i], rq, NULL);
  2331				}
  2332			}
  2333		}
  2334	
  2335		/*
  2336		 * Wait until all pending iteration is done.
  2337		 *
  2338		 * Request reference is cleared and it is guaranteed to be observed
  2339		 * after the ->lock is released.
  2340		 */
  2341		spin_lock_irqsave(&drv_tags->lock, flags);
  2342		spin_unlock_irqrestore(&drv_tags->lock, flags);
  2343	}
  2344	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30181 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap
  2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
                   ` (10 preceding siblings ...)
  2021-08-09 14:29 ` [PATCH v2 11/11] blk-mq: Stop using pointers for blk_mq_tags bitmap tags John Garry
@ 2021-08-17  7:02 ` John Garry
  11 siblings, 0 replies; 31+ messages in thread
From: John Garry @ 2021-08-17  7:02 UTC (permalink / raw)
  To: axboe, jejb, martin.petersen
  Cc: linux-block, linux-kernel, linux-scsi, kashyap.desai, hare, ming.lei

On 09/08/2021 15:29, John Garry wrote:

Hi guys,

Any chance to review or test this?

Thanks!

> Currently a full set of static requests are allocated per hw queue per
> tagset when shared sbitmap is used.
> 
> However, only tagset->queue_depth number of requests may be active at
> any given time. As such, only tagset->queue_depth number of static
> requests are required.
> 
> The same goes for using an IO scheduler, which allocates a full set of
> static requests per hw queue per request queue.
> 
> This series changes shared sbitmap support by using a shared tags per
> tagset and request queue. Ming suggested something along those lines in
> v1. But we'll keep name "shared sbitmap" name as it is fimilar. In using
> a shared tags, the static rqs also become shared, reducing the number of
> sets of static rqs, reducing memory usage.
> 
> Patch "blk-mq: Use shared tags for shared sbitmap support" is a bit big,
> and could be broken down. But then maintaining ability to bisect
> becomes harder and each sub-patch would get more convoluted.
> 
> For megaraid sas driver on my 128-CPU arm64 system with 1x SATA disk, we
> save approx. 300MB(!) [370MB -> 60MB]
> 
> Baseline is 2112f5c1330a (for-5.15/block) loop: Select I/O scheduler ...
> 
> Changes since v1:
> - Switch to use blk_mq_tags for shared sbitmap
> - Add new blk_mq_ops init request callback
> - Add some RB tags (thanks!)
> 
> John Garry (11):
>    blk-mq: Change rqs check in blk_mq_free_rqs()
>    block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ
>    blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests()
>    blk-mq: Invert check in blk_mq_update_nr_requests()
>    blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}()
>    blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
>    blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap()
>    blk-mq: Add blk_mq_ops.init_request_no_hctx()
>    scsi: Set blk_mq_ops.init_request_no_hctx
>    blk-mq: Use shared tags for shared sbitmap support
>    blk-mq: Stop using pointers for blk_mq_tags bitmap tags
> 
>   block/bfq-iosched.c      |   4 +-
>   block/blk-core.c         |   2 +-
>   block/blk-mq-debugfs.c   |   8 +--
>   block/blk-mq-sched.c     |  92 +++++++++++++------------
>   block/blk-mq-sched.h     |   2 +-
>   block/blk-mq-tag.c       | 111 +++++++++++++-----------------
>   block/blk-mq-tag.h       |  12 ++--
>   block/blk-mq.c           | 144 +++++++++++++++++++++++----------------
>   block/blk-mq.h           |   8 +--
>   block/kyber-iosched.c    |   4 +-
>   block/mq-deadline-main.c |   2 +-
>   drivers/block/rbd.c      |   2 +-
>   drivers/scsi/scsi_lib.c  |   6 +-
>   include/linux/blk-mq.h   |  20 +++---
>   include/linux/blkdev.h   |   5 +-
>   15 files changed, 218 insertions(+), 204 deletions(-)
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 03/11] blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests()
  2021-08-09 14:29 ` [PATCH v2 03/11] blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests() John Garry
@ 2021-08-18  3:49   ` Ming Lei
  0 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2021-08-18  3:49 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Mon, Aug 09, 2021 at 10:29:30PM +0800, John Garry wrote:
> For shared sbitmap, if the call to blk_mq_tag_update_depth() was
> successful for any hctx when hctx->sched_tags is not set, then it would be
> successful for all (due to nature in which blk_mq_tag_update_depth()
> fails).
> 
> As such, there is no need to call blk_mq_tag_resize_shared_sbitmap() for
> each hctx. So relocate the call until after the hctx iteration under the
> !q->elevator check, which is equivalent (to !hctx->sched_tags).
> 
> Signed-off-by: John Garry <john.garry@huawei.com>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 04/11] blk-mq: Invert check in blk_mq_update_nr_requests()
  2021-08-09 14:29 ` [PATCH v2 04/11] blk-mq: Invert check " John Garry
@ 2021-08-18  3:53   ` Ming Lei
  0 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2021-08-18  3:53 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Mon, Aug 09, 2021 at 10:29:31PM +0800, John Garry wrote:
> It's easier to read:
> 
> if (x)
> 	X;
> else
> 	Y;
> 
> over:
> 
> if (!x)
> 	Y;
> else
> 	X;
> 
> No functional change intended.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 05/11] blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}()
  2021-08-09 14:29 ` [PATCH v2 05/11] blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}() John Garry
@ 2021-08-18  3:55   ` Ming Lei
  0 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2021-08-18  3:55 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Mon, Aug 09, 2021 at 10:29:32PM +0800, John Garry wrote:
> Function blk_mq_sched_alloc_tags() does same as
> __blk_mq_alloc_map_and_request(), so give a similar name to be consistent.
> 
> Similarly rename label err_free_tags -> err_free_map_and_request.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-09 14:29 ` [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping() John Garry
  2021-08-09 17:00   ` kernel test robot
  2021-08-09 19:55   ` kernel test robot
@ 2021-08-18  4:02   ` Ming Lei
  2021-08-18 12:00     ` John Garry
  2 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2021-08-18  4:02 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Mon, Aug 09, 2021 at 10:29:33PM +0800, John Garry wrote:
> Function blk_mq_clear_rq_mapping() will be used for shared sbitmap tags
> in future, so pass a driver tags pointer instead of the tagset container
> and HW queue index.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  block/blk-mq.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 42c4b8d1a570..0bb596f4d061 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2310,10 +2310,9 @@ static size_t order_to_size(unsigned int order)
>  }
>  
>  /* called before freeing request pool in @tags */
> -static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
> -		struct blk_mq_tags *tags, unsigned int hctx_idx)
> +void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
> +			     struct blk_mq_tags *tags)
>  {
> -	struct blk_mq_tags *drv_tags = set->tags[hctx_idx];
>  	struct page *page;
>  	unsigned long flags;
>  
> @@ -2322,7 +2321,7 @@ static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
>  		unsigned long end = start + order_to_size(page->private);
>  		int i;
>  
> -		for (i = 0; i < set->queue_depth; i++) {
> +		for (i = 0; i < drv_tags->nr_tags; i++) {
>  			struct request *rq = drv_tags->rqs[i];
>  			unsigned long rq_addr = (unsigned long)rq;
>  
> @@ -2346,8 +2345,11 @@ static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
>  void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>  		     unsigned int hctx_idx)
>  {
> +	struct blk_mq_tags *drv_tags;
>  	struct page *page;
>  
> +		drv_tags = set->tags[hctx_idx];

Indent.

> +
>  	if (tags->static_rqs && set->ops->exit_request) {
>  		int i;
>  
> @@ -2361,7 +2363,7 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>  		}
>  	}
>  
> -	blk_mq_clear_rq_mapping(set, tags, hctx_idx);
> +	blk_mq_clear_rq_mapping(drv_tags, tags);

Maybe you can pass set->tags[hctx_idx] directly since there is only one
reference on it.

-- 
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 07/11] blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap()
  2021-08-09 14:29 ` [PATCH v2 07/11] blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap() John Garry
@ 2021-08-18  7:28   ` Ming Lei
  0 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2021-08-18  7:28 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Mon, Aug 09, 2021 at 10:29:34PM +0800, John Garry wrote:
> Put the functionality to update the sched shared sbitmap size in a common
> function.
> 
> Since the same formula is always used to resize, and it can be got from
> the request queue argument, so just pass the request queue pointer.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 08/11] blk-mq: Add blk_mq_ops.init_request_no_hctx()
  2021-08-09 14:29 ` [PATCH v2 08/11] blk-mq: Add blk_mq_ops.init_request_no_hctx() John Garry
@ 2021-08-18  7:38   ` Ming Lei
  2021-08-18  8:46     ` John Garry
  0 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2021-08-18  7:38 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Mon, Aug 09, 2021 at 10:29:35PM +0800, John Garry wrote:
> Add a variant of the init_request function which does not pass a hctx_idx
> arg.
> 
> This is important for shared sbitmap support, as it needs to be ensured for
> introducing shared static rqs that the LLDD cannot think that requests
> are associated with a specific HW queue.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  block/blk-mq.c         | 15 ++++++++++-----
>  include/linux/blk-mq.h |  7 +++++++
>  2 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f14cc2705f9b..4d6723cfa582 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2427,13 +2427,15 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
>  static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
>  			       unsigned int hctx_idx, int node)
>  {
> -	int ret;
> +	int ret = 0;
>  
> -	if (set->ops->init_request) {
> +	if (set->ops->init_request)
>  		ret = set->ops->init_request(set, rq, hctx_idx, node);
> -		if (ret)
> -			return ret;
> -	}
> +	else if (set->ops->init_request_no_hctx)
> +		ret = set->ops->init_request_no_hctx(set, rq, node);

The only shared sbitmap user of SCSI does not use passed hctx_idx, not
sure we need such new callback.

If you really want to do this, just wondering why not pass '-1' as
hctx_idx in case of shared sbitmap?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 10/11] blk-mq: Use shared tags for shared sbitmap support
  2021-08-09 14:29 ` [PATCH v2 10/11] blk-mq: Use shared tags for shared sbitmap support John Garry
@ 2021-08-18  8:16   ` Ming Lei
  2021-08-18 14:18     ` John Garry
  2021-08-23  8:55     ` John Garry
  0 siblings, 2 replies; 31+ messages in thread
From: Ming Lei @ 2021-08-18  8:16 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Mon, Aug 09, 2021 at 10:29:37PM +0800, John Garry wrote:
> Currently we use separate sbitmap pairs and active_queues atomic_t for
> shared sbitmap support.
> 
> However a full set of static requests are used per HW queue, which is
> quite wasteful, considering that the total number of requests usable at
> any given time across all HW queues is limited by the shared sbitmap depth.
> 
> As such, it is considerably more memory efficient in the case of shared
> sbitmap to allocate a set of static rqs per tag set or request queue, and
> not per HW queue.
> 
> So replace the sbitmap pairs and active_queues atomic_t with a shared
> tags per tagset and request queue.

This idea looks good and the current implementation is simplified a bit
too.

> 
> Continue to use term "shared sbitmap" for now, as the meaning is known.

I guess shared tags is better.

> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> ---
>  block/blk-mq-sched.c   | 77 ++++++++++++++++++++-----------------
>  block/blk-mq-tag.c     | 65 ++++++++++++-------------------
>  block/blk-mq-tag.h     |  4 +-
>  block/blk-mq.c         | 86 +++++++++++++++++++++++++-----------------
>  block/blk-mq.h         |  8 ++--
>  include/linux/blk-mq.h | 13 +++----
>  include/linux/blkdev.h |  3 +-
>  7 files changed, 131 insertions(+), 125 deletions(-)
> 
> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
> index ac0408ebcd47..1101a2e4da9a 100644
> --- a/block/blk-mq-sched.c
> +++ b/block/blk-mq-sched.c
> @@ -522,14 +522,19 @@ static int blk_mq_sched_alloc_map_and_request(struct request_queue *q,
>  	struct blk_mq_tag_set *set = q->tag_set;
>  	int ret;
>  
> +	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
> +		hctx->sched_tags = q->shared_sbitmap_tags;
> +		return 0;
> +	}
> +
>  	hctx->sched_tags = blk_mq_alloc_rq_map(set, hctx_idx, q->nr_requests,
> -					       set->reserved_tags, set->flags);
> +					       set->reserved_tags);
>  	if (!hctx->sched_tags)
>  		return -ENOMEM;
>  
>  	ret = blk_mq_alloc_rqs(set, hctx->sched_tags, hctx_idx, q->nr_requests);
>  	if (ret) {
> -		blk_mq_free_rq_map(hctx->sched_tags, set->flags);
> +		blk_mq_free_rq_map(hctx->sched_tags);
>  		hctx->sched_tags = NULL;
>  	}
>  
> @@ -544,35 +549,39 @@ static void blk_mq_sched_tags_teardown(struct request_queue *q)
>  
>  	queue_for_each_hw_ctx(q, hctx, i) {
>  		if (hctx->sched_tags) {
> -			blk_mq_free_rq_map(hctx->sched_tags, hctx->flags);
> +			if (!blk_mq_is_sbitmap_shared(q->tag_set->flags))
> +				blk_mq_free_rq_map(hctx->sched_tags);
>  			hctx->sched_tags = NULL;
>  		}
>  	}
>  }
>  
> +static void blk_mq_exit_sched_shared_sbitmap(struct request_queue *queue)
> +{
> +	blk_mq_free_rq_map(queue->shared_sbitmap_tags);
> +	queue->shared_sbitmap_tags = NULL;
> +}
> +
>  static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue)
>  {
>  	struct blk_mq_tag_set *set = queue->tag_set;
> -	int alloc_policy = BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags);
> -	struct blk_mq_hw_ctx *hctx;
> -	int ret, i;
> +	struct blk_mq_tags *tags;
> +	int ret;
>  
>  	/*
>  	 * Set initial depth at max so that we don't need to reallocate for
>  	 * updating nr_requests.
>  	 */
> -	ret = blk_mq_init_bitmaps(&queue->sched_bitmap_tags,
> -				  &queue->sched_breserved_tags,
> -				  MAX_SCHED_RQ, set->reserved_tags,
> -				  set->numa_node, alloc_policy);
> -	if (ret)
> -		return ret;
> +	tags = queue->shared_sbitmap_tags = blk_mq_alloc_rq_map(set, 0,
> +					  set->queue_depth,
> +					  set->reserved_tags);
> +	if (!queue->shared_sbitmap_tags)
> +		return -ENOMEM;
>  
> -	queue_for_each_hw_ctx(queue, hctx, i) {
> -		hctx->sched_tags->bitmap_tags =
> -					&queue->sched_bitmap_tags;
> -		hctx->sched_tags->breserved_tags =
> -					&queue->sched_breserved_tags;
> +	ret = blk_mq_alloc_rqs(set, tags, 0, set->queue_depth);
> +	if (ret) {
> +		blk_mq_exit_sched_shared_sbitmap(queue);
> +		return ret;

There are two such patterns for allocate rq map and request pool
together, please put them into one helper(such as blk_mq_alloc_map_and_rqs)
which can return the allocated tags and handle failure inline. Also we may
convert current users into this helper.

>  	}
>  
>  	blk_mq_tag_update_sched_shared_sbitmap(queue);
> @@ -580,12 +589,6 @@ static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue)
>  	return 0;
>  }
>  
> -static void blk_mq_exit_sched_shared_sbitmap(struct request_queue *queue)
> -{
> -	sbitmap_queue_free(&queue->sched_bitmap_tags);
> -	sbitmap_queue_free(&queue->sched_breserved_tags);
> -}
> -
>  int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
>  {
>  	struct blk_mq_hw_ctx *hctx;
> @@ -607,21 +610,21 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
>  	q->nr_requests = 2 * min_t(unsigned int, q->tag_set->queue_depth,
>  				   BLKDEV_DEFAULT_RQ);
>  
> -	queue_for_each_hw_ctx(q, hctx, i) {
> -		ret = blk_mq_sched_alloc_map_and_request(q, hctx, i);
> +	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
> +		ret = blk_mq_init_sched_shared_sbitmap(q);
>  		if (ret)
> -			goto err_free_map_and_request;
> +			return ret;
>  	}
>  
> -	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
> -		ret = blk_mq_init_sched_shared_sbitmap(q);
> +	queue_for_each_hw_ctx(q, hctx, i) {
> +		ret = blk_mq_sched_alloc_map_and_request(q, hctx, i);
>  		if (ret)
>  			goto err_free_map_and_request;
>  	}
>  
>  	ret = e->ops.init_sched(q, e);
>  	if (ret)
> -		goto err_free_sbitmap;
> +		goto err_free_map_and_request;
>  
>  	blk_mq_debugfs_register_sched(q);
>  
> @@ -641,12 +644,12 @@ int blk_mq_init_sched(struct request_queue *q, struct elevator_type *e)
>  
>  	return 0;
>  
> -err_free_sbitmap:
> -	if (blk_mq_is_sbitmap_shared(q->tag_set->flags))
> -		blk_mq_exit_sched_shared_sbitmap(q);
>  err_free_map_and_request:
>  	blk_mq_sched_free_requests(q);
>  	blk_mq_sched_tags_teardown(q);
> +	if (blk_mq_is_sbitmap_shared(q->tag_set->flags))
> +		blk_mq_exit_sched_shared_sbitmap(q);
> +
>  	q->elevator = NULL;
>  	return ret;
>  }
> @@ -660,9 +663,13 @@ void blk_mq_sched_free_requests(struct request_queue *q)
>  	struct blk_mq_hw_ctx *hctx;
>  	int i;
>  
> -	queue_for_each_hw_ctx(q, hctx, i) {
> -		if (hctx->sched_tags)
> -			blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i);
> +	if (blk_mq_is_sbitmap_shared(q->tag_set->flags)) {
> +		blk_mq_free_rqs(q->tag_set, q->shared_sbitmap_tags, 0);
> +	} else {
> +		queue_for_each_hw_ctx(q, hctx, i) {
> +			if (hctx->sched_tags)
> +				blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i);
> +		}
>  	}
>  }
>  
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index 5f06ad6efc8f..e97bbf477b96 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -27,10 +27,11 @@ bool __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
>  	if (blk_mq_is_sbitmap_shared(hctx->flags)) {
>  		struct request_queue *q = hctx->queue;
>  		struct blk_mq_tag_set *set = q->tag_set;
> +		struct blk_mq_tags *tags = set->shared_sbitmap_tags;
>  
>  		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags) &&
>  		    !test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
> -			atomic_inc(&set->active_queues_shared_sbitmap);
> +			atomic_inc(&tags->active_queues);
>  	} else {
>  		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state) &&
>  		    !test_and_set_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
> @@ -61,10 +62,12 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
>  	struct blk_mq_tag_set *set = q->tag_set;
>  
>  	if (blk_mq_is_sbitmap_shared(hctx->flags)) {
> +		struct blk_mq_tags *tags = set->shared_sbitmap_tags;
> +
>  		if (!test_and_clear_bit(QUEUE_FLAG_HCTX_ACTIVE,
>  					&q->queue_flags))
>  			return;
> -		atomic_dec(&set->active_queues_shared_sbitmap);
> +		atomic_dec(&tags->active_queues);
>  	} else {
>  		if (!test_and_clear_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state))
>  			return;
> @@ -510,38 +513,16 @@ static int blk_mq_init_bitmap_tags(struct blk_mq_tags *tags,
>  	return 0;
>  }
>  
> -int blk_mq_init_shared_sbitmap(struct blk_mq_tag_set *set)
> -{
> -	int alloc_policy = BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags);
> -	int i, ret;
> -
> -	ret = blk_mq_init_bitmaps(&set->__bitmap_tags, &set->__breserved_tags,
> -				  set->queue_depth, set->reserved_tags,
> -				  set->numa_node, alloc_policy);
> -	if (ret)
> -		return ret;
> -
> -	for (i = 0; i < set->nr_hw_queues; i++) {
> -		struct blk_mq_tags *tags = set->tags[i];
> -
> -		tags->bitmap_tags = &set->__bitmap_tags;
> -		tags->breserved_tags = &set->__breserved_tags;
> -	}
> -
> -	return 0;
> -}
> -
>  void blk_mq_exit_shared_sbitmap(struct blk_mq_tag_set *set)
>  {
> -	sbitmap_queue_free(&set->__bitmap_tags);
> -	sbitmap_queue_free(&set->__breserved_tags);
> +	blk_mq_free_rq_map(set->shared_sbitmap_tags);
> +	set->shared_sbitmap_tags = NULL;
>  }
>  
>  struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
>  				     unsigned int reserved_tags,
> -				     int node, unsigned int flags)
> +				     int node, int alloc_policy)
>  {
> -	int alloc_policy = BLK_MQ_FLAG_TO_ALLOC_POLICY(flags);
>  	struct blk_mq_tags *tags;
>  
>  	if (total_tags > BLK_MQ_TAG_MAX) {
> @@ -557,9 +538,6 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
>  	tags->nr_reserved_tags = reserved_tags;
>  	spin_lock_init(&tags->lock);
>  
> -	if (blk_mq_is_sbitmap_shared(flags))
> -		return tags;
> -
>  	if (blk_mq_init_bitmap_tags(tags, node, alloc_policy) < 0) {
>  		kfree(tags);
>  		return NULL;
> @@ -567,12 +545,10 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
>  	return tags;
>  }
>  
> -void blk_mq_free_tags(struct blk_mq_tags *tags, unsigned int flags)
> +void blk_mq_free_tags(struct blk_mq_tags *tags)
>  {
> -	if (!blk_mq_is_sbitmap_shared(flags)) {
> -		sbitmap_queue_free(tags->bitmap_tags);
> -		sbitmap_queue_free(tags->breserved_tags);
> -	}
> +	sbitmap_queue_free(tags->bitmap_tags);
> +	sbitmap_queue_free(tags->breserved_tags);
>  	kfree(tags);
>  }
>  
> @@ -604,18 +580,25 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
>  		if (tdepth > MAX_SCHED_RQ)
>  			return -EINVAL;
>  
> +		if (blk_mq_is_sbitmap_shared(set->flags)) {
> +			/* No point in allowing this to happen */
> +			if (tdepth > set->queue_depth)
> +				return -EINVAL;
> +			return 0;
> +		}

The above looks wrong, it isn't unusual to see small queue depth
hardware meantime we often have scheduler queue depth of 2 * set->queue_depth.

> +
>  		new = blk_mq_alloc_rq_map(set, hctx->queue_num, tdepth,
> -				tags->nr_reserved_tags, set->flags);
> +				tags->nr_reserved_tags);
>  		if (!new)
>  			return -ENOMEM;
>  		ret = blk_mq_alloc_rqs(set, new, hctx->queue_num, tdepth);
>  		if (ret) {
> -			blk_mq_free_rq_map(new, set->flags);
> +			blk_mq_free_rq_map(new);
>  			return -ENOMEM;
>  		}
>  
>  		blk_mq_free_rqs(set, *tagsptr, hctx->queue_num);
> -		blk_mq_free_rq_map(*tagsptr, set->flags);
> +		blk_mq_free_rq_map(*tagsptr);
>  		*tagsptr = new;
>  	} else {
>  		/*
> @@ -631,12 +614,14 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
>  
>  void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set, unsigned int size)
>  {
> -	sbitmap_queue_resize(&set->__bitmap_tags, size - set->reserved_tags);
> +	struct blk_mq_tags *tags = set->shared_sbitmap_tags;
> +
> +	sbitmap_queue_resize(&tags->__bitmap_tags, size - set->reserved_tags);
>  }
>  
>  void blk_mq_tag_update_sched_shared_sbitmap(struct request_queue *q)
>  {
> -	sbitmap_queue_resize(&q->sched_bitmap_tags,
> +	sbitmap_queue_resize(q->shared_sbitmap_tags->bitmap_tags,
>  			     q->nr_requests - q->tag_set->reserved_tags);
>  }
>  
> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> index 88f3c6485543..c9fc52ee07c4 100644
> --- a/block/blk-mq-tag.h
> +++ b/block/blk-mq-tag.h
> @@ -30,8 +30,8 @@ struct blk_mq_tags {
>  
>  extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
>  					unsigned int reserved_tags,
> -					int node, unsigned int flags);
> -extern void blk_mq_free_tags(struct blk_mq_tags *tags, unsigned int flags);
> +					int node, int alloc_policy);
> +extern void blk_mq_free_tags(struct blk_mq_tags *tags);
>  extern int blk_mq_init_bitmaps(struct sbitmap_queue *bitmap_tags,
>  			       struct sbitmap_queue *breserved_tags,
>  			       unsigned int queue_depth,
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4d6723cfa582..d3dd5fab3426 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2348,6 +2348,9 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>  	struct blk_mq_tags *drv_tags;
>  	struct page *page;
>  
> +	if (blk_mq_is_sbitmap_shared(set->flags))
> +		drv_tags = set->shared_sbitmap_tags;
> +	else
>  		drv_tags = set->tags[hctx_idx];

Here I guess you need to avoid to double ->exit_request()?

>  
>  	if (tags->static_rqs && set->ops->exit_request) {
> @@ -2377,21 +2380,20 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>  	}
>  }
>  
> -void blk_mq_free_rq_map(struct blk_mq_tags *tags, unsigned int flags)
> +void blk_mq_free_rq_map(struct blk_mq_tags *tags)
>  {
>  	kfree(tags->rqs);
>  	tags->rqs = NULL;
>  	kfree(tags->static_rqs);
>  	tags->static_rqs = NULL;
>  
> -	blk_mq_free_tags(tags, flags);
> +	blk_mq_free_tags(tags);
>  }
>  
>  struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
>  					unsigned int hctx_idx,
>  					unsigned int nr_tags,
> -					unsigned int reserved_tags,
> -					unsigned int flags)
> +					unsigned int reserved_tags)
>  {
>  	struct blk_mq_tags *tags;
>  	int node;
> @@ -2400,7 +2402,8 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
>  	if (node == NUMA_NO_NODE)
>  		node = set->numa_node;
>  
> -	tags = blk_mq_init_tags(nr_tags, reserved_tags, node, flags);
> +	tags = blk_mq_init_tags(nr_tags, reserved_tags, node,
> +				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags));
>  	if (!tags)
>  		return NULL;
>  
> @@ -2408,7 +2411,7 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
>  				 GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
>  				 node);
>  	if (!tags->rqs) {
> -		blk_mq_free_tags(tags, flags);
> +		blk_mq_free_tags(tags);
>  		return NULL;
>  	}
>  
> @@ -2417,7 +2420,7 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
>  					node);
>  	if (!tags->static_rqs) {
>  		kfree(tags->rqs);
> -		blk_mq_free_tags(tags, flags);
> +		blk_mq_free_tags(tags);
>  		return NULL;
>  	}
>  
> @@ -2859,8 +2862,14 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set,
>  	unsigned int flags = set->flags;
>  	int ret = 0;
>  
> +	if (blk_mq_is_sbitmap_shared(flags)) {
> +		set->tags[hctx_idx] = set->shared_sbitmap_tags;
> +
> +		return true;
> +	}
> +
>  	set->tags[hctx_idx] = blk_mq_alloc_rq_map(set, hctx_idx,
> -					set->queue_depth, set->reserved_tags, flags);
> +					set->queue_depth, set->reserved_tags);
>  	if (!set->tags[hctx_idx])
>  		return false;
>  
> @@ -2869,7 +2878,7 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set,
>  	if (!ret)
>  		return true;
>  
> -	blk_mq_free_rq_map(set->tags[hctx_idx], flags);
> +	blk_mq_free_rq_map(set->tags[hctx_idx]);
>  	set->tags[hctx_idx] = NULL;
>  	return false;
>  }
> @@ -2877,11 +2886,11 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set,
>  static void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set,
>  					 unsigned int hctx_idx)
>  {
> -	unsigned int flags = set->flags;
> -
>  	if (set->tags && set->tags[hctx_idx]) {
> -		blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx);
> -		blk_mq_free_rq_map(set->tags[hctx_idx], flags);
> +		if (!blk_mq_is_sbitmap_shared(set->flags)) {

I remember you hate negative check, :-)

> +			blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx);
> +			blk_mq_free_rq_map(set->tags[hctx_idx]);

We can add one helper of blk_mq_free_map_and_rqs(), and there seems
several such pattern.

> +		}
>  		set->tags[hctx_idx] = NULL;
>  	}
>  }
> @@ -3348,6 +3357,21 @@ static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set)
>  {
>  	int i;
>  
> +	if (blk_mq_is_sbitmap_shared(set->flags)) {
> +		int ret;
> +
> +		set->shared_sbitmap_tags = blk_mq_alloc_rq_map(set, 0,
> +						  set->queue_depth,
> +						  set->reserved_tags);
> +		if (!set->shared_sbitmap_tags)
> +			return -ENOMEM;
> +
> +		ret = blk_mq_alloc_rqs(set, set->shared_sbitmap_tags, 0,
> +				       set->queue_depth);
> +		if (ret)
> +			goto out_free_sbitmap_tags;
> +	}
> +
>  	for (i = 0; i < set->nr_hw_queues; i++) {
>  		if (!__blk_mq_alloc_map_and_request(set, i))
>  			goto out_unwind;
> @@ -3359,6 +3383,11 @@ static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set)
>  out_unwind:
>  	while (--i >= 0)
>  		blk_mq_free_map_and_requests(set, i);
> +	if (blk_mq_is_sbitmap_shared(set->flags))
> +		blk_mq_free_rqs(set, set->shared_sbitmap_tags, 0);
> +out_free_sbitmap_tags:
> +	if (blk_mq_is_sbitmap_shared(set->flags))
> +		blk_mq_exit_shared_sbitmap(set);

Once a helper of blk_mq_alloc_map_and_rqs() is added, the above failure
handling can be simplified too.


-- 
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 11/11] blk-mq: Stop using pointers for blk_mq_tags bitmap tags
  2021-08-09 14:29 ` [PATCH v2 11/11] blk-mq: Stop using pointers for blk_mq_tags bitmap tags John Garry
@ 2021-08-18  8:18   ` Ming Lei
  0 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2021-08-18  8:18 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Mon, Aug 09, 2021 at 10:29:38PM +0800, John Garry wrote:
> Now that we use shared tags for shared sbitmap support, we don't require
> the tags sbitmap pointers, so drop them.
> 
> This essentially reverts commit 222a5ae03cdd ("blk-mq: Use pointers for
> blk_mq_tags bitmap tags").
> 
> Signed-off-by: John Garry <john.garry@huawei.com>

Reviewed-by: Ming Lei <ming.lei@redhat.com>

-- 
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 08/11] blk-mq: Add blk_mq_ops.init_request_no_hctx()
  2021-08-18  7:38   ` Ming Lei
@ 2021-08-18  8:46     ` John Garry
  0 siblings, 0 replies; 31+ messages in thread
From: John Garry @ 2021-08-18  8:46 UTC (permalink / raw)
  To: Ming Lei
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On 18/08/2021 08:38, Ming Lei wrote:
> On Mon, Aug 09, 2021 at 10:29:35PM +0800, John Garry wrote:
>> Add a variant of the init_request function which does not pass a hctx_idx
>> arg.
>>
>> This is important for shared sbitmap support, as it needs to be ensured for
>> introducing shared static rqs that the LLDD cannot think that requests
>> are associated with a specific HW queue.
>>
>> Signed-off-by: John Garry<john.garry@huawei.com>
>> ---
>>   block/blk-mq.c         | 15 ++++++++++-----
>>   include/linux/blk-mq.h |  7 +++++++
>>   2 files changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index f14cc2705f9b..4d6723cfa582 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2427,13 +2427,15 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
>>   static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
>>   			       unsigned int hctx_idx, int node)
>>   {
>> -	int ret;
>> +	int ret = 0;
>>   
>> -	if (set->ops->init_request) {
>> +	if (set->ops->init_request)
>>   		ret = set->ops->init_request(set, rq, hctx_idx, node);
>> -		if (ret)
>> -			return ret;
>> -	}
>> +	else if (set->ops->init_request_no_hctx)
>> +		ret = set->ops->init_request_no_hctx(set, rq, node);

Hi Ming,

> The only shared sbitmap user of SCSI does not use passed hctx_idx, not
> sure we need such new callback.

Sure, actually most versions of init_request callback don't use 
hctx_idx. Or numa_node arg.
> If you really want to do this, just wondering why not pass '-1' as
> hctx_idx in case of shared sbitmap?

Yeah, I did consider that. hctx_idx is an unsigned, and I generally 
don't like -1U - but that's no big deal. But I also didn't like how it 
relies on the driver init_request callback to check the value, which 
changes the semantics.

Obviously we don't add new versions of init_request for new block 
drivers which use shared sbitmap everyday, and any new ones would get it 
right.

I suppose I can go the way you suggest - I just thought that this method 
was neat as well.

Thanks,
John


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-18  4:02   ` Ming Lei
@ 2021-08-18 12:00     ` John Garry
  2021-08-19  0:39       ` Ming Lei
  0 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-18 12:00 UTC (permalink / raw)
  To: Ming Lei
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

>>   
>> @@ -2346,8 +2345,11 @@ static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
>>   void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>>   		     unsigned int hctx_idx)
>>   {
>> +	struct blk_mq_tags *drv_tags;
>>   	struct page *page;
>>   
>> +		drv_tags = set->tags[hctx_idx];
> 

Hi Ming,

> Indent.

That's intentional, as we have from later patch:

void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags 
*tags, unsigned int hctx_idx)
{
	struct blk_mq_tags *drv_tags;
	struct page *page;

+	if (blk_mq_is_sbitmap_shared(set->flags))
+		drv_tags = set->shared_sbitmap_tags;
+	else
		drv_tags = set->tags[hctx_idx];

	...

	blk_mq_clear_rq_mapping(drv_tags, tags);

}

And it's just nice to not re-indent later.

> 
>> +
>>   	if (tags->static_rqs && set->ops->exit_request) {
>>   		int i;
>>   
>> @@ -2361,7 +2363,7 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>>   		}
>>   	}
>>   
>> -	blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>> +	blk_mq_clear_rq_mapping(drv_tags, tags);
> 
> Maybe you can pass set->tags[hctx_idx] directly since there is only one
> reference on it.
> 

Again, intentional for similar reason, as above.

Thanks,
John


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 10/11] blk-mq: Use shared tags for shared sbitmap support
  2021-08-18  8:16   ` Ming Lei
@ 2021-08-18 14:18     ` John Garry
  2021-08-23  8:55     ` John Garry
  1 sibling, 0 replies; 31+ messages in thread
From: John Garry @ 2021-08-18 14:18 UTC (permalink / raw)
  To: Ming Lei
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On 18/08/2021 09:16, Ming Lei wrote:
> On Mon, Aug 09, 2021 at 10:29:37PM +0800, John Garry wrote:
>> Currently we use separate sbitmap pairs and active_queues atomic_t for
>> shared sbitmap support.
>>
>> However a full set of static requests are used per HW queue, which is
>> quite wasteful, considering that the total number of requests usable at
>> any given time across all HW queues is limited by the shared sbitmap depth.
>>
>> As such, it is considerably more memory efficient in the case of shared
>> sbitmap to allocate a set of static rqs per tag set or request queue, and
>> not per HW queue.
>>
>> So replace the sbitmap pairs and active_queues atomic_t with a shared
>> tags per tagset and request queue.
> 
> This idea looks good and the current implementation is simplified a bit
> too.

Good, but you did hint at it :)

> 
>>
>> Continue to use term "shared sbitmap" for now, as the meaning is known.
> 
> I guess shared tags is better.

Yeah, agreed. My preference would be to change later, once things settle 
down.

As I see, the only thing close to an ABI is the debugfs "flags" code, 
but that's debugfs, so not stable.

> 
>>
>> Signed-off-by: John Garry <john.garry@huawei.com>
>> ---
>>   block/blk-mq-sched.c   | 77 ++++++++++++++++++++-----------------
>>   block/blk-mq-tag.c     | 65 ++++++++++++-------------------
>>   block/blk-mq-tag.h     |  4 +-
>>   block/blk-mq.c         | 86 +++++++++++++++++++++++++-----------------
>>   block/blk-mq.h         |  8 ++--
>>   include/linux/blk-mq.h | 13 +++----
>>   include/linux/blkdev.h |  3 +-
>>   7 files changed, 131 insertions(+), 125 deletions(-)
>>
>> diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c

...

>> +
>>   static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue)
>>   {
>>   	struct blk_mq_tag_set *set = queue->tag_set;
>> -	int alloc_policy = BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags);
>> -	struct blk_mq_hw_ctx *hctx;
>> -	int ret, i;
>> +	struct blk_mq_tags *tags;
>> +	int ret;
>>   
>>   	/*
>>   	 * Set initial depth at max so that we don't need to reallocate for
>>   	 * updating nr_requests.
>>   	 */
>> -	ret = blk_mq_init_bitmaps(&queue->sched_bitmap_tags,
>> -				  &queue->sched_breserved_tags,
>> -				  MAX_SCHED_RQ, set->reserved_tags,
>> -				  set->numa_node, alloc_policy);
>> -	if (ret)
>> -		return ret;
>> +	tags = queue->shared_sbitmap_tags = blk_mq_alloc_rq_map(set, 0,
>> +					  set->queue_depth,
>> +					  set->reserved_tags);
>> +	if (!queue->shared_sbitmap_tags)
>> +		return -ENOMEM;
>>   
>> -	queue_for_each_hw_ctx(queue, hctx, i) {
>> -		hctx->sched_tags->bitmap_tags =
>> -					&queue->sched_bitmap_tags;
>> -		hctx->sched_tags->breserved_tags =
>> -					&queue->sched_breserved_tags;
>> +	ret = blk_mq_alloc_rqs(set, tags, 0, set->queue_depth);
>> +	if (ret) {
>> +		blk_mq_exit_sched_shared_sbitmap(queue);
>> +		return ret;
> 
> There are two such patterns for allocate rq map and request pool
> together, please put them into one helper(such as blk_mq_alloc_map_and_rqs)
> which can return the allocated tags and handle failure inline. Also we may
> convert current users into this helper.

I'll have a look, but I will mention about "free" helper below

> 
>>   	}
>>   
>>   	blk_mq_tag_update_sched_shared_sbitmap(queue);
>> @@ -580,12 +589,6 @@ static int blk_mq_init_sched_shared_sbitmap(struct request_queue *queue)
>>   	return 0;
>>   }
>>   
>> -static void blk_mq_exit_sched_shared_sbitmap(struct request_queue *queue)
>> -{
>> -	sbitmap_queue_free(&queue->sched_bitmap_tags);
>> -	sbitmap_queue_free(&queue->sched_breserved_tags);
>> -}
>> -

...

>>   
>> -void blk_mq_free_tags(struct blk_mq_tags *tags, unsigned int flags)
>> +void blk_mq_free_tags(struct blk_mq_tags *tags)
>>   {
>> -	if (!blk_mq_is_sbitmap_shared(flags)) {
>> -		sbitmap_queue_free(tags->bitmap_tags);
>> -		sbitmap_queue_free(tags->breserved_tags);
>> -	}
>> +	sbitmap_queue_free(tags->bitmap_tags);
>> +	sbitmap_queue_free(tags->breserved_tags);
>>   	kfree(tags);
>>   }
>>   
>> @@ -604,18 +580,25 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
>>   		if (tdepth > MAX_SCHED_RQ)
>>   			return -EINVAL;
>>   
>> +		if (blk_mq_is_sbitmap_shared(set->flags)) {
>> +			/* No point in allowing this to happen */
>> +			if (tdepth > set->queue_depth)
>> +				return -EINVAL;
>> +			return 0;
>> +		}
> 
> The above looks wrong, it isn't unusual to see small queue depth
> hardware meantime we often have scheduler queue depth of 2 * set->queue_depth.

ok, I suppose you're right.

> 
>> +
>>   		new = blk_mq_alloc_rq_map(set, hctx->queue_num, tdepth,
>> -				tags->nr_reserved_tags, set->flags);
>> +				tags->nr_reserved_tags);
>>   		if (!new)
>>   			return -ENOMEM;
>>   		ret = blk_mq_alloc_rqs(set, new, hctx->queue_num, tdepth);
>>   		if (ret) {
>> -			blk_mq_free_rq_map(new, set->flags);
>> +			blk_mq_free_rq_map(new);
>>   			return -ENOMEM;
>>   		}
>>   
>>   		blk_mq_free_rqs(set, *tagsptr, hctx->queue_num);
>> -		blk_mq_free_rq_map(*tagsptr, set->flags);
>> +		blk_mq_free_rq_map(*tagsptr);
>>   		*tagsptr = new;
>>   	} else {
>>   		/*
>> @@ -631,12 +614,14 @@ int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
>>   
>>   void blk_mq_tag_resize_shared_sbitmap(struct blk_mq_tag_set *set, unsigned int size)
>>   {
>> -	sbitmap_queue_resize(&set->__bitmap_tags, size - set->reserved_tags);
>> +	struct blk_mq_tags *tags = set->shared_sbitmap_tags;
>> +
>> +	sbitmap_queue_resize(&tags->__bitmap_tags, size - set->reserved_tags);
>>   }
>>   
>>   void blk_mq_tag_update_sched_shared_sbitmap(struct request_queue *q)
>>   {
>> -	sbitmap_queue_resize(&q->sched_bitmap_tags,
>> +	sbitmap_queue_resize(q->shared_sbitmap_tags->bitmap_tags,
>>   			     q->nr_requests - q->tag_set->reserved_tags);
>>   }
>>   
>> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
>> index 88f3c6485543..c9fc52ee07c4 100644
>> --- a/block/blk-mq-tag.h
>> +++ b/block/blk-mq-tag.h
>> @@ -30,8 +30,8 @@ struct blk_mq_tags {
>>   
>>   extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags,
>>   					unsigned int reserved_tags,
>> -					int node, unsigned int flags);
>> -extern void blk_mq_free_tags(struct blk_mq_tags *tags, unsigned int flags);
>> +					int node, int alloc_policy);
>> +extern void blk_mq_free_tags(struct blk_mq_tags *tags);
>>   extern int blk_mq_init_bitmaps(struct sbitmap_queue *bitmap_tags,
>>   			       struct sbitmap_queue *breserved_tags,
>>   			       unsigned int queue_depth,
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 4d6723cfa582..d3dd5fab3426 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2348,6 +2348,9 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>>   	struct blk_mq_tags *drv_tags;
>>   	struct page *page;
>>   
>> +	if (blk_mq_is_sbitmap_shared(set->flags))
>> +		drv_tags = set->shared_sbitmap_tags;
>> +	else
>>   		drv_tags = set->tags[hctx_idx];
> 
> Here I guess you need to avoid to double ->exit_request()?

I'll check that doesn't occur, but I didn't think it did.

> 
>>   
>>   	if (tags->static_rqs && set->ops->exit_request) {
>> @@ -2377,21 +2380,20 @@ void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>>   	}
>>   }
>>   
>> -void blk_mq_free_rq_map(struct blk_mq_tags *tags, unsigned int flags)
>> +void blk_mq_free_rq_map(struct blk_mq_tags *tags)
>>   {
>>   	kfree(tags->rqs);
>>   	tags->rqs = NULL;
>>   	kfree(tags->static_rqs);
>>   	tags->static_rqs = NULL;
>>   
>> -	blk_mq_free_tags(tags, flags);
>> +	blk_mq_free_tags(tags);
>>   }
>>   

...

>>   }
>> @@ -2877,11 +2886,11 @@ static bool __blk_mq_alloc_map_and_request(struct blk_mq_tag_set *set,
>>   static void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set,
>>   					 unsigned int hctx_idx)
>>   {
>> -	unsigned int flags = set->flags;
>> -
>>   	if (set->tags && set->tags[hctx_idx]) {
>> -		blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx);
>> -		blk_mq_free_rq_map(set->tags[hctx_idx], flags);
>> +		if (!blk_mq_is_sbitmap_shared(set->flags)) {
> 
> I remember you hate negative check, :-)

Not always, but sometimes I think the code harder to read with them.

> 
>> +			blk_mq_free_rqs(set, set->tags[hctx_idx], hctx_idx);
>> +			blk_mq_free_rq_map(set->tags[hctx_idx]);
> 
> We can add one helper of blk_mq_free_map_and_rqs(), and there seems
> several such pattern.

ok, I can check, but I don't think it's useful in the blk-mq sched code 
as the tags and rqs are freed separately there, so not sure on how much 
we gain.

> 
>> +		}
>>   		set->tags[hctx_idx] = NULL;
>>   	}
>>   }
>> @@ -3348,6 +3357,21 @@ static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set)
>>   {
>>   	int i;
>>   
>> +	if (blk_mq_is_sbitmap_shared(set->flags)) {
>> +		int ret;
>> +
>> +		set->shared_sbitmap_tags = blk_mq_alloc_rq_map(set, 0,
>> +						  set->queue_depth,
>> +						  set->reserved_tags);
>> +		if (!set->shared_sbitmap_tags)
>> +			return -ENOMEM;
>> +
>> +		ret = blk_mq_alloc_rqs(set, set->shared_sbitmap_tags, 0,
>> +				       set->queue_depth);
>> +		if (ret)
>> +			goto out_free_sbitmap_tags;
>> +	}
>> +
>>   	for (i = 0; i < set->nr_hw_queues; i++) {
>>   		if (!__blk_mq_alloc_map_and_request(set, i))
>>   			goto out_unwind;
>> @@ -3359,6 +3383,11 @@ static int __blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set)
>>   out_unwind:
>>   	while (--i >= 0)
>>   		blk_mq_free_map_and_requests(set, i);
>> +	if (blk_mq_is_sbitmap_shared(set->flags))
>> +		blk_mq_free_rqs(set, set->shared_sbitmap_tags, 0);
>> +out_free_sbitmap_tags:
>> +	if (blk_mq_is_sbitmap_shared(set->flags))
>> +		blk_mq_exit_shared_sbitmap(set);
> 
> Once a helper of blk_mq_alloc_map_and_rqs() is added, the above failure
> handling can be simplified too.
> 
> 

OK

Thanks a lot,
John



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-18 12:00     ` John Garry
@ 2021-08-19  0:39       ` Ming Lei
  2021-08-19  7:32         ` John Garry
  0 siblings, 1 reply; 31+ messages in thread
From: Ming Lei @ 2021-08-19  0:39 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Wed, Aug 18, 2021 at 01:00:13PM +0100, John Garry wrote:
> > > @@ -2346,8 +2345,11 @@ static void blk_mq_clear_rq_mapping(struct blk_mq_tag_set *set,
> > >   void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
> > >   		     unsigned int hctx_idx)
> > >   {
> > > +	struct blk_mq_tags *drv_tags;
> > >   	struct page *page;
> > > +		drv_tags = set->tags[hctx_idx];
> > 
> 
> Hi Ming,
> 
> > Indent.
> 
> That's intentional, as we have from later patch:
> 
> void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
> unsigned int hctx_idx)
> {
> 	struct blk_mq_tags *drv_tags;
> 	struct page *page;
> 
> +	if (blk_mq_is_sbitmap_shared(set->flags))
> +		drv_tags = set->shared_sbitmap_tags;
> +	else
> 		drv_tags = set->tags[hctx_idx];
> 
> 	...
> 
> 	blk_mq_clear_rq_mapping(drv_tags, tags);
> 
> }
> 
> And it's just nice to not re-indent later.

But this way is weird, and I don't think checkpatch.pl is happy with
it.

-- 
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-19  0:39       ` Ming Lei
@ 2021-08-19  7:32         ` John Garry
  2021-08-23  2:59           ` Ming Lei
  0 siblings, 1 reply; 31+ messages in thread
From: John Garry @ 2021-08-19  7:32 UTC (permalink / raw)
  To: Ming Lei
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On 19/08/2021 01:39, Ming Lei wrote:
>> That's intentional, as we have from later patch:
>>
>> void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
>> unsigned int hctx_idx)
>> {
>> 	struct blk_mq_tags *drv_tags;
>> 	struct page *page;
>>
>> +	if (blk_mq_is_sbitmap_shared(set->flags))
>> +		drv_tags = set->shared_sbitmap_tags;
>> +	else
>> 		drv_tags = set->tags[hctx_idx];
>>
>> 	...
>>
>> 	blk_mq_clear_rq_mapping(drv_tags, tags);
>>
>> }
>>
>> And it's just nice to not re-indent later.
> But this way is weird, and I don't think checkpatch.pl is happy with
> it.

There is the idea to try to not remove/change code earlier in a series - 
I am taking it to an extreme! I can stop.

On another related topic, how about this change also:

---8<---
void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
			     struct blk_mq_tags *tags)
  {

+	/* There is no need to clear a driver tags own mapping */
+	if (drv_tags == tags)
+		return;
--->8---

Thanks,
John

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping()
  2021-08-19  7:32         ` John Garry
@ 2021-08-23  2:59           ` Ming Lei
  0 siblings, 0 replies; 31+ messages in thread
From: Ming Lei @ 2021-08-23  2:59 UTC (permalink / raw)
  To: John Garry
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On Thu, Aug 19, 2021 at 08:32:20AM +0100, John Garry wrote:
> On 19/08/2021 01:39, Ming Lei wrote:
> > > That's intentional, as we have from later patch:
> > > 
> > > void blk_mq_free_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
> > > unsigned int hctx_idx)
> > > {
> > > 	struct blk_mq_tags *drv_tags;
> > > 	struct page *page;
> > > 
> > > +	if (blk_mq_is_sbitmap_shared(set->flags))
> > > +		drv_tags = set->shared_sbitmap_tags;
> > > +	else
> > > 		drv_tags = set->tags[hctx_idx];
> > > 
> > > 	...
> > > 
> > > 	blk_mq_clear_rq_mapping(drv_tags, tags);
> > > 
> > > }
> > > 
> > > And it's just nice to not re-indent later.
> > But this way is weird, and I don't think checkpatch.pl is happy with
> > it.
> 
> There is the idea to try to not remove/change code earlier in a series - I
> am taking it to an extreme! I can stop.
> 
> On another related topic, how about this change also:
> 
> ---8<---
> void blk_mq_clear_rq_mapping(struct blk_mq_tags *drv_tags,
> 			     struct blk_mq_tags *tags)
>  {
> 
> +	/* There is no need to clear a driver tags own mapping */
> +	if (drv_tags == tags)
> +		return;
> --->8---

The change itself is correct, and no need to clear driver tags
->rq[] since all request queues have been cleaned up when freeing
tagset.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 10/11] blk-mq: Use shared tags for shared sbitmap support
  2021-08-18  8:16   ` Ming Lei
  2021-08-18 14:18     ` John Garry
@ 2021-08-23  8:55     ` John Garry
  1 sibling, 0 replies; 31+ messages in thread
From: John Garry @ 2021-08-23  8:55 UTC (permalink / raw)
  To: Ming Lei
  Cc: axboe, jejb, martin.petersen, linux-block, linux-kernel,
	linux-scsi, kashyap.desai, hare

On 18/08/2021 09:16, Ming Lei wrote:
>> ret = blk_mq_alloc_rqs(set, tags, 0, set->queue_depth);
>> +	if (ret) {
>> +		blk_mq_exit_sched_shared_sbitmap(queue);
>> +		return ret;
> There are two such patterns for allocate rq map and request pool
> together, please put them into one helper(such as blk_mq_alloc_map_and_rqs)
> which can return the allocated tags and handle failure inline. Also we may
> convert current users into this helper.
> 
>>   	}

Hi Ming,

Do you have a preference for the API:

int blk_mq_alloc_map_and_rqs(struct blk_mq_tag_set *set, struct 
blk_mq_tags **tags, unsigned int hctx_idx, unsigned int depth);

void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set,
  struct blk_mq_tags **tags, unsigned int hctx_idx);


or

struct blk_mq_tags *blk_mq_alloc_map_and_rqs(struct blk_mq_tag_set *set,
unsigned int hctx_idx, unsigned int depth);

void blk_mq_free_map_and_requests(struct blk_mq_tag_set *set,
  struct blk_mq_tags *tags, unsigned int hctx_idx);


The advantage of the first is that we don't need the pattern:
blk_mq_free_map_and_requests(tags)
tags = NULL;

But then passing struct blk_mq_tags ** is a bit overly complicated.

Thanks,
John

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-08-23  8:51 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-09 14:29 [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry
2021-08-09 14:29 ` [PATCH v2 01/11] blk-mq: Change rqs check in blk_mq_free_rqs() John Garry
2021-08-09 14:29 ` [PATCH v2 02/11] block: Rename BLKDEV_MAX_RQ -> BLKDEV_DEFAULT_RQ John Garry
2021-08-09 14:29 ` [PATCH v2 03/11] blk-mq: Relocate shared sbitmap resize in blk_mq_update_nr_requests() John Garry
2021-08-18  3:49   ` Ming Lei
2021-08-09 14:29 ` [PATCH v2 04/11] blk-mq: Invert check " John Garry
2021-08-18  3:53   ` Ming Lei
2021-08-09 14:29 ` [PATCH v2 05/11] blk-mq-sched: Rename blk_mq_sched_alloc_{tags -> map_and_request}() John Garry
2021-08-18  3:55   ` Ming Lei
2021-08-09 14:29 ` [PATCH v2 06/11] blk-mq: Pass driver tags to blk_mq_clear_rq_mapping() John Garry
2021-08-09 17:00   ` kernel test robot
2021-08-09 17:10     ` John Garry
2021-08-09 19:55   ` kernel test robot
2021-08-18  4:02   ` Ming Lei
2021-08-18 12:00     ` John Garry
2021-08-19  0:39       ` Ming Lei
2021-08-19  7:32         ` John Garry
2021-08-23  2:59           ` Ming Lei
2021-08-09 14:29 ` [PATCH v2 07/11] blk-mq: Add blk_mq_tag_update_sched_shared_sbitmap() John Garry
2021-08-18  7:28   ` Ming Lei
2021-08-09 14:29 ` [PATCH v2 08/11] blk-mq: Add blk_mq_ops.init_request_no_hctx() John Garry
2021-08-18  7:38   ` Ming Lei
2021-08-18  8:46     ` John Garry
2021-08-09 14:29 ` [PATCH v2 09/11] scsi: Set blk_mq_ops.init_request_no_hctx John Garry
2021-08-09 14:29 ` [PATCH v2 10/11] blk-mq: Use shared tags for shared sbitmap support John Garry
2021-08-18  8:16   ` Ming Lei
2021-08-18 14:18     ` John Garry
2021-08-23  8:55     ` John Garry
2021-08-09 14:29 ` [PATCH v2 11/11] blk-mq: Stop using pointers for blk_mq_tags bitmap tags John Garry
2021-08-18  8:18   ` Ming Lei
2021-08-17  7:02 ` [PATCH v2 00/11] blk-mq: Reduce static requests memory footprint for shared sbitmap John Garry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).