LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v5 0/6] handle unexpected message from server
@ 2021-09-09 14:12 Yu Kuai
2021-09-09 14:12 ` [PATCH v5 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
` (5 more replies)
0 siblings, 6 replies; 24+ messages in thread
From: Yu Kuai @ 2021-09-09 14:12 UTC (permalink / raw)
To: axboe, josef, ming.lei, hch
Cc: linux-block, linux-kernel, nbd, yukuai3, yi.zhang
This patch set tries to fix that client might oops if nbd server send
unexpected message to client, for example, our syzkaller report a uaf
in nbd_read_stat():
Call trace:
dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x144/0x1b4 lib/dump_stack.c:118
print_address_description+0x68/0x2d0 mm/kasan/report.c:253
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x134/0x2f0 mm/kasan/report.c:409
check_memory_region_inline mm/kasan/kasan.c:260 [inline]
__asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
__read_once_size include/linux/compiler.h:193 [inline]
blk_mq_rq_state block/blk-mq.h:106 [inline]
blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
nbd_read_stat drivers/block/nbd.c:670 [inline]
recv_work+0x1bc/0x890 drivers/block/nbd.c:749
process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
kthread+0x1d8/0x1e0 kernel/kthread.c:255
ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
1) At first, a normal io is submitted and completed with scheduler:
internel_tag = blk_mq_get_tag -> get tag from sched_tags
blk_mq_rq_ctx_init
sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
__blk_mq_get_driver_tag -> get tag from tags
tags->rq[tag] = sched_tag->static_rq[internel_tag]
So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.
2) nbd server send a reply with random tag directly:
recv_work
nbd_read_stat
blk_mq_tag_to_rq(tags, tag)
rq = tags->rq[tag]
3) if the sched_tags->static_rq is freed:
blk_mq_sched_free_requests
blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
-> step 2) access rq before clearing rq mapping
blk_mq_clear_rq_mapping(set, tags, hctx_idx);
__free_pages() -> rq is freed here
4) Then, nbd continue to use the freed request in nbd_read_stat()
Changes in v5:
- move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
- add some comment in patch 5
Changes in v4:
- change the name of the patchset, since uaf is not the only problem
if server send unexpected reply message.
- instead of adding new interface, use blk_mq_find_and_get_req().
- add patch 5 to this series
Changes in v3:
- v2 can't fix the problem thoroughly, add patch 3-4 to this series.
- modify descriptions.
- patch 5 is just a cleanup
Changes in v2:
- as Bart suggested, add a new helper function for drivers to get
request by tag.
Yu Kuai (6):
nbd: don't handle response without a corresponding request message
nbd: make sure request completion won't concurrent
nbd: check sock index in nbd_read_stat()
blk-mq: export two symbols to get request by tag
nbd: convert to use blk_mq_find_and_get_req()
nbd: don't start request if nbd_queue_rq() failed
block/blk-mq-tag.c | 5 +++--
block/blk-mq.c | 1 +
drivers/block/nbd.c | 51 ++++++++++++++++++++++++++++++++++++------
include/linux/blk-mq.h | 3 +++
4 files changed, 51 insertions(+), 9 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v5 1/6] nbd: don't handle response without a corresponding request message
2021-09-09 14:12 [PATCH v5 0/6] handle unexpected message from server Yu Kuai
@ 2021-09-09 14:12 ` Yu Kuai
2021-09-14 0:54 ` Ming Lei
2021-09-09 14:12 ` [PATCH v5 2/6] nbd: make sure request completion won't concurrent Yu Kuai
` (4 subsequent siblings)
5 siblings, 1 reply; 24+ messages in thread
From: Yu Kuai @ 2021-09-09 14:12 UTC (permalink / raw)
To: axboe, josef, ming.lei, hch
Cc: linux-block, linux-kernel, nbd, yukuai3, yi.zhang
While handling a response message from server, nbd_read_stat() will
try to get request by tag, and then complete the request. However,
this is problematic if nbd haven't sent a corresponding request
message:
t1 t2
submit_bio
nbd_queue_rq
blk_mq_start_request
recv_work
nbd_read_stat
blk_mq_tag_to_rq
blk_mq_complete_request
nbd_send_cmd
Thus add a new cmd flag 'NBD_CMD_INFLIGHT', it will be set in
nbd_send_cmd() and checked in nbd_read_stat().
Noted that this patch can't fix that blk_mq_tag_to_rq() might
return a freed request, and this will be fixed in following
patches.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
drivers/block/nbd.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..04861b585b62 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -126,6 +126,12 @@ struct nbd_device {
};
#define NBD_CMD_REQUEUED 1
+/*
+ * This flag will be set if nbd_queue_rq() succeed, and will be checked and
+ * cleared in completion. Both setting and clearing of the flag are protected
+ * by cmd->lock.
+ */
+#define NBD_CMD_INFLIGHT 2
struct nbd_cmd {
struct nbd_device *nbd;
@@ -400,6 +406,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
if (!mutex_trylock(&cmd->lock))
return BLK_EH_RESET_TIMER;
+ __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
if (!refcount_inc_not_zero(&nbd->config_refs)) {
cmd->status = BLK_STS_TIMEOUT;
mutex_unlock(&cmd->lock);
@@ -729,6 +736,12 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
cmd = blk_mq_rq_to_pdu(req);
mutex_lock(&cmd->lock);
+ if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+ dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)",
+ tag, cmd->status, cmd->flags);
+ ret = -ENOENT;
+ goto out;
+ }
if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
@@ -829,6 +842,7 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
mutex_lock(&cmd->lock);
cmd->status = BLK_STS_IOERR;
+ __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
mutex_unlock(&cmd->lock);
blk_mq_complete_request(req);
@@ -964,7 +978,13 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
* returns EAGAIN can be retried on a different socket.
*/
ret = nbd_send_cmd(nbd, cmd, index);
- if (ret == -EAGAIN) {
+ /*
+ * Access to this flag is protected by cmd->lock, thus it's safe to set
+ * the flag after nbd_send_cmd() succeed to send request to server.
+ */
+ if (!ret)
+ __set_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+ else if (ret == -EAGAIN) {
dev_err_ratelimited(disk_to_dev(nbd->disk),
"Request send failed, requeueing\n");
nbd_mark_nsock_dead(nbd, nsock, 1);
--
2.31.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 2/6] nbd: make sure request completion won't concurrent
2021-09-09 14:12 [PATCH v5 0/6] handle unexpected message from server Yu Kuai
2021-09-09 14:12 ` [PATCH v5 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
@ 2021-09-09 14:12 ` Yu Kuai
2021-09-14 0:57 ` Ming Lei
2021-09-09 14:12 ` [PATCH v5 3/6] nbd: check sock index in nbd_read_stat() Yu Kuai
` (3 subsequent siblings)
5 siblings, 1 reply; 24+ messages in thread
From: Yu Kuai @ 2021-09-09 14:12 UTC (permalink / raw)
To: axboe, josef, ming.lei, hch
Cc: linux-block, linux-kernel, nbd, yukuai3, yi.zhang
commit cddce0116058 ("nbd: Aovid double completion of a request")
try to fix that nbd_clear_que() and recv_work() can complete a
request concurrently. However, the problem still exists:
t1 t2 t3
nbd_disconnect_and_put
flush_workqueue
recv_work
blk_mq_complete_request
blk_mq_complete_request_remote -> this is true
WRITE_ONCE(rq->state, MQ_RQ_COMPLETE)
blk_mq_raise_softirq
blk_done_softirq
blk_complete_reqs
nbd_complete_rq
blk_mq_end_request
blk_mq_free_request
WRITE_ONCE(rq->state, MQ_RQ_IDLE)
nbd_clear_que
blk_mq_tagset_busy_iter
nbd_clear_req
__blk_mq_free_request
blk_mq_put_tag
blk_mq_complete_request -> complete again
There are three places where request can be completed in nbd:
recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they
all hold cmd->lock before completing the request, it's easy to
avoid the problem by setting and checking a cmd flag.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
drivers/block/nbd.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 04861b585b62..550c8dc438ac 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -406,7 +406,11 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
if (!mutex_trylock(&cmd->lock))
return BLK_EH_RESET_TIMER;
- __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+ if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+ mutex_unlock(&cmd->lock);
+ return BLK_EH_DONE;
+ }
+
if (!refcount_inc_not_zero(&nbd->config_refs)) {
cmd->status = BLK_STS_TIMEOUT;
mutex_unlock(&cmd->lock);
@@ -842,7 +846,10 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
mutex_lock(&cmd->lock);
cmd->status = BLK_STS_IOERR;
- __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+ if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+ mutex_unlock(&cmd->lock);
+ return true;
+ }
mutex_unlock(&cmd->lock);
blk_mq_complete_request(req);
--
2.31.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 3/6] nbd: check sock index in nbd_read_stat()
2021-09-09 14:12 [PATCH v5 0/6] handle unexpected message from server Yu Kuai
2021-09-09 14:12 ` [PATCH v5 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
2021-09-09 14:12 ` [PATCH v5 2/6] nbd: make sure request completion won't concurrent Yu Kuai
@ 2021-09-09 14:12 ` Yu Kuai
2021-09-09 14:12 ` [PATCH v5 4/6] blk-mq: export two symbols to get request by tag Yu Kuai
` (2 subsequent siblings)
5 siblings, 0 replies; 24+ messages in thread
From: Yu Kuai @ 2021-09-09 14:12 UTC (permalink / raw)
To: axboe, josef, ming.lei, hch
Cc: linux-block, linux-kernel, nbd, yukuai3, yi.zhang
The sock that clent send request in nbd_send_cmd() and receive reply
in nbd_read_stat() should be the same.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
drivers/block/nbd.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 550c8dc438ac..6d8cbf8be231 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
ret = -ENOENT;
goto out;
}
+ if (cmd->index != index) {
+ dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
+ tag, index, cmd->index);
+ }
if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
--
2.31.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 4/6] blk-mq: export two symbols to get request by tag
2021-09-09 14:12 [PATCH v5 0/6] handle unexpected message from server Yu Kuai
` (2 preceding siblings ...)
2021-09-09 14:12 ` [PATCH v5 3/6] nbd: check sock index in nbd_read_stat() Yu Kuai
@ 2021-09-09 14:12 ` Yu Kuai
2021-09-09 14:12 ` [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req() Yu Kuai
2021-09-09 14:12 ` [PATCH v5 6/6] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
5 siblings, 0 replies; 24+ messages in thread
From: Yu Kuai @ 2021-09-09 14:12 UTC (permalink / raw)
To: axboe, josef, ming.lei, hch
Cc: linux-block, linux-kernel, nbd, yukuai3, yi.zhang
nbd has a defect that blk_mq_tag_to_rq() might return a freed
request in nbd_read_stat(). We need a new mechanism if we want to
fix this in nbd driver, which is rather complicated.
Thus use blk_mq_find_and_get_req() to replace blk_mq_tag_to_rq(),
which can make sure the returned request is not freed, and then we
can do more checking while 'cmd->lock' is hold.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
block/blk-mq-tag.c | 5 +++--
block/blk-mq.c | 1 +
include/linux/blk-mq.h | 3 +++
3 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 86f87346232a..b4f66b75b4d1 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -200,8 +200,8 @@ struct bt_iter_data {
bool reserved;
};
-static struct request *blk_mq_find_and_get_req(struct blk_mq_tags *tags,
- unsigned int bitnr)
+struct request *blk_mq_find_and_get_req(struct blk_mq_tags *tags,
+ unsigned int bitnr)
{
struct request *rq;
unsigned long flags;
@@ -213,6 +213,7 @@ static struct request *blk_mq_find_and_get_req(struct blk_mq_tags *tags,
spin_unlock_irqrestore(&tags->lock, flags);
return rq;
}
+EXPORT_SYMBOL(blk_mq_find_and_get_req);
static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
{
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 08626cb0534c..5113aa3788a2 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -916,6 +916,7 @@ void blk_mq_put_rq_ref(struct request *rq)
else if (refcount_dec_and_test(&rq->ref))
__blk_mq_free_request(rq);
}
+EXPORT_SYMBOL(blk_mq_put_rq_ref);
static bool blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
struct request *rq, void *priv, bool reserved)
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 13ba1861e688..03e02990609d 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -637,4 +637,7 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio);
void blk_mq_hctx_set_fq_lock_class(struct blk_mq_hw_ctx *hctx,
struct lock_class_key *key);
+void blk_mq_put_rq_ref(struct request *rq);
+struct request *blk_mq_find_and_get_req(struct blk_mq_tags *tags,
+ unsigned int bitnr);
#endif
--
2.31.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-09 14:12 [PATCH v5 0/6] handle unexpected message from server Yu Kuai
` (3 preceding siblings ...)
2021-09-09 14:12 ` [PATCH v5 4/6] blk-mq: export two symbols to get request by tag Yu Kuai
@ 2021-09-09 14:12 ` Yu Kuai
2021-09-14 1:11 ` Ming Lei
2021-09-09 14:12 ` [PATCH v5 6/6] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
5 siblings, 1 reply; 24+ messages in thread
From: Yu Kuai @ 2021-09-09 14:12 UTC (permalink / raw)
To: axboe, josef, ming.lei, hch
Cc: linux-block, linux-kernel, nbd, yukuai3, yi.zhang
blk_mq_tag_to_rq() can only ensure to return valid request in
following situation:
1) client send request message to server first
submit_bio
...
blk_mq_get_tag
...
blk_mq_get_driver_tag
...
nbd_queue_rq
nbd_handle_cmd
nbd_send_cmd
2) client receive respond message from server
recv_work
nbd_read_stat
blk_mq_tag_to_rq
If step 1) is missing, blk_mq_tag_to_rq() will return a stale
request, which might be freed. Thus convert to use
blk_mq_find_and_get_req() to make sure the returned request is not
freed.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
drivers/block/nbd.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 6d8cbf8be231..d298e2b9e6ee 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -729,12 +729,13 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
tag = nbd_handle_to_tag(handle);
hwq = blk_mq_unique_tag_to_hwq(tag);
if (hwq < nbd->tag_set.nr_hw_queues)
- req = blk_mq_tag_to_rq(nbd->tag_set.tags[hwq],
- blk_mq_unique_tag_to_tag(tag));
+ req = blk_mq_find_and_get_req(nbd->tag_set.tags[hwq],
+ blk_mq_unique_tag_to_tag(tag));
if (!req || !blk_mq_request_started(req)) {
dev_err(disk_to_dev(nbd->disk), "Unexpected reply (%d) %p\n",
tag, req);
- return ERR_PTR(-ENOENT);
+ ret = -ENOENT;
+ goto put_req;
}
trace_nbd_header_received(req, handle);
cmd = blk_mq_rq_to_pdu(req);
@@ -806,6 +807,14 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
out:
trace_nbd_payload_received(req, handle);
mutex_unlock(&cmd->lock);
+put_req:
+ /*
+ * It's safe to drop refcnt here because request completion won't
+ * concurent, thus if nbd_read_stat() successd, the request refcnt
+ * won't drop to zero here.
+ */
+ if (req)
+ blk_mq_put_rq_ref(req);
return ret ? ERR_PTR(ret) : cmd;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v5 6/6] nbd: don't start request if nbd_queue_rq() failed
2021-09-09 14:12 [PATCH v5 0/6] handle unexpected message from server Yu Kuai
` (4 preceding siblings ...)
2021-09-09 14:12 ` [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req() Yu Kuai
@ 2021-09-09 14:12 ` Yu Kuai
5 siblings, 0 replies; 24+ messages in thread
From: Yu Kuai @ 2021-09-09 14:12 UTC (permalink / raw)
To: axboe, josef, ming.lei, hch
Cc: linux-block, linux-kernel, nbd, yukuai3, yi.zhang
Currently, blk_mq_end_request() will be called if nbd_queue_rq()
failed, thus start request in such situation is useless.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
drivers/block/nbd.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index d298e2b9e6ee..7a963c4ec0d1 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -943,7 +943,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
if (!refcount_inc_not_zero(&nbd->config_refs)) {
dev_err_ratelimited(disk_to_dev(nbd->disk),
"Socks array is empty\n");
- blk_mq_start_request(req);
return -EINVAL;
}
config = nbd->config;
@@ -952,7 +951,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
dev_err_ratelimited(disk_to_dev(nbd->disk),
"Attempted send on invalid socket\n");
nbd_config_put(nbd);
- blk_mq_start_request(req);
return -EINVAL;
}
cmd->status = BLK_STS_OK;
@@ -976,7 +974,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
*/
sock_shutdown(nbd);
nbd_config_put(nbd);
- blk_mq_start_request(req);
return -EIO;
}
goto again;
--
2.31.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v5 1/6] nbd: don't handle response without a corresponding request message
2021-09-09 14:12 ` [PATCH v5 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
@ 2021-09-14 0:54 ` Ming Lei
0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2021-09-14 0:54 UTC (permalink / raw)
To: Yu Kuai; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Thu, Sep 09, 2021 at 10:12:51PM +0800, Yu Kuai wrote:
> While handling a response message from server, nbd_read_stat() will
> try to get request by tag, and then complete the request. However,
> this is problematic if nbd haven't sent a corresponding request
> message:
>
> t1 t2
> submit_bio
> nbd_queue_rq
> blk_mq_start_request
> recv_work
> nbd_read_stat
> blk_mq_tag_to_rq
> blk_mq_complete_request
> nbd_send_cmd
>
> Thus add a new cmd flag 'NBD_CMD_INFLIGHT', it will be set in
> nbd_send_cmd() and checked in nbd_read_stat().
>
> Noted that this patch can't fix that blk_mq_tag_to_rq() might
> return a freed request, and this will be fixed in following
> patches.
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Looks fine:
Reviewed-by: Ming Lei <ming.lei@redhat.com>
--
Ming
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 2/6] nbd: make sure request completion won't concurrent
2021-09-09 14:12 ` [PATCH v5 2/6] nbd: make sure request completion won't concurrent Yu Kuai
@ 2021-09-14 0:57 ` Ming Lei
2021-09-14 3:11 ` yukuai (C)
0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2021-09-14 0:57 UTC (permalink / raw)
To: Yu Kuai; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Thu, Sep 09, 2021 at 10:12:52PM +0800, Yu Kuai wrote:
> commit cddce0116058 ("nbd: Aovid double completion of a request")
> try to fix that nbd_clear_que() and recv_work() can complete a
> request concurrently. However, the problem still exists:
>
> t1 t2 t3
>
> nbd_disconnect_and_put
> flush_workqueue
> recv_work
> blk_mq_complete_request
> blk_mq_complete_request_remote -> this is true
> WRITE_ONCE(rq->state, MQ_RQ_COMPLETE)
> blk_mq_raise_softirq
> blk_done_softirq
> blk_complete_reqs
> nbd_complete_rq
> blk_mq_end_request
> blk_mq_free_request
> WRITE_ONCE(rq->state, MQ_RQ_IDLE)
> nbd_clear_que
> blk_mq_tagset_busy_iter
> nbd_clear_req
> __blk_mq_free_request
> blk_mq_put_tag
> blk_mq_complete_request -> complete again
>
> There are three places where request can be completed in nbd:
> recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they
> all hold cmd->lock before completing the request, it's easy to
> avoid the problem by setting and checking a cmd flag.
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
> drivers/block/nbd.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 04861b585b62..550c8dc438ac 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -406,7 +406,11 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
> if (!mutex_trylock(&cmd->lock))
> return BLK_EH_RESET_TIMER;
>
> - __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
> + if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
> + mutex_unlock(&cmd->lock);
> + return BLK_EH_DONE;
> + }
> +
> if (!refcount_inc_not_zero(&nbd->config_refs)) {
> cmd->status = BLK_STS_TIMEOUT;
> mutex_unlock(&cmd->lock);
> @@ -842,7 +846,10 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
>
> mutex_lock(&cmd->lock);
> cmd->status = BLK_STS_IOERR;
> - __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
> + if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
> + mutex_unlock(&cmd->lock);
> + return true;
> + }
> mutex_unlock(&cmd->lock);
If this request has completed from other code paths, ->status shouldn't be
updated here, maybe it is done successfully.
--
Ming
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-09 14:12 ` [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req() Yu Kuai
@ 2021-09-14 1:11 ` Ming Lei
2021-09-14 3:11 ` yukuai (C)
0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2021-09-14 1:11 UTC (permalink / raw)
To: Yu Kuai; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
> blk_mq_tag_to_rq() can only ensure to return valid request in
> following situation:
>
> 1) client send request message to server first
> submit_bio
> ...
> blk_mq_get_tag
> ...
> blk_mq_get_driver_tag
> ...
> nbd_queue_rq
> nbd_handle_cmd
> nbd_send_cmd
>
> 2) client receive respond message from server
> recv_work
> nbd_read_stat
> blk_mq_tag_to_rq
>
> If step 1) is missing, blk_mq_tag_to_rq() will return a stale
> request, which might be freed. Thus convert to use
> blk_mq_find_and_get_req() to make sure the returned request is not
> freed.
But NBD_CMD_INFLIGHT has been added for checking if the reply is
expected, do we still need blk_mq_find_and_get_req() for covering
this issue? BTW, request and its payload is pre-allocated, so there
isn't real use-after-free.
Thanks,
Ming
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 1:11 ` Ming Lei
@ 2021-09-14 3:11 ` yukuai (C)
2021-09-14 6:44 ` Ming Lei
0 siblings, 1 reply; 24+ messages in thread
From: yukuai (C) @ 2021-09-14 3:11 UTC (permalink / raw)
To: Ming Lei; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On 2021/09/14 9:11, Ming Lei wrote:
> On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
>> blk_mq_tag_to_rq() can only ensure to return valid request in
>> following situation:
>>
>> 1) client send request message to server first
>> submit_bio
>> ...
>> blk_mq_get_tag
>> ...
>> blk_mq_get_driver_tag
>> ...
>> nbd_queue_rq
>> nbd_handle_cmd
>> nbd_send_cmd
>>
>> 2) client receive respond message from server
>> recv_work
>> nbd_read_stat
>> blk_mq_tag_to_rq
>>
>> If step 1) is missing, blk_mq_tag_to_rq() will return a stale
>> request, which might be freed. Thus convert to use
>> blk_mq_find_and_get_req() to make sure the returned request is not
>> freed.
>
> But NBD_CMD_INFLIGHT has been added for checking if the reply is
> expected, do we still need blk_mq_find_and_get_req() for covering
> this issue? BTW, request and its payload is pre-allocated, so there
> isn't real use-after-free.
Hi, Ming
Checking NBD_CMD_INFLIGHT relied on the request founded by tag is valid,
not the other way round.
nbd_read_stat
req = blk_mq_tag_to_rq()
cmd = blk_mq_rq_to_pdu(req)
mutex_lock(cmd->lock)
checking NBD_CMD_INFLIGHT
The checking doesn't have any effect on blk_mq_tag_to_rq().
Thanks,
Kuai
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 2/6] nbd: make sure request completion won't concurrent
2021-09-14 0:57 ` Ming Lei
@ 2021-09-14 3:11 ` yukuai (C)
0 siblings, 0 replies; 24+ messages in thread
From: yukuai (C) @ 2021-09-14 3:11 UTC (permalink / raw)
To: Ming Lei; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On 2021/09/14 8:57, Ming Lei wrote:
> On Thu, Sep 09, 2021 at 10:12:52PM +0800, Yu Kuai wrote:
>> commit cddce0116058 ("nbd: Aovid double completion of a request")
>> try to fix that nbd_clear_que() and recv_work() can complete a
>> request concurrently. However, the problem still exists:
>>
>> t1 t2 t3
>>
>> nbd_disconnect_and_put
>> flush_workqueue
>> recv_work
>> blk_mq_complete_request
>> blk_mq_complete_request_remote -> this is true
>> WRITE_ONCE(rq->state, MQ_RQ_COMPLETE)
>> blk_mq_raise_softirq
>> blk_done_softirq
>> blk_complete_reqs
>> nbd_complete_rq
>> blk_mq_end_request
>> blk_mq_free_request
>> WRITE_ONCE(rq->state, MQ_RQ_IDLE)
>> nbd_clear_que
>> blk_mq_tagset_busy_iter
>> nbd_clear_req
>> __blk_mq_free_request
>> blk_mq_put_tag
>> blk_mq_complete_request -> complete again
>>
>> There are three places where request can be completed in nbd:
>> recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they
>> all hold cmd->lock before completing the request, it's easy to
>> avoid the problem by setting and checking a cmd flag.
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>> drivers/block/nbd.c | 11 +++++++++--
>> 1 file changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index 04861b585b62..550c8dc438ac 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -406,7 +406,11 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
>> if (!mutex_trylock(&cmd->lock))
>> return BLK_EH_RESET_TIMER;
>>
>> - __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>> + if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>> + mutex_unlock(&cmd->lock);
>> + return BLK_EH_DONE;
>> + }
>> +
>> if (!refcount_inc_not_zero(&nbd->config_refs)) {
>> cmd->status = BLK_STS_TIMEOUT;
>> mutex_unlock(&cmd->lock);
>> @@ -842,7 +846,10 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
>>
>> mutex_lock(&cmd->lock);
>> cmd->status = BLK_STS_IOERR;
>> - __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
>> + if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
>> + mutex_unlock(&cmd->lock);
>> + return true;
>> + }
>> mutex_unlock(&cmd->lock);
>
> If this request has completed from other code paths, ->status shouldn't be
> updated here, maybe it is done successfully.
Hi, Ming
Will change this in next iteration.
Thanks,
Kuai
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 3:11 ` yukuai (C)
@ 2021-09-14 6:44 ` Ming Lei
2021-09-14 7:13 ` yukuai (C)
0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2021-09-14 6:44 UTC (permalink / raw)
To: yukuai (C); +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Tue, Sep 14, 2021 at 11:11:06AM +0800, yukuai (C) wrote:
> On 2021/09/14 9:11, Ming Lei wrote:
> > On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
> > > blk_mq_tag_to_rq() can only ensure to return valid request in
> > > following situation:
> > >
> > > 1) client send request message to server first
> > > submit_bio
> > > ...
> > > blk_mq_get_tag
> > > ...
> > > blk_mq_get_driver_tag
> > > ...
> > > nbd_queue_rq
> > > nbd_handle_cmd
> > > nbd_send_cmd
> > >
> > > 2) client receive respond message from server
> > > recv_work
> > > nbd_read_stat
> > > blk_mq_tag_to_rq
> > >
> > > If step 1) is missing, blk_mq_tag_to_rq() will return a stale
> > > request, which might be freed. Thus convert to use
> > > blk_mq_find_and_get_req() to make sure the returned request is not
> > > freed.
> >
> > But NBD_CMD_INFLIGHT has been added for checking if the reply is
> > expected, do we still need blk_mq_find_and_get_req() for covering
> > this issue? BTW, request and its payload is pre-allocated, so there
> > isn't real use-after-free.
>
> Hi, Ming
>
> Checking NBD_CMD_INFLIGHT relied on the request founded by tag is valid,
> not the other way round.
>
> nbd_read_stat
> req = blk_mq_tag_to_rq()
> cmd = blk_mq_rq_to_pdu(req)
> mutex_lock(cmd->lock)
> checking NBD_CMD_INFLIGHT
Request and its payload is pre-allocated, and either req->ref or cmd->lock can
serve the same purpose here. Once cmd->lock is held, you can check if the cmd is
inflight or not. If it isn't inflight, just return -ENOENT. Is there any
problem to handle in this way?
Thanks,
Ming
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 6:44 ` Ming Lei
@ 2021-09-14 7:13 ` yukuai (C)
2021-09-14 7:46 ` Ming Lei
0 siblings, 1 reply; 24+ messages in thread
From: yukuai (C) @ 2021-09-14 7:13 UTC (permalink / raw)
To: Ming Lei; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On 2021/09/14 14:44, Ming Lei wrote:
> On Tue, Sep 14, 2021 at 11:11:06AM +0800, yukuai (C) wrote:
>> On 2021/09/14 9:11, Ming Lei wrote:
>>> On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
>>>> blk_mq_tag_to_rq() can only ensure to return valid request in
>>>> following situation:
>>>>
>>>> 1) client send request message to server first
>>>> submit_bio
>>>> ...
>>>> blk_mq_get_tag
>>>> ...
>>>> blk_mq_get_driver_tag
>>>> ...
>>>> nbd_queue_rq
>>>> nbd_handle_cmd
>>>> nbd_send_cmd
>>>>
>>>> 2) client receive respond message from server
>>>> recv_work
>>>> nbd_read_stat
>>>> blk_mq_tag_to_rq
>>>>
>>>> If step 1) is missing, blk_mq_tag_to_rq() will return a stale
>>>> request, which might be freed. Thus convert to use
>>>> blk_mq_find_and_get_req() to make sure the returned request is not
>>>> freed.
>>>
>>> But NBD_CMD_INFLIGHT has been added for checking if the reply is
>>> expected, do we still need blk_mq_find_and_get_req() for covering
>>> this issue? BTW, request and its payload is pre-allocated, so there
>>> isn't real use-after-free.
>>
>> Hi, Ming
>>
>> Checking NBD_CMD_INFLIGHT relied on the request founded by tag is valid,
>> not the other way round.
>>
>> nbd_read_stat
>> req = blk_mq_tag_to_rq()
>> cmd = blk_mq_rq_to_pdu(req)
>> mutex_lock(cmd->lock)
>> checking NBD_CMD_INFLIGHT
>
> Request and its payload is pre-allocated, and either req->ref or cmd->lock can
> serve the same purpose here. Once cmd->lock is held, you can check if the cmd is
> inflight or not. If it isn't inflight, just return -ENOENT. Is there any
> problem to handle in this way?
Hi, Ming
in nbd_read_stat:
1) get a request by tag first
2) get nbd_cmd by the request
3) hold cmd->lock and check if cmd is inflight
If we want to check if the cmd is inflight in step 3), we have to do
setp 1) and 2) first. As I explained in patch 0, blk_mq_tag_to_rq()
can't make sure the returned request is not freed:
nbd_read_stat
blk_mq_sched_free_requests
blk_mq_free_rqs
blk_mq_tag_to_rq
-> get rq before clear mapping
blk_mq_clear_rq_mapping
__free_pages -> rq is freed
blk_mq_request_started -> UAF
Thanks,
Kuai
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 7:13 ` yukuai (C)
@ 2021-09-14 7:46 ` Ming Lei
2021-09-14 9:08 ` yukuai (C)
2021-09-14 9:19 ` yukuai (C)
0 siblings, 2 replies; 24+ messages in thread
From: Ming Lei @ 2021-09-14 7:46 UTC (permalink / raw)
To: yukuai (C); +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Tue, Sep 14, 2021 at 03:13:38PM +0800, yukuai (C) wrote:
> On 2021/09/14 14:44, Ming Lei wrote:
> > On Tue, Sep 14, 2021 at 11:11:06AM +0800, yukuai (C) wrote:
> > > On 2021/09/14 9:11, Ming Lei wrote:
> > > > On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
> > > > > blk_mq_tag_to_rq() can only ensure to return valid request in
> > > > > following situation:
> > > > >
> > > > > 1) client send request message to server first
> > > > > submit_bio
> > > > > ...
> > > > > blk_mq_get_tag
> > > > > ...
> > > > > blk_mq_get_driver_tag
> > > > > ...
> > > > > nbd_queue_rq
> > > > > nbd_handle_cmd
> > > > > nbd_send_cmd
> > > > >
> > > > > 2) client receive respond message from server
> > > > > recv_work
> > > > > nbd_read_stat
> > > > > blk_mq_tag_to_rq
> > > > >
> > > > > If step 1) is missing, blk_mq_tag_to_rq() will return a stale
> > > > > request, which might be freed. Thus convert to use
> > > > > blk_mq_find_and_get_req() to make sure the returned request is not
> > > > > freed.
> > > >
> > > > But NBD_CMD_INFLIGHT has been added for checking if the reply is
> > > > expected, do we still need blk_mq_find_and_get_req() for covering
> > > > this issue? BTW, request and its payload is pre-allocated, so there
> > > > isn't real use-after-free.
> > >
> > > Hi, Ming
> > >
> > > Checking NBD_CMD_INFLIGHT relied on the request founded by tag is valid,
> > > not the other way round.
> > >
> > > nbd_read_stat
> > > req = blk_mq_tag_to_rq()
> > > cmd = blk_mq_rq_to_pdu(req)
> > > mutex_lock(cmd->lock)
> > > checking NBD_CMD_INFLIGHT
> >
> > Request and its payload is pre-allocated, and either req->ref or cmd->lock can
> > serve the same purpose here. Once cmd->lock is held, you can check if the cmd is
> > inflight or not. If it isn't inflight, just return -ENOENT. Is there any
> > problem to handle in this way?
>
> Hi, Ming
>
> in nbd_read_stat:
>
> 1) get a request by tag first
> 2) get nbd_cmd by the request
> 3) hold cmd->lock and check if cmd is inflight
>
> If we want to check if the cmd is inflight in step 3), we have to do
> setp 1) and 2) first. As I explained in patch 0, blk_mq_tag_to_rq()
> can't make sure the returned request is not freed:
>
> nbd_read_stat
> blk_mq_sched_free_requests
> blk_mq_free_rqs
> blk_mq_tag_to_rq
> -> get rq before clear mapping
> blk_mq_clear_rq_mapping
> __free_pages -> rq is freed
> blk_mq_request_started -> UAF
If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
wondering why not take the following simpler way for avoiding the UAF?
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..dfa5cce71f66 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
work);
struct nbd_device *nbd = args->nbd;
struct nbd_config *config = nbd->config;
+ struct request_queue *q = nbd->disk->queue;
struct nbd_cmd *cmd;
struct request *rq;
+ if (!percpu_ref_tryget(&q->q_usage_counter))
+ return;
+
while (1) {
cmd = nbd_read_stat(nbd, args->index);
if (IS_ERR(cmd)) {
@@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
if (likely(!blk_should_fake_timeout(rq->q)))
blk_mq_complete_request(rq);
}
+ blk_queue_exit(q);
nbd_config_put(nbd);
atomic_dec(&config->recv_threads);
wake_up(&config->recv_wq);
Thanks,
Ming
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 7:46 ` Ming Lei
@ 2021-09-14 9:08 ` yukuai (C)
2021-09-14 9:12 ` yukuai (C)
2021-09-14 14:33 ` Ming Lei
2021-09-14 9:19 ` yukuai (C)
1 sibling, 2 replies; 24+ messages in thread
From: yukuai (C) @ 2021-09-14 9:08 UTC (permalink / raw)
To: Ming Lei; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On 2021/09/14 15:46, Ming Lei wrote:
> On Tue, Sep 14, 2021 at 03:13:38PM +0800, yukuai (C) wrote:
>> On 2021/09/14 14:44, Ming Lei wrote:
>>> On Tue, Sep 14, 2021 at 11:11:06AM +0800, yukuai (C) wrote:
>>>> On 2021/09/14 9:11, Ming Lei wrote:
>>>>> On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
>>>>>> blk_mq_tag_to_rq() can only ensure to return valid request in
>>>>>> following situation:
>>>>>>
>>>>>> 1) client send request message to server first
>>>>>> submit_bio
>>>>>> ...
>>>>>> blk_mq_get_tag
>>>>>> ...
>>>>>> blk_mq_get_driver_tag
>>>>>> ...
>>>>>> nbd_queue_rq
>>>>>> nbd_handle_cmd
>>>>>> nbd_send_cmd
>>>>>>
>>>>>> 2) client receive respond message from server
>>>>>> recv_work
>>>>>> nbd_read_stat
>>>>>> blk_mq_tag_to_rq
>>>>>>
>>>>>> If step 1) is missing, blk_mq_tag_to_rq() will return a stale
>>>>>> request, which might be freed. Thus convert to use
>>>>>> blk_mq_find_and_get_req() to make sure the returned request is not
>>>>>> freed.
>>>>>
>>>>> But NBD_CMD_INFLIGHT has been added for checking if the reply is
>>>>> expected, do we still need blk_mq_find_and_get_req() for covering
>>>>> this issue? BTW, request and its payload is pre-allocated, so there
>>>>> isn't real use-after-free.
>>>>
>>>> Hi, Ming
>>>>
>>>> Checking NBD_CMD_INFLIGHT relied on the request founded by tag is valid,
>>>> not the other way round.
>>>>
>>>> nbd_read_stat
>>>> req = blk_mq_tag_to_rq()
>>>> cmd = blk_mq_rq_to_pdu(req)
>>>> mutex_lock(cmd->lock)
>>>> checking NBD_CMD_INFLIGHT
>>>
>>> Request and its payload is pre-allocated, and either req->ref or cmd->lock can
>>> serve the same purpose here. Once cmd->lock is held, you can check if the cmd is
>>> inflight or not. If it isn't inflight, just return -ENOENT. Is there any
>>> problem to handle in this way?
>>
>> Hi, Ming
>>
>> in nbd_read_stat:
>>
>> 1) get a request by tag first
>> 2) get nbd_cmd by the request
>> 3) hold cmd->lock and check if cmd is inflight
>>
>> If we want to check if the cmd is inflight in step 3), we have to do
>> setp 1) and 2) first. As I explained in patch 0, blk_mq_tag_to_rq()
>> can't make sure the returned request is not freed:
>>
>> nbd_read_stat
>> blk_mq_sched_free_requests
>> blk_mq_free_rqs
>> blk_mq_tag_to_rq
>> -> get rq before clear mapping
>> blk_mq_clear_rq_mapping
>> __free_pages -> rq is freed
>> blk_mq_request_started -> UAF
>
> If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
Hi, Ming
Why can't blk_mq_find_and_get_req() fix it? I can't think of any
scenario that might have problem currently.
> wondering why not take the following simpler way for avoiding the UAF?
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 5170a630778d..dfa5cce71f66 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
> work);
> struct nbd_device *nbd = args->nbd;
> struct nbd_config *config = nbd->config;
> + struct request_queue *q = nbd->disk->queue;
> struct nbd_cmd *cmd;
> struct request *rq;
>
> + if (!percpu_ref_tryget(&q->q_usage_counter))
> + return;
> +
We can't make sure freeze_queue is called before this, thus this approch
can't fix the problem, right?
nbd_read_stat
blk_mq_tag_to_rq
elevator_switch
blk_mq_freeze_queue(q);
elevator_switch_mq
elevator_exit
blk_mq_sched_free_requests
blk_mq_request_started -> UAF
Thanks,
Kuai
> while (1) {
> cmd = nbd_read_stat(nbd, args->index);
> if (IS_ERR(cmd)) {
> @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
> if (likely(!blk_should_fake_timeout(rq->q)))
> blk_mq_complete_request(rq);
> }
> + blk_queue_exit(q);
> nbd_config_put(nbd);
> atomic_dec(&config->recv_threads);
> wake_up(&config->recv_wq);
>
> Thanks,
> Ming
>
> .
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 9:08 ` yukuai (C)
@ 2021-09-14 9:12 ` yukuai (C)
2021-09-14 14:33 ` Ming Lei
1 sibling, 0 replies; 24+ messages in thread
From: yukuai (C) @ 2021-09-14 9:12 UTC (permalink / raw)
To: Ming Lei; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On 2021/09/14 17:08, yukuai (C) wrote:
> On 2021/09/14 15:46, Ming Lei wrote:
>> On Tue, Sep 14, 2021 at 03:13:38PM +0800, yukuai (C) wrote:
>>> On 2021/09/14 14:44, Ming Lei wrote:
>>>> On Tue, Sep 14, 2021 at 11:11:06AM +0800, yukuai (C) wrote:
>>>>> On 2021/09/14 9:11, Ming Lei wrote:
>>>>>> On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
>>>>>>> blk_mq_tag_to_rq() can only ensure to return valid request in
>>>>>>> following situation:
>>>>>>>
>>>>>>> 1) client send request message to server first
>>>>>>> submit_bio
>>>>>>> ...
>>>>>>> blk_mq_get_tag
>>>>>>> ...
>>>>>>> blk_mq_get_driver_tag
>>>>>>> ...
>>>>>>> nbd_queue_rq
>>>>>>> nbd_handle_cmd
>>>>>>> nbd_send_cmd
>>>>>>>
>>>>>>> 2) client receive respond message from server
>>>>>>> recv_work
>>>>>>> nbd_read_stat
>>>>>>> blk_mq_tag_to_rq
>>>>>>>
>>>>>>> If step 1) is missing, blk_mq_tag_to_rq() will return a stale
>>>>>>> request, which might be freed. Thus convert to use
>>>>>>> blk_mq_find_and_get_req() to make sure the returned request is not
>>>>>>> freed.
>>>>>>
>>>>>> But NBD_CMD_INFLIGHT has been added for checking if the reply is
>>>>>> expected, do we still need blk_mq_find_and_get_req() for covering
>>>>>> this issue? BTW, request and its payload is pre-allocated, so there
>>>>>> isn't real use-after-free.
>>>>>
>>>>> Hi, Ming
>>>>>
>>>>> Checking NBD_CMD_INFLIGHT relied on the request founded by tag is
>>>>> valid,
>>>>> not the other way round.
>>>>>
>>>>> nbd_read_stat
>>>>> req = blk_mq_tag_to_rq()
>>>>> cmd = blk_mq_rq_to_pdu(req)
>>>>> mutex_lock(cmd->lock)
>>>>> checking NBD_CMD_INFLIGHT
>>>>
>>>> Request and its payload is pre-allocated, and either req->ref or
>>>> cmd->lock can
>>>> serve the same purpose here. Once cmd->lock is held, you can check
>>>> if the cmd is
>>>> inflight or not. If it isn't inflight, just return -ENOENT. Is there
>>>> any
>>>> problem to handle in this way?
>>>
>>> Hi, Ming
>>>
>>> in nbd_read_stat:
>>>
>>> 1) get a request by tag first
>>> 2) get nbd_cmd by the request
>>> 3) hold cmd->lock and check if cmd is inflight
>>>
>>> If we want to check if the cmd is inflight in step 3), we have to do
>>> setp 1) and 2) first. As I explained in patch 0, blk_mq_tag_to_rq()
>>> can't make sure the returned request is not freed:
>>>
>>> nbd_read_stat
>>> blk_mq_sched_free_requests
>>> blk_mq_free_rqs
>>> blk_mq_tag_to_rq
>>> -> get rq before clear mapping
>>> blk_mq_clear_rq_mapping
>>> __free_pages -> rq is freed
>>> blk_mq_request_started -> UAF
>>
>> If the above can happen, blk_mq_find_and_get_req() may not fix it too,
>> just
>
> Hi, Ming
>
> Why can't blk_mq_find_and_get_req() fix it? I can't think of any
> scenario that might have problem currently.
>
>> wondering why not take the following simpler way for avoiding the UAF?
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index 5170a630778d..dfa5cce71f66 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
>> work);
>> struct nbd_device *nbd = args->nbd;
>> struct nbd_config *config = nbd->config;
>> + struct request_queue *q = nbd->disk->queue;
>> struct nbd_cmd *cmd;
>> struct request *rq;
>> + if (!percpu_ref_tryget(&q->q_usage_counter))
>> + return;
>> +
>
> We can't make sure freeze_queue is called before this, thus this approch
> can't fix the problem, right?
> nbd_read_stat
> blk_mq_tag_to_rq
> elevator_switch
> blk_mq_freeze_queue(q);
> elevator_switch_mq
> elevator_exit
> blk_mq_sched_free_requests
> blk_mq_request_started -> UAF
Hi, Ming
I forgot that if percpu_ref_tryget succeed here, blk_mq_free_queue()
will block untill blk_queue_exit() in nbd_read_stat().
Thanks,
Kuai
>
> Thanks,
> Kuai
>
>> while (1) {
>> cmd = nbd_read_stat(nbd, args->index);
>> if (IS_ERR(cmd)) {
>> @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
>> if (likely(!blk_should_fake_timeout(rq->q)))
>> blk_mq_complete_request(rq);
>> }
>> + blk_queue_exit(q);
>> nbd_config_put(nbd);
>> atomic_dec(&config->recv_threads);
>> wake_up(&config->recv_wq);
>>
>> Thanks,
>> Ming
>>
>> .
>>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 7:46 ` Ming Lei
2021-09-14 9:08 ` yukuai (C)
@ 2021-09-14 9:19 ` yukuai (C)
2021-09-14 14:37 ` Ming Lei
1 sibling, 1 reply; 24+ messages in thread
From: yukuai (C) @ 2021-09-14 9:19 UTC (permalink / raw)
To: Ming Lei; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On 在 2021/09/14 15:46, Ming Lei wrote:
> If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
> wondering why not take the following simpler way for avoiding the UAF?
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 5170a630778d..dfa5cce71f66 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
> work);
> struct nbd_device *nbd = args->nbd;
> struct nbd_config *config = nbd->config;
> + struct request_queue *q = nbd->disk->queue;
> struct nbd_cmd *cmd;
> struct request *rq;
>
> + if (!percpu_ref_tryget(&q->q_usage_counter))
> + return;
> +
> while (1) {
> cmd = nbd_read_stat(nbd, args->index);
> if (IS_ERR(cmd)) {
> @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
> if (likely(!blk_should_fake_timeout(rq->q)))
> blk_mq_complete_request(rq);
> }
> + blk_queue_exit(q);
> nbd_config_put(nbd);
> atomic_dec(&config->recv_threads);
> wake_up(&config->recv_wq);
>
Hi, Ming
This apporch is wrong.
If blk_mq_freeze_queue() is called, and nbd is waiting for all
request to complete. percpu_ref_tryget() will fail here, and deadlock
will occur because request can't complete in recv_work().
Thanks,
Kuai
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 9:08 ` yukuai (C)
2021-09-14 9:12 ` yukuai (C)
@ 2021-09-14 14:33 ` Ming Lei
1 sibling, 0 replies; 24+ messages in thread
From: Ming Lei @ 2021-09-14 14:33 UTC (permalink / raw)
To: yukuai (C); +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Tue, Sep 14, 2021 at 05:08:00PM +0800, yukuai (C) wrote:
> On 2021/09/14 15:46, Ming Lei wrote:
> > On Tue, Sep 14, 2021 at 03:13:38PM +0800, yukuai (C) wrote:
> > > On 2021/09/14 14:44, Ming Lei wrote:
> > > > On Tue, Sep 14, 2021 at 11:11:06AM +0800, yukuai (C) wrote:
> > > > > On 2021/09/14 9:11, Ming Lei wrote:
> > > > > > On Thu, Sep 09, 2021 at 10:12:55PM +0800, Yu Kuai wrote:
> > > > > > > blk_mq_tag_to_rq() can only ensure to return valid request in
> > > > > > > following situation:
> > > > > > >
> > > > > > > 1) client send request message to server first
> > > > > > > submit_bio
> > > > > > > ...
> > > > > > > blk_mq_get_tag
> > > > > > > ...
> > > > > > > blk_mq_get_driver_tag
> > > > > > > ...
> > > > > > > nbd_queue_rq
> > > > > > > nbd_handle_cmd
> > > > > > > nbd_send_cmd
> > > > > > >
> > > > > > > 2) client receive respond message from server
> > > > > > > recv_work
> > > > > > > nbd_read_stat
> > > > > > > blk_mq_tag_to_rq
> > > > > > >
> > > > > > > If step 1) is missing, blk_mq_tag_to_rq() will return a stale
> > > > > > > request, which might be freed. Thus convert to use
> > > > > > > blk_mq_find_and_get_req() to make sure the returned request is not
> > > > > > > freed.
> > > > > >
> > > > > > But NBD_CMD_INFLIGHT has been added for checking if the reply is
> > > > > > expected, do we still need blk_mq_find_and_get_req() for covering
> > > > > > this issue? BTW, request and its payload is pre-allocated, so there
> > > > > > isn't real use-after-free.
> > > > >
> > > > > Hi, Ming
> > > > >
> > > > > Checking NBD_CMD_INFLIGHT relied on the request founded by tag is valid,
> > > > > not the other way round.
> > > > >
> > > > > nbd_read_stat
> > > > > req = blk_mq_tag_to_rq()
> > > > > cmd = blk_mq_rq_to_pdu(req)
> > > > > mutex_lock(cmd->lock)
> > > > > checking NBD_CMD_INFLIGHT
> > > >
> > > > Request and its payload is pre-allocated, and either req->ref or cmd->lock can
> > > > serve the same purpose here. Once cmd->lock is held, you can check if the cmd is
> > > > inflight or not. If it isn't inflight, just return -ENOENT. Is there any
> > > > problem to handle in this way?
> > >
> > > Hi, Ming
> > >
> > > in nbd_read_stat:
> > >
> > > 1) get a request by tag first
> > > 2) get nbd_cmd by the request
> > > 3) hold cmd->lock and check if cmd is inflight
> > >
> > > If we want to check if the cmd is inflight in step 3), we have to do
> > > setp 1) and 2) first. As I explained in patch 0, blk_mq_tag_to_rq()
> > > can't make sure the returned request is not freed:
> > >
> > > nbd_read_stat
> > > blk_mq_sched_free_requests
> > > blk_mq_free_rqs
> > > blk_mq_tag_to_rq
> > > -> get rq before clear mapping
> > > blk_mq_clear_rq_mapping
> > > __free_pages -> rq is freed
> > > blk_mq_request_started -> UAF
> >
> > If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
>
> Hi, Ming
>
> Why can't blk_mq_find_and_get_req() fix it? I can't think of any
> scenario that might have problem currently.
The principle behind blk_mq_find_and_get_req() is that if one request's
ref is grabbed, the queue's usage counter is guaranteed to be grabbed,
and this way isn't straight-forward.
Yeah, it can fix the issue, but I don't think it is good to call it in
fast path cause tags->lock is required.
>
> > wondering why not take the following simpler way for avoiding the UAF?
> >
> > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > index 5170a630778d..dfa5cce71f66 100644
> > --- a/drivers/block/nbd.c
> > +++ b/drivers/block/nbd.c
> > @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
> > work);
> > struct nbd_device *nbd = args->nbd;
> > struct nbd_config *config = nbd->config;
> > + struct request_queue *q = nbd->disk->queue;
> > struct nbd_cmd *cmd;
> > struct request *rq;
> > + if (!percpu_ref_tryget(&q->q_usage_counter))
> > + return;
> > +
>
> We can't make sure freeze_queue is called before this, thus this approch
> can't fix the problem, right?
> nbd_read_stat
> blk_mq_tag_to_rq
> elevator_switch
> blk_mq_freeze_queue(q);
> elevator_switch_mq
> elevator_exit
> blk_mq_sched_free_requests
> blk_mq_request_started -> UAF
No, blk_mq_freeze_queue() waits until .q_usage_counter becomes zero, so
there won't be any concurrent nbd_read_stat() during switching elevator
if ->q_usage_counter is grabbed in recv_work().
Thanks,
Ming
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 9:19 ` yukuai (C)
@ 2021-09-14 14:37 ` Ming Lei
2021-09-15 1:54 ` yukuai (C)
0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2021-09-14 14:37 UTC (permalink / raw)
To: yukuai (C); +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Tue, Sep 14, 2021 at 05:19:31PM +0800, yukuai (C) wrote:
> On 在 2021/09/14 15:46, Ming Lei wrote:
>
> > If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
> > wondering why not take the following simpler way for avoiding the UAF?
> >
> > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > index 5170a630778d..dfa5cce71f66 100644
> > --- a/drivers/block/nbd.c
> > +++ b/drivers/block/nbd.c
> > @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
> > work);
> > struct nbd_device *nbd = args->nbd;
> > struct nbd_config *config = nbd->config;
> > + struct request_queue *q = nbd->disk->queue;
> > struct nbd_cmd *cmd;
> > struct request *rq;
> > + if (!percpu_ref_tryget(&q->q_usage_counter))
> > + return;
> > +
> > while (1) {
> > cmd = nbd_read_stat(nbd, args->index);
> > if (IS_ERR(cmd)) {
> > @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
> > if (likely(!blk_should_fake_timeout(rq->q)))
> > blk_mq_complete_request(rq);
> > }
> > + blk_queue_exit(q);
> > nbd_config_put(nbd);
> > atomic_dec(&config->recv_threads);
> > wake_up(&config->recv_wq);
> >
>
> Hi, Ming
>
> This apporch is wrong.
>
> If blk_mq_freeze_queue() is called, and nbd is waiting for all
> request to complete. percpu_ref_tryget() will fail here, and deadlock
> will occur because request can't complete in recv_work().
No, percpu_ref_tryget() won't fail until ->q_usage_counter is zero, when
it is perfectly fine to do nothing in recv_work().
Thanks,
Ming
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-14 14:37 ` Ming Lei
@ 2021-09-15 1:54 ` yukuai (C)
2021-09-15 3:16 ` Ming Lei
0 siblings, 1 reply; 24+ messages in thread
From: yukuai (C) @ 2021-09-15 1:54 UTC (permalink / raw)
To: Ming Lei; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On 2021/09/14 22:37, Ming Lei wrote:
> On Tue, Sep 14, 2021 at 05:19:31PM +0800, yukuai (C) wrote:
>> On 在 2021/09/14 15:46, Ming Lei wrote:
>>
>>> If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
>>> wondering why not take the following simpler way for avoiding the UAF?
>>>
>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>> index 5170a630778d..dfa5cce71f66 100644
>>> --- a/drivers/block/nbd.c
>>> +++ b/drivers/block/nbd.c
>>> @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
>>> work);
>>> struct nbd_device *nbd = args->nbd;
>>> struct nbd_config *config = nbd->config;
>>> + struct request_queue *q = nbd->disk->queue;
>>> struct nbd_cmd *cmd;
>>> struct request *rq;
>>> + if (!percpu_ref_tryget(&q->q_usage_counter))
>>> + return;
>>> +
>>> while (1) {
>>> cmd = nbd_read_stat(nbd, args->index);
>>> if (IS_ERR(cmd)) {
>>> @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
>>> if (likely(!blk_should_fake_timeout(rq->q)))
>>> blk_mq_complete_request(rq);
>>> }
>>> + blk_queue_exit(q);
>>> nbd_config_put(nbd);
>>> atomic_dec(&config->recv_threads);
>>> wake_up(&config->recv_wq);
>>>
>>
>> Hi, Ming
>>
>> This apporch is wrong.
>>
>> If blk_mq_freeze_queue() is called, and nbd is waiting for all
>> request to complete. percpu_ref_tryget() will fail here, and deadlock
>> will occur because request can't complete in recv_work().
>
> No, percpu_ref_tryget() won't fail until ->q_usage_counter is zero, when
> it is perfectly fine to do nothing in recv_work().
>
Hi Ming
This apporch is a good idea, however we should not get q_usage_counter
in reccv_work(), because It will block freeze queue.
How about get q_usage_counter in nbd_read_stat(), and put in error path
or after request completion?
Thanks
Kuai
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-15 1:54 ` yukuai (C)
@ 2021-09-15 3:16 ` Ming Lei
2021-09-15 3:36 ` yukuai (C)
0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2021-09-15 3:16 UTC (permalink / raw)
To: yukuai (C); +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Wed, Sep 15, 2021 at 09:54:09AM +0800, yukuai (C) wrote:
> On 2021/09/14 22:37, Ming Lei wrote:
> > On Tue, Sep 14, 2021 at 05:19:31PM +0800, yukuai (C) wrote:
> > > On 在 2021/09/14 15:46, Ming Lei wrote:
> > >
> > > > If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
> > > > wondering why not take the following simpler way for avoiding the UAF?
> > > >
> > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > > index 5170a630778d..dfa5cce71f66 100644
> > > > --- a/drivers/block/nbd.c
> > > > +++ b/drivers/block/nbd.c
> > > > @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
> > > > work);
> > > > struct nbd_device *nbd = args->nbd;
> > > > struct nbd_config *config = nbd->config;
> > > > + struct request_queue *q = nbd->disk->queue;
> > > > struct nbd_cmd *cmd;
> > > > struct request *rq;
> > > > + if (!percpu_ref_tryget(&q->q_usage_counter))
> > > > + return;
> > > > +
> > > > while (1) {
> > > > cmd = nbd_read_stat(nbd, args->index);
> > > > if (IS_ERR(cmd)) {
> > > > @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
> > > > if (likely(!blk_should_fake_timeout(rq->q)))
> > > > blk_mq_complete_request(rq);
> > > > }
> > > > + blk_queue_exit(q);
> > > > nbd_config_put(nbd);
> > > > atomic_dec(&config->recv_threads);
> > > > wake_up(&config->recv_wq);
> > > >
> > >
> > > Hi, Ming
> > >
> > > This apporch is wrong.
> > >
> > > If blk_mq_freeze_queue() is called, and nbd is waiting for all
> > > request to complete. percpu_ref_tryget() will fail here, and deadlock
> > > will occur because request can't complete in recv_work().
> >
> > No, percpu_ref_tryget() won't fail until ->q_usage_counter is zero, when
> > it is perfectly fine to do nothing in recv_work().
> >
>
> Hi Ming
>
> This apporch is a good idea, however we should not get q_usage_counter
> in reccv_work(), because It will block freeze queue.
>
> How about get q_usage_counter in nbd_read_stat(), and put in error path
> or after request completion?
OK, looks I missed that nbd_read_stat() needs to wait for incoming reply
first, so how about the following change by partitioning nbd_read_stat()
into nbd_read_reply() and nbd_handle_reply()?
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..477fe057fc93 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -683,38 +683,47 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
return 0;
}
-/* NULL returned = something went wrong, inform userspace */
-static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
+static int nbd_read_reply(struct nbd_device *nbd, int index,
+ struct nbd_reply *reply)
{
- struct nbd_config *config = nbd->config;
int result;
- struct nbd_reply reply;
- struct nbd_cmd *cmd;
- struct request *req = NULL;
- u64 handle;
- u16 hwq;
- u32 tag;
- struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)};
+ struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)};
struct iov_iter to;
- int ret = 0;
- reply.magic = 0;
+ reply->magic = 0;
iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
- if (result <= 0) {
- if (!nbd_disconnected(config))
+ if (result < 0) {
+ if (!nbd_disconnected(nbd->config))
dev_err(disk_to_dev(nbd->disk),
"Receive control failed (result %d)\n", result);
- return ERR_PTR(result);
+ return result;
}
- if (ntohl(reply.magic) != NBD_REPLY_MAGIC) {
+ if (ntohl(reply->magic) != NBD_REPLY_MAGIC) {
dev_err(disk_to_dev(nbd->disk), "Wrong magic (0x%lx)\n",
- (unsigned long)ntohl(reply.magic));
- return ERR_PTR(-EPROTO);
+ (unsigned long)ntohl(reply->magic));
+ return -EPROTO;
}
- memcpy(&handle, reply.handle, sizeof(handle));
+ return 0;
+}
+
+/* NULL returned = something went wrong, inform userspace */
+static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
+ struct nbd_reply *reply)
+{
+ struct nbd_config *config = nbd->config;
+ int result;
+ struct nbd_cmd *cmd;
+ struct request *req = NULL;
+ u64 handle;
+ u16 hwq;
+ u32 tag;
+ struct iov_iter to;
+ int ret = 0;
+
+ memcpy(&handle, reply->handle, sizeof(handle));
tag = nbd_handle_to_tag(handle);
hwq = blk_mq_unique_tag_to_hwq(tag);
if (hwq < nbd->tag_set.nr_hw_queues)
@@ -747,9 +756,9 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
ret = -ENOENT;
goto out;
}
- if (ntohl(reply.error)) {
+ if (ntohl(reply->error)) {
dev_err(disk_to_dev(nbd->disk), "Other side returned error (%d)\n",
- ntohl(reply.error));
+ ntohl(reply->error));
cmd->status = BLK_STS_IOERR;
goto out;
}
@@ -795,24 +804,36 @@ static void recv_work(struct work_struct *work)
work);
struct nbd_device *nbd = args->nbd;
struct nbd_config *config = nbd->config;
+ struct request_queue *q = nbd->disk->queue;
+ struct nbd_sock *nsock;
struct nbd_cmd *cmd;
struct request *rq;
while (1) {
- cmd = nbd_read_stat(nbd, args->index);
- if (IS_ERR(cmd)) {
- struct nbd_sock *nsock = config->socks[args->index];
+ struct nbd_reply reply;
- mutex_lock(&nsock->tx_lock);
- nbd_mark_nsock_dead(nbd, nsock, 1);
- mutex_unlock(&nsock->tx_lock);
+ if (nbd_read_reply(nbd, args->index, &reply))
break;
- }
+ if (!percpu_ref_tryget(&q->q_usage_counter))
+ break;
+
+ cmd = nbd_handle_reply(nbd, args->index, &reply);
+ if (IS_ERR(cmd)) {
+ blk_queue_exit(q);
+ break;
+ }
rq = blk_mq_rq_from_pdu(cmd);
if (likely(!blk_should_fake_timeout(rq->q)))
blk_mq_complete_request(rq);
+ blk_queue_exit(q);
}
+
+ nsock = config->socks[args->index];
+ mutex_lock(&nsock->tx_lock);
+ nbd_mark_nsock_dead(nbd, nsock, 1);
+ mutex_unlock(&nsock->tx_lock);
+
nbd_config_put(nbd);
atomic_dec(&config->recv_threads);
wake_up(&config->recv_wq);
Thanks,
Ming
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-15 3:16 ` Ming Lei
@ 2021-09-15 3:36 ` yukuai (C)
2021-09-15 3:46 ` Ming Lei
0 siblings, 1 reply; 24+ messages in thread
From: yukuai (C) @ 2021-09-15 3:36 UTC (permalink / raw)
To: Ming Lei; +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On 2021/09/15 11:16, Ming Lei wrote:
> On Wed, Sep 15, 2021 at 09:54:09AM +0800, yukuai (C) wrote:
>> On 2021/09/14 22:37, Ming Lei wrote:
>>> On Tue, Sep 14, 2021 at 05:19:31PM +0800, yukuai (C) wrote:
>>>> On 在 2021/09/14 15:46, Ming Lei wrote:
>>>>
>>>>> If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
>>>>> wondering why not take the following simpler way for avoiding the UAF?
>>>>>
>>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>>>> index 5170a630778d..dfa5cce71f66 100644
>>>>> --- a/drivers/block/nbd.c
>>>>> +++ b/drivers/block/nbd.c
>>>>> @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
>>>>> work);
>>>>> struct nbd_device *nbd = args->nbd;
>>>>> struct nbd_config *config = nbd->config;
>>>>> + struct request_queue *q = nbd->disk->queue;
>>>>> struct nbd_cmd *cmd;
>>>>> struct request *rq;
>>>>> + if (!percpu_ref_tryget(&q->q_usage_counter))
>>>>> + return;
>>>>> +
>>>>> while (1) {
>>>>> cmd = nbd_read_stat(nbd, args->index);
>>>>> if (IS_ERR(cmd)) {
>>>>> @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
>>>>> if (likely(!blk_should_fake_timeout(rq->q)))
>>>>> blk_mq_complete_request(rq);
>>>>> }
>>>>> + blk_queue_exit(q);
>>>>> nbd_config_put(nbd);
>>>>> atomic_dec(&config->recv_threads);
>>>>> wake_up(&config->recv_wq);
>>>>>
>>>>
>>>> Hi, Ming
>>>>
>>>> This apporch is wrong.
>>>>
>>>> If blk_mq_freeze_queue() is called, and nbd is waiting for all
>>>> request to complete. percpu_ref_tryget() will fail here, and deadlock
>>>> will occur because request can't complete in recv_work().
>>>
>>> No, percpu_ref_tryget() won't fail until ->q_usage_counter is zero, when
>>> it is perfectly fine to do nothing in recv_work().
>>>
>>
>> Hi Ming
>>
>> This apporch is a good idea, however we should not get q_usage_counter
>> in reccv_work(), because It will block freeze queue.
>>
>> How about get q_usage_counter in nbd_read_stat(), and put in error path
>> or after request completion?
>
> OK, looks I missed that nbd_read_stat() needs to wait for incoming reply
> first, so how about the following change by partitioning nbd_read_stat()
> into nbd_read_reply() and nbd_handle_reply()?
Hi, Ming
The change looks good to me.
Do you want to send a patch to fix this?
Thanks,
Kuai
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 5170a630778d..477fe057fc93 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -683,38 +683,47 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
> return 0;
> }
>
> -/* NULL returned = something went wrong, inform userspace */
> -static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> +static int nbd_read_reply(struct nbd_device *nbd, int index,
> + struct nbd_reply *reply)
> {
> - struct nbd_config *config = nbd->config;
> int result;
> - struct nbd_reply reply;
> - struct nbd_cmd *cmd;
> - struct request *req = NULL;
> - u64 handle;
> - u16 hwq;
> - u32 tag;
> - struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)};
> + struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)};
> struct iov_iter to;
> - int ret = 0;
>
> - reply.magic = 0;
> + reply->magic = 0;
> iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
> result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
> - if (result <= 0) {
> - if (!nbd_disconnected(config))
> + if (result < 0) {
> + if (!nbd_disconnected(nbd->config))
> dev_err(disk_to_dev(nbd->disk),
> "Receive control failed (result %d)\n", result);
> - return ERR_PTR(result);
> + return result;
> }
>
> - if (ntohl(reply.magic) != NBD_REPLY_MAGIC) {
> + if (ntohl(reply->magic) != NBD_REPLY_MAGIC) {
> dev_err(disk_to_dev(nbd->disk), "Wrong magic (0x%lx)\n",
> - (unsigned long)ntohl(reply.magic));
> - return ERR_PTR(-EPROTO);
> + (unsigned long)ntohl(reply->magic));
> + return -EPROTO;
> }
>
> - memcpy(&handle, reply.handle, sizeof(handle));
> + return 0;
> +}
> +
> +/* NULL returned = something went wrong, inform userspace */
> +static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
> + struct nbd_reply *reply)
> +{
> + struct nbd_config *config = nbd->config;
> + int result;
> + struct nbd_cmd *cmd;
> + struct request *req = NULL;
> + u64 handle;
> + u16 hwq;
> + u32 tag;
> + struct iov_iter to;
> + int ret = 0;
> +
> + memcpy(&handle, reply->handle, sizeof(handle));
> tag = nbd_handle_to_tag(handle);
> hwq = blk_mq_unique_tag_to_hwq(tag);
> if (hwq < nbd->tag_set.nr_hw_queues)
> @@ -747,9 +756,9 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> ret = -ENOENT;
> goto out;
> }
> - if (ntohl(reply.error)) {
> + if (ntohl(reply->error)) {
> dev_err(disk_to_dev(nbd->disk), "Other side returned error (%d)\n",
> - ntohl(reply.error));
> + ntohl(reply->error));
> cmd->status = BLK_STS_IOERR;
> goto out;
> }
> @@ -795,24 +804,36 @@ static void recv_work(struct work_struct *work)
> work);
> struct nbd_device *nbd = args->nbd;
> struct nbd_config *config = nbd->config;
> + struct request_queue *q = nbd->disk->queue;
> + struct nbd_sock *nsock;
> struct nbd_cmd *cmd;
> struct request *rq;
>
> while (1) {
> - cmd = nbd_read_stat(nbd, args->index);
> - if (IS_ERR(cmd)) {
> - struct nbd_sock *nsock = config->socks[args->index];
> + struct nbd_reply reply;
>
> - mutex_lock(&nsock->tx_lock);
> - nbd_mark_nsock_dead(nbd, nsock, 1);
> - mutex_unlock(&nsock->tx_lock);
> + if (nbd_read_reply(nbd, args->index, &reply))
> break;
> - }
>
> + if (!percpu_ref_tryget(&q->q_usage_counter))
> + break;
> +
> + cmd = nbd_handle_reply(nbd, args->index, &reply);
> + if (IS_ERR(cmd)) {
> + blk_queue_exit(q);
> + break;
> + }
> rq = blk_mq_rq_from_pdu(cmd);
> if (likely(!blk_should_fake_timeout(rq->q)))
> blk_mq_complete_request(rq);
> + blk_queue_exit(q);
> }
> +
> + nsock = config->socks[args->index];
> + mutex_lock(&nsock->tx_lock);
> + nbd_mark_nsock_dead(nbd, nsock, 1);
> + mutex_unlock(&nsock->tx_lock);
> +
> nbd_config_put(nbd);
> atomic_dec(&config->recv_threads);
> wake_up(&config->recv_wq);
>
>
> Thanks,
> Ming
>
> .
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req()
2021-09-15 3:36 ` yukuai (C)
@ 2021-09-15 3:46 ` Ming Lei
0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2021-09-15 3:46 UTC (permalink / raw)
To: yukuai (C); +Cc: axboe, josef, hch, linux-block, linux-kernel, nbd, yi.zhang
On Wed, Sep 15, 2021 at 11:36:47AM +0800, yukuai (C) wrote:
> On 2021/09/15 11:16, Ming Lei wrote:
> > On Wed, Sep 15, 2021 at 09:54:09AM +0800, yukuai (C) wrote:
> > > On 2021/09/14 22:37, Ming Lei wrote:
> > > > On Tue, Sep 14, 2021 at 05:19:31PM +0800, yukuai (C) wrote:
> > > > > On 在 2021/09/14 15:46, Ming Lei wrote:
> > > > >
> > > > > > If the above can happen, blk_mq_find_and_get_req() may not fix it too, just
> > > > > > wondering why not take the following simpler way for avoiding the UAF?
> > > > > >
> > > > > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > > > > index 5170a630778d..dfa5cce71f66 100644
> > > > > > --- a/drivers/block/nbd.c
> > > > > > +++ b/drivers/block/nbd.c
> > > > > > @@ -795,9 +795,13 @@ static void recv_work(struct work_struct *work)
> > > > > > work);
> > > > > > struct nbd_device *nbd = args->nbd;
> > > > > > struct nbd_config *config = nbd->config;
> > > > > > + struct request_queue *q = nbd->disk->queue;
> > > > > > struct nbd_cmd *cmd;
> > > > > > struct request *rq;
> > > > > > + if (!percpu_ref_tryget(&q->q_usage_counter))
> > > > > > + return;
> > > > > > +
> > > > > > while (1) {
> > > > > > cmd = nbd_read_stat(nbd, args->index);
> > > > > > if (IS_ERR(cmd)) {
> > > > > > @@ -813,6 +817,7 @@ static void recv_work(struct work_struct *work)
> > > > > > if (likely(!blk_should_fake_timeout(rq->q)))
> > > > > > blk_mq_complete_request(rq);
> > > > > > }
> > > > > > + blk_queue_exit(q);
> > > > > > nbd_config_put(nbd);
> > > > > > atomic_dec(&config->recv_threads);
> > > > > > wake_up(&config->recv_wq);
> > > > > >
> > > > >
> > > > > Hi, Ming
> > > > >
> > > > > This apporch is wrong.
> > > > >
> > > > > If blk_mq_freeze_queue() is called, and nbd is waiting for all
> > > > > request to complete. percpu_ref_tryget() will fail here, and deadlock
> > > > > will occur because request can't complete in recv_work().
> > > >
> > > > No, percpu_ref_tryget() won't fail until ->q_usage_counter is zero, when
> > > > it is perfectly fine to do nothing in recv_work().
> > > >
> > >
> > > Hi Ming
> > >
> > > This apporch is a good idea, however we should not get q_usage_counter
> > > in reccv_work(), because It will block freeze queue.
> > >
> > > How about get q_usage_counter in nbd_read_stat(), and put in error path
> > > or after request completion?
> >
> > OK, looks I missed that nbd_read_stat() needs to wait for incoming reply
> > first, so how about the following change by partitioning nbd_read_stat()
> > into nbd_read_reply() and nbd_handle_reply()?
>
> Hi, Ming
>
> The change looks good to me.
>
> Do you want to send a patch to fix this?
I guess you may add inflight check or sort of change in nbd_read_stat(), so feel
free to fold it into your series.
Thanks,
Ming
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2021-09-15 3:46 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09 14:12 [PATCH v5 0/6] handle unexpected message from server Yu Kuai
2021-09-09 14:12 ` [PATCH v5 1/6] nbd: don't handle response without a corresponding request message Yu Kuai
2021-09-14 0:54 ` Ming Lei
2021-09-09 14:12 ` [PATCH v5 2/6] nbd: make sure request completion won't concurrent Yu Kuai
2021-09-14 0:57 ` Ming Lei
2021-09-14 3:11 ` yukuai (C)
2021-09-09 14:12 ` [PATCH v5 3/6] nbd: check sock index in nbd_read_stat() Yu Kuai
2021-09-09 14:12 ` [PATCH v5 4/6] blk-mq: export two symbols to get request by tag Yu Kuai
2021-09-09 14:12 ` [PATCH v5 5/6] nbd: convert to use blk_mq_find_and_get_req() Yu Kuai
2021-09-14 1:11 ` Ming Lei
2021-09-14 3:11 ` yukuai (C)
2021-09-14 6:44 ` Ming Lei
2021-09-14 7:13 ` yukuai (C)
2021-09-14 7:46 ` Ming Lei
2021-09-14 9:08 ` yukuai (C)
2021-09-14 9:12 ` yukuai (C)
2021-09-14 14:33 ` Ming Lei
2021-09-14 9:19 ` yukuai (C)
2021-09-14 14:37 ` Ming Lei
2021-09-15 1:54 ` yukuai (C)
2021-09-15 3:16 ` Ming Lei
2021-09-15 3:36 ` yukuai (C)
2021-09-15 3:46 ` Ming Lei
2021-09-09 14:12 ` [PATCH v5 6/6] nbd: don't start request if nbd_queue_rq() failed Yu Kuai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).