LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* xen-blkfront: weird behavior of "iostat" after VM live-migrate which xen-blkfront module has indirect descriptors
@ 2015-01-23  7:59 Ouyang Zhaowei (Charles)
  2015-01-23 11:15 ` Roger Pau Monné
  0 siblings, 1 reply; 4+ messages in thread
From: Ouyang Zhaowei (Charles) @ 2015-01-23  7:59 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: linux-kernel, suoben, liuyingdong, weiping.ding

Hi Roger,

We are testing the indirect feature of xen-blkfront module these days.
And we found that, after VM live-migrate a couple of times, the "%util" of iostat keeps being 100%, and there are several requests stock in "avgqu-sz".
We have checked some later version of Linux, and it happens on Ubuntu 14.04, Ubuntu 14.10 and RHEL 7.0.

The iostat shows like below:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.00    0.00    0.00  100.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvda              0.00     0.00    0.00    0.00     0.00     0.00     0.00     4.00    0.00    0.00    0.00   0.00 100.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

Could you tell us that why is this happening, is this a bug?

Thanks!

Ouyang Zhaowei


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xen-blkfront: weird behavior of "iostat" after VM live-migrate which xen-blkfront module has indirect descriptors
  2015-01-23  7:59 xen-blkfront: weird behavior of "iostat" after VM live-migrate which xen-blkfront module has indirect descriptors Ouyang Zhaowei (Charles)
@ 2015-01-23 11:15 ` Roger Pau Monné
  2015-01-26  2:30   ` Ouyang Zhaowei (Charles)
  0 siblings, 1 reply; 4+ messages in thread
From: Roger Pau Monné @ 2015-01-23 11:15 UTC (permalink / raw)
  To: Ouyang Zhaowei (Charles)
  Cc: linux-kernel, suoben, liuyingdong, weiping.ding, xen-devel,
	David Vrabel, Konrad Rzeszutek Wilk, Boris Ostrovsky

Hello,

El 23/01/15 a les 8.59, Ouyang Zhaowei (Charles) ha escrit:
> Hi Roger,
> 
> We are testing the indirect feature of xen-blkfront module these days.
> And we found that, after VM live-migrate a couple of times, the "%util" of iostat keeps being 100%, and there are several requests stock in "avgqu-sz".
> We have checked some later version of Linux, and it happens on Ubuntu 14.04, Ubuntu 14.10 and RHEL 7.0.
> 
> The iostat shows like below:
> 
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.00    0.00    0.00    0.00    0.00  100.00
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> xvda              0.00     0.00    0.00    0.00     0.00     0.00     0.00     4.00    0.00    0.00    0.00   0.00 100.00
> dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> Could you tell us that why is this happening, is this a bug?

It is a bug indeed, thanks for reporting it. The problem seems to be 
that blk_put_request (which is used to discard the old requests before 
requeuing them) doesn't update the queue statistics. The following 
patch solves the problem for me, could you try it and report back?

---
commit bb4317c051ca81a2906edb7ccc505cbd6d1d80c7
Author: Roger Pau Monne <roger.pau@citrix.com>
Date:   Fri Jan 23 12:10:51 2015 +0100

    xen-blkfront: fix accounting of reqs when migrating
    
    Current migration code uses blk_put_request in order to finish a request
    before requeuing it. This function doesn't update the statistics of the
    queue, which completely screws accounting. Use blk_end_request_all instead
    which properly updates the statistics of the queue.
    
    Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5ac312f..aac41c1 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1493,7 +1493,7 @@ static int blkif_recover(struct blkfront_info *info)
 		merge_bio.tail = copy[i].request->biotail;
 		bio_list_merge(&bio_list, &merge_bio);
 		copy[i].request->bio = NULL;
-		blk_put_request(copy[i].request);
+		blk_end_request_all(copy[i].request, 0);
 	}
 
 	kfree(copy);
@@ -1516,7 +1516,7 @@ static int blkif_recover(struct blkfront_info *info)
 		req->bio = NULL;
 		if (req->cmd_flags & (REQ_FLUSH | REQ_FUA))
 			pr_alert("diskcache flush request found!\n");
-		__blk_put_request(info->rq, req);
+		__blk_end_request_all(req, 0);
 	}
 	spin_unlock_irq(&info->io_lock);
 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xen-blkfront: weird behavior of "iostat" after VM live-migrate which xen-blkfront module has indirect descriptors
  2015-01-23 11:15 ` Roger Pau Monné
@ 2015-01-26  2:30   ` Ouyang Zhaowei (Charles)
  2015-01-30  8:37     ` Ouyang Zhaowei (Charles)
  0 siblings, 1 reply; 4+ messages in thread
From: Ouyang Zhaowei (Charles) @ 2015-01-26  2:30 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: linux-kernel, suoben, liuyingdong, weiping.ding, xen-devel,
	David Vrabel, Konrad Rzeszutek Wilk, Boris Ostrovsky


On 2015.1.23 19:15, Roger Pau Monné wrote:
> Hello,
> 
> El 23/01/15 a les 8.59, Ouyang Zhaowei (Charles) ha escrit:
>> Hi Roger,
>>
>> We are testing the indirect feature of xen-blkfront module these days.
>> And we found that, after VM live-migrate a couple of times, the "%util" of iostat keeps being 100%, and there are several requests stock in "avgqu-sz".
>> We have checked some later version of Linux, and it happens on Ubuntu 14.04, Ubuntu 14.10 and RHEL 7.0.
>>
>> The iostat shows like below:
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.00    0.00    0.00    0.00    0.00  100.00
>>
>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>> xvda              0.00     0.00    0.00    0.00     0.00     0.00     0.00     4.00    0.00    0.00    0.00   0.00 100.00
>> dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
>> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
>>
>> Could you tell us that why is this happening, is this a bug?
> 
> It is a bug indeed, thanks for reporting it. The problem seems to be 
> that blk_put_request (which is used to discard the old requests before 
> requeuing them) doesn't update the queue statistics. The following 
> patch solves the problem for me, could you try it and report back?
> 
> ---
> commit bb4317c051ca81a2906edb7ccc505cbd6d1d80c7
> Author: Roger Pau Monne <roger.pau@citrix.com>
> Date:   Fri Jan 23 12:10:51 2015 +0100
> 
>     xen-blkfront: fix accounting of reqs when migrating
>     
>     Current migration code uses blk_put_request in order to finish a request
>     before requeuing it. This function doesn't update the statistics of the
>     queue, which completely screws accounting. Use blk_end_request_all instead
>     which properly updates the statistics of the queue.
>     
>     Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> index 5ac312f..aac41c1 100644
> --- a/drivers/block/xen-blkfront.c
> +++ b/drivers/block/xen-blkfront.c
> @@ -1493,7 +1493,7 @@ static int blkif_recover(struct blkfront_info *info)
>  		merge_bio.tail = copy[i].request->biotail;
>  		bio_list_merge(&bio_list, &merge_bio);
>  		copy[i].request->bio = NULL;
> -		blk_put_request(copy[i].request);
> +		blk_end_request_all(copy[i].request, 0);
>  	}
>  
>  	kfree(copy);
> @@ -1516,7 +1516,7 @@ static int blkif_recover(struct blkfront_info *info)
>  		req->bio = NULL;
>  		if (req->cmd_flags & (REQ_FLUSH | REQ_FUA))
>  			pr_alert("diskcache flush request found!\n");
> -		__blk_put_request(info->rq, req);
> +		__blk_end_request_all(req, 0);
>  	}
>  	spin_unlock_irq(&info->io_lock);
>  
> 

Hi Roger,

Thanks for answering this question. Sure, I'll try this patch and test VM migrating, so far it seems this patch has solved this bug (after 10 times migrate).
I'll keep testing it for more times and will let you know if it's OK.

Regards,
Ouyang Zhaowei


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: xen-blkfront: weird behavior of "iostat" after VM live-migrate which xen-blkfront module has indirect descriptors
  2015-01-26  2:30   ` Ouyang Zhaowei (Charles)
@ 2015-01-30  8:37     ` Ouyang Zhaowei (Charles)
  0 siblings, 0 replies; 4+ messages in thread
From: Ouyang Zhaowei (Charles) @ 2015-01-30  8:37 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: linux-kernel, weiping.ding, xen-devel, David Vrabel,
	Konrad Rzeszutek Wilk, Boris Ostrovsky



On 2015.1.26 10:30, Ouyang Zhaowei (Charles) wrote:
> 
> On 2015.1.23 19:15, Roger Pau Monné wrote:
>> Hello,
>>
>> El 23/01/15 a les 8.59, Ouyang Zhaowei (Charles) ha escrit:
>>> Hi Roger,
>>>
>>> We are testing the indirect feature of xen-blkfront module these days.
>>> And we found that, after VM live-migrate a couple of times, the "%util" of iostat keeps being 100%, and there are several requests stock in "avgqu-sz".
>>> We have checked some later version of Linux, and it happens on Ubuntu 14.04, Ubuntu 14.10 and RHEL 7.0.
>>>
>>> The iostat shows like below:
>>>
>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>            0.00    0.00    0.00    0.00    0.00  100.00
>>>
>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>>> xvda              0.00     0.00    0.00    0.00     0.00     0.00     0.00     4.00    0.00    0.00    0.00   0.00 100.00
>>> dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
>>> dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
>>>
>>> Could you tell us that why is this happening, is this a bug?
>>
>> It is a bug indeed, thanks for reporting it. The problem seems to be 
>> that blk_put_request (which is used to discard the old requests before 
>> requeuing them) doesn't update the queue statistics. The following 
>> patch solves the problem for me, could you try it and report back?

Hi Roger,

After near 1000 times migrate test, the "%util" of iostat did not become 100% anymore, seems like the patch fix this bug

Thanks

Ouyang Zhaowei

>>
>> ---
>> commit bb4317c051ca81a2906edb7ccc505cbd6d1d80c7
>> Author: Roger Pau Monne <roger.pau@citrix.com>
>> Date:   Fri Jan 23 12:10:51 2015 +0100
>>
>>     xen-blkfront: fix accounting of reqs when migrating
>>     
>>     Current migration code uses blk_put_request in order to finish a request
>>     before requeuing it. This function doesn't update the statistics of the
>>     queue, which completely screws accounting. Use blk_end_request_all instead
>>     which properly updates the statistics of the queue.
>>     
>>     Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
>> index 5ac312f..aac41c1 100644
>> --- a/drivers/block/xen-blkfront.c
>> +++ b/drivers/block/xen-blkfront.c
>> @@ -1493,7 +1493,7 @@ static int blkif_recover(struct blkfront_info *info)
>>  		merge_bio.tail = copy[i].request->biotail;
>>  		bio_list_merge(&bio_list, &merge_bio);
>>  		copy[i].request->bio = NULL;
>> -		blk_put_request(copy[i].request);
>> +		blk_end_request_all(copy[i].request, 0);
>>  	}
>>  
>>  	kfree(copy);
>> @@ -1516,7 +1516,7 @@ static int blkif_recover(struct blkfront_info *info)
>>  		req->bio = NULL;
>>  		if (req->cmd_flags & (REQ_FLUSH | REQ_FUA))
>>  			pr_alert("diskcache flush request found!\n");
>> -		__blk_put_request(info->rq, req);
>> +		__blk_end_request_all(req, 0);
>>  	}
>>  	spin_unlock_irq(&info->io_lock);
>>  
>>
> 
> Hi Roger,
> 
> Thanks for answering this question. Sure, I'll try this patch and test VM migrating, so far it seems this patch has solved this bug (after 10 times migrate).
> I'll keep testing it for more times and will let you know if it's OK.
> 
> Regards,
> Ouyang Zhaowei
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> .
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-01-30  8:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-23  7:59 xen-blkfront: weird behavior of "iostat" after VM live-migrate which xen-blkfront module has indirect descriptors Ouyang Zhaowei (Charles)
2015-01-23 11:15 ` Roger Pau Monné
2015-01-26  2:30   ` Ouyang Zhaowei (Charles)
2015-01-30  8:37     ` Ouyang Zhaowei (Charles)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).