[LU-12946] Multipath path flapping issue Created: 07/Nov/19 Updated: 17/Feb/21 Resolved: 22/Nov/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.13.0, Lustre 2.14.0, Lustre 2.12.4 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Wang Shilong (Inactive) | Assignee: | Wang Shilong (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 9223372036854775807 | ||||
| Description |
|
The symptoms of this issue are high I/O latency seen on an external server from the multipath devices without associated latency on the underlying physical disks. The following messages are also associated with this issue: Oct 28 15:12:23 nvme1 kernel: device-mapper: multipath: Reinstating path 8:160. Specifically, paths being failed by the device-mapper module itself (e.g. this message) without any other associated error: Oct 28 15:12:25 nvme1 kernel: device-mapper: multipath: Failing path 8:160. and then being reinstated shortly thereafter by the tur checker: Oct 28 15:12:28 nvme1 multipathd: 360001ff0b05e90000000002e8964000a: sdk - tur checker reports path is up For a more definitive diagnosis one can run the attached systemtap script and the output should look like this (the key is the return/error fields set to 3 and the tail of the backtrace being fail_path called by multipath_end_io): blk_insert_cloned_request() return=3 The cause is a borked RedHat backport of this upstream patch (https://github.com/torvalds/linux/commit/86ff7c2a80cd357f6156a53b354f6a0b357dc0c9). Here's what I believe to be the fix which prevents the BLK_MQ_RQ_QUEUE_DEV_BUSY state from causing the request to be prematurely completed in an error state: — kernel-3.10.0-957.21.3.el7/linux-3.10.0-957.21.3.el7.x86_64/drivers/md/dm-rq.c 2019-06-14 06:29:35.000000000 +0000
+++ kernel-3.10.0-957.21.3.el7.patched/linux-3.10.0-957.21.3.el7.x86_64/drivers/md/dm-rq.c 2019-10-28 00:16:55.949220284 +0000
@@ -477,7 +477,7 @@
clone->start_time = jiffies;
r = blk_insert_cloned_request(clone->q, clone);
if (r != BLK_MQ_RQ_QUEUE_OK && r != BLK_MQ_RQ_QUEUE_BUSY)
+ if (r != BLK_MQ_RQ_QUEUE_OK && r != BLK_MQ_RQ_QUEUE_BUSY && r != BLK_MQ_RQ_QUEUE_DEV_BUSY )
/* must complete clone in terms of original request */
dm_complete_request(rq, r);
return r;
@@ -661,7 +661,7 @@
trace_block_rq_remap(clone->q, clone, disk_devt(dm_disk(md)),
blk_rq_pos(rq));
ret = dm_dispatch_clone_request(clone, rq);
if (ret == BLK_MQ_RQ_QUEUE_BUSY) {
+ if (ret == BLK_MQ_RQ_QUEUE_BUSY || ret == BLK_MQ_RQ_QUEUE_DEV_BUSY) {
blk_rq_unprep_clone(clone);
tio->ti->type->release_clone_rq(clone);
tio->clone = NULL;
|
| Comments |
| Comment by Gerrit Updater [ 07/Nov/19 ] |
|
Wang Shilong (wshilong@ddn.com) uploaded a new patch: https://review.whamcloud.com/36699 |
| Comment by Ruth Klundt (Inactive) [ 20/Nov/19 ] |
|
We see this with RHEL 7.7 and lustre 2.10+. We confirmed the patch fixes the issue. Is there a RHEL bugzilla associated with this? Thanks. |
| Comment by Chris Hunter (Inactive) [ 21/Nov/19 ] |
|
There is a private BZ:
|
| Comment by Gerrit Updater [ 22/Nov/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36699/ |
| Comment by Olaf Faaland [ 22/Nov/19 ] |
|
Should this be backported to b2_12? |
| Comment by Peter Jones [ 22/Nov/19 ] |
|
Yes |
| Comment by Gerrit Updater [ 26/Nov/19 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36868 |
| Comment by Gerrit Updater [ 12/Dec/19 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36868/ |