Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.12.0, Lustre 2.10.4, Lustre 2.10.5
-
None
-
3
-
9223372036854775807
Description
replay-vbr test_7f fails on mounting an MDS. It’s not clear when this test started failing with this error, but it looks like this test didn’t fail on MDS mount for two and a half months and started failing again on July 14, 2018. All failures of this type since March of 2018 are listed below.
Looking at the failure at https://testing.whamcloud.com/test_sets/5b253cd8-878f-11e8-9028-52540065bddc, in the test_log, the only sign of trouble is when we try and mount the failover MDS
Failing mds1 on trevis-4vm8 + pm -h powerman --off trevis-4vm8 Command completed successfully reboot facets: mds1 + pm -h powerman --on trevis-4vm8 Command completed successfully Failover mds1 to trevis-4vm7 12:30:33 (1531571433) waiting for trevis-4vm7 network 900 secs ... 12:30:33 (1531571433) network interface is UP CMD: trevis-4vm7 hostname mount facets: mds1 CMD: trevis-4vm7 test -b /dev/lvm-Role_MDS/P1 CMD: trevis-4vm7 e2label /dev/lvm-Role_MDS/P1 trevis-4vm7: e2label: No such file or directory while trying to open /dev/lvm-Role_MDS/P1 trevis-4vm7: Couldn't find valid filesystem superblock. Starting mds1: -o loop /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1 CMD: trevis-4vm7 mkdir -p /mnt/lustre-mds1; mount -t lustre -o loop /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1 trevis-4vm7: mount: /dev/lvm-Role_MDS/P1: failed to setup loop device: No such file or directory Start of /dev/lvm-Role_MDS/P1 on mds1 failed 32 replay-vbr test_7f: @@@@@@ FAIL: Restart of mds1 failed!
In all the following cases, test 7g hangs when test 7f fails in this way.
2018-08-15 2.10.5 RC2 – fails in “test_7f.5 last”
https://testing.whamcloud.com/test_sets/a75d306e-a081-11e8-8ee3-52540065bddc
2018-08-02 2.10.4.14 – fails in “test_7f.5 last”
https://testing.whamcloud.com/test_sets/7405ad54-9645-11e8-a9f7-52540065bddc
2018-07-14 2.10.4.8 - fails in “test_7f.1 last”
https://testing.whamcloud.com/test_sets/5b253cd8-878f-11e8-9028-52540065bddc
2018-04-12 2.11.50.51 - fails in “test_7f.4 last”
https://testing.whamcloud.com/test_sets/37bad538-3e69-11e8-b45c-52540065bddc
2018-03-03 2.10.3.35 - fails in “test_7f.4 last”
https://testing.whamcloud.com/test_sets/f33a4326-1f0f-11e8-a6ca-52540065bddc
In the following test session, replay-vbr test 7e fails in the way described above and test 7f hangs
2018-07-15 2.10.4.8 - fails in “test_7e.5 last”
https://testing.whamcloud.com/test_sets/0ca6ca46-87fc-11e8-b376-52540065bddc
2018-03-14 2.10.59 - fails in “test_7e.5 last”
https://testing.whamcloud.com/test_sets/d26f60b0-2809-11e8-b6a0-52540065bddc