[LU-9707] Failover: recovery-random-scale test_fail_client_mds: Restart of mds1 failed! Created: 23/Jun/17 Updated: 20/Nov/19 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0, Lustre 2.10.1, Lustre 2.11.0, Lustre 2.12.0, Lustre 2.10.5, Lustre 2.13.0, Lustre 2.12.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Failover |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/55c95900-576f-11e7-9221-5254006e85c2. The sub-test test_fail_client_mds failed with the following error: Restart of mds1 failed! test logs: CMD: trevis-7vm12 hostname mount facets: mds1 CMD: trevis-7vm12 test -b /dev/lvm-Role_MDS/P1 CMD: trevis-7vm12 e2label /dev/lvm-Role_MDS/P1 trevis-7vm12: e2label: No such file or directory while trying to open /dev/lvm-Role_MDS/P1 trevis-7vm12: Couldn't find valid filesystem superblock. Starting mds1: -o loop /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1 CMD: trevis-7vm12 mkdir -p /mnt/lustre-mds1; mount -t lustre -o loop /dev/lvm-Role_MDS/P1 /mnt/lustre-mds1 trevis-7vm12: mount: /dev/lvm-Role_MDS/P1: failed to setup loop device: No such file or directory Start of /dev/lvm-Role_MDS/P1 on mds1 failed 32 recovery-random-scale test_fail_client_mds: @@@@@@ FAIL: Restart of mds1 failed! |
| Comments |
| Comment by James Casper [ 26/Sep/17 ] |
|
2.10.1: |
| Comment by James Nunez (Inactive) [ 14/Mar/19 ] |
|
We have a very similar failure with mmp test 6 and replay-single test 0a for 2.10.7 RC1. Logs are at https://testing.whamcloud.com/test_sets/8c397c98-429d-11e9-a256-52540065bddc and https://testing.whamcloud.com/test_sets/8d508356-429d-11e9-a256-52540065bddc, respectively. |
| Comment by James Nunez (Inactive) [ 19/Nov/19 ] |
|
Since this ticket is old ... Just an update that we are still seeing this issue with RHEL 7.7 servers and clients during failover testing. Here's a link to a recent failure https://testing.whamcloud.com/test_sets/d9837c4a-07b6-11ea-bbc3-52540065bddc. |