[LU-10708] replay-single test_20b: Restart of mds1 failed! Created: 23/Feb/18 Updated: 20/Nov/19 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0, Lustre 2.12.0, Lustre 2.13.0, Lustre 2.12.1, Lustre 2.12.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | zfs | ||
| Environment: |
Hard Failover: |
||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
replay-single test_20b - Restart of mds1 failed! This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/7f078076-15ba-11e8-bd00-52540065bddc test_20b failed with the following error: Restart of mds1 failed! test_logs: == replay-single test 20b: write, unlink, eviction, replay (test mds_cleanup_orphans) ================ 19:46:25 (1519069585) CMD: onyx-32vm7 lctl set_param -n os[cd]*.*MDT*.force_sync=1 CMD: onyx-32vm6 lctl set_param -n osd*.*OS*.force_sync=1 /mnt/lustre/f20b.replay-single lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 4770 0x12a2 0 CMD: onyx-32vm7 /usr/sbin/lctl set_param -n mdt.lustre-MDT0000.evict_client 425b1455-3c86-3ef4-e5f3-8752f5bdb612 10000+0 records in 10000+0 records out 40960000 bytes (41 MB) copied, 1.08016 s, 37.9 MB/s CMD: onyx-32vm7 lctl set_param -n osd*.*MDT*.force_sync=1 CMD: onyx-32vm7 /usr/sbin/lctl dl Failing mds1 on onyx-32vm7 + pm -h powerman --off onyx-32vm7 Command completed successfully reboot facets: mds1 + pm -h powerman --on onyx-32vm7 Command completed successfully Failover mds1 to onyx-32vm8 19:46:42 (1519069602) waiting for onyx-32vm8 network 900 secs ... 19:46:42 (1519069602) network interface is UP CMD: onyx-32vm8 hostname mount facets: mds1 CMD: onyx-32vm8 lsmod | grep zfs >&/dev/null || modprobe zfs; zpool list -H lustre-mdt1 >/dev/null 2>&1 || zpool import -f -o cachefile=none -o failmode=panic -d /dev/lvm-Role_MDS lustre-mdt1 onyx-32vm8: cannot import 'lustre-mdt1': no such pool available replay-single test_20b: @@@@@@ FAIL: Restart of mds1 failed! |
| Comments |
| Comment by James Nunez (Inactive) [ 09/May/18 ] |
|
We see the same problem when remounting the MDS after a failover in recovery-mds-scale in test_failover_mds. See the following for logs https://testing.hpdd.intel.com/test_sets/c57c0bda-527d-11e8-b9d3-52540065bddc . From the client test_log Failing mds1 on trevis-8vm7 + pm -h powerman --off trevis-8vm7 Command completed successfully reboot facets: mds1 + pm -h powerman --on trevis-8vm7 Command completed successfully Failover mds1 to trevis-8vm8 19:58:09 (1525550289) waiting for trevis-8vm8 network 900 secs ... 19:58:09 (1525550289) network interface is UP CMD: trevis-8vm8 hostname mount facets: mds1 CMD: trevis-8vm8 lsmod | grep zfs >&/dev/null || modprobe zfs; zpool list -H lustre-mdt1 >/dev/null 2>&1 || zpool import -f -o cachefile=none -d /dev/lvm-Role_MDS lustre-mdt1 trevis-8vm8: cannot import 'lustre-mdt1': no such pool available recovery-mds-scale test_failover_mds: @@@@@@ FAIL: Restart of mds1 failed! |
| Comment by James Nunez (Inactive) [ 17/Dec/18 ] |
|
Similar failure for recovery-random-scale test fail_client_mds at https://testing.whamcloud.com/test_sets/e3b58552-fea5-11e8-b837-52540065bddc CMD: trevis-25vm11 hostname mount facets: mds1 CMD: trevis-25vm11 lsmod | grep zfs >&/dev/null || modprobe zfs; zpool list -H lustre-mdt1 >/dev/null 2>&1 || zpool import -f -o cachefile=none -o failmode=panic -d /dev/lvm-Role_MDS lustre-mdt1 trevis-25vm11: cannot import 'lustre-mdt1': no such pool available recovery-random-scale test_fail_client_mds: @@@@@@ FAIL: Restart of mds1 failed! |
| Comment by James Nunez (Inactive) [ 29/Apr/19 ] |
|
Another similar failure with replay-single test 3c at https://testing.whamcloud.com/test_sets/4051fd66-682a-11e9-bd0e-52540065bddc . |