[LU-13221] conf-sanity test_112: FAIL: MDS start failed Created: 07/Feb/20  Updated: 28/Jul/22

Status: Reopened
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0, Lustre 2.15.0, Lustre 2.15.1
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-13813 conf-sanity test_112: can't put impor... Resolved
is related to LU-12818 replay-single test_70b and other test... Resolved
is related to LU-13184 conf-sanity test_112: problem creatin... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for jianyu <yujian@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/94e11602-472f-11ea-a1c8-52540065bddc

test_112 failed with the following error:

Starting mds2:   /dev/mapper/mds2_flakey /mnt/lustre-mds2
CMD: trevis-6vm5 mkdir -p /mnt/lustre-mds2; mount -t lustre   /dev/mapper/mds2_flakey /mnt/lustre-mds2
trevis-6vm5: mount.lustre: mount /dev/mapper/mds2_flakey at /mnt/lustre-mds2 failed: No such file or directory
trevis-6vm5: Is the MGS specification correct?
trevis-6vm5: Is the filesystem name correct?
trevis-6vm5: If upgrading, is the copied client log valid? (see upgrade docs)
Start of /dev/mapper/mds2_flakey on mds2 failed 2
 conf-sanity test_112: @@@@@@ FAIL: MDS start failed 

<<Please provide additional information about the failure here>>

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
conf-sanity test_112 - MDS start failed



 Comments   
Comment by Jian Yu [ 10/May/20 ]

+1 on master branch:
https://testing.whamcloud.com/test_sets/27dd246a-6ef3-4cbd-a235-a89c5d117cb8

Comment by Andreas Dilger [ 26/Jan/21 ]

There is patch: https://review.whamcloud.com/37393 "LU-13184 tests: wait for OST startup in test_112" which may fix this problem.

Comment by James Nunez (Inactive) [ 15/Feb/21 ]

I'm reopening this ticket because we are still seeing this issue or something that looks like it. See https://testing.whamcloud.com/test_sets/ea83e839-7f5d-48da-b284-2cba8c50d366 for logs for an ARM client testing session.
ZFS, DNE - https://testing.whamcloud.com/test_sets/179cb35b-27de-485e-8a50-8a5dc64dd8cf

Comment by James Nunez (Inactive) [ 29/Sep/21 ]

For DNE testing on master branch, we are seeing a similar error:

Starting mds2: -o localrecov  /dev/mapper/mds2_flakey /mnt/lustre-mds2
CMD: trevis-4vm3 mkdir -p /mnt/lustre-mds2; mount -t lustre -o localrecov  /dev/mapper/mds2_flakey /mnt/lustre-mds2
trevis-4vm3: mount.lustre: mount /dev/mapper/mds2_flakey at /mnt/lustre-mds2 failed: Input/output error
trevis-4vm3: Is the MGS running?
pdsh@trevis-79vm13: trevis-4vm3: ssh exited with exit code 5
Start of /dev/mapper/mds2_flakey on mds2 failed 5
 conf-sanity test_112: @@@@@@ FAIL: MDS start failed 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:6319:error()
  = /usr/lib64/lustre/tests/conf-sanity.sh:8470:test_112()

See the following for logs
https://testing.whamcloud.com/test_sets/4fb3b12e-8f3b-4b75-8fc4-49e257fd3b44
https://testing.whamcloud.com/test_sets/60d02b30-1a14-4951-a6f3-7ec7d37e4d5f

Generated at Sat Feb 10 02:59:26 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.