[LU-11681] sanity test 65i fails with 'find /mnt/lustre failed' Created: 19/Nov/18  Updated: 17/Mar/21  Resolved: 07/Sep/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0, Lustre 2.13.0, Lustre 2.12.1, Lustre 2.12.2, Lustre 2.12.3, Lustre 2.12.4
Fix Version/s: Lustre 2.13.0

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: LTS12, dne, zfs
Environment:

DNE/ZFS


Issue Links:
Duplicate
is duplicated by LU-11767 sanity test_60g: FAIL: mkdir failed Resolved
Related
is related to LU-10755 sanity test 409 fails with 'Fail to c... Open
is related to LU-11418 hung threads on MDT and MDT won't umount Resolved
is related to LU-13099 ll_set_inode()) Can not initialize in... Resolved
is related to LU-11907 sanity fails to cleanup d60g.sanity Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

sanity test 65i is failing for DNE with ZFS configurations. Looking at the client test log at https://testing.whamcloud.com/test_sets/4ac91f66-e875-11e8-bfe1-52540065bddc , we see ‘lfs find’ is failing with

lfs find: llapi_semantic_traverse: Failed to open '/mnt/lustre/d60g.sanity/subdir2': No such file or directory (2)
error: find failed for /mnt/lustre.
 sanity test_65i: @@@@@@ FAIL: find /mnt/lustre failed 

Looking at the console logs, the only indication of a problem is in the client (vm7) console

 [ 3432.108946] Lustre: DEBUG MARKER: == sanity test 65i: various tests to set root directory striping ===================================== 23:16:02 (1542237362)
[ 3436.508827] LustreError: 662:0:(llite_lib.c:2390:ll_prep_inode()) new_inode -fatal: rc -2
[ 3438.077315] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_65i: @@@@@@ FAIL: find \/mnt\/lustre failed 

We’ve seen this test fail a total of three time since November 14, 2018.
https://testing.whamcloud.com/test_sets/da7b847a-ebb1-11e8-86c0-52540065bddc
https://testing.whamcloud.com/test_sets/5f393496-eba8-11e8-bfe1-52540065bddc



 Comments   
Comment by Andreas Dilger [ 03/Dec/18 ]

The sanity test_60g was added in patch https://review.whamcloud.com/33401 "LU-11418 llog: refresh remote llog upon -ESTALE", which landed on 2018-11-13.

It looks like it has left over some broken directory stub due to the test that is being run. At a minimum, the test should be modified to try and remove the test directory at the end, so that the failure is localized to the test that introduced it. It might be that we need to use "lfs rmentry" to delete the partially-created directory entry.

There were 9 failures in the past 4 weeks, so it isn't a high failure rate, but something that could likely be addressed fairly easily.

Comment by Lai Siyao [ 04/Dec/18 ]

Okay, I'll look into it soon.

Comment by Gerrit Updater [ 28/Dec/18 ]

Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33926
Subject: LU-11681 lfsck: read LMV from bottom object
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 12a5b93757f61da46df72c56cb2bcbb157786e41

Comment by Gerrit Updater [ 28/Dec/18 ]

Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33927
Subject: LU-11681 lfsck: misc fixes for dangling entry repair
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: eb81609cb56f4f3c452d6d2adcb3192a2008e09b

Comment by Gerrit Updater [ 28/Dec/18 ]

Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33928
Subject: LU-11681 lfsck: misc fixes in inserting shard
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: abea2a3add2fd4d0022b76f53a5aacc6c25fdc96

Comment by Gerrit Updater [ 28/Dec/18 ]

Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33929
Subject: LU-11681 llite: clear lsm if dir lost its LMV
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: d451f4ee49eec34f56a5bebc1795bf1f9123833e

Comment by Gerrit Updater [ 28/Dec/18 ]

Lai Siyao (lai.siyao@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33930
Subject: LU-11681 lmv: disable remote file statahead
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: cd9447f033d284d7bcbb0c60118af1605f740107

Comment by Minh Diep [ 02/Apr/19 ]

+1 on b2_12: https://testing.whamcloud.com/test_sets/503715b8-54bb-11e9-8e92-52540065bddc

Comment by Gerrit Updater [ 16/Apr/19 ]

James Nunez (jnunez@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34687
Subject: LU-11681 tests: stop running sanity 65i
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 430ec3ba2cc71fb99e8a2fc80adae4638f4f928f

Comment by Andreas Dilger [ 19/Jul/19 ]

There are a bunch of failures in sanity.sh test_60g with "mkdir: cannot create directory '/mnt/lustre/d60g.sanity/new': Input/output error" (linked here from LU-11767 as a duplicate):
https://testing.whamcloud.com/sub_tests/0db5baa0-a326-11e9-8dbe-52540065bddc
https://testing.whamcloud.com/sub_tests/cdb41b3e-a462-11e9-8fc1-52540065bddc
https://testing.whamcloud.com/sub_tests/249487e4-a635-11e9-8fc1-52540065bddc
https://testing.whamcloud.com/sub_tests/d9edd372-a7cd-11e9-9e3d-52540065bddc
https://testing.whamcloud.com/sub_tests/5e57126a-a876-11e9-8dbe-52540065bddc
https://testing.whamcloud.com/sub_tests/7c0f940e-a9ba-11e9-bade-52540065bddc
https://testing.whamcloud.com/sub_tests/7c3941de-aa0d-11e9-bade-52540065bddc
https://testing.whamcloud.com/sub_tests/714d0a40-aa16-11e9-8fc1-52540065bddc

It looks like these patches need to be refreshed.

Comment by Gerrit Updater [ 15/Aug/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33926/
Subject: LU-11681 lfsck: read LMV from bottom object
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ef12cbccb0a5c50da0bb9c4c0bf17e51df5ca91b

Comment by Gerrit Updater [ 15/Aug/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33927/
Subject: LU-11681 lfsck: misc fixes for dangling entry repair
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: a6dafda245a4ee4c5a86f585914cf0dde87e420e

Comment by Gerrit Updater [ 15/Aug/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33928/
Subject: LU-11681 lfsck: misc fixes in inserting shard
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 4becec906d3a997a8ff4a8c10e39633ff69bd738

Comment by Gerrit Updater [ 07/Sep/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33930/
Subject: LU-11681 lmv: disable remote file statahead
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 02b5a407081c88090f7237ef464a6f6c74139f67

Comment by Lai Siyao [ 07/Sep/19 ]

All patches landed.

Generated at Sat Feb 10 02:46:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.