[LU-5130] Test failure sanity test_17n: destroy remote dir error 0 Created: 02/Jun/14  Updated: 18/Feb/15  Resolved: 22/Dec/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Maloo Assignee: Di Wang
Resolution: Duplicate Votes: 0
Labels: None

Issue Links:
Related
is related to LU-5420 Failure on test suite sanity test_17m... Resolved
Severity: 3
Rank (Obsolete): 14157

 Description   

This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

This issue relates to the following test suite run:
https://maloo.whamcloud.com/test_sets/f8142678-e8e5-11e3-849b-52540035b04c
https://maloo.whamcloud.com/test_sets/92c4a4ee-e4fa-11e3-a294-52540035b04c

The sub-test test_17n failed with the following error:

rm: cannot remove `/mnt/lustre/d17n.sanity/remote_dir_0': Stale file handle
destroy remote dir error 0

Info required for matching: sanity 17n



 Comments   
Comment by Andreas Dilger [ 02/Jun/14 ]

Di, can you take a quick look at this? Is this a regression of some kind?

Comment by Di Wang [ 25/Jun/14 ]

This seems related with lod object initialization process, which unnecessarily load the stripping object, I will cook a patch.

Comment by Di Wang [ 26/Jun/14 ]

http://review.whamcloud.com/10841

Comment by Andreas Dilger [ 30/Jun/14 ]

Di, is the http://review.whamcloud.com/10841 patch fixing a critical enough problem that it should land for 2.6.0, or is it just an optimization?

Comment by Di Wang [ 30/Jun/14 ]

Andreas: it is a fix for remote dir unlink error, not just optimization, so probably should go into 2.6.0. The patch still have some problem according to the maloon test, I am trying to figure out now. Thanks.

Comment by Jodi Levi (Inactive) [ 08/Jul/14 ]

Patch landed to Master.

Comment by Nathaniel Clark [ 22/Dec/14 ]

Re-occurance on master:
https://testing.hpdd.intel.com/test_sets/500c0696-87e4-11e4-a70f-5254006e85c2
https://testing.hpdd.intel.com/test_sets/e01556ea-762b-11e4-8532-5254006e85c2
https://testing.hpdd.intel.com/test_sets/081395c6-6cb3-11e4-8bd3-5254006e85c2
https://testing.hpdd.intel.com/test_sets/8f143d96-6b37-11e4-b1b4-5254006e85c2

Comment by Di Wang [ 22/Dec/14 ]

I checked the console log, should be related with LU-5420, i.e. stale config log caused the OST is not being setup correctly.

21:30:38:Lustre: lustre-MDT0001: Recovery over after 0:04, of 5 clients 5 recovered and 0 were evicted.
21:30:38:LustreError: 3444:0:(lod_lov.c:821:validate_lod_and_idx()) lustre-MDT0001-mdtlov: bad idx: 7 of 32
21:30:38:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  sanity test_17n: @@@@@@ FAIL: destroy remote dir error 0 
Comment by Di Wang [ 22/Dec/14 ]

duplicate with LU-5420.

Generated at Sat Feb 10 01:48:47 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.