[LU-2990] Failure on sanity test_24v: error in listing large dir Created: 19/Mar/13  Updated: 17/Jul/13  Resolved: 09/Apr/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Sarah Liu Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: MB, zfs
Environment:

server and client: lustre-master build# 1328 RHEL6
fstype is zfs


Attachments: Text File debug     File trace    
Issue Links:
Duplicate
is duplicated by LU-3119 System hang when running sanity test ... Resolved
Related
is related to LU-3605 Sanity test suite aborts in test 24v Resolved
Severity: 3
Rank (Obsolete): 7284

 Description   

https://maloo.whamcloud.com/test_sessions/a84605f2-90c2-11e2-8311-52540035b04c

client console:

Lustre: DEBUG MARKER: == sanity test 24v: list directory with large files (handle hash collision, bug: 17560) == 10:15:47 (1363713347)
Lustre: DEBUG MARKER: cancel_lru_locks mdc start
Lustre: DEBUG MARKER: cancel_lru_locks mdc stop
Lustre: DEBUG MARKER: sanity test_24v: @@@@@@ FAIL: error in listing large dir
Lustre: 8261:0:(dir.c:463:ll_get_dir_page()) Page-wide hash collision: 5723140463788032
LustreError: 8261:0:(dir.c:594:ll_dir_read()) error reading dir [0x200000400:0x4:0x0] at 5723140463788032: rc -5
Lustre: 8261:0:(dir.c:463:ll_get_dir_page()) Page-wide hash collision: 5723140463788032
LustreError: 8261:0:(dir.c:594:ll_dir_read()) error reading dir [0x200000400:0x4:0x0] at 5723140463788032: rc -5


 Comments   
Comment by Keith Mannthey (Inactive) [ 19/Mar/13 ]

The error logs for sanity show ABORT and do not contain any logs for test24v. https://maloo.whamcloud.com/test_sets/aab26060-90c2-11e2-8311-52540035b04c

I do not seem to be able to search in maloo for sanity test_24v. It seems there is a test_24u and test_24w but no v.

Do you have more info you can share? How did you run the test?

Comment by Sarah Liu [ 20/Mar/13 ]

Here are the debug log and trace from the client. The system actually hung, so I abort the testing.

Comment by Jodi Levi (Inactive) [ 20/Mar/13 ]

Fan Yong,
Could you please have a look at this one?
Thank you!

Comment by Andreas Dilger [ 20/Mar/13 ]

We shouldn't get a hash collision with just 100k files. I wonder if something bad is happening with the hash mapping in the ZFS code?

Comment by nasf (Inactive) [ 30/Mar/13 ]

This is the patch:

http://review.whamcloud.com/#change,5894

Comment by Peter Jones [ 09/Apr/13 ]

Landed for 2.4

Generated at Sat Feb 10 01:30:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.