[LU-3119] System hang when running sanity test 24x on DNE with ZFS Created: 05/Apr/13 Updated: 08/Apr/13 Resolved: 08/Apr/13 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Sarah Liu | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | dne, zfs | ||
| Environment: |
server and client: lustre-master build #1370 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 7574 | ||||||||
| Description |
|
System hang when running sanity test_24x with 2MDTs over ZFS MDS console shows: Lustre: DEBUG MARKER: == sanity test 24x: cross rename/link should be failed == 11:47:46 (1365187666) LustreError: 14710:0:(mdt_reint.c:944:mdt_reint_link()) Target directory [0x380000bd0:0x1ae82:0x0] is on another MDT LNet: Service thread pid 14747 completed after 0.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). client console shows: Lustre: DEBUG MARKER: == sanity test 24x: cross rename/link should be failed == 11:47:46 (1365187666) Lustre: 8053:0:(dir.c:463:ll_get_dir_page()) Page-wide hash collision: 1706989648068149248 LustreError: 8053:0:(dir.c:594:ll_dir_read()) error reading dir [0x380000bd0:0x27db:0x0] at 1706989648068149248: rc -5 Lustre: 8053:0:(dir.c:463:ll_get_dir_page()) Page-wide hash collision: 1706989648068149248 LustreError: 8053:0:(dir.c:594:ll_dir_read()) error reading dir [0x380000bd0:0x27db:0x0] at 1706989648068149248: rc -5 |
| Comments |
| Comment by Di Wang [ 06/Apr/13 ] |
|
I suspect this is already fixed, I tried DNE on ZFS (1 MDS/2MDTs), it works locally == sanity test 24x: cross rename/link should be failed == 20:34:56 (1390278896) rename returned -1: Invalid cross-device link rename returned -1: Invalid cross-device link ln: creating hard link `/mnt/lustre/d0.sanity/d24/remote_dir/tgt_file1' => `/mnt/lustre/d0.sanity/d24/src_file': Invalid cross-device link Resetting fail_loc on all nodes...done. PASS 24x (1s) Please try lastest master. |
| Comment by Andreas Dilger [ 08/Apr/13 ] |
|
Duplicate of |