[LU-4855] LBUG when doing ls on newly created stripe directory Created: 02/Apr/14  Updated: 02/Oct/14  Resolved: 28/Apr/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.6.0, Lustre 2.5.4

Type: Bug Priority: Critical
Reporter: Robert Read (Inactive) Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 13390

 Description   

Testing DNE2 with Lustre 2.5.57

I created an 8 stripe directory and then immediately did "ls -la" on it and got the following LBUG. This was a newly created filesystem with ~120 MDTs in it.

LustreError: 1998:0:(fld_request.c:145:fld_rrb_scan()) cli-scratch-clilmv-ffff8803c3e18000: Can't find target by hash 1 (seq 0x480000400). Targets (3):
LustreError: 1998:0:(fld_request.c:156:fld_rrb_scan())   exp: 0xffff8803c35c2800 (5283e71c-0538-312f-c5a4-4dd348e9f08c), srv: 0x(null) (<null>), idx: 0
LustreError: 1998:0:(fld_request.c:156:fld_rrb_scan())   exp: 0xffff8803c34c8400 (5283e71c-0538-312f-c5a4-4dd348e9f08c), srv: 0x(null) (<null>), idx: 6
LustreError: 1998:0:(fld_request.c:163:fld_rrb_scan()) LBUG
Pid: 1998, comm: bash

Call Trace:
 [<ffffffffa0198895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0198e97>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa07672a1>] fld_rrb_scan+0x251/0x260 [fld]
 [<ffffffffa0769c0a>] ? fld_cache_lookup+0x3a/0x1e0 [fld]
 [<ffffffffa07698a5>] fld_client_lookup+0x175/0x4a0 [fld]
 [<ffffffffa053e820>] ? lustre_swab_mdt_body+0x0/0x140 [ptlrpc]
 [<ffffffffa07a4c38>] lmv_fld_lookup+0x98/0x3c0 [lmv]
 [<ffffffff8128a55a>] ? strlcpy+0x4a/0x60
 [<ffffffffa0790417>] lmv_unpack_md+0x5f7/0xb20 [lmv]
 [<ffffffffa079094e>] lmv_unpackmd+0xe/0x10 [lmv]
 [<ffffffffa07f1036>] obd_unpackmd+0xe6/0x360 [mdc]
 [<ffffffffa07fe6e8>] mdc_get_lustre_md+0x3c8/0x13e0 [mdc]
 [<ffffffffa07874b8>] lmv_get_lustre_md+0x88/0x300 [lmv]
 [<ffffffffa053a2d5>] ? lustre_msg_buf+0x55/0x60 [ptlrpc]
 [<ffffffffa0561f56>] ? __req_capsule_get+0x166/0x6e0 [ptlrpc]
 [<ffffffffa08f11f6>] ll_prep_inode+0x306/0xde0 [lustre]
 [<ffffffffa05625d8>] ? req_capsule_server_get+0x18/0x20 [ptlrpc]
 [<ffffffffa07a3e15>] ? lmv_intent_lookup+0xcb5/0xd00 [lmv]
 [<ffffffffa08ffc00>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
 [<ffffffffa09006b1>] ll_lookup_it_finish+0x2c1/0xe20 [lustre]
 [<ffffffffa08ffc00>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
 [<ffffffffa08e1128>] ? ll_prep_md_op_data+0x1a8/0x490 [lustre]
 [<ffffffffa0901505>] ll_lookup_it+0x2f5/0xb50 [lustre]
 [<ffffffffa08ffc00>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
 [<ffffffffa0901dec>] ll_lookup_nd+0x8c/0x3e0 [lustre]
 [<ffffffff811a39ae>] ? d_alloc+0x13e/0x1b0
 [<ffffffff81198a85>] do_lookup+0x1a5/0x230
 [<ffffffff811993a4>] __link_path_walk+0x794/0xff0
 [<ffffffffa0807b6b>] ? mdc_lock_match+0xbb/0x170 [mdc]
 [<ffffffff81199eba>] path_walk+0x6a/0xe0
 [<ffffffff8119a0cb>] filename_lookup+0x6b/0xc0
 [<ffffffffa08c248f>] ? ll_file_data_put+0xbf/0xd0 [lustre]
 [<ffffffff8119b1f7>] user_path_at+0x57/0xa0
 [<ffffffff810ec17e>] ? rcu_start_gp+0x1be/0x230
 [<ffffffff8118ea40>] vfs_fstatat+0x50/0xa0
 [<ffffffff811aaa70>] ? mntput_no_expire+0x30/0x110
 [<ffffffff8118ebbb>] vfs_stat+0x1b/0x20
 [<ffffffff8118ebe4>] sys_newstat+0x24/0x50
 [<ffffffff810e2057>] ? audit_syscall_entry+0x1d7/0x200
 [<ffffffff810e1e4e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b


 Comments   
Comment by Di Wang [ 03/Apr/14 ]

http://review.whamcloud.com/9877

Comment by Jodi Levi (Inactive) [ 28/Apr/14 ]

Patch landed to Master. Please reopen if more work is needed in this ticket.

Comment by James A Simmons [ 05/Sep/14 ]

Just hit this bug on the 2.5 branch. Will need a back port.

Comment by James A Simmons [ 05/Sep/14 ]

http://review.whamcloud.com/#/c/11780/

Generated at Sat Feb 10 01:46:26 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.