[LU-6738] lfs fid2path with invalid fid triggers LBUG: lmv_fld_lookup()) ASSERTION( (fid_seq_in_fldb(fid_seq(fid)) || fid_seq_is_local_file(fid_seq(fid))) && fid_is_sane(fid) ) failed: [0x100190000:0x39b4fc:0x0] is insane! Created: 17/Jun/15  Updated: 27/Jul/17  Resolved: 27/Jul/17

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Patrick Farrell (Inactive) Assignee: WC Triage
Resolution: Won't Fix Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

On file systems with more than one MDT, lmv_find_target calls lmv_fld_lookup, which sanity checks the provided fid...

However, when this is called from lfs fid2path, the fid has not yet been sanity checked. As a result, passing an invalid fid in to fid2path causes an LBUG.

This is present in any version of Lustre which supports multiple MDTs.

The crash below is from a Cray system running 2.5, but the bug is present in current master.

Run this command from a client on a file system with multiple MDTs:

lfs fid2path /[fs_mount_point] [0x100190000:0x39b4b2:0x0]

And it will LBUG:

> 2015-06-11T18:11:48.877805+00:00 c1-0c0s1n1 LustreError: 20721:0:(lmv_fld.c:72:lmv_fld_lookup()) ASSERTION( (fid_seq_in_fldb(fid_seq(fid)) || fid_seq_is_local_file(fid_seq(fid))) && fid_is_sane(fid) ) failed: [0x100190000:0x39b4fc:0x0] is insane!
> 2015-06-11T18:11:48.903006+00:00 c1-0c0s1n1 LustreError: 20721:0:(lmv_fld.c:72:lmv_fld_lookup()) LBUG
> 2015-06-11T18:11:48.903020+00:00 c1-0c0s1n1 Pid: 20721, comm: lfs
> 2015-06-11T18:11:48.903043+00:00 c1-0c0s1n1 Call Trace:
> 2015-06-11T18:11:48.903055+00:00 c1-0c0s1n1 [<ffffffff81005e89>] try_stack_unwind+0x169/0x1b0
> 2015-06-11T18:11:48.903067+00:00 c1-0c0s1n1 [<ffffffff81004919>] dump_trace+0x89/0x440
> 2015-06-11T18:11:48.903078+00:00 c1-0c0s1n1 [<ffffffffa02ba8c7>] libcfs_debug_dumpstack+0x57/0x80 [libcfs]
> 2015-06-11T18:11:48.953415+00:00 c1-0c0s1n1 [<ffffffffa02bae27>] lbug_with_loc+0x47/0xc0 [libcfs]
> 2015-06-11T18:11:48.953429+00:00 c1-0c0s1n1 [<ffffffffa09c04a3>] lmv_fld_lookup+0x1d3/0x3c0 [lmv]
> 2015-06-11T18:11:48.953450+00:00 c1-0c0s1n1 [<ffffffffa09bb2e9>] lmv_iocontrol+0x8d9/0x3230 [lmv]
> 2015-06-11T18:11:48.978617+00:00 c1-0c0s1n1 [<ffffffffa088fd1f>] ll_fid2path+0x36f/0xbb0 [lustre]
> 2015-06-11T18:11:48.978634+00:00 c1-0c0s1n1 [<ffffffffa087678f>] ll_dir_ioctl+0x16df/0x5f00 [lustre]
> 2015-06-11T18:11:48.978652+00:00 c1-0c0s1n1 [<ffffffff811609fb>] do_vfs_ioctl+0x9b/0x510
> 2015-06-11T18:11:48.978667+00:00 c1-0c0s1n1 [<ffffffff81160ebf>] sys_ioctl+0x4f/0x80
> 2015-06-11T18:11:49.003830+00:00 c1-0c0s1n1 [<ffffffff81560d2b>] system_call_fastpath+0x16/0x1b
> 2015-06-11T18:11:49.003862+00:00 c1-0c0s1n1 [<00007f7dda2961c7>] 0x7f7dda2961c7
> 2015-06-11T18:11:49.003881+00:00 c1-0c0s1n1 Kernel panic - not syncing: LBUG

The fix is straightforward - Sanity check the FID in ll_fid2path. Patch coming in a moment.



 Comments   
Comment by Gerrit Updater [ 17/Jun/15 ]

Patrick Farrell (paf@cray.com) uploaded a new patch: http://review.whamcloud.com/15328
Subject: LU-6738 llite: Sanity check fid in ll_fid2path
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: dfc09a11829e514c2ce2925a1ca02373581d64a1

Comment by Di Wang [ 17/Jun/15 ]

Hmm, It seems this problem has been fixed in LU-4691 on master http://review.whamcloud.com/#/c/10866/12 . So probably fix this in lmv_fld_lookup() or might need port the fix to 2_5. Thanks

Comment by Patrick Farrell (Inactive) [ 18/Jun/15 ]

Oops - Thank you, Di, I missed that fix. Sloppy on my part.

I still think it would be good to sanity check the fid immediately after we get it from userspace, in ll_fid2path, but there's no crash any more in master. Thanks for taking a look.

Comment by Patrick Farrell (Inactive) [ 27/Jul/17 ]

Fixed elsewhere as noted by Di.

Generated at Sat Feb 10 02:02:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.