[LU-1531] Accessing .lustre/fid/[0x1:0x0:0x0] triggers LBUG in osd_compat_objid_lookup() Created: 15/Jun/12 Updated: 06/Nov/12 Resolved: 06/Nov/12 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.3.0 |
| Fix Version/s: | Lustre 2.4.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | John Hammond | Assignee: | Richard Henwood (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
|
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 6380 | ||||||||
| Description |
|
[sanity@r62-lustre lustre]$ cat .lustre/fid/[0x1:0x0:0x0] LustreError: 2316:0:(osd_compat.c:383:osd_compat_objid_lookup()) ASSERTION( map ) failed: Call Trace: LustreError: dumping log to /tmp/lustre-log.1339790039.2316 |
| Comments |
| Comment by Andreas Dilger [ 15/Jun/12 ] |
|
This is code that was landed to master from orion in commit 4980567857699c7f902ebda336ea98fdc4b83100. It definitely shouldn't be possible for regular users to trigger an LASSERT(), or otherwise access invalid FIDs via .lustre, so the MDS needs to validate the FID is sane and allowed to be accessed before passing it down to the lower layers. |
| Comment by John Hammond [ 09/Jul/12 ] |
|
You can also get here using lfs path2fid, in fact by using the sample FID provided. [root]# lfs fid2path /mnt/lustre PANTS fid2path error: Invalid argument LustreError: 2446:0:(osd_compat.c:381:osd_compat_objid_lookup()) ASSERTION( map ) failed: Call Trace: |
| Comment by Richard Henwood (Inactive) [ 03/Oct/12 ] |
|
I couldn't reproduce this issue with a recent Master build (924) and I can't reproduce. # cat .lustre/fid/[0x1:0x0:0x0] cat: .lustre/fid/[0x1:0x0:0x0]: Invalid argument |
| Comment by Richard Henwood (Inactive) [ 03/Oct/12 ] |
|
Seems like this issue has been fixed by |
| Comment by Andreas Dilger [ 03/Oct/12 ] |
|
Richard, it would probably be mired appropriate to have closed this as "Fixed" rather than "Cannot Reproduce", since it was a real bug that was fixed with a code change. The "Cannot Reproduce" label indicates that a reported bug was not seen in later testing for whatever reason, and no change was made to the code to resolve the issue. In any case, I'm glad this problem is gone. |
| Comment by Richard Henwood (Inactive) [ 03/Oct/12 ] |
|
Correcting resolution message. This issue was a real problem that was /fixed/. |
| Comment by Richard Henwood (Inactive) [ 03/Oct/12 ] |
|
double checking if this is actually fixed on Lustre 2.3 ... |
| Comment by Richard Henwood (Inactive) [ 03/Oct/12 ] |
|
Given Andreas's comment that this issue was introduced by an Orion commit, 2.3 should not be affected. I've checked it anyway, and I couldn't reproduce. However, this issue is fixed. |
| Comment by Richard Henwood (Inactive) [ 05/Oct/12 ] |
|
This issue is present on Master. lfs fid2path /mnt/lustre/ [0x1:0x2:0x0] LustreError: 31926:0:(mdt_handler.c:2447:mdt_obj()) ASSERTION( lu_device_is_mdt(o->lo_dev) ) failed: LustreError: 31926:0:(mdt_handler.c:2447:mdt_obj()) LBUG Pid: 31926, comm: mdt00_000 Call Trace: [<ffffffffa03d3905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa03d3f17>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa0b851cf>] mdt_obj+0x5f/0x80 [mdt] [<ffffffffa0b892e6>] mdt_object_find+0x66/0x170 [mdt] [<ffffffffa0b8dcaa>] mdt_get_info+0x22a/0xa90 [mdt] [<ffffffffa0b8943d>] ? mdt_unpack_req_pack_rep+0x4d/0x4c0 [mdt] [<ffffffffa0b91322>] mdt_handle_common+0x932/0x1740 [mdt] [<ffffffffa0b92205>] mdt_regular_handle+0x15/0x20 [mdt] [<ffffffffa06ed7ac>] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc] [<ffffffffa03d465e>] ? cfs_timer_arm+0xe/0x10 [libcfs] [<ffffffffa06e4b87>] ? ptlrpc_wait_event+0xa7/0x2a0 [ptlrpc] [<ffffffff810533f3>] ? __wake_up+0x53/0x70 [<ffffffffa06eed81>] ptlrpc_main+0xbf1/0x19e0 [ptlrpc] [<ffffffffa06ee190>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc] [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffffa06ee190>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc] [<ffffffffa06ee190>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Kernel panic - not syncing: LBUG Pid: 31926, comm: mdt00_000 Not tainted 2.6.32-279.5.1.el6_lustre.x86_64 #1 Call Trace: [<ffffffff814fd58a>] ? panic+0xa0/0x168 [<ffffffffa03d3f6b>] ? lbug_with_loc+0x9b/0xb0 [libcfs] [<ffffffffa0b851cf>] ? mdt_obj+0x5f/0x80 [mdt] [<ffffffffa0b892e6>] ? mdt_object_find+0x66/0x170 [mdt] [<ffffffffa0b8dcaa>] ? mdt_get_info+0x22a/0xa90 [mdt] [<ffffffffa0b8943d>] ? mdt_unpack_req_pack_rep+0x4d/0x4c0 [mdt] [<ffffffffa0b91322>] ? mdt_handle_common+0x932/0x1740 [mdt] [<ffffffffa0b92205>] ? mdt_regular_handle+0x15/0x20 [mdt] [<ffffffffa06ed7ac>] ? ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc] [<ffffffffa03d465e>] ? cfs_timer_arm+0xe/0x10 [libcfs] [<ffffffffa06e4b87>] ? ptlrpc_wait_event+0xa7/0x2a0 [ptlrpc] [<ffffffff810533f3>] ? __wake_up+0x53/0x70 [<ffffffffa06eed81>] ? ptlrpc_main+0xbf1/0x19e0 [ptlrpc] [<ffffffffa06ee190>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc] [<ffffffff8100c14a>] ? child_rip+0xa/0x20 [<ffffffffa06ee190>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc] [<ffffffffa06ee190>] ? ptlrpc_main+0x0/0x19e0 [ptlrpc] [<ffffffff8100c140>] ? child_rip+0x0/0x20 |
| Comment by Richard Henwood (Inactive) [ 06/Nov/12 ] |
|
a fix has landed in master: |