[LU-4719] mdt_dump_lmm crashes for directories created with a large stripe count. Created: 06/Mar/14  Updated: 29/Apr/14  Resolved: 20/Mar/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.5.0, Lustre 2.6.0, Lustre 2.4.2
Fix Version/s: Lustre 2.6.0, Lustre 2.5.2

Type: Bug Priority: Major
Reporter: James A Simmons Assignee: Oleg Drokin
Resolution: Fixed Votes: 0
Labels: None
Environment:

RHEL6.5 server and clients.


Severity: 3
Rank (Obsolete): 12973

 Description   

For directories created with a large stripe when mdt_dump_lmm is called it will kernel panic.

<4>[11496.962905] Pid: 48644, comm: mdt01_248 Not tainted 2.6.32-358.23.2.el6.atlas.x86_64 #1 Dell Inc. PowerEdge C6220 II/09N44V
<4>[11496.994466] RIP: 0010:[<ffffffffa0c998fa>] [<ffffffffa0c998fa>] mdt_dump_lmm+0x1aa/0x410 [mdt]
<4>[11497.023662] RSP: 0018:ffff88086d9fdc30 EFLAGS: 00010282
<4>[11497.043458] RAX: 0000000000000000 RBX: ffff8807baf1bff8 RCX: 5a5a5a595a5a5a5a
<4>[11497.064035] RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000040
<4>[11497.092925] RBP: ffff88086d9fdc80 R08: 0000000000004001 R09: 0000000000f40403
<4>[11497.113743] R10: ffff8807baf1b1f8 R11: ffffffffa0bb56d0 R12: 000000000000008b
<4>[11497.134012] R13: 0000000000000040 R14: 0000000000000000 R15: ffffffff00000000
<4>[11497.163174] FS: 00007f6f7f130700(0000) GS:ffff880044680000(0000) knlGS:0000000000000000
<4>[11497.192763] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>[11497.204072] CR2: ffff8807baf1c000 CR3: 0000000c4ad87000 CR4: 00000000001407e0
<4>[11497.232997] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[11497.253540] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>[11497.274148] Process mdt01_248 (pid: 48644, threadinfo ffff88086d9fc000, task ffff88086d9b6ae0)
<4>[11497.304009] Stack:
<4>[11497.313221] ffff88086d9e0000 00000096e6ec0000 0000000000000000 0000000000000000
<4>[11497.334450] <d> ffffc900500c5e28 ffff88086d9e0000 ffff88086c8f8548 ffff8806447f73a0
<4>[11497.363226] <d> ffff880be6ec0000 ffff8806447f7000 ffff88086d9fdd10 ffffffffa0c921c9
<4>[11497.393213] Call Trace:
<4>[11497.394912] [<ffffffffa0c921c9>] mdt_getattr_internal+0x669/0x13e0 [mdt]
<4>[11497.422925] [<ffffffffa0c9aa7c>] ? mdt_check_ucred+0x4c/0xa10 [mdt]
<4>[11497.443315] [<ffffffffa0c98d30>] mdt_getattr+0x160/0x900 [mdt]
<4>[11497.463294] [<ffffffffa0c86b27>] mdt_handle_common+0x647/0x16d0 [mdt]
<4>[11497.483521] [<ffffffffa06c9d3c>] ? lustre_msg_get_transno+0x8c/0x100 [ptlrpc]
<4>[11497.504148] [<ffffffffa0cc2a85>] mds_regular_handle+0x15/0x20 [mdt]
<4>[11497.532941] [<ffffffffa06d9558>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
<4>[11497.553835] [<ffffffffa03e25de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
<4>[11497.573882] [<ffffffffa03f3d9f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
<4>[11497.594049] [<ffffffffa06d08b9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
<4>[11497.622982] [<ffffffff81055c93>] ? __wake_up+0x53/0x70
<4>[11497.634054] [<ffffffffa06da8ee>] ptlrpc_main+0xace/0x1700 [ptlrpc]
<4>[11497.654205] [<ffffffffa06d9e20>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
<4>[11497.682781] [<ffffffff8100c0ca>] child_rip+0xa/0x20
<4>[11497.693726] [<ffffffffa06d9e20>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
<4>[11497.713792] [<ffffffffa06d9e20>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
<4>[11497.733980] [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
<4>[11497.753572] Code: 31 c0 8b 4b 14 44 89 e2 48 c7 c6 e8 ec cc a0 48 c7 c7 e0 30 cf a0 31 c0 e8 a4 89 75 ff 41 83 c4 01 44 3b 65 bc 7d 2a 48 83 c3 18 <48> 8b 43 08 48 85 c0 0f 85 d1 fe ff ff 48 8b 13 48 89 45 c8 48
<1>[11497.814662] RIP [<ffffffffa0c998fa>] mdt_dump_lmm+0x1aa/0x410 [mdt]



 Comments   
Comment by Peter Jones [ 06/Mar/14 ]

Oleg is working on the fix for this

Comment by Oleg Drokin [ 06/Mar/14 ]

The issue at hand is that mdt_dump_lmm assumes passed in striping is the real one.
Yet when it's a directory, while stripe count could be large, there are no real objects. It was on when we allocated the biggest possible EA sized buffer,
but now that LU-4008 shrinks that buffer, traversing this listof objects will walk out of the buffer and if the next page is not mapped, we;ll have this crash.

We just need to make sure the EA is not for a directory before printing objects.

I see lov_dump_lmm has somewhat similar code with o checks, so it needs to be verified separately not to do stupid things.

Comment by Oleg Drokin [ 06/Mar/14 ]

Patch is in http://review.whamcloud.com/9520

Comment by Oleg Drokin [ 06/Mar/14 ]

Another observation, the mdt_dump_lmm traverses striping info even if it's not printed, so it's a pure waste of cpu time.

Extra patch is needed to compare current visible msk to the level passed inthat functin and return early if there's no overlap.

Comment by Andreas Dilger [ 06/Mar/14 ]

I was just going to write the same thing - this function is mostly just overhead.

I pushed an updated version of the patch to skip iteration if nothing will be printed.

Comment by Peter Jones [ 20/Mar/14 ]

Landed for 2.6

Comment by James Nunez (Inactive) [ 18/Apr/14 ]

Patch for b2_5 at http://review.whamcloud.com/10021

Generated at Sat Feb 10 01:45:15 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.