Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
Lustre 2.4.1, Lustre 2.5.0
-
None
-
SLES11 SP1 and SLES11 SP3
-
3
-
12639
Description
When statahead is disabled, the client crashes referencing a NULL pointer.
c0-0c0s3n1 Lustre: Lustre: Build Version: 2.4.1-trunk-1.0501.14514.14.1-abuild-RB-5.1UP01_2.4.1-2013-12-06-21:46 c0-0c0s3n1 BUG: unable to handle kernel NULL pointer dereference at 000000000000000c c0-0c0s3n1 IP: [<ffffffffa0994ef5>] ll_lookup_it+0x605/0xb00 [lustre] c0-0c0s3n1 PGD 838401067 PUD 827df6067 PMD 0 c0-0c0s3n1 Oops: 0000 [#1] SMP c0-0c0s3n1 CPU 0 c0-0c0s3n1 Modules linked in: lmv mgc lustre lov osc mdc fid fld ptlrpc obdclass lvfs ipt_MASQUERADE ipt_LOG xt_state iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables x_tables binfmt_misc dvspn(P) dvsof(P) kgnilnd dvsutil(P) dvsipc(P) dvsipc_lnet(P) lnet libcfs dvsproc(P) bpmcdmod nic_compat dm_mod ahci libahci libata ehci_hcd usbcore scsi_mod usb_common igb kdreg gpcd_ari ipogif_ari kgni_ari hwerr(P) rca hss_os(P) heartbeat simplex(P) ghal_ari craytrace c0-0c0s3n1 Pid: 7762, comm: rm Tainted: P 3.0.80-0.5.1_1.0501.7664-cray_ari_s #1 Cray Inc. Cascade/Cascade c0-0c0s3n1 RIP: 0010:[<ffffffffa0994ef5>] [<ffffffffa0994ef5>] ll_lookup_it+0x605/0xb00 [lustre] c0-0c0s3n1 RSP: 0018:ffff88082b68dd08 EFLAGS: 00010282 c0-0c0s3n1 RAX: 0000000000001e52 RBX: 0000000000001e52 RCX: ffff88082bcd1800 c0-0c0s3n1 RDX: ffff88082a516500 RSI: ffff880825601140 RDI: ffff88082bcd1800 c0-0c0s3n1 RBP: ffff88082b68dde8 R08: 0000000000000002 R09: 0000000000000000 c0-0c0s3n1 R10: 5a5a5a5a5a5a5a5a R11: 5a5a5a5a5a5a5a5a R12: 0000000000000000 c0-0c0s3n1 R13: ffff88082509b640 R14: 0000000000000000 R15: ffff880827e376c0 c0-0c0s3n1 FS: 00007f259e978700(0000) GS:ffff88087fa00000(0000) knlGS:0000000000000000 c0-0c0s3n1 FS: 00007f259e978700(0000) GS:ffff88087fa00000(0000) knlGS:0000000000000000 c0-0c0s3n1 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b c0-0c0s3n1 CR2: 000000000000000c CR3: 000000082b58d000 CR4: 00000000000406f0 c0-0c0s3n1 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 c0-0c0s3n1 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 c0-0c0s3n1 Process rm (pid: 7762, threadinfo ffff88082b68c000, task ffff8808257dc7a0) c0-0c0s3n1 Stack: c0-0c0s3n1 ffff88082b68ddb0 ffffffffa0990090 0000000000000000 0000000000000048 c0-0c0s3n1 0000000020000280 ffff88080064b8c0 ffff880827e376c0 ffff88080064b8c0 c0-0c0s3n1 0000000000000010 0000000000000000 0000000000000000 0000000000000000 c0-0c0s3n1 Call Trace: c0-0c0s3n1 [<ffffffffa099547c>] ll_lookup_nd+0x8c/0x3e0 [lustre] c0-0c0s3n1 [<ffffffff8114d83c>] d_alloc_and_lookup+0x4c/0x80 c0-0c0s3n1 [<ffffffff8114d96e>] __lookup_hash+0xfe/0x180 c0-0c0s3n1 [<ffffffff81152296>] do_unlinkat+0xa6/0x1c0 c0-0c0s3n1 [<ffffffff81152522>] sys_unlinkat+0x22/0x40 c0-0c0s3n1 [<ffffffff814d9aab>] system_call_fastpath+0x16/0x1b c0-0c0s3n1 [<00007f259e4fad28>] 0x7f259e4fad27 c0-0c0s3n1 Code: 58 ff ff ff 8b 9a bc 03 00 00 4c 8b b2 a8 03 00 00 4c 8b 68 78 e8 7c e2 94 ff 39 c3 0f 85 a8 fa ff ff 4d 85 ed 0f 84 9f fa ff ff c0-0c0s3n1 8b 46 0c 41 89 45 60 0f 1f 00 e9 8f fa ff ff 0f 1f 00 49 8b c0-0c0s3n1 RIP [<ffffffffa0994ef5>] ll_lookup_it+0x605/0xb00 [lustre] c0-0c0s3n1 RSP <ffff88082b68dd08> c0-0c0s3n1 CR2: 000000000000000c c0-0c0s3n1 ---[ end trace 4e73cb9a15a458fd ]--- c0-0c0s3n1 Kernel panic - not syncing: Fatal exception
Reproducer:
> lctl set_param llite/*/statahead_max = 0
> ls -l <lustre_dir>
=====> crash
The bug has been seen on both 2.4 and 2.5 clients, but goes away with the patch from LU-3270 (http://review.whamcloud.com/#change,6392).