[LU-1086] several crash triggered in key_fini related to a list corruption Created: 09/Feb/12  Updated: 27/Feb/12  Resolved: 27/Feb/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Alexandre Louvet Assignee: Zhenyu Xu
Resolution: Duplicate Votes: 0
Labels: None
Environment:

lustre 2.1


Issue Links:
Duplicate
duplicates LU-1013 recovery-mds lu_object.c:116:lu_objec... Resolved
Severity: 3
Rank (Obsolete): 6466

 Description   

During the past 3 days, we hit several crashes with those following backtraces on 2 different MDS:

 
crash> bt
PID: 13838  TASK: ffff88107c3ea0c0  CPU: 0   COMMAND: "jbd2/dm-1-8"
 #0 [ffff880fd3a4b740] machine_kexec at ffffffff81027a4b
 #1 [ffff880fd3a4b7a0] crash_kexec at ffffffff810a2db2
 #2 [ffff880fd3a4b870] oops_end at ffffffff81481730
 #3 [ffff880fd3a4b8a0] no_context at ffffffff81031d1b
 #4 [ffff880fd3a4b8f0] __bad_area_nosemaphore at ffffffff81031fa5
 #5 [ffff880fd3a4b940] bad_area_nosemaphore at ffffffff81032073
 #6 [ffff880fd3a4b950] __do_page_fault at ffffffff810326fd
 #7 [ffff880fd3a4ba70] do_page_fault at ffffffff8148373e
 #8 [ffff880fd3a4baa0] page_fault at ffffffff81480ac5
    [exception RIP: kmem_cache_free+123]
    RIP: ffffffff81146c5b  RSP: ffff880fd3a4bb50  RFLAGS: 00010086
    RAX: ffffeae3808b7d30  RBX: ffff88085645f000  RCX: 0000000000000000
    RDX: ffffea0000000000  RSI: ffffc90027daa01c  RDI: ffffc90027daa01c
    RBP: ffff880fd3a4bbb0   R8: 0000000000000000   R9: 5a5a5a5a5a5a5a5a
    R10: 5a5a5a5a5a5a5a5a  R11: 5a5a5a5a5a5a5a5a  R12: 0000000000000286
    R13: ffffc90027daa01c  R14: ffff88185d934500  R15: ffff880802c85560
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff880fd3a4bbb8] cfs_mem_cache_free at ffffffffa054887e [libcfs]
#10 [ffff880fd3a4bbc8] osc_key_fini at ffffffffa0876e11 [osc]
#11 [ffff880fd3a4bc18] key_fini at ffffffffa0610b89 [obdclass]
#12 [ffff880fd3a4bc48] keys_fini at ffffffffa0610ccf [obdclass]
#13 [ffff880fd3a4bc98] lu_context_fini at ffffffffa0610ddd [obdclass]
#14 [ffff880fd3a4bcb8] osd_trans_commit_cb at ffffffffa0aab6c2 [osd_ldiskfs]
#15 [ffff880fd3a4bd18] jbd2_journal_commit_transaction at ffffffffa00693a3 [jbd2]
#16 [ffff880fd3a4be68] kjournald2 at ffffffffa006ec28 [jbd2]
#17 [ffff880fd3a4bee8] kthread at ffffffff81079f36
#18 [ffff880fd3a4bf48] kernel_thread at ffffffff810041aa

or

crash> bt
PID: 18628  TASK: ffff88085b3a1180  CPU: 0   COMMAND: "jbd2/dm-19-8"
 #0 [ffff8807fd4e7740] machine_kexec at ffffffff81027a2b
 #1 [ffff8807fd4e77a0] crash_kexec at ffffffff810a3a52
 #2 [ffff8807fd4e7870] oops_end at ffffffff8147f680
 #3 [ffff8807fd4e78a0] no_context at ffffffff81031ddb
 #4 [ffff8807fd4e78f0] __bad_area_nosemaphore at ffffffff81032065
 #5 [ffff8807fd4e7940] bad_area_nosemaphore at ffffffff81032133
 #6 [ffff8807fd4e7950] __do_page_fault at ffffffff810327bd
 #7 [ffff8807fd4e7a70] do_page_fault at ffffffff8148168e
 #8 [ffff8807fd4e7aa0] page_fault at ffffffff8147ea15
    [exception RIP: kmem_cache_free+123]
    RIP: ffffffff811465eb  RSP: ffff8807fd4e7b50  RFLAGS: 00010086
    RAX: ffffeae380a76130  RBX: ffff88073ce4c000  RCX: 0000000000000000
    RDX: ffffea0000000000  RSI: ffffc9002fd2a01c  RDI: ffffc9002fd2a01c
    RBP: ffff8807fd4e7bb0   R8: 0000000000000000   R9: 5a5a5a5a5a5a5a5a
    R10: 5a5a5a5a5a5a5a5a  R11: 5a5a5a5a5a5a5a5a  R12: 0000000000000286
    R13: ffffc9002fd2a01c  R14: ffff882059a850c0  R15: ffff8817d6207c90
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff8807fd4e7bb8] cfs_mem_cache_free at ffffffffa04e087e [libcfs]
#10 [ffff8807fd4e7bc8] lov_key_fini at ffffffffa090f811 [lov]
#11 [ffff8807fd4e7c18] key_fini at ffffffffa05a7a39 [obdclass]
#12 [ffff8807fd4e7c48] keys_fini at ffffffffa05a7b7f [obdclass]
#13 [ffff8807fd4e7c98] lu_context_fini at ffffffffa05a7c8d [obdclass]
#14 [ffff8807fd4e7cb8] osd_trans_commit_cb at ffffffffa0a406c2 [osd_ldiskfs]
#15 [ffff8807fd4e7d18] jbd2_journal_commit_transaction at ffffffffa005927b [jbd2]
#16 [ffff8807fd4e7e68] kjournald2 at ffffffffa005eb48 [jbd2]
#17 [ffff8807fd4e7ee8] kthread at ffffffff8107ad36
#18 [ffff8807fd4e7f48] kernel_thread at ffffffff810041aa

In the second case, there is a lot of __list_add corruption warning in the dmesg log.

The first one (as far as I can see in the dmesg log buffer):

------------[ cut here ]------------
WARNING: at lib/list_debug.c:30 __list_add+0x8f/0xa0() (Tainted: G        W  ---------------- T)
Hardware name: bullx super-node
list_add corruption. prev->next should be next (ffffc9003197e01c), but was ffff880583a54ab8. (prev=ffff880583a54ab8).
Modules linked in: iptable_filter ip_tables cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fid(U) fld(U) lov(U) lquota(U) osc(U) fsfilt_ldiskfs(U) exportfs mgc(U) ldiskfs(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache(T) nfs_acl auth_rpcgss sunrpc acpi_cpufreq freq_table rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_round_robin dm_multipath usbhid hid ghes i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ehci_hcd uhci_hcd ioatdma lpfc scsi_transport_fc scsi_tgt hed sg igb dca ext4 jbd2 sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: microcode]
Pid: 18758, comm: mdt_63 Tainted: G        W  ---------------- T 2.6.32-131.12.1.bl6.Bull.26.x86_64 #1
Call Trace:
 [<ffffffff810540b7>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff810541a6>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff81267d3f>] ? __list_add+0x8f/0xa0
 [<ffffffffa05a9111>] ? lu_object_put+0x161/0x1f0 [obdclass]
 [<ffffffffa09e5c08>] ? mdt_getattr_name_lock+0xf08/0x1a40 [mdt]
 [<ffffffffa06c75bb>] ? __req_capsule_get+0x14b/0x6b0 [ptlrpc]
 [<ffffffffa069bb54>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
 [<ffffffffa09e6cfa>] ? mdt_intent_getattr+0x32a/0x500 [mdt]
 [<ffffffffa09e01e7>] ? mdt_unpack_req_pack_rep+0x297/0x5d0 [mdt]
 [<ffffffffa04ef625>] ? cfs_hash_bd_lookup_intent+0xe5/0x130 [libcfs]
 [<ffffffffa069cf50>] ? lustre_swab_ldlm_intent+0x0/0x20 [ptlrpc]
 [<ffffffffa09e4790>] ? mdt_intent_policy+0x3c0/0x6b0 [mdt]
 [<ffffffff81042890>] ? fair_enqueue_task_fair+0x190/0x350
 [<ffffffffa0587521>] ? class_handle_hash+0xa1/0x280 [obdclass]
 [<ffffffffa0654afa>] ? ldlm_lock_enqueue+0x2da/0xa50 [ptlrpc]
 [<ffffffffa0673305>] ? ldlm_export_lock_get+0x15/0x20 [ptlrpc]
 [<ffffffffa04ee692>] ? cfs_hash_bd_add_locked+0x62/0x90 [libcfs]
 [<ffffffffa067b227>] ? ldlm_handle_enqueue0+0x447/0x1090 [ptlrpc]
 [<ffffffffa09dffa1>] ? mdt_unpack_req_pack_rep+0x51/0x5d0 [mdt]
 [<ffffffffa09e430a>] ? mdt_enqueue+0x4a/0x110 [mdt]
 [<ffffffffa09e0df5>] ? mdt_handle_common+0x8d5/0x1810 [mdt]
 [<ffffffffa06992d4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
 [<ffffffffa09e1e05>] ? mdt_regular_handle+0x15/0x20 [mdt]
 [<ffffffffa06aa019>] ? ptlrpc_main+0xc79/0x19d0 [ptlrpc]
 [<ffffffff810017bc>] ? __switch_to+0x1ac/0x320
 [<ffffffffa06a93a0>] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [<ffffffff810041aa>] ? child_rip+0xa/0x20
 [<ffffffffa06a93a0>] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [<ffffffffa06a93a0>] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [<ffffffff810041a0>] ? child_rip+0x0/0x20
---[ end trace b8f1465c05250f4c ]---

The latest one just before OOPS:

------------[ cut here ]------------
WARNING: at lib/list_debug.c:30 __list_add+0x8f/0xa0() (Tainted: G        W  ---------------- T)
Hardware name: bullx super-node
list_add corruption. prev->next should be next (ffffc9002fd2a01c), but was (null). (prev=ffff88179c9cd1b8).
Modules linked in: iptable_filter ip_tables cmm(U) osd_ldiskfs(U) mdt(U) mdd(U) mds(U) fid(U) fld(U) lov(U) lquota(U) osc(U) fsfilt_ldiskfs(U) exportfs mgc(U) ldiskfs(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ipmi_devintf ipmi_si ipmi_msghandler nfs lockd fscache(T) nfs_acl auth_rpcgss sunrpc acpi_cpufreq freq_table rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ib_cm(U) ib_sa(U) ipv6 ib_uverbs(U) ib_umad(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_round_robin dm_multipath usbhid hid ghes i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ehci_hcd uhci_hcd ioatdma lpfc scsi_transport_fc scsi_tgt hed sg igb dca ext4 jbd2 sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: microcode]
Pid: 18750, comm: mdt_55 Tainted: G        W  ---------------- T 2.6.32-131.12.1.bl6.Bull.26.x86_64 #1
Call Trace:
 [<ffffffff810540b7>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff810541a6>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff81267d3f>] ? __list_add+0x8f/0xa0
 [<ffffffffa05a9111>] ? lu_object_put+0x161/0x1f0 [obdclass]
 [<ffffffffa09e5c08>] ? mdt_getattr_name_lock+0xf08/0x1a40 [mdt]
 [<ffffffffa06c75bb>] ? __req_capsule_get+0x14b/0x6b0 [ptlrpc]
 [<ffffffffa069bb54>] ? lustre_msg_get_flags+0x34/0xa0 [ptlrpc]
 [<ffffffffa09e6cfa>] ? mdt_intent_getattr+0x32a/0x500 [mdt]
 [<ffffffffa09e01e7>] ? mdt_unpack_req_pack_rep+0x297/0x5d0 [mdt]
 [<ffffffffa04ef5ab>] ? cfs_hash_bd_lookup_intent+0x6b/0x130 [libcfs]
 [<ffffffffa069cf50>] ? lustre_swab_ldlm_intent+0x0/0x20 [ptlrpc]
 [<ffffffffa09e4790>] ? mdt_intent_policy+0x3c0/0x6b0 [mdt]
 [<ffffffff81042890>] ? fair_enqueue_task_fair+0x190/0x350
 [<ffffffffa0587521>] ? class_handle_hash+0xa1/0x280 [obdclass]
 [<ffffffffa0654afa>] ? ldlm_lock_enqueue+0x2da/0xa50 [ptlrpc]
 [<ffffffffa0673305>] ? ldlm_export_lock_get+0x15/0x20 [ptlrpc]
 [<ffffffffa04ee692>] ? cfs_hash_bd_add_locked+0x62/0x90 [libcfs]
 [<ffffffffa067b227>] ? ldlm_handle_enqueue0+0x447/0x1090 [ptlrpc]
 [<ffffffffa09dffa1>] ? mdt_unpack_req_pack_rep+0x51/0x5d0 [mdt]
 [<ffffffffa09e430a>] ? mdt_enqueue+0x4a/0x110 [mdt]
 [<ffffffffa09e0df5>] ? mdt_handle_common+0x8d5/0x1810 [mdt]
 [<ffffffffa06992d4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
 [<ffffffffa09e1e05>] ? mdt_regular_handle+0x15/0x20 [mdt]
 [<ffffffffa06aa019>] ? ptlrpc_main+0xc79/0x19d0 [ptlrpc]
 [<ffffffff810017bc>] ? __switch_to+0x1ac/0x320
 [<ffffffffa06a93a0>] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [<ffffffff810041aa>] ? child_rip+0xa/0x20
 [<ffffffffa06a93a0>] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [<ffffffffa06a93a0>] ? ptlrpc_main+0x0/0x19d0 [ptlrpc]
 [<ffffffff810041a0>] ? child_rip+0x0/0x20
---[ end trace b8f1465c05250f81 ]---

Alex.



 Comments   
Comment by Peter Jones [ 09/Feb/12 ]

Bobi

Could you please have a look into this one?

Thanks

Peter

Comment by Gregoire Pichon [ 10/Feb/12 ]

After reading the portions of code involved in the dmesg messages, I think this could be an occurence of LU-1013.

When Bull integrated the patch for LU-685, the patch set did not had the LASSERT(cfs_list_empty(&top->loh_lru)); line in lu_object_put() routine. So the Lustre code running on this customer does not has this LASSERT.

This could explain why the bug appears as warnings from the __list_add() routine rather than an LBUG.

Comment by Peter Jones [ 10/Feb/12 ]

FanYong

Will this issue be addressed by your proposed fix for LU-1013? The crash seems to be in the same function...

Peter

Comment by nasf (Inactive) [ 11/Feb/12 ]

Currently, I do not know the root reason for LU-1086, but the patch for LU-1013 not only resolved the issue of "LASSERT()" mentioned in LU-1013, but also avoid some possible memory crash issue: free the object which is still in using. Because if without LU-1013 patch, some in used object may be still in lru_list, and may be freed by purge() routine. If so, will cause a lot of unknown memory crash issues.

Comment by Oleg Drokin [ 16/Feb/12 ]

In fact LU-1017 is believed to be the proper fix for LU-1013 and some others.

Comment by Diego Moreno (Inactive) [ 17/Feb/12 ]

Thanks Oleg.

Since we're running with patch from LU-1013 we didn't hit this issue anymore. Anyway we'll integrate LU-1017 instead of LU-1013.

Regards,

Comment by Peter Jones [ 27/Feb/12 ]

Duplicate of LU1017

Generated at Sat Feb 10 01:13:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.