Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
Lustre 2.1.5
-
None
-
Linux 2.6.32-279.19.1.el6_lustre.x86_64 #1 SMP
-
3
-
9290
Description
Hello,
Our OST-servers periodically crashes with kernel panic error:
general protection fault: 0000 1 SMP
last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
CPU 9
Modules linked in: lmv(U) obdfilter(U) fsfilt_ldiskfs(U) exportfs ost(U) mgc(U) ldiskfs(U) jbd2 lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) autofs4 cpufreq_ondemand acpi_cpufreq freq_table mperf bonding 8021q garp stp llc ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx ses enclosure sg microcode serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core ib_mthca ib_mad ib_core igb dca ext3 jbd mbcache sd_mod crc_t10dif mpt2sas scsi_transport_sas raid_class ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 26235, comm: ldlm_bl_04 Not tainted 2.6.32-279.19.1.el6_lustre.x86_64 #1 Supermicro X8DTH-i/6/iF/6F/X8DTH
RIP: 0010:[<ffffffffa0976751>] [<ffffffffa0976751>] osc_lock_detach+0x51/0x1b0 [osc]
RSP: 0018:ffff88016223bd40 EFLAGS: 00010206
RAX: ffffffffa0993100 RBX: 5a5a5a5a5a5a5a5a RCX: 0000000000000000
RDX: 000000000000e4f9 RSI: ffff8801f08ead78 RDI: ffffffffa0993100
RBP: ffff88016223bd70 R08: 0000000000000000 R09: ffff88014f9a2400
R10: 5a5a5a5a5a5a5a5a R11: 5a5a5a5a5a5a5a5a R12: ffff8801f08ead78
R13: ffff8801f41a9b58 R14: ffff8801f41a9b58 R15: ffff88020cf45240
FS: 00007f01e69b9700(0000) GS:ffff8800282a0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000042b650 CR3: 0000000001a85000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ldlm_bl_04 (pid: 26235, threadinfo ffff88016223a000, task ffff8801ad57e080)
Stack:
ffff88016223bd40 ffff8801f08ead78 0000000000000000 ffff88022ce47c18
<d> ffff8801f41a9b58 ffff88020cf45240 ffff88016223bdc0 ffffffffa0976ab4
<d> ffffffffa04f944d 0000000000000000 ffff88016223bda0 ffff8801f41a9b58
Call Trace:
[<ffffffffa0976ab4>] osc_lock_cancel+0xa4/0x1b0 [osc]
[<ffffffffa04f944d>] ? cl_env_nested_get+0x5d/0xc0 [obdclass]
[<ffffffffa04ff225>] cl_lock_cancel0+0x75/0x160 [obdclass]
[<ffffffffa04fff0b>] cl_lock_cancel+0x13b/0x140 [obdclass]
[<ffffffffa0977bba>] osc_ldlm_blocking_ast+0x13a/0x380 [osc]
[<ffffffffa062a123>] ldlm_handle_bl_callback+0x123/0x2e0 [ptlrpc]
[<ffffffffa062a561>] ldlm_bl_thread_main+0x281/0x3d0 [ptlrpc]
[<ffffffff8105fa40>] ? default_wake_function+0x0/0x20
[<ffffffffa062a2e0>] ? ldlm_bl_thread_main+0x0/0x3d0 [ptlrpc]
[<ffffffff8100c0ca>] child_rip+0xa/0x20
[<ffffffffa062a2e0>] ? ldlm_bl_thread_main+0x0/0x3d0 [ptlrpc]
[<ffffffffa062a2e0>] ? ldlm_bl_thread_main+0x0/0x3d0 [ptlrpc]
[<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Code: fd 48 c7 c7 00 31 99 a0 e8 bd 60 b7 e0 49 8b 5c 24 28 48 85 db 0f 84 7f 00 00 00 49 c7 44 24 28 00 00 00 00 48 c7 c0 00 31 99 a0 <48> c7 83 70 01 00 00 00 00 00 00 49 c7 44 24 60 00 00 00 00 66
RIP [<ffffffffa0976751>] osc_lock_detach+0x51/0x1b0 [osc]
RSP <ffff88016223bd40>