Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.4.0
-
3
-
6169
Description
I hit this today on a few of the IONs for Sequoia. I was running a mount and umount of the filesystem in a loop, upon killing the script I hit the crash below a some of the nodes:
LustreError: 8462:0:(llite_lib.c:543:client_common_fill_super()) cannot start close thread: rc -513
Unable to handle kernel paging request for data at address 0x00000010
Faulting instruction address: 0x80000000046825cc
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=68 Blue Gene/Q
Modules linked in: lmv(U) mgc(U) lustre(U) mdc(U) fid(U) fld(U) lov(U) osc(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) bgvrnic bgmudm
NIP: 80000000046825cc LR: 8000000003bc3acc CTR: 8000000004682590
REGS: c0000003c4a0eaa0 TRAP: 0300 Not tainted (2.6.32-220.23.3.bgq.18llnl.V1R1M2.bgq62_16.ppc64)
MSR: 0000000080029000 <EE,ME,CE> CR: 44088488 XER: 20000000
DEAR: 0000000000000010, ESR: 0000000000000000
TASK = c0000003e4fdf360[8462] 'mount.lustre' THREAD: c0000003c4a0c000 CPU: 3
GPR00: 0000000003060580 c0000003c4a0ed20 80000000046f07a0 c0000003ce1726f0
GPR04: c000000360b9f800 00000000000012e0 c0000003c4a0f0b0 00000000640a0000
GPR08: c000000360b9f954 0000000000000000 0000000000000001 8000000004684240
GPR12: 8000000003bfebb0 c000000000764c00 c0000003c4a68000 00000000000005c2
GPR16: 0000000000000001 0000000000002d20 0000000002000400 000000000000028a
GPR20: 0000000000000295 0000000000020000 80000000025220e0 0000000000000001
GPR24: 0000000000000000 c0000003ce16c138 8000000000b2424c 0000000040080000
GPR28: c0000003ce1726f0 8000000000b24248 80000000046eda40 c0000003c4a0ed20
NIP [80000000046825cc] .osc_import_event+0xa5c/0x26d0 [osc]
LR [8000000003bc3acc] .ptlrpc_deactivate_import+0x1fc/0x7d0 [ptlrpc]
Call Trace:
[c0000003c4a0ed20] [c000000360b9f8b0] 0xc000000360b9f8b0 (unreliable)
[c0000003c4a0ee60] [8000000003bc3acc] .ptlrpc_deactivate_import+0x1fc/0x7d0 [ptlrpc]
[c0000003c4a0ef30] [8000000003bc47b8] .ptlrpc_invalidate_import+0x1d8/0xef0 [ptlrpc]
[c0000003c4a0f0d0] [8000000004690238] .osc_precleanup+0x2a8/0x720 [osc]
[c0000003c4a0f190] [80000000024ac3a0] .class_cleanup+0x240/0x17a0 [obdclass]
[c0000003c4a0f310] [80000000024b3208] .class_process_config+0x20f8/0x4b00 [obdclass]
[c0000003c4a0f460] [80000000024b614c] .class_manual_cleanup+0x53c/0x1760 [obdclass]
[c0000003c4a0f5c0] [800000000691d6e4] .ll_put_super+0x2c4/0x800 [lustre]
[c0000003c4a0f750] [800000000691e3d8] .ll_fill_super+0x7b8/0xae20 [lustre]
[c0000003c4a0f900] [80000000024e1144] .lustre_fill_super+0x4c4/0x8e0 [obdclass]
[c0000003c4a0f9d0] [c0000000000d4508] .get_sb_nodev+0x84/0xe8
[c0000003c4a0fa80] [80000000024ba618] .lustre_get_sb+0x28/0x40 [obdclass]
[c0000003c4a0fb10] [c0000000000d2f14] .vfs_kern_mount+0x80/0x114
[c0000003c4a0fbc0] [c0000000000d3010] .do_kern_mount+0x58/0x130
[c0000003c4a0fc80] [c0000000000f12fc] .do_mount+0x84c/0x908
[c0000003c4a0fd70] [c0000000000f1470] .SyS_mount+0xb8/0x124
[c0000003c4a0fe30] [c000000000000580] syscall_exit+0x0/0x2c
Instruction dump:
419e00f4 801a0000 20b14000 7fa50040 41dd1868 801d0000 780907e1 40820108
eb7900c0 777b4008 4182097c e9390000 <e9290010> eb6901c0 2fbb0000 419e0d10
Kernel panic - not syncing: Fatal exception
Call Trace:
[c0000003c4a0e7d0] [c000000000008d1c] .show_stack+0x7c/0x184 (unreliable)
[c0000003c4a0e880] [c000000000431ef4] .panic+0x80/0x1ac
[c0000003c4a0e910] [c000000000019d40] .die+0x1a4/0x1bc
[c0000003c4a0e9b0] [c00000000001f95c] .bad_page_fault+0xb8/0xd4
[c0000003c4a0ea30] [c000000000014e4c] storage_fault_common+0x48/0x4c
--- Exception: 300 at .osc_import_event+0xa5c/0x26d0 [osc]
LR = .ptlrpc_deactivate_import+0x1fc/0x7d0 [ptlrpc]
[c0000003c4a0ed20] [c000000360b9f8b0] 0xc000000360b9f8b0 (unreliable)
[c0000003c4a0ee60] [8000000003bc3acc] .ptlrpc_deactivate_import+0x1fc/0x7d0 [ptlrpc]
[c0000003c4a0ef30] [8000000003bc47b8] .ptlrpc_invalidate_import+0x1d8/0xef0 [ptlrpc]
[c0000003c4a0f0d0] [8000000004690238] .osc_precleanup+0x2a8/0x720 [osc]
[c0000003c4a0f190] [80000000024ac3a0] .class_cleanup+0x240/0x17a0 [obdclass]
[c0000003c4a0f310] [80000000024b3208] .class_process_config+0x20f8/0x4b00 [obdclass]
[c0000003c4a0f460] [80000000024b614c] .class_manual_cleanup+0x53c/0x1760 [obdclass]
[c0000003c4a0f5c0] [800000000691d6e4] .ll_put_super+0x2c4/0x800 [lustre]
[c0000003c4a0f750] [800000000691e3d8] .ll_fill_super+0x7b8/0xae20 [lustre]
[c0000003c4a0f900] [80000000024e1144] .lustre_fill_super+0x4c4/0x8e0 [obdclass]
[c0000003c4a0f9d0] [c0000000000d4508] .get_sb_nodev+0x84/0xe8
[c0000003c4a0fa80] [80000000024ba618] .lustre_get_sb+0x28/0x40 [obdclass]
[c0000003c4a0fb10] [c0000000000d2f14] .vfs_kern_mount+0x80/0x114
[c0000003c4a0fbc0] [c0000000000d3010] .do_kern_mount+0x58/0x130
[c0000003c4a0fc80] [c0000000000f12fc] .do_mount+0x84c/0x908
[c0000003c4a0fd70] [c0000000000f1470] .SyS_mount+0xb8/0x124
[c0000003c4a0fe30] [c000000000000580] syscall_exit+0x0/0x2c