Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
None
-
Lustre 2.10.0
-
Soak performance test cluster. Test patch for
LU-9504. Lustre-reviews build 47608 - version=2.9.57_70_gfcd7bdd
-
3
-
9223372036854775807
Description
OSS failover completed 2017-05-23 02:45:41
While OST were remounting/recovering, MDT3 hit an LBUG:
10) was lost; in progress operations using this service will wait for recovery to complete May 23 02:45:06 soak-10 kernel: Lustre: Skipped 3 previous similar messages May 23 02:45:11 soak-10 chronyd[1310]: Source 204.11.201.12 replaced with 198.58.110.84 May 23 02:45:19 soak-10 kernel: LustreError: 4833:0:(osp_dev.c:755:osp_statfs()) ASSERTION( sfs->os_fprecreated <= OST_MAX_PRECREATE * 2 ) failed: last_created [0x540000401:0x2e43841:0x0], next_fid [0x540000401:0x2e43841:0x0], reserved 19 May 23 02:45:19 soak-10 kernel: LustreError: 4833:0:(osp_dev.c:755:osp_statfs()) LBUG May 23 02:45:19 soak-10 kernel: Pid: 4833, comm: mdt00_016 May 23 02:45:19 soak-10 kernel: #012Call Trace: May 23 02:45:19 soak-10 kernel: [<ffffffffa0bbe7ee>] libcfs_call_trace+0x4e/0x60 [libcfs] May 23 02:45:19 soak-10 kernel: [<ffffffffa0bbe87c>] lbug_with_loc+0x4c/0xb0 [libcfs] May 23 02:45:19 soak-10 kernel: [<ffffffffa14e7764>] osp_statfs+0x404/0x480 [osp] May 23 02:45:19 soak-10 kernel: [<ffffffffa1443251>] lod_statfs_and_check+0xa1/0x580 [lod] May 23 02:45:19 soak-10 kernel: [<ffffffffa1445154>] lod_alloc_rr.constprop.18+0x494/0x1000 [lod] May 23 02:45:19 soak-10 kernel: [<ffffffffa1448979>] lod_qos_prep_create+0x12b9/0x17f0 [lod] May 23 02:45:19 soak-10 kernel: [<ffffffffa1449428>] lod_prepare_create+0x298/0x3f0 [lod] May 23 02:45:19 soak-10 kernel: [<ffffffffa143df8e>] lod_declare_striped_create+0x1ee/0x970 [lod] May 23 02:45:19 soak-10 kernel: [<ffffffffa14404bc>] lod_declare_create+0x1fc/0x570 [lod] May 23 02:45:19 soak-10 kernel: [<ffffffffa14ab97f>] mdd_declare_create_object_internal+0xdf/0x2f0 [mdd] May 23 02:45:19 soak-10 kernel: [<ffffffffa149d2f3>] mdd_declare_create+0x53/0xe20 [mdd] May 23 02:45:19 soak-10 kernel: [<ffffffffa0bc6574>] ? libcfs_log_return+0x24/0x30 [libcfs] May 23 02:45:19 soak-10 kernel: [<ffffffffa14a10d9>] mdd_create+0x7c9/0x1330 [mdd] May 23 02:45:19 soak-10 kernel: [<ffffffffa1377302>] mdt_reint_open+0x2192/0x31a0 [mdt] May 23 02:45:19 soak-10 kernel: [<ffffffffa136c500>] mdt_reint_rec+0x80/0x210 [mdt] May 23 02:45:19 soak-10 kernel: [<ffffffffa134e46b>] mdt_reint_internal+0x5fb/0x9c0 [mdt] May 23 02:45:19 soak-10 kernel: [<ffffffffa134e992>] mdt_intent_reint+0x162/0x430 [mdt] May 23 02:45:19 soak-10 kernel: [<ffffffffa135970e>] mdt_intent_policy+0x43e/0xc70 [mdt] May 23 02:45:19 soak-10 kernel: [<ffffffffa0ee0167>] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] May 23 02:45:19 soak-10 kernel: [<ffffffffa0f09ab3>] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] May 23 02:45:19 soak-10 kernel: [<ffffffffa0bc9d87>] ? libcfs_debug_msg+0x57/0x80 [libcfs] May 23 02:45:19 soak-10 kernel: [<ffffffffa0f8ef62>] tgt_enqueue+0x62/0x210 [ptlrpc] May 23 02:45:19 soak-10 kernel: [<ffffffffa0f92225>] tgt_request_handle+0x915/0x1360 [ptlrpc] May 23 02:45:19 soak-10 kernel: [<ffffffffa0f3b3e3>] ptlrpc_server_handle_request+0x233/0xa90 [ptlrpc] May 23 02:45:19 soak-10 kernel: [<ffffffffa0f3f3f0>] ptlrpc_main+0xaa0/0x1dd0 [ptlrpc] May 23 02:45:19 soak-10 kernel: [<ffffffffa0f3e950>] ? ptlrpc_main+0x0/0x1dd0 [ptlrpc] May 23 02:45:19 soak-10 kernel: [<ffffffff810b0a4f>] kthread+0xcf/0xe0 May 23 02:45:19 soak-10 kernel: [<ffffffff810b0980>] ? kthread+0x0/0xe0 May 23 02:45:19 soak-10 kernel: [<ffffffff81697318>] ret_from_fork+0x58/0x90 May 23 02:45:19 soak-10 kernel: [<ffffffff810b0980>] ? kthread+0x0/0xe0 May 23 02:45:19 soak-10 kernel: May 23 02:45:19 soak-10 kernel: LustreError: dumping log to /tmp/lustre-log.1495507519.4833 May 23 02:45:31 soak-10 kernel: LustreError: 11-0: soaked-OST0015-osc-MDT0002: operation obd_ping to node 192.168.1.104@o2ib10 failed: rc = -107 May 23 02:45:31 soak-10 kernel: Lustre: soaked-OST0015-osc-MDT0002: Connection to soaked-OST0015 (at 192.168.1.104@o2ib10) was lost; in progress operations using this service will wait for recovery to complete May 23 02:45:31 soak-10 kernel: Lustre: Skipped 1 previous similar message May 23 02:45:43 soak-10 kernel: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [mdt00_006:4716]
System is crash dumping atm. Will attach.
Attachments
Issue Links
- is related to
-
LU-9504 LBUG ptlrpc_handle_rs()) ASSERTION( lock != ((void *)0) ) failed
- Resolved