Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.4.0
-
None
-
4372
Description
Second time I see a crash like this. First time was test 118m, now test 27m, but same assertion/stack.
Has to do with OOS apparently:
[10419.559628] Lustre: DEBUG MARKER: == sanity test 27m: create file while OST0 was full ==================== 22:00:08 (1349402408) [10437.623359] LustreError: 16640:0:(vvp_io.c:1038:vvp_io_commit_write()) Write page 37435 of inode ffff88016e63bb20 failed -28 [10437.775336] LustreError: 12752:0:(osp_precreate.c:275:osp_precreate_send()) l ustre-OST0000-osc-MDT0000: can't precreate: rc = -28[10439.841663] LustreError: 12937:0:(lod_qos.c:1147:lod_alloc_qos()) can't decla re new object on #0: -28 [10439.843142] LustreError: 12937:0:(lod_qos.c:1159:lod_alloc_qos()) Didn't find any OSTs? [10439.844380] LustreError: 12937:0:(lod_qos.c:1163:lod_alloc_qos()) ASSERTION( nfound == stripe_cnt ) failed: [10439.845971] LustreError: 12937:0:(lod_qos.c:1163:lod_alloc_qos()) LBUG [10439.847156] Pid: 12937, comm: mdt00_004 [10439.848067] [10439.848067] Call Trace: [10439.848838] [<ffffffffa074d915>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [10439.850378] [<ffffffffa074df27>] lbug_with_loc+0x47/0xb0 [libcfs] [10439.851372] [<ffffffffa0c39217>] lod_alloc_qos.clone.0+0x8e7/0x1170 [lod] [10439.852475] [<ffffffffa0c3b303>] lod_qos_prep_create+0x693/0x18e4 [lod] [10439.853557] [<ffffffffa0c36a8b>] lod_declare_striped_object+0x14b/0x920 [lod] [10439.854839] [<ffffffffa0c37568>] lod_declare_object_create+0x308/0x4f0 [lod] [10439.855969] [<ffffffffa06ffc4f>] mdd_declare_object_create_internal+0xaf/0x1d0 [mdd] [10439.857238] [<ffffffffa0710aca>] mdd_create+0x39a/0x1550 [mdd] [10439.858190] [<ffffffffa0b7bbc9>] mdt_reint_open+0x1079/0x1860 [mdt] [10439.859199] [<ffffffffa071686e>] ? md_ucred+0x1e/0x60 [mdd] [10439.860114] [<ffffffffa0b46655>] ? mdt_ucred+0x15/0x20 [mdt] [10439.861049] [<ffffffffa0b660a1>] mdt_reint_rec+0x41/0xe0 [mdt] [10439.862006] [<ffffffffa0b5f483>] mdt_reint_internal+0x4e3/0x7e0 [mdt] [10439.863046] [<ffffffffa0b5fa4d>] mdt_intent_reint+0x1ed/0x500 [mdt] [10439.864064] [<ffffffffa0b5b3fe>] mdt_intent_policy+0x38e/0x770 [mdt] [10439.865125] [<ffffffffa022ddda>] ldlm_lock_enqueue+0x2ea/0x890 [ptlrpc] [10439.866208] [<ffffffffa0254fc7>] ldlm_handle_enqueue0+0x4e7/0x1010 [ptlrpc] [10439.867328] [<ffffffffa0b5b936>] mdt_enqueue+0x46/0x130 [mdt] [10439.868292] [<ffffffffa0b4f1f2>] mdt_handle_common+0x932/0x1740 [mdt] [10439.869535] [<ffffffffa0b500d5>] mdt_regular_handle+0x15/0x20 [mdt] [10439.870577] [<ffffffffa0283743>] ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc] [10439.871790] [<ffffffffa074e66e>] ? cfs_timer_arm+0xe/0x10 [libcfs] [10439.873067] [<ffffffffa027c431>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc] [10439.874154] [<ffffffff81051f73>] ? __wake_up+0x53/0x70 [10439.875000] [<ffffffffa02862ce>] ptlrpc_main+0xb8e/0x1960 [ptlrpc] [10439.876019] [<ffffffffa0285740>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] [10439.877234] [<ffffffff8100c14a>] child_rip+0xa/0x20 [10439.878049] [<ffffffffa0285740>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] [10439.879049] [<ffffffffa0285740>] ? ptlrpc_main+0x0/0x1960 [ptlrpc] [10439.880047] [<ffffffff8100c140>] ? child_rip+0x0/0x20 [10439.880889] [10439.881508] Kernel panic - not syncing: LBUG