Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Oleg Drokin <green@whamcloud.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/de8e19f6-88ee-11e6-ad53-5254006e85c2.
The sub-test test_0a failed with the following error:
test failed to respond and timed out
Now, this is with dm_flakey patch that might have introduced some unexpected failure behavior, but even so it highlights an assert that could happen as it turns out, so we better look at this in more details
Assertion was introcuded by FanYong's patch commit cecde8bdb4913fd4405d425b0bf3aead03181e9d for LU-8218
MDS crashed with:
15:13:56:[ 1124.877616] LustreError: 167-0: lustre-MDT0000-lwp-MDT0002: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. 15:13:56:[ 1124.908734] LDISKFS-fs error (device dm-6): mb_free_blocks:1453: group 4, block 68664:freeing already freed block (bit 1560); block bitmap corrupt. 15:13:56:[ 1124.914666] Aborting journal on device dm-6-8. 15:13:56:[ 1124.978308] LDISKFS-fs (dm-6): Remounting filesystem read-only 15:13:56:[ 1124.982226] LDISKFS-fs warning (device dm-6): ldiskfs_mb_generate_buddy:761: group 4: 15957 blocks in bitmap, 15959 in bb, 16445 in gd, 1 pa's block bitmap corrupt 15:13:56:[ 1124.987219] LDISKFS-fs error (device dm-6) in osd_trans_stop:1834: IO failure 15:13:56:[ 1124.990016] LustreError: 6145:0:(osd_handler.c:1837:osd_trans_stop()) lustre-MDT0000: failed to stop transaction: rc = -5 15:13:56:[ 1124.992521] Lustre: lustre-MDT0000: Recovery over after 0:04, of 5 clients 5 recovered and 0 were evicted. 15:13:56:[ 1125.992942] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 4 sec 15:13:56:[ 1126.229054] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 4 sec 15:13:56:[ 1126.241517] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 4 sec 15:13:56:[ 1126.476588] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 4 sec 15:13:56:[ 1127.807784] format at osd_handler.c:1182:osd_fid_lookup doesn't end in newline 15:13:56:[ 1127.812857] LustreError: 800:0:(osd_handler.c:1182:osd_fid_lookup()) ASSERTION( tid->oii_ino == id->oii_ino && tid->oii_gen == id->oii_gen ) failed: OI mapping changed(2): 139/537564674 => 139/1570612830 15:13:56:[ 1127.815844] LustreError: 800:0:(osd_handler.c:1182:osd_fid_lookup()) LBUG 15:13:56:[ 1127.819571] Pid: 800, comm: mdt00_001 15:13:56:[ 1127.821424] 15:13:56:[ 1127.821424] Call Trace: 15:13:56:[ 1127.824770] [<ffffffffa06987d3>] libcfs_debug_dumpstack+0x53/0x80 [libcfs] 15:13:56:[ 1127.826840] [<ffffffffa0698d75>] lbug_with_loc+0x45/0xc0 [libcfs] 15:13:56:[ 1127.828862] [<ffffffffa0c250bb>] osd_fid_lookup+0x175b/0x1800 [osd_ldiskfs] 15:13:56:[ 1127.830930] [<ffffffffa0c251b5>] osd_object_init+0x55/0xf0 [osd_ldiskfs] 15:13:56:[ 1127.833042] [<ffffffffa07d68cf>] lu_object_alloc+0xdf/0x310 [obdclass] 15:13:56:[ 1127.835107] [<ffffffffa07d6ccc>] lu_object_find_try+0x16c/0x2b0 [obdclass] 15:13:56:[ 1127.837173] [<ffffffffa07d6ebc>] lu_object_find_at+0xac/0xe0 [obdclass] 15:13:56:[ 1127.839279] [<ffffffffa0e9bb65>] ? lod_index_lookup+0x25/0x30 [lod] 15:13:56:[ 1127.841285] [<ffffffffa0ef5f4f>] ? __mdd_lookup.isra.17+0x26f/0x450 [mdd] 15:13:56:[ 1127.843369] [<ffffffffa07d6f06>] lu_object_find+0x16/0x20 [obdclass] 15:13:56:[ 1127.845403] [<ffffffffa0dbff5b>] mdt_object_find+0x4b/0x170 [mdt] 15:13:56:[ 1127.847443] [<ffffffffa0dc3ab6>] mdt_getattr_name_lock+0x746/0x1900 [mdt] 15:13:56:[ 1127.849494] [<ffffffffa0dc4f20>] mdt_intent_getattr+0x2b0/0x480 [mdt] 15:13:56:[ 1127.851464] [<ffffffffa0dc8b3c>] mdt_intent_policy+0x5bc/0xbb0 [mdt] 15:13:56:[ 1127.853443] [<ffffffffa09ae1e7>] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] 15:13:56:[ 1127.855381] [<ffffffffa09d7363>] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] 15:13:56:[ 1127.857316] [<ffffffffa09fef40>] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] 15:13:56:[ 1127.859274] [<ffffffffa0a57bf2>] tgt_enqueue+0x62/0x210 [ptlrpc] 15:13:56:[ 1127.861113] [<ffffffffa0a5c055>] tgt_request_handle+0x915/0x1320 [ptlrpc] 15:13:56:[ 1127.862983] [<ffffffffa0a07fdb>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc] 15:13:56:[ 1127.864882] [<ffffffffa0a05b98>] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] 15:13:56:[ 1127.866693] [<ffffffff810b8952>] ? default_wake_function+0x12/0x20 15:13:56:[ 1127.868447] [<ffffffff810af0b8>] ? __wake_up_common+0x58/0x90 15:13:56:[ 1127.870236] [<ffffffffa0a0c090>] ptlrpc_main+0xaa0/0x1de0 [ptlrpc] 15:13:56:[ 1127.871995] [<ffffffffa0a0b5f0>] ? ptlrpc_main+0x0/0x1de0 [ptlrpc] 15:13:56:[ 1127.873738] [<ffffffff810a5b8f>] kthread+0xcf/0xe0 15:13:56:[ 1127.875343] [<ffffffff810a5ac0>] ? kthread+0x0/0xe0 15:13:56:[ 1127.876914] [<ffffffff81646b98>] ret_from_fork+0x58/0x90 15:13:56:[ 1127.878478] [<ffffffff810a5ac0>] ? kthread+0x0/0xe0 15:13:56:[ 1127.879973] 15:13:56:[ 1127.883734] Kernel panic - not syncing: LBUG
Info required for matching: replay-single 0a