Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.8.0
-
3
-
9223372036854775807
Description
Found this during 24 hours failover test on hyperion.
2015-07-22 21:06:58 Lustre: DEBUG MARKER: ==== Checking the clients loads AFTER failover -- failure NOT OK 2015-07-22 21:08:08 LustreError: 6302:0:(update_recovery.c:726:update_is_committed()) lustre-MDT0001: master transno 4294990737 committed 4294990976 2015-07-22 21:08:08 LustreError: 6302:0:(update_recovery.c:738:update_is_committed()) LBUG 2015-07-22 21:08:08 Pid: 6302, comm: tgt_recov 2015-07-22 21:08:08 2015-07-22 21:08:08 Call Trace: 2015-07-22 21:08:08 [<ffffffffa04a7875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 2015-07-22 21:08:08 [<ffffffffa04a7e77>] lbug_with_loc+0x47/0xb0 [libcfs] 2015-07-22 21:08:08 [<ffffffffa08ed00e>] update_recovery_exec+0x1ebe/0x1f20 [ptlrpc] 2015-07-22 21:08:08 [<ffffffffa04b3c01>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 2015-07-22 21:08:08 [<ffffffffa08eeed7>] distribute_txn_replay_handle+0x387/0x10c0 [ptlrpc] 2015-07-22 21:08:08 [<ffffffffa04b3c01>] ? libcfs_debug_msg+0x41/0x50 [libcfs] 2015-07-22 21:08:08 [<ffffffffa0824483>] ? target_recovery_overseer+0x93/0x320 [ptlrpc] 2015-07-22 21:08:08 [<ffffffffa082b39f>] target_recovery_thread+0x92f/0x24d0 [ptlrpc] 2015-07-22 21:08:08 [<ffffffffa082aa70>] ? target_recovery_thread+0x0/0x24d0 [ptlrpc] 2015-07-22 21:08:08 [<ffffffff8109e78e>] kthread+0x9e/0xc0 2015-07-22 21:08:08 [<ffffffff8100c28a>] child_rip+0xa/0x20 2015-07-22 21:08:08 [<ffffffff8109e6f0>] ? kthread+0x0/0xc0 2015-07-22 21:08:08 [<ffffffff8100c280>] ? child_rip+0x0/0x20
After checking the log, it seems update log object is missing after recovery. And we probably need more synchronization for update llog creation.
Attachments
Issue Links
- is related to
-
LU-6831 The ticket for tracking all DNE2 bugs
- Reopened