Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.5.3, Lustre 2.8.0
-
None
-
2
-
9223372036854775807
Description
LustreError: 11-0: hw_nb-OST0016-osc-MDT0000: Communicating with 10.151.26.55@o2ib, operation ost_connect failed with -114. LustreError: 6488:0:(llog_cat.c:866:llog_cat_init_and_process()) hw_nb-OST0024-osc-MDT0000: llog_process() with cat_cancel_cb failed: rc = -5 LustreError: 6580:0:(osp_sync.c:874:osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 0 changes, 0 in progress, 0 in flight: -5 LustreError: 6580:0:(osp_sync.c:874:osp_sync_thread()) LBUG Pid: 6580, comm: osp-syn-36-0 Call Trace: [<ffffffffa05cf895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] [<ffffffffa05cfe97>] lbug_with_loc+0x47/0xb0 [libcfs] [<ffffffffa10d9243>] osp_sync_thread+0x753/0x7d0 [osp] [<ffffffff81559b9e>] ? thread_return+0x4e/0x770 [<ffffffffa10d8af0>] ? osp_sync_thread+0x0/0x7d0 [osp] Entering kdb (current=0xffff8803b5e04080, pid 6580) on processor 3 Oops: (null) due to oops @ 0x0 kdba_dumpregs: pt_regs not available, use bt* or pid to select a different task [3]kdb>
Attachments
Issue Links
- is related to
-
LU-9068 Hardware problem resulting in bad blocks
-
- Resolved
-
-
LU-8252 MDS kernel panic after aborting journal
-
- Resolved
-
-
LU-7011 Kernel part of llog subsystem can do self-repairing in some cases
-
- Resolved
-
- is related to
-
LU-5056 osp_sync_thread()) ASSERTION( rc == 0 || rc == LLOG_PROC_BREAK ) failed: 6 changes, 8 in progress, 0 in flight: -5
-
- Resolved
-
(2 mentioned in)
well, llog is corrupted in some strange way, meanwhile I've found that llog contained 4 records with indeces 61,62,63,64. Llog itself contains only 3 records 62, 63 and 64. And everything before those records are just garbage. I've fixed llog manually so it looks healthy now and contains those three records:
# lustre/utils/llog_reader cb2000d_9c396a65_fixed Bit 0 of 3 not set rec #62 type=1064553b len=64 rec #63 type=1064553b len=64 rec #64 type=1064553b len=64 Header size : 8192 Time : Fri Nov 7 09:00:21 2008 Number of records: 3 Target uuid : ----------------------- #62 (064)ogen=0 name=0x3bf:1 #63 (064)ogen=0 name=0x419:1 #64 (064)ogen=0 name=0x448:1
That might help to revive MDS with access at least to those plain llogs.