Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.4.0
-
3
-
8157
Description
When starting an MDT on a SPARC MDS, this assertion failure occurred:
Lustre: lustre-MDT0000: used disk, loading Lustre: lustre-OST0000-osc-MDT0000: Init llog for 0 - catid 0x2:0:0 LustreError: 11309:0:(osp_sync.c:963:osp_sync_llog_init()) ASSERTION( lgh != ((void *)0) ) failed: LustreError: 11309:0:(osp_sync.c:963:osp_sync_llog_init()) LBUG Pid: 11309, comm: llog_process_th Call Trace: Kernel panic - not syncing: LBUG Call Trace: [0000000010181194] lbug_with_loc+0x94/0xc0 [libcfs] [0000000010dc28c8] osp_sync_llog_init+0xa28/0xc00 [osp] [0000000010dc6d78] osp_sync_init+0x1f8/0xbe0 [osp] [0000000010daf51c] osp_device_alloc+0x4d7c/0x5c40 [osp] [000000001033a500] class_setup+0x6e0/0xf00 [obdclass] [000000001033da58] class_process_config+0x1738/0x5180 [obdclass] [...]
According to the "catid" printed, I guess the FID of the log must be [1:2:0]. The problem is in the definition of oat_id:
struct ost_id { union { struct ostid { __u64 oi_id; __u64 oi_seq; } oi; struct lu_fid oi_fid; }; };
When fid_to_logid() assigns a 64-bit sequence number to oi_seq, which 32 bits go to f_oid and f_ver really depends on the endianness of the MDS. On the SPARC MDS, the FID_SEQ_LLOG goes to f_ver, causing oatid_id() to return 0, while the log ID as a whole is nonzero. This combined caused osp_sync_llog_init() to neither open nor re-create the log.
Attachments
Issue Links
- is related to
-
LU-3302 ll_fill_super() Unable to process log: -2
- Resolved