Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.4.0
-
3
-
8082
Description
On a SPARC machine, "./llmount.sh" hit this assertion failure:
LustreError: 20452:0:(llog_osd.c:630:llog_osd_next_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed: LustreError: 20452:0:(llog_osd.c:630:llog_osd_next_block()) LBUG Pid: 20452, comm: ll_mgs_0002 Call Trace: Kernel panic - not syncing: LBUG Call Trace: [00000000103a3194] lbug_with_loc+0x94/0xc0 [libcfs] [0000000010535fbc] llog_osd_next_block+0xb5c/0x1000 [obdclass] [00000000104f39d0] llog_process_thread+0x2b0/0x13a0 [obdclass] [00000000104f4cdc] llog_process_or_fork+0x21c/0x980 [obdclass] [000000001090a140] mgs_steal_llog_for_mdt_from_client+0x5e0/0xae0 [mgs] [000000001090b120] mgs_write_log_mdt+0xae0/0x3a60 [mgs] [00000000109262f8] mgs_write_log_target+0x798/0x20a0 [mgs] [00000000108ea624] mgs_handle_target_reg+0xd44/0x17c0 [mgs] [00000000108edab8] mgs_handle+0xd18/0x22a0 [mgs] [00000000106f5f60] ptlrpc_server_handle_request+0x980/0x16c0 [ptlrpc] [00000000106fc130] ptlrpc_main+0xa10/0x1680 [ptlrpc] [000000000042ad88] kernel_thread+0x30/0x48 [00000000103aea44] cfs_create_thread+0x24/0x60 [libcfs] Press Stop-A (L1-A) to return to the boot prom
The llog_osd_next_block() lines in question are
tail = (struct llog_rec_tail *)((char *)buf + rc - sizeof(struct llog_rec_tail)); /* get the last record in block */ last_rec = (struct llog_rec_hdr *)((char *)buf + rc - le32_to_cpu(tail->lrt_len)); if (LLOG_REC_HDR_NEEDS_SWABBING(last_rec)) lustre_swab_llog_rec(last_rec); LASSERT(last_rec->lrh_index == tail->lrt_index);
The le32_to_cpu() call above assumes the data to be little-endian. That is not true, however, because configuration logs (as well as at least OSP logs) are actually written in host-endianness, which is big-endian on sparc Linux.
It is not clear what the endianness rule should be. The comment above the definition of llog_rec_hdr requires little-endianness, while the LLOG_REC_HDR_NEEDS_SWABBING() calls and log writing code suggest host-endianness (or adaptive-endianness). Enforcing little-endianness requires a larger amount of changes, while host-endianness makes it impossible to find the index of the last record in a chunk in O(1) time, since the record header must be read first to determine endianness.
Attachments
Issue Links
- is related to
-
LU-6968 Update the whole header in llog_cancel_rec()
- Resolved