Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.15.0
-
None
-
3
-
9223372036854775807
Description
MDS can crash in lod_fill_mirrors() if a default PFL layout is set on a directory, and the filesystem has sparse OST indices:
BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8 IP: [<ffffffffc17a0e6e>] lod_fill_mirrors+0x17e/0x490 [lod] Oops: 0000 [#1] SMP CPU: 8 PID: 16061 Comm: mdt02_001 Kdump: loaded 3.10.0-1160.49.1.el7_lustre.x86_64 #1 Call Trace: lod_striped_create+0x3d7/0x690 [lod] lod_layout_change+0x3f/0x50 [lod] mdd_layout_change+0xaea/0xef0 [mdd] mdt_layout_change+0x2df/0x480 [mdt] mdt_intent_layout+0x8a0/0xe00 [mdt] mdt_intent_policy+0x435/0xd80 [mdt] ldlm_lock_enqueue+0x376/0x9b0 [ptlrpc] ldlm_handle_enqueue0+0xaa6/0x1630 [ptlrpc] tgt_enqueue+0x62/0x210 [ptlrpc] tgt_request_handle+0xaee/0x15f0 [ptlrpc] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc] ptlrpc_main+0xb34/0x1470 [ptlrpc]
This results in the lod_tgt_osts[] array being sparse and have NULL pointers that are accessed when checking for non-rotational OSTs.
(gdb) list *(lod_fill_mirrors+0x17e) 0xfe9e is in lod_fill_mirrors (/usr/src/lustre-exa-52/lustre/lod/lod_lov.c:768). 757 for (i = 0; i < lo->ldo_comp_cnt; i++, lod_comp++) { 758 int stale = !!(lod_comp->llc_flags & LCME_FL_STALE); 759 int preferred = !!(lod_comp->llc_flags & LCME_FL_PREF_WR); 760 int j; 761 762 pref = 0; 763 /* calculate component preference over all used OSTs */ 764 for (j = 0; j < lod_comp->llc_stripes_allocated; j++) { 765 int idx = lod_comp->llc_ost_indices[j]; 766 struct obd_statfs *osfs = &OST_TGT(lod,idx)->ltd_statfs; 767 768 if (osfs->os_state & OS_STATE_NONROT) 769 pref++; 770 } 771 772 if (mirror_id_of(lod_comp->llc_id) == mirror_id) { (gdb) p &((struct lod_tgt_desc *)0)->ltd_statfs + &((struct obd_statfs *)0)->os_state $1 = 0xe8
Attachments
Issue Links
- is related to
-
LU-14996 select preferred mirror using non-rotational status
- Resolved