Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15513

crash in lod_fill_mirrors() with sparse OSTs + PFL

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.16.0
    • Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      MDS can crash in lod_fill_mirrors() if a default PFL layout is set on a directory, and the filesystem has sparse OST indices:

      BUG: unable to handle kernel NULL pointer dereference at 00000000000000e8
      IP: [<ffffffffc17a0e6e>] lod_fill_mirrors+0x17e/0x490 [lod]
      Oops: 0000 [#1] SMP 
      CPU: 8 PID: 16061 Comm: mdt02_001 Kdump: loaded 3.10.0-1160.49.1.el7_lustre.x86_64 #1
      Call Trace:
       lod_striped_create+0x3d7/0x690 [lod]
       lod_layout_change+0x3f/0x50 [lod]
       mdd_layout_change+0xaea/0xef0 [mdd]
       mdt_layout_change+0x2df/0x480 [mdt]
       mdt_intent_layout+0x8a0/0xe00 [mdt]
       mdt_intent_policy+0x435/0xd80 [mdt]
       ldlm_lock_enqueue+0x376/0x9b0 [ptlrpc]
       ldlm_handle_enqueue0+0xaa6/0x1630 [ptlrpc]
       tgt_enqueue+0x62/0x210 [ptlrpc]
       tgt_request_handle+0xaee/0x15f0 [ptlrpc]
       ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
       ptlrpc_main+0xb34/0x1470 [ptlrpc]
      

      This results in the lod_tgt_osts[] array being sparse and have NULL pointers that are accessed when checking for non-rotational OSTs.

      (gdb) list *(lod_fill_mirrors+0x17e)
      0xfe9e is in lod_fill_mirrors (/usr/src/lustre-exa-52/lustre/lod/lod_lov.c:768).
      757             for (i = 0; i < lo->ldo_comp_cnt; i++, lod_comp++) {
      758                     int stale = !!(lod_comp->llc_flags & LCME_FL_STALE);
      759                     int preferred = !!(lod_comp->llc_flags & LCME_FL_PREF_WR);
      760                     int j;
      761
      762                     pref = 0;
      763                     /* calculate component preference over all used OSTs */
      764                     for (j = 0; j < lod_comp->llc_stripes_allocated; j++) {
      765                             int idx = lod_comp->llc_ost_indices[j];
      766                             struct obd_statfs *osfs = &OST_TGT(lod,idx)->ltd_statfs;
      767
      768                             if (osfs->os_state & OS_STATE_NONROT)
      769                                     pref++;
      770                     }
      771
      772                     if (mirror_id_of(lod_comp->llc_id) == mirror_id) {
      (gdb) p &((struct lod_tgt_desc *)0)->ltd_statfs + &((struct obd_statfs *)0)->os_state
      $1 = 0xe8
      

      Attachments

        Issue Links

          Activity

            People

              bobijam Zhenyu Xu
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: