Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5299

osd_start() LBUG when doing parallel mount of the same target

    XMLWordPrintable

Details

    • 3
    • 14785

    Description

      One of our customers hits a crash when mounting OST targets. Here is the system log and backtrace.

      LustreError: 5048:0:(obd_class.h:1008:obd_connect()) Device 35 not setup
      LustreError: 5048:0:(obd_config.c:619:class_cleanup()) Device 35 not setup
      LustreError: 5048:0:(obd_mount_server.c:1635:osd_start()) ASSERTION( obd->obd_lu_dev ) failed:
      LustreError: 5048:0:(obd_mount_server.c:1635:osd_start()) LBUG
      
      crash> bt
      PID: 5048 TASK: ffff88085be42b00 CPU: 7 COMMAND: "mount.lustre"
       #0 [ffff8807c8189ac0] machine_kexec at ffffffff81035f1b
       #1 [ffff8807c8189b20] crash_kexec at ffffffff810c0f12
       #2 [ffff8807c8189bf0] panic at ffffffff8150e884
       #3 [ffff8807c8189c70] lbug_with_loc at ffffffffa047deeb [libcfs]
       #4 [ffff8807c8189c90] server_fill_super at ffffffffa0717673 [obdclass]
       #5 [ffff8807c8189d70] lustre_fill_super at ffffffffa06e76f8 [obdclass]
       #6 [ffff8807c8189da0] get_sb_nodev at ffffffff81184bcf
       #7 [ffff8807c8189de0] lustre_get_sb at ffffffffa06df175 [obdclass]
       #8 [ffff8807c8189e00] vfs_kern_mount at ffffffff8118420b
       #9 [ffff8807c8189e50] do_kern_mount at ffffffff811843b2
      #10 [ffff8807c8189ea0] do_mount at ffffffff811a4572
      #11 [ffff8807c8189f20] sys_mount at ffffffff811a4c00
      #12 [ffff8807c8189f80] system_call_fastpath at ffffffff8100b072
          RIP: 00000030ce4e909a RSP: 00007fff59ccbb70 RFLAGS: 00010202
          RAX: 00000000000000a5 RBX: ffffffff8100b072 RCX: 00002b58cacf1006
          RDX: 0000000000408bbf RSI: 00007fff59cd4f28 RDI: 00000000012a3170
          RBP: 0000000000000000 R8: 00000000012a3190 R9: 0000000000000000
          R10: 0000000001000000 R11: 0000000000000206 R12: 000000000060db18
          R13: 000000000060db10 R14: 00000000012a3190 R15: 0000000000000000
          ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b
      

      Lustre release is lustre-2.4.2.
      The issue occured on several OSSs, after a first centralized mount of OSTs was interrupted (shine start ... Ctr^C) and relaunched.

      I have been able to reproduce the issue on lustre-2.4.2, by launching mount of the same OST target in parallel.

      # mntpoint=/mnt/fs1/ost/4
      # mntdev=/dev/mapper/mpathe
      # umount $mntpoint
      # for i in $(seq 1 10); do dd if=$mntdev of=/dev/null bs=4k count=1M & done ;
      # /bin/mount -t lustre -o errors=panic $mntdev $mntpoint & \
      /bin/mount -t lustre -o errors=panic $mntdev $mntpoint
      

      Launching read IOs on the device in backgroung may help reproduce the issue, since ldiskfs_get_sb() calls set_blocksize() (via get_sb_bdev() and sb_set_blocksize()), which serializes with IOs on the device.

         ...
          /* Prevent starting I/O or mapping the device */
          percpu_down_write(&bdev->bd_block_size_semaphore);
      

      This probably opens a larger window for a potential race between the two mount operations.

      This ticket is certainly related to LU-2134, which is currently unresolved.

      I have reproduced the issue on Lustre 2.5.60 with a sligthly different assertion.

      <3>[264188.026108] LustreError: 22665:0:(obd_class.h:954:obd_connect()) Device 1 not setup
      <3>[264188.033999] LustreError: 22665:0:(obd_config.c:615:class_cleanup()) Device 1 not setup
      <3>[264188.042166] LustreError: 22665:0:(obd_mount_server.c:1735:server_fill_super()) Unable to start osd on /dev/ram0: -19
      <3>[264188.052925] LustreError: 22665:0:(obd_mount.c:1335:lustre_fill_super()) Unable to mount  (-19)
      <6>[264189.997203] LDISKFS-fs (ram0): mounted filesystem with ordered data mode. quota=on. Opts: 
      <0>[264190.065537] LustreError: 22664:0:(obd_mount_server.c:1690:osd_start()) ASSERTION( obd ) failed: 
      <0>[264190.074554] LustreError: 22664:0:(obd_mount_server.c:1690:osd_start()) LBUG
      <4>[264190.081682] Pid: 22664, comm: mount.lustre
      <4>[264190.085933] 
      <4>[264190.085935] Call Trace:
      <4>[264190.090213]  [<ffffffffa04fd895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
      <4>[264190.097358]  [<ffffffffa04fde97>] lbug_with_loc+0x47/0xb0 [libcfs]
      <4>[264190.103765]  [<ffffffffa066e522>] server_fill_super+0x1082/0x1620 [obdclass]
      <4>[264190.111014]  [<ffffffffa0605a6e>] ? obd_zombie_is_idle+0x1e/0x70 [obdclass]
      <4>[264190.118183]  [<ffffffffa0639a68>] lustre_fill_super+0x1d8/0x530 [obdclass]
      <4>[264190.125268]  [<ffffffffa0639890>] ? lustre_fill_super+0x0/0x530 [obdclass]
      <4>[264190.132326]  [<ffffffff8118c7bf>] get_sb_nodev+0x5f/0xa0
      <4>[264190.137847]  [<ffffffffa06317b5>] lustre_get_sb+0x25/0x30 [obdclass]
      <4>[264190.144370]  [<ffffffff8118be1b>] vfs_kern_mount+0x7b/0x1b0
      <4>[264190.150108]  [<ffffffff8118bfc2>] do_kern_mount+0x52/0x130
      <4>[264190.155751]  [<ffffffff811acf9b>] do_mount+0x2fb/0x930
      <4>[264190.162326]  [<ffffffff811ad660>] sys_mount+0x90/0xe0
      <4>[264190.167536]  [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      <4>[264190.173699] 
      

      In both cases, there is a lack of synchronisation in osd_start() when osd is retrieved from class_name2obd(). Obd device may still be in initialization phase while second mount.lustre process reaches this portion of code.

      static int osd_start(struct lustre_sb_info *lsi, unsigned long mflags)
      {
              struct lustre_mount_data *lmd = lsi->lsi_lmd;
              struct obd_device        *obd;
              struct dt_device_param    p;
              char                      flagstr[16];
              int                       rc;
              ENTRY;
      
              CDEBUG(D_MOUNT,
                     "Attempting to start %s, type=%s, lsifl=%x, mountfl=%lx\n",
                     lsi->lsi_svname, lsi->lsi_osd_type, lsi->lsi_flags, mflags);
      
              sprintf(lsi->lsi_osd_obdname, "%s-osd", lsi->lsi_svname);
              strcpy(lsi->lsi_osd_uuid, lsi->lsi_osd_obdname);
              strcat(lsi->lsi_osd_uuid, "_UUID");
              sprintf(flagstr, "%lu:%lu", mflags, (unsigned long) lmd->lmd_flags);
      
              obd = class_name2obd(lsi->lsi_osd_obdname);
              if (obd == NULL) {
                      rc = lustre_start_simple(lsi->lsi_osd_obdname,
                                               lsi->lsi_osd_type,
                                               lsi->lsi_osd_uuid, lmd->lmd_dev,
                                               flagstr, lsi->lsi_lmd->lmd_opts,
                                               lsi->lsi_svname);
                      if (rc)
                              GOTO(out, rc);
                      obd = class_name2obd(lsi->lsi_osd_obdname);
                      LASSERT(obd);
              }
      
              rc = obd_connect(NULL, &lsi->lsi_osd_exp,
                               obd, &obd->obd_uuid, NULL, NULL);
              if (rc) {
                      obd->obd_force = 1;
                      class_manual_cleanup(obd);
                      lsi->lsi_dt_dev = NULL;
              }
      
              /* XXX: to keep support old components relying on lsi_srv_mnt
               *      we get this info from OSD just started */
              LASSERT(obd->obd_lu_dev);
              lsi->lsi_dt_dev = lu2dt_dev(obd->obd_lu_dev);
              LASSERT(lsi->lsi_dt_dev);
      
              dt_conf_get(NULL, lsi->lsi_dt_dev, &p);
      
              lsi->lsi_srv_mnt = p.ddp_mnt;
      
      out:
              RETURN(rc);
      }
      

      Additional data from the crash dumps is available if needed.

      Attachments

        Issue Links

          Activity

            People

              bfaccini Bruno Faccini (Inactive)
              pichong Gregoire Pichon
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: