Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6155

osd_count_not_mapped() calls dbuf_hold_impl() without the lock

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.8.0
    • None
    • 3
    • 17211

    Description

      Observed when testing with zfs-0.6.3-179-g3c832b8 with ZFS debugging enabled. The osd_count_not_mapped() function calls dbuf_hold_impl() without first taking the dn->dn_struct_rwlock lock.

      Taking the lock around dbuf_hold_impl() would solve the issue. But actually it would be far better if Lustre didn't call dbuf_hold_impl() at all. As the name implies this is an internal function and really never should have been exported in the first place.

      But leaving that issue aside taking a hold on a dbuf has the potential to be very expensive. If the dbuf isn't in the dbuf hash then we're going to be forced to read it from disk to populate the hash. This will be 100% wasted work in the case where the write is dirtying the entire dbuf. In effect it's converting what would just be a write in to a read-modify-write.

      Since this code is only being using to determine if a particular block is sparse for quota checking. I think it would preferable to make a worst case assumption and assume that it's always mapped. The exact accounting can't be strictly settled anyway until the txg is synced.

      INFO: task lctl:26256 blocked for more than 120 seconds.
            Tainted: P           ---------------
      2.6.32-431.29.2.el6_lustre.gffd1fc2.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      lctl          D 000000000000000a     0 26256  26251 0x00000080
       ffff88170777b5b8 0000000000000086 0000000000000000 ffff88170777b590
       ffffffff817d6e5f 0000000000000000 ffff88170777b558 ffffffff81010eb5
       ffff881ff4aba638 ffff88170777bfd8 000000000000fbc8 ffff881ff4aba638
      Call Trace:
       [<ffffffffa019d7ed>] spl_panic+0xdd/0xe0 [spl]
       [<ffffffffa0269800>] __dbuf_hold_impl+0x610/0x9e0 [zfs]
       [<ffffffffa0269c4d>] dbuf_hold_impl+0x7d/0xb0 [zfs]
       [<ffffffffa0f344ad>] osd_count_not_mapped+0x19d/0x370 [osd_zfs]
       [<ffffffffa0f3622f>] osd_declare_write_commit+0x59f/0x710 [osd_zfs]
       [<ffffffffa106f7eb>] ofd_commitrw_write+0x35b/0x1060 [ofd]
       [<ffffffffa1070ab3>] ofd_commitrw+0x5c3/0xae0 [ofd]
       [<ffffffffa1128a83>] echo_client_brw_ioctl+0xd03/0x1470 [obdecho]
       [<ffffffffa112b1db>] echo_client_iocontrol+0x69b/0x2aa0 [obdecho]
       [<ffffffffa08037fc>] class_handle_ioctl+0x15fc/0x2180 [obdclass]
       [<ffffffffa07ea2ab>] obd_class_ioctl+0x4b/0x190 [obdclass]
       [<ffffffff8119e972>] vfs_ioctl+0x22/0xa0
       [<ffffffff8119eb14>] do_vfs_ioctl+0x84/0x580
       [<ffffffff8119f091>] sys_ioctl+0x81/0xa0
       [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      

      Attachments

        Issue Links

          Activity

            People

              utopiabound Nathaniel Clark
              behlendorf Brian Behlendorf
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: