Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5242

Test hang sanity test_132, test_133: umount ost

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • Lustre 2.6.0, Lustre 2.7.0, Lustre 2.5.3
    • 3
    • 14622

    Description

      This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

      This issue relates to the following test suite run:

      http://maloo.whamcloud.com/test_sets/e5783778-f887-11e3-b13a-52540035b04c.

      The sub-test test_132 failed with the following error:

      test failed to respond and timed out

      Info required for matching: sanity 132

      Attachments

        Issue Links

          Activity

            [LU-5242] Test hang sanity test_132, test_133: umount ost
            pjones Peter Jones added a comment -

            Landed for 2.8

            pjones Peter Jones added a comment - Landed for 2.8

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13630/
            Subject: LU-5242 osd-zfs: umount hang in sanity 133g
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 9b704e4088d867851cdb011f0a2560b1e622555c

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13630/ Subject: LU-5242 osd-zfs: umount hang in sanity 133g Project: fs/lustre-release Branch: master Current Patch Set: Commit: 9b704e4088d867851cdb011f0a2560b1e622555c

            Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/13805
            Subject: LU-5242 osd-zfs: umount hang in sanity 133g
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set: 1
            Commit: 817cf8a2e781d546508929a9f58b44561ae3361c

            gerrit Gerrit Updater added a comment - Nathaniel Clark (nathaniel.l.clark@intel.com) uploaded a new patch: http://review.whamcloud.com/13805 Subject: LU-5242 osd-zfs: umount hang in sanity 133g Project: fs/lustre-release Branch: b2_5 Current Patch Set: 1 Commit: 817cf8a2e781d546508929a9f58b44561ae3361c
            bogl Bob Glossman (Inactive) added a comment - another seen on b2_5 with zfs: https://testing.hpdd.intel.com/test_sessions/435c3152-b816-11e4-9ecb-5254006e85c2

            I would agree with Alex on this. By deferring unlink of small files it will probably double or triple the total IO that the MDT is doing because in addition to the actual dnode deletion it also needs to insert the dnode into the deathrow ZAP in one TXG and then delete it from the same ZAP in a different txg. If there are a large number of objects being deleted at once (easily possible on the MDT), then the deathrow ZAP may get quite large (and never shrink) and updates would become less efficient than if it is kept small.

            adilger Andreas Dilger added a comment - I would agree with Alex on this. By deferring unlink of small files it will probably double or triple the total IO that the MDT is doing because in addition to the actual dnode deletion it also needs to insert the dnode into the deathrow ZAP in one TXG and then delete it from the same ZAP in a different txg. If there are a large number of objects being deleted at once (easily possible on the MDT), then the deathrow ZAP may get quite large (and never shrink) and updates would become less efficient than if it is kept small.

            Thanks all. I'll work on a patch first without the small object optimization to get this bug fixed; then will benchmark to figure out whether to optimize the small object path or not.

            isaac Isaac Huang (Inactive) added a comment - Thanks all. I'll work on a patch first without the small object optimization to get this bug fixed; then will benchmark to figure out whether to optimize the small object path or not.

            People

              isaac Isaac Huang (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              26 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: