Details

    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for S Buisson <sbuisson@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/a8d333cd-1b69-4f00-9829-2590702b0c0e

      test_413a failed with the following error:

      Timeout occurred after 492 mins, last suite running was sanity
      

      The test is blocked for an unknown reason, as nothing is visible in the console of the client or server nodes. Last message in test log is:

      Mkdir (stripe_count 3) roundrobin:
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_413a - Timeout occurred after 492 mins, last suite running was sanity

      Attachments

        Issue Links

          Activity

            [LU-14824] sanity test_413a: timeout
            nangelinas Nikitas Angelinas added a comment - +1 on master: https://testing.whamcloud.com/test_sets/46d8f39d-da3a-4039-bd02-d7c90196d7f9

            It is worthwhile to note that patch https://review.whamcloud.com/46734 "LU-15528 mdt: enqueue newly created object locks in TXN mode" and the later patch https://review.whamcloud.com/46733 "LU-15526 mdt: enable remote PDO lock" are about 10x faster (~150-170s vs. ~1100-3000s) when running sanity test_413a compared to unpatched systems:

            https://testing.whamcloud.com/search?server_file_system_type_id=00437f32-318d-11e1-9c6d-5254004bbbd3&test_set_script_id=f9516376-32bc-11e0-aaee-52540025f9ae&sub_test_script_id=44d5fa14-70d0-11e9-a6f2-52540065bddc&start_date=2022-03-07&end_date=2022-03-09&source=sub_tests#redirect

            adilger Andreas Dilger added a comment - It is worthwhile to note that patch https://review.whamcloud.com/46734 " LU-15528 mdt: enqueue newly created object locks in TXN mode " and the later patch https://review.whamcloud.com/46733 " LU-15526 mdt: enable remote PDO lock " are about 10x faster (~150-170s vs. ~1100-3000s) when running sanity test_413a compared to unpatched systems: https://testing.whamcloud.com/search?server_file_system_type_id=00437f32-318d-11e1-9c6d-5254004bbbd3&test_set_script_id=f9516376-32bc-11e0-aaee-52540025f9ae&sub_test_script_id=44d5fa14-70d0-11e9-a6f2-52540065bddc&start_date=2022-03-07&end_date=2022-03-09&source=sub_tests#redirect
            gerrit Gerrit Updater added a comment - - edited

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46774
            Subject: LU-14824 tests: reduce sanity test_413 ZFS test time
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 502cc9bb0ee94537a77e56f3888c25f11f8790a0

            gerrit Gerrit Updater added a comment - - edited "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46774 Subject: LU-14824 tests: reduce sanity test_413 ZFS test time Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 502cc9bb0ee94537a77e56f3888c25f11f8790a0
            adilger Andreas Dilger added a comment - - edited

            +1 on master: https://testing.whamcloud.com/test_sets/3720ae52-c898-40bd-9bb0-f41ea075568c

            Currently failing about 2.5% of runs, but 7.5% of ZFS runs.

            adilger Andreas Dilger added a comment - - edited +1 on master: https://testing.whamcloud.com/test_sets/3720ae52-c898-40bd-9bb0-f41ea075568c Currently failing about 2.5% of runs, but 7.5% of ZFS runs.

            "Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45955
            Subject: LU-14824 test: collect debug logs on zfs system
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 1d583433b1a4ad23a99ecb85fe4dc6858edaef20

            gerrit Gerrit Updater added a comment - "Lai Siyao <lai.siyao@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45955 Subject: LU-14824 test: collect debug logs on zfs system Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 1d583433b1a4ad23a99ecb85fe4dc6858edaef20

            I would prefer not to disable it if possible. I think one option to speed up the test for ZFS is to use larger DoM files, since this also reduces free inodes on a ZFS MDT, and should give the same behavior as creating a large number of inodes, unlike on ldiskfs.

            adilger Andreas Dilger added a comment - I would prefer not to disable it if possible. I think one option to speed up the test for ZFS is to use larger DoM files, since this also reduces free inodes on a ZFS MDT, and should give the same behavior as creating a large number of inodes, unlike on ldiskfs.
            laisiyao Lai Siyao added a comment - - edited

            Andreas, all the failures are zfs system, and I don't see anything special in test logs, this test creates lots of files/directories, and unlinks them after test, it's stuck in unlink time. Should we just disable this test on zfs system?

            laisiyao Lai Siyao added a comment - - edited Andreas, all the failures are zfs system, and I don't see anything special in test logs, this test creates lots of files/directories, and unlinks them after test, it's stuck in unlink time. Should we just disable this test on zfs system?
            spitzcor Cory Spitz added a comment -

            Proposing for 2.15.0 given the recent activity with master.

            spitzcor Cory Spitz added a comment - Proposing for 2.15.0 given the recent activity with master.
            hornc Chris Horn added a comment - +1 on master https://testing.whamcloud.com/test_sets/712a4fd4-1460-412c-a436-f648b3e0fc3d
            scherementsev Sergey Cheremencev added a comment - +1 on master: https://testing.whamcloud.com/test_sets/014222ff-aefc-4677-8545-f8b4bf0975c2
            laisiyao Lai Siyao added a comment -

            This looks to be on zfs backend only, and the possible reason is slow striped directory mkdir. I'll look into the test scripts to see how to improve this.

            laisiyao Lai Siyao added a comment - This looks to be on zfs backend only, and the possible reason is slow striped directory mkdir. I'll look into the test scripts to see how to improve this.

            People

              laisiyao Lai Siyao
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: