Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13745

tasks hang with copy_file_range: ll_file_splice_read()

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      copy_file_range(2) reads from a Lustre file hang.

      With no .copy_file_range VFS API implemented it calls do_splice_direct()->splice_direct_to_actor()->do_splice_to()->ll_file_splice_read().

      While the call chain of ll_file_splice_read()->ll_file_io_generic()->generic_file_splice_read()->ll_file_read_iter()->ll_file_io_generic().

      And that would try to get LDLM lock twice in ll_file_io_generic(), so that hang ensued.

      Attachments

        Issue Links

          Activity

            [LU-13745] tasks hang with copy_file_range: ll_file_splice_read()
            pjones Peter Jones added a comment -

            The fix itself has landed for 2.14. All that remains tracked by this ticket is a test. Are there still plans to land that test imminently or can we either abandon that changeset /move it to a new JIRA?

            pjones Peter Jones added a comment - The fix itself has landed for 2.14. All that remains tracked by this ticket is a test. Are there still plans to land that test imminently or can we either abandon that changeset /move it to a new JIRA?

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40396/
            Subject: LU-13745 pcc: fall back normal splice read for detached file
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: cca45ad8aeaa8e124e9e48361bf7cff89a035f82

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40396/ Subject: LU-13745 pcc: fall back normal splice read for detached file Project: fs/lustre-release Branch: master Current Patch Set: Commit: cca45ad8aeaa8e124e9e48361bf7cff89a035f82

            Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/40396
            Subject: LU-13745 pcc: fall back normal splice read for detached file
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 7b45c8d9c77a9c45862bd61ea05b8c46117cffa4

            gerrit Gerrit Updater added a comment - Yingjin Qian (qian@ddn.com) uploaded a new patch: https://review.whamcloud.com/40396 Subject: LU-13745 pcc: fall back normal splice read for detached file Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7b45c8d9c77a9c45862bd61ea05b8c46117cffa4

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40366/
            Subject: LU-13745 tests: skip sanity test_426 for 4.15+
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: f8a8d3f83db67be9dcc724ff49757cce81b13a5e

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40366/ Subject: LU-13745 tests: skip sanity test_426 for 4.15+ Project: fs/lustre-release Branch: master Current Patch Set: Commit: f8a8d3f83db67be9dcc724ff49757cce81b13a5e

            John L. Hammond (jhammond@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40366
            Subject: LU-13745 tests: skip sanity test_426 for 4.15+
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: a8a6a55a1e4e11ee3b61aa7e230d752b5c1a476a

            gerrit Gerrit Updater added a comment - John L. Hammond (jhammond@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/40366 Subject: LU-13745 tests: skip sanity test_426 for 4.15+ Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: a8a6a55a1e4e11ee3b61aa7e230d752b5c1a476a

            adilger, after I have rebased my patch on top of change #40326, it appears that same crash during sanity/test_426 still occurs with Ubuntu Client because the used Kernel version is 4.15 so sub-test is not skipped !!

            bruno Bruno Faccini (Inactive) added a comment - adilger , after I have rebased my patch on top of change #40326, it appears that same crash during sanity/test_426 still occurs with Ubuntu Client because the used Kernel version is 4.15 so sub-test is not skipped !!

            This is also failing on Ubuntu 18.04 which is using 4.15.0-72-generic. See https://testing.whamcloud.com/test_sets/bdfb8c4b-a6a2-493a-ab57-6a9923f96e7c.

            jhammond John Hammond added a comment - This is also failing on Ubuntu 18.04 which is using 4.15.0-72-generic. See https://testing.whamcloud.com/test_sets/bdfb8c4b-a6a2-493a-ab57-6a9923f96e7c .

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40326/
            Subject: LU-13745 tests: skip sanity test_426 for 4.18+
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 010425898fa4b2abc6325a8073e20cb994ce7947

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/40326/ Subject: LU-13745 tests: skip sanity test_426 for 4.18+ Project: fs/lustre-release Branch: master Current Patch Set: Commit: 010425898fa4b2abc6325a8073e20cb994ce7947
            bruno Bruno Faccini (Inactive) added a comment - - edited

            adilger this may be of interest, but I think my patch https://review.whamcloud.com/35856 has also failed 100% of "test review-ldiskfs-ubuntu on CentOS 7.8/x86_64, Ubuntu 18.04/x86_64" stage I have attempted, for the same sanity/test_426 crash on Client side that you have provided the significant stack in LU-14045. So it should be more a Kernel v4.x related issue than an arch related one.

            bruno Bruno Faccini (Inactive) added a comment - - edited adilger this may be of interest, but I think my patch https://review.whamcloud.com/35856 has also failed 100% of "test review-ldiskfs-ubuntu on CentOS 7.8/x86_64, Ubuntu 18.04/x86_64" stage I have attempted, for the same sanity/test_426 crash on Client side that you have provided the significant stack in LU-14045 . So it should be more a Kernel v4.x related issue than an arch related one.

            adilger That means the newly added test is helpful and something we need fix still

            wshilong Wang Shilong (Inactive) added a comment - adilger That means the newly added test is helpful and something we need fix still

            People

              bobijam Zhenyu Xu
              bobijam Zhenyu Xu
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: