Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16412

check truncated page in ->read page()

Details

    • 3
    • 9223372036854775807

    Description

      I found the page end offset calculation in filemap_get_read_batch() was off by one in 5.x kernel.

      When a read is submitted with end offset 1048575, then it incorrectly calculates
      the end page for read of 1024 when it should be 1023. This result in the readpage() call of the page is over stripe boundary and may be not covered by a DLM extent lock.

      In some corner racer case, filemap_get_read_batch() batches the page with index 1024 for read, but later this page is truncated and removed from page cache due to the lock protected it being revoked. This results in this page in the read path is not covered by a DLM lock. This will trigger an assertion in the code:

      LustreError: 14129:0:(osc_object.c:397:osc_req_attr_set()) uncovered page!
      Pid: 14129, comm: ptlrpcd_04_18 5.14.0-1038-oem #42-Ubuntu SMP Thu May 19 05:03:08 UTC 2022
      LustreError: 14129:0:(osc_object.c:411:osc_req_attr_set()) LBUG
      

      To work around this bug in the kernel, we can simply check whether this page got truncated and was removed from page cache in ->readpage(), and return AOP_TRUNCATED_PAGE to the upper layer, and then it will retry to batch pages and it will not add this truncated page into batches as it was removed from page cache.

      Attachments

        Issue Links

          Activity

            [LU-16412] check truncated page in ->read page()
            cfaber Colin Faber made changes -
            Link New: This issue is related to DDN-5149 [ DDN-5149 ]
            pjones Peter Jones made changes -
            Link New: This issue is related to DDN-3986 [ DDN-3986 ]
            pjones Peter Jones made changes -
            Labels Original: LTS15
            pjones Peter Jones made changes -
            Link New: This issue is related to NCP-12 [ NCP-12 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.15.3 [ 15998 ]

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50277/
            Subject: LU-16412 llite: check read page past requested
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: c6388ef80ff593936296f394a0154729578ac6eb

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50277/ Subject: LU-16412 llite: check read page past requested Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: c6388ef80ff593936296f394a0154729578ac6eb
            lixi_wc Li Xi made changes -
            Link New: This issue is blocked by EX-6850 [ EX-6850 ]

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50277
            Subject: LU-16412 llite: check read page past requested
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 3c35c311211e561dd83b665f081e0112044de98a

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50277 Subject: LU-16412 llite: check read page past requested Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 3c35c311211e561dd83b665f081e0112044de98a
            pjones Peter Jones made changes -
            Labels New: LTS15
            adilger Andreas Dilger added a comment - - edited

            Patch for the upstream kernel submitted:
            https://lore.kernel.org/linux-fsdevel/20230208022400.28962-1-coolqyj@163.com/

            Accepted into kernel and nackported to 6.1 and 5.15 stable trees:
            https://lore.kernel.org/stable/20230220133603.227781589@linuxfoundation.org/
            https://lore.kernel.org/stable/20230220133556.188276389@linuxfoundation.org/

            This patch also ends up improving fxmark benchmark performance by 13%, likely due to avoiding extraneous reads of pages not actually requested by the application:
            https://lore.kernel.org/linux-fsdevel/202302171032.69bd3cf7-yujie.liu@intel.com/

            adilger Andreas Dilger added a comment - - edited Patch for the upstream kernel submitted: https://lore.kernel.org/linux-fsdevel/20230208022400.28962-1-coolqyj@163.com/ Accepted into kernel and nackported to 6.1 and 5.15 stable trees: https://lore.kernel.org/stable/20230220133603.227781589@linuxfoundation.org/ https://lore.kernel.org/stable/20230220133556.188276389@linuxfoundation.org/ This patch also ends up improving fxmark benchmark performance by 13%, likely due to avoiding extraneous reads of pages not actually requested by the application: https://lore.kernel.org/linux-fsdevel/202302171032.69bd3cf7-yujie.liu@intel.com/

            People

              qian_wc Qian Yingjin
              qian_wc Qian Yingjin
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: