Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13528

sanity test_69: read succeeded, expect -ENOENT

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for S Buisson <sbuisson@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/2be4adce-b792-4432-8f8b-6caba6148d68

      test_69 failed with the following error:

      read succeeded, expect -ENOENT
      

      This error occurs only on aarch64 platforms. It seems the fail_loc value set on server side has no effect.

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_69 - read succeeded, expect -ENOENT

      Attachments

        Activity

          [LU-13528] sanity test_69: read succeeded, expect -ENOENT
          pjones Peter Jones added a comment -

          Landed for 2.14

          pjones Peter Jones added a comment - Landed for 2.14

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38526/
          Subject: LU-13528 llite: prevent MAX_DIO_SIZE 32-bit truncation
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 8cfd5be8b04bdaaa4a0c794392cc2e6835e103eb

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38526/ Subject: LU-13528 llite: prevent MAX_DIO_SIZE 32-bit truncation Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8cfd5be8b04bdaaa4a0c794392cc2e6835e103eb

          Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/38526
          Subject: LU-13528 llite: prevent MAX_DIO_SIZE 32-bit truncation
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 0e0984dcdd4a1f0b3c0b619793c31131704519be

          gerrit Gerrit Updater added a comment - Sebastien Buisson (sbuisson@ddn.com) uploaded a new patch: https://review.whamcloud.com/38526 Subject: LU-13528 llite: prevent MAX_DIO_SIZE 32-bit truncation Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0e0984dcdd4a1f0b3c0b619793c31131704519be

          After further investigation, it appears that a bug with the Lustre code on aarch64 platforms turns the O_DIRECT IO into a regular IO, hence bypassing the expected fail_loc behavior in sanity test_69.

          The bug stems from the fact that the MAX_DIO_SIZE value is truncated to its lower 32 bits.
          Its definition is:

          #define MAX_DIO_SIZE ((MAX_MALLOC / sizeof(struct brw_page) * PAGE_SIZE) & \
          		      ~(DT_MAX_BRW_SIZE - 1))
          

          On 4kB PAGE_SIZE platforms, it goes unnoticed as the value fits into 32 bits. But on 64kB PAGE_SIZE platforms, this value can go up to 1365GB.

          This problem, in combination with the fact that patch https://review.whamcloud.com/36144 increased the size of struct brw_page to 32 bytes, makes 1024GB (64 bits) truncated to 0 (lower 32 bits). The consequence is that ll_direct_IO_impl() triggers a direct IO of 0 bytes, which in turn makes llite issue a normal, buffered IO.

          sebastien Sebastien Buisson added a comment - After further investigation, it appears that a bug with the Lustre code on aarch64 platforms turns the O_DIRECT IO into a regular IO, hence bypassing the expected fail_loc behavior in sanity test_69. The bug stems from the fact that the MAX_DIO_SIZE value is truncated to its lower 32 bits. Its definition is: #define MAX_DIO_SIZE ((MAX_MALLOC / sizeof(struct brw_page) * PAGE_SIZE) & \ ~(DT_MAX_BRW_SIZE - 1)) On 4kB PAGE_SIZE platforms, it goes unnoticed as the value fits into 32 bits. But on 64kB PAGE_SIZE platforms, this value can go up to 1365GB. This problem, in combination with the fact that patch https://review.whamcloud.com/36144 increased the size of struct brw_page to 32 bytes, makes 1024GB (64 bits) truncated to 0 (lower 32 bits). The consequence is that ll_direct_IO_impl() triggers a direct IO of 0 bytes, which in turn makes llite issue a normal, buffered IO.

          People

            sebastien Sebastien Buisson
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: