Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.8.0
    • Lustre 2.8.0
    • 3
    • 9223372036854775807

    Description

      When running on osd-ldiskfs, if the isize is equal to 4095, a read length of 4096 will be returned because a wrong calculation of EOF.

      Attachments

        Issue Links

          Activity

            [LU-7371] Wrong read length over isize

            Both patches have landed for 2.8.0

            jgmitter Joseph Gmitter (Inactive) added a comment - Both patches have landed for 2.8.0

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17060/
            Subject: LU-7371 test: wrong read length over isize
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7023698133970372031a16beac276e5e3e64cfbe

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17060/ Subject: LU-7371 test: wrong read length over isize Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7023698133970372031a16beac276e5e3e64cfbe

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17020/
            Subject: LU-7371 osd-ldiskfs: fix wrong read length over isize
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 97c4c162be77d2ee9bad5d800c9b5803f252caa0

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17020/ Subject: LU-7371 osd-ldiskfs: fix wrong read length over isize Project: fs/lustre-release Branch: master Current Patch Set: Commit: 97c4c162be77d2ee9bad5d800c9b5803f252caa0

            Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/17060
            Subject: LU-7371 test: wrong read length over isize
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: bc9893da37ae60918b61e2d5fd84b4cb87ad3b82

            gerrit Gerrit Updater added a comment - Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/17060 Subject: LU-7371 test: wrong read length over isize Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: bc9893da37ae60918b61e2d5fd84b4cb87ad3b82
            adilger Andreas Dilger added a comment - - edited

            I had originally thought dd if=/dev/zero of=$DIR/$tfile bs=4095 count=1 conv=sync would be enough to create the file at 4095 bytes, and then the with bs=4096 bytes would trigger the bug. If that doesn't work, then another option is to add an OBD_FAIL_OST_* check in the code to reproduce the original symptom.

            adilger Andreas Dilger added a comment - - edited I had originally thought dd if=/dev/zero of=$DIR/$tfile bs=4095 count=1 conv=sync would be enough to create the file at 4095 bytes, and then the with bs=4096 bytes would trigger the bug. If that doesn't work, then another option is to add an OBD_FAIL_OST_* check in the code to reproduce the original symptom.

            I tried to write regression test. However, Lustre client has way to determine file size on client side. So, it seems hard to reproduce the issue on client side.

            I collected following messages when doing following things on Lustre without patch (all Lustre client and servers runs on the same machine):

            dd if=/dev/zero of=file bs=4095 count=1
            sync
            echo 3 > /proc/sys/vm/drop_caches
            dd if=file of=/dev/null bs=1048576

            [root@server1 lustre]# grep tgt_brw_read /tmp/lustre.log | grep leaving
            00000020:00000001:0.0:1446642073.288612:0:10887:0:(tgt_handler.c:1915:tgt_brw_read()) Process leaving (rc=4096 : 4096 : 1000)
            [root@server1 lustre]# grep 4095 /tmp/lustre.log
            00020000:00000001:0.0:1446642073.282732:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)
            00020000:00000001:0.0:1446642073.282733:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)
            00000080:00000001:1.0:1446642073.288760:0:12771:0:(file.c:1302:ll_file_aio_read()) Process leaving (rc=4095 : 4095 : fff)
            00000080:00000001:1.0:1446642073.288762:0:12771:0:(file.c:1332:ll_file_read()) Process leaving (rc=4095 : 4095 : fff)
            00020000:00000001:1.0:1446642073.288856:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)
            00020000:00000001:1.0:1446642073.288857:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)

            lixi Li Xi (Inactive) added a comment - I tried to write regression test. However, Lustre client has way to determine file size on client side. So, it seems hard to reproduce the issue on client side. I collected following messages when doing following things on Lustre without patch (all Lustre client and servers runs on the same machine): dd if=/dev/zero of=file bs=4095 count=1 sync echo 3 > /proc/sys/vm/drop_caches dd if=file of=/dev/null bs=1048576 [root@server1 lustre] # grep tgt_brw_read /tmp/lustre.log | grep leaving 00000020:00000001:0.0:1446642073.288612:0:10887:0:(tgt_handler.c:1915:tgt_brw_read()) Process leaving (rc=4096 : 4096 : 1000) [root@server1 lustre] # grep 4095 /tmp/lustre.log 00020000:00000001:0.0:1446642073.282732:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff) 00020000:00000001:0.0:1446642073.282733:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff) 00000080:00000001:1.0:1446642073.288760:0:12771:0:(file.c:1302:ll_file_aio_read()) Process leaving (rc=4095 : 4095 : fff) 00000080:00000001:1.0:1446642073.288762:0:12771:0:(file.c:1332:ll_file_read()) Process leaving (rc=4095 : 4095 : fff) 00020000:00000001:1.0:1446642073.288856:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff) 00020000:00000001:1.0:1446642073.288857:0:12771:0:(lov_offset.c:68:lov_stripe_size()) Process leaving (rc=4095 : 4095 : fff)

            Hi Alex,
            Can you take a look at this issue?
            Thanks.
            Joe

            jgmitter Joseph Gmitter (Inactive) added a comment - Hi Alex, Can you take a look at this issue? Thanks. Joe

            Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/17020
            Subject: LU-7371 osd-ldiskfs: fix wrong read length over isize
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 669fc00eb6ad6a6e9c94e3814326c50c65a7e3af

            gerrit Gerrit Updater added a comment - Li Xi (lixi@ddn.com) uploaded a new patch: http://review.whamcloud.com/17020 Subject: LU-7371 osd-ldiskfs: fix wrong read length over isize Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 669fc00eb6ad6a6e9c94e3814326c50c65a7e3af

            People

              bzzz Alex Zhuravlev
              lixi Li Xi (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: