Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16579

llite: Fix the wrong ending offset calculation

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0, Lustre 2.15.3
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      For a single-stripe file, after applied https://review.whamcloud.com/c/fs/lustre-release/+/49226
      Subject: LU-16338 readahead: clip readahead with kms,
      It cause a dead loop:
      https://testing.whamcloud.com/test_logs/8d73c4e7-0e6e-482c-b93c-bf7159706890/show_text

      00000080:00200000:1.0:1676614736.277040:0:21215:0:(rw.c:1981:ll_readpage()) pgno:260, cnt:1032192, pos:0
      00000080:00000001:1.0:1676614736.277041:0:21215:0:(vvp_io.c:1654:vvp_io_read_ahead()) Process entered
      00000080:00000001:1.0:1676614736.277041:0:21215:0:(vvp_io.c:1666:vvp_io_read_ahead()) Process leaving (rc=0 : 0 : 0)
      00000008:00000001:1.0:1676614736.277042:0:21215:0:(osc_io.c:83:osc_io_read_ahead()) Process entered
      00000008:00000001:1.0:1676614736.277042:0:21215:0:(osc_lock.c:1281:osc_obj_dlmlock_at_pgoff()) Process entered
      00000008:00000001:1.0:1676614736.277043:0:21215:0:(osc_request.c:3137:osc_match_base()) Process entered
      00000008:00000001:1.0:1676614736.277043:0:21215:0:(osc_request.c:3172:osc_match_base()) Process leaving (rc=4 : 4 : 4)
      00000008:00000001:1.0:1676614736.277044:0:21215:0:(osc_lock.c:1315:osc_obj_dlmlock_at_pgoff()) Process leaving (rc=18446620395345229440 : -123678364322176 : ffff8f83e3073680)
      00000008:00000001:1.0:1676614736.277045:0:21215:0:(osc_io.c:112:osc_io_read_ahead()) Process leaving (rc=0 : 0 : 0)
      00000080:00000001:1.0:1676614736.277045:0:21215:0:(rw.c:2012:ll_readpage()) Process leaving (rc=524289 : 524289 : 80001)
      00000080:00000001:1.0:1676614736.277046:0:21215:0:(rw.c:1873:ll_readpage()) Process entered
      00000080:00200000:1.0:1676614736.277046:0:21215:0:(rw.c:1981:ll_readpage()) pgno:260, cnt:1032192, pos:0
      00000080:00000001:1.0:1676614736.277047:0:21215:0:(vvp_io.c:1654:vvp_io_read_ahead()) Process entered
      00000080:00000001:1.0:1676614736.277047:0:21215:0:(vvp_io.c:1666:vvp_io_read_ahead()) Process leaving (rc=0 : 0 : 0)
      00000008:00000001:1.0:1676614736.277048:0:21215:0:(osc_io.c:83:osc_io_read_ahead()) Process entered
      00000008:00000001:1.0:1676614736.277048:0:21215:0:(osc_lock.c:1281:osc_obj_dlmlock_at_pgoff()) Process entered
      00000008:00000001:1.0:1676614736.277049:0:21215:0:(osc_request.c:3137:osc_match_base()) Process entered
      00000008:00000001:1.0:1676614736.277049:0:21215:0:(osc_request.c:3172:osc_match_base()) Process leaving (rc=4 : 4 : 4)
      00000008:00000001:1.0:1676614736.277050:0:21215:0:(osc_lock.c:1315:osc_obj_dlmlock_at_pgoff()) Process leaving (rc=18446620395345229440 : -123678364322176 : ffff8f83e3073680)
      00000008:00000001:1.0:1676614736.277051:0:21215:0:(osc_io.c:112:osc_io_read_ahead()) Process leaving (rc=0 : 0 : 0)
      00000080:00000001:1.0:1676614736.277051:0:21215:0:(rw.c:2012:ll_readpage()) Process leaving (rc=524289 : 524289 : 80001)
      

      The dead loop code lines:

      if (cl_offset(clob, vmpage->index) >= iter->count + iocb->ki_pos) {
                              result = cl_io_read_ahead(env, io, vmpage->index, &ra);
                              if (result < 0 || vmpage->index > ra.cra_end_idx) {
                                      cl_read_ahead_release(env, &ra);
                                      unlock_page(vmpage);
                                      RETURN(AOP_TRUNCATED_PAGE); ===>AOP_TRUNCATED_PAGE = 80001
                              }
                      }
      
      

      After investigated, I found the ending beyond offset calculation is wrong as each time finished the read of a page, it will advance the @iter->count (iter->count - read_bytes).
      The wrong ending offset results in the dead loop described above.

      Attachments

        Issue Links

          Activity

            [LU-16579] llite: Fix the wrong ending offset calculation
            qian_wc Qian Yingjin added a comment - - edited

            The deadlock problem in the previous comment was solved by LU-16579.

            qian_wc Qian Yingjin added a comment - - edited The deadlock problem in the previous comment was solved by LU-16579 .

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50278/
            Subject: LU-16579 llite: fix the wrong beyond read end calculation
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 9c8a80bca738884e09affd66837b9e94508664d1

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50278/ Subject: LU-16579 llite: fix the wrong beyond read end calculation Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 9c8a80bca738884e09affd66837b9e94508664d1
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50065/
            Subject: LU-16579 llite: fix the wrong beyond read end calculation
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ae356dc325877bd130ad94acc5f3610898de8a8a

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/50065/ Subject: LU-16579 llite: fix the wrong beyond read end calculation Project: fs/lustre-release Branch: master Current Patch Set: Commit: ae356dc325877bd130ad94acc5f3610898de8a8a

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50278
            Subject: LU-16579 llite: fix the wrong beyond read end calculation
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 88e538a9c6f7444fde284960d56ed62d4f17cb3a

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50278 Subject: LU-16579 llite: fix the wrong beyond read end calculation Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 88e538a9c6f7444fde284960d56ed62d4f17cb3a

            Hi Patrick,

            They are all not applied to ES5.2.

            qian_wc Qian Yingjin added a comment - Hi Patrick, They are all not applied to ES5.2.
            qian_wc Qian Yingjin added a comment -

            I think it is an existing bug exposed by that change.

            qian_wc Qian Yingjin added a comment - I think it is an existing bug exposed by that change.

            "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50065
            Subject: LU-16579 llite: fix the wrong beyond read end calculation
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: c4afbee4c795d5305827f3ce533f3d04474a18eb

            gerrit Gerrit Updater added a comment - "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/50065 Subject: LU-16579 llite: fix the wrong beyond read end calculation Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: c4afbee4c795d5305827f3ce533f3d04474a18eb

            Yingjin, since the LU-16338 patch has not landed yet, why not just fix this as part of that patch? Is this issue caused by this patch, or just an existing bug exposed by that change?

            adilger Andreas Dilger added a comment - Yingjin, since the LU-16338 patch has not landed yet, why not just fix this as part of that patch? Is this issue caused by this patch, or just an existing bug exposed by that change?

            People

              qian_wc Qian Yingjin
              qian_wc Qian Yingjin
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: