Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16338

read-ahead more than file size for a small file

Details

    • Improvement
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 9223372036854775807

    Description

      The following scripts will read pages more than file size:

      test_907()
      {
              local file=$DIR/$tfile
      
              $LCTL set_param -n llite.*.read_ahead_stats=0
              fallocate -l 16K $file || error "failed to fallocate $file"
              cancel_lru_locks osc
              #dd if=$file of=/dev/null bs=1M
              $MULTIOP $file or1048576c
              $LCTL get_param llite.*.read_ahead_stats
      }
      run_test 907 "read ahaed for small file"
      

      The readahead page stats are as follows:

      == sanity test 907: read ahaed for small file ============ 04:38:01 (1669196281)
      short read: 16384/1048576
      short read: 0/1032192
      llite.lustre-ffff938b7849d000.read_ahead_stats=
      snapshot_time             4011.320890492 secs.nsecs
      start_time                0.000000000 secs.nsecs
      elapsed_time              4011.320890492 secs.nsecs
      hits                      4 samples [pages]
      misses                    1 samples [pages]
      zero_size_window          4 samples [pages]
      failed_to_reach_end       1 samples [pages]
      failed_to_fast_read       1 samples [pages]
      readahead_pages           1 samples [pages] 255 255 255
      

      For small files, the read-ahead pages reach 255, it is not reasonable...

      Attachments

        Issue Links

          Activity

            [LU-16338] read-ahead more than file size for a small file
            qian_wc Qian Yingjin added a comment -

            I think we can close this ticket.

            The deadlock problem in the previous comment was solved by LU-16579.

            qian_wc Qian Yingjin added a comment - I think we can close this ticket. The deadlock problem in the previous comment was solved by LU-16579 .

            qian_wc, can this patch be closed, or is there still more work to do here?

            adilger Andreas Dilger added a comment - qian_wc , can this patch be closed, or is there still more work to do here?
            pjones Peter Jones added a comment -

            All patches seem to have landed for 2.16. can this be considered as complete?

            pjones Peter Jones added a comment - All patches seem to have landed for 2.16. can this be considered as complete?

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49226/
            Subject: LU-16338 readahead: clip readahead with kms
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: b33808d3aebb06cf0660baee1186b2f944d9d06d

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49226/ Subject: LU-16338 readahead: clip readahead with kms Project: fs/lustre-release Branch: master Current Patch Set: Commit: b33808d3aebb06cf0660baee1186b2f944d9d06d
            qian_wc Qian Yingjin added a comment -

            For a single-stripe file, after applied https://review.whamcloud.com/c/fs/lustre-release/+/49226
            Subject: LU-16338 readahead: clip readahead with kms,
            It cause a dead loop:
            https://testing.whamcloud.com/test_logs/8d73c4e7-0e6e-482c-b93c-bf7159706890/show_text

            00000080:00200000:1.0:1676614736.277040:0:21215:0:(rw.c:1981:ll_readpage()) pgno:260, cnt:1032192, pos:0
            00000080:00000001:1.0:1676614736.277041:0:21215:0:(vvp_io.c:1654:vvp_io_read_ahead()) Process entered
            00000080:00000001:1.0:1676614736.277041:0:21215:0:(vvp_io.c:1666:vvp_io_read_ahead()) Process leaving (rc=0 : 0 : 0)
            00000008:00000001:1.0:1676614736.277042:0:21215:0:(osc_io.c:83:osc_io_read_ahead()) Process entered
            00000008:00000001:1.0:1676614736.277042:0:21215:0:(osc_lock.c:1281:osc_obj_dlmlock_at_pgoff()) Process entered
            00000008:00000001:1.0:1676614736.277043:0:21215:0:(osc_request.c:3137:osc_match_base()) Process entered
            00000008:00000001:1.0:1676614736.277043:0:21215:0:(osc_request.c:3172:osc_match_base()) Process leaving (rc=4 : 4 : 4)
            00000008:00000001:1.0:1676614736.277044:0:21215:0:(osc_lock.c:1315:osc_obj_dlmlock_at_pgoff()) Process leaving (rc=18446620395345229440 : -123678364322176 : ffff8f83e3073680)
            00000008:00000001:1.0:1676614736.277045:0:21215:0:(osc_io.c:112:osc_io_read_ahead()) Process leaving (rc=0 : 0 : 0)
            00000080:00000001:1.0:1676614736.277045:0:21215:0:(rw.c:2012:ll_readpage()) Process leaving (rc=524289 : 524289 : 80001)
            00000080:00000001:1.0:1676614736.277046:0:21215:0:(rw.c:1873:ll_readpage()) Process entered
            00000080:00200000:1.0:1676614736.277046:0:21215:0:(rw.c:1981:ll_readpage()) pgno:260, cnt:1032192, pos:0
            00000080:00000001:1.0:1676614736.277047:0:21215:0:(vvp_io.c:1654:vvp_io_read_ahead()) Process entered
            00000080:00000001:1.0:1676614736.277047:0:21215:0:(vvp_io.c:1666:vvp_io_read_ahead()) Process leaving (rc=0 : 0 : 0)
            00000008:00000001:1.0:1676614736.277048:0:21215:0:(osc_io.c:83:osc_io_read_ahead()) Process entered
            00000008:00000001:1.0:1676614736.277048:0:21215:0:(osc_lock.c:1281:osc_obj_dlmlock_at_pgoff()) Process entered
            00000008:00000001:1.0:1676614736.277049:0:21215:0:(osc_request.c:3137:osc_match_base()) Process entered
            00000008:00000001:1.0:1676614736.277049:0:21215:0:(osc_request.c:3172:osc_match_base()) Process leaving (rc=4 : 4 : 4)
            00000008:00000001:1.0:1676614736.277050:0:21215:0:(osc_lock.c:1315:osc_obj_dlmlock_at_pgoff()) Process leaving (rc=18446620395345229440 : -123678364322176 : ffff8f83e3073680)
            00000008:00000001:1.0:1676614736.277051:0:21215:0:(osc_io.c:112:osc_io_read_ahead()) Process leaving (rc=0 : 0 : 0)
            00000080:00000001:1.0:1676614736.277051:0:21215:0:(rw.c:2012:ll_readpage()) Process leaving (rc=524289 : 524289 : 80001)
            

            The dead loop code lines:

            if (cl_offset(clob, vmpage->index) >= iter->count + iocb->ki_pos) {
                                    result = cl_io_read_ahead(env, io, vmpage->index, &ra);
                                    if (result < 0 || vmpage->index > ra.cra_end_idx) {
                                            cl_read_ahead_release(env, &ra);
                                            unlock_page(vmpage);
                                            RETURN(AOP_TRUNCATED_PAGE); ===>AOP_TRUNCATED_PAGE = 80001
                                    }
                            }
            

            I will investigate whether this bug existed without the patch.

            qian_wc Qian Yingjin added a comment - For a single-stripe file, after applied https://review.whamcloud.com/c/fs/lustre-release/+/49226 Subject: LU-16338 readahead: clip readahead with kms, It cause a dead loop: https://testing.whamcloud.com/test_logs/8d73c4e7-0e6e-482c-b93c-bf7159706890/show_text 00000080:00200000:1.0:1676614736.277040:0:21215:0:(rw.c:1981:ll_readpage()) pgno:260, cnt:1032192, pos:0 00000080:00000001:1.0:1676614736.277041:0:21215:0:(vvp_io.c:1654:vvp_io_read_ahead()) Process entered 00000080:00000001:1.0:1676614736.277041:0:21215:0:(vvp_io.c:1666:vvp_io_read_ahead()) Process leaving (rc=0 : 0 : 0) 00000008:00000001:1.0:1676614736.277042:0:21215:0:(osc_io.c:83:osc_io_read_ahead()) Process entered 00000008:00000001:1.0:1676614736.277042:0:21215:0:(osc_lock.c:1281:osc_obj_dlmlock_at_pgoff()) Process entered 00000008:00000001:1.0:1676614736.277043:0:21215:0:(osc_request.c:3137:osc_match_base()) Process entered 00000008:00000001:1.0:1676614736.277043:0:21215:0:(osc_request.c:3172:osc_match_base()) Process leaving (rc=4 : 4 : 4) 00000008:00000001:1.0:1676614736.277044:0:21215:0:(osc_lock.c:1315:osc_obj_dlmlock_at_pgoff()) Process leaving (rc=18446620395345229440 : -123678364322176 : ffff8f83e3073680) 00000008:00000001:1.0:1676614736.277045:0:21215:0:(osc_io.c:112:osc_io_read_ahead()) Process leaving (rc=0 : 0 : 0) 00000080:00000001:1.0:1676614736.277045:0:21215:0:(rw.c:2012:ll_readpage()) Process leaving (rc=524289 : 524289 : 80001) 00000080:00000001:1.0:1676614736.277046:0:21215:0:(rw.c:1873:ll_readpage()) Process entered 00000080:00200000:1.0:1676614736.277046:0:21215:0:(rw.c:1981:ll_readpage()) pgno:260, cnt:1032192, pos:0 00000080:00000001:1.0:1676614736.277047:0:21215:0:(vvp_io.c:1654:vvp_io_read_ahead()) Process entered 00000080:00000001:1.0:1676614736.277047:0:21215:0:(vvp_io.c:1666:vvp_io_read_ahead()) Process leaving (rc=0 : 0 : 0) 00000008:00000001:1.0:1676614736.277048:0:21215:0:(osc_io.c:83:osc_io_read_ahead()) Process entered 00000008:00000001:1.0:1676614736.277048:0:21215:0:(osc_lock.c:1281:osc_obj_dlmlock_at_pgoff()) Process entered 00000008:00000001:1.0:1676614736.277049:0:21215:0:(osc_request.c:3137:osc_match_base()) Process entered 00000008:00000001:1.0:1676614736.277049:0:21215:0:(osc_request.c:3172:osc_match_base()) Process leaving (rc=4 : 4 : 4) 00000008:00000001:1.0:1676614736.277050:0:21215:0:(osc_lock.c:1315:osc_obj_dlmlock_at_pgoff()) Process leaving (rc=18446620395345229440 : -123678364322176 : ffff8f83e3073680) 00000008:00000001:1.0:1676614736.277051:0:21215:0:(osc_io.c:112:osc_io_read_ahead()) Process leaving (rc=0 : 0 : 0) 00000080:00000001:1.0:1676614736.277051:0:21215:0:(rw.c:2012:ll_readpage()) Process leaving (rc=524289 : 524289 : 80001) The dead loop code lines: if (cl_offset(clob, vmpage->index) >= iter->count + iocb->ki_pos) { result = cl_io_read_ahead(env, io, vmpage->index, &ra); if (result < 0 || vmpage->index > ra.cra_end_idx) { cl_read_ahead_release(env, &ra); unlock_page(vmpage); RETURN(AOP_TRUNCATED_PAGE); ===>AOP_TRUNCATED_PAGE = 80001 } } I will investigate whether this bug existed without the patch.

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49224/
            Subject: LU-16338 readahead: add stats for read-ahead page count
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: cdcf97e17e73dfdd65c4e46bb30c4a07f5e710cf

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49224/ Subject: LU-16338 readahead: add stats for read-ahead page count Project: fs/lustre-release Branch: master Current Patch Set: Commit: cdcf97e17e73dfdd65c4e46bb30c4a07f5e710cf

            I think the logic at the top level is a bit weird - I have mixed feelings about it - but I think you have the right idea about clipping at the OSC layer.  Even if we change the higher level logic, clipping at the OSC layer seems like a good idea to me.

            paf0186 Patrick Farrell added a comment - I think the logic at the top level is a bit weird - I have mixed feelings about it - but I think you have the right idea about clipping at the OSC layer.  Even if we change the higher level logic, clipping at the OSC layer seems like a good idea to me.

            our read-ahead mechanism is much radical.
            And there is some logic unreasonable in the read-ahed code path.
            In @ll_readahead(), it already clips the readahead end index with the kms, but in the following @ll_read_ahead_pages(), it increases the end index to align with optimal RPC size and @ria->ria_end_idx_min.

            One solution for this problem is not to increase the end index larger than kms in llite layer.
            Another solution is the patch above: clip the returned readahead range with Kim in OSC layer.

            qian_wc Qian Yingjin added a comment - our read-ahead mechanism is much radical. And there is some logic unreasonable in the read-ahed code path. In @ll_readahead(), it already clips the readahead end index with the kms, but in the following @ll_read_ahead_pages(), it increases the end index to align with optimal RPC size and @ria->ria_end_idx_min. One solution for this problem is not to increase the end index larger than kms in llite layer. Another solution is the patch above: clip the returned readahead range with Kim in OSC layer.

            "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49226
            Subject: LU-16338 readahead: clip readahead with kms
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 9404bfb6f825c28d692d9c833ccd2137c46a84d3

            gerrit Gerrit Updater added a comment - "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49226 Subject: LU-16338 readahead: clip readahead with kms Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 9404bfb6f825c28d692d9c833ccd2137c46a84d3

            "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49224
            Subject: LU-16338 readahead: add stats for read-ahead page count
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 34779e7c2f51d93e190c863682c79ecd0060b06c

            gerrit Gerrit Updater added a comment - "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49224 Subject: LU-16338 readahead: add stats for read-ahead page count Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 34779e7c2f51d93e190c863682c79ecd0060b06c

            People

              qian_wc Qian Yingjin
              qian_wc Qian Yingjin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: