Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17070

sanity-flr test_200b: vvp_vmpage_error()) LBUG

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0, Lustre 2.15.7
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      There's a periodic crash in sanity-flr in master that looks like this:

       [26047.097521] Lustre: DEBUG MARKER: == sanity-flr test 200b: racing IO, mirror extend and resync ========================================================== 09:51:50 (1693475510)
      [26047.201126] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre2
      [26047.214264] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock trevis-23vm8@tcp:/lustre /mnt/lustre2
      [26047.252017] Lustre: Mounted lustre-client
      [26047.252911] Lustre: Skipped 1 previous similar message
      [26047.286711] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre3
      [26047.298657] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock trevis-23vm8@tcp:/lustre /mnt/lustre3
      [26053.066268] LustreError: 11-0: lustre-OST0003-osc-ffff9e711d5ac000: operation ost_fallocate to node 10.240.38.66@tcp failed: rc = -524
      [26098.433176] Lustre: *** cfs_fail_loc=1423, val=0***
      [26098.437521] LustreError: 329308:0:(vvp_page.c:119:vvp_vmpage_error()) LBUG
      [26098.438976] Pid: 329308, comm: ptlrpcd_00_01 4.18.0-477.15.1.el8_8.x86_64 #1 SMP Fri Jun 2 08:27:19 EDT 2023
      [26098.440832] Call Trace TBD:
      [26098.441616] [<0>] libcfs_call_trace+0x6f/0xa0 [libcfs]
      [26098.442707] [<0>] lbug_with_loc+0x3f/0x70 [libcfs]
      [26098.443675] [<0>] vvp_page_completion_write+0x2f7/0x400 [lustre]
      [26098.445022] [<0>] cl_page_completion+0x170/0x430 [obdclass]
      [26098.446365] [<0>] osc_ap_completion.isra.34+0x138/0x3e0 [osc]
      [26098.447567] [<0>] osc_extent_finish+0x203/0x9f0 [osc]
      [26098.448577] [<0>] brw_interpret+0x1c3/0xdb0 [osc]
      [26098.449527] [<0>] ptlrpc_check_set+0x53a/0x1e70 [ptlrpc]
      [26098.450855] [<0>] ptlrpcd+0x856/0xa70 [ptlrpc]
      [26098.451782] [<0>] kthread+0x134/0x150
      [26098.452577] [<0>] ret_from_fork+0x35/0x40
      [26098.453420] Kernel panic - not syncing: LBUG

      latest occurrence: https://testing.whamcloud.com/test_sets/1e9bca24-c969-4c92-a668-84678334a22a

      First occurrence in maloo on May 24: https://testing.whamcloud.com/test_sets/0c20e209-9122-4091-8244-e0c7b6ac8b2a

      First ever occurrence happened to be in janitor in special testing of this patch (where the test was added even: https://review.whamcloud.com/c/fs/lustre-release/+/46413

      The patch landed in April 2023 so it's likely the culprit here.

      Attachments

        Issue Links

          Activity

            [LU-17070] sanity-flr test_200b: vvp_vmpage_error()) LBUG
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.15.7 [ 16821 ]

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/56988/
            Subject: LU-17070 lov: retry layout refresh if got old layouts
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 356222ffb713c611af1bfa8379fe866a15da0439

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/56988/ Subject: LU-17070 lov: retry layout refresh if got old layouts Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 356222ffb713c611af1bfa8379fe866a15da0439
            pjones Peter Jones made changes -
            Link New: This issue is related to EX-11137 [ EX-11137 ]

            "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56988
            Subject: LU-17070 lov: retry layout refresh if got old layouts
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 016e0aa36d03857c11de0a3413245ed605cb521c

            gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56988 Subject: LU-17070 lov: retry layout refresh if got old layouts Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 016e0aa36d03857c11de0a3413245ed605cb521c

            "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56916
            Subject: LU-17070 lov: retry layout refresh if got old layouts
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 86ca8d0d3a97533df9760cd3ed55521fea578d27

            gerrit Gerrit Updater added a comment - "Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56916 Subject: LU-17070 lov: retry layout refresh if got old layouts Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 86ca8d0d3a97533df9760cd3ed55521fea578d27
            adilger Andreas Dilger made changes -
            Summary Original: vvp_vmpage_error()) LBUG New: sanity-flr test_200b: vvp_vmpage_error()) LBUG
            yujian Jian Yu made changes -
            Link New: This issue is related to LU-18410 [ LU-18410 ]
            pjones Peter Jones made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            pjones Peter Jones added a comment -

            Merged for 2.16

            pjones Peter Jones added a comment - Merged for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/55061/
            Subject: LU-17070 lov: retry layout refresh if got old layouts
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7974e41a26c22181be2818b3580756fa559d14d9

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/55061/ Subject: LU-17070 lov: retry layout refresh if got old layouts Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7974e41a26c22181be2818b3580756fa559d14d9

            People

              bobijam Zhenyu Xu
              green Oleg Drokin
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: