Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • Upstream
    • None
    • 3
    • 9223372036854775807

    Description

      many racer runs timeout, bisected that to patch:
      https://review.whamcloud.com/32158/
      Subject: LU-10948 llite: Introduce inode open heat counter
      Project: fs/lustre-release
      Branch: master
      Commit: 41d99c4902836b7265db946dfa49cf99381f0db4

      Attachments

        Issue Links

          Activity

            [LU-14694] racer timeouts with LU-10948
            pjones Peter Jones added a comment -

            Landed for 2.15

            pjones Peter Jones added a comment - Landed for 2.15

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/43783/
            Subject: LU-14694 mdt: do not remove orphans at umount
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 0b94a058fe6732907d7aa3f5de0cbd7888d28ce1

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/43783/ Subject: LU-14694 mdt: do not remove orphans at umount Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0b94a058fe6732907d7aa3f5de0cbd7888d28ce1

            Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43783
            Subject: LU-14694 mdt: do not remove orphans at umount
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 9f9e98a5e94e021c68882db5ab3d03484048b810

            gerrit Gerrit Updater added a comment - Alex Zhuravlev (bzzz@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43783 Subject: LU-14694 mdt: do not remove orphans at umount Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 9f9e98a5e94e021c68882db5ab3d03484048b810

            I think we need something like above on its own as umount with just evicted client is a legal case.

            bzzz Alex Zhuravlev added a comment - I think we need something like above on its own as umount with just evicted client is a legal case.

            probably a bit better option is to re-use existing infrastructure to skip orphan removal:

            --- a/lustre/mdt/mdt_handler.c
            +++ b/lustre/mdt/mdt_handler.c
            @@ -6438,7 +6438,8 @@ static int mdt_export_cleanup(struct obd_export *exp)
                                            rc = mdt_ctxt_add_dirty_flag(&env, info, mfd);
             
                                    /* Don't unlink orphan on failover umount, LU-184 */
            -                       if (exp->exp_flags & OBD_OPT_FAILOVER) {
            +                       if (exp->exp_flags & OBD_OPT_FAILOVER ||
            +                               exp->exp_obd->obd_stopping) {
                                            ma->ma_valid = MA_FLAGS;
                                            ma->ma_attr_flags |= MDS_KEEP_ORPHAN;
                                    }
            
            bzzz Alex Zhuravlev added a comment - probably a bit better option is to re-use existing infrastructure to skip orphan removal: --- a/lustre/mdt/mdt_handler.c +++ b/lustre/mdt/mdt_handler.c @@ -6438,7 +6438,8 @@ static int mdt_export_cleanup(struct obd_export *exp) rc = mdt_ctxt_add_dirty_flag(&env, info, mfd); /* Don't unlink orphan on failover umount, LU-184 */ - if (exp->exp_flags & OBD_OPT_FAILOVER) { + if (exp->exp_flags & OBD_OPT_FAILOVER || + exp->exp_obd->obd_stopping) { ma->ma_valid = MA_FLAGS; ma->ma_attr_flags |= MDS_KEEP_ORPHAN; }

            basically the problem is that:
            1) openlock is taken on directories
            2) the client doesn't cancel locks on umount:

                            /* obd_force == local only */
                            ldlm_cli_cancel_unused(obd->obd_namespace, NULL,
                                                   obd->obd_force ? LCF_LOCAL : 0, NULL);
            

            3) thus MDT has to clear those "lost" locks, so close the directories
            4) a close may result in directory removal
            5) directory can be striped, thus needs all involved MDTs to be healty
            6) MDTs are stopped one by one

            locally I "solved" the problem disabling opencache for directories:

            --- a/lustre/mdt/mdt_open.c
            +++ b/lustre/mdt/mdt_open.c
            @@ -820,6 +820,9 @@ static int mdt_object_open_lock(struct mdt_thread_info *info,
                    if (req_is_replay(mdt_info_req(info)))
                            RETURN(0);
             
            +       if (S_ISDIR(lu_object_attr(&obj->mot_obj)))
            +               open_flags &= ~MDS_OPEN_LOCK;
            +
                    if (S_ISREG(lu_object_attr(&obj->mot_obj))) {
            

            but it looks like we've got few real problems to solve here. the most serious one, IMO, is to handle such a close on MDT in a better manner, so that umount doesn't get stuck indefinitely.

            bzzz Alex Zhuravlev added a comment - basically the problem is that: 1) openlock is taken on directories 2) the client doesn't cancel locks on umount: /* obd_force == local only */ ldlm_cli_cancel_unused(obd->obd_namespace, NULL, obd->obd_force ? LCF_LOCAL : 0, NULL); 3) thus MDT has to clear those "lost" locks, so close the directories 4) a close may result in directory removal 5) directory can be striped, thus needs all involved MDTs to be healty 6) MDTs are stopped one by one locally I "solved" the problem disabling opencache for directories: --- a/lustre/mdt/mdt_open.c +++ b/lustre/mdt/mdt_open.c @@ -820,6 +820,9 @@ static int mdt_object_open_lock(struct mdt_thread_info *info, if (req_is_replay(mdt_info_req(info))) RETURN(0); + if (S_ISDIR(lu_object_attr(&obj->mot_obj))) + open_flags &= ~MDS_OPEN_LOCK; + if (S_ISREG(lu_object_attr(&obj->mot_obj))) { but it looks like we've got few real problems to solve here. the most serious one, IMO, is to handle such a close on MDT in a better manner, so that umount doesn't get stuck indefinitely.
            --- a/lustre/tests/racer.sh
            +++ b/lustre/tests/racer.sh
            @@ -180,6 +180,7 @@ test_1() {
                            lss_cleanup
                    fi
             
            +       cancel_lru_locks mdc
                    return $rrc
             }
             run_test 1 "racer on clients: ${CLIENTS:-$(hostname)} DURATION=$DURATION"
            

            this helps.. it looks like openhandle locks are lost at umount, causing MDT to close them at export cleanup.

            bzzz Alex Zhuravlev added a comment - --- a/lustre/tests/racer.sh +++ b/lustre/tests/racer.sh @@ -180,6 +180,7 @@ test_1() { lss_cleanup fi + cancel_lru_locks mdc return $rrc } run_test 1 "racer on clients: ${CLIENTS:-$(hostname)} DURATION=$DURATION" this helps.. it looks like openhandle locks are lost at umount, causing MDT to close them at export cleanup.

            sanity/29 fails as well:

            == sanity test 29: IT_GETATTR regression  ============================================================ 11:52:13 (1621597933)
            Lustre: DEBUG MARKER: == sanity test 29: IT_GETATTR regression ============================================================ 11:52:13 (1621597933)
            striped dir -i1 -c2 -H all_char /mnt/lustre/d29
            first d29
            Lustre: DEBUG MARKER: first d29
            total 0
            -rw-r--r-- 1 root root 0 May 21 11:52 foo
            second d29
            Lustre: DEBUG MARKER: second d29
            total 0
            -rw-r--r-- 1 root root 0 May 21 11:52 foo
            done
            Lustre: DEBUG MARKER: done
             sanity test_29: @@@@@@ FAIL: CURRENT: 10 > 9 
            Lustre: DEBUG MARKER: sanity test_29: @@@@@@ FAIL: CURRENT: 10 > 9
              Trace dump:
              = ./../tests/test-framework.sh:6280:error()
              = sanity.sh:3398:test_29()
              = ./../tests/test-framework.sh:6583:run_one()
              = ./../tests/test-framework.sh:6630:run_one_logged()
              = ./../tests/test-framework.sh:6457:run_test()
              = sanity.sh:3410:main()
            Dumping lctl log to /tmp/ltest-logs/sanity.test_29.*.1621597935.log
            Dumping logs only on local client.
            Resetting fail_loc on all nodes...done.
            FAIL 29 (2s)
            
            bzzz Alex Zhuravlev added a comment - sanity/29 fails as well: == sanity test 29: IT_GETATTR regression ============================================================ 11:52:13 (1621597933) Lustre: DEBUG MARKER: == sanity test 29: IT_GETATTR regression ============================================================ 11:52:13 (1621597933) striped dir -i1 -c2 -H all_char /mnt/lustre/d29 first d29 Lustre: DEBUG MARKER: first d29 total 0 -rw-r--r-- 1 root root 0 May 21 11:52 foo second d29 Lustre: DEBUG MARKER: second d29 total 0 -rw-r--r-- 1 root root 0 May 21 11:52 foo done Lustre: DEBUG MARKER: done sanity test_29: @@@@@@ FAIL: CURRENT: 10 > 9 Lustre: DEBUG MARKER: sanity test_29: @@@@@@ FAIL: CURRENT: 10 > 9 Trace dump: = ./../tests/test-framework.sh:6280:error() = sanity.sh:3398:test_29() = ./../tests/test-framework.sh:6583:run_one() = ./../tests/test-framework.sh:6630:run_one_logged() = ./../tests/test-framework.sh:6457:run_test() = sanity.sh:3410:main() Dumping lctl log to /tmp/ltest-logs/sanity.test_29.*.1621597935.log Dumping logs only on local client. Resetting fail_loc on all nodes...done. FAIL 29 (2s)

            Also, is this with opencache enabled, or just with the patch applied?

            this is the latest clean master

            bzzz Alex Zhuravlev added a comment - Also, is this with opencache enabled, or just with the patch applied? this is the latest clean master

            basically it's MDSCOUNT=2 REFORMAT=yes bash racer.sh

            bzzz Alex Zhuravlev added a comment - basically it's MDSCOUNT=2 REFORMAT=yes bash racer.sh

            People

              bzzz Alex Zhuravlev
              bzzz Alex Zhuravlev
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: