Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14399

mount MDT takes very long with hsm enable

Details

    • 3
    • 9223372036854775807

    Description

      We observed that when mounting MDT with HSM enable, mount command take minutes compare to seconds as before. We saw this in the log

      [53618.238941] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre -o localrecov  /dev/mapper/mds2_flakey /mnt/lustre-mds2
      [53618.624098] LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
      [53720.390690] Lustre: 1722736:0:(mdt_coordinator.c:1114:mdt_hsm_cdt_start()) lustre-MDT0001: trying to init HSM before MDD
      [53720.392834] LustreError: 1722736:0:(mdt_coordinator.c:1125:mdt_hsm_cdt_start()) lustre-MDT0001: cannot take the layout locks needed for registered restore: -2
      [53720.398049] LustreError: 1722741:0:(mdt_coordinator.c:1090:mdt_hsm_cdt_start()) lustre-MDT0001: Coordinator already started or stopping
      [53720.400681] Lustre: lustre-MDT0001: Imperative Recovery not enabled, recovery window 60-180
      [53720.424872] Lustre: lustre-MDT0001: in recovery but waiting for the first client to connect
      [53720.953893] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
      [53722.067555] Lustre: DEBUG MARKER:  
      

      Seems related to LU-13920

      Attachments

        Issue Links

          Activity

            [LU-14399] mount MDT takes very long with hsm enable
            Deiter Alex Deiter made changes -
            Link New: This issue is related to EX-4820 [ EX-4820 ]
            bzzz Alex Zhuravlev made changes -
            Comment [ the patch just landed fails every run on my setup:
            {quote}
            ...
            Writing CONFIGS/mountdata
            start mds service on tmp.BKaRODHgLn
            Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1
            Started lustre-MDT0000
             conf-sanity test_132: @@@@@@ FAIL: Can not take the layout lock
              Trace dump:
              = ./../tests/test-framework.sh:6389:error()
              = conf-sanity.sh:9419:test_132()
              = ./../tests/test-framework.sh:6693:run_one()
              = ./../tests/test-framework.sh:6740:run_one_logged()
              = ./../tests/test-framework.sh:6581:run_test()
              = conf-sanity.sh:9422:main()
            Dumping lctl log to /tmp/ltest-logs/conf-sanity.test_132.*.1642612854.log
            Dumping logs only on local client.
            FAIL 132 (84s)
            {quote}
            ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.15.0 [ 14791 ]
            Resolution New: Fixed [ 1 ]
            Status Original: Open [ 1 ] New: Resolved [ 5 ]
            spitzcor Cory Spitz made changes -
            Fix Version/s Original: Lustre 2.15.0 [ 14791 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.15.0 [ 14791 ]
            Fix Version/s Original: Lustre 2.14.0 [ 14490 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.14.0 [ 14490 ]
            pjones Peter Jones made changes -
            Assignee Original: WC Triage [ wc-triage ] New: Sergey Cheremencev [ scherementsev ]
            pjones Peter Jones made changes -
            Link New: This issue is related to LU-13920 [ LU-13920 ]
            mdiep Minh Diep created issue -

            People

              scherementsev Sergey Cheremencev
              mdiep Minh Diep
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: