Details

    • Improvement
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.14.0, Lustre 2.12.9, Lustre 2.16.1, Lustre 2.15.7
    • 9223372036854775807

    Description

      extend procfs to let lctl set_param add FLDB entries:

      [root@tmp tests]# lctl set_param fld.srv-lustre-MDT0000.fldb=[0x0000000280000400-0x0000000290000400]:0:mdt
      fld.srv-lustre-MDT0000.fldb=[0x0000000280000400-0x0000000290000400]:0:mdt
      [root@tmp tests]# lctl set_param fld.srv-lustre-MDT0000.fldb=[0x00000002a0000400-0x00000002b0000400]:2:ost
      fld.srv-lustre-MDT0000.fldb=[0x00000002a0000400-0x00000002b0000400]:2:ost
      [root@tmp tests]# lctl get_param fld.*.fldb
      fld.srv-lustre-MDT0000.fldb=
      [0x000000000000000c-0x0000000100000000]:0:mdt
      [0x0000000200000002-0x0000000200000003]:0:mdt
      [0x0000000200000007-0x0000000200000008]:0:mdt
      [0x0000000200000400-0x0000000240000400]:0:mdt
      [0x0000000280000400-0x0000000290000400]:0:mdt
      [0x00000002a0000400-0x00000002b0000400]:2:ost
      

      Attachments

        Issue Links

          Activity

            [LU-15437] setparam to modify fldb

            Note also that if the on-disk seq_srv value is wrong this triggers an LASSEERT from the on-disk data (which happened to me as I was testing things), which isn't helpful. Better to return an error and fail the mount:

            LustreError: 18161:0:(fid_handler.c:579:seq_server_init()) ASSERTION( lu_seq_range_is_sane(&seq->lss_space) ) failed: 
            [21797.990017] LustreError: 18161:0:(fid_handler.c:579:seq_server_init()) LBUG
            
            adilger Andreas Dilger added a comment - Note also that if the on-disk seq_srv value is wrong this triggers an LASSEERT from the on-disk data (which happened to me as I was testing things), which isn't helpful. Better to return an error and fail the mount: LustreError: 18161:0:(fid_handler.c:579:seq_server_init()) ASSERTION( lu_seq_range_is_sane(&seq->lss_space) ) failed: [21797.990017] LustreError: 18161:0:(fid_handler.c:579:seq_server_init()) LBUG
            adilger Andreas Dilger added a comment - - edited

            It also appears that there is a way to change the allocation range for a target, by writing:

            mds# lctl get_param seq.srv-testfs-MDT0000.space
            [0x200000bd0-0x240000000]
            mds# lctl set_param seq.srv-testfs-MDT0000.space="[0x201000000-0x240000000]"
            seq.srv-testfs-MDT0000.space=[0x201000000-0x240000000]
            mds# lctl get_param seq.srv-testfs-MDT0000.space
            [0x201000000-0x240000000]
            

            so this looks like it works, but in fact it does not affect sequence allocation for the current mount, nor does it modify the on-disk seq_srv file, whether the client and/or server are remounted:

            client# mount -t lustre centos7:/testfs /mnt/testfs2
            touch /mnt/testfs2/foo; lfs path2fid /mnt/testfs/foo
            [0x200000bd4:0x1:0x0]
            mds# umount /mnt/testfs-mds2; mount -t lustre /dev/mapper/mds1_flakey /mnt/testfs-mds2
            mds# lctl get_param seq.srv-testfs-MDT0000.space
            [0x2000013a0-0x240000000]
            client# umount /mnt/testfs2; mount -t lustre centos7:/testfs /mnt/testfs2
            rm /mnt/testfs2/foo; touch /mnt/testfs2/foo; lfs path2fid /mnt/testfs/foo
            [0x2000013a0:0x1:0x0]
            
            adilger Andreas Dilger added a comment - - edited It also appears that there is a way to change the allocation range for a target, by writing: mds# lctl get_param seq.srv-testfs-MDT0000.space [0x200000bd0-0x240000000] mds# lctl set_param seq.srv-testfs-MDT0000.space="[0x201000000-0x240000000]" seq.srv-testfs-MDT0000.space=[0x201000000-0x240000000] mds# lctl get_param seq.srv-testfs-MDT0000.space [0x201000000-0x240000000] so this looks like it works, but in fact it does not affect sequence allocation for the current mount, nor does it modify the on-disk seq_srv file, whether the client and/or server are remounted: client# mount -t lustre centos7:/testfs /mnt/testfs2 touch /mnt/testfs2/foo; lfs path2fid /mnt/testfs/foo [0x200000bd4:0x1:0x0] mds# umount /mnt/testfs-mds2; mount -t lustre /dev/mapper/mds1_flakey /mnt/testfs-mds2 mds# lctl get_param seq.srv-testfs-MDT0000.space [0x2000013a0-0x240000000] client# umount /mnt/testfs2; mount -t lustre centos7:/testfs /mnt/testfs2 rm /mnt/testfs2/foo; touch /mnt/testfs2/foo; lfs path2fid /mnt/testfs/foo [0x2000013a0:0x1:0x0]
            adilger Andreas Dilger added a comment - - edited

            I realized that there is already a way to recover the FLDB file with the existing code, since patch http://review.whamcloud.com/7027 "LU-3565 mdt: a new param to allocate sequences" was landed in v2_4_54.

            The code is confusing because there is "lustre/fid/lproc_fid.c" which is the seq.ctl-*.fldb SEQ controller, and a separate "lustre/fld/lproc_fld.c" which is the individual fld.srv-*.fldb SEQ servers (and cannot be written without your patch).

            The other thing that is broken is that seq.ctl-*.fldb prints an entry like:

             [0x0000000200000400-0x0000000240000400]:0:mdt
            

            but reads an entry like:

             [0x0000000200000400-0x0000000240000400):0:mdt
            

            so it isn't possible to directly restore an existing fldb backup. Also, it must be written one entry at a time and in seq order. I'm working on a patch to fix that to be symmetrical.

            adilger Andreas Dilger added a comment - - edited I realized that there is already a way to recover the FLDB file with the existing code, since patch http://review.whamcloud.com/7027 " LU-3565 mdt: a new param to allocate sequences " was landed in v2_4_54. The code is confusing because there is " lustre/fid/lproc_fid.c " which is the seq.ctl-*.fldb SEQ controller, and a separate " lustre/fld/lproc_fld.c " which is the individual fld.srv-*.fldb SEQ servers (and cannot be written without your patch). The other thing that is broken is that seq.ctl-*.fldb prints an entry like: [0x0000000200000400-0x0000000240000400]:0:mdt but reads an entry like: [0x0000000200000400-0x0000000240000400):0:mdt so it isn't possible to directly restore an existing fldb backup. Also, it must be written one entry at a time and in seq order. I'm working on a patch to fix that to be symmetrical.

            "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46042
            Subject: LU-15437 fld: extend procfs to allow fldb manipulations
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8b938c541f844e0778e95ff6bf5d1095bb3b2c35

            gerrit Gerrit Updater added a comment - "Alex Zhuravlev <bzzz@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/46042 Subject: LU-15437 fld: extend procfs to allow fldb manipulations Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8b938c541f844e0778e95ff6bf5d1095bb3b2c35

            People

              wc-triage WC Triage
              bzzz Alex Zhuravlev
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: