Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14692

deprecate use of OST FID SEQ 0 for MDT0000

Details

    • 9223372036854775807

    Description

      Since Lustre 2.4.0 and DNE1, it has been possible to create OST objects using a different FID SEQ range for each MDT, to avoid contention during MDT object precreation.

      Objects that are created by MDT0000 are put into FID SEQ 0 (O/0/d*) on all OSTs and have a filename that is the decimal FID OID in ASCII. However, SEQ=0 objects are remapped to IDIF FID SEQ (0x100000000 | (ost_idx << 16)) so that they are unique across all OSTs.

      Objects that are created by other MDTs (or MDT0000 after 2^48 objects are created in SEQ 0) use a unique SEQ in the FID_SEQ_NORMAL range (> 0x200000400), and use a filename that is the hexadecimal FID OID in ASCII.

      For compatibility with pre-DNE MDTs and OSTs, the use of SEQ=0 by MDT0000 was kept until now, but there has not been a reason to keep this compatibility for new filesystems. It would be better to have MDT0000 assigned a "regular" FID SEQ range at startup, so that the SEQ=0 compatibility can eventually be removed. That would ensure OST objects have "proper and unique" FIDs, and avoid the complexity of mapping between the old SEQ=0 48-bit OID values and the IDIF FIDs.

      Older filesystems using SEQ=0 would eventually delete old objects in this range and/or could be forced to migrate to using new objects to clean up the remaining usage, if necessary.

      Attachments

        1. serial.txt
          778 kB
        2. stdout.txt
          484 kB

        Issue Links

          Activity

            [LU-14692] deprecate use of OST FID SEQ 0 for MDT0000
            dongyang Dongyang Li added a comment -

            Alex, could you share the vmcore-dmesg from the crash?
            I wonder if the change to "normal SEQ" happened after replay_barrier, when mdt starts again for recovery, it will see the old IDIF seq from disk.

            dongyang Dongyang Li added a comment - Alex, could you share the vmcore-dmesg from the crash? I wonder if the change to "normal SEQ" happened after replay_barrier, when mdt starts again for recovery, it will see the old IDIF seq from disk.

            not sure, but I haven't seen the following problem before the last wave of landings which include LU-14692:

            LustreError: 343158:0:(osp_internal.h:530:osp_fid_diff()) ASSERTION( fid_seq(fid1) == fid_seq(fid2) ) failed: fid1:[0x2c0000401:0x2:0x0], fid2:[0x100010000:0x1:0x0] in conf-sanity / 84
            ...
            PID: 343158  TASK: ffff8b7824a605c0  CPU: 1   COMMAND: "tgt_recover_0"
             #0 [ffff8b783d273578] panic at ffffffff8f0b9786
                /tmp/kernel/kernel/panic.c: 299
             #1 [ffff8b783d2735f8] osp_create at ffffffffc1198895 [osp]
                /home/lustre/master-mine/lustre/osp/osp_internal.h: 529
             #2 [ffff8b783d273680] lod_sub_create at ffffffffc113534e [lod]
                /home/lustre/master-mine/lustre/include/dt_object.h: 2333
             #3 [ffff8b783d2736f0] lod_striped_create at ffffffffc112076b [lod]
                /home/lustre/master-mine/lustre/lod/lod_object.c: 6338
             #4 [ffff8b783d273760] lod_xattr_set at ffffffffc1128200 [lod]
                /home/lustre/master-mine/lustre/lod/lod_object.c: 5068
             #5 [ffff8b783d273810] mdd_create_object at ffffffffc0f76a93 [mdd]
                /home/lustre/master-mine/lustre/include/dt_object.h: 2832
             #6 [ffff8b783d273940] mdd_create at ffffffffc0f81f98 [mdd]
                /home/lustre/master-mine/lustre/mdd/mdd_dir.c: 2827
             #7 [ffff8b783d273a40] mdt_reint_open at ffffffffc1038328 [mdt]
                /home/lustre/master-mine/lustre/mdt/mdt_open.c: 1574
             #8 [ffff8b783d273bf8] mdt_reint_rec at ffffffffc102731f [mdt]
                /home/lustre/master-mine/lustre/mdt/mdt_reint.c: 3240
             #9 [ffff8b783d273c20] mdt_reint_internal at ffffffffc0ff6ef6 [mdt]
                /home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155
            #10 [ffff8b783d273c58] mdt_intent_open at ffffffffc1002982 [mdt]
                /home/lustre/master-mine/lustre/mdt/mdt_handler.c: 4826
            #11 [ffff8b783d273c98] mdt_intent_policy at ffffffffc0fffe79 [mdt]
                /home/lustre/master-mine/lustre/mdt/mdt_handler.c: 4971
            #12 [ffff8b783d273cf8] ldlm_lock_enqueue at ffffffffc08bdbdf [ptlrpc]
                /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lock.c: 1794
            #13 [ffff8b783d273d60] ldlm_handle_enqueue0 at ffffffffc08e5046 [ptlrpc]
                /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lockd.c: 1441
            #14 [ffff8b783d273dd8] tgt_enqueue at ffffffffc091fd1f [ptlrpc]
                /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/tgt_handler.c: 1446
            #15 [ffff8b783d273df0] tgt_request_handle at ffffffffc0926147 [ptlrpc]
                /home/lustre/master-mine/lustre/include/lu_target.h: 645
            #16 [ffff8b783d273e68] handle_recovery_req at ffffffffc08c8c3c [ptlrpc]
                /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lib.c: 2418
            #17 [ffff8b783d273e98] target_recovery_thread at ffffffffc08d1300 [ptlrpc]
                /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lib.c: 2677
            #18 [ffff8b783d273f10] kthread at ffffffff8f0d5199
                /tmp/kernel/kernel/kthread.c: 340
            
            bzzz Alex Zhuravlev added a comment - not sure, but I haven't seen the following problem before the last wave of landings which include LU-14692 : LustreError: 343158:0:(osp_internal.h:530:osp_fid_diff()) ASSERTION( fid_seq(fid1) == fid_seq(fid2) ) failed: fid1:[0x2c0000401:0x2:0x0], fid2:[0x100010000:0x1:0x0] in conf-sanity / 84 ... PID: 343158 TASK: ffff8b7824a605c0 CPU: 1 COMMAND: "tgt_recover_0" #0 [ffff8b783d273578] panic at ffffffff8f0b9786 /tmp/kernel/kernel/panic.c: 299 #1 [ffff8b783d2735f8] osp_create at ffffffffc1198895 [osp] /home/lustre/master-mine/lustre/osp/osp_internal.h: 529 #2 [ffff8b783d273680] lod_sub_create at ffffffffc113534e [lod] /home/lustre/master-mine/lustre/include/dt_object.h: 2333 #3 [ffff8b783d2736f0] lod_striped_create at ffffffffc112076b [lod] /home/lustre/master-mine/lustre/lod/lod_object.c: 6338 #4 [ffff8b783d273760] lod_xattr_set at ffffffffc1128200 [lod] /home/lustre/master-mine/lustre/lod/lod_object.c: 5068 #5 [ffff8b783d273810] mdd_create_object at ffffffffc0f76a93 [mdd] /home/lustre/master-mine/lustre/include/dt_object.h: 2832 #6 [ffff8b783d273940] mdd_create at ffffffffc0f81f98 [mdd] /home/lustre/master-mine/lustre/mdd/mdd_dir.c: 2827 #7 [ffff8b783d273a40] mdt_reint_open at ffffffffc1038328 [mdt] /home/lustre/master-mine/lustre/mdt/mdt_open.c: 1574 #8 [ffff8b783d273bf8] mdt_reint_rec at ffffffffc102731f [mdt] /home/lustre/master-mine/lustre/mdt/mdt_reint.c: 3240 #9 [ffff8b783d273c20] mdt_reint_internal at ffffffffc0ff6ef6 [mdt] /home/lustre/master-mine/libcfs/include/libcfs/libcfs_debug.h: 155 #10 [ffff8b783d273c58] mdt_intent_open at ffffffffc1002982 [mdt] /home/lustre/master-mine/lustre/mdt/mdt_handler.c: 4826 #11 [ffff8b783d273c98] mdt_intent_policy at ffffffffc0fffe79 [mdt] /home/lustre/master-mine/lustre/mdt/mdt_handler.c: 4971 #12 [ffff8b783d273cf8] ldlm_lock_enqueue at ffffffffc08bdbdf [ptlrpc] /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lock.c: 1794 #13 [ffff8b783d273d60] ldlm_handle_enqueue0 at ffffffffc08e5046 [ptlrpc] /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lockd.c: 1441 #14 [ffff8b783d273dd8] tgt_enqueue at ffffffffc091fd1f [ptlrpc] /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/tgt_handler.c: 1446 #15 [ffff8b783d273df0] tgt_request_handle at ffffffffc0926147 [ptlrpc] /home/lustre/master-mine/lustre/include/lu_target.h: 645 #16 [ffff8b783d273e68] handle_recovery_req at ffffffffc08c8c3c [ptlrpc] /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lib.c: 2418 #17 [ffff8b783d273e98] target_recovery_thread at ffffffffc08d1300 [ptlrpc] /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lib.c: 2677 #18 [ffff8b783d273f10] kthread at ffffffff8f0d5199 /tmp/kernel/kernel/kthread.c: 340
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/45822/
            Subject: LU-14692 osp: deprecate IDIF sequence for MDT0000
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 6d2e7d191a7b27cde62b605dbed14488cfd4d410

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/45822/ Subject: LU-14692 osp: deprecate IDIF sequence for MDT0000 Project: fs/lustre-release Branch: master Current Patch Set: Commit: 6d2e7d191a7b27cde62b605dbed14488cfd4d410

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49720/
            Subject: LU-14692 tests: restore sanity/312 to always_except
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8767d2e44110fc19e624e963d5ebc788409339d3

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49720/ Subject: LU-14692 tests: restore sanity/312 to always_except Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8767d2e44110fc19e624e963d5ebc788409339d3

            "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49754
            Subject: LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: 6b69a998e14917656556e62c6a4e4f33f80e2b4b

            gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49754 Subject: LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000 Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: 6b69a998e14917656556e62c6a4e4f33f80e2b4b

            Alex, the "allow FID_SEQ_NORMAL for MDT0000" patch removes "always_except LU-9054 312", but I'm not sure why, since it doesn't look related to the FID SEQ at all. It should be added back.

            adilger Andreas Dilger added a comment - Alex, the " allow FID_SEQ_NORMAL for MDT0000 " patch removes " always_except LU-9054 312 ", but I'm not sure why, since it doesn't look related to the FID SEQ at all. It should be added back.

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49720
            Subject: LU-14692 tests: restore sanity/312 to always_except
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 73b154bd53f36da8907701077f2182c933364c62

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49720 Subject: LU-14692 tests: restore sanity/312 to always_except Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 73b154bd53f36da8907701077f2182c933364c62

            with this patch landed 100% of my local tests fail:

            == sanity test 312: make sure ZFS adjusts its block size by write pattern ========================================================== 05:05:02 (1674191102)
            1+0 records in
            1+0 records out
            4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0159779 s, 256 kB/s
            1+0 records in
            1+0 records out
            16384 bytes (16 kB, 16 KiB) copied, 0.0165552 s, 990 kB/s
            1+0 records in
            1+0 records out
            65536 bytes (66 kB, 64 KiB) copied, 0.0225513 s, 2.9 MB/s
            1+0 records in
            1+0 records out
            262144 bytes (262 kB, 256 KiB) copied, 0.0189756 s, 13.8 MB/s
            1+0 records in
            1+0 records out
            1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227387 s, 46.1 MB/s
            1+0 records in
            1+0 records out
            4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0551142 s, 74.3 kB/s
            1+0 records in
            1+0 records out
            4096 bytes (4.1 kB, 4.0 KiB) copied, 0.029839 s, 137 kB/s
             sanity test_312: @@@@@@ FAIL: blksz error, actual 4096,  expected: 2 * 1 * 4096 
              Trace dump:
              = ./../tests/test-framework.sh:6549:error()
              = sanity.sh:24840:test_312()
              = ./../tests/test-framework.sh:6887:run_one()
              = ./../tests/test-framework.sh:6937:run_one_logged()
              = ./../tests/test-framework.sh:6773:run_test()
              = sanity.sh:24863:main()
            
            bzzz Alex Zhuravlev added a comment - with this patch landed 100% of my local tests fail: == sanity test 312: make sure ZFS adjusts its block size by write pattern ========================================================== 05:05:02 (1674191102) 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0159779 s, 256 kB/s 1+0 records in 1+0 records out 16384 bytes (16 kB, 16 KiB) copied, 0.0165552 s, 990 kB/s 1+0 records in 1+0 records out 65536 bytes (66 kB, 64 KiB) copied, 0.0225513 s, 2.9 MB/s 1+0 records in 1+0 records out 262144 bytes (262 kB, 256 KiB) copied, 0.0189756 s, 13.8 MB/s 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227387 s, 46.1 MB/s 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0551142 s, 74.3 kB/s 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.029839 s, 137 kB/s sanity test_312: @@@@@@ FAIL: blksz error, actual 4096, expected: 2 * 1 * 4096 Trace dump: = ./../tests/test-framework.sh:6549:error() = sanity.sh:24840:test_312() = ./../tests/test-framework.sh:6887:run_one() = ./../tests/test-framework.sh:6937:run_one_logged() = ./../tests/test-framework.sh:6773:run_test() = sanity.sh:24863:main()

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/46293/
            Subject: LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: eaae4655567b16260237764dadb7ab57df8b0edd

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/46293/ Subject: LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000 Project: fs/lustre-release Branch: master Current Patch Set: Commit: eaae4655567b16260237764dadb7ab57df8b0edd

            "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46293
            Subject: LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: fd4b785de75608c0652500625e82e3668f8a9495

            gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/46293 Subject: LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: fd4b785de75608c0652500625e82e3668f8a9495

            People

              dongyang Dongyang Li
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: