Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.17.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      An attempt to restripe a striped dir to 1-stripe and then to 2-stripes fails.
      A demonstration using a modified sanity test_230o
      ( https://review.whamcloud.com/c/fs/lustre-release/+/57784 ) :

      == sanity test 230o: dir split =========================== 23:39:23 (1736973563)
      lod.lustre-MDT0000-mdtlov.mdt_hash=crush
      lod.lustre-MDT0001-mdtlov.mdt_hash=crush
      mdt.lustre-MDT0000.enable_dir_restripe=1
      mdt.lustre-MDT0001.enable_dir_restripe=1
      total: 100 create in 0.04 seconds: 2278.42 ops/second
      total: 100 mkdir in 0.73 seconds: 137.12 ops/second
      Waiting 100s for 'crush'
      Updated after 2s: want 'crush' got 'crush'
      99 migrated when dir split 1 to 2 stripes
      Waiting 100s for 'crush,fixed'
      Updated after 3s: want 'crush,fixed' got 'crush,fixed'
      lt-lfs setdirstripe: dirstripe error on '/mnt/lustre/d230o.sanity': Invalid argument
      lt-lfs setdirstripe: cannot create dir '/mnt/lustre/d230o.sanity': Invalid argument
      lmv_stripe_count: 1 lmv_stripe_offset: 0 lmv_hash_type: crush,fixed
      mdtidx		 FID[seq:oid:ver]
           0		 [0x200002b11:0x3:0x0]		
       sanity test_230o: @@@@@@ FAIL: split d230o.sanity to 2 stripes failed 
        Trace dump:
        = ./../tests/test-framework.sh:7228:error()
        = sanity.sh:23951:test_230o()
        = ./../tests/test-framework.sh:7601:run_one()
        = ./../tests/test-framework.sh:7664:run_one_logged()
        = ./../tests/test-framework.sh:7467:run_test()
        = sanity.sh:23956:main()
      Dumping lctl log to /tmp/test_logs/1736973560/sanity.test_230o.*.1736973569.log
      Dumping logs only on local client.
      mdt.lustre-MDT0000.enable_dir_restripe=0
      mdt.lustre-MDT0001.enable_dir_restripe=0
      FAIL 230o (6s)
      

      Attachments

        Activity

          [LU-18639] dir split fails after dir merge
          pjones Peter Jones added a comment -

          Merged for 2.17

          pjones Peter Jones added a comment - Merged for 2.17

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/57784/
          Subject: LU-18639 dne: a correct check for dir split
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 103c1f560c8c8dde601aac1fafed137b19fa3429

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/57784/ Subject: LU-18639 dne: a correct check for dir split Project: fs/lustre-release Branch: master Current Patch Set: Commit: 103c1f560c8c8dde601aac1fafed137b19fa3429

          > I suspect the stripe count is being cached somewhere incorrectly and that needs to be fixed after the stripes are merged in the first part of the test.
          yes I think the same and I attempted to do MDS failover and repeat the re-striping op. Unfortunately the server crashed in another place:

          [541942.910617] LustreError: 604077:0:(dt_object.h:2767:dt_insert()) ASSERTION( dt->do_index_ops ) failed: 
          [541942.913430] LustreError: 604077:0:(dt_object.h:2767:dt_insert()) LBUG
          [541942.915345] CPU: 0 PID: 604077 Comm: mdt00_002 Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-305.25.1.el8_4.x86_64 #1
          [541942.918826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
          [541942.921179] Call Trace:
          [541942.922004]  dump_stack+0x5c/0x80
          [541942.923036]  lbug_with_loc.cold.10+0x5/0x68 [libcfs]
          [541942.924534]  lod_sub_insert+0x28c/0x360 [lod]
          [541942.925846]  lod_xattr_set_lmv.isra.57.constprop.72+0x60c/0xe00 [lod]
          [541942.927977]  lod_dir_striping_create_internal+0x395/0x800 [lod]
          [541942.929732]  lod_xattr_set+0x696/0x1420 [lod]
          [541942.931031]  ? kmem_cache_alloc+0x12e/0x270
          [541942.932301]  mdo_xattr_set+0xcc/0x5e0 [mdd]
          [541942.933563]  ? osd_trans_start+0x18b/0x5f0 [osd_ldiskfs]
          [541942.935202]  ? top_trans_start+0x353/0xa10 [ptlrpc]
          [541942.936897]  mdd_dir_layout_split+0x7c5/0xa40 [mdd]
          [541942.938573]  ? mdd_layout_change+0x215/0x17a0 [mdd]
          [541942.940190]  ? mdd_attr_get+0x37/0x100 [mdd]
          [541942.941475]  mdd_layout_change+0x215/0x17a0 [mdd]
          [541942.943034]  ? mdt_attr_get_complex+0xd4/0x8f0 [mdt]
          [541942.944676]  mdt_restripe_internal+0x730/0xac0 [mdt]
          [541942.946268]  mdt_restripe.isra.54+0x98c/0xbe0 [mdt]
          [541942.947861]  mdt_create+0xf0a/0x1540 [mdt]
          [541942.949175]  mdt_reint_create+0x35d/0x420 [mdt]
          [541942.950540]  mdt_reint_rec+0x11b/0x260 [mdt]
          [541942.951842]  mdt_reint_internal+0x4a5/0x800 [mdt]
          [541942.953250]  mdt_reint+0x5d/0x110 [mdt]
          [541942.954469]  tgt_request_handle+0x3f4/0x1950 [ptlrpc]
          [541942.956031]  ptlrpc_server_handle_request+0x2aa/0xcf0 [ptlrpc]
          [541942.957778]  ? lprocfs_counter_add+0x10f/0x180 [obdclass]
          [541942.959419]  ptlrpc_main+0xc5d/0x1530 [ptlrpc]
          [541942.960805]  ? ptlrpc_wait_event+0x510/0x510 [ptlrpc]
          [541942.962293]  kthread+0x116/0x130
          [541942.963290]  ? kthread_flush_work_fn+0x10/0x10
          [541942.964615]  ret_from_fork+0x22/0x40
          
          zam Alexander Zarochentsev added a comment - > I suspect the stripe count is being cached somewhere incorrectly and that needs to be fixed after the stripes are merged in the first part of the test. yes I think the same and I attempted to do MDS failover and repeat the re-striping op. Unfortunately the server crashed in another place: [541942.910617] LustreError: 604077:0:(dt_object.h:2767:dt_insert()) ASSERTION( dt->do_index_ops ) failed: [541942.913430] LustreError: 604077:0:(dt_object.h:2767:dt_insert()) LBUG [541942.915345] CPU: 0 PID: 604077 Comm: mdt00_002 Kdump: loaded Tainted: G OE --------- - - 4.18.0-305.25.1.el8_4.x86_64 #1 [541942.918826] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [541942.921179] Call Trace: [541942.922004] dump_stack+0x5c/0x80 [541942.923036] lbug_with_loc.cold.10+0x5/0x68 [libcfs] [541942.924534] lod_sub_insert+0x28c/0x360 [lod] [541942.925846] lod_xattr_set_lmv.isra.57.constprop.72+0x60c/0xe00 [lod] [541942.927977] lod_dir_striping_create_internal+0x395/0x800 [lod] [541942.929732] lod_xattr_set+0x696/0x1420 [lod] [541942.931031] ? kmem_cache_alloc+0x12e/0x270 [541942.932301] mdo_xattr_set+0xcc/0x5e0 [mdd] [541942.933563] ? osd_trans_start+0x18b/0x5f0 [osd_ldiskfs] [541942.935202] ? top_trans_start+0x353/0xa10 [ptlrpc] [541942.936897] mdd_dir_layout_split+0x7c5/0xa40 [mdd] [541942.938573] ? mdd_layout_change+0x215/0x17a0 [mdd] [541942.940190] ? mdd_attr_get+0x37/0x100 [mdd] [541942.941475] mdd_layout_change+0x215/0x17a0 [mdd] [541942.943034] ? mdt_attr_get_complex+0xd4/0x8f0 [mdt] [541942.944676] mdt_restripe_internal+0x730/0xac0 [mdt] [541942.946268] mdt_restripe.isra.54+0x98c/0xbe0 [mdt] [541942.947861] mdt_create+0xf0a/0x1540 [mdt] [541942.949175] mdt_reint_create+0x35d/0x420 [mdt] [541942.950540] mdt_reint_rec+0x11b/0x260 [mdt] [541942.951842] mdt_reint_internal+0x4a5/0x800 [mdt] [541942.953250] mdt_reint+0x5d/0x110 [mdt] [541942.954469] tgt_request_handle+0x3f4/0x1950 [ptlrpc] [541942.956031] ptlrpc_server_handle_request+0x2aa/0xcf0 [ptlrpc] [541942.957778] ? lprocfs_counter_add+0x10f/0x180 [obdclass] [541942.959419] ptlrpc_main+0xc5d/0x1530 [ptlrpc] [541942.960805] ? ptlrpc_wait_event+0x510/0x510 [ptlrpc] [541942.962293] kthread+0x116/0x130 [541942.963290] ? kthread_flush_work_fn+0x10/0x10 [541942.964615] ret_from_fork+0x22/0x40

          Sure, but when you do have a patch it can be fixed there. It is just a trivial comment in a man page, so not really worth its own patch.

          It looks from your debug log that the root of the problem is that the test is trying to split the directory 1->2 stripes, but the MDS thinks the directory already has 2 stripes for some reason and fails the check. I suspect the stripe count is being cached somewhere incorrectly and that needs to be fixed after the stripes are merged in the first part of the test.

          adilger Andreas Dilger added a comment - Sure, but when you do have a patch it can be fixed there. It is just a trivial comment in a man page, so not really worth its own patch. It looks from your debug log that the root of the problem is that the test is trying to split the directory 1->2 stripes, but the MDS thinks the directory already has 2 stripes for some reason and fails the check. I suspect the stripe count is being cached somewhere incorrectly and that needs to be fixed after the stripes are merged in the first part of the test.
          zam Alexander Zarochentsev added a comment - - edited

          adilger ,
          > Can you please fix this in your patch.
          This patch isn't for landing yet, as I have no fix for the problem yet.

          zam Alexander Zarochentsev added a comment - - edited adilger , > Can you please fix this in your patch. This patch isn't for landing yet, as I have no fix for the problem yet.

          zam, I see that LMV_HASH_FIXED_FLAG is mentioned in one comment related to patch https://review.whamcloud.com/57117 ("LU-17810 dne: dir restripe without fixed hash flag"), but this value does not exist anywhere in master (as of 2.16.51). It looks like this was a typo and should be "LMV_HASH_FLAG_FIXED" instead. Can you please fix this in your patch.

          adilger Andreas Dilger added a comment - zam , I see that LMV_HASH_FIXED_FLAG is mentioned in one comment related to patch https://review.whamcloud.com/57117 (" LU-17810 dne: dir restripe without fixed hash flag "), but this value does not exist anywhere in master (as of 2.16.51). It looks like this was a typo and should be " LMV_HASH_FLAG_FIXED " instead. Can you please fix this in your patch.
          zam Alexander Zarochentsev added a comment - - edited

          the dir split fails in lod_dir_declare_layout_split():

          00000004:00000001:1.0:1736966398.352205:0:600631:0:(lod_object.c:9027:lod_dir_declare_layout_split()) Process entered
          00000100:00000001:3.0:1736966398.352206:0:597520:0:(client.c:2942:ptlrpc_free_committed()) Process leaving
          00000100:00000001:3.0:1736966398.352207:0:597520:0:(client.c:1676:after_reply()) Process leaving (rc=0 : 0 : 0)
          00000004:00000002:1.0:1736966398.352207:0:600631:0:(lod_object.c:9035:lod_dir_declare_layout_split()) restriping [0x200000402:0x2:0x0] 2 to 2
          00000004:00000001:1.0:1736966398.352208:0:600631:0:(lod_object.c:9056:lod_dir_declare_layout_split()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea)
          

          please node that dir stripe count is 1 not 2 at the moment.

          extra debug messages from this patch:

          diff --git a/lustre/lod/lod_object.c b/lustre/lod/lod_object.c
          index 95b8b556fd..7199c95e74 100644
          --- a/lustre/lod/lod_object.c
          +++ b/lustre/lod/lod_object.c
          @@ -9031,6 +9031,8 @@ static int lod_dir_declare_layout_split(const struct lu_env *env,
           
                  saved_count = lo->ldo_dir_stripes_allocated;
                  stripe_count = le32_to_cpu(lum->lum_stripe_count);
          +       CDEBUG(D_INODE, "restriping "DFID" %d to %d\n", 
          +               PFID(lod_object_fid(lo)), saved_count, stripe_count);
           
                  /* if the split target is overstriped, we need to put that flag in the
                   * current layout so it can allocate the larger number of stripes
          diff --git a/lustre/mdt/mdt_restripe.c b/lustre/mdt/mdt_restripe.c
          index bbcef6a641..871d5e568d 100644
          --- a/lustre/mdt/mdt_restripe.c
          +++ b/lustre/mdt/mdt_restripe.c
          @@ -181,6 +181,9 @@ int mdt_restripe_internal(struct mdt_thread_info *info,
                          RETURN(-EALREADY);
                  }
           
          +       CDEBUG(D_INODE, "restriping "DFID" from %d stripes to %d\n",
          +               PFID(mdt_object_fid(child)),
          +               lmv_stripe_count, le32_to_cpu(lum->lum_stripe_count));
                  if (le32_to_cpu(lum->lum_stripe_count) > lmv_stripe_count) {
                          /* split */
                          struct md_layout_change *mlc = &info->mti_mlc;
          
          zam Alexander Zarochentsev added a comment - - edited the dir split fails in lod_dir_declare_layout_split(): 00000004:00000001:1.0:1736966398.352205:0:600631:0:(lod_object.c:9027:lod_dir_declare_layout_split()) Process entered 00000100:00000001:3.0:1736966398.352206:0:597520:0:(client.c:2942:ptlrpc_free_committed()) Process leaving 00000100:00000001:3.0:1736966398.352207:0:597520:0:(client.c:1676:after_reply()) Process leaving (rc=0 : 0 : 0) 00000004:00000002:1.0:1736966398.352207:0:600631:0:(lod_object.c:9035:lod_dir_declare_layout_split()) restriping [0x200000402:0x2:0x0] 2 to 2 00000004:00000001:1.0:1736966398.352208:0:600631:0:(lod_object.c:9056:lod_dir_declare_layout_split()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) please node that dir stripe count is 1 not 2 at the moment. extra debug messages from this patch: diff --git a/lustre/lod/lod_object.c b/lustre/lod/lod_object.c index 95b8b556fd..7199c95e74 100644 --- a/lustre/lod/lod_object.c +++ b/lustre/lod/lod_object.c @@ -9031,6 +9031,8 @@ static int lod_dir_declare_layout_split( const struct lu_env *env, saved_count = lo->ldo_dir_stripes_allocated; stripe_count = le32_to_cpu(lum->lum_stripe_count); + CDEBUG(D_INODE, "restriping " DFID " %d to %d\n" , + PFID(lod_object_fid(lo)), saved_count, stripe_count); /* if the split target is overstriped, we need to put that flag in the * current layout so it can allocate the larger number of stripes diff --git a/lustre/mdt/mdt_restripe.c b/lustre/mdt/mdt_restripe.c index bbcef6a641..871d5e568d 100644 --- a/lustre/mdt/mdt_restripe.c +++ b/lustre/mdt/mdt_restripe.c @@ -181,6 +181,9 @@ int mdt_restripe_internal(struct mdt_thread_info *info, RETURN(-EALREADY); } + CDEBUG(D_INODE, "restriping " DFID " from %d stripes to %d\n" , + PFID(mdt_object_fid(child)), + lmv_stripe_count, le32_to_cpu(lum->lum_stripe_count)); if (le32_to_cpu(lum->lum_stripe_count) > lmv_stripe_count) { /* split */ struct md_layout_change *mlc = &info->mti_mlc;

          "Alexander Zarochentsev <alexander.zarochentsev@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57784
          Subject: LU-18639 tests: sanity 230o improvement
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: a4bebf6cc4666b21e9b7940c3af68cf1520b0905

          gerrit Gerrit Updater added a comment - "Alexander Zarochentsev <alexander.zarochentsev@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/57784 Subject: LU-18639 tests: sanity 230o improvement Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: a4bebf6cc4666b21e9b7940c3af68cf1520b0905

          People

            zam Alexander Zarochentsev
            zam Alexander Zarochentsev
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: