Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12209

cannot create stripe dir: Stale file handle

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • None
    • Lustre 2.10.7
    • None
    • CentOS 7.6, servers 2.10.7, clients 2.12 or 2.10
    • 3
    • 9223372036854775807

    Description

      I'm facing a new issue on Oak (2.10.7 servers), tried with both 2.10 and 2.12 clients:

      As root:

      # cd /oak/stanford/groups/
      # lfs mkdir -i 1 caiwei
      lfs mkdir: dirstripe error on 'caiwei': Stale file handle
      lfs setdirstripe: cannot create stripe dir 'caiwei': Stale file handle
      
      # lfs getdirstripe .
      lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none
      

      does that ring a bell? Oak is only using DNE v1 with statically striped directories. Never seen that before 2.10.7 (we recently upgraded Oak).

      a basic lctl dk doesn't show anything on the MDS but I may have to enable specific debug flags to see more. No other traces found so far.

      Tried with 2.10 and 2.12 clients, with or without idmap.

      Thanks!

      Attachments

        1. oak-md1-s1-MDT1.dk.gz
          2.95 MB
          Stephane Thiell
        2. oak-md1-s2-MDT0.dk.gz
          10.15 MB
          Stephane Thiell
        3. sh-101-60.dk.gz
          363 kB
          Stephane Thiell

        Issue Links

          Activity

            [LU-12209] cannot create stripe dir: Stale file handle
            pjones Peter Jones added a comment -

            Stephane

            You are correct. Yet another illustration as to why it is confusing to having multiple patches tracked under the same Jira ticket spanning release boundaries

            Peter

            pjones Peter Jones added a comment - Stephane You are correct. Yet another illustration as to why it is confusing to having multiple patches tracked under the same Jira ticket spanning release boundaries Peter

            Peter, this patch (https://review.whamcloud.com/#/c/33401/ - LU-11418 llog: refresh remote llog upon -ESTALE) is already available in 2.12.0:

             

            commit 71f409c9b31b90fa432f1f46ad4e612fb65c7fcc
            Author: Lai Siyao <lai.siyao@intel.com>
            Date:   Wed Oct 17 13:29:53 2018 +0800
            
                LU-11418 llog: refresh remote llog upon -ESTALE
            

            But it's not included in 2.10.7 (that we're running on our Oak servers).

            sthiell Stephane Thiell added a comment - Peter, this patch ( https://review.whamcloud.com/#/c/33401/  -  LU-11418 llog: refresh remote llog upon -ESTALE) is already available in 2.12.0:   commit 71f409c9b31b90fa432f1f46ad4e612fb65c7fcc Author: Lai Siyao <lai.siyao@intel.com> Date: Wed Oct 17 13:29:53 2018 +0800 LU-11418 llog: refresh remote llog upon -ESTALE But it's not included in 2.10.7 (that we're running on our Oak servers).
            pjones Peter Jones added a comment -

            Nice! Thanks all. sthiell note that this fix is included in the upcoming 2.12.1

            pjones Peter Jones added a comment - Nice! Thanks all. sthiell note that this fix is included in the upcoming 2.12.1

            Hi Lai,

            We restarted the servers with the patch this morning and the problem is now gone. Thanks!

            sthiell Stephane Thiell added a comment - Hi Lai, We restarted the servers with the patch this morning and the problem is now gone. Thanks!
            laisiyao Lai Siyao added a comment -

            This looks to be the same issue which was fixed by https://review.whamcloud.com/#/c/33401/, can you apply this patch on all MDS's and try again?

            laisiyao Lai Siyao added a comment - This looks to be the same issue which was fixed by https://review.whamcloud.com/#/c/33401/ , can you apply this patch on all MDS's and try again?

            Thanks Patrick for this analysis. I see that obj->opo_stale = 1; only in osp_invalidate()...

            Because it's not impacting production, but just new group creation, we won't failover the MDT today (new groups can wait a bit ). We have some interactive jobs running. But I'll try to find a good time during the weekend to do so. Let me know if you want me to grab more debug info before then.

            sthiell Stephane Thiell added a comment - Thanks Patrick for this analysis. I see that obj->opo_stale = 1; only in osp_invalidate() ... Because it's not impacting production, but just new group creation, we won't failover the MDT today (new groups can wait a bit ). We have some interactive jobs running. But I'll try to find a good time during the weekend to do so. Let me know if you want me to grab more debug info before then.

            Stephane,

            Thanks for the more detailed logs.

            Here's the source of that ESTALE:

            00000040:00000001:18.0:1555692294.403870:0:22597:0:(llog_osd.c:322:llog_osd_declare_write_rec()) Process entered
            00000040:00000001:18.0:1555692294.403870:0:22597:0:(llog_osd.c:340:llog_osd_declare_write_rec()) Process leaving (rc=18446744073709551500 : -116 : ffffffffffffff8c)
            00000040:00000001:18.0:1555692294.403871:0:22597:0:(llog.c:960:llog_declare_write_rec()) Process leaving (rc=18446744073709551500 : -116 : ffffffffffffff8c)
            00000040:00000001:18.0:1555692294.403871:0:22597:0:(llog_cat.c:141:llog_cat_new_log()) Process leaving via out (rc=18446744073709551500 : -116 : 0xffffffffffffff8c) 

            Looks to be out of osp_md_declare_write:

                     if (dt2osp_obj(dt)->opo_stale)
                            return -ESTALE;

            But I'm not sure of much more.  I'm going to ask Lai to take a look at this - It's in the DNE area, as you noted.

             

            As for what would fix this...  A failover /failbackof MDT1 might do the trick.  It kind of looks like there's confusion over the state of an object in memory, and I think that might clear it up.

            pfarrell Patrick Farrell (Inactive) added a comment - Stephane, Thanks for the more detailed logs. Here's the source of that ESTALE: 00000040:00000001:18.0:1555692294.403870:0:22597:0:(llog_osd.c:322:llog_osd_declare_write_rec()) Process entered 00000040:00000001:18.0:1555692294.403870:0:22597:0:(llog_osd.c:340:llog_osd_declare_write_rec()) Process leaving (rc=18446744073709551500 : -116 : ffffffffffffff8c) 00000040:00000001:18.0:1555692294.403871:0:22597:0:(llog.c:960:llog_declare_write_rec()) Process leaving (rc=18446744073709551500 : -116 : ffffffffffffff8c) 00000040:00000001:18.0:1555692294.403871:0:22597:0:(llog_cat.c:141:llog_cat_new_log()) Process leaving via out (rc=18446744073709551500 : -116 : 0xffffffffffffff8c) Looks to be out of osp_md_declare_write: if (dt2osp_obj(dt)->opo_stale) return -ESTALE; But I'm not sure of much more.  I'm going to ask Lai to take a look at this - It's in the DNE area, as you noted.   As for what would fix this...  A failover /failbackof MDT1 might do the trick.  It kind of looks like there's confusion over the state of an object in memory, and I think that might clear it up.

            This is done.
             
            Command issued on client sh-101-60 (10.9.101.60@o2ib4) running 2.12 was:

            [root@sh-101-60 ruthm]# lctl clear
            [root@sh-101-60 ruthm]# lfs mkdir -i 1 .testdir_mdt1
            lfs mkdir: dirstripe error on '.testdir_mdt1': Stale file handle
            lfs setdirstripe: cannot create stripe dir '.testdir_mdt1': Stale file handle
            

            Nothing else was running on this client.

            Client logs attached as sh-101-60.dk.gz

            MDT0 and 1 dk logs attached as oak-md1-s2-MDT0.dk.gz and oak-md1-s1-MDT1.dk.gz

            sthiell Stephane Thiell added a comment - This is done.   Command issued on client sh-101-60 (10.9.101.60@o2ib4) running 2.12 was: [root@sh-101-60 ruthm]# lctl clear [root@sh-101-60 ruthm]# lfs mkdir -i 1 .testdir_mdt1 lfs mkdir: dirstripe error on '.testdir_mdt1': Stale file handle lfs setdirstripe: cannot create stripe dir '.testdir_mdt1': Stale file handle Nothing else was running on this client. Client logs attached as sh-101-60.dk.gz MDT0 and 1 dk logs attached as oak-md1-s2-MDT0.dk.gz and oak-md1-s1-MDT1.dk.gz

            Hi Patrick,

            Ok, I will try (should be in an hour, I'm on my way to the office). But it looks like it is repeatable but only if doing lfs mkdir -i 1 in a parent directory striped on MDT0. See below.

            Creating a directory on MDT0 in a parent dir in MDT1 does work:

            [root@oak-rbh01 giocomo]# lfs getdirstripe .
            lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none
            
            [root@oak-rbh01 giocomo]# lfs mkdir -i 0 .testdir_mdt0
            [root@oak-rbh01 giocomo]# lfs getdirstripe .testdir_mdt0
            lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none
            [root@oak-rbh01 giocomo]# rmdir .testdir_mdt0
            

            But not the other way around:

            [root@oak-rbh01 giocomo]# cd ../ruthm
            [root@oak-rbh01 ruthm]# lfs getdirstripe .
            lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none
            [root@oak-rbh01 ruthm]# lfs mkdir -i 0 .testdir_mdt0
            [root@oak-rbh01 ruthm]# 
            [root@oak-rbh01 ruthm]# lfs mkdir -i 1 .testdir_mdt1
            error on LL_IOC_LMV_SETSTRIPE '.testdir_mdt1' (3): Stale file handle
            error: mkdir: create stripe dir '.testdir_mdt1' failed
            
            sthiell Stephane Thiell added a comment - Hi Patrick, Ok, I will try (should be in an hour, I'm on my way to the office). But it looks like it is repeatable but only if doing lfs mkdir -i 1 in a parent directory striped on MDT0. See below. Creating a directory on MDT0 in a parent dir in MDT1 does work: [root@oak-rbh01 giocomo]# lfs getdirstripe . lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none [root@oak-rbh01 giocomo]# lfs mkdir -i 0 .testdir_mdt0 [root@oak-rbh01 giocomo]# lfs getdirstripe .testdir_mdt0 lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none [root@oak-rbh01 giocomo]# rmdir .testdir_mdt0 But not the other way around: [root@oak-rbh01 giocomo]# cd ../ruthm [root@oak-rbh01 ruthm]# lfs getdirstripe . lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none [root@oak-rbh01 ruthm]# lfs mkdir -i 0 .testdir_mdt0 [root@oak-rbh01 ruthm]# [root@oak-rbh01 ruthm]# lfs mkdir -i 1 .testdir_mdt1 error on LL_IOC_LMV_SETSTRIPE '.testdir_mdt1' (3): Stale file handle error: mkdir: create stripe dir '.testdir_mdt1' failed

            If it is this repeatable, can you get -1 debug on the client and the server?  I know that may be a pain server side, but if possible it would be great.

            pfarrell Patrick Farrell (Inactive) added a comment - If it is this repeatable, can you get -1 debug on the client and the server?  I know that may be a pain server side, but if possible it would be great.

            People

              laisiyao Lai Siyao
              sthiell Stephane Thiell
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: