Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1432

Race condition between lprocfs_exp_setup() and lprocfs_free_per_client_stats() causes LBUG

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.3.0, Lustre 2.1.3
    • None
    • None
    • FEFS based on Lustre 1.8.5, RIKEN K-computer
    • 3
    • 4541

    Description

      We've often happened to see the below LBUG until now, nevertheless we already applied the putch of LU-39 to FEFS.

              LASSERTF(atomic_read(&client_stat->nid_exp_ref_count) == 0,
                       "nid %s:count %d\n", libcfs_nid2str(client_stat->nid),
                       atomic_read(&client_stat->nid_exp_ref_count));
      

      Then, today, I finally found out the race condition between lprocfs_exp_setup() and lprocfs_free_per_client_stats(). And as far as I can see, it appears that the LBUG could be the case in Lustre-1.8 and Lustre-2.x. So I decided to report it here.

      the account of the case is below.

      • umount command is executed and set obd_stopping 1 in class_cleanup()
      • but some ll_ost_xxx threads which have already passed the below if-statement in target_handle_connect() can go ahead anyway.
      int target_handle_connect(struct ptlrpc_request *req, svc_handler_t handler)
      {
      
              ...
              if (!target || target->obd_stopping || !target->obd_set_up) {
                      LCONSOLE_ERROR_MSG(0x137, "UUID '%s' is not available "
                                         " for connect (%s)\n", str,
                                         !target ? "no target" :
                                         (target->obd_stopping ? "stopping" :
                                         "not set up"));
                      GOTO(out, rc = -ENODEV);
              }
              ...
      
      }
      
      • the threads will reach out the below codes
      int lprocfs_exp_setup(struct obd_export *exp, lnet_nid_t *nid, int *newnid)
      {
              ...
              old_stat = lustre_hash_findadd_unique(obd->obd_nid_stats_hash,
                                                    nid, &new_stat->nid_hash);
              CDEBUG(D_INFO, "Found stats %p for nid %s - ref %d\n",
                     old_stat, libcfs_nid2str(*nid),
                     atomic_read(&new_stat->nid_exp_ref_count));
      
              /* we need to release old stats because lprocfs_exp_cleanup() hasn't
               * been and will never be called. */
              if (exp->exp_nid_stats) {
                      nidstat_putref(exp->exp_nid_stats);
                      exp->exp_nid_stats = NULL;
              }
              ...
      }
      
      • if exp->exp_nid_stats is not NULL, lustre_hash_findadd_unique() increments the ref count of old_stat and returns it. So, it means that the old_stat is temporalily one bigger, until nidstat_putref() is called.
      • Then, at the same timing, if the umount is running in lprocfs_free_client_stats() and is handling the same stats as the above old_stat. it causes the LBUG beause the nid_stat ref count is one bigger than expected.

      You can cause the same problem when you add some waiting code into the part of lprocfs_exp_setup() and execute umount command. in this way, I've actually caused the problem in purpose.

      and now, I'm creating the patch to the issue and I'll upload that soon but it's for 1.8.8.
      I'm really sorry that I don't have the machine for Lustre-2.x X(

      Attachments

        Issue Links

          Activity

            [LU-1432] Race condition between lprocfs_exp_setup() and lprocfs_free_per_client_stats() causes LBUG

            In the "ISSUE Links" section of this ticket, LU-1592 is said to be a duplicate
            of this problem. I do not think that is correct. Can we remove the link?

            jaylan Jay Lan (Inactive) added a comment - In the "ISSUE Links" section of this ticket, LU-1592 is said to be a duplicate of this problem. I do not think that is correct. Can we remove the link?
            pjones Peter Jones added a comment -

            Landed for 2.1.3 and 2.3

            pjones Peter Jones added a comment - Landed for 2.1.3 and 2.3

            Can you offically land the above patch? We have been running that patch in our production systems for ~2 weeks. I would like to see it landed. BTW, a port to b2_1 branch would be appreciated.

            jaylan Jay Lan (Inactive) added a comment - Can you offically land the above patch? We have been running that patch in our production systems for ~2 weeks. I would like to see it landed. BTW, a port to b2_1 branch would be appreciated.
            laisiyao Lai Siyao added a comment -
            laisiyao Lai Siyao added a comment - Patch for master is at: http://review.whamcloud.com/#change,3244

            ah, the above comment was my mistake. I forgot to add one put. I'm really sorry.

            nozaki Hiroya Nozaki (Inactive) added a comment - ah, the above comment was my mistake. I forgot to add one put. I'm really sorry.
            laisiyao Lai Siyao added a comment -

            Could you upload full log? I need to know where the export refcount is taken.

            laisiyao Lai Siyao added a comment - Could you upload full log? I need to know where the export refcount is taken.

            Hi, Lai.
            I've almost finished implementing what you said, but it seems that there is an one extra getting refcount somewhere.
            whenever umount runs, I can see the below message and ...

            Jun 21 11:22:54.163835 rx200-040 [ERR.] FEFS 137-5: UUID 'fefs-OST0000_UUID' is not available  for connect (stopping)
            Jun 21 11:22:54.163846 rx200-040 [ERR.] FEFS 4485:0:(ldlm_lib.c:2094:target_send_reply_msg()) @@@ processing error (-19)  req@ffff810c37d53400 x1405183678491029/t0 o8-><?>@<?>:0/0 lens 368/0 e 0 st 0 to 0 dl 1340245474 ot 0 ref 1 fl Interpret:/0/0 rc -19/0
            

            I can take the debug log about one of the export objects of fefs-OST0000_UUID.
            As you can see, there is no final put message anywhere. That's why I wondered about class_export_get(), but now I don't think it's wrong because your account was very reasonable. So now, I should find out one extra getting refcount, right? if you find something weird in the below log at sight. plese tell me about that.

            thank you.

            88016:00000020:00000040:2:1340264041.083772:0:6336:0:(obd_config.c:1459:uuid_export_get()) GETting export ffff81063d211400 : new refcount 3
            88050:00002000:00000040:2:1340264041.083815:0:6336:0:(filter.c:1058:filter_init_server_data()) PUTting export ffff81063d211400 : new refcount 2
            90103:00000020:00000040:10:1340264047.516715:0:6078:0:(obd_config.c:1459:uuid_export_get()) GETting export ffff81063d211400 : new refcount 3
            90104:00000020:00000001:10:1340264047.516716:0:6078:0:(obd_config.c:1461:uuid_export_get()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400)
            90105:00000001:00000001:10:1340264047.516719:0:6078:0:(class_hash.c:467:lustre_hash_lookup()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400)
            90107:00010000:00080000:10:1340264047.516723:0:6078:0:(ldlm_lib.c:623:target_handle_reconnect()) connect export for UUID '5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7' at ffff81063d211400, cookie 0x1c8c621b54ad24a8
            90115:00002000:00100000:10:1340264047.516741:0:6078:0:(filter.c:2444:filter_connect_internal()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 ocd_connect_flags: 0xe3440478 ocd_version: 1080500 ocd_grant: 2097152 ocd_index: 0
            90119:00002000:00000020:10:1340264047.516753:0:6078:0:(filter_io.c:201:filter_grant_space_left()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 free: 17254846464 avail: 17254846464 grant 0 left: 17221083136 pending: 0
            90120:00002000:00000020:10:1340264047.516757:0:6078:0:(filter_io.c:261:filter_grant()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 wants: 2097152 current grant 0 granting: 2097152
            90121:00002000:00000020:10:1340264047.516760:0:6078:0:(filter_io.c:266:filter_grant()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 tot cached:0 granted:2097152 num_exports: 3
            90122:00002000:00000020:10:1340264047.516765:0:6078:0:(filter.c:2470:filter_connect_internal()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 ocd_grant: 2097152 want: 2097152 left: 17221083136
            90610:00010000:00000040:2:1340264056.529876:0:6078:0:(ldlm_lib.c:1032:target_handle_connect()) GETting export ffff81063d211400 : new refcount 4
            90617:00000020:00000040:2:1340264056.529886:0:6078:0:(obd_config.c:1526:nid_export_get()) GETting export ffff81063d211400 : new refcount 5
            90618:00000020:00000001:2:1340264056.529887:0:6078:0:(obd_config.c:1528:nid_export_get()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400)
            90693:00000100:00000040:2:1340264056.530005:0:6078:0:(service.c:638:ptlrpc_server_drop_request()) PUTting export ffff81063d211400 : new refcount 4
            91015:00000020:00000080:13:1340264061.174390:0:6374:0:(obd_config.c:440:dump_exports()) fefs-OST0000: ffff81063d211400 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7 192.168.128.38@o2ib 4 0 0: 0000000000000000
            91021:00000020:00000040:13:1340264061.174402:0:6374:0:(genops.c:1223:class_disconnect_export_list()) GETting export ffff81063d211400 : new refcount 5
            91022:00000020:00000040:13:1340264061.174404:0:6374:0:(genops.c:1241:class_disconnect_export_list()) GETting export ffff81063d211400 : new refcount 6
            91023:00000020:00080000:13:1340264061.174406:0:6374:0:(genops.c:1245:class_disconnect_export_list()) fefs-OST0000: disconnecting export at 192.168.128.38@o2ib (ffff81063d211400), last request at 1340264041
            91026:00002000:00000040:13:1340264061.174412:0:6374:0:(filter.c:2772:filter_disconnect()) GETting export ffff81063d211400 : new refcount 7
            91048:00000020:00000040:13:1340264061.174452:0:6374:0:(obd_config.c:1537:nid_export_put()) PUTting export ffff81063d211400 : new refcount 6
            91049:00000020:00000001:13:1340264061.174454:0:6374:0:(obd_config.c:1539:nid_export_put()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400)
            91050:00000001:00000001:13:1340264061.174457:0:6374:0:(class_hash.c:365:lustre_hash_del()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400)
            91051:00000020:00000040:13:1340264061.174460:0:6374:0:(lustre_handles.c:143:class_handle_unhash_nolock()) removing object ffff81063d211400 with handle 0x1c8c621b54ad24a8 from hash
            91053:00000020:00000040:13:1340264061.174464:0:6374:0:(obd_config.c:1470:uuid_export_put()) PUTting export ffff81063d211400 : new refcount 5
            91054:00000020:00000001:13:1340264061.174465:0:6374:0:(obd_config.c:1472:uuid_export_put()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400)
            91055:00000001:00000001:13:1340264061.174468:0:6374:0:(class_hash.c:365:lustre_hash_del()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400)
            91056:00000020:00000040:13:1340264061.174471:0:6374:0:(genops.c:976:class_unlink_export()) PUTting export ffff81063d211400 : new refcount 4
            91057:00000020:00000040:13:1340264061.174473:0:6374:0:(genops.c:1193:class_disconnect()) PUTting export ffff81063d211400 : new refcount 3
            91067:00002000:00000040:13:1340264061.174489:0:6374:0:(filter.c:2797:filter_disconnect()) PUTting export ffff81063d211400 : new refcount 2
            91070:00000020:00080000:13:1340264061.174494:0:6374:0:(genops.c:1250:class_disconnect_export_list()) disconnected export at 192.168.128.38@o2ib (ffff81063d211400): rc 0
            91071:00000020:00000040:13:1340264061.174496:0:6374:0:(genops.c:1251:class_disconnect_export_list()) PUTting export ffff81063d211400 : new refcount 1
            
            nozaki Hiroya Nozaki (Inactive) added a comment - Hi, Lai. I've almost finished implementing what you said, but it seems that there is an one extra getting refcount somewhere. whenever umount runs, I can see the below message and ... Jun 21 11:22:54.163835 rx200-040 [ERR.] FEFS 137-5: UUID 'fefs-OST0000_UUID' is not available for connect (stopping) Jun 21 11:22:54.163846 rx200-040 [ERR.] FEFS 4485:0:(ldlm_lib.c:2094:target_send_reply_msg()) @@@ processing error (-19) req@ffff810c37d53400 x1405183678491029/t0 o8-><?>@<?>:0/0 lens 368/0 e 0 st 0 to 0 dl 1340245474 ot 0 ref 1 fl Interpret:/0/0 rc -19/0 I can take the debug log about one of the export objects of fefs-OST0000_UUID. As you can see, there is no final put message anywhere. That's why I wondered about class_export_get(), but now I don't think it's wrong because your account was very reasonable. So now, I should find out one extra getting refcount, right? if you find something weird in the below log at sight. plese tell me about that. thank you. 88016:00000020:00000040:2:1340264041.083772:0:6336:0:(obd_config.c:1459:uuid_export_get()) GETting export ffff81063d211400 : new refcount 3 88050:00002000:00000040:2:1340264041.083815:0:6336:0:(filter.c:1058:filter_init_server_data()) PUTting export ffff81063d211400 : new refcount 2 90103:00000020:00000040:10:1340264047.516715:0:6078:0:(obd_config.c:1459:uuid_export_get()) GETting export ffff81063d211400 : new refcount 3 90104:00000020:00000001:10:1340264047.516716:0:6078:0:(obd_config.c:1461:uuid_export_get()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400) 90105:00000001:00000001:10:1340264047.516719:0:6078:0:(class_hash.c:467:lustre_hash_lookup()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400) 90107:00010000:00080000:10:1340264047.516723:0:6078:0:(ldlm_lib.c:623:target_handle_reconnect()) connect export for UUID '5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7' at ffff81063d211400, cookie 0x1c8c621b54ad24a8 90115:00002000:00100000:10:1340264047.516741:0:6078:0:(filter.c:2444:filter_connect_internal()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 ocd_connect_flags: 0xe3440478 ocd_version: 1080500 ocd_grant: 2097152 ocd_index: 0 90119:00002000:00000020:10:1340264047.516753:0:6078:0:(filter_io.c:201:filter_grant_space_left()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 free: 17254846464 avail: 17254846464 grant 0 left: 17221083136 pending: 0 90120:00002000:00000020:10:1340264047.516757:0:6078:0:(filter_io.c:261:filter_grant()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 wants: 2097152 current grant 0 granting: 2097152 90121:00002000:00000020:10:1340264047.516760:0:6078:0:(filter_io.c:266:filter_grant()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 tot cached:0 granted:2097152 num_exports: 3 90122:00002000:00000020:10:1340264047.516765:0:6078:0:(filter.c:2470:filter_connect_internal()) fefs-OST0000: cli 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7/ffff81063d211400 ocd_grant: 2097152 want: 2097152 left: 17221083136 90610:00010000:00000040:2:1340264056.529876:0:6078:0:(ldlm_lib.c:1032:target_handle_connect()) GETting export ffff81063d211400 : new refcount 4 90617:00000020:00000040:2:1340264056.529886:0:6078:0:(obd_config.c:1526:nid_export_get()) GETting export ffff81063d211400 : new refcount 5 90618:00000020:00000001:2:1340264056.529887:0:6078:0:(obd_config.c:1528:nid_export_get()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400) 90693:00000100:00000040:2:1340264056.530005:0:6078:0:(service.c:638:ptlrpc_server_drop_request()) PUTting export ffff81063d211400 : new refcount 4 91015:00000020:00000080:13:1340264061.174390:0:6374:0:(obd_config.c:440:dump_exports()) fefs-OST0000: ffff81063d211400 5daed9bc-4c53-eeb7-ef59-cad4e3ca5ef7 192.168.128.38@o2ib 4 0 0: 0000000000000000 91021:00000020:00000040:13:1340264061.174402:0:6374:0:(genops.c:1223:class_disconnect_export_list()) GETting export ffff81063d211400 : new refcount 5 91022:00000020:00000040:13:1340264061.174404:0:6374:0:(genops.c:1241:class_disconnect_export_list()) GETting export ffff81063d211400 : new refcount 6 91023:00000020:00080000:13:1340264061.174406:0:6374:0:(genops.c:1245:class_disconnect_export_list()) fefs-OST0000: disconnecting export at 192.168.128.38@o2ib (ffff81063d211400), last request at 1340264041 91026:00002000:00000040:13:1340264061.174412:0:6374:0:(filter.c:2772:filter_disconnect()) GETting export ffff81063d211400 : new refcount 7 91048:00000020:00000040:13:1340264061.174452:0:6374:0:(obd_config.c:1537:nid_export_put()) PUTting export ffff81063d211400 : new refcount 6 91049:00000020:00000001:13:1340264061.174454:0:6374:0:(obd_config.c:1539:nid_export_put()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400) 91050:00000001:00000001:13:1340264061.174457:0:6374:0:(class_hash.c:365:lustre_hash_del()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400) 91051:00000020:00000040:13:1340264061.174460:0:6374:0:(lustre_handles.c:143:class_handle_unhash_nolock()) removing object ffff81063d211400 with handle 0x1c8c621b54ad24a8 from hash 91053:00000020:00000040:13:1340264061.174464:0:6374:0:(obd_config.c:1470:uuid_export_put()) PUTting export ffff81063d211400 : new refcount 5 91054:00000020:00000001:13:1340264061.174465:0:6374:0:(obd_config.c:1472:uuid_export_put()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400) 91055:00000001:00000001:13:1340264061.174468:0:6374:0:(class_hash.c:365:lustre_hash_del()) Process leaving (rc=18446604462528205824 : -139611181345792 : ffff81063d211400) 91056:00000020:00000040:13:1340264061.174471:0:6374:0:(genops.c:976:class_unlink_export()) PUTting export ffff81063d211400 : new refcount 4 91057:00000020:00000040:13:1340264061.174473:0:6374:0:(genops.c:1193:class_disconnect()) PUTting export ffff81063d211400 : new refcount 3 91067:00002000:00000040:13:1340264061.174489:0:6374:0:(filter.c:2797:filter_disconnect()) PUTting export ffff81063d211400 : new refcount 2 91070:00000020:00080000:13:1340264061.174494:0:6374:0:(genops.c:1250:class_disconnect_export_list()) disconnected export at 192.168.128.38@o2ib (ffff81063d211400): rc 0 91071:00000020:00000040:13:1340264061.174496:0:6374:0:(genops.c:1251:class_disconnect_export_list()) PUTting export ffff81063d211400 : new refcount 1

            I uploaded the patch just now. I know it may not be enough but I'm happy if it help you consider the issue.

            nozaki Hiroya Nozaki (Inactive) added a comment - I uploaded the patch just now. I know it may not be enough but I'm happy if it help you consider the issue.
            laisiyao Lai Siyao added a comment -

            Hmm, one possible fix for this issue is to serialize target obd device cleanup and connect operation. I'll consider it.

            As for request not taking export refcount in 2.x, it happens to be correct:

            • mdt_obd_connect() takes one extra export refcount.

            But fix it the way I do is more clear, I will fix it in 2.x later.

            laisiyao Lai Siyao added a comment - Hmm, one possible fix for this issue is to serialize target obd device cleanup and connect operation. I'll consider it. As for request not taking export refcount in 2.x, it happens to be correct: mdt_obd_connect() takes one extra export refcount. But fix it the way I do is more clear, I will fix it in 2.x later.

            I added a new member, obd_conn_inprogress into obd_device to count the number of threads which are doing connecting in target_handle_connect(), and do like below pseudo code. so far, it looks working well. So, I'll upload the patch soon.

            target_handle_connect() {
                    ...
                    atomic_inc(obd_conn_inprogress);
                    // do connect
                    atomic_dec(obd_conn_inprogress);
                    ...
            }
             
            class_cleanup() {
                    ...
                    while (atomic_read(obd_conn_inprogress) > 0) {
                            cfs_cond_resched();
                    }
                    lustre_hash_exit(obd_nid_hash);
                    ...
            }    
            
            nozaki Hiroya Nozaki (Inactive) added a comment - I added a new member, obd_conn_inprogress into obd_device to count the number of threads which are doing connecting in target_handle_connect(), and do like below pseudo code. so far, it looks working well. So, I'll upload the patch soon. target_handle_connect() { ... atomic_inc(obd_conn_inprogress); // do connect atomic_dec(obd_conn_inprogress); ... } class_cleanup() { ... while (atomic_read(obd_conn_inprogress) > 0) { cfs_cond_resched(); } lustre_hash_exit(obd_nid_hash); ... }

            People

              laisiyao Lai Siyao
              nozaki Hiroya Nozaki (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: