Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8397

take comma as separator of mgsnode's list

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      mgsnode's list of nids is allowed to use comma as separator (see 36.15.3 of Lustre Operations Manual), however lustre_start_mgc() assumes that only colon can separate nids. Due to that nids after comma do not get into list of import's connections and those nids do not get involved.

      Attachments

        Issue Links

          Activity

            [LU-8397] take comma as separator of mgsnode's list

            I'm sorry - it turns out I was wrong when I said reverting this patch fixed our situation.  I think what happened is that I forgot to stop lnet before I uninstalled stock 2.9.0.  I then installed and tested the patched version of 2.9.0 - but the lnet kernel modules from stock 2.9.0 stayed loaded.  So I was using some weird combination of a patched and stock version of 2.9.0.  I've repeated the tests using a version of 2.9.0 with patch 7905 reverted and the situation is actually worse - with both single rail (tcp only) or dual rail (tcp and ib) I only see one entry for failover_nids in /proc/fs/lustre/mgc/MGC192.52.98.30@tcp/import.  I can repeat the tests and dump debug logs if that would help.  

            dvicker Darby Vicker added a comment - I'm sorry - it turns out I was wrong when I said reverting this patch fixed our situation.  I think what happened is that I forgot to stop lnet before I uninstalled stock 2.9.0.  I then installed and tested the patched version of 2.9.0 - but the lnet kernel modules from stock 2.9.0 stayed loaded.  So I was using some weird combination of a patched and stock version of 2.9.0.  I've repeated the tests using a version of 2.9.0 with patch 7905 reverted and the situation is actually worse - with both single rail (tcp only) or dual rail (tcp and ib) I only see one entry for failover_nids in /proc/fs/lustre/mgc/MGC192.52.98.30@tcp/import.  I can repeat the tests and dump debug logs if that would help.  
            yujian Jian Yu added a comment -

            Thank you very much Darby for the confirmation.

            yujian Jian Yu added a comment - Thank you very much Darby for the confirmation.
            dvicker Darby Vicker added a comment -

            This evening I compiled the 2.9.0 release with patch 7905 reverted.  I can confirm that this did allow me to failover our OST's properly.  Once the OST were mounted with the NIDS shown in the comment above we see both TCP NIDS in /proc/fs/lustre/mgc/MGC192.52.98.30@tcp/import and the MGC connects with the secondary MDS/MGS when we failover.  

            dvicker Darby Vicker added a comment - This evening I compiled the 2.9.0 release with patch 7905 reverted.  I can confirm that this did allow me to failover our OST's properly.  Once the OST were mounted with the NIDS shown in the comment above we see both TCP NIDS in /proc/fs/lustre/mgc/MGC192.52.98.30@tcp/import and the MGC connects with the secondary MDS/MGS when we failover.  
            yujian Jian Yu added a comment -

            Hi Bobi,

            I found the following codes in lustre_start_mgc() introduced by patch https://review.whamcloud.com/7509 for LU-3829 might have some issue:

                                    /*
                                     * LU-3829.
                                     * Here we only take the first mgsnid as its primary
                                     * serving mgs node, the rest mgsnid will be taken as
                                     * failover mgs node, otherwise they would be takens
                                     * as multiple nids of a single mgs node.
                                     */
                                    while (class_parse_nid(ptr, &nid, &ptr) == 0) {
                                            rc = do_lcfg(mgcname, nid, LCFG_ADD_UUID,
                                                         niduuid, NULL, NULL, NULL);
                                            if (rc == 0) {
                                                    i = 1;
                                                    break;
                                            }
                                    }
            
            

            For multiple NIDs separated by commas on one MGS node, the above codes would only add the first NID and then break the while loop. After lustre_start_simple() is performed, the remaining comma-separated MGS NID(s) would not be added because the following condition doesn't work:

                    /* Add any failover MGS nids */
                    i = 1;
                    while (ptr && ((*ptr == ':' ||
                           class_find_param(ptr, PARAM_MGSNODE, &ptr) == 0))) {
            
            

            ptr is previously assigned with ptr = lsi->lsi_lmd->lmd_mgs, and for comma-separated MGS NIDs, ptr now points to a ','. There is also no mgsnode= in the string, so the above condition does not work. All of the NIDs after the first comma would not be added.

            It seems this is the root cause of the latest issue reported by Darby in LU-8311. The ptr has a value of 192.52.98.30@tcp,10.148.0.30@o2ib:192.52.98.31@tcp,10.148.0.31@o2ib, but from the debug log in debug.log.ib_and_eth, I can only see the first NID 192.52.98.30@tcp was added.

            yujian Jian Yu added a comment - Hi Bobi, I found the following codes in lustre_start_mgc() introduced by patch https://review.whamcloud.com/7509 for LU-3829 might have some issue: /* * LU-3829. * Here we only take the first mgsnid as its primary * serving mgs node, the rest mgsnid will be taken as * failover mgs node, otherwise they would be takens * as multiple nids of a single mgs node. */ while (class_parse_nid(ptr, &nid, &ptr) == 0) { rc = do_lcfg(mgcname, nid, LCFG_ADD_UUID, niduuid, NULL, NULL, NULL); if (rc == 0) { i = 1; break ; } } For multiple NIDs separated by commas on one MGS node, the above codes would only add the first NID and then break the while loop. After lustre_start_simple() is performed, the remaining comma-separated MGS NID(s) would not be added because the following condition doesn't work: /* Add any failover MGS nids */ i = 1; while (ptr && ((*ptr == ':' || class_find_param(ptr, PARAM_MGSNODE, &ptr) == 0))) { ptr is previously assigned with ptr = lsi->lsi_lmd->lmd_mgs , and for comma-separated MGS NIDs, ptr now points to a ','. There is also no mgsnode= in the string, so the above condition does not work. All of the NIDs after the first comma would not be added. It seems this is the root cause of the latest issue reported by Darby in LU-8311 . The ptr has a value of 192.52.98.30@tcp,10.148.0.30@o2ib:192.52.98.31@tcp,10.148.0.31@o2ib , but from the debug log in debug.log.ib_and_eth, I can only see the first NID 192.52.98.30@tcp was added.
            pjones Peter Jones added a comment -

            Emoly

            Could you please take care of reviews etc for this patch?

            Thanks

            Peter

            pjones Peter Jones added a comment - Emoly Could you please take care of reviews etc for this patch? Thanks Peter

            Vladimir Saveliev (vladimir_saveliev@xyratex.com) uploaded a new patch: http://review.whamcloud.com/21308
            Subject: LU-8397 mgc: take comma as separator of mgsnode's list
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d967d2d09409aaaf036abe3001457260aff15292

            gerrit Gerrit Updater added a comment - Vladimir Saveliev (vladimir_saveliev@xyratex.com) uploaded a new patch: http://review.whamcloud.com/21308 Subject: LU-8397 mgc: take comma as separator of mgsnode's list Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d967d2d09409aaaf036abe3001457260aff15292

            People

              emoly.liu Emoly Liu
              vsaveliev Vladimir Saveliev
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: