Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5796

MGS: non-config logname received: params

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.5.4
    • Lustre 2.5.3
    • 3
    • 16260

    Description

      We are in the process of testing 2.5.3 plus our local patch stack. The current development tag is 2.5.3-0.13morrone (see github.com/chaos/lustre).

      Our MGS is printing the following new message to the console:

      Oct 21 15:23:43 zwicky-lcy-mds1 kernel: Lustre: lcy-OST0009-osc-MDT0000: Connection to lcy-OST0009 (at 10.1.1.180@o2ib9) was lost; in progress operations using this service will wait for recovery to complete
      Oct 21 15:24:09 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params
      Oct 21 15:24:09 zwicky-lcy-mds1 kernel: Lustre: Skipped 11 previous similar messages
      

      The "non-config logname received" is the error message that needs to be addressed.

      It would appear to be corrolated with OST start up.

      Attachments

        Issue Links

          Activity

            [LU-5796] MGS: non-config logname received: params
            yujian Jian Yu added a comment -

            Patches were merged into Lustre b2_5 branch for 2.5.4 release.

            yujian Jian Yu added a comment - Patches were merged into Lustre b2_5 branch for 2.5.4 release.
            yujian Jian Yu added a comment -

            Here are the back-ported patches for Lustre b2_5 branch:
            http://review.whamcloud.com/12427
            http://review.whamcloud.com/12428

            yujian Jian Yu added a comment - Here are the back-ported patches for Lustre b2_5 branch: http://review.whamcloud.com/12427 http://review.whamcloud.com/12428

            Yes, the combination of http://review.whamcloud.com/10311 and http://review.whamcloud.com/10589 on b2_5 seems to have eliminated the "non-config logname" messages.

            morrone Christopher Morrone (Inactive) added a comment - Yes, the combination of http://review.whamcloud.com/10311 and http://review.whamcloud.com/10589 on b2_5 seems to have eliminated the "non-config logname" messages.
            yujian Jian Yu added a comment -

            Thank you very much, Andreas!
            This was fixed in LU-2059. I'll back-port http://review.whamcloud.com/10311 and http://review.whamcloud.com/10589 to Lustre b2_5 branch.

            yujian Jian Yu added a comment - Thank you very much, Andreas! This was fixed in LU-2059 . I'll back-port http://review.whamcloud.com/10311 and http://review.whamcloud.com/10589 to Lustre b2_5 branch.

            Yu Jian, I think this bug was fixed in master also, please check the git commit logs and/or "git blame" and/or jira for a duplicate and backport to b2_5.

            adilger Andreas Dilger added a comment - Yu Jian, I think this bug was fixed in master also, please check the git commit logs and/or "git blame" and/or jira for a duplicate and backport to b2_5.
            yujian Jian Yu added a comment -

            The "non-config logname received" warning message was printed from mgs_llog_open() in lustre/mgs/mgs_handler.c:

                    logname = req_capsule_client_get(tsi->tsi_pill, &RMF_NAME);
                    if (logname) {
                            char *ptr = strchr(logname, '-');
                            int   len = (int)(ptr - logname);
            
                            if (ptr == NULL || len >= sizeof(mgi->mgi_fsname)) {
                                    LCONSOLE_WARN("%s: non-config logname received: %s\n",
                                                  tgt_name(tsi->tsi_tgt), logname);
                                    /* not error, this can be llog test name */
                            } else {
                                    //......
                            }
                    }
            

            The codes were introduced by the following commit on Lustre b2_5 branch:

            commit 93a6346f8b73f68cb5bc02a3c826ac0e5b4c236e
            Author: Mikhail Pershin <tappro@whamcloud.com>
            Date:   Thu Dec 13 22:07:52 2012 +0400
            
                LU-2145 server: use unified request handler for MGS
            

            I'll look into the codes.

            yujian Jian Yu added a comment - The "non-config logname received" warning message was printed from mgs_llog_open() in lustre/mgs/mgs_handler.c: logname = req_capsule_client_get(tsi->tsi_pill, &RMF_NAME); if (logname) { char *ptr = strchr(logname, '-' ); int len = ( int )(ptr - logname); if (ptr == NULL || len >= sizeof(mgi->mgi_fsname)) { LCONSOLE_WARN( "%s: non-config logname received: %s\n" , tgt_name(tsi->tsi_tgt), logname); /* not error, this can be llog test name */ } else { //...... } } The codes were introduced by the following commit on Lustre b2_5 branch: commit 93a6346f8b73f68cb5bc02a3c826ac0e5b4c236e Author: Mikhail Pershin <tappro@whamcloud.com> Date: Thu Dec 13 22:07:52 2012 +0400 LU-2145 server: use unified request handler for MGS I'll look into the codes.

            The MGS message is not unique to OST connections. Any connection at all seems to make the message. For instance, I just reboot a bunch of client nodes to make the 2.5.3-1chaos based, and saw the messages. Here is a snippet from the console:

            Oct 23 18:56:20 zwicky-lcy-oss7 kernel: Lustre: lcy-OST0006: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, 
            and I am evicting it. exp ffff8808206e6c00, cur 1414115780 expire 1414115630 last 1414115548
            Oct 23 18:56:20 zwicky-lcy-oss5 kernel: Lustre: lcy-OST0004: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, 
            and I am evicting it. exp ffff8810304e6c00, cur 1414115780 expire 1414115630 last 1414115548
            Oct 23 18:56:20 zwicky-lcy-oss15 kernel: Lustre: lcy-OST000e: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead,
             and I am evicting it. exp ffff8807fb866000, cur 1414115780 expire 1414115630 last 1414115548
            Oct 23 18:56:20 zwicky-lcy-oss1 kernel: Lustre: lcy-OST0000: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, 
            and I am evicting it. exp ffff881013c25c00, cur 1414115780 expire 1414115630 last 1414115548
            Oct 23 18:56:20 zwicky-lcy-oss13 kernel: Lustre: lcy-OST000c: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead,
             and I am evicting it. exp ffff880819aa7400, cur 1414115780 expire 1414115630 last 1414115548
            Oct 23 18:56:20 zwicky-lcy-oss10 kernel: Lustre: lcy-OST0009: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead,
             and I am evicting it. exp ffff8810322dc000, cur 1414115780 expire 1414115630 last 1414115548
            Oct 23 18:56:20 zwicky-lcy-oss12 kernel: Lustre: lcy-OST000b: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead,
             and I am evicting it. exp ffff88102b5b1800, cur 1414115780 expire 1414115630 last 1414115548
            Oct 23 18:56:21 zwicky-lcy-oss3 kernel: Lustre: lcy-OST0002: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 233 seconds. I think it's dead, 
            and I am evicting it. exp ffff88080aea4000, cur 1414115781 expire 1414115631 last 1414115548
            Oct 23 18:56:25 zwicky-lcy-mds1 kernel: Lustre: lcy-MDT0000: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 237 seconds. I think it's dead, 
            and I am evicting it. exp ffff88100d2fe400, cur 1414115785 expire 1414115635 last 1414115548
            Oct 23 19:00:23 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params
            Oct 23 19:00:23 zwicky-lcy-mds1 kernel: Lustre: Skipped 3 previous similar messages
            Oct 23 19:00:25 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params
            Oct 23 19:00:25 zwicky-lcy-mds1 kernel: Lustre: Skipped 32 previous similar messages
            Oct 23 19:00:27 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params
            Oct 23 19:00:27 zwicky-lcy-mds1 kernel: Lustre: Skipped 25 previous similar messages
            Oct 23 19:00:32 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params
            Oct 23 19:00:32 zwicky-lcy-mds1 kernel: Lustre: Skipped 42 previous similar messages
            
            morrone Christopher Morrone (Inactive) added a comment - The MGS message is not unique to OST connections. Any connection at all seems to make the message. For instance, I just reboot a bunch of client nodes to make the 2.5.3-1chaos based, and saw the messages. Here is a snippet from the console: Oct 23 18:56:20 zwicky-lcy-oss7 kernel: Lustre: lcy-OST0006: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, and I am evicting it. exp ffff8808206e6c00, cur 1414115780 expire 1414115630 last 1414115548 Oct 23 18:56:20 zwicky-lcy-oss5 kernel: Lustre: lcy-OST0004: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, and I am evicting it. exp ffff8810304e6c00, cur 1414115780 expire 1414115630 last 1414115548 Oct 23 18:56:20 zwicky-lcy-oss15 kernel: Lustre: lcy-OST000e: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, and I am evicting it. exp ffff8807fb866000, cur 1414115780 expire 1414115630 last 1414115548 Oct 23 18:56:20 zwicky-lcy-oss1 kernel: Lustre: lcy-OST0000: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, and I am evicting it. exp ffff881013c25c00, cur 1414115780 expire 1414115630 last 1414115548 Oct 23 18:56:20 zwicky-lcy-oss13 kernel: Lustre: lcy-OST000c: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, and I am evicting it. exp ffff880819aa7400, cur 1414115780 expire 1414115630 last 1414115548 Oct 23 18:56:20 zwicky-lcy-oss10 kernel: Lustre: lcy-OST0009: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, and I am evicting it. exp ffff8810322dc000, cur 1414115780 expire 1414115630 last 1414115548 Oct 23 18:56:20 zwicky-lcy-oss12 kernel: Lustre: lcy-OST000b: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 232 seconds. I think it's dead, and I am evicting it. exp ffff88102b5b1800, cur 1414115780 expire 1414115630 last 1414115548 Oct 23 18:56:21 zwicky-lcy-oss3 kernel: Lustre: lcy-OST0002: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 233 seconds. I think it's dead, and I am evicting it. exp ffff88080aea4000, cur 1414115781 expire 1414115631 last 1414115548 Oct 23 18:56:25 zwicky-lcy-mds1 kernel: Lustre: lcy-MDT0000: haven't heard from client b25f461c-463d-a2e2-24f2-54c135569e7c (at 192.168.121.132@o2ib2) in 237 seconds. I think it's dead, and I am evicting it. exp ffff88100d2fe400, cur 1414115785 expire 1414115635 last 1414115548 Oct 23 19:00:23 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params Oct 23 19:00:23 zwicky-lcy-mds1 kernel: Lustre: Skipped 3 previous similar messages Oct 23 19:00:25 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params Oct 23 19:00:25 zwicky-lcy-mds1 kernel: Lustre: Skipped 32 previous similar messages Oct 23 19:00:27 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params Oct 23 19:00:27 zwicky-lcy-mds1 kernel: Lustre: Skipped 25 previous similar messages Oct 23 19:00:32 zwicky-lcy-mds1 kernel: Lustre: MGS: non-config logname received: params Oct 23 19:00:32 zwicky-lcy-mds1 kernel: Lustre: Skipped 42 previous similar messages
            mount -t lustre -v zwicky-lcy-oss16/lcy-ost0 /mnt/lustre/local/lcy-OST000f 
            arg[0] = /sbin/mount.lustre
            arg[1] = -v
            arg[2] = -o
            arg[3] = rw
            arg[4] = zwicky-lcy-oss16/lcy-ost0
            arg[5] = /mnt/lustre/local/lcy-OST000f
            source = zwicky-lcy-oss16/lcy-ost0 (zwicky-lcy-oss16/lcy-ost0), target = /mnt/lustre/local/lcy-OST000f
            options = rw
            checking for existing Lustre data: found
            mounting device zwicky-lcy-oss16/lcy-ost0 at /mnt/lustre/local/lcy-OST000f, flags=0x1000000 options=osd=osd-zfs,,mgsnode=10.1.1.169@o2ib9,param=failover.node=10.1.1.185@o2ib9,param=mgsnode=10.1.1.169@o2ib9,svname=lcy-OST000f,device=zwicky-lcy-oss16/lcy-ost0
            
            morrone Christopher Morrone (Inactive) added a comment - - edited mount -t lustre -v zwicky-lcy-oss16/lcy-ost0 /mnt/lustre/local/lcy-OST000f arg[0] = /sbin/mount.lustre arg[1] = -v arg[2] = -o arg[3] = rw arg[4] = zwicky-lcy-oss16/lcy-ost0 arg[5] = /mnt/lustre/local/lcy-OST000f source = zwicky-lcy-oss16/lcy-ost0 (zwicky-lcy-oss16/lcy-ost0), target = /mnt/lustre/local/lcy-OST000f options = rw checking for existing Lustre data: found mounting device zwicky-lcy-oss16/lcy-ost0 at /mnt/lustre/local/lcy-OST000f, flags=0x1000000 options=osd=osd-zfs,,mgsnode=10.1.1.169@o2ib9,param=failover.node=10.1.1.185@o2ib9,param=mgsnode=10.1.1.169@o2ib9,svname=lcy-OST000f,device=zwicky-lcy-oss16/lcy-ost0
            yujian Jian Yu added a comment -

            Hi Chris,

            Could you please mount the OST with "-v" option like "mount -v -t lustre /dev/xxx /mnt/xxx" and show the output here?
            I'll debug the issue by looking into the "options=" line.

            Thank you!

            yujian Jian Yu added a comment - Hi Chris, Could you please mount the OST with "-v" option like "mount -v -t lustre /dev/xxx /mnt/xxx" and show the output here? I'll debug the issue by looking into the "options=" line. Thank you!
            pjones Peter Jones added a comment -

            Yu, Jian

            Could you please help with this one?

            Thanks

            Peter

            pjones Peter Jones added a comment - Yu, Jian Could you please help with this one? Thanks Peter

            People

              yujian Jian Yu
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: