Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2319

Remove "Not available for connect" messages

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.4.0
    • Lustre 2.4.0
    • 5543

    Description

      Do these messages provide any real benefit?

      2012-11-13 09:54:49 LustreError: 137-5: lstest-MDT0000: Not available for connect from 172.20.4.132@o2ib500 (not set up)
      2012-11-13 09:54:49 LustreError: 137-5: lstest-MDT0000: Not available for connect from 172.20.17.141@o2ib500 (not set up)
      2012-11-13 09:54:50 LustreError: 137-5: lstest-MDT0000: Not available for connect from 172.20.17.65@o2ib500 (not set up)
      2012-11-13 09:54:50 LustreError: Skipped 4 previous similar messages
      2012-11-13 09:54:51 LustreError: 137-5: lstest-MDT0000: Not available for connect from 172.20.3.112@o2ib500 (not set up)
      2012-11-13 09:54:51 LustreError: Skipped 12 previous similar messages
      

      At first glance, it looks like a peer is trying to connect before the target is fully initialized. Why do we need to print this to the console?

      Attachments

        Issue Links

          Activity

            [LU-2319] Remove "Not available for connect" messages

            This was fixed in patch http://review.whamcloud.com/6264 ("LU-1095 debug: quiet noisy console error messages") to use LCONSOLE_INFO() instead of LCONSOLE_ERROR_MSG().

            If it continues to be a problem, it could use the new CERROR_SLOW() macro from patch https://review.whamcloud.com/55439 ("LU-17432 libcfs: new CDEBUG_SLOW message type") that will not print the error message the first few times it is hit.

            adilger Andreas Dilger added a comment - This was fixed in patch http://review.whamcloud.com/6264 (" LU-1095 debug: quiet noisy console error messages ") to use LCONSOLE_INFO() instead of LCONSOLE_ERROR_MSG() . If it continues to be a problem, it could use the new CERROR_SLOW() macro from patch https://review.whamcloud.com/55439 (" LU-17432 libcfs: new CDEBUG_SLOW message type ") that will not print the error message the first few times it is hit.

            Andreas, would you agree to turn this into CDEBUG() ?

            bzzz Alex Zhuravlev added a comment - Andreas, would you agree to turn this into CDEBUG() ?

            Sorry, I was thinking of some other message then.

            adilger Andreas Dilger added a comment - Sorry, I was thinking of some other message then.

            Will these messages really be shown if the target was misconfigured? From what I can tell, it's coming from here:

             803         if (target->obd_stopping || !target->obd_set_up) {                      
             804                 cfs_spin_unlock(&target->obd_dev_lock);                         
             805                                                                                 
             806                 deuuidify(str, NULL, &target_start, &target_len);               
             807                 LCONSOLE_ERROR_MSG(0x137, "%.*s: Not available for connect "    
             808                                    "from %s (%s)\n", target_len, target_start,  
             809                                    libcfs_nid2str(req->rq_peer.nid),            
             810                                    (target->obd_stopping ?                      
             811                                    "stopping" : "not set up"));                 
             812                 GOTO(out, rc = -ENODEV);                                        
             813         }
            

            So at first glance, I don't see how a configuration error would cause it.

            And the case where a client is trying to connect to a server by the server is not set up is exactly the case this message for. Why is that important enough to make it to the console? I'd argue, it should handle it silently since that's normal during set up and tear down.

            prakash Prakash Surya (Inactive) added a comment - Will these messages really be shown if the target was misconfigured? From what I can tell, it's coming from here: 803 if (target->obd_stopping || !target->obd_set_up) { 804 cfs_spin_unlock(&target->obd_dev_lock); 805 806 deuuidify(str, NULL, &target_start, &target_len); 807 LCONSOLE_ERROR_MSG(0x137, "%.*s: Not available for connect " 808 "from %s (%s)\n", target_len, target_start, 809 libcfs_nid2str(req->rq_peer.nid), 810 (target->obd_stopping ? 811 "stopping" : "not set up")); 812 GOTO(out, rc = -ENODEV); 813 } So at first glance, I don't see how a configuration error would cause it. And the case where a client is trying to connect to a server by the server is not set up is exactly the case this message for. Why is that important enough to make it to the console? I'd argue, it should handle it silently since that's normal during set up and tear down.

            Unlike some of the other messages, this one is potentially quite important, if the client is trying to connect to a server but the server is not set up or the client is configured incorrectly.

            adilger Andreas Dilger added a comment - Unlike some of the other messages, this one is potentially quite important, if the client is trying to connect to a server but the server is not set up or the client is configured incorrectly.
            pjones Peter Jones added a comment -

            Alex

            Can you please triage and assign this one?

            Thanks

            Peter

            pjones Peter Jones added a comment - Alex Can you please triage and assign this one? Thanks Peter

            People

              bzzz Alex Zhuravlev
              prakash Prakash Surya (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: