Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10937

Use sysfs to fix up sptlrpc handling.

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.12.0
    • None
    • Lustre with sptlrpc/gss enabled.
    • 3
    • 9223372036854775807

    Description

      Several problems exist for the GSS / sptlrpc code. The problems are:

      1) lctl set_param  P does not work with sptlrpc  LU-7183

      2) Specific mgs bingings for sptlrpc is broken. LU-9034 / LU-9086 / LU-9567

          With the move to kobjects with sysfs we could the kobject instead.

      3) After the move to sysfs we can use udev events instead of polling the proc files like what is now done in for example svgssd.

       

      Attachments

        Issue Links

          Activity

            [LU-10937] Use sysfs to fix up sptlrpc handling.

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33760/
            Subject: LU-10937 sptlrpc: split sptlrpc_process_config()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 0ff7d548eb7bf0c04836c7d6809cac163e0ffc2c

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33760/ Subject: LU-10937 sptlrpc: split sptlrpc_process_config() Project: fs/lustre-release Branch: master Current Patch Set: Commit: 0ff7d548eb7bf0c04836c7d6809cac163e0ffc2c

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33760
            Subject: LU-10937 sptlrpc: split sptlrpc_process_config()
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 62c61bb1c4692f1c608047210befd9cb9dedc449

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33760 Subject: LU-10937 sptlrpc: split sptlrpc_process_config() Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 62c61bb1c4692f1c608047210befd9cb9dedc449

            I started to do a code review and I have noticed a bunch of potential bugs in general. This is just for the lctl conf_param case. The first question would be are the targets, the first *. field, just the following types:

            lctl conf_param lustre.srpc.****

            lctl conf_param lustre-OST0000.srpc.***

            lctl conf_param lustre-MDT0000.srpc.***

            for example.

            If that is the case we can simply replace obdname2fsname() with server_name2fsname(). If server_name2fsname() returns an error then we know the target is a file system.

            Next bug I noticed is that if we supply an obd device as a target plus a direction we don't validate the direction. The following should fail but doesn't

            lctl conf_param lustre-MDT0000.srpc.flavor.default.cli2ost=skpi.

            Lastly I noticed the network type supplied can easily break. Consider the case of a file system named test that are Cray clients and you have routers in between that convert to o2ib1 with infiniband storage backend. So if you do

            lctl conf_param test.srpc.flavor.o2ib1=skpi

            Does this filter so only the server back end is updated to skpi? I noticed we don't really test what LNet network interface is in use when setting the rule. Is it valid do a partial setup in this case?

             

            simmonsja James A Simmons added a comment - I started to do a code review and I have noticed a bunch of potential bugs in general. This is just for the lctl conf_param case. The first question would be are the targets, the first *. field, just the following types: lctl conf_param lustre.srpc.**** lctl conf_param lustre-OST0000.srpc.*** lctl conf_param lustre-MDT0000.srpc.*** for example. If that is the case we can simply replace obdname2fsname() with server_name2fsname(). If server_name2fsname() returns an error then we know the target is a file system. Next bug I noticed is that if we supply an obd device as a target plus a direction we don't validate the direction. The following should fail but doesn't lctl conf_param lustre-MDT0000.srpc.flavor.default.cli2ost=skpi. Lastly I noticed the network type supplied can easily break. Consider the case of a file system named test that are Cray clients and you have routers in between that convert to o2ib1 with infiniband storage backend. So if you do lctl conf_param test.srpc.flavor.o2ib1=skpi Does this filter so only the server back end is updated to skpi? I noticed we don't really test what LNet network interface is in use when setting the rule. Is it valid do a partial setup in this case?  

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33311/
            Subject: LU-10937 mgc: restore mgc binding for sptlrpc
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ca9300e53dc2b7bcaaa5482bb4234cce7d9a344e

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33311/ Subject: LU-10937 mgc: restore mgc binding for sptlrpc Project: fs/lustre-release Branch: master Current Patch Set: Commit: ca9300e53dc2b7bcaaa5482bb4234cce7d9a344e
            simmonsja James A Simmons added a comment - - edited

            Developers from Cray reported to me that the band aid fix landed for LU-9567 actually can cause an MSG server failover to crash the node. So I looked this after noon and figured out how to final resolve LU-9034. This will be needed for the LTS release.

            simmonsja James A Simmons added a comment - - edited Developers from Cray reported to me that the band aid fix landed for LU-9567 actually can cause an MSG server failover to crash the node. So I looked this after noon and figured out how to final resolve LU-9034 . This will be needed for the LTS release.

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33311
            Subject: LU-10937 mgc: restore mgc binding for sptlrpc
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 2347c82412b29c0777f5907551164d5cc27b80d4

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/33311 Subject: LU-10937 mgc: restore mgc binding for sptlrpc Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2347c82412b29c0777f5907551164d5cc27b80d4

            People

              simmonsja James A Simmons
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: