Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8915

Don't use linux list structure as user land arguments for lnet selftest

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • Lustre 2.17.0
    • Lustre 2.10.0
    • None
    • 3
    • 9223372036854775807

    Description

      As the upstream client looks like move out of staging issues are being brought up that need to be addressed. One issue is that the lnet selftest utility uses and passes linux kernel link list as arguments. A new proper user land interface has to be developed do we can stop using linux link list arguments as ioctl parameters.

      Attachments

        Issue Links

          Activity

            [LU-8915] Don't use linux list structure as user land arguments for lnet selftest
            pjones Peter Jones added a comment -

            As per discussion on the LWG call today, moving tickets that do not appear to be essential to fix version 2.17. If the fix lands before code freeze we will update the fix version to reflect that but we want to focus on activities on the critical path. Please speak up if you think that this issue definitely needs to be fixed before we could issue a 2.16 release.

            pjones Peter Jones added a comment - As per discussion on the LWG call today, moving tickets that do not appear to be essential to fix version 2.17. If the fix lands before code freeze we will update the fix version to reflect that but we want to focus on activities on the critical path. Please speak up if you think that this issue definitely needs to be fixed before we could issue a 2.16 release.

            yes

            simmonsja James A Simmons added a comment - yes
            pjones Peter Jones added a comment -

            Still more to come?

            pjones Peter Jones added a comment - Still more to come?

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49314/
            Subject: LU-8915 lnet: migrate LNet selftest group handling to Netlink
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: a6c2d277d09ce9b33d5407dd08da7ffc42066de0

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/49314/ Subject: LU-8915 lnet: migrate LNet selftest group handling to Netlink Project: fs/lustre-release Branch: master Current Patch Set: Commit: a6c2d277d09ce9b33d5407dd08da7ffc42066de0

            "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49314
            Subject: LU-8915 lnet: migrate LNet selftest group handling to Netlink
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 3a0d2dda2cdcb491f11b0ea4977a8ccb11bcf37e

            gerrit Gerrit Updater added a comment - "James Simmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/49314 Subject: LU-8915 lnet: migrate LNet selftest group handling to Netlink Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3a0d2dda2cdcb491f11b0ea4977a8ccb11bcf37e

            More patches coming.

            simmonsja James A Simmons added a comment - More patches coming.
            pjones Peter Jones added a comment -

            Only patch seems to have landed for 2.16. Should reopen if more is to come

            pjones Peter Jones added a comment - Only patch seems to have landed for 2.16. Should reopen if more is to come

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/43298/
            Subject: LU-8915 lnet: migrate LNet selftest session handling to Netlink
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 15c82ae3d31969afd866ca825c2f6ab11ae36a8f

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/43298/ Subject: LU-8915 lnet: migrate LNet selftest session handling to Netlink Project: fs/lustre-release Branch: master Current Patch Set: Commit: 15c82ae3d31969afd866ca825c2f6ab11ae36a8f

            James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/43298
            Subject: LU-8915 lnet: migrate LNet selftest session handling to Netlink
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 3d1631ed0561651bdef83b138f086864f54c6a5d

            gerrit Gerrit Updater added a comment - James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/43298 Subject: LU-8915 lnet: migrate LNet selftest session handling to Netlink Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3d1631ed0561651bdef83b138f086864f54c6a5d

            I was looking at this today in the context of patch https://review.whamcloud.com/38105 "LU-13344 lnet: stop using struct timeval" and came to the conclusion that we aren't really using "struct list_head" in the userspace interface, even though it looks that way.

            What is actually being done is that stuct list_head is being used in userspace to generate a linked list of reply buffers for requests, and the kernel is essentially using the ".next" field in lstcon_rpc_trans_interpreter() as a way of passing a series of userspace pointers to the kernel. The somewhat confusing part is that this is done within a list_for_each_entry() loop in the kernel, but that is actually walking a different list, which is OK.

            For the "head_up" pointer passed from userspace, each one has the "struct lstcon_rpc_ent" copied properly from userspace, but only the .next field is used from "struct list_head rpe_link" to point to the next "struct lstcon_rpc_ent" structure. The fact that all of these functions pass "struct list_head __user *head_up" as an argument is misleading and makes it look like they could be using list_for_each() in the kernel on this list, but that never happens. Also, since "rpe_link" is the first field in the structure, passing the "head_up" pointer (named result_up in the caller) as the function argument is really the same as passing the pointer to "struct lstcon_rpc_ent" itself.

            I think it would be possible to change all of these functions and structs in lnet/include/uapi/linux/lnet/lnetst.h from passing "struct list_head __user *" to passing "struct lstcon_rpc_ent __user *" as the argument with no functional difference in the code (even binary compatibility), since all of these pointers end up down in lstcon_rpc_trans_interpreter() at the end. Inside lstcon_rpc_trans_interpreter() instead of "ent = list_entry()" it would use "ent = container_of()" (which list_entry() uses internally these days anyway) to make it clear that it isn't treating this as a list_head. The rpe_link.next field is treated as a 64-bit field that holds a userspace pointer to the next "struct lstcon_rpc_ent", which is only ever accessed via copy_to_user() or compared against itself, which is fine.

            In summary, I think we can remove the use of "struct list_head" for from the lnetst.h user/kernel interface without even compatibility issues.

            adilger Andreas Dilger added a comment - I was looking at this today in the context of patch https://review.whamcloud.com/38105 " LU-13344 lnet: stop using struct timeval " and came to the conclusion that we aren't really using " struct list_head " in the userspace interface, even though it looks that way. What is actually being done is that stuct list_head is being used in userspace to generate a linked list of reply buffers for requests, and the kernel is essentially using the " .next " field in lstcon_rpc_trans_interpreter() as a way of passing a series of userspace pointers to the kernel. The somewhat confusing part is that this is done within a list_for_each_entry() loop in the kernel, but that is actually walking a different list, which is OK. For the " head_up " pointer passed from userspace, each one has the " struct lstcon_rpc_ent " copied properly from userspace, but only the .next field is used from " struct list_head rpe_link " to point to the next " struct lstcon_rpc_ent " structure. The fact that all of these functions pass " struct list_head __user *head_up " as an argument is misleading and makes it look like they could be using list_for_each() in the kernel on this list, but that never happens. Also, since " rpe_link " is the first field in the structure, passing the " head_up " pointer (named result_up in the caller) as the function argument is really the same as passing the pointer to " struct lstcon_rpc_ent " itself. I think it would be possible to change all of these functions and structs in lnet/include/uapi/linux/lnet/lnetst.h from passing " struct list_head __user * " to passing " struct lstcon_rpc_ent __user * " as the argument with no functional difference in the code (even binary compatibility), since all of these pointers end up down in lstcon_rpc_trans_interpreter() at the end. Inside lstcon_rpc_trans_interpreter() instead of " ent = list_entry() " it would use " ent = container_of() " (which list_entry() uses internally these days anyway) to make it clear that it isn't treating this as a list_head . The rpe_link.next field is treated as a 64-bit field that holds a userspace pointer to the next " struct lstcon_rpc_ent ", which is only ever accessed via copy_to_user() or compared against itself, which is fine. In summary, I think we can remove the use of " struct list_head " for from the lnetst.h user/kernel interface without even compatibility issues.

            People

              simmonsja James A Simmons
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated: