Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.9.0
    • Lustre 2.8.0
    • 3
    • 17615

    Description

      The RQF_FLD_READ RPC reply format is defined with "generic_data" (RQF_GENERIC_DATA) as the second field, but in fact this is always struct lu_seq_range_array in the two places that I see it used. Since RQF_FLD_READ does not have any swabber associated with generic data, the client will not swab the FLD reply data.

      That doesn't cause a problem for current code, because it always assumes MDT0000 for any FID lookup, and we haven't tested DNE + PPC client systems yet, nor do we have mixed-endian MDS/OSS combinations. However, this will break for DNE + PPC clients and should be fixed.

      Attachments

        Issue Links

          Activity

            [LU-6284] FLD read is not swabbed correctly
            pjones Peter Jones added a comment -

            Landed for 2.9

            pjones Peter Jones added a comment - Landed for 2.9

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22309/
            Subject: LU-6284 ptlrpc: comment for FLD_QUERY RPC reply swab
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 8d8a153e12437900a876293dedc3cc657810ee83

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/22309/ Subject: LU-6284 ptlrpc: comment for FLD_QUERY RPC reply swab Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8d8a153e12437900a876293dedc3cc657810ee83

            Here is the patch on master:
            http://review.whamcloud.com/22309

            yong.fan nasf (Inactive) added a comment - Here is the patch on master: http://review.whamcloud.com/22309

            I think the only thing that may be needed here is to add a comment at the swabbed function and at the struct definition that includes the explanation here about what is happening and why.

            adilger Andreas Dilger added a comment - I think the only thing that may be needed here is to add a comment at the swabbed function and at the struct definition that includes the explanation here about what is happening and why.
            yong.fan nasf (Inactive) added a comment - - edited

            I may found why we did not registered a swabber() for 'FLD_READ' RPC. In theory, we can do that. But from the implementation view, it is not easy to be done within our current "struct req_msg_field" framework. Because the sequence range array is not fixed length, instead, its length depends on the 'lu_seq_range' count, that is unknown when prepare the RPC buffer. Generally, for such flexible length RPC usage, there will be a field in the RPC layout to indicate the data length. But for 'FLD_READ' RPC, we have no way to indicate the length unless we add new length filed that will broken the on-wire RPC protocol.

            yong.fan nasf (Inactive) added a comment - - edited I may found why we did not registered a swabber() for 'FLD_READ' RPC. In theory, we can do that. But from the implementation view, it is not easy to be done within our current "struct req_msg_field" framework. Because the sequence range array is not fixed length, instead, its length depends on the 'lu_seq_range' count, that is unknown when prepare the RPC buffer. Generally, for such flexible length RPC usage, there will be a field in the RPC layout to indicate the data length. But for 'FLD_READ' RPC, we have no way to indicate the length unless we add new length filed that will broken the on-wire RPC protocol.

            If we have only ever supported little-endian servers, then the only clients that would need to swab would be PPC clients, so I don't think this should cause any problems...

            adilger Andreas Dilger added a comment - If we have only ever supported little-endian servers, then the only clients that would need to swab would be PPC clients, so I don't think this should cause any problems...

            Yes, the current implementation will transfer the buffer as little endian on the network, and convert it as host endian in RAM when handle it. Comparing with the registered swabber solution that is only called when necessary, current implementation is NOT efficient, anyway it can work. If we replace it with the registered swabber as other RPCs do, that may cause some interoperability trouble if the peer (different host endian) still uses range_array_{le,cpu}to{cpu,le}. I think that if there will be some other on-wire protocol changes in the future, then it will be good time-point to adjust it.

            yong.fan nasf (Inactive) added a comment - Yes, the current implementation will transfer the buffer as little endian on the network, and convert it as host endian in RAM when handle it. Comparing with the registered swabber solution that is only called when necessary, current implementation is NOT efficient, anyway it can work. If we replace it with the registered swabber as other RPCs do, that may cause some interoperability trouble if the peer (different host endian) still uses range_array_{le,cpu} to {cpu,le}. I think that if there will be some other on-wire protocol changes in the future, then it will be good time-point to adjust it.

            Hmm, it seems that range_array_le_to_cpu() would not be a proper function for swabbing the buffer, since this would convert the array to little endian instead of host endian. If this is registered as the swabbed function it would need to always swab the array if it is called, but it would only be called if swabbing is needed.

            adilger Andreas Dilger added a comment - Hmm, it seems that range_array_le_to_cpu() would not be a proper function for swabbing the buffer, since this would convert the array to little endian instead of host endian. If this is registered as the swabbed function it would need to always swab the array if it is called, but it would only be called if swabbing is needed.

            Honestly, I do not know the original reason for why not registered a swabber() as other RPCs do. Do you think it is good time to adjust that now?

            yong.fan nasf (Inactive) added a comment - Honestly, I do not know the original reason for why not registered a swabber() as other RPCs do. Do you think it is good time to adjust that now?

            Is there a reason we don't just register range_array_le_to_cpu() as the swabber for this buffer so that it is swabbed like normal RPC buffers as part of the RPC handling?

            adilger Andreas Dilger added a comment - Is there a reason we don't just register range_array_le_to_cpu() as the swabber for this buffer so that it is swabbed like normal RPC buffers as part of the RPC handling?

            People

              yong.fan nasf (Inactive)
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: