Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13136

(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0, Lustre 2.12.5
    • Lustre 2.13.0
    • None
    • CentOS 7.6, Lustre client 2.13.0 from WC
    • 3
    • 9223372036854775807

    Description

      Since we upgraded Sherlock compute nodes from Lustre client 2.12 LTS to Lustre client 2.13.0, we're seeing the following error messages on the 2.13 clients:

      Jan 10 17:45:03 sh-109-11.int kernel: LustreError: 363749:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)  req@ffff8a4bdc2de300 x1652789462777344/t108684297018(108684297018) o101->oak-MDT0001-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 592/648 e 0 to 0 dl 1578707147 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:''
      Jan 10 18:08:42 sh-109-11.int kernel: LustreError: 404571:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)  req@ffff8a4bdd684800 x1652789475552640/t764228865102(764228865102) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 584/672 e 0 to 0 dl 1578708566 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:''
      Jan 10 18:31:37 sh-109-11.int kernel: LustreError: 406707:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)  req@ffff8a4fbb5d2400 x1652789596142720/t764553338818(764553338818) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 584/672 e 0 to 0 dl 1578709940 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:''
      Jan 10 18:55:58 sh-109-11.int kernel: LustreError: 409784:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)  req@ffff8a60041f1680 x1652789724785856/t764900612344(764900612344) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 584/672 e 0 to 0 dl 1578711402 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:''
      Jan 10 19:21:06 sh-109-11.int kernel: LustreError: 412271:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)  req@ffff8a4bc148a400 x1652789839725568/t765251846397(765251846397) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 584/672 e 0 to 0 dl 1578712910 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:''
      Jan 11 00:25:18 sh-109-11.int kernel: LustreError: 363221:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)  req@ffff8a49e8ebba80 x1652790011215872/t108687007176(108687007176) o101->oak-MDT0001-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 592/648 e 0 to 0 dl 1578731162 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:''
      Jan 11 00:58:19 sh-109-11.int kernel: LustreError: 435607:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)  req@ffff8a5317f8e300 x1652790028203136/t767649780647(767649780647) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 592/648 e 0 to 0 dl 1578733107 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:''
      

      It is only happening with Oak which is still running Lustre 2.10.8 (servers), not with Fir which is running Lustre 2.12.3_4 (servers).

      It's unclear if this has a negative impact. Oak is accessible and seems to work as expected. The error messages are a bit annoying though.

      Thanks!
      Stephane

      Attachments

        Issue Links

          Activity

            [LU-13136] (layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server)

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38188/
            Subject: LU-13136 dom: check read-on-open buffer presents in reply
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: de60bf29f4a4f6b1443850ce5797c23b4290f36e

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38188/ Subject: LU-13136 dom: check read-on-open buffer presents in reply Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: de60bf29f4a4f6b1443850ce5797c23b4290f36e

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38188
            Subject: LU-13136 dom: check read-on-open buffer presents in reply
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 2df4913d85a3826636d27bbe6cf75a4c7dd21bfd

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38188 Subject: LU-13136 dom: check read-on-open buffer presents in reply Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 2df4913d85a3826636d27bbe6cf75a4c7dd21bfd

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37249/
            Subject: LU-13136 dom: check read-on-open buffer presents in reply
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 58bea527100b50abf3df2dbab0ed6d6b42b69d86

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37249/ Subject: LU-13136 dom: check read-on-open buffer presents in reply Project: fs/lustre-release Branch: master Current Patch Set: Commit: 58bea527100b50abf3df2dbab0ed6d6b42b69d86

            Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37249
            Subject: LU-13136 dom: check read-on-open buffer presents in reply
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: a12e36d71a9d06cca4aad69de28e9af052f9168a

            gerrit Gerrit Updater added a comment - Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37249 Subject: LU-13136 dom: check read-on-open buffer presents in reply Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: a12e36d71a9d06cca4aad69de28e9af052f9168a
            tappro Mikhail Pershin added a comment - - edited

            Andreas, right, the req_capsule_field_present() must be used there. This error message is just complaining, I will prepare patch. As for disabling read-on-open, that will not help, this feature is server parameter, client was supposed to ignore old servers silently. We could use mdc_dom_min_repsize option on client to set zero length of niobuf, but that 'extra data length' and buffer still has header so its size is not zero. I think that also should be addressed in patch - no sense to pass header with expected zero buffer length, whole buffer size should be set to zero. This is better to keep as is, changing that would cause protocol change and compatibility issues while benefit is aesthetic mostly

            tappro Mikhail Pershin added a comment - - edited Andreas, right, the req_capsule_field_present() must be used there. This error message is just complaining, I will prepare patch. As for disabling read-on-open, that will not help, this feature is server parameter, client was supposed to ignore old servers silently. We could use mdc_dom_min_repsize option on client to set zero length of niobuf, but that 'extra data length' and buffer still has header so its size is not zero. I think that also should be addressed in patch - no sense to pass header with expected zero buffer length, whole buffer size should be set to zero. This is better to keep as is, changing that would cause protocol change and compatibility issues while benefit is aesthetic mostly

            Mike, could you please take a look. It appears that the niobuf_inline buffer was added as part of patch https://review.whamcloud.com/23011 "LU-10181 mdt: read on open for DoM files", so it makes sense that this is unlikely to work with 2.10 servers, but sohuldn't be spewing an error on the client.

            Mike,

            • is this message indicating any problem, or (as I suspect) just complaining that the read-on-open buffer is not available, since the 2.10.x server isn't sending any data?
            • is there some tunable parameter that could disable the read-on-open functionality temporarily?
            • can you please make a patch for the clients?

            At first glance, it may be enough to replace the call in ll_dom_finish_open() to req_capsule_has_field() with req_capsule_field_present(). My understanding is that the former checks whether the named field might be present in a particular message format, while the latter checks whether the field is actually present in the message being processed.

            adilger Andreas Dilger added a comment - Mike, could you please take a look. It appears that the niobuf_inline buffer was added as part of patch https://review.whamcloud.com/23011 " LU-10181 mdt: read on open for DoM files ", so it makes sense that this is unlikely to work with 2.10 servers, but sohuldn't be spewing an error on the client. Mike, is this message indicating any problem, or (as I suspect) just complaining that the read-on-open buffer is not available, since the 2.10.x server isn't sending any data? is there some tunable parameter that could disable the read-on-open functionality temporarily? can you please make a patch for the clients? At first glance, it may be enough to replace the call in ll_dom_finish_open() to req_capsule_has_field() with req_capsule_field_present() . My understanding is that the former checks whether the named field might be present in a particular message format, while the latter checks whether the field is actually present in the message being processed.

            People

              tappro Mikhail Pershin
              sthiell Stephane Thiell
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: