[LU-13136] (layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server) Created: 14/Jan/20 Updated: 19/Apr/20 Resolved: 28/Jan/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | Lustre 2.14.0, Lustre 2.12.5 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Stephane Thiell | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: |
CentOS 7.6, Lustre client 2.13.0 from WC |
||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
Since we upgraded Sherlock compute nodes from Lustre client 2.12 LTS to Lustre client 2.13.0, we're seeing the following error messages on the 2.13 clients: Jan 10 17:45:03 sh-109-11.int kernel: LustreError: 363749:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server) req@ffff8a4bdc2de300 x1652789462777344/t108684297018(108684297018) o101->oak-MDT0001-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 592/648 e 0 to 0 dl 1578707147 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:'' Jan 10 18:08:42 sh-109-11.int kernel: LustreError: 404571:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server) req@ffff8a4bdd684800 x1652789475552640/t764228865102(764228865102) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 584/672 e 0 to 0 dl 1578708566 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:'' Jan 10 18:31:37 sh-109-11.int kernel: LustreError: 406707:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server) req@ffff8a4fbb5d2400 x1652789596142720/t764553338818(764553338818) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 584/672 e 0 to 0 dl 1578709940 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:'' Jan 10 18:55:58 sh-109-11.int kernel: LustreError: 409784:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server) req@ffff8a60041f1680 x1652789724785856/t764900612344(764900612344) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 584/672 e 0 to 0 dl 1578711402 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:'' Jan 10 19:21:06 sh-109-11.int kernel: LustreError: 412271:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server) req@ffff8a4bc148a400 x1652789839725568/t765251846397(765251846397) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 584/672 e 0 to 0 dl 1578712910 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:'' Jan 11 00:25:18 sh-109-11.int kernel: LustreError: 363221:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server) req@ffff8a49e8ebba80 x1652790011215872/t108687007176(108687007176) o101->oak-MDT0001-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 592/648 e 0 to 0 dl 1578731162 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:'' Jan 11 00:58:19 sh-109-11.int kernel: LustreError: 435607:0:(layout.c:2121:__req_capsule_get()) @@@ Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (server) req@ffff8a5317f8e300 x1652790028203136/t767649780647(767649780647) o101->oak-MDT0000-mdc-ffff8a63bc27e000@10.0.2.52@o2ib5:12/10 lens 592/648 e 0 to 0 dl 1578733107 ref 3 fl Complete:RPQU/4/0 rc 0/0 job:'' It is only happening with Oak which is still running Lustre 2.10.8 (servers), not with Fir which is running Lustre 2.12.3_4 (servers). It's unclear if this has a negative impact. Oak is accessible and seems to work as expected. The error messages are a bit annoying though. Thanks! |
| Comments |
| Comment by Andreas Dilger [ 15/Jan/20 ] |
|
Mike, could you please take a look. It appears that the niobuf_inline buffer was added as part of patch https://review.whamcloud.com/23011 " Mike,
At first glance, it may be enough to replace the call in ll_dom_finish_open() to req_capsule_has_field() with req_capsule_field_present(). My understanding is that the former checks whether the named field might be present in a particular message format, while the latter checks whether the field is actually present in the message being processed. |
| Comment by Mikhail Pershin [ 15/Jan/20 ] |
|
Andreas, right, the req_capsule_field_present() must be used there. This error message is just complaining, I will prepare patch. As for disabling read-on-open, that will not help, this feature is server parameter, client was supposed to ignore old servers silently. We could use mdc_dom_min_repsize option on client to set zero length of niobuf, but that 'extra data length' and buffer still has header so its size is not zero. |
| Comment by Gerrit Updater [ 15/Jan/20 ] |
|
Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37249 |
| Comment by Gerrit Updater [ 28/Jan/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37249/ |
| Comment by Gerrit Updater [ 09/Apr/20 ] |
|
Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38188 |
| Comment by Gerrit Updater [ 19/Apr/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38188/ |