[LU-13382] Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN' Created: 24/Mar/20  Updated: 09/Apr/20  Resolved: 09/Apr/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.5
Fix Version/s: None

Type: Question/Request Priority: Major
Reporter: Jon Symon (Inactive) Assignee: Mikhail Pershin
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Servers running lustre 2.10.5-1 on RHEL 7.5, clients are running lustre-client 2.13.0-1 on RHEL 8.1


Issue Links:
Related
is related to LU-13136 (layout.c:2121:__req_capsule_get()) @... Resolved
is related to LU-13438 Rhel8.1 / lustre-client 2.12.4-1 Open
Epic/Theme: client
Rank (Obsolete): 9223372036854775807

 Description   

We have just upgraded all of our clients to RHEL 8.1 with lustre client 2.13.0-1.  Within hours of letting the users back onto the system we started seeing a number of the nodes reboot.  These servers are showing a very large number of this error :-

 

Wrong buffer for field 'niobuf_inline' (7 of 7) in format 'LDLM_INTENT_OPEN'

 

I have sen this described as a cosmetic error, but were are seeing a lot of them - in oe case more the 21000 in a period of less than a few seconds.  Can you please confirm if these messages are cosmetic?

Thanks

jon

 



 Comments   
Comment by Jon Symon (Inactive) [ 24/Mar/20 ]

I should add that these errors appear to be generated by one program compiled by one of our users.  End result is a kernel panic followed by a reboot of the node.

js

Comment by Peter Jones [ 24/Mar/20 ]

Mike

This looks to be a duplicate of LU-13136. Is this something that we should be planning to include in 2.12.5? Can you confirm whether there are any negative impacts when receiving this message.

Jon

I am puzzled as to why you are running 2.13 clients. This is not something tested or expected to work on RHEL 8.1 nor supported to interoperate with 2.10.x servers. I would recommend that you consider switching to a 2.12.x client

Peter

Comment by Mikhail Pershin [ 26/Mar/20 ]

yes, this is the same issue, though it didn't cause panic and reboot in past. It would be helpful to get corresponding panic info if possible. As for 2.12.5, I think that will be useful to patch it too

Comment by Andreas Dilger [ 09/Apr/20 ]

Close as duplicate of LU-13136.

Generated at Sat Feb 10 03:00:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.