[LU-2951] lnet_try_match_md()) Matching packet from 12345-10.10.17.9@tcp, match 1429230968490588 length 65928 too big: 117674 left, 49386 allowed Created: 12/Mar/13  Updated: 26/Mar/13  Resolved: 26/Mar/13

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Li Wei (Inactive)
Resolution: Fixed Votes: 0
Labels: HB

Severity: 3
Rank (Obsolete): 7084

 Description   

This issue was created by maloo for Li Wei <liwei@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/9e87cad2-8b22-11e2-965f-52540035b04c.

The sub-test test_102ha failed with the following error:

test failed to respond and timed out

Info required for matching: sanity 102ha

The MDS was dropping requests, while the client was repeatedly reconnecting until Autotest timed out:

10:38:21:LNetError: 3022:0:(lib-ptl.c:190:lnet_try_match_md()) Matching packet from 12345-10.10.17.9@tcp, match 1429230968490588 length 65928 too big: 117674 left, 49386 allowed
10:39:14:Lustre: lustre-MDT0000: Client 8aa83ecb-c95c-6b87-53ec-3d81990549d0 (at 10.10.17.9@tcp) reconnecting


 Comments   
Comment by Li Wei (Inactive) [ 12/Mar/13 ]

https://maloo.whamcloud.com/test_sets/d32b51c6-8b24-11e2-965f-52540035b04c

This time it was conf-sanity 58b.

Comment by Li Wei (Inactive) [ 12/Mar/13 ]

This is due to recent request buffer shrinking effort, which makes SETXATTR unable to accept requests with maximum extend attribute size (64 KB). 49386 is exactly current intended max size for SETXATTR, while 65928 is how much needed.

Comment by James Nunez (Inactive) [ 12/Mar/13 ]

https://maloo.whamcloud.com/test_sets/d89abf3a-884e-11e2-961a-52540035b04c

Another example of this issue for sanity test_102ha

Comment by Nathaniel Clark [ 13/Mar/13 ]

ZFS, conf-sanity test_61

https://maloo.whamcloud.com/test_sets/fba5bc00-8bb9-11e2-aa89-52540035b04c

Comment by Li Wei (Inactive) [ 13/Mar/13 ]

http://review.whamcloud.com/5703

Comment by Andreas Dilger [ 18/Mar/13 ]

Looks like this is mostly hitting on review-zfs because ZFS has "large xattr" enabled by default, but that isn't currently enabled for ldiskfs by default, but the bug would be applicable to both.

We should really squash the last remaining ZFS test failures and enable ZFS testing for real, so we don't keep introducing failures like this that are not caught "because ZFS tests always fail"...

Comment by Li Wei (Inactive) [ 26/Mar/13 ]

The patch above has landed on master.

Generated at Sat Feb 10 01:29:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.