[LU-3859] grant shrinker floods OST and produce a large load Created: 29/Aug/13 Updated: 11/Dec/18 Resolved: 11/Dec/18 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Alexey Lyashkov | Assignee: | WC Triage |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Environment: |
xyratex lustre (mostly 2.1 based) |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 10011 | ||||||||||||||||
| Description |
|
While invested a sort of LNet bugs i tried to dump a rpc cache on OST after fail and find most of rpc in cache is grant shrink requests - it's produce a large load to the routers and OST's. ffff8807b9cc3440 rpc_cache 1024 313 896 224 4k # grep -c RQF_OST_SET_GRANT_INFO \&\ log 304 so 304 from 313 RPC's is flood from grant shrinker. |
| Comments |
| Comment by Andreas Dilger [ 30/Aug/13 ] |
|
We don't actually have OBD_CONNECT_GRANT_SHRINK enabled on the clients in our version of Lustre. I agree that it shouldn't be flooding the server with these requests - grant shrinking should only happen occasionally, like on a ping. The default interval GRANT_SHRINK_INTERVAL is 1200s, but this would be sent from multiple clients if they are idle and not using the grant. It should only be a one-time event. |
| Comment by Andreas Dilger [ 11/Dec/18 ] |
|
No activity here in 5 years and this is likely fixed with |