[LU-16206] PCC crashes MDS: mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed Created: 04/Oct/22 Updated: 04/Oct/22 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0, Lustre 2.15.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Ellis Wilson | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Linux 5.4.0-1091-azure #96~18.04.1-Ubuntu SMP Tue Aug 30 19:15:32 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
||
| Issue Links: |
|
||||||||||||||||
| Severity: | 3 | ||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||
| Description |
|
Reproducible on 2.15.1 and 2.14.0. Both clients and servers are running Ubuntu 18.04 as shown in Environment. Steps to reproduce: # confirm hsm is enabled # setup pcc on client 0 Minutes later things resolve and the stuck command lines return. Examining the MDS, it crashed and rebooted. Relevant [ 3266.211270] LustreError: 11458:0:(mdt_handler.c:960:mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed: |
| Comments |
| Comment by Ellis Wilson [ 04/Oct/22 ] |
|
I've linked three potentially related bugs. The last one has a description that's particularly enlightening: "This is result of inappropriate usage of mti_big_lmm buffer in various places. Originally it was introduced to be used for getting big LOV/LMV EA and passing them to reply buffers. Meanwhile it is widely used now for internal server needs. These cases should be distinguished and if there is no intention to return EA in reply then flag {mti_big_lmm_used}} should not be set. Maybe it is worth to rename it as mti_big_lmm_keep to mark that is to be kept until reply is packed." LASSERT(!info->mti_big_lmm_used); rc = __mdt_stripe_get(info, o, ma, name); |