[LU-14740] LustreError: 76942:0:(qsd_entry.c:243:qsd_refresh_usage()) $$$ failed to read disk usage, rc:-3 qsd:lustre-MDT0000 qtype:prj id:4294967295 enforced:0 granted: 0 pending:0 waiting:0 req:0 usage: 0 qunit:0 qtune:0 edquot:0 default:yes Created: 07/Jun/21  Updated: 14/Oct/21  Resolved: 16/Sep/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Minor
Reporter: Wang Shilong (Inactive) Assignee: Wang Shilong (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-13845 Kernel crash on: lfs quota -u $(( (1<... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

On one of system, we frequently hit following errors which prevent us to use project quota:

[58249.501596] LustreError: 76942:0:(qsd_entry.c:243:qsd_refresh_usage()) $$$ failed to read disk usage, rc:-3  qsd:lustre-MDT0000 qtype:prj id:4294967295 enforced:0 granted: 0 pending:0 waiting:0 req:0 usage: 0 qunit:0 qtune:0 edquot:0 default:yes

[58259.041837] LustreError: 77077:0:(qsd_entry.c:243:qsd_refresh_usage()) $$$ failed to read disk usage, rc:-3  qsd:lustre-MDT0000 qtype:prj id:4294967295 enforced:0 granted: 0 pending:0 waiting:0 req:0 usage: 0 qunit:0 qtune:0 edquot:0 default:yes

[58268.119459] LustreError: 77077:0:(qsd_entry.c:243:qsd_refresh_usage()) $$$ failed to read disk usage, rc:-3  qsd:lustre-MDT0000 qtype:prj id:4294967295 enforced:0 granted: 0 pending:0 waiting:0 req:0 usage: 0 qunit:0 qtune:0 edquot:0 default:yes

Problem blames to project ID 4294967295, this out of range kernel could handle, and kernel think this is not a valid project ID.



 Comments   
Comment by Gerrit Updater [ 07/Jun/21 ]

Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43939
Subject: LU-14740 llite: avoid project quota overflow
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8bb1bfdbc6f435873a33d7d3a131e625f6e157a9

Comment by Andreas Dilger [ 16/Jul/21 ]

It would be good to add a second patch to verify on the MDS/OSS that projid=-1 is not used for a file, to catch other bugs and/or older clients.

Comment by Wang Shilong (Inactive) [ 17/Jul/21 ]

Yup, that makes sense.

Comment by Gerrit Updater [ 19/Jul/21 ]

Wang Shilong (wshilong@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/44339
Subject: LU-14740 quota: reject invalid project id on server side
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 24572a3988f2d848182fc59d652b9fe7dee13815

Comment by Gerrit Updater [ 27/Jul/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43939/
Subject: LU-14740 llite: avoid project quota overflow
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 3ffa5d680f0092ae51ffa84bd94a9983f9a8c99e

Comment by Wang Shilong (Inactive) [ 28/Jul/21 ]

For b_es5_2:

https://review.whamcloud.com/#/c/44407/1

Comment by Gerrit Updater [ 10/Aug/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44339/
Subject: LU-14740 quota: reject invalid project id on server side
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d6a3e06cb0f0db57a2637d029b1ff3bfd1de3d7d

Generated at Sat Feb 10 03:12:22 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.