[LU-13967] Tests fail/hang with “can't notify - lge_glbl_data is not set” Created: 15/Sep/20  Updated: 25/Sep/20  Resolved: 25/Sep/20

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.14.0
Fix Version/s: Lustre 2.14.0

Type: Bug Priority: Major
Reporter: James Nunez (Inactive) Assignee: Sergey Cheremencev
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

We are seeing the message

(qmt_entry.c:805:qmt_adjust_edquot_qunit_notify()) lustre-QMT0000: can't notify - lge_glbl_data is not set

in the MDS console log for many test suites starting on 19 AUG 2020. This message was introduced by "LU-11023 quota: quota pools for OSTs" with patch commit 09f9fb3211cd .

Some tests like sanity-pcc test 5 at https://testing.whamcloud.com/test_sets/b8d571f0-46fe-4651-a545-24c8142eebf8 fail or hang after this comment is printed, but it’s not clear if the statement is actually warning of a real problem or if the tests are failing/hanging for a different reason.

For the test session https://testing.whamcloud.com/test_sessions/42070a37-3493-47fe-a89f-5b0d4c869bb8, we see messages like the following seven times in the MDS console log:

[31796.726146] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  SKIP: sanity-sec test_33 need shared key feature for this test
[31796.769617] Lustre: format at qmt_entry.c:805:qmt_adjust_edquot_qunit_notify doesn't end in newline
[31796.769631] Lustre: format at qmt_entry.c:805:qmt_adjust_edquot_qunit_notify doesn't end in newline
[31796.769634] Lustre: 4891:0:(qmt_entry.c:805:qmt_adjust_edquot_qunit_notify()) lustre-QMT0000: can't notify - lge_glbl_data is not set[31796.773585] Lustre: format at qmt_entry.c:805:qmt_adjust_edquot_qunit_notify doesn't end in newline
[31796.773626] Lustre: format at qmt_entry.c:805:qmt_adjust_edquot_qunit_notify doesn't end in newline

We need to determine if this message is a sign of a real problem of if it should be suppressed.



 Comments   
Comment by Sergey Cheremencev [ 15/Sep/20 ]

This might happen if no slaves have enqueued global quota locks yet. I don't now any problems that may be caused in such case - we just can't reseed lqe_glbl_data with new edquot or qunit as it doesn't exist at this moment. Such case is valid only for quota report and release - quota acquire requires enqueued lock.

I think this warning should be changed to CDEBUG(D_QUOTA, ...).

 

Comment by Gerrit Updater [ 15/Sep/20 ]

Sergey Cheremencev (sergey.cheremencev@hpe.com) uploaded a new patch: https://review.whamcloud.com/39921
Subject: LU-13967 quota: change warning to cdebug
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 37f20c95b38bbfa36a605080d197f05dee97bafb

Comment by Gerrit Updater [ 25/Sep/20 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39921/
Subject: LU-13967 quota: change warning to cdebug
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 880218785f06cd11c488fa6b9dfd6bf14cc2451c

Comment by Peter Jones [ 25/Sep/20 ]

Landed for 2.14

Generated at Sat Feb 10 03:05:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.