[LU-5521] grant errors during soak testing with message drop Created: 20/Aug/14  Updated: 05/Jun/15  Resolved: 03/Feb/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Minor
Reporter: Johann Lombardi (Inactive) Assignee: Johann Lombardi (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 15378

 Description   

We regularly notice the following error in the client logs during soak testing with message drop on routers:

LustreError: 7110:0:(osc_cache.c:1565:osc_enter_cache()) soaked-OST0001-osc-ffff88083651dc00: grant { dirty: 8192/8192 dirty_pages: 8192/4096729 dropped: 0 avail: 12132352, reserved: 0, flight: 9 }lru {in list: 1365510, left: 8194, waiters: 0 }try to reserve 4096.


 Comments   
Comment by Johann Lombardi (Inactive) [ 01/Sep/14 ]

http://review.whamcloud.com/#/c/11716/

Comment by Oleg Drokin [ 24/Sep/14 ]

It seems that due to an omission, patch 11716 introduced a warning and I now get a ton of this in my kernel logs:
"format at osc_cache.c:1473:osc_enter_cache_try doesn't end in new line"
also
"format at osc_cache.c:1524:osc_enter_cache doesn't end in newline"

Hopefully somebody can make a patch soon.

Comment by Johann Lombardi (Inactive) [ 24/Sep/14 ]

hm, the patch looks sane to me ... strange

Comment by Oleg Drokin [ 01/Oct/14 ]

I reverted this patch due to all this message spam that was also affecting Cliff.
Also LU-5656 seems to be caused by this patch too, ever since I reverted it the problem seems to be gone (still testing for some more time to be 100% sure, but by this
time it was already hitting in previous testing).

Comment by Johann Lombardi (Inactive) [ 01/Oct/14 ]

I have pushed a new version which should address the warning: http://review.whamcloud.com/12146
As for the umount issue, i have no clue for now on how this patch could trigger this ...

Comment by Johann Lombardi (Inactive) [ 08/Oct/14 ]

The new version should fix the umount problem. Should be tested on the soak cluster shortly.

Comment by Gerrit Updater [ 03/Feb/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/12146/
Subject: LU-5521 grant: quiet message on grant waiting timeout
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: c2a0ab016383be46a8d04c00ff163eb6a4550f58

Comment by Peter Jones [ 03/Feb/15 ]

Landed for 2.7

Generated at Sat Feb 10 01:52:11 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.