[LU-4538] Move unnecesary warnings under debug flags Created: 24/Jan/14  Updated: 15/Aug/14  Resolved: 15/Aug/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Patrick Farrell (Inactive) Assignee: Cliff White (Inactive)
Resolution: Won't Fix Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 12405

 Description   

During normal testing on our systems, Cray sees this message logged extremely often:

(The mmstress test from the Linux Test Project, in particular, causes this constantly.)

This LU is proposing moving this under the D_PAGE debug flag, rather than having it as an always printed warning.

Similarly, we see the error message described below quite regularly, without any associated problems. I'm proposing moving this message from LDLM_ERROR to LDLM_DEBUG.

The following error message always dumps stack. The stack trace and perhaps the message itself look like debug info that doesn't indicate a real problem. These messages clutter the console log and mislead/confuse debugging.

22:54:15 LustreError: 4906:0:(ldlm_lockd.c:433:ldlm_add_waiting_lock()) ### not waiting on destroyed lock (bug 5653) ns: filter-scratch-OST0001_UUID lock: ffff88038a52e7c0/0xaced045b46120b6d lrc: 1/0,0 mode: -/PW res: [0x1018066:0x0:0x0].0 rrc: 443 type: EXT [25559040->25604095] (req 25559040>25604095) flags: 0x74801000000020 nid: 58@gni remote: 0x67c48314886320c5 expref: 162 pid: 4347 timeout: 0 lvb_type: 0
22:54:15 Pid: 4906, comm: ldlm_cn02_025
22:54:15 Call Trace:
22:54:15 [<ffffffffa02bb8d5>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
22:54:15 [<ffffffffa069a5cb>] ldlm_add_waiting_lock+0x1db/0x310 [ptlrpc]
22:54:15 [<ffffffffa069bd58>] ldlm_server_completion_ast+0x548/0x6c0 [ptlrpc]
22:54:15 [<ffffffffa069b810>] ? ldlm_server_completion_ast+0x0/0x6c0 [ptlrpc]
22:54:15 [<ffffffffa066f0cc>] ldlm_work_cp_ast_lock+0xcc/0x200 [ptlrpc]
22:54:15 [<ffffffffa06b004c>] ptlrpc_set_wait+0x6c/0x860 [ptlrpc]
22:54:15 [<ffffffffa02d09b8>] ? cfs_hash_multi_bd_lock+0x68/0xc0 [libcfs]
22:54:15 [<ffffffffa06acd2a>] ? ptlrpc_prep_set+0xfa/0x2f0 [ptlrpc]
22:54:15 [<ffffffffa066f000>] ? ldlm_work_cp_ast_lock+0x0/0x200 [ptlrpc]
22:54:15 [<ffffffffa0671f3b>] ldlm_run_ast_work+0x1bb/0x470 [ptlrpc]
22:54:15 [<ffffffffa0672304>] ldlm_reprocess_all+0x114/0x300 [ptlrpc]
22:54:15 [<ffffffffa0694737>] ldlm_request_cancel+0x277/0x410 [ptlrpc]
22:54:15 [<ffffffffa0694a0d>] ldlm_handle_cancel+0x13d/0x240 [ptlrpc]
22:54:15 [<ffffffffa0699f29>] ldlm_cancel_handler+0x1e9/0x500 [ptlrpc]
22:54:15 [<ffffffffa06ca705>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
22:54:15 [<ffffffffa02bc4be>] ? cfs_timer_arm+0xe/0x10 [libcfs]
22:54:15 [<ffffffffa02cd2ef>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
22:54:15 [<ffffffffa06c1e19>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
22:54:15 [<ffffffff81051389>] ? __wake_up_common+0x59/0x90
22:54:15 [<ffffffffa06cba6d>] ptlrpc_main+0xaed/0x1730 [ptlrpc]
22:54:15 [<ffffffffa06caf80>] ? ptlrpc_main+0x0/0x1730 [ptlrpc]
22:54:15 [<ffffffff81096136>] kthread+0x96/0xa0
22:54:15 [<ffffffff8100c0ca>] child_rip+0xa/0x20
22:54:15 [<ffffffff810960a0>] ? kthread+0x0/0xa0
22:54:15 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

Patch will be forthcoming shortly.



 Comments   
Comment by Oleg Drokin [ 27/Jan/14 ]

Actually I would not dismiss this message as a harmless warning, somebody needs to dig into it to see how come the not yet granted lock is already destroyed when we try to conflict with it first.

Comment by Patrick Farrell (Inactive) [ 12/Feb/14 ]

Patch to move the vvp_io message under D_PAGE and remove the stack trace but leave the LDLM_ERROR intact is here:

http://review.whamcloud.com/9243

Comment by Cliff White (Inactive) [ 08/May/14 ]

It appears from Gerrit we should leave this message in, is it okay to close this bug, as the issue is being worked under LU-4591?

Comment by Cliff White (Inactive) [ 11/Jul/14 ]

I am going to close this issue, please reopen if you have further concerns

Comment by Cliff White (Inactive) [ 15/Aug/14 ]

Please reopen if further questions

Generated at Sat Feb 10 01:43:37 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.