[LU-5691] hsm: Cannot find running request for cookie 0x539b6fc2 on fid=[0x200000bd0:0x5d57:0x0] Created: 30/Sep/14  Updated: 20/Jul/15  Resolved: 05/Nov/14

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.5.3
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Minor
Reporter: Frank Zago (Inactive) Assignee: James Nunez (Inactive)
Resolution: Fixed Votes: 0
Labels: patch
Environment:

Centos 6.5


Issue Links:
Related
Severity: 3
Rank (Obsolete): 15932

 Description   

After some error with the copy tool, the MDS will start logging the following in the system logs:

Jun 14 04:28:57 tassrv01 kernel: LustreError: 1535:0:(mdt_coordinator.c:1464:mdt_hsm_update_request_state()) tas01-MDT0000: Cannot find running request for cookie 0x539b6fc2 on fid=[0x200000bd0:0x5d57:0x0]
Jun 14 04:28:57 tassrv01 kernel: LustreError: 1535:0:(mdt_coordinator.c:1464:mdt_hsm_update_request_state()) Skipped 59 previous similar messages
Jun 14 04:28:57 tassrv01 kernel: LustreError: 1535:0:(mdt_coordinator.c:338:mdt_coordinator_cb()) tas01-MDT0000: Cannot cleanup timeouted request: [0x200000bd0:0x5d57:0x0] for cookie 0x539b6fc2 action=CANCEL
Jun 14 04:28:57 tassrv01 kernel: LustreError: 1535:0:(mdt_coordinator.c:338:mdt_coordinator_cb()) Skipped 59 previous similar messages
Jun 14 04:38:58 tassrv01 kernel: LustreError: 1535:0:(mdt_coordinator.c:1464:mdt_hsm_update_request_state()) tas01-MDT0000: Cannot find running request for cookie 0x539b6fc2 on fid=[0x200000bd0:0x5d57:0x0]
Jun 14 04:38:58 tassrv01 kernel: LustreError: 1535:0:(mdt_coordinator.c:1464:mdt_hsm_update_request_state()) Skipped 59 previous similar messages
Jun 14 04:38:58 tassrv01 kernel: LustreError: 1535:0:(mdt_coordinator.c:338:mdt_coordinator_cb()) tas01-MDT0000: Cannot cleanup timeouted request: [0x200000bd0:0x5d57:0x0] for cookie 0x539b6fc2 action=CANCEL
Jun 14 04:38:58 tassrv01 kernel: LustreError: 1535:0:(mdt_coordinator.c:338:mdt_coordinator_cb()) Skipped 59 previous similar messages

This will go on forever, and actually log many many messages:

LustreError: 2028:0:(mdt_coordinator.c:1465:mdt_hsm_update_request_state()) tas01-MDT0000: Cannot find running request for cookie 0x54249515 on fid=[0x200000404:0x15caa:0x0]
LustreError: 2028:0:(mdt_coordinator.c:1465:mdt_hsm_update_request_state()) Skipped 15979999 previous similar messages
LustreError: 2028:0:(mdt_coordinator.c:339:mdt_coordinator_cb()) tas01-MDT0000: Cannot cleanup timeouted request: [0x200000404:0x15caa:0x0] for cookie 0x54249515 action=CANCEL
LustreError: 2028:0:(mdt_coordinator.c:339:mdt_coordinator_cb()) Skipped 15979999 previous similar messages

To get these messages, shutdown the copytool while it has some requests being processed, and restart it. That may need several such cycles.



 Comments   
Comment by Frank Zago (Inactive) [ 30/Sep/14 ]

Fix proposed: http://review.whamcloud.com/12142

Note that the commit message may need some rewording.

Comment by James Nunez (Inactive) [ 05/Nov/14 ]

Patch landed to master (pre-2.7)

Generated at Sat Feb 10 01:53:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.