[LU-15385] Resending request on EINPROGRESS Created: 17/Dec/21  Updated: 23/Dec/21  Resolved: 23/Dec/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.6
Fix Version/s: None

Type: Bug Priority: Critical
Reporter: Mahmoud Hanafi Assignee: Peter Jones
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

lustre2.12.6


Severity: 2
Rank (Obsolete): 9223372036854775807

 Description   

We are seeing EINPROGRESS errors on the clients. And some threads deadlock not he client. I am not sure if they are related.
Evicting the client cleans up the deadlock threads.

HUNG THREAD

('5851', 'usm3d.productio')
[<ffffffffc154586f>] osc_io_setattr_end+0xaf/0x280 [osc]
[<ffffffffc0f4cd78>] cl_io_end+0x58/0x140 [obdclass]
[<ffffffffc14defbb>] lov_io_end_wrapper+0xcb/0xd0 [lov]
[<ffffffffc14df3f7>] lov_io_call.isra.33+0x77/0x120 [lov]
[<ffffffffc14df4d2>] lov_io_end+0x32/0xa0 [lov]
[<ffffffffc0f4cd78>] cl_io_end+0x58/0x140 [obdclass]
[<ffffffffc0f4f40f>] cl_io_loop+0x9f/0x1d0 [obdclass]
[<ffffffffc16227d1>] cl_setattr_ost+0x221/0x380 [lustre]
[<ffffffffc15fdcc8>] ll_setattr_raw+0xa18/0xfa0 [lustre]
[<ffffffffa8289a4c>] notify_change+0x26c/0x440
[<ffffffffa826625e>] do_truncate+0x5e/0x90
[<ffffffffa826657a>] do_sys_ftruncate.constprop.12+0xea/0xf0
[<ffffffffa8004954>] do_syscall_64+0x74/0x160
[<ffffffffa88000b6>] entry_SYSCALL_64_after_hwframe+0x59/0xbe
[<ffffffffffffffff>] 0xffffffffffffffff

Fri Dec 17 12:57:54 2021] LustreError: 3382:0:(client.c:1440:after_reply()) Skipped 14 previous similar messages
[Fri Dec 17 13:10:15 2021] LustreError: 3382:0:(client.c:1440:after_reply()) @@@ Resending request on EINPROGRESS  req@ffff9e532fcf2480 x1719326682210048/t0(0) o10->nbp17-OST000f-osc-ffff9e56ce3b9800@10.151.27.149@o2ib:6/4 lens 440/400 e 0 to 0 dl 1639776037 ref 1 fl Rpc:R/2/0 rc 0/-115
[Fri Dec 17 13:10:15 2021] LustreError: 3382:0:(client.c:1440:after_reply()) Skipped 14 previous similar messages
[Fri Dec 17 13:22:42 2021] LustreError: 3382:0:(client.c:1440:after_reply()) @@@ Resending request on EINPROGRESS  req@ffff9e551ea04040 x1719326682299840/t0(0) o10->nbp17-OST0008-osc-ffff9e56ce3b9800@10.151.27.146@o2ib:6/4 lens 440/400 e 0 to 0 dl 1639776785 ref 1 fl Rpc:R/2/0 rc 0/-115
[Fri Dec 17 13:22:42 2021] LustreError: 3382:0:(client.c:1440:after_reply()) Skipped 14 previous similar messages
[Fri Dec 17 13:35:20 2021] LustreError: 3382:0:(client.c:1440:after_reply()) @@@ Resending request on EINPROGRESS  req@ffff9e532fcf2480 x1719326682389376/t0(0) o10->nbp17-OST000f-osc-ffff9e56ce3b9800@10.151.27.149@o2ib:6/4 lens 440/400 e 0 to 0 dl 1639777543 ref 1 fl Rpc:R/2/0 rc 0/-115
[Fri Dec 17 13:35:20 2021] LustreError: 3382:0:(client.c:1440:after_reply()) Skipped 14 previous similar messages
 


 Comments   
Comment by Peter Jones [ 20/Dec/21 ]

Mahmoud

Is this vanilla 2.12.6 or are any patches applied?

Peter

Comment by Etienne Aujames [ 23/Dec/21 ]

Could it be related to LU-15115 ("h1. ptlrpc resend on EINPROGRESS timeouts can be not correct")?

Comment by Mahmoud Hanafi [ 23/Dec/21 ]

We can close this case. This was an issue on stuck threads on the server.

Generated at Sat Feb 10 03:17:51 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.