Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.11.0
-
3
-
9223372036854775807
Description
There is a race between blocking ast send and ldlm_request_cancel. The first thread is ldlm_server_blocking_ast, the second is ldlm_request_cancel.
1. ldlm_server_blocking_ast
2. at_measure with current time
1. at_get before at_current became at_max
1. rq_delay_limit = 1.5*current_time
1. ptlrpc_send_limit_expired
1. cfs_time_seconds wrong type conversion (After 64bit time fix this is not happen)
1. ETIMEDOUT
1. Client eviction
So, at master 2.11 AT history is affected, and wrong at_get with current time. And evictions are impossible, instead of older 2.10.54 etc.
00010000:00010000:6.0:1518731697.761061:0:104827:0:(ldlm_lockd.c:1317:ldlm_handle_enqueue0()) ### server-side enqueue handler, new lock created ns: filter-snx11210-OST0004_UUID lock: ffff88060b224500/0x29e2650fb9a5aa0a lrc: 2/0,0 mode: --/PW res: [0x20567e43:0x0:0x0].0 rrc: 3 type: EXT [0->0] (req 0->0) flags: 0x40000000000000 nid: local remote: 0xec7dc906bbe31375 cl: expref: -99 pid: 104827 timeout: 0 lvb_type: 0 00010000:00010000:6.0:1518731697.761071:0:104827:0:(ldlm_lock.c:762:ldlm_add_bl_work_item()) ### lock incompatible; sending blocking AST. ns: filter-snx11210-OST0004_UUID lock: ffff8806d0052100/0x29e2650fb9a5a9d9 lrc: 3/0,0 mode: PR/PR res: [0x20567e43:0x0:0x0].0 rrc: 4 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40000000000000 nid: 3056@gni remote: 0xec7dc906bbe31359 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 0 lvb_type: 1 00010000:00010000:6.0:1518731697.761075:0:104827:0:(ldlm_extent.c:317:ldlm_check_contention()) contended locks = 0 00010000:00010000:6.0:1518731697.761076:0:104827:0:(ldlm_extent.c:317:ldlm_check_contention()) contended locks = 0 00010000:00010000:6.0:1518731697.761077:0:104827:0:(ldlm_resource.c:1318:ldlm_resource_add_lock()) ### About to add this lock: ns: filter-snx11210-OST0004_UUID lock: ffff88060b224500/0x29e2650fb9a5aa0a lrc: 4/0,0 mode: --/PW res: [0x20567e43:0x0:0x0].0 rrc: 4 type: EXT [0->3543039] (req 0->3543039) flags: 0x50000000000000 nid: 3056@gni remote: 0xec7dc906bbe31375 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 0 lvb_type: 0 00010000:00010000:6.0:1518731697.761088:0:104827:0:(ldlm_lockd.c:897:ldlm_server_blocking_ast()) ### server preparing blocking AST ns: filter-snx11210-OST0004_UUID lock: ffff8806d0052100/0x29e2650fb9a5a9d9 lrc: 4/0,0 mode: PR/PR res: [0x20567e43:0x0:0x0].0 rrc: 5 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x50000000000020 nid: 3056@gni remote: 0xec7dc906bbe31359 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 0 lvb_type: 1 00010000:00010000:17.0:1518731697.761089:0:30249:0:(ldlm_lockd.c:1704:ldlm_request_cancel()) ### server cancels blocked lock after 1518731697s ns: filter-snx11210-OST0004_UUID lock: ffff8806d0052100/0x29e2650fb9a5a9d9 lrc: 4/0,0 mode: PR/PR res: [0x20567e43:0x0:0x0].0 rrc: 5 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x50000000000020 nid: 3056@gni remote: 0xec7dc906bbe31359 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 0 lvb_type: 1 00010000:00010000:6.0:1518731697.761093:0:104827:0:(ldlm_lockd.c:481:ldlm_add_waiting_lock()) ### adding to wait list(timeout: 600, AT: on) ns: filter-snx11210-OST0004_UUID lock: ffff8806d0052100/0x29e2650fb9a5a9d9 lrc: 5/0,0 mode: PR/PR res: [0x20567e43:0x0:0x0].0 rrc: 5 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x70000000000020 nid: 3056@gni remote: 0xec7dc906bbe31359 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 15911500223 lvb_type: 1 00000100:00080000:6.0:1518731697.761098:0:104827:0:(client.c:1113:ptlrpc_import_delay_req()) @@@ send limit expired req@ffff880364d7d0c0 x1580326886726864/t0(0) o104->snx11210-OST0004@3056@gni:15/16 lens 296/224 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 00010000:00010000:17.0:1518731697.761099:0:30249:0:(ldlm_lockd.c:546:ldlm_del_waiting_lock()) ### removed ns: filter-snx11210-OST0004_UUID lock: ffff8806d0052100/0x29e2650fb9a5a9d9 lrc: 5/0,0 mode: PR/PR res: [0x20567e43:0x0:0x0].0 rrc: 4 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x50000000000020 nid: 3056@gni remote: 0xec7dc906bbe31359 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 15911500223 lvb_type: 1 00010000:00020000:6.0:1518731697.761104:0:104827:0:(ldlm_lockd.c:679:ldlm_handle_ast_error()) ### client (nid 3056@gni) failed to reply to blocking AST (req status 0 rc -110), evict it ns: filter-snx11210-OST0004_UUID lock: ffff8806d0052100/0x29e2650fb9a5a9d9 lrc: 4/0,0 mode: PR/PR res: [0x20567e43:0x0:0x0].0 rrc: 5 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x50000000000020 nid: 3056@gni remote: 0xec7dc906bbe31359 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 15911500223 lvb_type: 1 00010000:00010000:17.0:1518731697.761112:0:30249:0:(ldlm_extent.c:317:ldlm_check_contention()) contended locks = 0 00010000:00010000:17.0:1518731697.761113:0:30249:0:(ldlm_extent.c:317:ldlm_check_contention()) contended locks = 0 00010000:00010000:17.0:1518731697.761117:0:30249:0:(ldlm_lock.c:783:ldlm_add_cp_work_item()) ### lock granted; sending completion AST. ns: filter-snx11210-OST0004_UUID lock: ffff88060b224500/0x29e2650fb9a5aa0a lrc: 3/0,0 mode: PW/PW res: [0x20567e43:0x0:0x0].0 rrc: 5 type: EXT [0->18446744073709551615] (req 0->3543039) flags: 0x50080000000000 nid: 3056@gni remote: 0xec7dc906bbe31375 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 0 lvb_type: 0 00010000:00010000:17.0:1518731697.761123:0:30249:0:(ldlm_resource.c:1318:ldlm_resource_add_lock()) ### About to add this lock: ns: filter-snx11210-OST0004_UUID lock: ffff88060b224500/0x29e2650fb9a5aa0a lrc: 4/0,0 mode: PW/PW res: [0x20567e43:0x0:0x0].0 rrc: 5 type: EXT [0->18446744073709551615] (req 0->3543039) flags: 0x50080000000000 nid: 3056@gni remote: 0xec7dc906bbe31375 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 0 lvb_type: 0 00010000:00010000:17.0:1518731697.761143:0:30249:0:(ldlm_lockd.c:1013:ldlm_server_completion_ast()) ### server preparing completion AST ns: filter-snx11210-OST0004_UUID lock: ffff88060b224500/0x29e2650fb9a5aa0a lrc: 4/0,0 mode: PW/PW res: [0x20567e43:0x0:0x0].0 rrc: 5 type: EXT [0->18446744073709551615] (req 0->3543039) flags: 0x40000000000000 nid: 3056@gni remote: 0xec7dc906bbe31375 cl: 8e1b6d75-1b38-60f5-93bb-46c98cadf380 expref: 7 pid: 104827 timeout: 0 lvb_type: 0