Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.12.0
-
None
-
2
-
9223372036854775807
Description
I think aptch https://review.whamcloud.com/#/c/31690/ for LU-10826 is more problematic.
after apply patch https://review.whamcloud.com/#/c/31690/ and test_req_buffer_pressure=1, it prevents OOM, but they are skipping some recvoery clients.
[root@voss05 ~]# lctl get_param obdfilter.*.recovery_status obdfilter.scratch-OST0024.recovery_status= status: COMPLETE recovery_start: 1525317355 recovery_duration: 54 completed_clients: 7249/7249 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0025.recovery_status= status: COMPLETE recovery_start: 1525317353 recovery_duration: 56 completed_clients: 7031/7031 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0026.recovery_status= status: COMPLETE recovery_start: 1525317352 recovery_duration: 57 completed_clients: 8168/8168 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0027.recovery_status= status: COMPLETE recovery_start: 1525317350 recovery_duration: 59 completed_clients: 8195/8195 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0028.recovery_status= status: COMPLETE recovery_start: 1525317355 recovery_duration: 54 completed_clients: 7984/7984 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0029.recovery_status= status: COMPLETE recovery_start: 1525317352 recovery_duration: 57 completed_clients: 7985/7985 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST002a.recovery_status= status: COMPLETE recovery_start: 1525317354 recovery_duration: 55 completed_clients: 8329/8329 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST002b.recovery_status= status: COMPLETE recovery_start: 1525317351 recovery_duration: 58 completed_clients: 8291/8291 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST002c.recovery_status= status: COMPLETE recovery_start: 1525317350 recovery_duration: 59 completed_clients: 8286/8286 replayed_requests: 0 last_transno: 94489280512 VBR: DISABLED IR: ENABLED
And, aslo sometimes, recovery still never triggered. e.g failover situation.
I see the messages after restart OSTs
[ 9169.158440] Lustre: 14598:0:(events.c:368:request_in_callback()) All ost request buffers busy [ 9169.158447] Lustre: 14598:0:(events.c:368:request_in_callback()) Skipped 3508 previous similar messages
Attachments
Issue Links
- is related to
-
LU-10826 Regression in LU-9372 on OPA enviroment and no recovery triggered
- Resolved