Details
-
Bug
-
Resolution: Cannot Reproduce
-
Minor
-
None
-
Lustre 2.12.0
-
None
-
2
-
9223372036854775807
Description
I think aptch https://review.whamcloud.com/#/c/31690/ for LU-10826 is more problematic.
after apply patch https://review.whamcloud.com/#/c/31690/ and test_req_buffer_pressure=1, it prevents OOM, but they are skipping some recvoery clients.
[root@voss05 ~]# lctl get_param obdfilter.*.recovery_status obdfilter.scratch-OST0024.recovery_status= status: COMPLETE recovery_start: 1525317355 recovery_duration: 54 completed_clients: 7249/7249 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0025.recovery_status= status: COMPLETE recovery_start: 1525317353 recovery_duration: 56 completed_clients: 7031/7031 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0026.recovery_status= status: COMPLETE recovery_start: 1525317352 recovery_duration: 57 completed_clients: 8168/8168 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0027.recovery_status= status: COMPLETE recovery_start: 1525317350 recovery_duration: 59 completed_clients: 8195/8195 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0028.recovery_status= status: COMPLETE recovery_start: 1525317355 recovery_duration: 54 completed_clients: 7984/7984 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST0029.recovery_status= status: COMPLETE recovery_start: 1525317352 recovery_duration: 57 completed_clients: 7985/7985 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST002a.recovery_status= status: COMPLETE recovery_start: 1525317354 recovery_duration: 55 completed_clients: 8329/8329 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST002b.recovery_status= status: COMPLETE recovery_start: 1525317351 recovery_duration: 58 completed_clients: 8291/8291 replayed_requests: 0 last_transno: 98784247808 VBR: DISABLED IR: ENABLED obdfilter.scratch-OST002c.recovery_status= status: COMPLETE recovery_start: 1525317350 recovery_duration: 59 completed_clients: 8286/8286 replayed_requests: 0 last_transno: 94489280512 VBR: DISABLED IR: ENABLED
And, aslo sometimes, recovery still never triggered. e.g failover situation.
I see the messages after restart OSTs
[ 9169.158440] Lustre: 14598:0:(events.c:368:request_in_callback()) All ost request buffers busy [ 9169.158447] Lustre: 14598:0:(events.c:368:request_in_callback()) Skipped 3508 previous similar messages
Attachments
Issue Links
- is related to
-
LU-10826 Regression in LU-9372 on OPA enviroment and no recovery triggered
-
- Resolved
-
Hello Shuichi,
Just a small update to let you know that the attempts to reproduce this problem have all been unsuccessful until now.
BTW, did you find sometime to reproduce again on your side and in order to provide the infos I have requested before?