[7566373.794723] LNetError: 130593:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7566373.806293] LNetError: 130593:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7566447.980688] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 5 seconds
[7566447.990943] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 257 previous similar messages
[7566495.301196] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7566495.313366] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7566507.268172] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7566507.276633] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7566507.285331] Lustre: Skipped 6 previous similar messages
[7566674.990260] LNetError: 87044:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7566675.002427] LNetError: 87044:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 151 previous similar messages
[7566776.993255] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583552799/real 1583552799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583552806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7566777.021029] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38268296 previous similar messages
[7566786.993366] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7566787.004062] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38278205 previous similar messages
[7566976.206861] LNetError: 87044:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7566976.218334] LNetError: 87044:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7567061.987420] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7567061.997683] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 286 previous similar messages
[7567096.721833] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7567096.734052] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7567108.259766] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7567108.268241] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7567277.470935] LNetError: 87044:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7567277.483106] LNetError: 87044:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 143 previous similar messages
[7567376.999903] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583553399/real 1583553399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583553406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7567377.027670] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38433115 previous similar messages
[7567387.002485] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7567387.013186] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38431105 previous similar messages
[7567435.181605] Lustre: fir-OST001f: haven't heard from client 3f9feeb7-0792-4 (at 10.49.26.4@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c1372b68400, cur 1583553458 expire 1583553308 last 1583553231
[7567435.201694] Lustre: Skipped 5 previous similar messages
[7567577.782418] LNetError: 104380:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7567577.793979] LNetError: 104380:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7567663.994041] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7567664.004300] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 241 previous similar messages
[7567699.270477] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7567699.282664] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7567709.290308] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7567709.298767] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7567878.985549] LNetError: 104380:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7567878.997807] LNetError: 104380:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 135 previous similar messages
[7567977.006489] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583553999/real 1583553999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583554006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7567977.034330] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 37749082 previous similar messages
[7567987.008600] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7567987.019293] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 37712489 previous similar messages
[7568180.212290] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7568180.223853] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7568271.000762] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7568271.011106] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 264 previous similar messages
[7568300.678094] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7568300.690264] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7568310.280912] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7568310.289379] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7568310.301798] Lustre: Skipped 6 previous similar messages
[7568481.486105] LNetError: 117619:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7568481.498359] LNetError: 117619:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 132 previous similar messages
[7568577.013124] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583554599/real 1583554599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583554606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7568577.040918] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38055247 previous similar messages
[7568587.015238] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7568587.025972] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38081882 previous similar messages
[7568629.119406] LustreError: 8682:0:(tgt_grant.c:758:tgt_grant_check()) fir-OST001f: cli d4be6328-2552-4 claims 4870144 GRANT, real grant 73728
[7568781.673336] LNetError: 117619:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7568781.684944] LNetError: 117619:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7568872.007389] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.217@o2ib7: 0 seconds
[7568872.017651] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 192 previous similar messages
[7568903.156761] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7568903.168933] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7568911.376559] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7568911.385058] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7569082.864003] LNetError: 117619:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7569082.876296] LNetError: 117619:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 151 previous similar messages
[7569112.198047] Lustre: fir-OST0019: haven't heard from client 1ecf6944-6593-4 (at 10.49.26.4@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c3d952b8000, cur 1583555135 expire 1583554985 last 1583554908
[7569112.223189] Lustre: Skipped 5 previous similar messages
[7569177.019745] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583555199/real 1583555199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583555206 ref 2 fl Rpc:eX/2/ffffffff rc 0/-1
[7569177.047518] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38072374 previous similar messages
[7569187.021862] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7569187.032559] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38063717 previous similar messages
[7569384.026257] LNetError: 117619:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7569384.037824] LNetError: 117619:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7569476.014061] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7569476.024406] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 274 previous similar messages
[7569505.494398] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7569505.506567] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7569512.471221] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7569512.479713] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7569685.252512] LNetError: 117619:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7569685.264771] LNetError: 117619:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 140 previous similar messages
[7569777.026283] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583555799/real 1583555799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583555806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7569777.054079] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 37644228 previous similar messages
[7569787.028385] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7569787.039083] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 37665797 previous similar messages
[7569902.208084] Lustre: fir-OST001f: haven't heard from client fb88818e-b66b-4 (at 10.49.26.4@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c4932268c00, cur 1583555925 expire 1583555775 last 1583555698
[7569902.228191] Lustre: Skipped 5 previous similar messages
[7569986.465683] LNetError: 117619:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7569986.477295] LNetError: 117619:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7570096.020812] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 1 seconds
[7570096.031155] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 194 previous similar messages
[7570106.936945] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7570106.949118] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7570113.565630] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7570113.574102] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7570113.582809] Lustre: Skipped 6 previous similar messages
[7570286.726116] LNetError: 117619:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7570286.738375] LNetError: 117619:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 138 previous similar messages
[7570377.032907] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583556399/real 1583556399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583556406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7570377.064738] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38341165 previous similar messages
[7570387.035014] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7570387.045708] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38332694 previous similar messages
[7570587.959439] LNetError: 117619:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7570587.970999] LNetError: 117619:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7570697.027449] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7570697.037798] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 297 previous similar messages
[7570709.416607] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7570709.428819] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7570714.910592] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7570714.919059] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7570714.927757] Lustre: Skipped 6 previous similar messages
[7570888.029574] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7570888.041761] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 155 previous similar messages
[7570977.039539] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583556999/real 1583556999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583557006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7570977.067364] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38390891 previous similar messages
[7570987.041662] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7570987.052359] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38402243 previous similar messages
[7571190.389973] LNetError: 14558:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7571190.401445] LNetError: 14558:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7571307.034315] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7571307.044571] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 259 previous similar messages
[7571311.906394] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7571311.918570] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7571315.882374] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7571315.890842] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7571490.610770] LNetError: 14558:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7571490.622970] LNetError: 14558:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 163 previous similar messages
[7571577.046580] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583557599/real 1583557599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583557606 ref 2 fl Rpc:eX/2/ffffffff rc 0/-1
[7571577.074368] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38132939 previous similar messages
[7571587.048701] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7571587.059393] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38132930 previous similar messages
[7571597.440153] LustreError: 3107:0:(tgt_grant.c:758:tgt_grant_check()) fir-OST001b: cli 9a1654e8-6409-4 claims 466944 GRANT, real grant 0
[7571791.787970] LNetError: 14558:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7571791.799470] LNetError: 14558:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7571913.274457] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7571913.286629] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7571915.041459] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7571915.051721] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 310 previous similar messages
[7571916.849240] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7571916.857707] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7572092.974122] LNetError: 14558:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7572092.986294] LNetError: 14558:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 129 previous similar messages
[7572177.053446] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583558199/real 1583558199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583558206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7572177.081217] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38111847 previous similar messages
[7572187.055557] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7572187.066276] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38107823 previous similar messages
[7572394.114037] LNetError: 14558:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7572394.125516] LNetError: 14558:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7572515.575396] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7572515.587568] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 9 previous similar messages
[7572516.048366] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.237@o2ib7: 1 seconds
[7572516.058617] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 315 previous similar messages
[7572517.816192] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7572517.824676] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7572559.020831] LustreError: 101264:0:(tgt_grant.c:758:tgt_grant_check()) fir-OST0021: cli 00c7f158-cc8b-4 claims 286720 GRANT, real grant 28672
[7572693.050519] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7572693.062692] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 149 previous similar messages
[7572777.060482] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583558799/real 1583558799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583558806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7572777.088259] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38493417 previous similar messages
[7572787.062591] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7572787.073322] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38481770 previous similar messages
[7572996.339208] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7572996.350680] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7573116.833478] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7573116.845669] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 9 previous similar messages
[7573118.783303] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7573118.791779] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7573122.055528] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7573122.065870] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 496 previous similar messages
[7573297.561740] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7573297.573906] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 155 previous similar messages
[7573377.067511] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583559399/real 1583559399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583559406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7573377.095283] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 39317731 previous similar messages
[7573387.069623] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7573387.080348] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 39342783 previous similar messages
[7573597.664214] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7573597.675811] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7573718.115554] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7573718.127734] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7573719.750274] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7573719.758763] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7573723.062588] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7573723.072931] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 415 previous similar messages
[7573898.818754] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7573898.830924] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 165 previous similar messages
[7573977.074619] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583559999/real 1583559999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583560006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7573977.102436] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38438157 previous similar messages
[7573987.076738] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7573987.087431] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38408981 previous similar messages
[7574199.911507] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7574199.923010] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7574320.717274] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7574320.725735] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7574321.356719] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7574321.368892] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7574328.069766] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 1 seconds
[7574328.080111] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 346 previous similar messages
[7574501.030898] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7574501.043070] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 169 previous similar messages
[7574577.081683] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583560599/real 1583560599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583560606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7574577.109477] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38173175 previous similar messages
[7574587.083809] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7574587.094545] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38203379 previous similar messages
[7574802.133276] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7574802.144781] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7574921.685780] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7574921.694295] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7574922.590776] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7574922.602978] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 10 previous similar messages
[7574940.076960] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7574940.087221] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 386 previous similar messages
[7575102.078874] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7575102.091046] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 166 previous similar messages
[7575153.402304] LustreError: 90686:0:(tgt_grant.c:758:tgt_grant_check()) fir-OST001f: cli d4be6328-2552-4 claims 7901184 GRANT, real grant 4870144
[7575177.088749] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583561199/real 1583561199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583561206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7575177.116523] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38953852 previous similar messages
[7575187.090866] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7575187.101559] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38946355 previous similar messages
[7575404.404555] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7575404.416033] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7575522.779460] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7575522.787928] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7575523.830865] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7575523.843041] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7575541.084026] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7575541.094281] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 421 previous similar messages
[7575634.597996] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7575634.610490] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Skipped 108 previous similar messages
[7575705.088685] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7575705.100861] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 197 previous similar messages
[7575777.095708] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583561799/real 1583561799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583561806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7575777.123557] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 38076882 previous similar messages
[7575787.097822] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7575787.108521] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 38024797 previous similar messages
[7576005.675253] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7576005.686760] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7576123.874105] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7576123.882571] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7576128.134617] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7576128.146839] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7576142.090743] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.237@o2ib7: 0 seconds
[7576142.101007] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 366 previous similar messages
[7576306.092585] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7576306.104759] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 179 previous similar messages
[7576377.102351] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583562399/real 1583562399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583562406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7576377.130124] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 37192275 previous similar messages
[7576387.104457] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7576387.115157] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 37039324 previous similar messages
[7576608.302345] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7576608.313894] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7576724.968757] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7576724.977308] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7576733.013250] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7576733.025430] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7576744.097333] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.237@o2ib7: 0 seconds
[7576744.107589] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 337 previous similar messages
[7576808.912072] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7576907.100073] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7576907.112245] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 187 previous similar messages
[7576977.108738] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583562999/real 1583562999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583563006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7576977.136522] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20390964 previous similar messages
[7576987.110830] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7576987.121529] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20201505 previous similar messages
[7577209.762239] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7577209.773717] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7577325.935778] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7577325.944343] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7577334.527416] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7577334.539596] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7577345.103493] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.237@o2ib7: 0 seconds
[7577345.113755] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 370 previous similar messages
[7577509.105189] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7577509.117360] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 189 previous similar messages
[7577577.114920] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583563599/real 1583563599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583563606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7577577.142698] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17601201 previous similar messages
[7577587.116985] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7577587.127686] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17619176 previous similar messages
[7577812.277331] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7577812.288812] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7577892.963313] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7577927.028996] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7577927.037483] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7577952.109690] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 4 seconds
[7577952.119947] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 421 previous similar messages
[7578025.440799] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7578113.301318] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7578113.313487] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 225 previous similar messages
[7578177.120982] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583564199/real 1583564199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583564206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7578177.148758] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16271501 previous similar messages
[7578187.123064] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7578187.133757] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16269642 previous similar messages
[7578232.977633] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7578232.989808] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7578322.322531] Lustre: fir-OST0021: haven't heard from client b3370a0f-7bce-4 (at 10.49.26.4@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c3feda97c00, cur 1583564345 expire 1583564195 last 1583564118
[7578322.342598] Lustre: Skipped 5 previous similar messages
[7578413.845528] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7578413.857008] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7578527.971862] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7578527.980337] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7578557.115802] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7578557.126060] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 417 previous similar messages
[7578714.117358] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7578714.129531] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 208 previous similar messages
[7578777.127038] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583564799/real 1583564799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583564806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7578777.154808] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13518004 previous similar messages
[7578787.129080] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7578787.139773] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 13500825 previous similar messages
[7578836.200361] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7578836.212537] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7579016.221403] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7579016.232882] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7579128.960813] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7579128.969272] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7579164.121833] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7579164.132093] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 400 previous similar messages
[7579316.839315] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7579316.851512] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 227 previous similar messages
[7579326.846851] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7579377.132905] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583565399/real 1583565399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583565406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7579377.160673] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16381739 previous similar messages
[7579387.134997] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7579387.145694] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16426943 previous similar messages
[7579437.556539] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7579437.568708] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7579618.527317] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7579618.538909] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7579729.926948] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7579729.935425] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7579768.127932] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 4 seconds
[7579768.138274] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 498 previous similar messages
[7579917.117561] LNetError: 57632:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7579917.129741] LNetError: 57632:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 234 previous similar messages
[7579977.139165] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583565999/real 1583565999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583566006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7579977.166946] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14980414 previous similar messages
[7579987.141253] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7579987.151954] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14964677 previous similar messages
[7580039.864910] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7580039.877110] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7580219.786860] LNetError: 87045:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7580219.798368] LNetError: 87045:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7580330.892261] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7580330.900733] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7580371.134361] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.235@o2ib7: 0 seconds
[7580371.144710] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 467 previous similar messages
[7580520.762066] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7580520.774238] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 239 previous similar messages
[7580577.145528] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583566599/real 1583566599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583566606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7580577.173307] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15138157 previous similar messages
[7580587.147639] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7580587.158334] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15116719 previous similar messages
[7580643.320249] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7580643.332493] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7580822.406161] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7580822.417675] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7580931.859399] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7580931.867882] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7580973.140784] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 1 seconds
[7580973.151043] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 451 previous similar messages
[7581010.220862] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7581122.112311] LNetError: 1155:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7581122.124398] LNetError: 1155:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 242 previous similar messages
[7581177.151870] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583567199/real 1583567199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583567206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7581177.179635] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14820897 previous similar messages
[7581187.153975] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7581187.164671] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14860906 previous similar messages
[7581243.751628] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7581243.763824] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7581423.753480] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7581423.764964] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7581532.825005] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7581532.833878] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7581575.147094] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.239@o2ib7: 0 seconds
[7581575.157353] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 441 previous similar messages
[7581723.148681] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7581723.160854] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 233 previous similar messages
[7581777.158250] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583567799/real 1583567799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583567806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7581777.186024] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14698939 previous similar messages
[7581787.160360] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7581787.171055] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14624715 previous similar messages
[7581849.847076] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7581849.859275] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 17 previous similar messages
[7582025.832071] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7582025.843547] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7582110.500065] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7582133.793214] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7582133.801725] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7582177.153579] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7582177.163926] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 422 previous similar messages
[7582327.182375] LNetError: 87045:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7582327.194555] LNetError: 87045:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 220 previous similar messages
[7582377.164729] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583568399/real 1583568399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583568406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7582377.192496] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16095968 previous similar messages
[7582387.166863] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7582387.177558] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16211136 previous similar messages
[7582628.011577] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7582628.023059] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7582734.886983] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7582734.895444] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7582747.814759] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7582747.826967] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 17 previous similar messages
[7582782.160111] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7582782.170370] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 432 previous similar messages
[7582928.161686] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7582928.173856] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 221 previous similar messages
[7582977.171200] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583568999/real 1583568999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583569006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7582977.198974] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14626396 previous similar messages
[7582987.173304] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7582987.184006] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14506960 previous similar messages
[7583230.059264] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7583230.070832] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7583335.854543] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7583335.863006] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7583352.684253] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7583352.696431] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7583383.166567] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7583383.176828] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 459 previous similar messages
[7583531.580347] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7583531.592514] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 229 previous similar messages
[7583577.177732] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583569599/real 1583569599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583569606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7583577.205509] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16684551 previous similar messages
[7583587.179759] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7583587.190466] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16703001 previous similar messages
[7583832.352717] LNetError: 87045:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7583832.364214] LNetError: 87045:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7583936.947101] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7583936.955576] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7583952.941759] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7583952.953981] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7583988.173103] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 6 seconds
[7583988.183366] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 401 previous similar messages
[7584133.045863] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7584133.058074] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 208 previous similar messages
[7584177.184141] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583570199/real 1583570199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583570206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7584177.211918] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16565230 previous similar messages
[7584187.186245] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7584187.196937] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16558445 previous similar messages
[7584291.605524] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7584434.499246] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7584434.510767] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7584537.915508] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7584537.924162] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7584550.459183] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7584555.201281] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7584555.213463] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7584597.179670] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 5 seconds
[7584597.190020] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 434 previous similar messages
[7584735.237946] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7584735.250151] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 226 previous similar messages
[7584777.190661] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583570799/real 1583570799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583570806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7584777.218438] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16006273 previous similar messages
[7584787.192970] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7584787.203665] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15965583 previous similar messages
[7584967.468687] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7585036.414438] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7585036.425930] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7585138.985126] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7585138.993619] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7585157.026748] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7585157.038913] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7585203.186245] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7585203.196592] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 439 previous similar messages
[7585337.144994] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7585337.157174] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 229 previous similar messages
[7585377.197130] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583571399/real 1583571399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583571406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7585377.224908] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13511898 previous similar messages
[7585387.199260] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7585387.209956] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 13553084 previous similar messages
[7585638.022239] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7585638.033732] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7585739.975616] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7585739.984077] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7585757.639442] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7585757.651625] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7585805.192783] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7585805.203044] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 458 previous similar messages
[7585853.169455] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7585902.983677] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7585937.194233] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7585937.206411] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 219 previous similar messages
[7585977.203904] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583571999/real 1583571999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583572006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7585977.231741] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15340036 previous similar messages
[7585987.205954] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7585987.216898] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15276199 previous similar messages
[7586240.497626] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7586240.509097] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7586340.943248] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7586340.951838] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7586361.375848] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7586361.388023] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7586408.199321] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 1 seconds
[7586408.209585] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 371 previous similar messages
[7586539.200744] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7586539.212910] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 220 previous similar messages
[7586577.210173] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583572599/real 1583572599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583572606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7586577.237948] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15791025 previous similar messages
[7586587.212250] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7586587.222975] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15951678 previous similar messages
[7586841.994191] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7586842.005703] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7586942.035822] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7586942.044294] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7586963.678391] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7586963.690606] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7587013.205895] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7587013.216154] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 486 previous similar messages
[7587140.207277] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7587140.219443] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 232 previous similar messages
[7587177.216663] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583573199/real 1583573199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583573206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7587177.244435] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16826306 previous similar messages
[7587187.218774] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7587187.229466] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16777712 previous similar messages
[7587444.280810] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7587444.292322] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7587542.979263] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7587542.987723] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7587563.888872] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7587563.901036] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7587619.212445] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 2 seconds
[7587619.222795] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 460 previous similar messages
[7587743.213776] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7587743.225949] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 228 previous similar messages
[7587777.223137] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583573799/real 1583573799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583573806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7587777.250917] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17123156 previous similar messages
[7587787.225246] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7587787.235945] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17127782 previous similar messages
[7587979.850331] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7588046.489302] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7588046.500783] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7588143.968819] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7588143.977291] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7588167.108405] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7588167.120571] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7588224.218996] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7588224.229254] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 485 previous similar messages
[7588347.069824] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7588347.081994] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 229 previous similar messages
[7588377.229711] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583574399/real 1583574399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583574406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7588377.257485] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16850404 previous similar messages
[7588387.231751] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7588387.242455] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16846744 previous similar messages
[7588648.582678] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7588648.594162] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7588739.819462] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7588744.937274] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7588744.945746] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7588769.293932] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7588769.306115] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7588825.225506] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7588825.235768] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 420 previous similar messages
[7588947.226847] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7588947.239031] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 223 previous similar messages
[7588977.236152] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583574999/real 1583574999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583575006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7588977.263929] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15295938 previous similar messages
[7588987.238280] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7588987.248975] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15287783 previous similar messages
[7589250.739267] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7589250.750743] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7589346.030809] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7589346.039271] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7589374.287419] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7589374.299594] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7589428.231975] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7589428.242233] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 400 previous similar messages
[7589548.233260] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7589548.245438] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 251 previous similar messages
[7589555.146262] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7589577.242581] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583575599/real 1583575599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583575606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7589577.270355] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16063839 previous similar messages
[7589587.244675] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7589587.255371] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16065943 previous similar messages
[7589852.738519] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7589852.750104] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7589947.125278] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7589947.133795] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7589974.432901] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7589974.445074] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7590033.238527] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7590033.248872] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 464 previous similar messages
[7590153.272661] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7590153.284837] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 245 previous similar messages
[7590177.249061] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583576199/real 1583576199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583576206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7590177.276826] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15942160 previous similar messages
[7590187.251179] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7590187.261871] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15952569 previous similar messages
[7590453.891294] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7590453.902772] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7590548.219678] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7590548.228140] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7590575.430446] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7590575.442672] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7590638.245044] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.237@o2ib7: 1 seconds
[7590638.255303] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 502 previous similar messages
[7590755.394384] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7590755.406556] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 239 previous similar messages
[7590777.255551] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583576799/real 1583576799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583576806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7590777.283427] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15941135 previous similar messages
[7590787.257673] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7590787.268374] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15875105 previous similar messages
[7591055.874698] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7591055.886197] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7591149.186185] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7591149.194649] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7591178.689976] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7591178.702145] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7591243.251577] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 4 seconds
[7591243.261917] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 535 previous similar messages
[7591306.573690] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7591357.013949] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7591357.026117] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 257 previous similar messages
[7591377.262046] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583577399/real 1583577399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583577406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7591377.289821] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15594866 previous similar messages
[7591387.264111] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7591387.274805] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15632574 previous similar messages
[7591387.469708] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7591658.231154] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7591658.242631] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7591750.152585] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7591750.161062] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7591779.839356] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7591779.851610] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 9 previous similar messages
[7591848.258082] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7591848.268346] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 553 previous similar messages
[7591957.259262] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7591957.271440] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 248 previous similar messages
[7591977.268486] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583577999/real 1583577999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583578006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7591977.296264] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15528520 previous similar messages
[7591987.276831] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7591987.287537] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15504046 previous similar messages
[7592260.488602] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7592260.500079] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7592351.119160] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7592351.127655] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7592384.091886] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7592384.104108] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7592454.264638] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 1 seconds
[7592454.274900] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 495 previous similar messages
[7592558.265778] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7592558.277962] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 221 previous similar messages
[7592577.274976] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583578599/real 1583578599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583578606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7592577.302748] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15317657 previous similar messages
[7592587.283053] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7592587.293750] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15339474 previous similar messages
[7592862.691064] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7592862.702573] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7592952.086721] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7592952.095213] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7592984.400366] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7592984.412535] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7593057.271139] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.239@o2ib7: 0 seconds
[7593057.281402] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 405 previous similar messages
[7593163.418180] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7593163.430355] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 209 previous similar messages
[7593177.281485] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583579199/real 1583579199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583579206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7593177.309285] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14467175 previous similar messages
[7593187.289542] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7593187.300237] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14479290 previous similar messages
[7593464.367874] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7593464.379447] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7593590.116933] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7593590.129154] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7593663.277694] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 1 seconds
[7593663.288035] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 329 previous similar messages
[7593675.545283] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7593675.553790] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7593700.922346] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7593764.278766] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7593764.290952] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 186 previous similar messages
[7593777.287910] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583579799/real 1583579799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583579806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7593777.315687] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14438236 previous similar messages
[7593787.297216] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7593787.307915] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14402092 previous similar messages
[7594066.547968] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7594066.559452] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7594268.284189] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7594268.294445] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 354 previous similar messages
[7594277.580668] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7594277.589137] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7594367.345191] LNetError: 111738:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7594367.357452] LNetError: 111738:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 184 previous similar messages
[7594377.294422] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583580399/real 1583580399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583580406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7594377.322195] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15572716 previous similar messages
[7594387.304475] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7594387.315173] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15676715 previous similar messages
[7594488.002610] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7594488.014785] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 18 previous similar messages
[7594667.907448] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7594667.918933] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7594876.290758] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7594876.301016] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 416 previous similar messages
[7594878.634459] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7594878.642929] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7594968.291756] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7594968.303935] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 208 previous similar messages
[7594977.300882] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583580999/real 1583580999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583581006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7594977.328657] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15061100 previous similar messages
[7594987.311135] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7594987.321839] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14965083 previous similar messages
[7595090.377275] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7595090.389449] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7595270.435478] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7595270.446961] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7595480.426936] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7595480.435419] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7595482.297291] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.225@o2ib7: 0 seconds
[7595482.307548] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 400 previous similar messages
[7595570.936532] LNetError: 111738:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7595570.948796] LNetError: 111738:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 221 previous similar messages
[7595577.307323] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583581599/real 1583581599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583581606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7595577.335107] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14424761 previous similar messages
[7595587.317490] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7595587.328191] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14401026 previous similar messages
[7595618.724319] LustreError: 83173:0:(tgt_grant.c:758:tgt_grant_check()) fir-OST001d: cli 1ccff414-1582-4 claims 8421376 GRANT, real grant 0
[7595690.414593] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7595690.426784] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7595872.419781] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7595872.431349] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7596081.463342] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7596081.471809] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7596088.303823] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7596088.314079] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 445 previous similar messages
[7596172.304737] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7596172.316910] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 237 previous similar messages
[7596177.313771] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583582199/real 1583582199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583582206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7596177.341548] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13835543 previous similar messages
[7596187.323880] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7596187.334576] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 13864122 previous similar messages
[7596293.942110] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7596293.954326] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7596474.249008] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7596474.260584] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7596683.439997] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7596683.448576] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7596693.310358] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.239@o2ib7: 0 seconds
[7596693.320623] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 533 previous similar messages
[7596775.722668] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7596775.734839] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 232 previous similar messages
[7596777.320251] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583582799/real 1583582799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583582806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7596777.348022] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15016085 previous similar messages
[7596782.974132] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7596787.330371] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7596787.341063] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15043806 previous similar messages
[7596894.330597] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7596894.342781] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7596994.649527] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7597076.400453] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7597076.411934] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7597284.444554] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7597284.453118] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7597294.316840] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 2 seconds
[7597294.327188] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 455 previous similar messages
[7597376.317733] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7597376.329920] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 216 previous similar messages
[7597377.326725] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583583399/real 1583583399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583583406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7597377.354504] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16344168 previous similar messages
[7597387.336849] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7597387.347550] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16267281 previous similar messages
[7597498.643103] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7597498.655281] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7597678.845158] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7597678.856647] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7597884.884046] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7597884.892597] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7597899.323395] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7597899.333650] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 514 previous similar messages
[7597977.333267] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583583999/real 1583583999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583584006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7597977.361125] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14198055 previous similar messages
[7597979.564332] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7597979.576518] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 231 previous similar messages
[7597987.343404] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7597987.354102] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14200528 previous similar messages
[7598100.136733] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7598100.148910] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7598280.219006] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7598280.230582] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7598486.456740] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7598486.465279] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7598505.330267] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7598505.340525] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 385 previous similar messages
[7598577.340084] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583584599/real 1583584599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583584606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7598577.367865] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15288158 previous similar messages
[7598581.331121] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7598581.343298] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 218 previous similar messages
[7598587.350161] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7598587.360852] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15395525 previous similar messages
[7598702.338497] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7598702.350678] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7598882.228432] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7598882.239917] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7599087.448325] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7599087.456799] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7599113.337106] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.237@o2ib7: 5 seconds
[7599113.347369] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 412 previous similar messages
[7599177.346903] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583585199/real 1583585199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583585206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7599177.374680] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15141667 previous similar messages
[7599183.270150] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7599183.282319] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 216 previous similar messages
[7599187.356927] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7599187.367624] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15071198 previous similar messages
[7599304.192275] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7599304.204448] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7599484.329527] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7599484.341094] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7599688.413966] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7599688.422563] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7599719.343838] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 1 seconds
[7599719.354098] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 459 previous similar messages
[7599777.353479] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583585799/real 1583585799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583585806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7599777.381335] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15719024 previous similar messages
[7599783.344550] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7599783.356727] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 223 previous similar messages
[7599787.363741] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7599787.374442] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15756097 previous similar messages
[7599906.485932] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7599906.498104] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7600086.386116] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7600086.397688] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7600289.381386] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7600289.389859] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7600328.350454] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 5 seconds
[7600328.360800] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 462 previous similar messages
[7600377.359955] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583586399/real 1583586399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583586406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7600377.387733] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15404172 previous similar messages
[7600386.887171] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7600386.899347] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 226 previous similar messages
[7600387.370067] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7600387.380769] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15365889 previous similar messages
[7600509.518384] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7600509.530606] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7600688.397451] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7600688.408957] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7600890.348145] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7600890.356617] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7600933.356979] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7600933.367239] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 436 previous similar messages
[7600977.366518] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583586999/real 1583586999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583587006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7600977.394293] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15846591 previous similar messages
[7600987.357557] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7600987.369738] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 227 previous similar messages
[7600987.380120] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7600987.390821] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15874587 previous similar messages
[7601110.964963] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7601110.977141] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7601289.925743] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7601289.937238] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7601491.314647] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7601491.323293] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7601538.363539] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 5 seconds
[7601538.373801] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 463 previous similar messages
[7601577.372997] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583587599/real 1583587599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583587606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7601577.400776] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13493640 previous similar messages
[7601587.365105] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7601587.377276] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 239 previous similar messages
[7601587.387880] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7601587.398595] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 13437250 previous similar messages
[7601591.510806] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7601715.628519] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7601715.640693] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 10 previous similar messages
[7601737.538369] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7601892.465685] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7601892.477293] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7602092.258340] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7602092.266814] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7602149.370199] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7602149.380455] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 503 previous similar messages
[7602177.379494] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583588199/real 1583588199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583588206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7602177.407305] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15854759 previous similar messages
[7602187.393614] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7602187.404313] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15938467 previous similar messages
[7602188.370629] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7602188.382804] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 239 previous similar messages
[7602261.606051] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7602319.589073] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7602319.601256] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7602494.599129] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7602494.610758] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7602649.223574] LustreError: 120648:0:(tgt_grant.c:758:tgt_grant_check()) fir-OST001d: cli 1ccff414-1582-4 claims 12607488 GRANT, real grant 8421376
[7602693.247667] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7602693.256141] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7602753.376801] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.239@o2ib7: 0 seconds
[7602753.387057] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 480 previous similar messages
[7602777.386077] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583588799/real 1583588799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583588806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7602777.414109] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17094726 previous similar messages
[7602787.400206] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7602787.410902] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17109465 previous similar messages
[7602793.377253] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7602793.389434] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 234 previous similar messages
[7602919.783656] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7602919.795829] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7603096.497809] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7603096.509288] LNetError: 42286:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7603294.214392] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7603294.222863] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7603358.383428] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 3 seconds
[7603358.393773] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 499 previous similar messages
[7603377.392626] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583589399/real 1583589399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583589406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7603377.420397] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16343773 previous similar messages
[7603387.406763] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7603387.417465] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16339809 previous similar messages
[7603397.860072] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7603397.872245] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 245 previous similar messages
[7603521.523241] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7603521.535425] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7603698.348739] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7603698.360231] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7603895.158514] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7603895.167000] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7603959.390037] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7603959.400299] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 503 previous similar messages
[7603977.399221] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583589999/real 1583589999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583590006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7603977.427002] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16734746 previous similar messages
[7603987.413487] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7603987.424183] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16688965 previous similar messages
[7603999.784637] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7603999.796816] LNetError: 42286:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 240 previous similar messages
[7604125.426877] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7604125.439057] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7604300.298039] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7604300.309566] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7604401.990812] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7604496.147126] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7604496.155622] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7604561.396679] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7604561.407023] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 509 previous similar messages
[7604577.405985] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583590599/real 1583590599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583590606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7604577.433765] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16667583 previous similar messages
[7604587.419975] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7604587.430674] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16656577 previous similar messages
[7604600.397098] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7604600.409282] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 242 previous similar messages
[7604902.503780] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7604902.515281] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7605023.161813] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7605023.173987] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 17 previous similar messages
[7605097.113020] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7605097.121496] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7605163.403234] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 1 seconds
[7605163.413580] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 508 previous similar messages
[7605177.412454] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583591199/real 1583591199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583591206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7605177.440232] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16795628 previous similar messages
[7605187.426479] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7605187.437174] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16803115 previous similar messages
[7605203.129893] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7605203.142082] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 238 previous similar messages
[7605504.680801] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7605504.692280] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7605625.307125] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7605625.319299] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 9 previous similar messages
[7605698.080472] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7605698.089017] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7605768.409678] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 5 seconds
[7605768.420025] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 395 previous similar messages
[7605777.418779] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583591799/real 1583591799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583591806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7605777.446561] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17707990 previous similar messages
[7605787.432886] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7605787.443580] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17738848 previous similar messages
[7605803.318078] LNetError: 35537:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7605803.330258] LNetError: 35537:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 212 previous similar messages
[7606106.701341] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7606106.712825] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7606230.358611] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7606230.370793] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 18 previous similar messages
[7606299.046981] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7606299.055454] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7606374.416135] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 1 seconds
[7606374.426480] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 438 previous similar messages
[7606377.425169] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583592399/real 1583592399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583592406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7606377.452943] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16821120 previous similar messages
[7606387.439260] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7606387.449960] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16803035 previous similar messages
[7606404.416472] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7606404.428647] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 236 previous similar messages
[7606708.314063] LNetError: 86516:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7606708.325554] LNetError: 86516:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7606830.928156] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.234@o2ib7: -125
[7606830.940351] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7606900.012474] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7606900.020965] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7606977.431651] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583592999/real 1583592999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583593006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7606977.459417] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16800247 previous similar messages
[7606987.422760] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7606987.433018] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 493 previous similar messages
[7606987.445900] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7606987.456597] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16867313 previous similar messages
[7607005.422961] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7607005.435143] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 236 previous similar messages
[7607310.417331] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7607310.428905] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7607434.604683] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7607434.616867] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7607500.980083] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7607500.988559] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7607577.438214] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583593599/real 1583593599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583593606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7607577.465990] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16124492 previous similar messages
[7607587.452311] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7607587.463007] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16062410 previous similar messages
[7607588.429343] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 1 seconds
[7607588.439602] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 520 previous similar messages
[7607609.429590] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7607609.441757] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 247 previous similar messages
[7607912.892924] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7607912.904424] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7608101.946545] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7608101.955010] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7608177.444753] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583594199/real 1583594199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583594206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7608177.472528] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16745370 previous similar messages
[7608187.458880] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7608187.469579] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16720124 previous similar messages
[7608193.435925] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 1 seconds
[7608193.446267] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 503 previous similar messages
[7608213.373315] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7608213.385498] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 235 previous similar messages
[7608333.941500] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7608333.953669] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7608514.859378] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7608514.870863] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7608702.914058] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7608702.922527] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7608777.451282] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583594799/real 1583594799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583594806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7608777.479060] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17102793 previous similar messages
[7608787.465380] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7608787.476076] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17116003 previous similar messages
[7608798.442512] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 2 seconds
[7608798.452854] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 485 previous similar messages
[7608815.124917] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7608815.137138] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 237 previous similar messages
[7608934.638104] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7608934.650274] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7609116.558160] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7609116.569638] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7609304.007642] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7609304.016120] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7609377.457814] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583595399/real 1583595399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583595406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7609377.485587] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17673852 previous similar messages
[7609387.471928] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7609387.482622] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17678344 previous similar messages
[7609403.449109] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 1 seconds
[7609403.459366] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 462 previous similar messages
[7609415.449250] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7609415.461433] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 227 previous similar messages
[7609538.385670] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7609538.397852] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7609718.339496] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7609718.350976] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7609904.951295] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7609904.959762] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7609977.464471] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583595999/real 1583595999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583596006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7609977.492604] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16789032 previous similar messages
[7609987.478490] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7609987.489186] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16800705 previous similar messages
[7610008.455718] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7610008.465981] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 521 previous similar messages
[7610019.056979] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7610019.069154] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 239 previous similar messages
[7610138.664222] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7610138.676389] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 10 previous similar messages
[7610320.560247] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7610320.571820] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7610505.940791] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7610505.949270] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7610577.470966] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583596599/real 1583596599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583596606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7610577.498730] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 19656852 previous similar messages
[7610587.485076] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7610587.495766] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 19726100 previous similar messages
[7610613.462378] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7610613.472639] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 513 previous similar messages
[7610620.994552] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7610621.006746] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 235 previous similar messages
[7610689.281976] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7610739.496813] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7610739.508981] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7610922.405968] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7610922.417449] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7611106.883383] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7611106.891851] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7611177.477534] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583597199/real 1583597199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583597206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7611177.505310] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 18838151 previous similar messages
[7611187.491646] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7611187.502347] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 18793849 previous similar messages
[7611218.468991] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7611218.479249] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 509 previous similar messages
[7611222.469051] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7611222.481222] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 240 previous similar messages
[7611344.267369] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7611344.279547] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7611524.984463] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7611524.995944] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7611579.867733] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7611707.873811] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7611707.882271] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7611777.484110] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583597799/real 1583597799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583597806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7611777.511887] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16937146 previous similar messages
[7611787.498191] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7611787.508884] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16942813 previous similar messages
[7611820.475563] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7611820.485818] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 548 previous similar messages
[7611825.318916] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7611825.331086] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 239 previous similar messages
[7611946.877049] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7611946.889218] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7612126.789125] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7612126.800771] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7612308.840403] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7612308.848893] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7612377.490646] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583598399/real 1583598399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583598406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7612377.518417] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 18811493 previous similar messages
[7612387.504762] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7612387.515459] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 18838138 previous similar messages
[7612423.482156] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7612423.492415] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 512 previous similar messages
[7612427.147244] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7612427.159413] LNetError: 23905:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 232 previous similar messages
[7612548.767587] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7612548.779761] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7612577.405614] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7612728.711673] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7612728.723214] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7612909.783453] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7612909.791929] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7612975.235434] LustreError: 6891:0:(tgt_grant.c:758:tgt_grant_check()) fir-OST001f: cli d4be6328-2552-4 claims 8626176 GRANT, real grant 7901184
[7612977.497195] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583598999/real 1583598999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583599006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7612977.524977] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 18489993 previous similar messages
[7612987.511311] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7612987.522003] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 18389107 previous similar messages
[7613028.488742] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7613028.499005] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 445 previous similar messages
[7613028.508680] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7613028.520846] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 235 previous similar messages
[7613150.755119] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7613150.767291] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 9 previous similar messages
[7613330.701136] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7613330.712612] LNetError: 23905:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7613510.773316] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7613510.781792] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7613577.503364] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583599599/real 1583599599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583599606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7613577.531132] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 18575054 previous similar messages
[7613587.517420] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7613587.528119] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 18670834 previous similar messages
[7613631.222278] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7613631.234451] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 228 previous similar messages
[7613633.494928] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7613633.505269] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 435 previous similar messages
[7613754.787273] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7613754.799442] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7613932.704285] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7613932.715784] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7614111.867609] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7614111.876069] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7614177.509689] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583600199/real 1583600199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583600206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7614177.537467] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16399341 previous similar messages
[7614187.523764] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7614187.534464] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16387813 previous similar messages
[7614233.009290] LNetError: 118266:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7614233.021555] LNetError: 118266:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 220 previous similar messages
[7614239.501326] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 6 seconds
[7614239.511584] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 405 previous similar messages
[7614359.678787] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7614359.690974] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7614534.678743] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7614534.690316] LNetError: 111738:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7614712.834522] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7614712.843000] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7614777.516007] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583600799/real 1583600799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583600806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7614777.543786] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 19129428 previous similar messages
[7614787.530096] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7614787.540795] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 19154962 previous similar messages
[7614835.249866] LNetError: 5179:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7614835.261955] LNetError: 5179:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 203 previous similar messages
[7614843.507694] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7614843.517948] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 413 previous similar messages
[7614959.837924] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7614959.850159] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 10 previous similar messages
[7615136.707921] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7615136.719402] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7615313.928134] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7615313.936776] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7615377.522235] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583601399/real 1583601399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583601406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7615377.550004] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20084936 previous similar messages
[7615387.536329] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7615387.547025] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20064283 previous similar messages
[7615437.173039] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7615437.185213] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 235 previous similar messages
[7615448.513979] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7615448.524237] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 467 previous similar messages
[7615560.690185] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7615560.702402] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 9 previous similar messages
[7615738.486325] LNetError: 86516:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7615738.497848] LNetError: 86516:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7615914.893348] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7615914.901825] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7615977.528487] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583601999/real 1583601999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583602006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7615977.556256] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 19572714 previous similar messages
[7615987.542592] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7615987.553284] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 19566005 previous similar messages
[7616037.520130] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7616037.532298] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 242 previous similar messages
[7616053.520290] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7616053.530550] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 573 previous similar messages
[7616165.417428] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7616165.429649] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7616309.674436] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7616340.123465] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7616340.134958] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7616515.860104] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7616515.868577] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7616577.534603] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583602599/real 1583602599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583602606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7616577.562377] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 19570848 previous similar messages
[7616587.548699] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7616587.559393] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 19541216 previous similar messages
[7616641.425800] LNetError: 20115:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7616641.437969] LNetError: 20115:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 247 previous similar messages
[7616654.526328] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7616654.536591] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 522 previous similar messages
[7616768.006504] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7616768.018677] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7616942.826975] LNetError: 5179:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7616942.838367] LNetError: 5179:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7616978.673038] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7617116.826412] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7617116.834876] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7617177.540552] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583603199/real 1583603199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583603206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7617177.568317] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 19806647 previous similar messages
[7617187.554652] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7617187.565343] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 19865431 previous similar messages
[7617222.181013] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7617241.532189] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7617241.544364] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 228 previous similar messages
[7617258.532358] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 1 seconds
[7617258.542707] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 435 previous similar messages
[7617288.062022] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7617368.924606] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7617368.936798] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7617544.851545] LNetError: 5179:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7617544.862976] LNetError: 5179:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7617717.791286] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7617717.799816] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7617777.546475] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583603799/real 1583603799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583603806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7617777.574243] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 18850394 previous similar messages
[7617787.560572] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7617787.571269] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 18855922 previous similar messages
[7617845.224400] LNetError: 5179:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7617845.236485] LNetError: 5179:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 244 previous similar messages
[7617864.538360] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7617864.548621] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 509 previous similar messages
[7617969.884436] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7617969.896605] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7618146.777356] LNetError: 5179:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7618146.788757] LNetError: 5179:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7618275.914903] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7618319.759337] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7618319.767804] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7618377.552752] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583604399/real 1583604399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583604406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7618377.580530] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 19557247 previous similar messages
[7618387.566774] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7618387.577469] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 19445107 previous similar messages
[7618447.221623] LNetError: 5179:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7618447.233705] LNetError: 5179:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 221 previous similar messages
[7618468.544572] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 1 seconds
[7618468.554828] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 500 previous similar messages
[7618570.748672] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7618570.760885] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 10 previous similar messages
[7618748.089837] LNetError: 5179:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7618748.101378] LNetError: 5179:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7618921.748660] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7618921.757216] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7618942.400956] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7618977.558938] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583604999/real 1583604999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583605006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7618977.586708] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 19009181 previous similar messages
[7618987.573050] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7618987.583743] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 19118339 previous similar messages
[7619047.550690] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7619047.562863] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 230 previous similar messages
[7619074.550961] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.225@o2ib7: 0 seconds
[7619074.561222] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 506 previous similar messages
[7619146.258855] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7619171.133115] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7619171.145294] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7619350.978072] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7619350.989548] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7619523.739303] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7619523.747788] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7619577.565370] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583605599/real 1583605599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583605606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7619577.593136] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17298055 previous similar messages
[7619587.579471] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7619587.590166] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17274620 previous similar messages
[7619648.557138] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7619648.569305] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 243 previous similar messages
[7619679.557467] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7619679.567723] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 496 previous similar messages
[7619774.166577] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7619774.178743] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7619952.134860] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7619952.146345] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7620125.729891] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7620125.738348] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7620177.572010] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583606199/real 1583606199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583606206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7620177.599784] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17233079 previous similar messages
[7620187.586121] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7620187.596815] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17200714 previous similar messages
[7620249.563808] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7620249.575978] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 199 previous similar messages
[7620280.564156] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 1 seconds
[7620280.574420] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 436 previous similar messages
[7620374.589209] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7620374.601545] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7620422.632430] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7620554.638446] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7620554.649938] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7620691.277930] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7620727.720538] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7620727.729055] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7620777.578721] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583606799/real 1583606799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583606806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7620777.606513] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16742295 previous similar messages
[7620787.592752] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7620787.603449] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16662798 previous similar messages
[7620853.570433] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7620853.582600] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 203 previous similar messages
[7620884.570784] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 1 seconds
[7620884.581044] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 365 previous similar messages
[7620976.186800] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7620976.199021] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 9 previous similar messages
[7621156.195714] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7621156.207186] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7621326.942238] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7621328.712283] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7621328.720750] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7621377.585184] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583607399/real 1583607399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583607406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7621377.612959] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16611023 previous similar messages
[7621387.599338] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7621387.610049] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16671651 previous similar messages
[7621457.946194] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7621457.958388] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 174 previous similar messages
[7621493.577446] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7621493.587701] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 349 previous similar messages
[7621580.589429] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7621580.601627] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7621648.479333] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7621758.723454] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7621758.734936] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7621929.653660] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7621929.662143] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7621977.591745] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583607999/real 1583607999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583608006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7621977.619513] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 18319720 previous similar messages
[7621987.605852] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7621987.616548] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 18334607 previous similar messages
[7622058.583645] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7622058.595821] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 171 previous similar messages
[7622104.584142] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7622104.594402] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 340 previous similar messages
[7622181.002990] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7622181.015168] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7622361.065892] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7622361.077374] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7622530.620581] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7622530.629112] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7622577.598656] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583608599/real 1583608599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583608606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7622577.626429] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13938125 previous similar messages
[7622587.612404] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7622587.623105] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 13967184 previous similar messages
[7622661.843181] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7622661.855362] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 218 previous similar messages
[7622716.590842] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.225@o2ib7: 0 seconds
[7622716.601097] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 448 previous similar messages
[7622718.613973] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7622786.655576] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7622786.667750] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 18 previous similar messages
[7622962.620791] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7622962.632276] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7623131.611169] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7623131.619629] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7623177.605053] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583609199/real 1583609199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583609206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7623177.632831] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 17218000 previous similar messages
[7623187.618949] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7623187.629644] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 17145147 previous similar messages
[7623188.791819] Lustre: fir-OST001b: haven't heard from client c25a35c7-c47b-4 (at 10.50.14.3@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c140b97a800, cur 1583609211 expire 1583609061 last 1583608984
[7623188.811890] Lustre: Skipped 5 previous similar messages
[7623208.785315] Lustre: fir-OST001f: haven't heard from client c25a35c7-c47b-4 (at 10.50.14.3@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c40752a1c00, cur 1583609231 expire 1583609081 last 1583609004
[7623263.475873] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7623263.488887] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 171 previous similar messages
[7623324.597446] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7623324.607705] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 316 previous similar messages
[7623388.227166] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7623388.239416] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7623564.209158] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7623564.220698] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7623732.577729] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7623732.586206] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7623732.595085] Lustre: Skipped 6 previous similar messages
[7623769.545697] LustreError: 67869:0:(tgt_grant.c:758:tgt_grant_check()) fir-OST001d: cli 1ccff414-1582-4 claims 16752640 GRANT, real grant 12607488
[7623777.611524] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583609799/real 1583609799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583609806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7623777.639300] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14552367 previous similar messages
[7623787.625575] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7623787.636302] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14525129 previous similar messages
[7623863.603421] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7623863.615593] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 188 previous similar messages
[7623931.604168] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7623931.614431] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 395 previous similar messages
[7623988.878853] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7623988.891060] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 10 previous similar messages
[7624166.983855] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7624166.995335] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7624333.544370] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7624333.552834] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7624377.618320] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583610399/real 1583610399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583610406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7624377.646097] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13783070 previous similar messages
[7624387.636262] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7624387.646957] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 13768126 previous similar messages
[7624463.935810] LNetError: 125876:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7624463.948068] LNetError: 125876:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 202 previous similar messages
[7624533.610843] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7624533.621106] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 394 previous similar messages
[7624590.700492] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7624590.712705] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7624768.570472] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7624768.581949] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7624809.873759] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7624934.512865] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7624934.521324] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7624977.624836] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583610999/real 1583610999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583611006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7624977.652664] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15421621 previous similar messages
[7624987.643025] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7624987.653728] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15480376 previous similar messages
[7625064.616628] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7625064.628802] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 191 previous similar messages
[7625135.617401] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7625135.627661] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 422 previous similar messages
[7625195.492091] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7625195.504294] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7625370.555610] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7625370.567091] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7625535.604520] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7625535.612978] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7625577.631188] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583611599/real 1583611599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583611606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7625577.658986] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15259986 previous similar messages
[7625587.649307] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7625587.660007] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15281482 previous similar messages
[7625671.208440] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7625671.220611] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 184 previous similar messages
[7625743.624019] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7625743.634363] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 332 previous similar messages
[7625795.809764] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7625795.821941] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7625973.043653] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7625973.055126] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7626107.892961] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7626136.572137] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7626136.580594] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7626177.637716] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583612199/real 1583612199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583612206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7626177.665480] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16331379 previous similar messages
[7626187.655827] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7626187.666524] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16347176 previous similar messages
[7626273.638952] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7626273.651144] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 188 previous similar messages
[7626353.630656] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7626353.641002] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 423 previous similar messages
[7626574.588270] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7626574.599751] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7626695.358458] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7626695.370625] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7626737.539582] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7626737.548112] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7626777.644331] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583612799/real 1583612799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583612806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7626777.672101] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15614456 previous similar messages
[7626787.662646] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7626787.673376] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15593797 previous similar messages
[7626875.166672] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7626875.178867] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 171 previous similar messages
[7626886.738177] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7626955.637366] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 6 seconds
[7626955.647626] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 356 previous similar messages
[7627177.072006] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7627177.083492] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7627297.654151] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7627297.666347] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7627338.633422] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7627338.641920] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7627377.651047] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583613399/real 1583613399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583613406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7627377.678827] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14736387 previous similar messages
[7627387.669122] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7627387.679819] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14624823 previous similar messages
[7627477.705216] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7627477.717388] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 199 previous similar messages
[7627556.643981] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.237@o2ib7: 0 seconds
[7627556.654242] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 436 previous similar messages
[7627650.994241] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7627778.633706] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7627778.645188] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7627893.636174] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7627899.296786] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7627899.308971] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7627939.728040] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7627939.736515] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7627977.657581] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583613999/real 1583613999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583614006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7627977.685360] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16114464 previous similar messages
[7627987.675689] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7627987.686388] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16244011 previous similar messages
[7628079.302074] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7628079.314259] LNetError: 79739:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 181 previous similar messages
[7628160.650611] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7628160.660870] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 380 previous similar messages
[7628380.868994] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7628380.880483] LNetError: 79739:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7628501.486401] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7628501.498579] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7628540.822610] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7628540.831069] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7628577.664287] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583614599/real 1583614599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583614606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7628577.692070] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14974429 previous similar messages
[7628587.682303] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7628587.692997] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14875165 previous similar messages
[7628679.656294] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7628679.668460] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 212 previous similar messages
[7628762.657190] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7628762.667542] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 362 previous similar messages
[7628982.571634] LNetError: 55894:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7628982.583112] LNetError: 55894:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7629104.194896] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7629104.207072] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7629142.421999] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7629142.430480] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7629177.670677] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583615199/real 1583615199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583615206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7629177.698458] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15357985 previous similar messages
[7629187.688785] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7629187.699486] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15458480 previous similar messages
[7629284.057775] LNetError: 55894:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7629284.069953] LNetError: 55894:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 204 previous similar messages
[7629365.663730] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 1 seconds
[7629365.673993] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 390 previous similar messages
[7629584.562200] LNetError: 55894:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7629584.573721] LNetError: 55894:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7629705.289598] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7629705.301824] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7629743.395883] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7629743.404366] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7629777.677318] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583615799/real 1583615799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583615806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7629777.705086] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15668366 previous similar messages
[7629787.695425] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7629787.706118] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15629224 previous similar messages
[7629885.235783] LNetError: 55894:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7629885.247955] LNetError: 55894:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 224 previous similar messages
[7629983.670620] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7629983.680881] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 405 previous similar messages
[7630186.783998] LNetError: 55894:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7630186.795485] LNetError: 55894:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7630305.398204] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7630305.410373] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7630344.490463] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7630344.498938] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7630377.683998] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583616399/real 1583616399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583616406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7630377.711777] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16150324 previous similar messages
[7630387.702265] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7630387.712956] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16169339 previous similar messages
[7630485.676147] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7630485.688324] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 211 previous similar messages
[7630593.677319] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7630593.687575] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 473 previous similar messages
[7630789.039755] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7630789.051290] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7630909.667909] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7630909.680105] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7630911.979980] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7630945.457111] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7630945.465577] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7630977.690678] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583616999/real 1583616999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583617006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7630977.718457] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16500086 previous similar messages
[7630987.708673] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7630987.719365] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16502602 previous similar messages
[7631089.668857] LNetError: 87046:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7631089.681068] LNetError: 87046:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 213 previous similar messages
[7631198.684007] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7631198.694262] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 424 previous similar messages
[7631268.471099] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7631390.337321] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7631390.348838] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7631512.002649] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7631512.014825] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7631546.398730] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7631546.407722] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7631577.697271] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583617599/real 1583617599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583617606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7631577.725051] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16875931 previous similar messages
[7631580.227286] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7631587.715384] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7631587.726078] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16791009 previous similar messages
[7631689.689534] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7631689.701704] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 187 previous similar messages
[7631799.690732] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7631799.700994] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 395 previous similar messages
[7631992.557110] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7631992.568600] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7632114.192120] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7632114.204299] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7632147.390250] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7632147.398729] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7632177.703805] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583618199/real 1583618199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583618206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7632177.731584] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15241427 previous similar messages
[7632187.721895] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7632187.732622] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15301622 previous similar messages
[7632293.399260] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7632293.411430] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 213 previous similar messages
[7632410.697340] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.236@o2ib7: 0 seconds
[7632410.707688] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 412 previous similar messages
[7632594.593616] LNetError: 87046:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7632594.605114] LNetError: 87046:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7632715.289754] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7632715.301935] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7632748.940679] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7632748.949158] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7632777.710291] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583618799/real 1583618799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583618806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7632777.738062] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15713934 previous similar messages
[7632787.728401] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7632787.739097] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15732909 previous similar messages
[7632895.327770] LNetError: 78212:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7632895.339944] LNetError: 78212:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 180 previous similar messages
[7633016.703903] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.235@o2ib7: 0 seconds
[7633016.714247] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 404 previous similar messages
[7633196.862993] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7633196.874492] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7633317.482199] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7633317.494384] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 10 previous similar messages
[7633349.964305] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7633349.972763] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7633377.716829] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583619399/real 1583619399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583619406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7633377.750904] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15066913 previous similar messages
[7633387.734939] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7633387.745637] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15038524 previous similar messages
[7633497.540176] LNetError: 78212:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7633497.552352] LNetError: 78212:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 229 previous similar messages
[7633617.710467] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.224@o2ib7: 1 seconds
[7633617.720730] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 430 previous similar messages
[7633799.083397] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7633799.094876] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7633921.680818] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7633921.692998] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7633951.066456] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7633951.075032] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7633977.724249] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583619999/real 1583619999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583620006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7633977.752040] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15141686 previous similar messages
[7633987.741515] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7633987.752211] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15099164 previous similar messages
[7634097.715722] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7634097.727899] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 224 previous similar messages
[7634221.717082] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.235@o2ib7: 1 seconds
[7634221.727432] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 468 previous similar messages
[7634400.720163] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7634400.731646] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7634523.372418] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7634523.384664] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7634552.024226] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7634552.032700] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7634577.729998] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583620599/real 1583620599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583620606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7634577.757776] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15112841 previous similar messages
[7634587.748125] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7634587.758826] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15074766 previous similar messages
[7634698.722319] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7634698.734496] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 206 previous similar messages
[7634822.723700] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7634822.733960] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 411 previous similar messages
[7635003.123603] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7635003.135080] LNetError: 78212:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7635125.819994] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7635125.832215] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7635154.053488] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7635154.061946] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7635177.736547] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583621199/real 1583621199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583621206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7635177.764325] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14982950 previous similar messages
[7635187.754625] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7635187.765318] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15148127 previous similar messages
[7635217.630837] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7635249.618307] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7635303.728920] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7635303.741087] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 209 previous similar messages
[7635424.730209] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.238@o2ib7: 0 seconds
[7635424.740557] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 428 previous similar messages
[7635604.386107] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7635604.397701] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7635726.267435] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7635726.279608] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7635755.110193] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7635755.118905] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7635777.742976] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583621799/real 1583621799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583621806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7635777.770741] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16286701 previous similar messages
[7635787.761043] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7635787.771744] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16246832 previous similar messages
[7635810.899412] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7635904.355321] LNetError: 85500:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7635904.367501] LNetError: 85500:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 204 previous similar messages
[7636026.736604] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.224@o2ib7: 0 seconds
[7636026.746862] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 488 previous similar messages
[7636206.994714] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7636207.006539] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7636332.631942] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7636332.644132] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 17 previous similar messages
[7636356.146714] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7636356.155178] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7636377.749398] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583622399/real 1583622399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583622406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7636377.777170] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15121847 previous similar messages
[7636387.767480] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7636387.778178] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15120835 previous similar messages
[7636505.741787] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7636505.753954] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 217 previous similar messages
[7636629.743132] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 1 seconds
[7636629.753387] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 403 previous similar messages
[7636809.142190] LNetError: 6316:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7636809.153603] LNetError: 6316:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7636957.170598] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7636957.179059] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7636977.756033] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583622999/real 1583622999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583623006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7636977.783816] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15742987 previous similar messages
[7636987.773966] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7636987.784664] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15738257 previous similar messages
[7637058.104239] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7637109.973424] LNetError: 6316:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7637109.985523] LNetError: 6316:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 198 previous similar messages
[7637230.605641] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7637230.617818] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7637235.749573] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7637235.759832] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 339 previous similar messages
[7637410.589375] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7637410.600876] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7637558.949403] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7637558.957873] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7637577.762394] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583623599/real 1583623599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583623606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7637577.790168] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15876615 previous similar messages
[7637587.780377] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7637587.791069] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 15848873 previous similar messages
[7637600.863246] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7637712.255688] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7637712.267907] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 181 previous similar messages
[7637833.073156] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7637833.085329] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7637858.756279] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7637858.766539] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 304 previous similar messages
[7638012.715064] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7638012.726541] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7638039.872725] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7638098.643147] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7638160.998720] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7638161.007199] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7638177.768774] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583624199/real 1583624199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583624206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7638177.796552] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13577240 previous similar messages
[7638187.786753] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7638187.797450] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 13520835 previous similar messages
[7638313.705419] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7638313.717649] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 133 previous similar messages
[7638435.273517] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7638435.285681] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7638459.762687] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 8 seconds
[7638459.773036] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 185 previous similar messages
[7638614.563483] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7638614.574964] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7638762.628146] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7638762.636625] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7638777.775071] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583624799/real 1583624799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583624806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7638777.802862] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 14005838 previous similar messages
[7638787.793183] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7638787.803879] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14067485 previous similar messages
[7638913.767535] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7638913.779704] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 137 previous similar messages
[7639037.218001] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7639037.230184] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7639063.769147] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7639063.779404] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 212 previous similar messages
[7639216.787405] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7639216.798970] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7639363.571010] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7639363.579998] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7639377.781493] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583625399/real 1583625399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583625406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7639377.809445] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 15020766 previous similar messages
[7639387.799607] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7639387.810304] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 14991916 previous similar messages
[7639517.307519] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7639517.319695] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 130 previous similar messages
[7639640.197330] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7639640.209612] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7639669.775626] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7639669.785887] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 133 previous similar messages
[7639818.934490] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7639818.945972] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7639964.564042] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7639964.572530] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7639977.787933] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583625999/real 1583625999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583626006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7639977.815708] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 13831003 previous similar messages
[7639987.806174] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7639987.816888] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 13846445 previous similar messages
[7640119.685713] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7640119.697883] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 128 previous similar messages
[7640240.745833] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7640240.758009] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7640283.782365] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7640283.792619] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 148 previous similar messages
[7640364.161375] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7640421.264888] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7640421.276364] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7640555.292345] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7640565.656404] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7640565.664908] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7640577.794455] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583626599/real 1583626599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583626606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7640577.822334] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 16464396 previous similar messages
[7640587.812559] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7640587.823462] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 16506173 previous similar messages
[7640721.601189] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7640721.613372] LNetError: 87043:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 125 previous similar messages
[7640845.536347] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7640845.548535] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7640899.788931] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7640899.799188] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 155 previous similar messages
[7641022.716459] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7641022.727960] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7641166.752671] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7641166.761162] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7641177.800968] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583627199/real 1583627199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583627206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7641177.828745] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20125261 previous similar messages
[7641187.819079] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7641187.829772] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20193310 previous similar messages
[7641324.106539] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7641324.118726] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 142 previous similar messages
[7641418.114918] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7641445.894964] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7641445.907145] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7641505.795585] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7641505.805845] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 171 previous similar messages
[7641624.963131] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7641624.974699] LNetError: 87043:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7641767.845738] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7641767.854202] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7641777.807596] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583627799/real 1583627799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583627806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7641777.835635] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20661535 previous similar messages
[7641787.825703] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7641787.836394] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20656669 previous similar messages
[7641826.558885] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7641925.633502] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7641925.645753] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 135 previous similar messages
[7642051.889676] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7642051.901889] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7642107.802274] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7642107.812537] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 146 previous similar messages
[7642226.553780] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7642226.565260] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7642369.058340] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7642369.066834] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7642377.814274] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583628399/real 1583628399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583628406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7642377.842108] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21513991 previous similar messages
[7642387.832354] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7642387.843053] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21479893 previous similar messages
[7642528.247156] LNetError: 20115:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7642528.259364] LNetError: 20115:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 141 previous similar messages
[7642652.490348] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7642652.502544] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7642708.808890] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7642708.819152] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 163 previous similar messages
[7642829.045435] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7642829.056914] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7642970.369994] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7642970.378481] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7642977.820838] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583628999/real 1583628999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583629006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7642977.848608] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21377371 previous similar messages
[7642987.838969] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7642987.849665] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21403851 previous similar messages
[7643129.433687] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7643129.445872] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 148 previous similar messages
[7643254.063935] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7643254.076118] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7643316.815550] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7643316.825897] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 279 previous similar messages
[7643430.991009] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7643431.002505] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7643553.906790] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7643571.385483] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7643571.393997] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7643577.827449] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583629599/real 1583629599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583629606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7643577.855222] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21176143 previous similar messages
[7643587.845559] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7643587.856256] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21163125 previous similar messages
[7643729.808148] LNetError: 99450:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7643729.820326] LNetError: 99450:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 209 previous similar messages
[7643855.828554] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7643855.840723] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7643918.822178] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7643918.832436] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 294 previous similar messages
[7644032.664594] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7644032.676083] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7644172.328892] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7644172.337472] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7644177.834003] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583630199/real 1583630199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583630206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7644177.861769] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20702196 previous similar messages
[7644187.852148] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7644187.862848] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20713450 previous similar messages
[7644271.055374] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7644333.472856] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7644333.485047] LNetError: 31987:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 192 previous similar messages
[7644460.182114] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7644460.194318] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7644521.828762] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7644521.839020] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 319 previous similar messages
[7644634.997146] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7644635.008635] LNetError: 20115:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7644773.320439] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7644773.328930] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7644777.840592] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583630799/real 1583630799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583630806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7644777.868368] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20865522 previous similar messages
[7644787.858667] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7644787.869367] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20878979 previous similar messages
[7644933.833275] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7644933.845451] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 196 previous similar messages
[7645061.892732] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7645061.904936] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7645127.835410] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7645127.845663] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 361 previous similar messages
[7645209.896190] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7645236.752816] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7645236.764345] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7645374.413516] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7645374.422066] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7645377.847204] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583631399/real 1583631399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583631406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7645377.874979] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20875349 previous similar messages
[7645387.865276] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7645387.875973] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20857886 previous similar messages
[7645535.839936] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7645535.852107] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 195 previous similar messages
[7645729.842075] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7645729.852332] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 334 previous similar messages
[7645768.817248] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7645838.459166] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7645838.470645] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7645955.951581] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7645955.963760] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 16 previous similar messages
[7645975.381175] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7645975.389707] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7645977.853818] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583631999/real 1583631999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583632006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7645977.881587] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21045240 previous similar messages
[7645987.871908] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7645987.882604] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21053256 previous similar messages
[7646138.846586] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7646138.858760] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 211 previous similar messages
[7646214.104397] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7646331.848702] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.235@o2ib7: 1 seconds
[7646331.859046] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 414 previous similar messages
[7646353.569316] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7646440.773071] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7646440.784659] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7646560.220324] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7646560.232497] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7646576.475965] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7646576.484439] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7646577.860438] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583632599/real 1583632599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583632606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7646577.888207] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21237457 previous similar messages
[7646587.878563] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7646587.889258] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21239888 previous similar messages
[7646739.853315] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7646739.865484] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 217 previous similar messages
[7646935.855513] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7646935.865769] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 395 previous similar messages
[7646993.205064] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7647042.577296] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7647042.588999] LNetError: 31987:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7647161.068059] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7647161.080229] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7647177.569768] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7647177.578569] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7647177.867218] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583633199/real 1583633199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583633206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7647177.894988] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20559961 previous similar messages
[7647187.885316] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7647187.896011] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20555921 previous similar messages
[7647341.860039] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7647341.872219] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 219 previous similar messages
[7647545.862325] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 1 seconds
[7647545.872583] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 413 previous similar messages
[7647645.043623] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7647645.055117] LNetError: 42281:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7647765.494988] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7647765.507165] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7647777.873980] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583633799/real 1583633799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583633806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7647777.901750] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21160141 previous similar messages
[7647778.537502] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7647778.546009] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7647787.892073] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7647787.902765] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21169862 previous similar messages
[7647946.235980] LNetError: 20115:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7647946.248194] LNetError: 20115:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 212 previous similar messages
[7648146.869125] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.224@o2ib7: 0 seconds
[7648146.879381] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 473 previous similar messages
[7648247.358379] LNetError: 32348:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7648247.369865] LNetError: 32348:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7648365.823716] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7648365.835903] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7648370.406648] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7648377.880763] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583634399/real 1583634399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583634406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7648377.908544] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21194678 previous similar messages
[7648379.631254] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7648379.639739] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7648387.898854] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7648387.909550] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21173490 previous similar messages
[7648546.873657] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7648546.885828] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 222 previous similar messages
[7648749.875935] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 0 seconds
[7648749.886198] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 447 previous similar messages
[7648848.798180] LNetError: 32348:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7648848.809679] LNetError: 32348:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7648970.322475] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7648970.334665] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7648977.887472] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583634999/real 1583634999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583635006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7648977.915247] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21181755 previous similar messages
[7648980.598050] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7648980.607033] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7648987.905575] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7648987.916313] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21208667 previous similar messages
[7649148.880329] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7649148.892505] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 218 previous similar messages
[7649352.882549] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.233@o2ib7: 0 seconds
[7649352.892892] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 450 previous similar messages
[7649451.137162] LNetError: 32348:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7649451.148673] LNetError: 32348:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7649570.590917] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7649570.603126] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7649577.893958] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583635599/real 1583635599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583635606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7649577.921730] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21335156 previous similar messages
[7649581.564541] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7649581.573200] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7649587.912034] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7649587.922749] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21320544 previous similar messages
[7649749.886718] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7649749.898899] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 225 previous similar messages
[7649954.888787] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 5 seconds
[7649954.899047] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 450 previous similar messages
[7650052.513079] LNetError: 32348:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7650052.524560] LNetError: 32348:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7650171.013062] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7650171.025246] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7650177.900072] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583636199/real 1583636199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583636206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7650177.927845] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20816775 previous similar messages
[7650182.658640] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7650182.667143] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7650187.918193] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7650187.928891] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20816867 previous similar messages
[7650281.879185] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7650353.773208] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7650353.785387] LNetError: 42281:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 202 previous similar messages
[7650555.894903] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.235@o2ib7: 0 seconds
[7650555.905246] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 429 previous similar messages
[7650612.600735] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7650639.299589] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7650654.903569] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7650654.915059] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7650721.996295] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7650775.405051] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7650775.417328] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7650777.906054] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583636799/real 1583636799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583636806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7650777.933866] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20709833 previous similar messages
[7650783.624380] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7650783.633053] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7650787.924187] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7650787.934887] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20694058 previous similar messages
[7650795.354862] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7650955.898827] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7650955.910999] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 212 previous similar messages
[7651156.900820] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.235@o2ib7: 0 seconds
[7651156.911168] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 393 previous similar messages
[7651257.341287] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7651257.352877] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7651375.784119] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7651375.796298] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7651377.912065] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583637399/real 1583637399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583637406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7651377.939831] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20622937 previous similar messages
[7651384.590449] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7651384.598945] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7651387.930174] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7651387.940867] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20636026 previous similar messages
[7651555.905985] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7651555.918153] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 243 previous similar messages
[7651759.907239] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 5 seconds
[7651759.917494] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 563 previous similar messages
[7651858.471554] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7651858.483209] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7651975.917639] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7651975.929807] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 8 previous similar messages
[7651977.918621] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583637999/real 1583637999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583638006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7651977.946393] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20708768 previous similar messages
[7651985.555851] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7651985.564386] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7651987.936706] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7651987.947401] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20714626 previous similar messages
[7652159.596811] LNetError: 61050:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7652159.608987] LNetError: 61050:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 255 previous similar messages
[7652360.913766] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7652360.924018] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 548 previous similar messages
[7652422.318197] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7652460.774353] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7652460.785831] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7652577.925150] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583638599/real 1583638599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583638606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7652577.952919] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21024269 previous similar messages
[7652580.202327] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7652580.214514] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 9 previous similar messages
[7652586.524389] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7652586.532853] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7652587.943285] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7652587.953981] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21031992 previous similar messages
[7652759.890247] LNetError: 63355:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7652759.902438] LNetError: 63355:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 243 previous similar messages
[7652895.821532] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7652961.920452] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7652961.930710] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 491 previous similar messages
[7653063.095705] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7653063.107186] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7653177.931868] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583639199/real 1583639199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583639206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7653177.959639] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20863955 previous similar messages
[7653180.555934] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7653180.568159] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7653187.530067] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7653187.538557] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7653187.949974] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7653187.960675] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20849883 previous similar messages
[7653359.924877] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7653359.937058] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 239 previous similar messages
[7653564.927143] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 8 seconds
[7653564.937402] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 518 previous similar messages
[7653664.510226] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7653664.521711] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7653777.938498] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583639799/real 1583639799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583639806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7653777.966268] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20805262 previous similar messages
[7653786.026709] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7653786.038910] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7653787.956604] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7653787.967301] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20799068 previous similar messages
[7653788.583790] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7653788.592267] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7653960.931545] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7653960.943733] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 228 previous similar messages
[7654167.933850] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7654167.944197] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 444 previous similar messages
[7654266.970350] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7654266.982010] LNetError: 61050:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7654377.945173] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583640399/real 1583640399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583640406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7654377.972945] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20378189 previous similar messages
[7654387.963312] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7654387.974002] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20397010 previous similar messages
[7654389.528395] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7654389.536882] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7654391.510411] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7654391.522591] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7654564.938280] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7654564.950453] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 213 previous similar messages
[7654768.940529] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.239@o2ib7: 1 seconds
[7654768.950784] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 470 previous similar messages
[7654869.246556] LNetError: 107965:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7654869.258177] LNetError: 107965:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7654977.951823] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583640999/real 1583640999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583641006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7654977.979598] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20163084 previous similar messages
[7654987.969915] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7654987.980610] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20161700 previous similar messages
[7654990.518735] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7654990.527207] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7654992.722032] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7654992.734204] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7655121.051515] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7655167.944912] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7655167.957089] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 231 previous similar messages
[7655369.946726] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.233@o2ib7: 0 seconds
[7655369.957069] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 479 previous similar messages
[7655470.517483] LNetError: 58838:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7655470.528983] LNetError: 58838:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7655577.957727] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583641599/real 1583641599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583641606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7655577.985509] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20264260 previous similar messages
[7655587.975807] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7655587.986498] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20249849 previous similar messages
[7655591.611898] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7655591.620362] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7655595.016893] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7655595.029075] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7655601.164309] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7655768.950686] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7655768.962863] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 223 previous similar messages
[7655970.952762] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.235@o2ib7: 1 seconds
[7655970.963111] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 482 previous similar messages
[7656072.781458] LNetError: 123486:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7656072.793051] LNetError: 123486:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7656177.963893] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583642199/real 1583642199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583642206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7656177.991665] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20257637 previous similar messages
[7656187.981952] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7656187.992646] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20266456 previous similar messages
[7656192.579149] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7656192.587747] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7656197.235069] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7656197.247248] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7656369.848837] LNetError: 125264:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7656369.861105] LNetError: 125264:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 223 previous similar messages
[7656579.959073] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7656579.969330] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 463 previous similar messages
[7656674.871103] LNetError: 58838:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7656674.882600] LNetError: 58838:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7656688.778803] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7656777.970285] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583642799/real 1583642799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583642806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7656777.998062] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20884394 previous similar messages
[7656787.988427] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7656787.999120] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20878728 previous similar messages
[7656793.673472] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7656793.682125] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7656800.327571] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.238@o2ib7: -125
[7656800.339750] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7656969.963445] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7656969.975621] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 209 previous similar messages
[7657189.966880] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7657189.977143] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 460 previous similar messages
[7657277.060944] LNetError: 123486:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7657277.072513] LNetError: 123486:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7657377.976895] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583643399/real 1583643399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583643406 ref 2 fl Rpc:eX/2/ffffffff rc 0/-1
[7657378.004668] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20899345 previous similar messages
[7657387.995031] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7657388.005730] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20907373 previous similar messages
[7657394.767245] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7657394.775725] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7657575.970133] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7657575.982306] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 240 previous similar messages
[7657695.554465] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7657695.566643] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 7 previous similar messages
[7657790.972497] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 2 seconds
[7657790.982756] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 576 previous similar messages
[7657879.177551] LNetError: 123486:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7657879.189180] LNetError: 123486:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7657977.983562] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583643999/real 1583643999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583644006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7657978.011334] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21407073 previous similar messages
[7657988.001671] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7657988.012367] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21475788 previous similar messages
[7657995.862109] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7657995.870584] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7658179.976833] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7658179.989005] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 245 previous similar messages
[7658300.620343] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7658300.632520] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7658391.980183] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.219@o2ib7: 0 seconds
[7658391.990443] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 573 previous similar messages
[7658481.258113] LNetError: 123486:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7658481.269689] LNetError: 123486:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7658577.990251] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583644599/real 1583644599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583644606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7658578.018021] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21522859 previous similar messages
[7658588.008360] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7658588.019062] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21460741 previous similar messages
[7658596.956843] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7658596.965318] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7658782.295955] LNetError: 123486:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7658782.308219] LNetError: 123486:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 238 previous similar messages
[7658902.671974] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7658902.684183] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 14 previous similar messages
[7658993.985883] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.225@o2ib7: 0 seconds
[7658993.996144] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 601 previous similar messages
[7659083.322003] LNetError: 59168:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7659083.333489] LNetError: 59168:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7659177.996927] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583645199/real 1583645199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583645206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7659178.024696] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21376713 previous similar messages
[7659188.015067] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7659188.025765] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21366392 previous similar messages
[7659198.052244] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7659198.060780] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7659382.990225] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7659383.002404] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 237 previous similar messages
[7659506.879803] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7659506.891997] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7659594.992577] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.234@o2ib7: 1 seconds
[7659595.002834] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 541 previous similar messages
[7659685.537920] LNetError: 58838:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7659685.549441] LNetError: 58838:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7659778.003620] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583645799/real 1583645799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583645806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7659778.031389] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21051255 previous similar messages
[7659788.021733] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7659788.032472] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21026314 previous similar messages
[7659799.145193] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7659799.153728] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7659984.608988] LNetError: 20306:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7659984.621154] LNetError: 20306:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 241 previous similar messages
[7660112.137464] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.236@o2ib7: -125
[7660112.149667] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7660198.999274] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.10.234@o2ib7: 0 seconds
[7660199.009617] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 561 previous similar messages
[7660287.791686] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7660287.803233] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7660378.010102] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583646399/real 1583646399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583646406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7660378.037870] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20533792 previous similar messages
[7660388.028203] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7660388.038905] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20559519 previous similar messages
[7660400.113626] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7660400.122143] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7660584.823207] LNetError: 32309:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7660584.835433] LNetError: 32309:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 228 previous similar messages
[7660805.005447] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.239@o2ib7: 0 seconds
[7660805.015705] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 524 previous similar messages
[7660839.132558] LNetError: 80403:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7660889.976845] LNetError: 42002:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7660889.988338] LNetError: 42002:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7660978.016263] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583646999/real 1583646999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583647006 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7660978.044037] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20796027 previous similar messages
[7660988.034365] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7660988.045062] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20791806 previous similar messages
[7661001.205875] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7661001.214458] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7661011.376739] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7661011.388920] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7661190.009444] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7661190.021624] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 233 previous similar messages
[7661409.011710] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.224@o2ib7: 0 seconds
[7661409.021969] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 630 previous similar messages
[7661491.990011] LNetError: 86516:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7661492.001489] LNetError: 86516:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7661578.022472] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583647599/real 1583647599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583647606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7661578.050246] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21218565 previous similar messages
[7661588.040539] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7661588.051240] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21226705 previous similar messages
[7661603.173017] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7661603.181495] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7661612.379873] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7661612.392061] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 13 previous similar messages
[7661791.015581] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7661791.027749] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 250 previous similar messages
[7661911.208791] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7662015.018855] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7662015.029111] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 554 previous similar messages
[7662093.972674] LNetError: 42002:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7662093.984159] LNetError: 42002:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7662178.028532] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583648199/real 1583648199]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583648206 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7662178.056298] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20893819 previous similar messages
[7662188.046624] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7662188.057344] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20895297 previous similar messages
[7662205.226957] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7662205.235473] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7662214.359987] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7662214.372196] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 12 previous similar messages
[7662270.481510] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7662391.022751] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7662391.034934] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 239 previous similar messages
[7662617.024245] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.233@o2ib7: 0 seconds
[7662617.034504] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 580 previous similar messages
[7662695.997049] LNetError: 42002:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7662696.008530] LNetError: 42002:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7662778.035053] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583648799/real 1583648799]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583648806 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7662778.062822] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21056487 previous similar messages
[7662788.053175] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7662788.063875] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21053738 previous similar messages
[7662806.280648] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7662806.289120] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7662815.377582] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.232@o2ib7: -125
[7662815.389762] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7662916.759459] LNetError: 80404:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7662991.029458] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7662991.041632] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 253 previous similar messages
[7663218.030955] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7663218.041216] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 617 previous similar messages
[7663297.960257] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7663297.971740] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7663378.041749] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583649399/real 1583649399]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583649406 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7663378.069518] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21394994 previous similar messages
[7663388.059845] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7663388.070545] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21404069 previous similar messages
[7663408.248738] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7663408.257214] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7663421.351358] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7663421.363537] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 8 previous similar messages
[7663546.472331] LNetError: 80402:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7663594.035131] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7663594.047300] LNetError: 80392:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 245 previous similar messages
[7663819.037618] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.227@o2ib7: 0 seconds
[7663819.047878] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 618 previous similar messages
[7663869.765196] LNetError: 80409:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.224@o2ib7 added to recovery queue. Health = 900
[7663869.778339] LNetError: 80409:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 2 previous similar messages
[7663869.789366] LustreError: 2946:0:(ldlm_lib.c:3271:target_bulk_io()) @@@ truncated bulk READ 0(4194304)  req@ffff9c11a9588850 x1659415032616320/t0(0) o3->1ccff414-1582-4@10.50.5.29@o2ib2:426/0 lens 488/440 e 0 to 0 dl 1583649921 ref 1 fl Interpret:/0/0 rc 0/0
[7663869.812316] LustreError: 2946:0:(ldlm_lib.c:3271:target_bulk_io()) Skipped 3 previous similar messages
[7663869.822100] Lustre: fir-OST001d: Bulk IO read error with 1ccff414-1582-4 (at 10.50.5.29@o2ib2), client will retry: rc -110
[7663869.833329] Lustre: Skipped 3 previous similar messages
[7663899.883683] LNetError: 42002:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7663899.895169] LNetError: 42002:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7663978.048351] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583649999/real 1583649999]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583650006 ref 2 fl Rpc:eX/2/ffffffff rc 0/-1
[7663978.076127] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 21840089 previous similar messages
[7663988.066479] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7663988.077177] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 21832206 previous similar messages
[7664010.239079] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7664010.247541] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7664023.339908] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.233@o2ib7: -125
[7664023.352107] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 11 previous similar messages
[7664076.236280] Lustre: fir-OST0023: haven't heard from client 1ccff414-1582-4 (at 10.50.5.29@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c3fedacb400, cur 1583650098 expire 1583649948 last 1583649871
[7664076.256343] Lustre: Skipped 4 previous similar messages
[7664078.234762] Lustre: fir-OST0019: haven't heard from client 1ccff414-1582-4 (at 10.50.5.29@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c4575786800, cur 1583650100 expire 1583649950 last 1583649873
[7664078.255135] Lustre: Skipped 1 previous similar message
[7664080.238751] Lustre: fir-OST001f: haven't heard from client 1ccff414-1582-4 (at 10.50.5.29@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c3fcf7f7c00, cur 1583650102 expire 1583649952 last 1583649875
[7664083.229587] Lustre: fir-OST0021: haven't heard from client 1ccff414-1582-4 (at 10.50.5.29@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff9c3fedac9800, cur 1583650105 expire 1583649955 last 1583649878
[7664195.021798] LNetError: 94959:0:(lib-msg.c:485:lnet_handle_local_failure()) ni 10.0.10.106@o2ib7 added to recovery queue. Health = 900
[7664195.033974] LNetError: 94959:0:(lib-msg.c:485:lnet_handle_local_failure()) Skipped 235 previous similar messages
[7664280.905605] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7664342.420486] LNetError: 80401:0:(lib-msg.c:822:lnet_is_health_check()) Msg is in inconsistent state, don't perform health checking (0, 5)
[7664425.044298] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib7: 0 seconds
[7664425.054559] LNet: 80392:0:(o2iblnd_cb.c:3396:kiblnd_check_conns()) Skipped 603 previous similar messages
[7664502.327548] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) 10.0.10.209@o2ib7 rejected: consumer defined fatal error
[7664502.339070] LNetError: 84226:0:(o2iblnd_cb.c:2923:kiblnd_rejected()) Skipped 55 previous similar messages
[7664578.055010] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1583650599/real 1583650599]  req@ffff9c2e05194380 x1652574281110560/t0(0) o106->fir-OST0019@10.9.0.63@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1583650606 ref 1 fl Rpc:eX/2/ffffffff rc 0/-1
[7664578.082806] Lustre: 124006:0:(client.c:2133:ptlrpc_expire_one_request()) Skipped 20170191 previous similar messages
[7664588.073130] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) no route to 10.9.0.63@o2ib4 from <?>
[7664588.083829] LNetError: 80427:0:(lib-move.c:2007:lnet_handle_find_routed_path()) Skipped 20095893 previous similar messages
[7664612.229893] Lustre: fir-OST0019: Client f97c9058-7bce-4 (at 10.49.0.63@o2ib1) reconnecting
[7664612.238464] Lustre: fir-OST0019: Connection restored to 7ef34a8a-27c8-4 (at 10.49.0.63@o2ib1)
[7664612.247241] Lustre: Skipped 6 previous similar messages
[7664627.821583] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Error sending GET to 12345-10.0.10.235@o2ib7: -125
[7664627.833810] LNetError: 80409:0:(lib-move.c:2963:lnet_resend_pending_msgs_locked()) Skipped 15 previous similar messages
[7664653.864670] ll_ost_io03_079 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664653.873113] ll_ost_io03_079 cpuset=/ mems_allowed=3
[7664653.878176] CPU: 15 PID: 90700 Comm: ll_ost_io03_079 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664653.891553] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664653.899386] Call Trace:
[7664653.902017]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664653.907335]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664653.912820]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664653.918655]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664653.924410]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664653.930421]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664653.936783]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664653.942882]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664653.948636]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664653.955170]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664653.961705]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664653.967894]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664653.973908]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664653.980020]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664653.986908]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664653.993066]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664654.000159]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664654.006729]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664654.013384]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664654.020741]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664654.027489]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664654.034836]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664654.042367]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664654.049294]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664654.056387]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664654.064149]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664654.071411]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664654.079284]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664654.086258]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664654.091703]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664654.098183]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664654.105756]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664654.110818]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664654.117094]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664654.123713]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664654.129985] Mem-Info:
[7664654.132457] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33364 inactive_file:35739 isolated_file:1504
 unevictable:9044 dirty:0 writeback:8 unstable:0
 slab_reclaimable:824017 slab_unreclaimable:62296359
 mapped:1720 shmem:0 pagetables:2953 bounce:0
 free:590242 free_pcp:2 free_cma:0
[7664654.166731] Node 3 Normal free:525304kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:37912kB inactive_file:40156kB unevictable:840kB isolated(anon):0kB isolated(file):5632kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:28kB mapped:844kB shmem:0kB slab_reclaimable:854176kB slab_unreclaimable:62369264kB kernel_stack:4224kB pagetables:3252kB unstable:0kB bounce:0kB free_pcp:8kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:225984 all_unreclaimable? yes
[7664654.213683] lowmem_reserve[]: 0 0 0 0
[7664654.217653] Node 3 Normal: 131743*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 526972kB
[7664654.230062] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664654.238938] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664654.247552] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664654.256426] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664654.265043] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664654.273916] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664654.282529] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664654.291401] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664654.300009] 72622 total pagecache pages
[7664654.304021] 0 pages in swap cache
[7664654.307514] Swap cache stats: add 21120185, delete 21136157, find 4513346/7609731
[7664654.315163] Free swap  = 2001536kB
[7664654.318745] Total swap = 4194300kB
[7664654.322325] 66993253 pages RAM
[7664654.325556] 0 pages HighMem/MovableOnly
[7664654.329569] 1101945 pages reserved
[7664654.333158] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664654.341205] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664654.350165] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664654.358960] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664654.367551] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664654.375733] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664654.383999] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664654.392614] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664654.400875] [53099]     0 53099     6670      239      18      649             0 smartd
[7664654.409053] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664654.417141] [53104]     0 53104    74785      315      85      275             0 sssd
[7664654.425142] [53106]     0 53106     5514      191      15      219             0 irqbalance
[7664654.433669] [53108]     0 53108    38960      175      19       84             0 dsm_sa_eventmgr
[7664654.442629] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664654.450978] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664654.459242] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664654.467502] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664654.475853] [53179]     0 53179    71689      281      85      227             0 sssd_pam
[7664654.484204] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664654.493076] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664654.501078] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664654.509434] [53863]     0 53863   176656      249      39     1246             0 collectd
[7664654.517788] [53969]     0 53969    31572      205      20      168             0 crond
[7664654.525874] [54035]     0 54035    27526      164      10       33             0 agetty
[7664654.534048] [54036]     0 54036    27526      158      11       33             0 agetty
[7664654.542229] [54186]     0 54186    22934      210      46      272             0 master
[7664654.550410] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664654.558540] [36317]     0 36317    28294      187      14       61             0 bash
[7664654.566548] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664654.575078] [36329]     0 36329    28177      160      14       55             0 grep
[7664654.583176] [117987]     0 117987   283356      297     509   230727             0 python
[7664654.591542] [76204]    89 76204    25501      252      46      282             0 pickup
[7664654.599720] [97037]     0 97037    50542      270      55     2086             0 lustre.py
[7664654.608156] [97087]     0 97087    34453      276      25     1402             0 mdraid.py
[7664654.616590] [97088]     0 97088    51294      272      55     2323             0 lustre-oss-expo
[7664654.625552] [97173]     0 97173    48653      264      49      261             0 crond
[7664654.633646] [97192]     0 97192    34468      258      25     1344             0 python3
[7664654.641915] [97789]     0 97789    44960      255      44     1248             0 lustre.py
[7664654.650355] [97872]     0 97872    48653      263      49      263             0 crond
[7664654.658442] [97890]     0 97890    31176      229      18      734             0 python3
[7664654.666712] [98004]     0 98004    31176      237      18      711             0 mdraid.py
[7664654.675152] [98087]     0 98087    45129      286      46     1400             0 lustre-oss-expo
[7664654.684111] [98530]     0 98530    31341      228      18      642             0 lustre.py
[7664654.692543] [98579]     0 98579    48653      266      49      235             0 crond
[7664654.700632] [98713]     0 98713    30977      243      16      529             0 python3
[7664654.708900] [98967]     0 98967    30977      239      19      528             0 mdraid.py
[7664654.717340] [99292]     0 99292    48653      257      49      261             0 crond
[7664654.725428] [99349]     0 99349     4779      217      14      463             0 lustre-oss-expo
[7664654.734387] [99450]     0 99450    30913      236      18      446             0 python3
[7664654.742648] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664654.750916] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664654.759878] [100032]     0 100032    48653      266      49      240             0 crond
[7664654.768147] [100105]    89 100105    25553      264      47      274             0 smtp
[7664654.776326] [100203]     0 100203    30816      222      17      351             0 python3
[7664654.784758] [100288]     0 100288     4568      176      14      235             0 lustre.py
[7664654.793364] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664654.801112] Killed process 97088 (lustre-oss-expo) total-vm:205176kB, anon-rss:0kB, file-rss:1088kB, shmem-rss:0kB
[7664654.830370] lustre-oss-expo: page allocation failure: order:0, mode:0x200da
[7664654.837519] CPU: 37 PID: 97088 Comm: lustre-oss-expo Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664654.850909] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664654.858748] Call Trace:
[7664654.861396]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664654.866716]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664654.872815]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664654.878401]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664654.884415]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664654.890948]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664654.897486]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664654.903326]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664654.909943]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664654.916217]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664654.922148]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664654.928158]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664654.934082]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664654.940008]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664654.945590]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664654.950909] Mem-Info:
[7664654.953388] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34512 inactive_file:35547 isolated_file:992
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823987 slab_unreclaimable:62296552
 mapped:1726 shmem:0 pagetables:2953 bounce:0
 free:590386 free_pcp:0 free_cma:0
[7664654.987574] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664655.029327] lowmem_reserve[]: 0 1418 63868 63868
[7664655.034248] Node 0 DMA32 free:261320kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:900kB inactive_file:3388kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:180kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686436kB kernel_stack:384kB pagetables:16kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:31147 all_unreclaimable? yes
[7664655.079208] lowmem_reserve[]: 0 0 62450 62450
[7664655.083876] Node 0 Normal free:508328kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:43872kB inactive_file:44332kB unevictable:168kB isolated(anon):0kB isolated(file):512kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610908kB slab_unreclaimable:60243084kB kernel_stack:6112kB pagetables:3188kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:332616 all_unreclaimable? yes
[7664655.130651] lowmem_reserve[]: 0 0 0 0
[7664655.134628] Node 1 Normal free:525316kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15556kB inactive_file:15728kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3800kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:64724 all_unreclaimable? yes
[7664655.181404] lowmem_reserve[]: 0 0 0 0
[7664655.185380] Node 2 Normal free:525440kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:34280kB inactive_file:37052kB unevictable:8680kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5708kB shmem:0kB slab_reclaimable:715124kB slab_unreclaimable:62476100kB kernel_stack:7936kB pagetables:1556kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:283481 all_unreclaimable? yes
[7664655.232244] lowmem_reserve[]: 0 0 0 0
[7664655.236220] Node 3 Normal free:523388kB min:525460kB low:656824kB high:788188kB active_anon:24kB inactive_anon:0kB active_file:43056kB inactive_file:41952kB unevictable:840kB isolated(anon):0kB isolated(file):2560kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854176kB slab_unreclaimable:62369240kB kernel_stack:4224kB pagetables:3252kB unstable:0kB bounce:0kB free_pcp:2248kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:784740 all_unreclaimable? yes
[7664655.283429] lowmem_reserve[]: 0 0 0 0
[7664655.287403] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664655.302241] Node 0 DMA32: 366*4kB (EM) 393*8kB (UEM) 1217*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261456kB
[7664655.318647] Node 0 Normal: 6433*4kB (UEM) 5775*8kB (UEM) 3898*16kB (UEM) 4479*32kB (EM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508668kB
[7664655.335401] Node 1 Normal: 87993*4kB (EM) 21668*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525316kB
[7664655.348469] Node 2 Normal: 27410*4kB (UEM) 40141*8kB (UEM) 837*16kB (UEM) 1683*32kB (UEM) 428*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525408kB
[7664655.363957] Node 3 Normal: 131115*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524460kB
[7664655.376308] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664655.385175] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664655.393780] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664655.402646] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664655.411253] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664655.420119] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664655.428725] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664655.437590] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664655.446199] 72919 total pagecache pages
[7664655.450219] 0 pages in swap cache
[7664655.453713] Swap cache stats: add 21120253, delete 21136225, find 4513355/7609757
[7664655.461367] Free swap  = 2001492kB
[7664655.464953] Total swap = 4194300kB
[7664655.468534] 66993253 pages RAM
[7664655.471774] 0 pages HighMem/MovableOnly
[7664655.475787] 1101945 pages reserved
[7664655.633560] ll_ost_io02_095 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664655.642001] ll_ost_io02_095 cpuset=/ mems_allowed=2
[7664655.647061] CPU: 2 PID: 8682 Comm: ll_ost_io02_095 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664655.660270] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664655.668102] Call Trace:
[7664655.670735]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664655.676051]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664655.681548]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664655.687391]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664655.693142]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664655.699156]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664655.705506]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664655.711599]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664655.717349]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664655.723883]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664655.730418]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664655.736604]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664655.742609]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664655.748720]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664655.755603]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664655.762831]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664655.769003]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664655.775624]  [<ffffffffa021bd89>] ? ___slab_alloc+0x209/0x4f0
[7664655.781554]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664655.787321]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664655.793765]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664655.800646]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664655.806187]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664655.813284]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664655.821039]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664655.828301]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664655.836180]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664655.843194]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664655.851156]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664655.858459]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664655.864969]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664655.872575]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664655.877630]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664655.883899]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664655.890518]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664655.896783] Mem-Info:
[7664655.899255] active_anon:0 inactive_anon:13 isolated_anon:0
 active_file:34972 inactive_file:34812 isolated_file:1728
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823987 slab_unreclaimable:62296531
 mapped:1727 shmem:0 pagetables:2953 bounce:0
 free:589976 free_pcp:0 free_cma:0
[7664655.933624] Node 2 Normal free:525408kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:34284kB inactive_file:36920kB unevictable:8680kB isolated(anon):0kB isolated(file):128kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5708kB shmem:0kB slab_reclaimable:715124kB slab_unreclaimable:62476132kB kernel_stack:7936kB pagetables:1556kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:365277 all_unreclaimable? yes
[7664655.980685] lowmem_reserve[]: 0 0 0 0
[7664655.984654] Node 2 Normal: 27413*4kB (UEM) 40141*8kB (UEM) 837*16kB (UEM) 1683*32kB (UEM) 428*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525420kB
[7664656.000142] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664656.009012] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664656.017624] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664656.026490] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664656.035097] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664656.043963] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664656.052571] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664656.061443] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664656.070049] 72934 total pagecache pages
[7664656.074063] 0 pages in swap cache
[7664656.077554] Swap cache stats: add 21120266, delete 21136238, find 4513356/7609760
[7664656.085206] Free swap  = 2010452kB
[7664656.088787] Total swap = 4194300kB
[7664656.092366] 66993253 pages RAM
[7664656.095598] 0 pages HighMem/MovableOnly
[7664656.099611] 1101945 pages reserved
[7664656.103191] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664656.111239] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664656.120198] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664656.128993] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664656.137583] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664656.145765] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664656.154032] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664656.162638] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664656.170898] [53099]     0 53099     6670      239      18      649             0 smartd
[7664656.179081] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664656.187174] [53104]     0 53104    74785      315      85      275             0 sssd
[7664656.195181] [53106]     0 53106     5514      191      15      219             0 irqbalance
[7664656.203702] [53108]     0 53108    38960      175      19       84             0 dsm_sa_eventmgr
[7664656.212663] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664656.221017] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664656.229277] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664656.237537] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664656.245889] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664656.254247] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664656.263133] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664656.271155] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664656.279515] [53863]     0 53863   176656      249      39     1246             0 collectd
[7664656.287864] [53969]     0 53969    31572      205      20      168             0 crond
[7664656.295950] [54035]     0 54035    27526      164      10       33             0 agetty
[7664656.304131] [54036]     0 54036    27526      158      11       33             0 agetty
[7664656.312304] [54186]     0 54186    22934      210      46      272             0 master
[7664656.320477] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664656.328593] [36317]     0 36317    28294      187      14       61             0 bash
[7664656.336597] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664656.345117] [36329]     0 36329    28177      160      14       55             0 grep
[7664656.353199] [117987]     0 117987   283356      297     509   230727             0 python
[7664656.361565] [76204]    89 76204    25501      252      46      282             0 pickup
[7664656.369742] [97037]     0 97037    50542      270      55     2086             0 lustre.py
[7664656.378182] [97087]     0 97087    34453      276      25     1402             0 mdraid.py
[7664656.386628] [97173]     0 97173    48653      264      49      261             0 crond
[7664656.394729] [97192]     0 97192    34468      258      25     1344             0 python3
[7664656.402994] [97789]     0 97789    44960      255      44     1248             0 lustre.py
[7664656.411426] [97872]     0 97872    48653      263      49      263             0 crond
[7664656.419512] [97890]     0 97890    31176      229      18      734             0 python3
[7664656.427773] [98004]     0 98004    31176      237      18      711             0 mdraid.py
[7664656.436215] [98087]     0 98087    45129      286      46     1400             0 lustre-oss-expo
[7664656.445175] [98530]     0 98530    31341      228      18      642             0 lustre.py
[7664656.453607] [98579]     0 98579    48653      266      49      235             0 crond
[7664656.461692] [98713]     0 98713    30977      243      16      529             0 python3
[7664656.469953] [98967]     0 98967    30977      239      19      528             0 mdraid.py
[7664656.478385] [99292]     0 99292    48653      257      49      261             0 crond
[7664656.486472] [99349]     0 99349     4779      217      14      469             0 lustre-oss-expo
[7664656.495433] [99450]     0 99450    30913      236      18      446             0 python3
[7664656.503693] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664656.511956] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664656.520913] [100032]     0 100032    48653      266      49      240             0 crond
[7664656.529173] [100105]    89 100105    25553      264      47      274             0 smtp
[7664656.537347] [100203]     0 100203    30816      222      17      351             0 python3
[7664656.545789] [100288]     0 100288     4568      176      14      235             0 lustre.py
[7664656.554400] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664656.562146] Killed process 97037 (lustre.py) total-vm:202168kB, anon-rss:0kB, file-rss:1080kB, shmem-rss:0kB
[7664656.574178] lustre.py: page allocation failure: order:0, mode:0x200da
[7664656.580795] CPU: 34 PID: 97037 Comm: lustre.py Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664656.593653] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664656.601478] Call Trace:
[7664656.604113]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664656.609431]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664656.615527]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664656.621110]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664656.627117]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664656.633653]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664656.640191]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664656.646029]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664656.652648]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664656.658916]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664656.664846]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664656.670860]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664656.676793]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664656.682714]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664656.688289]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664656.693609] Mem-Info:
[7664656.696089] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34476 inactive_file:35213 isolated_file:2208
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823987 slab_unreclaimable:62296527
 mapped:1725 shmem:0 pagetables:2898 bounce:0
 free:590281 free_pcp:0 free_cma:0
[7664656.730358] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664656.772117] lowmem_reserve[]: 0 1418 63868 63868
[7664656.777049] Node 0 DMA32 free:261312kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:936kB inactive_file:3376kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:176kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686388kB kernel_stack:384kB pagetables:16kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:8546 all_unreclaimable? no
[7664656.821833] lowmem_reserve[]: 0 0 62450 62450
[7664656.826495] Node 0 Normal free:508576kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:47604kB inactive_file:43356kB unevictable:168kB isolated(anon):0kB isolated(file):7936kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610908kB slab_unreclaimable:60243024kB kernel_stack:5952kB pagetables:3068kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:270135 all_unreclaimable? yes
[7664656.873349] lowmem_reserve[]: 0 0 0 0
[7664656.877325] Node 1 Normal free:525352kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15636kB inactive_file:15620kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3792kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:98366 all_unreclaimable? yes
[7664656.924104] lowmem_reserve[]: 0 0 0 0
[7664656.928078] Node 2 Normal free:525420kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:29520kB inactive_file:30544kB unevictable:8680kB isolated(anon):0kB isolated(file):4608kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5708kB shmem:0kB slab_reclaimable:715124kB slab_unreclaimable:62476132kB kernel_stack:7936kB pagetables:1544kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:11722 all_unreclaimable? no
[7664656.975025] lowmem_reserve[]: 0 0 0 0
[7664656.978996] Node 3 Normal free:524560kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:43400kB inactive_file:42512kB unevictable:840kB isolated(anon):0kB isolated(file):384kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854176kB slab_unreclaimable:62369244kB kernel_stack:4224kB pagetables:3172kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:237933 all_unreclaimable? yes
[7664657.025769] lowmem_reserve[]: 0 0 0 0
[7664657.029735] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664657.044573] Node 0 DMA32: 392*4kB (UEM) 393*8kB (UEM) 1216*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261544kB
[7664657.061075] Node 0 Normal: 6387*4kB (UEM) 5782*8kB (UEM) 3897*16kB (EM) 4480*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508556kB
[7664657.077830] Node 1 Normal: 88002*4kB (EM) 21668*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525352kB
[7664657.090899] Node 2 Normal: 27641*4kB (UEM) 40336*8kB (UEM) 873*16kB (UEM) 1690*32kB (UEM) 428*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 528692kB
[7664657.106385] Node 3 Normal: 131140*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524560kB
[7664657.118824] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664657.127690] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664657.136296] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664657.145165] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664657.153779] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664657.162654] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664657.171268] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664657.180132] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664657.188738] 72840 total pagecache pages
[7664657.192754] 0 pages in swap cache
[7664657.196243] Swap cache stats: add 21120266, delete 21136238, find 4513356/7609760
[7664657.203896] Free swap  = 2010452kB
[7664657.207476] Total swap = 4194300kB
[7664657.211057] 66993253 pages RAM
[7664657.214289] 0 pages HighMem/MovableOnly
[7664657.218303] 1101945 pages reserved
[7664657.421407] ll_ost_io01_077 invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[7664657.429593] ll_ost_io01_077 cpuset=/ mems_allowed=1
[7664657.434655] CPU: 41 PID: 90482 Comm: ll_ost_io01_077 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664657.448033] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664657.455857] Call Trace:
[7664657.458492]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664657.463807]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664657.469296]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664657.475134]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664657.480882]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664657.486893]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664657.493248]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664657.499342]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664657.505095]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664657.511626]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664657.518157]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664657.524407]  [<ffffffffc124293f>] tgt_checksum_niobuf_rw+0xbf/0xe00 [ptlrpc]
[7664657.531664]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664657.538979]  [<ffffffffc172e0ac>] ? ofd_preprw+0x5dc/0x11b0 [ofd]
[7664657.545274]  [<ffffffffc0cb71e0>] ? obd_dif_crc_fn+0x20/0x20 [obdclass]
[7664657.552104]  [<ffffffffc1247325>] tgt_brw_read+0xc35/0x1e50 [ptlrpc]
[7664657.558649]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664657.565999]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664657.573345]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664657.580858]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664657.587782]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664657.594868]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664657.602624]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664657.609889]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664657.617754]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664657.624717]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664657.630157]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664657.636629]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664657.644198]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664657.649258]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664657.655526]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664657.662147]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664657.668419] Mem-Info:
[7664657.670879] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32542 inactive_file:34474 isolated_file:4384
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823987 slab_unreclaimable:62296487
 mapped:1642 shmem:0 pagetables:2843 bounce:0
 free:590338 free_pcp:0 free_cma:0
[7664657.705154] Node 1 Normal free:525404kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:14996kB inactive_file:15620kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3776kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:204093 all_unreclaimable? yes
[7664657.752011] lowmem_reserve[]: 0 0 0 0
[7664657.755980] Node 1 Normal: 88026*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525472kB
[7664657.769052] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664657.777928] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664657.786542] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664657.795414] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664657.804019] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664657.812885] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664657.821492] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664657.830358] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664657.838962] 72858 total pagecache pages
[7664657.842977] 0 pages in swap cache
[7664657.846468] Swap cache stats: add 21120271, delete 21136243, find 4513357/7609762
[7664657.854120] Free swap  = 2018900kB
[7664657.857701] Total swap = 4194300kB
[7664657.861282] 66993253 pages RAM
[7664657.864513] 0 pages HighMem/MovableOnly
[7664657.868524] 1101945 pages reserved
[7664657.872103] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664657.880156] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664657.889113] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664657.897907] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664657.906483] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664657.914663] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664657.922929] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664657.931538] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664657.939803] [53099]     0 53099     6670      239      18      649             0 smartd
[7664657.947975] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664657.956062] [53104]     0 53104    74785      315      85      275             0 sssd
[7664657.964062] [53106]     0 53106     5514      189      15      219             0 irqbalance
[7664657.972582] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664657.981543] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664657.989888] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664657.998149] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664658.006416] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664658.014771] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664658.023127] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664658.032003] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664658.040011] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664658.048364] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664658.056718] [53969]     0 53969    31572      205      20      168             0 crond
[7664658.064804] [54035]     0 54035    27526      164      10       33             0 agetty
[7664658.072976] [54036]     0 54036    27526      158      11       33             0 agetty
[7664658.081150] [54186]     0 54186    22934      210      46      272             0 master
[7664658.089332] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664658.097427] [36317]     0 36317    28294      187      14       61             0 bash
[7664658.105435] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664658.113962] [36329]     0 36329    28177      160      14       55             0 grep
[7664658.122053] [117987]     0 117987   283356      282     509   230727             0 python
[7664658.130423] [76204]    89 76204    25501      252      46      282             0 pickup
[7664658.138608] [97087]     0 97087    34453      266      25     1402             0 mdraid.py
[7664658.147046] [97173]     0 97173    48653      264      49      261             0 crond
[7664658.155141] [97192]     0 97192    34468      247      25     1344             0 python3
[7664658.163408] [97789]     0 97789    44960      250      44     1248             0 lustre.py
[7664658.171853] [97872]     0 97872    48653      263      49      263             0 crond
[7664658.179943] [97890]     0 97890    31176      214      18      734             0 python3
[7664658.188213] [98004]     0 98004    31176      224      18      711             0 mdraid.py
[7664658.196653] [98087]     0 98087    45129      249      46     1400             0 lustre-oss-expo
[7664658.205615] [98530]     0 98530    31341      224      18      642             0 lustre.py
[7664658.214056] [98579]     0 98579    48653      266      49      235             0 crond
[7664658.222151] [98713]     0 98713    30977      230      16      529             0 python3
[7664658.230418] [98967]     0 98967    30977      227      19      528             0 mdraid.py
[7664658.238852] [99292]     0 99292    48653      257      49      261             0 crond
[7664658.246947] [99349]     0 99349     4779      194      14      469             0 lustre-oss-expo
[7664658.255909] [99450]     0 99450    30913      226      18      446             0 python3
[7664658.264175] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664658.272443] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664658.281408] [100032]     0 100032    48653      266      49      240             0 crond
[7664658.289674] [100105]    89 100105    25553      264      47      274             0 smtp
[7664658.297857] [100203]     0 100203    30816      209      17      333             0 python3
[7664658.306300] [100288]     0 100288     4568      160      14      235             0 lustre.py
[7664658.314910] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664658.322659] Killed process 98087 (lustre-oss-expo) total-vm:180516kB, anon-rss:0kB, file-rss:996kB, shmem-rss:0kB
[7664658.400559] lustre-oss-expo: page allocation failure: order:0, mode:0x200da
[7664658.407702] CPU: 43 PID: 98087 Comm: lustre-oss-expo Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664658.421080] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664658.428910] Call Trace:
[7664658.431543]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664658.436862]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664658.442961]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664658.448536]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664658.454551]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664658.461083]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664658.467610]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664658.473444]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664658.480062]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664658.486330]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664658.492249]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664658.498255]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664658.504175]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664658.510094]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664658.515667]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664658.520979] Mem-Info:
[7664658.523453] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33535 inactive_file:35253 isolated_file:3840
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823990 slab_unreclaimable:62296479
 mapped:1642 shmem:0 pagetables:2843 bounce:0
 free:590288 free_pcp:0 free_cma:0
[7664658.557720] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664658.599474] lowmem_reserve[]: 0 1418 63868 63868
[7664658.604398] Node 0 DMA32 free:261344kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:964kB inactive_file:3568kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:176kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686312kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:9892 all_unreclaimable? yes
[7664658.649263] lowmem_reserve[]: 0 0 62450 62450
[7664658.653930] Node 0 Normal free:508440kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:40556kB inactive_file:41756kB unevictable:168kB isolated(anon):0kB isolated(file):7936kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610908kB slab_unreclaimable:60242984kB kernel_stack:5984kB pagetables:3012kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:524518 all_unreclaimable? yes
[7664658.700795] lowmem_reserve[]: 0 0 0 0
[7664658.704768] Node 1 Normal free:525472kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15660kB inactive_file:15492kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3776kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:204093 all_unreclaimable? yes
[7664658.751625] lowmem_reserve[]: 0 0 0 0
[7664658.755597] Node 2 Normal free:524828kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32208kB inactive_file:38884kB unevictable:8680kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5376kB shmem:0kB slab_reclaimable:715124kB slab_unreclaimable:62476112kB kernel_stack:7936kB pagetables:1480kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:282508 all_unreclaimable? yes
[7664658.802624] lowmem_reserve[]: 0 0 0 0
[7664658.806593] Node 3 Normal free:525164kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:41172kB inactive_file:42748kB unevictable:840kB isolated(anon):0kB isolated(file):3968kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854188kB slab_unreclaimable:62369188kB kernel_stack:4224kB pagetables:3092kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:4640 all_unreclaimable? no
[7664658.853194] lowmem_reserve[]: 0 0 0 0
[7664658.857161] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664658.871999] Node 0 DMA32: 384*4kB (EM) 396*8kB (UEM) 1211*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261456kB
[7664658.888403] Node 0 Normal: 6378*4kB (EM) 5766*8kB (UEM) 3901*16kB (UEM) 4483*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508552kB
[7664658.905157] Node 1 Normal: 88026*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525472kB
[7664658.918228] Node 2 Normal: 27386*4kB (UEM) 40221*8kB (UEM) 837*16kB (UEM) 1667*32kB (UEM) 421*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524992kB
[7664658.933716] Node 3 Normal: 131301*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525204kB
[7664658.946152] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664658.955019] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664658.963625] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664658.972491] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664658.981099] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664658.989963] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664658.998570] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664659.007436] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664659.016041] 73118 total pagecache pages
[7664659.020054] 0 pages in swap cache
[7664659.023547] Swap cache stats: add 21120302, delete 21136274, find 4513363/7609773
[7664659.031199] Free swap  = 2018876kB
[7664659.034780] Total swap = 4194300kB
[7664659.038360] 66993253 pages RAM
[7664659.041590] 0 pages HighMem/MovableOnly
[7664659.045602] 1101945 pages reserved
[7664659.063445] ll_ost_io03_047 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664659.071885] ll_ost_io03_047 cpuset=/ mems_allowed=3
[7664659.076951] CPU: 31 PID: 6896 Comm: ll_ost_io03_047 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664659.090241] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664659.098072] Call Trace:
[7664659.100706]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664659.106022]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664659.111517]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664659.117357]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664659.123105]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664659.129119]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664659.135472]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664659.141574]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664659.147320]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664659.153854]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664659.160381]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664659.166568]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664659.172577]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664659.178691]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664659.185575]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664659.192808]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664659.198975]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664659.205630]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664659.212711]  [<ffffffffc11dcbe7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[7664659.219373]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664659.226464]  [<ffffffffc0c833c9>] ? class_handle2object+0xb9/0x1c0 [obdclass]
[7664659.233772]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664659.239523]  [<ffffffffa00ddd9e>] ? account_entity_dequeue+0xae/0xd0
[7664659.246049]  [<ffffffffa00e192c>] ? dequeue_entity+0x11c/0x5e0
[7664659.252060]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664659.257616]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664659.264721]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664659.272480]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664659.279752]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664659.287621]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664659.294619]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664659.302575]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664659.309831]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664659.316319]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664659.323889]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664659.328951]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664659.335224]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664659.341834]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664659.348100] Mem-Info:
[7664659.350563] active_anon:0 inactive_anon:3 isolated_anon:0
 active_file:33780 inactive_file:35393 isolated_file:3328
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823991 slab_unreclaimable:62296484
 mapped:1641 shmem:0 pagetables:2797 bounce:0
 free:590287 free_pcp:0 free_cma:0
[7664659.384840] Node 3 Normal free:525168kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:12kB active_file:39756kB inactive_file:41840kB unevictable:840kB isolated(anon):0kB isolated(file):8192kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854188kB slab_unreclaimable:62369192kB kernel_stack:4224kB pagetables:3076kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:814844 all_unreclaimable? yes
[7664659.431796] lowmem_reserve[]: 0 0 0 0
[7664659.435765] Node 3 Normal: 131302*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525208kB
[7664659.448202] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664659.457068] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664659.465676] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664659.474540] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664659.483146] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664659.492012] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664659.500621] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664659.509494] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664659.518098] 73117 total pagecache pages
[7664659.522113] 0 pages in swap cache
[7664659.525604] Swap cache stats: add 21120316, delete 21136288, find 4513363/7609775
[7664659.533257] Free swap  = 2024508kB
[7664659.536836] Total swap = 4194300kB
[7664659.540417] 66993253 pages RAM
[7664659.543649] 0 pages HighMem/MovableOnly
[7664659.547662] 1101945 pages reserved
[7664659.551240] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664659.559290] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664659.568248] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664659.577044] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664659.585648] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664659.593824] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664659.602092] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664659.610706] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664659.618967] [53099]     0 53099     6670      239      18      649             0 smartd
[7664659.627147] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664659.635233] [53104]     0 53104    74785      315      85      275             0 sssd
[7664659.643240] [53106]     0 53106     5514      189      15      219             0 irqbalance
[7664659.651761] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664659.660722] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664659.669078] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664659.677344] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664659.685603] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664659.693952] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664659.702303] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664659.711175] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664659.719179] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664659.727527] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664659.735881] [53969]     0 53969    31572      205      20      168             0 crond
[7664659.743974] [54035]     0 54035    27526      164      10       33             0 agetty
[7664659.752147] [54036]     0 54036    27526      158      11       33             0 agetty
[7664659.760320] [54186]     0 54186    22934      210      46      272             0 master
[7664659.768493] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664659.776612] [36317]     0 36317    28294      187      14       61             0 bash
[7664659.784613] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664659.793133] [36329]     0 36329    28177      160      14       55             0 grep
[7664659.801220] [117987]     0 117987   283356      282     509   230727             0 python
[7664659.809589] [76204]    89 76204    25501      252      46      282             0 pickup
[7664659.817768] [97087]     0 97087    34453      266      25     1402             0 mdraid.py
[7664659.826207] [97173]     0 97173    48653      264      49      261             0 crond
[7664659.834301] [97192]     0 97192    34468      247      25     1344             0 python3
[7664659.842569] [97789]     0 97789    44960      250      44     1248             0 lustre.py
[7664659.851009] [97872]     0 97872    48653      263      49      263             0 crond
[7664659.859095] [97890]     0 97890    31176      214      18      734             0 python3
[7664659.867355] [98004]     0 98004    31176      224      18      711             0 mdraid.py
[7664659.875790] [98530]     0 98530    31341      224      18      642             0 lustre.py
[7664659.884230] [98579]     0 98579    48653      266      49      235             0 crond
[7664659.892315] [98713]     0 98713    30977      230      16      529             0 python3
[7664659.900576] [98967]     0 98967    30977      227      19      528             0 mdraid.py
[7664659.909017] [99292]     0 99292    48653      257      49      261             0 crond
[7664659.917111] [99349]     0 99349     4779      194      14      469             0 lustre-oss-expo
[7664659.926065] [99450]     0 99450    30913      226      18      446             0 python3
[7664659.934332] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664659.942590] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664659.951544] [100032]     0 100032    48653      266      49      240             0 crond
[7664659.959813] [100105]    89 100105    25553      264      47      274             0 smtp
[7664659.967995] [100203]     0 100203    30816      203      17      333             0 python3
[7664659.976437] [100288]     0 100288     4568      160      14      235             0 lustre.py
[7664659.985052] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664659.992807] Killed process 97087 (mdraid.py) total-vm:137812kB, anon-rss:0kB, file-rss:1064kB, shmem-rss:0kB
[7664660.348817] mdraid.py: page allocation failure: order:0, mode:0x200da
[7664660.355439] CPU: 8 PID: 97087 Comm: mdraid.py Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664660.368209] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664660.376042] Call Trace:
[7664660.378676]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664660.383996]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664660.390092]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664660.395666]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664660.401683]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664660.408216]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664660.414750]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664660.420593]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664660.427212]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664660.433486]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664660.439407]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664660.445420]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664660.451339]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664660.457256]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664660.462830]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664660.468143] Mem-Info:
[7664660.470620] active_anon:0 inactive_anon:3 isolated_anon:0
 active_file:33385 inactive_file:34779 isolated_file:3621
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823991 slab_unreclaimable:62296475
 mapped:1635 shmem:0 pagetables:2797 bounce:0
 free:590472 free_pcp:0 free_cma:0
[7664660.504889] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664660.546638] lowmem_reserve[]: 0 1418 63868 63868
[7664660.551560] Node 0 DMA32 free:261328kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:984kB inactive_file:3288kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:172kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686312kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:113582 all_unreclaimable? yes
[7664660.596601] lowmem_reserve[]: 0 0 62450 62450
[7664660.601266] Node 0 Normal free:508652kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:46788kB inactive_file:46692kB unevictable:168kB isolated(anon):0kB isolated(file):2048kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610908kB slab_unreclaimable:60242932kB kernel_stack:5952kB pagetables:2896kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:656327 all_unreclaimable? yes
[7664660.648125] lowmem_reserve[]: 0 0 0 0
[7664660.652095] Node 1 Normal free:525472kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15580kB inactive_file:15572kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3776kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:204093 all_unreclaimable? yes
[7664660.698960] lowmem_reserve[]: 0 0 0 0
[7664660.702936] Node 2 Normal free:525336kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31276kB inactive_file:33872kB unevictable:8680kB isolated(anon):0kB isolated(file):4864kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5352kB shmem:0kB slab_reclaimable:715128kB slab_unreclaimable:62476144kB kernel_stack:7936kB pagetables:1428kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:764000 all_unreclaimable? yes
[7664660.750056] lowmem_reserve[]: 0 0 0 0
[7664660.754028] Node 3 Normal free:525196kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:12kB active_file:38940kB inactive_file:43736kB unevictable:840kB isolated(anon):0kB isolated(file):7444kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854188kB slab_unreclaimable:62369192kB kernel_stack:4224kB pagetables:3076kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2078411 all_unreclaimable? yes
[7664660.801078] lowmem_reserve[]: 0 0 0 0
[7664660.805046] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664660.819880] Node 0 DMA32: 381*4kB (EM) 395*8kB (UEM) 1211*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261436kB
[7664660.836288] Node 0 Normal: 6399*4kB (UEM) 5773*8kB (UEM) 3897*16kB (EM) 4481*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508564kB
[7664660.853042] Node 1 Normal: 88026*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525472kB
[7664660.866112] Node 2 Normal: 27446*4kB (UEM) 40274*8kB (UEM) 838*16kB (UEM) 1667*32kB (UEM) 421*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525672kB
[7664660.881599] Node 3 Normal: 131308*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525232kB
[7664660.894037] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664660.902902] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664660.911508] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664660.920374] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664660.928982] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664660.937849] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664660.946454] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664660.955320] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664660.963926] 72976 total pagecache pages
[7664660.967941] 0 pages in swap cache
[7664660.971431] Swap cache stats: add 21120316, delete 21136288, find 4513363/7609775
[7664660.979083] Free swap  = 2024508kB
[7664660.982663] Total swap = 4194300kB
[7664660.986245] 66993253 pages RAM
[7664660.989481] 0 pages HighMem/MovableOnly
[7664660.993494] 1101945 pages reserved
[7664660.999826] crond invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
[7664661.007403] crond cpuset=/ mems_allowed=0-3
[7664661.011770] CPU: 13 PID: 53969 Comm: crond Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664661.024286] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664661.032117] Call Trace:
[7664661.034751]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664661.040065]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664661.045553]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664661.051394]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664661.057151]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664661.063163]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664661.069523]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664661.075616]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664661.081363]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664661.087899]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664661.094432]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664661.100264]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664661.106876]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664661.113144]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664661.119065]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664661.125078]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664661.131010]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664661.136934]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664661.142505]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664661.147816] Mem-Info:
[7664661.150293] active_anon:0 inactive_anon:3 isolated_anon:0
 active_file:32182 inactive_file:34249 isolated_file:3781
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823991 slab_unreclaimable:62296475
 mapped:1635 shmem:0 pagetables:2797 bounce:0
 free:590489 free_pcp:62 free_cma:0
[7664661.184645] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664661.226400] lowmem_reserve[]: 0 1418 63868 63868
[7664661.231322] Node 0 DMA32 free:261380kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:984kB inactive_file:3512kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:172kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686312kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:332256 all_unreclaimable? yes
[7664661.276367] lowmem_reserve[]: 0 0 62450 62450
[7664661.281037] Node 0 Normal free:508556kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:45488kB inactive_file:45736kB unevictable:168kB isolated(anon):0kB isolated(file):2048kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610908kB slab_unreclaimable:60242932kB kernel_stack:6048kB pagetables:2800kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:289578 all_unreclaimable? yes
[7664661.327902] lowmem_reserve[]: 0 0 0 0
[7664661.331872] Node 1 Normal free:525472kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15580kB inactive_file:15572kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3776kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:204093 all_unreclaimable? yes
[7664661.378729] lowmem_reserve[]: 0 0 0 0
[7664661.382704] Node 2 Normal free:525440kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31228kB inactive_file:37628kB unevictable:8680kB isolated(anon):0kB isolated(file):1536kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5352kB shmem:0kB slab_reclaimable:715128kB slab_unreclaimable:62476112kB kernel_stack:7936kB pagetables:1424kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1005958 all_unreclaimable? yes
[7664661.429914] lowmem_reserve[]: 0 0 0 0
[7664661.433882] Node 3 Normal free:525160kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:12kB active_file:37152kB inactive_file:39116kB unevictable:840kB isolated(anon):0kB isolated(file):11412kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854188kB slab_unreclaimable:62369188kB kernel_stack:4224kB pagetables:3076kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:690742 all_unreclaimable? yes
[7664661.480916] lowmem_reserve[]: 0 0 0 0
[7664661.484884] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664661.499722] Node 0 DMA32: 379*4kB (EM) 395*8kB (UEM) 1211*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261428kB
[7664661.516127] Node 0 Normal: 6399*4kB (UEM) 5775*8kB (UEM) 3897*16kB (EM) 4481*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508580kB
[7664661.532881] Node 1 Normal: 88026*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525472kB
[7664661.545952] Node 2 Normal: 27446*4kB (UEM) 40274*8kB (UEM) 838*16kB (UEM) 1667*32kB (UEM) 421*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525672kB
[7664661.561447] Node 3 Normal: 131298*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525192kB
[7664661.573885] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664661.582754] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664661.591367] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664661.600240] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664661.608848] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664661.617722] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664661.626329] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664661.635193] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664661.643799] 72959 total pagecache pages
[7664661.647815] 0 pages in swap cache
[7664661.651312] Swap cache stats: add 21120330, delete 21136302, find 4513364/7609777
[7664661.658966] Free swap  = 2030140kB
[7664661.662545] Total swap = 4194300kB
[7664661.666127] 66993253 pages RAM
[7664661.669356] 0 pages HighMem/MovableOnly
[7664661.673371] 1101945 pages reserved
[7664661.676949] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664661.684998] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664661.693959] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664661.702751] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664661.711328] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664661.719506] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664661.727775] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664661.736388] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664661.744649] [53099]     0 53099     6670      239      18      649             0 smartd
[7664661.752830] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664661.760924] [53104]     0 53104    74785      315      85      275             0 sssd
[7664661.768929] [53106]     0 53106     5514      189      15      219             0 irqbalance
[7664661.777451] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664661.786405] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664661.794758] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664661.803019] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664661.811286] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664661.819633] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664661.827983] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664661.836854] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664661.844852] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664661.853199] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664661.861543] [53969]     0 53969    31572      205      20      168             0 crond
[7664661.869631] [54035]     0 54035    27526      164      10       33             0 agetty
[7664661.877805] [54036]     0 54036    27526      158      11       33             0 agetty
[7664661.885984] [54186]     0 54186    22934      210      46      272             0 master
[7664661.894159] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664661.902259] [36317]     0 36317    28294      187      14       61             0 bash
[7664661.910263] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664661.918791] [36329]     0 36329    28177      160      14       55             0 grep
[7664661.926889] [117987]     0 117987   283356      282     509   230727             0 python
[7664661.935256] [76204]    89 76204    25501      252      46      282             0 pickup
[7664661.943434] [97173]     0 97173    48653      264      49      261             0 crond
[7664661.951524] [97192]     0 97192    34468      247      25     1344             0 python3
[7664661.959793] [97789]     0 97789    44960      250      44     1248             0 lustre.py
[7664661.968233] [97872]     0 97872    48653      263      49      263             0 crond
[7664661.976318] [97890]     0 97890    31176      214      18      734             0 python3
[7664661.984579] [98004]     0 98004    31176      224      18      711             0 mdraid.py
[7664661.993013] [98530]     0 98530    31341      224      18      642             0 lustre.py
[7664662.001452] [98579]     0 98579    48653      266      49      235             0 crond
[7664662.009539] [98713]     0 98713    30977      230      16      529             0 python3
[7664662.017799] [98967]     0 98967    30977      227      19      528             0 mdraid.py
[7664662.026244] [99292]     0 99292    48653      257      49      261             0 crond
[7664662.034334] [99349]     0 99349     4779      194      14      469             0 lustre-oss-expo
[7664662.043287] [99450]     0 99450    30913      226      18      446             0 python3
[7664662.051547] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664662.059817] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664662.068777] [100032]     0 100032    48653      266      49      240             0 crond
[7664662.077047] [100105]    89 100105    25553      264      47      274             0 smtp
[7664662.085226] [100203]     0 100203    30816      203      17      333             0 python3
[7664662.093662] [100288]     0 100288     4568      160      14      235             0 lustre.py
[7664662.102273] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664662.110018] Killed process 97789 (lustre.py) total-vm:179840kB, anon-rss:0kB, file-rss:1000kB, shmem-rss:0kB
[7664662.144458] lustre.py: page allocation failure: order:0, mode:0x200da
[7664662.151079] CPU: 35 PID: 97789 Comm: lustre.py Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664662.163938] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664662.171775] Call Trace:
[7664662.174414]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664662.179730]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664662.185831]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664662.191413]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664662.197427]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664662.203961]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664662.210487]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664662.216319]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664662.222931]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664662.229199]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664662.235117]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664662.241125]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664662.247057]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664662.252982]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664662.258563]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664662.263881] Mem-Info:
[7664662.266362] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:32816 inactive_file:33553 isolated_file:5472
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:823994 slab_unreclaimable:62296464
 mapped:1630 shmem:0 pagetables:2772 bounce:0
 free:590449 free_pcp:62 free_cma:0
[7664662.300722] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664662.342475] lowmem_reserve[]: 0 1418 63868 63868
[7664662.347403] Node 0 DMA32 free:261252kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1000kB inactive_file:3448kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:172kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686304kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:9308 all_unreclaimable? yes
[7664662.392534] lowmem_reserve[]: 0 0 62450 62450
[7664662.397201] Node 0 Normal free:508824kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44184kB inactive_file:41768kB unevictable:168kB isolated(anon):0kB isolated(file):5248kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610908kB slab_unreclaimable:60242900kB kernel_stack:5952kB pagetables:2800kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:369749 all_unreclaimable? yes
[7664662.444067] lowmem_reserve[]: 0 0 0 0
[7664662.448047] Node 1 Normal free:525472kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15580kB inactive_file:15572kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3776kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:204093 all_unreclaimable? yes
[7664662.494912] lowmem_reserve[]: 0 0 0 0
[7664662.498882] Node 2 Normal free:525232kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31760kB inactive_file:33804kB unevictable:8680kB isolated(anon):0kB isolated(file):10496kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715128kB slab_unreclaimable:62476144kB kernel_stack:7936kB pagetables:1424kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:533557 all_unreclaimable? yes
[7664662.546093] lowmem_reserve[]: 0 0 0 0
[7664662.550071] Node 3 Normal free:525180kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:39944kB inactive_file:43828kB unevictable:840kB isolated(anon):0kB isolated(file):2688kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854204kB slab_unreclaimable:62369188kB kernel_stack:4224kB pagetables:3076kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2020180 all_unreclaimable? yes
[7664662.597025] lowmem_reserve[]: 0 0 0 0
[7664662.600992] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664662.615830] Node 0 DMA32: 375*4kB (EM) 396*8kB (UEM) 1211*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261420kB
[7664662.632236] Node 0 Normal: 6432*4kB (UEM) 5776*8kB (UEM) 3902*16kB (UEM) 4482*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508832kB
[7664662.649077] Node 1 Normal: 88026*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525472kB
[7664662.662144] Node 2 Normal: 27412*4kB (UEM) 40262*8kB (EM) 829*16kB (EM) 1665*32kB (EM) 421*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525232kB
[7664662.677286] Node 3 Normal: 131295*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525180kB
[7664662.689724] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664662.698591] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664662.707203] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664662.716071] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664662.724677] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664662.733545] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664662.742149] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664662.751015] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664662.759619] 73026 total pagecache pages
[7664662.763634] 0 pages in swap cache
[7664662.767126] Swap cache stats: add 21120330, delete 21136302, find 4513364/7609777
[7664662.774779] Free swap  = 2030140kB
[7664662.778358] Total swap = 4194300kB
[7664662.781939] 66993253 pages RAM
[7664662.785168] 0 pages HighMem/MovableOnly
[7664662.789182] 1101945 pages reserved
[7664663.129112] ll_ost_io03_110 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664663.137555] ll_ost_io03_110 cpuset=/ mems_allowed=3
[7664663.142617] CPU: 43 PID: 8770 Comm: ll_ost_io03_110 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664663.155915] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664663.163743] Call Trace:
[7664663.166374]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664663.171693]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664663.177182]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664663.183021]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664663.188768]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664663.194780]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664663.201135]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664663.207240]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664663.212991]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664663.219527]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664663.226060]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664663.232238]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664663.238242]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664663.244353]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664663.251241]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664663.257403]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664663.264500]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664663.271066]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664663.277714]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664663.285067]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664663.291809]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664663.299154]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664663.306677]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664663.313591]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664663.320681]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664663.328431]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664663.335687]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664663.343554]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664663.350514]  [<ffffffffa00d7c40>] ? wake_up_state+0x20/0x20
[7664663.356297]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664663.362787]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664663.370362]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664663.375413]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664663.381686]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664663.388302]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664663.394575] Mem-Info:
[7664663.397035] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32352 inactive_file:33829 isolated_file:4540
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824007 slab_unreclaimable:62296453
 mapped:1628 shmem:0 pagetables:2728 bounce:0
 free:590474 free_pcp:0 free_cma:0
[7664663.431312] Node 3 Normal free:525148kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:40988kB inactive_file:40932kB unevictable:840kB isolated(anon):0kB isolated(file):896kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854244kB slab_unreclaimable:62369188kB kernel_stack:4224kB pagetables:3072kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:200668 all_unreclaimable? yes
[7664663.478091] lowmem_reserve[]: 0 0 0 0
[7664663.482066] Node 3 Normal: 131498*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525992kB
[7664663.494505] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664663.503382] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664663.511995] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664663.520867] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664663.529474] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664663.538339] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664663.546947] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664663.555812] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664663.564419] 72421 total pagecache pages
[7664663.568432] 0 pages in swap cache
[7664663.571923] Swap cache stats: add 21120342, delete 21136314, find 4513365/7609779
[7664663.579575] Free swap  = 2035004kB
[7664663.583154] Total swap = 4194300kB
[7664663.586735] 66993253 pages RAM
[7664663.589965] 0 pages HighMem/MovableOnly
[7664663.593979] 1101945 pages reserved
[7664663.597560] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664663.605605] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664663.614565] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664663.623349] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664663.631948] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664663.640123] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664663.648386] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664663.657002] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664663.665266] [53099]     0 53099     6670      239      18      649             0 smartd
[7664663.673439] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664663.681523] [53104]     0 53104    74785      315      85      275             0 sssd
[7664663.689525] [53106]     0 53106     5514      189      15      219             0 irqbalance
[7664663.698051] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664663.707005] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664663.715351] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664663.723610] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664663.731869] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664663.740217] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664663.748571] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664663.757440] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664663.765447] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664663.773799] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664663.782149] [53969]     0 53969    31572      205      20      168             0 crond
[7664663.790241] [54035]     0 54035    27526      164      10       33             0 agetty
[7664663.798421] [54036]     0 54036    27526      158      11       33             0 agetty
[7664663.806596] [54186]     0 54186    22934      210      46      272             0 master
[7664663.814775] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664663.822896] [36317]     0 36317    28294      187      14       61             0 bash
[7664663.830897] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664663.839425] [36329]     0 36329    28177      160      14       55             0 grep
[7664663.847525] [117987]     0 117987   283356      282     509   230727             0 python
[7664663.855891] [76204]    89 76204    25501      252      46      282             0 pickup
[7664663.864070] [97173]     0 97173    48653      264      49      261             0 crond
[7664663.872161] [97192]     0 97192    34468      247      25     1344             0 python3
[7664663.880427] [97872]     0 97872    48653      263      49      263             0 crond
[7664663.888514] [97890]     0 97890    31176      214      18      734             0 python3
[7664663.896781] [98004]     0 98004    31176      224      18      711             0 mdraid.py
[7664663.905214] [98530]     0 98530    31341      224      18      642             0 lustre.py
[7664663.913649] [98579]     0 98579    48653      266      49      235             0 crond
[7664663.921741] [98713]     0 98713    30977      230      16      529             0 python3
[7664663.930003] [98967]     0 98967    30977      227      19      528             0 mdraid.py
[7664663.938442] [99292]     0 99292    48653      257      49      261             0 crond
[7664663.946529] [99349]     0 99349     4779      194      14      469             0 lustre-oss-expo
[7664663.955481] [99450]     0 99450    30913      226      18      446             0 python3
[7664663.963745] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664663.972008] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664663.980963] [100032]     0 100032    48653      266      49      240             0 crond
[7664663.989231] [100105]    89 100105    25553      264      47      274             0 smtp
[7664663.997411] [100203]     0 100203    30816      203      17      333             0 python3
[7664664.005844] [100288]     0 100288     4568      160      14      235             0 lustre.py
[7664664.014450] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664664.022194] Killed process 98004 (mdraid.py) total-vm:124704kB, anon-rss:0kB, file-rss:896kB, shmem-rss:0kB
[7664664.055065] mdraid.py: page allocation failure: order:0, mode:0x200da
[7664664.061689] CPU: 3 PID: 98004 Comm: mdraid.py Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664664.074459] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664664.082297] Call Trace:
[7664664.084933]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664664.090251]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664664.096354]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664664.101944]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664664.107965]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664664.114502]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664664.121033]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664664.126867]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664664.133479]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664664.139744]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664664.145664]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664664.151672]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664664.157591]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664664.163509]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664664.169081]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664664.174397] Mem-Info:
[7664664.176880] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32448 inactive_file:34176 isolated_file:3580
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824007 slab_unreclaimable:62296448
 mapped:1628 shmem:0 pagetables:2728 bounce:0
 free:590711 free_pcp:272 free_cma:0
[7664664.211333] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664664.253083] lowmem_reserve[]: 0 1418 63868 63868
[7664664.258014] Node 0 DMA32 free:261340kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1024kB inactive_file:3392kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:164kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686300kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:9083 all_unreclaimable? no
[7664664.302892] lowmem_reserve[]: 0 0 62450 62450
[7664664.307562] Node 0 Normal free:508432kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:42476kB inactive_file:42020kB unevictable:168kB isolated(anon):0kB isolated(file):8544kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610916kB slab_unreclaimable:60242708kB kernel_stack:6128kB pagetables:2688kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1408 all_unreclaimable? no
[7664664.354170] lowmem_reserve[]: 0 0 0 0
[7664664.358140] Node 1 Normal free:525504kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15540kB inactive_file:15584kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3772kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:52787 all_unreclaimable? yes
[7664664.404914] lowmem_reserve[]: 0 0 0 0
[7664664.408882] Node 2 Normal free:525128kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31912kB inactive_file:35548kB unevictable:8680kB isolated(anon):0kB isolated(file):3200kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715168kB slab_unreclaimable:62476128kB kernel_stack:7936kB pagetables:1368kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:388709 all_unreclaimable? no
[7664664.455921] lowmem_reserve[]: 0 0 0 0
[7664664.459895] Node 3 Normal free:525308kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:41124kB inactive_file:42136kB unevictable:840kB isolated(anon):0kB isolated(file):384kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854256kB slab_unreclaimable:62369164kB kernel_stack:4224kB pagetables:3072kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:322266 all_unreclaimable? no
[7664664.506583] lowmem_reserve[]: 0 0 0 0
[7664664.510550] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664664.525388] Node 0 DMA32: 370*4kB (UEM) 395*8kB (UEM) 1211*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261392kB
[7664664.541891] Node 0 Normal: 6227*4kB (UEM) 5742*8kB (UEM) 3989*16kB (UEM) 4498*32kB (UEM) 2053*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 510092kB
[7664664.558729] Node 1 Normal: 88041*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525532kB
[7664664.571799] Node 2 Normal: 27460*4kB (UEM) 40263*8kB (UEM) 874*16kB (UEM) 1692*32kB (UEM) 418*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 526824kB
[7664664.587199] Node 3 Normal: 131471*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525884kB
[7664664.599638] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664664.608506] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664664.617110] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664664.625976] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664664.634582] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664664.643449] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664664.652056] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664664.660923] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664664.669535] 72861 total pagecache pages
[7664664.673548] 0 pages in swap cache
[7664664.677040] Swap cache stats: add 21120342, delete 21136314, find 4513365/7609779
[7664664.684692] Free swap  = 2035004kB
[7664664.688274] Total swap = 4194300kB
[7664664.691853] 66993253 pages RAM
[7664664.695083] 0 pages HighMem/MovableOnly
[7664664.699095] 1101945 pages reserved
[7664665.544714] ll_ost_io00_029 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664665.553167] ll_ost_io00_029 cpuset=/ mems_allowed=0
[7664665.558234] CPU: 28 PID: 123071 Comm: ll_ost_io00_029 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664665.571696] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664665.579520] Call Trace:
[7664665.582156]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664665.587471]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664665.592965]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664665.598808]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664665.604562]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664665.610568]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664665.616927]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664665.623019]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664665.628766]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664665.635294]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664665.641827]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664665.648007]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664665.654012]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664665.660121]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664665.667005]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664665.674232]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664665.680389]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664665.687048]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664665.694134]  [<ffffffffc11dcbe7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[7664665.700795]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664665.707845]  [<ffffffffa00dca58>] ? __enqueue_entity+0x78/0x80
[7664665.713865]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664665.719396]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664665.726495]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664665.734246]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664665.741506]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664665.749373]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664665.756368]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664665.764316]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664665.771570]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664665.778050]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664665.785619]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664665.790682]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664665.796963]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664665.803579]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664665.809850] Mem-Info:
[7664665.812317] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32796 inactive_file:34009 isolated_file:4828
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824028 slab_unreclaimable:62296402
 mapped:1628 shmem:0 pagetables:2728 bounce:0
 free:590442 free_pcp:0 free_cma:0
[7664665.846584] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664665.888336] lowmem_reserve[]: 0 1418 63868 63868
[7664665.893258] Node 0 DMA32 free:261336kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1032kB inactive_file:3584kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:164kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686300kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:271370 all_unreclaimable? yes
[7664665.938391] lowmem_reserve[]: 0 0 62450 62450
[7664665.943061] Node 0 Normal free:508740kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:36936kB inactive_file:38016kB unevictable:168kB isolated(anon):0kB isolated(file):14960kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610944kB slab_unreclaimable:60242704kB kernel_stack:6240kB pagetables:2688kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1099072 all_unreclaimable? no
[7664665.990018] lowmem_reserve[]: 0 0 0 0
[7664665.993993] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664666.008829] Node 0 DMA32: 370*4kB (UEM) 395*8kB (UEM) 1211*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261392kB
[7664666.025320] Node 0 Normal: 6293*4kB (UEM) 5742*8kB (UEM) 3970*16kB (UEM) 4495*32kB (UEM) 2053*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 509956kB
[7664666.042171] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664666.051035] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664666.059644] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664666.068510] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664666.077115] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664666.085982] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664666.094586] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664666.103453] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664666.112059] 72816 total pagecache pages
[7664666.116073] 0 pages in swap cache
[7664666.119566] Swap cache stats: add 21120351, delete 21136323, find 4513366/7609781
[7664666.127216] Free swap  = 2037812kB
[7664666.130795] Total swap = 4194300kB
[7664666.134376] 66993253 pages RAM
[7664666.137608] 0 pages HighMem/MovableOnly
[7664666.141620] 1101945 pages reserved
[7664666.145200] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664666.153246] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664666.162201] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664666.170994] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664666.179588] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664666.187767] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664666.196034] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664666.204640] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664666.212909] [53099]     0 53099     6670      239      18      649             0 smartd
[7664666.221091] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664666.229183] [53104]     0 53104    74785      315      85      275             0 sssd
[7664666.237183] [53106]     0 53106     5514      189      15      219             0 irqbalance
[7664666.245702] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664666.254654] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664666.263002] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664666.271261] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664666.279520] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664666.287866] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664666.296214] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664666.305090] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664666.313102] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664666.321455] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664666.329807] [53969]     0 53969    31572      205      20      168             0 crond
[7664666.337905] [54035]     0 54035    27526      164      10       33             0 agetty
[7664666.346083] [54036]     0 54036    27526      158      11       33             0 agetty
[7664666.354269] [54186]     0 54186    22934      210      46      272             0 master
[7664666.362449] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664666.370573] [36317]     0 36317    28294      187      14       61             0 bash
[7664666.378573] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664666.387092] [36329]     0 36329    28177      160      14       55             0 grep
[7664666.395181] [117987]     0 117987   283356      282     509   230727             0 python
[7664666.403551] [76204]    89 76204    25501      252      46      282             0 pickup
[7664666.411728] [97173]     0 97173    48653      264      49      261             0 crond
[7664666.419818] [97192]     0 97192    34468      247      25     1344             0 python3
[7664666.428077] [97872]     0 97872    48653      263      49      263             0 crond
[7664666.436166] [97890]     0 97890    31176      214      18      734             0 python3
[7664666.444435] [98530]     0 98530    31341      224      18      642             0 lustre.py
[7664666.452872] [98579]     0 98579    48653      266      49      235             0 crond
[7664666.460959] [98713]     0 98713    30977      230      16      529             0 python3
[7664666.469221] [98967]     0 98967    30977      227      19      528             0 mdraid.py
[7664666.477660] [99292]     0 99292    48653      257      49      261             0 crond
[7664666.485746] [99349]     0 99349     4779      194      14      469             0 lustre-oss-expo
[7664666.494699] [99450]     0 99450    30913      226      18      446             0 python3
[7664666.502959] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664666.511219] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664666.520181] [100032]     0 100032    48653      266      49      240             0 crond
[7664666.528448] [100105]    89 100105    25553      264      47      274             0 smtp
[7664666.536619] [100203]     0 100203    30816      203      17      333             0 python3
[7664666.545052] [100288]     0 100288     4568      160      14      235             0 lustre.py
[7664666.553661] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664666.561413] Killed process 98530 (lustre.py) total-vm:125364kB, anon-rss:0kB, file-rss:896kB, shmem-rss:0kB
[7664666.661186] lustre.py: page allocation failure: order:0, mode:0x200da
[7664666.667804] CPU: 12 PID: 98530 Comm: lustre.py Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664666.680671] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664666.688513] Call Trace:
[7664666.691147]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664666.696466]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664666.702565]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664666.708147]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664666.714159]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664666.720686]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664666.727212]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664666.733054]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664666.739675]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664666.745938]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664666.751861]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664666.757875]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664666.763794]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664666.769714]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664666.775286]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664666.780597] Mem-Info:
[7664666.783073] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32430 inactive_file:34395 isolated_file:4928
 unevictable:9044 dirty:9 writeback:0 unstable:0
 slab_reclaimable:824028 slab_unreclaimable:62296425
 mapped:1628 shmem:0 pagetables:2710 bounce:0
 free:590587 free_pcp:0 free_cma:0
[7664666.817339] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664666.859098] lowmem_reserve[]: 0 1418 63868 63868
[7664666.864034] Node 0 DMA32 free:261336kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1032kB inactive_file:3588kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:164kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686300kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:276266 all_unreclaimable? yes
[7664666.909159] lowmem_reserve[]: 0 0 62450 62450
[7664666.913823] Node 0 Normal free:508708kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:39628kB inactive_file:38440kB unevictable:168kB isolated(anon):0kB isolated(file):16128kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610944kB slab_unreclaimable:60242800kB kernel_stack:6560kB pagetables:2684kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:872275 all_unreclaimable? yes
[7664666.960778] lowmem_reserve[]: 0 0 0 0
[7664666.964750] Node 1 Normal free:525532kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15584kB inactive_file:15512kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:12kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3772kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:52787 all_unreclaimable? yes
[7664667.011616] lowmem_reserve[]: 0 0 0 0
[7664667.015591] Node 2 Normal free:525508kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:30544kB inactive_file:35552kB unevictable:8680kB isolated(anon):0kB isolated(file):4224kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:36kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715172kB slab_unreclaimable:62476116kB kernel_stack:7936kB pagetables:1300kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:165945 all_unreclaimable? no
[7664667.062719] lowmem_reserve[]: 0 0 0 0
[7664667.066690] Node 3 Normal free:525420kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:41792kB inactive_file:43096kB unevictable:840kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854256kB slab_unreclaimable:62369164kB kernel_stack:4224kB pagetables:3072kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:267246 all_unreclaimable? yes
[7664667.113466] lowmem_reserve[]: 0 0 0 0
[7664667.117437] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664667.132277] Node 0 DMA32: 371*4kB (UEM) 397*8kB (UEM) 1212*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261428kB
[7664667.148769] Node 0 Normal: 6079*4kB (UEM) 5734*8kB (UEM) 3927*16kB (UEM) 4492*32kB (UEM) 2053*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508252kB
[7664667.165608] Node 1 Normal: 88041*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525532kB
[7664667.178678] Node 2 Normal: 27128*4kB (EM) 40110*8kB (UEM) 874*16kB (UEM) 1688*32kB (UEM) 418*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524144kB
[7664667.193993] Node 3 Normal: 131420*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525680kB
[7664667.206430] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664667.215299] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664667.223910] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664667.232776] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664667.241385] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664667.250251] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664667.258859] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664667.267733] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664667.276349] 73167 total pagecache pages
[7664667.280368] 0 pages in swap cache
[7664667.283859] Swap cache stats: add 21120351, delete 21136323, find 4513366/7609781
[7664667.291512] Free swap  = 2037812kB
[7664667.295090] Total swap = 4194300kB
[7664667.298676] 66993253 pages RAM
[7664667.301910] 0 pages HighMem/MovableOnly
[7664667.305924] 1101945 pages reserved
[7664667.629418] ll_ost_io01_096 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664667.637862] ll_ost_io01_096 cpuset=/ mems_allowed=1
[7664667.642920] CPU: 33 PID: 27189 Comm: ll_ost_io01_096 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664667.656296] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664667.664123] Call Trace:
[7664667.666758]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664667.672072]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664667.677560]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664667.683400]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664667.689147]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664667.695160]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664667.701512]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664667.707604]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664667.713350]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664667.719877]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664667.726405]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664667.732592]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664667.738599]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664667.744705]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664667.751589]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664667.757747]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664667.764846]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664667.771404]  [<ffffffffc11966b2>] ? ldlm_res_hop_get_locked+0x12/0x20 [ptlrpc]
[7664667.778814]  [<ffffffffc0a13297>] ? cfs_hash_bd_lookup_intent+0xf7/0x170 [libcfs]
[7664667.786505]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664667.793853]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664667.800596]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664667.807943]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664667.815465]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664667.822386]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664667.829473]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664667.837224]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664667.844482]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664667.852352]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664667.859351]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664667.867299]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664667.874553]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664667.881027]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664667.888596]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664667.893655]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664667.899925]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664667.906544]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664667.912818] Mem-Info:
[7664667.915279] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:33575 inactive_file:34350 isolated_file:3200
 unevictable:9044 dirty:9 writeback:0 unstable:0
 slab_reclaimable:824055 slab_unreclaimable:62296433
 mapped:1627 shmem:0 pagetables:2692 bounce:0
 free:590125 free_pcp:0 free_cma:0
[7664667.949550] Node 1 Normal free:525532kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15584kB inactive_file:15512kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:12kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3772kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:52787 all_unreclaimable? yes
[7664667.996417] lowmem_reserve[]: 0 0 0 0
[7664668.000381] Node 1 Normal: 88041*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525532kB
[7664668.013457] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664668.022329] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664668.030938] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664668.039813] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664668.048425] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664668.057294] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664668.065907] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664668.074772] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664668.083378] 73029 total pagecache pages
[7664668.087389] 7 pages in swap cache
[7664668.090885] Swap cache stats: add 21120389, delete 21136354, find 4513370/7609789
[7664668.098536] Free swap  = 2040356kB
[7664668.102115] Total swap = 4194300kB
[7664668.105694] 66993253 pages RAM
[7664668.108929] 0 pages HighMem/MovableOnly
[7664668.112940] 1101945 pages reserved
[7664668.116519] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664668.124571] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664668.133526] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664668.142317] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664668.150903] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664668.159086] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664668.167352] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664668.175960] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664668.184227] [53099]     0 53099     6670      239      18      649             0 smartd
[7664668.192408] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664668.200493] [53104]     0 53104    74785      315      85      275             0 sssd
[7664668.208494] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664668.217014] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664668.225976] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664668.234331] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664668.242597] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664668.250856] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664668.259202] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664668.267550] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664668.276425] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664668.284434] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664668.292792] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664668.301142] [53969]     0 53969    31572      205      20      168             0 crond
[7664668.309240] [54035]     0 54035    27526      164      10       33             0 agetty
[7664668.317418] [54036]     0 54036    27526      158      11       33             0 agetty
[7664668.325602] [54186]     0 54186    22934      210      46      272             0 master
[7664668.333781] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664668.341883] [36317]     0 36317    28294      187      14       61             0 bash
[7664668.349884] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664668.358414] [36329]     0 36329    28177      160      14       55             0 grep
[7664668.366509] [117987]     0 117987   283356      282     509   230727             0 python
[7664668.374879] [76204]    89 76204    25501      252      46      282             0 pickup
[7664668.383066] [97173]     0 97173    48653      264      49      261             0 crond
[7664668.391153] [97192]     0 97192    34468      247      25     1344             0 python3
[7664668.399414] [97872]     0 97872    48653      263      49      263             0 crond
[7664668.407503] [97890]     0 97890    31176      215      18      701             0 python3
[7664668.415774] [98579]     0 98579    48653      266      49      235             0 crond
[7664668.423863] [98713]     0 98713    30977      230      16      529             0 python3
[7664668.432133] [98967]     0 98967    30977      227      19      528             0 mdraid.py
[7664668.440572] [99292]     0 99292    48653      257      49      261             0 crond
[7664668.448661] [99349]     0 99349     4779      194      14      469             0 lustre-oss-expo
[7664668.457624] [99450]     0 99450    30913      226      18      446             0 python3
[7664668.465888] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664668.474149] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664668.483108] [100032]     0 100032    48653      266      49      240             0 crond
[7664668.491369] [100105]    89 100105    25553      264      47      274             0 smtp
[7664668.499549] [100203]     0 100203    30816      203      17      333             0 python3
[7664668.507986] [100288]     0 100288     4568      160      14      235             0 lustre.py
[7664668.516598] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664668.524350] Killed process 98967 (mdraid.py) total-vm:123908kB, anon-rss:0kB, file-rss:908kB, shmem-rss:0kB
[7664668.563588] mdraid.py: page allocation failure: order:0, mode:0x200da
[7664668.570211] CPU: 16 PID: 98967 Comm: mdraid.py Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664668.583072] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664668.590910] Call Trace:
[7664668.593554]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664668.598876]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664668.604974]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664668.610558]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664668.616575]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664668.623105]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664668.629640]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664668.635481]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664668.642099]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664668.648366]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664668.654287]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664668.660293]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664668.666213]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664668.672141]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664668.677722]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664668.683034] Mem-Info:
[7664668.685507] active_anon:0 inactive_anon:4 isolated_anon:0
 active_file:33325 inactive_file:35903 isolated_file:1920
 unevictable:9044 dirty:9 writeback:0 unstable:0
 slab_reclaimable:824056 slab_unreclaimable:62296402
 mapped:1628 shmem:0 pagetables:2692 bounce:0
 free:590365 free_pcp:0 free_cma:0
[7664668.719778] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664668.761539] lowmem_reserve[]: 0 1418 63868 63868
[7664668.766477] Node 0 DMA32 free:261344kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1036kB inactive_file:3540kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:160kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686248kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:10496 all_unreclaimable? yes
[7664668.811519] lowmem_reserve[]: 0 0 62450 62450
[7664668.816181] Node 0 Normal free:508532kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44380kB inactive_file:45208kB unevictable:168kB isolated(anon):0kB isolated(file):4224kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610944kB slab_unreclaimable:60242840kB kernel_stack:6048kB pagetables:2632kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1947553 all_unreclaimable? yes
[7664668.863138] lowmem_reserve[]: 0 0 0 0
[7664668.867118] Node 1 Normal free:525532kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15584kB inactive_file:15512kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:12kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3772kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:52787 all_unreclaimable? yes
[7664668.913977] lowmem_reserve[]: 0 0 0 0
[7664668.917945] Node 2 Normal free:525068kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31116kB inactive_file:38308kB unevictable:8680kB isolated(anon):0kB isolated(file):1536kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:36kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715256kB slab_unreclaimable:62476028kB kernel_stack:7936kB pagetables:1292kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:263825 all_unreclaimable? no
[7664668.965076] lowmem_reserve[]: 0 0 0 0
[7664668.969046] Node 3 Normal free:525080kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:16kB active_file:37244kB inactive_file:37872kB unevictable:840kB isolated(anon):0kB isolated(file):7040kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:852kB shmem:0kB slab_reclaimable:854284kB slab_unreclaimable:62369172kB kernel_stack:4224kB pagetables:3060kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2148692 all_unreclaimable? no
[7664669.015994] lowmem_reserve[]: 0 0 0 0
[7664669.019959] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664669.034798] Node 0 DMA32: 370*4kB (UEM) 397*8kB (UEM) 1214*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261456kB
[7664669.051292] Node 0 Normal: 6219*4kB (UEM) 5737*8kB (UEM) 3923*16kB (UEM) 4491*32kB (UEM) 2053*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508740kB
[7664669.068132] Node 1 Normal: 88041*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525532kB
[7664669.081201] Node 2 Normal: 27389*4kB (UEM) 40197*8kB (UEM) 866*16kB (UEM) 1679*32kB (UEM) 416*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525340kB
[7664669.096602] Node 3 Normal: 131249*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524996kB
[7664669.109039] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664669.117908] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664669.126522] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664669.135395] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664669.144002] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664669.152868] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664669.161477] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664669.170348] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664669.178955] 73025 total pagecache pages
[7664669.182969] 0 pages in swap cache
[7664669.186470] Swap cache stats: add 21120390, delete 21136362, find 4513370/7609789
[7664669.194120] Free swap  = 2040356kB
[7664669.197700] Total swap = 4194300kB
[7664669.201282] 66993253 pages RAM
[7664669.204511] 0 pages HighMem/MovableOnly
[7664669.208524] 1101945 pages reserved
[7664669.386713] ll_ost_io03_071 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664669.395183] ll_ost_io03_071 cpuset=/ mems_allowed=3
[7664669.400254] CPU: 47 PID: 90679 Comm: ll_ost_io03_071 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664669.413642] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664669.421469] Call Trace:
[7664669.424102]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664669.429419]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664669.434914]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664669.440754]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664669.446511]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664669.452527]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664669.458890]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664669.464992]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664669.470744]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664669.477275]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664669.483803]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664669.489987]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664669.495998]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664669.502111]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664669.509000]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664669.515170]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664669.522284]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664669.528818]  [<ffffffffa021ab4e>] ? kmalloc_order_trace+0x2e/0xa0
[7664669.535101]  [<ffffffffa021e721>] ? __kmalloc+0x211/0x230
[7664669.540719]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664669.548078]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664669.554834]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664669.562194]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664669.569714]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664669.576655]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664669.583758]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664669.591515]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664669.598802]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664669.606669]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664669.613644]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664669.619090]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664669.625573]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664669.633152]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664669.638209]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664669.644486]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664669.651109]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664669.657377] Mem-Info:
[7664669.659838] active_anon:0 inactive_anon:8 isolated_anon:0
 active_file:33079 inactive_file:34284 isolated_file:5344
 unevictable:9044 dirty:9 writeback:0 unstable:0
 slab_reclaimable:824056 slab_unreclaimable:62296402
 mapped:1628 shmem:0 pagetables:2673 bounce:0
 free:590386 free_pcp:0 free_cma:0
[7664669.694145] Node 3 Normal free:525080kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:16kB active_file:39404kB inactive_file:43096kB unevictable:840kB isolated(anon):0kB isolated(file):9472kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:852kB shmem:0kB slab_reclaimable:854284kB slab_unreclaimable:62369172kB kernel_stack:4224kB pagetables:3044kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:515512 all_unreclaimable? yes
[7664669.741119] lowmem_reserve[]: 0 0 0 0
[7664669.745097] Node 3 Normal: 131253*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525012kB
[7664669.757549] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664669.766425] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664669.775040] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664669.783914] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664669.792529] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664669.801403] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664669.810007] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664669.818878] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664669.827490] 73029 total pagecache pages
[7664669.831510] 0 pages in swap cache
[7664669.835013] Swap cache stats: add 21120396, delete 21136368, find 4513371/7609791
[7664669.842664] Free swap  = 2042404kB
[7664669.846244] Total swap = 4194300kB
[7664669.849825] 66993253 pages RAM
[7664669.853054] 0 pages HighMem/MovableOnly
[7664669.857070] 1101945 pages reserved
[7664669.860649] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664669.868700] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664669.877662] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664669.886457] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664669.895057] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664669.903241] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664669.911508] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664669.920123] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664669.928390] [53099]     0 53099     6670      239      18      649             0 smartd
[7664669.936570] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664669.944660] [53104]     0 53104    74785      315      85      275             0 sssd
[7664669.952670] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664669.961201] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664669.970166] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664669.978521] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664669.986789] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664669.995058] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664670.003410] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664670.011767] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664670.020643] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664670.028653] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664670.037002] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664670.045360] [53969]     0 53969    31572      205      20      168             0 crond
[7664670.053453] [54035]     0 54035    27526      164      10       33             0 agetty
[7664670.061633] [54036]     0 54036    27526      158      11       33             0 agetty
[7664670.069821] [54186]     0 54186    22934      210      46      272             0 master
[7664670.077996] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664670.086147] [36317]     0 36317    28294      187      14       61             0 bash
[7664670.094151] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664670.102678] [36329]     0 36329    28177      160      14       55             0 grep
[7664670.110777] [117987]     0 117987   283356      282     509   230727             0 python
[7664670.119148] [76204]    89 76204    25501      252      46      282             0 pickup
[7664670.127334] [97173]     0 97173    48653      264      49      261             0 crond
[7664670.135423] [97192]     0 97192    34468      247      25     1344             0 python3
[7664670.143694] [97872]     0 97872    48653      263      49      263             0 crond
[7664670.151788] [97890]     0 97890    31176      215      18      701             0 python3
[7664670.160070] [98579]     0 98579    48653      266      49      235             0 crond
[7664670.168169] [98713]     0 98713    30977      230      16      529             0 python3
[7664670.176448] [99292]     0 99292    48653      257      49      261             0 crond
[7664670.184539] [99349]     0 99349     4779      194      14      469             0 lustre-oss-expo
[7664670.193499] [99450]     0 99450    30913      226      18      446             0 python3
[7664670.201775] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664670.210055] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664670.219011] [100032]     0 100032    48653      266      49      240             0 crond
[7664670.227274] [100105]    89 100105    25553      264      47      274             0 smtp
[7664670.235459] [100203]     0 100203    30816      203      17      333             0 python3
[7664670.243899] [100288]     0 100288     4568      160      14      235             0 lustre.py
[7664670.252512] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664670.260269] Killed process 99349 (lustre-oss-expo) total-vm:19116kB, anon-rss:0kB, file-rss:776kB, shmem-rss:0kB
[7664670.464170] ll_ost_io02_000 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664670.472632] ll_ost_io02_000 cpuset=/ mems_allowed=2
[7664670.477700] CPU: 14 PID: 101256 Comm: ll_ost_io02_000 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664670.491198] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664670.499025] Call Trace:
[7664670.501662]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664670.506976]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664670.512472]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664670.518313]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664670.524065]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664670.530071]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664670.536424]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664670.542518]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664670.548271]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664670.554797]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664670.561325]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664670.567511]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664670.573516]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664670.579625]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664670.586508]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664670.593734]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664670.599901]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664670.606523]  [<ffffffffa021bd89>] ? ___slab_alloc+0x209/0x4f0
[7664670.612483]  [<ffffffffc11dcbe7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[7664670.619134]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664670.626187]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664670.631941]  [<ffffffffa00ddd9e>] ? account_entity_dequeue+0xae/0xd0
[7664670.638466]  [<ffffffffa00e192c>] ? dequeue_entity+0x11c/0x5e0
[7664670.644473]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664670.650003]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664670.657099]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664670.664848]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664670.672105]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664670.679973]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664670.686941]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664670.692383]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664670.698864]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664670.706433]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664670.711492]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664670.717761]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664670.724380]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664670.730646] Mem-Info:
[7664670.733104] active_anon:0 inactive_anon:12 isolated_anon:0
 active_file:32083 inactive_file:35695 isolated_file:4187
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824056 slab_unreclaimable:62296402
 mapped:1627 shmem:0 pagetables:2659 bounce:0
 free:590537 free_pcp:0 free_cma:0
[7664670.767460] Node 2 Normal free:525388kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:32kB active_file:29828kB inactive_file:36012kB unevictable:8680kB isolated(anon):0kB isolated(file):7808kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715256kB slab_unreclaimable:62476028kB kernel_stack:7936kB pagetables:1240kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:784850 all_unreclaimable? yes
[7664670.814669] lowmem_reserve[]: 0 0 0 0
[7664670.818637] Node 2 Normal: 27503*4kB (UEM) 40283*8kB (UEM) 874*16kB (UEM) 1665*32kB (UEM) 415*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 526100kB
[7664670.834127] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664670.842995] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664670.851610] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664670.860483] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664670.869089] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664670.877953] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664670.886562] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664670.895426] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664670.904039] 72834 total pagecache pages
[7664670.908063] 0 pages in swap cache
[7664670.911561] Swap cache stats: add 21120405, delete 21136377, find 4513373/7609793
[7664670.919215] Free swap  = 2044188kB
[7664670.922795] Total swap = 4194300kB
[7664670.926375] 66993253 pages RAM
[7664670.929606] 0 pages HighMem/MovableOnly
[7664670.933620] 1101945 pages reserved
[7664670.937198] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664670.945245] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664670.954206] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664670.962996] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664670.971595] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664670.979775] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664670.988041] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664670.996647] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664671.004909] [53099]     0 53099     6670      239      18      649             0 smartd
[7664671.013089] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664671.021183] [53104]     0 53104    74785      315      85      275             0 sssd
[7664671.029191] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664671.037711] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664671.046673] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664671.055027] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664671.063287] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664671.071555] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664671.079910] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664671.088263] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664671.097132] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664671.105138] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664671.113487] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664671.121912] [53969]     0 53969    31572      205      20      168             0 crond
[7664671.130056] [54035]     0 54035    27526      164      10       33             0 agetty
[7664671.138336] [54036]     0 54036    27526      158      11       33             0 agetty
[7664671.146535] [54186]     0 54186    22934      210      46      272             0 master
[7664671.154755] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664671.162938] [36317]     0 36317    28294      187      14       61             0 bash
[7664671.170961] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664671.179480] [36329]     0 36329    28177      160      14       55             0 grep
[7664671.187589] [117987]     0 117987   283356      282     509   230727             0 python
[7664671.195957] [76204]    89 76204    25501      252      46      282             0 pickup
[7664671.204137] [97173]     0 97173    48653      264      49      261             0 crond
[7664671.212226] [97192]     0 97192    34468      247      25     1344             0 python3
[7664671.220496] [97872]     0 97872    48653      263      49      263             0 crond
[7664671.228590] [97890]     0 97890    31176      215      18      701             0 python3
[7664671.236864] [98579]     0 98579    48653      266      49      235             0 crond
[7664671.244953] [98713]     0 98713    30977      230      16      529             0 python3
[7664671.253220] [99292]     0 99292    48653      257      49      261             0 crond
[7664671.261314] [99450]     0 99450    30913      226      18      446             0 python3
[7664671.269581] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664671.277843] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664671.286803] [100032]     0 100032    48653      266      49      240             0 crond
[7664671.295079] [100105]    89 100105    25553      264      47      274             0 smtp
[7664671.303264] [100203]     0 100203    30816      203      17      333             0 python3
[7664671.311734] [100288]     0 100288     4568      160      14      235             0 lustre.py
[7664671.320341] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664671.328089] Killed process 100288 (lustre.py) total-vm:18272kB, anon-rss:0kB, file-rss:640kB, shmem-rss:0kB
[7664671.400244] lustre.py: page allocation failure: order:0, mode:0x200da
[7664671.406866] CPU: 35 PID: 100288 Comm: lustre.py Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664671.419810] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664671.427636] Call Trace:
[7664671.430267]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664671.435585]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664671.441678]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664671.447260]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664671.453274]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664671.459808]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664671.466336]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664671.472175]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664671.478788]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664671.485053]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664671.490974]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664671.496979]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664671.502901]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664671.508827]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664671.514403]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664671.519723] Mem-Info:
[7664671.522205] active_anon:0 inactive_anon:4 isolated_anon:0
 active_file:32568 inactive_file:35967 isolated_file:4224
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824056 slab_unreclaimable:62296396
 mapped:1627 shmem:0 pagetables:2659 bounce:0
 free:590301 free_pcp:0 free_cma:0
[7664671.556473] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664671.598214] lowmem_reserve[]: 0 1418 63868 63868
[7664671.603139] Node 0 DMA32 free:261296kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1040kB inactive_file:3360kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:160kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686248kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:11575 all_unreclaimable? yes
[7664671.648180] lowmem_reserve[]: 0 0 62450 62450
[7664671.652844] Node 0 Normal free:508180kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:16kB active_file:44696kB inactive_file:45544kB unevictable:168kB isolated(anon):0kB isolated(file):4736kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610944kB slab_unreclaimable:60242824kB kernel_stack:6320kB pagetables:2632kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:843430 all_unreclaimable? yes
[7664671.699790] lowmem_reserve[]: 0 0 0 0
[7664671.703760] Node 1 Normal free:525572kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15712kB inactive_file:15388kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411320kB kernel_stack:20816kB pagetables:3764kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:240998 all_unreclaimable? yes
[7664671.750621] lowmem_reserve[]: 0 0 0 0
[7664671.754591] Node 2 Normal free:525068kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:29992kB inactive_file:34336kB unevictable:8680kB isolated(anon):0kB isolated(file):7296kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715256kB slab_unreclaimable:62476020kB kernel_stack:7936kB pagetables:1240kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:397513 all_unreclaimable? no
[7664671.801616] lowmem_reserve[]: 0 0 0 0
[7664671.805586] Node 3 Normal free:525204kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:42104kB inactive_file:43092kB unevictable:840kB isolated(anon):0kB isolated(file):2176kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854284kB slab_unreclaimable:62369172kB kernel_stack:4224kB pagetables:2988kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:569827 all_unreclaimable? yes
[7664671.852448] lowmem_reserve[]: 0 0 0 0
[7664671.856413] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664671.871252] Node 0 DMA32: 368*4kB (UEM) 397*8kB (UEM) 1215*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261464kB
[7664671.887747] Node 0 Normal: 6197*4kB (UEM) 5708*8kB (UEM) 3928*16kB (UEM) 4493*32kB (UEM) 2053*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508564kB
[7664671.904585] Node 1 Normal: 88060*4kB (EM) 21671*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525608kB
[7664671.917655] Node 2 Normal: 27473*4kB (UEM) 40276*8kB (UEM) 869*16kB (UEM) 1662*32kB (UEM) 415*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525748kB
[7664671.933142] Node 3 Normal: 131315*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525260kB
[7664671.945579] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664671.954447] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664671.963061] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664671.971927] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664671.980535] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664671.989409] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664671.998024] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664672.006898] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664672.015504] 72941 total pagecache pages
[7664672.019518] 0 pages in swap cache
[7664672.023007] Swap cache stats: add 21120405, delete 21136377, find 4513373/7609793
[7664672.030661] Free swap  = 2044188kB
[7664672.034239] Total swap = 4194300kB
[7664672.037820] 66993253 pages RAM
[7664672.041052] 0 pages HighMem/MovableOnly
[7664672.045063] 1101945 pages reserved
[7664672.303880] ll_ost_io03_031 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664672.312319] ll_ost_io03_031 cpuset=/ mems_allowed=3
[7664672.317379] CPU: 47 PID: 119038 Comm: ll_ost_io03_031 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664672.330843] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664672.338672] Call Trace:
[7664672.341302]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664672.346623]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664672.352115]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664672.357956]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664672.363703]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664672.369719]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664672.376077]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664672.382172]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664672.387926]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664672.394460]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664672.400995]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664672.407184]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664672.413198]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664672.419315]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664672.426200]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664672.433432]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664672.439595]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664672.446211]  [<ffffffffa002a59e>] ? __switch_to+0xce/0x580
[7664672.451879]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664672.457632]  [<ffffffffa00ddd9e>] ? account_entity_dequeue+0xae/0xd0
[7664672.464165]  [<ffffffffa00e192c>] ? dequeue_entity+0x11c/0x5e0
[7664672.470171]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664672.475699]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664672.482785]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664672.490538]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664672.497797]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664672.505663]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664672.512622]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664672.518056]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664672.524529]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664672.532099]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664672.537156]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664672.543425]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664672.550045]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664672.556310] Mem-Info:
[7664672.558771] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:34376 inactive_file:35546 isolated_file:1312
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824046 slab_unreclaimable:62296402
 mapped:1620 shmem:0 pagetables:2645 bounce:0
 free:590455 free_pcp:0 free_cma:0
[7664672.593049] Node 3 Normal free:525328kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:8kB active_file:41904kB inactive_file:42400kB unevictable:840kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854284kB slab_unreclaimable:62369164kB kernel_stack:4224kB pagetables:2936kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:146304 all_unreclaimable? no
[7664672.639738] lowmem_reserve[]: 0 0 0 0
[7664672.643703] Node 3 Normal: 131384*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525536kB
[7664672.656141] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664672.665011] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664672.673614] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664672.682481] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664672.691088] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664672.699953] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664672.708561] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664672.717426] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664672.726037] 72957 total pagecache pages
[7664672.730053] 0 pages in swap cache
[7664672.733547] Swap cache stats: add 21120410, delete 21136382, find 4513374/7609796
[7664672.741200] Free swap  = 2045208kB
[7664672.744776] Total swap = 4194300kB
[7664672.748358] 66993253 pages RAM
[7664672.751589] 0 pages HighMem/MovableOnly
[7664672.755602] 1101945 pages reserved
[7664672.759180] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664672.767231] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664672.776188] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664672.784979] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664672.793581] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664672.801755] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664672.810024] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664672.818640] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664672.826907] [53099]     0 53099     6670      239      18      649             0 smartd
[7664672.835088] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664672.843183] [53104]     0 53104    74785      315      85      275             0 sssd
[7664672.851191] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664672.859722] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664672.868680] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664672.877034] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664672.885296] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664672.893561] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664672.901909] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664672.910261] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664672.919132] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664672.927137] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664672.935484] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664672.943840] [53969]     0 53969    31572      205      20      168             0 crond
[7664672.951934] [54035]     0 54035    27526      164      10       33             0 agetty
[7664672.960112] [54036]     0 54036    27526      158      11       33             0 agetty
[7664672.968285] [54186]     0 54186    22934      210      46      272             0 master
[7664672.976458] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664672.984581] [36317]     0 36317    28294      187      14       61             0 bash
[7664672.992588] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664673.001108] [36329]     0 36329    28177      160      14       55             0 grep
[7664673.009196] [117987]     0 117987   283356      282     509   230727             0 python
[7664673.017565] [76204]    89 76204    25501      252      46      282             0 pickup
[7664673.025745] [97173]     0 97173    48653      264      49      261             0 crond
[7664673.033833] [97192]     0 97192    34468      247      25     1344             0 python3
[7664673.042096] [97872]     0 97872    48653      263      49      263             0 crond
[7664673.050188] [97890]     0 97890    31176      215      18      701             0 python3
[7664673.058449] [98579]     0 98579    48653      266      49      235             0 crond
[7664673.066541] [98713]     0 98713    30977      230      16      529             0 python3
[7664673.074802] [99292]     0 99292    48653      257      49      261             0 crond
[7664673.082896] [99450]     0 99450    30913      226      18      446             0 python3
[7664673.091154] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664673.099416] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664673.108376] [100032]     0 100032    48653      266      49      240             0 crond
[7664673.116636] [100105]    89 100105    25553      264      47      274             0 smtp
[7664673.124810] [100203]     0 100203    30816      203      17      333             0 python3
[7664673.133251] Out of memory: Kill process 117987 (python) score 3 or sacrifice child
[7664673.140988] Killed process 117987 (python) total-vm:1133424kB, anon-rss:0kB, file-rss:1128kB, shmem-rss:0kB
[7664673.160364] python: page allocation failure: order:0, mode:0x200da
[7664673.166724] CPU: 30 PID: 117987 Comm: python Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664673.179410] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664673.187246] Call Trace:
[7664673.189890]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664673.195211]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664673.201310]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664673.206886]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664673.212900]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664673.219433]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664673.225959]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664673.231801]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664673.238421]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664673.244690]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664673.250623]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664673.256634]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664673.262559]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664673.268488]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664673.274071]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664673.279390] Mem-Info:
[7664673.281863] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32644 inactive_file:34792 isolated_file:4128
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824041 slab_unreclaimable:62296405
 mapped:1621 shmem:0 pagetables:2645 bounce:0
 free:590149 free_pcp:182 free_cma:0
[7664673.316306] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664673.358059] lowmem_reserve[]: 0 1418 63868 63868
[7664673.362987] Node 0 DMA32 free:261332kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1060kB inactive_file:3572kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:136kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686248kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:7737 all_unreclaimable? yes
[7664673.407946] lowmem_reserve[]: 0 0 62450 62450
[7664673.412615] Node 0 Normal free:508568kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:46136kB inactive_file:46500kB unevictable:168kB isolated(anon):0kB isolated(file):256kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610944kB slab_unreclaimable:60242792kB kernel_stack:6224kB pagetables:2628kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:160638 all_unreclaimable? yes
[7664673.459388] lowmem_reserve[]: 0 0 0 0
[7664673.463359] Node 1 Normal free:525352kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15616kB inactive_file:15656kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:3764kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:404311 all_unreclaimable? yes
[7664673.510223] lowmem_reserve[]: 0 0 0 0
[7664673.514198] Node 2 Normal free:524648kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:28168kB inactive_file:32448kB unevictable:8680kB isolated(anon):0kB isolated(file):11648kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476084kB kernel_stack:7936kB pagetables:1240kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:511039 all_unreclaimable? yes
[7664673.561406] lowmem_reserve[]: 0 0 0 0
[7664673.565375] Node 3 Normal free:524912kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:41692kB inactive_file:42612kB unevictable:840kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854276kB slab_unreclaimable:62369164kB kernel_stack:4224kB pagetables:2936kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:272344 all_unreclaimable? yes
[7664673.612150] lowmem_reserve[]: 0 0 0 0
[7664673.616118] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664673.630973] Node 0 DMA32: 363*4kB (UEM) 397*8kB (UEM) 1215*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261444kB
[7664673.647633] Node 0 Normal: 6237*4kB (UEM) 5710*8kB (UEM) 3929*16kB (UEM) 4491*32kB (UEM) 2053*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508692kB
[7664673.664504] Node 1 Normal: 88021*4kB (UEM) 21658*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525348kB
[7664673.677668] Node 2 Normal: 27439*4kB (UEM) 40237*8kB (EM) 870*16kB (UEM) 1650*32kB (UEM) 412*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524740kB
[7664673.693086] Node 3 Normal: 131254*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525016kB
[7664673.705492] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664673.714375] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664673.722992] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664673.731882] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664673.740494] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664673.749361] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664673.757966] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664673.766836] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664673.775446] 73277 total pagecache pages
[7664673.779462] 0 pages in swap cache
[7664673.782953] Swap cache stats: add 21120410, delete 21136382, find 4513374/7609796
[7664673.790607] Free swap  = 2045208kB
[7664673.794184] Total swap = 4194300kB
[7664673.797765] 66993253 pages RAM
[7664673.800995] 0 pages HighMem/MovableOnly
[7664673.805011] 1101945 pages reserved
[7664675.124509] ll_ost_io02_078 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664675.132960] ll_ost_io02_078 cpuset=/ mems_allowed=2
[7664675.138026] CPU: 14 PID: 83189 Comm: ll_ost_io02_078 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664675.151440] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664675.159280] Call Trace:
[7664675.161927]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664675.167247]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664675.172746]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664675.178584]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664675.184340]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664675.190360]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664675.196724]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664675.202832]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664675.208617]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664675.215153]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664675.221695]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664675.227882]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664675.233893]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664675.240004]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664675.246886]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664675.253055]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664675.260168]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664675.266743]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664675.273399]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664675.280763]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664675.287503]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664675.294853]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664675.302375]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664675.309296]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664675.316380]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664675.324132]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664675.331399]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664675.339280]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664675.346285]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664675.354257]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664675.361512]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664675.368003]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664675.375572]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664675.380637]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664675.386914]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664675.393534]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664675.399807] Mem-Info:
[7664675.402269] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32702 inactive_file:34505 isolated_file:5345
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824042 slab_unreclaimable:62296393
 mapped:1621 shmem:0 pagetables:2269 bounce:0
 free:590010 free_pcp:0 free_cma:0
[7664675.436573] Node 2 Normal free:524868kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:27956kB inactive_file:32352kB unevictable:8680kB isolated(anon):0kB isolated(file):13696kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476116kB kernel_stack:7936kB pagetables:1240kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:762123 all_unreclaimable? yes
[7664675.483800] lowmem_reserve[]: 0 0 0 0
[7664675.487775] Node 2 Normal: 27509*4kB (UEM) 40245*8kB (UEM) 873*16kB (UEM) 1652*32kB (UEM) 412*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525196kB
[7664675.503227] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664675.512121] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664675.520729] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664675.529595] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664675.538208] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664675.547087] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664675.555706] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664675.564604] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664675.573221] 73613 total pagecache pages
[7664675.577244] 0 pages in swap cache
[7664675.580744] Swap cache stats: add 21120420, delete 21136392, find 4513375/7609800
[7664675.588422] Free swap  = 2967828kB
[7664675.592003] Total swap = 4194300kB
[7664675.595584] 66993253 pages RAM
[7664675.598815] 0 pages HighMem/MovableOnly
[7664675.602828] 1101945 pages reserved
[7664675.606407] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664675.614467] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664675.623429] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664675.632248] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664675.640850] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664675.649033] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664675.657295] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664675.665937] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664675.674208] [53099]     0 53099     6670      239      18      649             0 smartd
[7664675.682391] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664675.690478] [53104]     0 53104    74785      315      85      275             0 sssd
[7664675.698516] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664675.707047] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664675.716006] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664675.724357] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664675.732625] [53159]     0 53159   110203      310     153    22622             0 sssd_be
[7664675.740900] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664675.749255] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664675.757608] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664675.766483] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664675.774514] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664675.782870] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664675.791228] [53969]     0 53969    31572      205      20      168             0 crond
[7664675.799320] [54035]     0 54035    27526      164      10       33             0 agetty
[7664675.807525] [54036]     0 54036    27526      158      11       33             0 agetty
[7664675.815709] [54186]     0 54186    22934      210      46      272             0 master
[7664675.823891] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664675.832013] [36317]     0 36317    28294      187      14       61             0 bash
[7664675.840044] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664675.848572] [36329]     0 36329    28177      160      14       55             0 grep
[7664675.856683] [76204]    89 76204    25501      252      46      282             0 pickup
[7664675.864873] [97173]     0 97173    48653      264      49      261             0 crond
[7664675.872994] [97192]     0 97192    34468      247      25     1344             0 python3
[7664675.881267] [97872]     0 97872    48653      263      49      263             0 crond
[7664675.889362] [97890]     0 97890    31176      215      18      701             0 python3
[7664675.897629] [98579]     0 98579    48653      266      49      235             0 crond
[7664675.905750] [98713]     0 98713    30977      230      16      529             0 python3
[7664675.914020] [99292]     0 99292    48653      257      49      261             0 crond
[7664675.922126] [99450]     0 99450    30913      226      18      446             0 python3
[7664675.930408] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664675.938675] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664675.947659] [100032]     0 100032    48653      266      49      240             0 crond
[7664675.955927] [100105]    89 100105    25553      264      47      274             0 smtp
[7664675.964109] [100203]     0 100203    30816      203      17      333             0 python3
[7664675.972552] Out of memory: Kill process 53159 (sssd_be) score 0 or sacrifice child
[7664675.980325] Killed process 53159 (sssd_be) total-vm:440812kB, anon-rss:0kB, file-rss:1240kB, shmem-rss:0kB
[7664676.026201] sssd_be: page allocation failure: order:0, mode:0x200da
[7664676.032671] CPU: 42 PID: 53159 Comm: sssd_be Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664676.045357] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664676.053188] Call Trace:
[7664676.055824]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664676.061142]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664676.067239]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664676.072814]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664676.078828]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664676.085363]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664676.091897]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664676.097741]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664676.104360]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664676.110627]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664676.116554]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664676.122570]  [<ffffffffa076a10e>] ? schedule_hrtimeout_range_clock+0xbe/0x150
[7664676.129885]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664676.135812]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664676.141730]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664676.147305]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664676.152626] Mem-Info:
[7664676.155106] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32636 inactive_file:33850 isolated_file:6272
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824042 slab_unreclaimable:62296393
 mapped:1621 shmem:0 pagetables:2269 bounce:0
 free:590006 free_pcp:0 free_cma:0
[7664676.189374] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664676.231130] lowmem_reserve[]: 0 1418 63868 63868
[7664676.236061] Node 0 DMA32 free:261308kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:816kB inactive_file:3384kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:136kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686248kB kernel_stack:384kB pagetables:12kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:33811 all_unreclaimable? yes
[7664676.281024] lowmem_reserve[]: 0 0 62450 62450
[7664676.285693] Node 0 Normal free:507576kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:38544kB inactive_file:37920kB unevictable:168kB isolated(anon):0kB isolated(file):18048kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610948kB slab_unreclaimable:60242704kB kernel_stack:5952kB pagetables:2628kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:802422 all_unreclaimable? yes
[7664676.332640] lowmem_reserve[]: 0 0 0 0
[7664676.336612] Node 1 Normal free:525360kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15992kB inactive_file:16036kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:2260kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:117404 all_unreclaimable? yes
[7664676.383472] lowmem_reserve[]: 0 0 0 0
[7664676.387442] Node 2 Normal free:524868kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31908kB inactive_file:38220kB unevictable:8680kB isolated(anon):0kB isolated(file):1152kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476116kB kernel_stack:7936kB pagetables:1240kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:232011 all_unreclaimable? yes
[7664676.434565] lowmem_reserve[]: 0 0 0 0
[7664676.438541] Node 3 Normal free:525016kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:42420kB inactive_file:42860kB unevictable:840kB isolated(anon):0kB isolated(file):256kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854276kB slab_unreclaimable:62369168kB kernel_stack:4224kB pagetables:2936kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:410764 all_unreclaimable? yes
[7664676.485317] lowmem_reserve[]: 0 0 0 0
[7664676.489281] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664676.504119] Node 0 DMA32: 410*4kB (EM) 396*8kB (UEM) 1210*16kB (UEM) 3689*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261576kB
[7664676.520525] Node 0 Normal: 6249*4kB (UEM) 5693*8kB (UEM) 3923*16kB (UEM) 4484*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 507836kB
[7664676.537366] Node 1 Normal: 88175*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525788kB
[7664676.550895] Node 2 Normal: 27509*4kB (UEM) 40245*8kB (UEM) 873*16kB (UEM) 1652*32kB (UEM) 412*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525196kB
[7664676.566294] Node 3 Normal: 131356*4kB (UM) 2*8kB (M) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525440kB
[7664676.579020] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664676.587888] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664676.596502] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664676.605367] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664676.613973] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664676.622841] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664676.631445] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664676.640310] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664676.648918] 73563 total pagecache pages
[7664676.652930] 0 pages in swap cache
[7664676.656423] Swap cache stats: add 21120420, delete 21136392, find 4513375/7609800
[7664676.664075] Free swap  = 2967828kB
[7664676.667654] Total swap = 4194300kB
[7664676.671235] 66993253 pages RAM
[7664676.674466] 0 pages HighMem/MovableOnly
[7664676.678480] 1101945 pages reserved
[7664676.688951] ll_ost_io01_079 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664676.697394] ll_ost_io01_079 cpuset=/ mems_allowed=1
[7664676.702460] CPU: 25 PID: 90484 Comm: ll_ost_io01_079 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664676.715838] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664676.723672] Call Trace:
[7664676.726314]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664676.731637]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664676.737132]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664676.742973]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664676.748726]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664676.754739]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664676.761093]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664676.767185]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664676.772933]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664676.779467]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664676.785995]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664676.792181]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664676.798185]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664676.804293]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664676.811175]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664676.817336]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664676.824435]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664676.831003]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664676.837652]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664676.844998]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664676.851744]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664676.859090]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664676.866611]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664676.873523]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664676.880611]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664676.888363]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664676.895621]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664676.903488]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664676.910448]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664676.915889]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664676.922363]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664676.929932]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664676.934990]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664676.941259]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664676.947880]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664676.954144] Mem-Info:
[7664676.956604] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33209 inactive_file:35834 isolated_file:3872
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824042 slab_unreclaimable:62296393
 mapped:1621 shmem:0 pagetables:2269 bounce:0
 free:590009 free_pcp:0 free_cma:0
[7664676.990878] Node 1 Normal free:525360kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15992kB inactive_file:16036kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711252kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:2260kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:117404 all_unreclaimable? yes
[7664677.037744] lowmem_reserve[]: 0 0 0 0
[7664677.041711] Node 1 Normal: 88175*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525788kB
[7664677.055242] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664677.064109] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664677.072713] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664677.081582] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664677.090186] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664677.099053] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664677.107659] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664677.116523] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664677.125131] 73563 total pagecache pages
[7664677.129144] 0 pages in swap cache
[7664677.132635] Swap cache stats: add 21120431, delete 21136403, find 4513376/7609803
[7664677.140290] Free swap  = 3058196kB
[7664677.143868] Total swap = 4194300kB
[7664677.147449] 66993253 pages RAM
[7664677.150678] 0 pages HighMem/MovableOnly
[7664677.154693] 1101945 pages reserved
[7664677.158272] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664677.166319] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664677.175278] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664677.184067] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664677.192649] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664677.200829] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664677.209098] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664677.217710] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664677.225970] [53099]     0 53099     6670      239      18      649             0 smartd
[7664677.234142] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664677.242228] [53104]     0 53104    74785      315      85      275             0 sssd
[7664677.250227] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664677.258749] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664677.267709] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664677.276055] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664677.284316] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664677.292670] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664677.301015] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664677.309885] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664677.317891] [53861]     0 53861   174315      320     170     4518             0 rsyslogd
[7664677.326245] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664677.334591] [53969]     0 53969    31572      205      20      168             0 crond
[7664677.342681] [54035]     0 54035    27526      164      10       33             0 agetty
[7664677.350858] [54036]     0 54036    27526      158      11       33             0 agetty
[7664677.359031] [54186]     0 54186    22934      210      46      272             0 master
[7664677.367203] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664677.375293] [36317]     0 36317    28294      187      14       61             0 bash
[7664677.383300] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664677.391827] [36329]     0 36329    28177      160      14       55             0 grep
[7664677.399934] [76204]    89 76204    25501      252      46      282             0 pickup
[7664677.408118] [97173]     0 97173    48653      264      49      261             0 crond
[7664677.416208] [97192]     0 97192    34468      247      25     1344             0 python3
[7664677.424478] [97872]     0 97872    48653      263      49      263             0 crond
[7664677.432569] [97890]     0 97890    31176      215      18      701             0 python3
[7664677.440838] [98579]     0 98579    48653      266      49      235             0 crond
[7664677.448931] [98713]     0 98713    30977      230      16      529             0 python3
[7664677.457194] [99292]     0 99292    48653      257      49      261             0 crond
[7664677.465287] [99450]     0 99450    30913      226      18      446             0 python3
[7664677.473553] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664677.481816] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664677.490775] [100032]     0 100032    48653      266      49      240             0 crond
[7664677.499035] [100105]    89 100105    25553      264      47      274             0 smtp
[7664677.507215] [100203]     0 100203    30816      203      17      333             0 python3
[7664677.515649] Out of memory: Kill process 53861 (rsyslogd) score 0 or sacrifice child
[7664677.523473] Killed process 53861 (rsyslogd) total-vm:697260kB, anon-rss:0kB, file-rss:1280kB, shmem-rss:0kB
[7664682.048160] LNetError: 80392:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds
[7664682.058503] LNetError: 80392:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.219@o2ib7 (6): c: 5, oc: 0, rc: 8
[7664682.072542] LNetError: 80401:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.219@o2ib7 added to recovery queue. Health = 900
[7664682.085670] LNetError: 80401:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 3 previous similar messages
[7664684.047403] LNetError: 80392:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds
[7664684.057747] LNetError: 80392:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.217@o2ib7 (6): c: 4, oc: 0, rc: 8
[7664684.128892] LNetError: 80399:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.217@o2ib7 added to recovery queue. Health = 900
[7664684.142706] LustreError: 80404:0:(events.c:305:request_in_callback()) event type 2, status -5, service ost_io
[7664684.153028] LustreError: 123039:0:(pack_generic.c:605:__lustre_unpack_msg()) message length 0 too small for magic/version check
[7664684.164683] LustreError: 123039:0:(sec.c:2191:sptlrpc_svc_unwrap_request()) error unpacking request from 12345-10.50.8.41@o2ib2 x1659083427320768
[7664685.994467] LNetError: 80409:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) lpni 10.0.10.226@o2ib7 added to recovery queue. Health = 900
[7664686.007593] LNetError: 80409:0:(peer.c:3451:lnet_peer_ni_add_to_recoveryq_locked()) Skipped 2 previous similar messages
[7664686.018600] LustreError: 3140:0:(ldlm_lib.c:3271:target_bulk_io()) @@@ truncated bulk READ 0(183274)  req@ffff9c114c918050 x1659185977603776/t0(0) o3->f56fe6b7-932f-4@10.50.16.1@o2ib2:473/0 lens 488/440 e 0 to 0 dl 1583650723 ref 1 fl Interpret:/0/0 rc 0/0
[7664686.018662] Lustre: fir-OST0021: Bulk IO read error with 430e4894-d38d-4 (at 10.50.14.11@o2ib2), client will retry: rc -110
[7664686.027329] LustreError: 123084:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(74605)  req@ffff9c2d8c244050 x1659209236853568/t0(0) o4->541f81d4-bd4f-4@10.50.7.3@o2ib2:488/0 lens 488/448 e 0 to 0 dl 1583650738 ref 1 fl Interpret:/0/0 rc 0/0
[7664686.027361] Lustre: fir-OST001b: Bulk IO write error with 541f81d4-bd4f-4 (at 10.50.7.3@o2ib2), client will retry: rc = -110
[7664686.087185] LustreError: 3140:0:(ldlm_lib.c:3271:target_bulk_io()) Skipped 4 previous similar messages
[7664687.047354] LNetError: 80392:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds
[7664687.057696] LNetError: 80392:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Skipped 2 previous similar messages
[7664687.068037] LNetError: 80392:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.216@o2ib7 (8): c: 4, oc: 0, rc: 8
[7664687.080196] LNetError: 80392:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Skipped 2 previous similar messages
[7664689.340976] ll_ost_io02_010 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664689.349423] ll_ost_io02_010 cpuset=/ mems_allowed=2
[7664689.354484] CPU: 6 PID: 119554 Comm: ll_ost_io02_010 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664689.367861] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664689.375689] Call Trace:
[7664689.378333]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664689.383654]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664689.389142]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664689.394988]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664689.400740]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664689.406750]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664689.413101]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664689.419195]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664689.424942]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664689.431470]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664689.438006]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664689.444191]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664689.450198]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664689.456304]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664689.463191]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664689.470422]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664689.476588]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664689.483247]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664689.490301]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664689.496054]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664689.502502]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664689.509383]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664689.514911]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664689.521998]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664689.529749]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664689.537006]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664689.544876]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664689.551877]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664689.559831]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664689.567089]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664689.573568]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664689.581139]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664689.586196]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664689.592463]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664689.599084]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664689.605349] Mem-Info:
[7664689.607809] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:32457 inactive_file:33840 isolated_file:5280
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824034 slab_unreclaimable:62296356
 mapped:1608 shmem:0 pagetables:1813 bounce:0
 free:590325 free_pcp:0 free_cma:0
[7664689.642084] Node 2 Normal free:525224kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31568kB inactive_file:37476kB unevictable:8680kB isolated(anon):0kB isolated(file):2944kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715188kB slab_unreclaimable:62476080kB kernel_stack:7920kB pagetables:620kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:299467 all_unreclaimable? yes
[7664689.689128] lowmem_reserve[]: 0 0 0 0
[7664689.693098] Node 2 Normal: 27486*4kB (UEM) 40274*8kB (UEM) 880*16kB (UEM) 1653*32kB (UEM) 412*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525480kB
[7664689.708501] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664689.717367] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664689.725974] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664689.734839] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664689.743443] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664689.752312] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664689.760928] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664689.769801] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664689.778407] 73887 total pagecache pages
[7664689.782422] 0 pages in swap cache
[7664689.785921] Swap cache stats: add 21120458, delete 21136430, find 4513385/7609819
[7664689.793573] Free swap  = 3075808kB
[7664689.797150] Total swap = 4194300kB
[7664689.800734] 66993253 pages RAM
[7664689.803965] 0 pages HighMem/MovableOnly
[7664689.807976] 1101945 pages reserved
[7664689.811556] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664689.819605] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664689.828563] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664689.837349] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664689.845940] [53050]     0 53050    13880      123      28      146         -1000 auditd
[7664689.854122] [53078]   999 53078   156119      278      64     2197             0 polkitd
[7664689.862391] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664689.871005] [53084]    32 53084    17316      115      37      138             0 rpcbind
[7664689.879273] [53099]     0 53099     6670      239      18      649             0 smartd
[7664689.887453] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664689.895543] [53104]     0 53104    74785      324      85      252             0 sssd
[7664689.903548] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664689.912078] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664689.921037] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664689.929392] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664689.937652] [53178]     0 53178    76774      292      95      239             0 sssd_nss
[7664689.946006] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664689.954351] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664689.963221] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664689.971228] [53863]     0 53863   176656      246      39     1246             0 collectd
[7664689.979580] [53969]     0 53969    31572      205      20      168             0 crond
[7664689.987666] [54035]     0 54035    27526      164      10       33             0 agetty
[7664689.995840] [54036]     0 54036    27526      158      11       33             0 agetty
[7664690.004013] [54186]     0 54186    22934      210      46      272             0 master
[7664690.012185] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664690.020307] [36317]     0 36317    28294      187      14       61             0 bash
[7664690.028315] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664690.036841] [36329]     0 36329    28177      160      14       55             0 grep
[7664690.044943] [76204]    89 76204    25501      252      46      282             0 pickup
[7664690.053125] [97173]     0 97173    48653      264      49      261             0 crond
[7664690.061213] [97192]     0 97192    34468      245      25     1344             0 python3
[7664690.069474] [97872]     0 97872    48653      263      49      263             0 crond
[7664690.077567] [97890]     0 97890    31176      215      18      701             0 python3
[7664690.085831] [98579]     0 98579    48653      266      49      235             0 crond
[7664690.093922] [98713]     0 98713    30977      227      16      529             0 python3
[7664690.102192] [99292]     0 99292    48653      257      49      261             0 crond
[7664690.110285] [99450]     0 99450    30913      224      18      446             0 python3
[7664690.118541] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664690.126807] [99739]    89 99739    25502      246      47      260             0 trivial-rewrite
[7664690.135765] [100032]     0 100032    48653      266      49      240             0 crond
[7664690.144025] [100105]    89 100105    25553      264      47      274             0 smtp
[7664690.152207] [100203]     0 100203    30816      202      17      333             0 python3
[7664690.160649] Out of memory: Kill process 53078 (polkitd) score 0 or sacrifice child
[7664690.168394] Killed process 53078 (polkitd) total-vm:624476kB, anon-rss:0kB, file-rss:1112kB, shmem-rss:0kB
[7664691.020435] LustreError: 90710:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(4194304)  req@ffff9c142c7ff050 x1659549650568960/t0(0) o4->b62a8bca-4275-4@10.50.1.4@o2ib2:467/0 lens 488/448 e 0 to 0 dl 1583650717 ref 1 fl Interpret:/0/0 rc 0/0
[7664691.020521] Lustre: fir-OST0023: Bulk IO write error with 12bd00c4-481c-4 (at 10.50.17.8@o2ib2), client will retry: rc = -110
[7664691.020535] LustreError: 123080:0:(ldlm_lib.c:3271:target_bulk_io()) @@@ truncated bulk READ 0(212363)  req@ffff9c3ef347e050 x1659397014590848/t0(0) o3->7c9c28a0-1550-4@10.50.15.11@o2ib2:473/0 lens 488/440 e 0 to 0 dl 1583650723 ref 1 fl Interpret:/0/0 rc 0/0
[7664691.020580] Lustre: fir-OST001f: Bulk IO read error with 7c9c28a0-1550-4 (at 10.50.15.11@o2ib2), client will retry: rc -110
[7664691.020582] Lustre: Skipped 4 previous similar messages
[7664691.094947] LustreError: 90710:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message
[7664695.735889] trivial-rewrite invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[7664695.744341] trivial-rewrite cpuset=/ mems_allowed=0-3
[7664695.749581] CPU: 11 PID: 99739 Comm: trivial-rewrite Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664695.762953] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664695.770780] Call Trace:
[7664695.773410]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664695.778730]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664695.784226]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664695.790071]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664695.796079]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664695.802431]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664695.808526]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664695.814271]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664695.820796]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664695.827322]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664695.833502]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664695.839508]  [<ffffffffa01ba3c8>] filemap_fault+0x298/0x490
[7664695.845283]  [<ffffffffc05871c6>] ext4_filemap_fault+0x36/0x50 [ext4]
[7664695.851905]  [<ffffffffa01e593a>] __do_fault.isra.59+0x8a/0x100
[7664695.858004]  [<ffffffffa01e5eec>] do_read_fault.isra.61+0x4c/0x1b0
[7664695.864369]  [<ffffffffa01ea874>] handle_pte_fault+0x2f4/0xd10
[7664695.870379]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664695.876298]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664695.882219]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664695.887790]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664695.893102] Mem-Info:
[7664695.895580] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34444 inactive_file:34623 isolated_file:4384
 unevictable:9044 dirty:90 writeback:0 unstable:0
 slab_reclaimable:824034 slab_unreclaimable:62296394
 mapped:1608 shmem:0 pagetables:1813 bounce:0
 free:590109 free_pcp:0 free_cma:0
[7664695.929934] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664695.971685] lowmem_reserve[]: 0 1418 63868 63868
[7664695.976606] Node 0 DMA32 free:261324kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:864kB inactive_file:2816kB unevictable:0kB isolated(anon):0kB isolated(file):512kB present:1633052kB managed:1452284kB mlocked:0kB dirty:20kB writeback:0kB mapped:84kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686224kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:125842 all_unreclaimable? no
[7664696.024342] LustreError: 3107:0:(ldlm_lib.c:3271:target_bulk_io()) @@@ truncated bulk READ 0(155474)  req@ffff9c1fb0425050 x1659467991415552/t0(0) o3->fb2c1382-8f5a-4@10.50.15.10@o2ib2:473/0 lens 488/440 e 0 to 0 dl 1583650723 ref 1 fl Interpret:/0/0 rc 0/0
[7664696.024378] Lustre: fir-OST0023: Bulk IO read error with fb2c1382-8f5a-4 (at 10.50.15.10@o2ib2), client will retry: rc -110
[7664696.021652] lowmem_reserve[]: 0 0 62450 62450
[7664696.060537] Node 0 Normal free:508608kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:4kB active_file:44756kB inactive_file:42316kB unevictable:168kB isolated(anon):0kB isolated(file):7296kB present:64998912kB managed:63949072kB mlocked:168kB dirty:116kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610940kB slab_unreclaimable:60242672kB kernel_stack:6192kB pagetables:2548kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:39925 all_unreclaimable? no
[7664696.107398] lowmem_reserve[]: 0 0 0 0
[7664696.111369] Node 1 Normal free:525508kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16500kB inactive_file:16660kB unevictable:26488kB isolated(anon):0kB isolated(file):896kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711248kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:2016kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:831626 all_unreclaimable? yes
[7664696.158404] lowmem_reserve[]: 0 0 0 0
[7664696.162381] Node 2 Normal free:524980kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32220kB inactive_file:35688kB unevictable:8680kB isolated(anon):0kB isolated(file):2432kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:160kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715188kB slab_unreclaimable:62476080kB kernel_stack:7920kB pagetables:620kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:470065 all_unreclaimable? no
[7664696.209493] lowmem_reserve[]: 0 0 0 0
[7664696.213464] Node 3 Normal free:523724kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:40kB active_file:43652kB inactive_file:43632kB unevictable:840kB isolated(anon):0kB isolated(file):1920kB present:67108352kB managed:66038732kB mlocked:840kB dirty:64kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854272kB slab_unreclaimable:62369264kB kernel_stack:4224kB pagetables:1860kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1304808 all_unreclaimable? yes
[7664696.260595] lowmem_reserve[]: 0 0 0 0
[7664696.264566] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664696.279407] Node 0 DMA32: 389*4kB (EM) 400*8kB (UEM) 1213*16kB (UEM) 3689*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261572kB
[7664696.295811] Node 0 Normal: 6502*4kB (UEM) 5720*8kB (UEM) 3924*16kB (UEM) 4484*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 509080kB
[7664696.312652] Node 1 Normal: 88093*4kB (UEM) 21640*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525508kB
[7664696.326182] Node 2 Normal: 27392*4kB (UEM) 40218*8kB (UEM) 896*16kB (UEM) 1675*32kB (UEM) 413*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525680kB
[7664696.341582] Node 3 Normal: 130945*4kB (UM) 7*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 523836kB
[7664696.354313] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664696.363179] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664696.371787] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664696.380653] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664696.389259] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664696.398127] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664696.406740] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664696.415605] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664696.424209] 74066 total pagecache pages
[7664696.428226] 0 pages in swap cache
[7664696.431716] Swap cache stats: add 21120629, delete 21136601, find 4513416/7609890
[7664696.439367] Free swap  = 3084652kB
[7664696.442947] Total swap = 4194300kB
[7664696.446529] 66993253 pages RAM
[7664696.449759] 0 pages HighMem/MovableOnly
[7664696.453773] 1101945 pages reserved
[7664696.457353] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664696.465401] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664696.474365] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664696.483152] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664696.491742] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664696.499920] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664696.508533] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664696.516799] [53099]     0 53099     6670      239      18      649             0 smartd
[7664696.524972] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664696.533059] [53104]     0 53104    74785      324      85      252             0 sssd
[7664696.541068] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664696.549594] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664696.558548] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664696.566894] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664696.575156] [53178]     0 53178    76774      291      95      241             0 sssd_nss
[7664696.583508] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664696.591853] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664696.600721] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664696.608730] [53863]     0 53863   176656      246      39     1247             0 collectd
[7664696.617085] [53969]     0 53969    31572      205      20      168             0 crond
[7664696.625179] [54035]     0 54035    27526      164      10       33             0 agetty
[7664696.633359] [54036]     0 54036    27526      158      11       33             0 agetty
[7664696.641532] [54186]     0 54186    22934      210      46      272             0 master
[7664696.649706] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664696.657830] [36317]     0 36317    28294      187      14       61             0 bash
[7664696.665834] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664696.674354] [36329]     0 36329    28177      160      14       55             0 grep
[7664696.682461] [76204]    89 76204    25501      252      46      282             0 pickup
[7664696.690642] [97173]     0 97173    48653      264      49      261             0 crond
[7664696.698732] [97192]     0 97192    34468      245      25     1344             0 python3
[7664696.706995] [97872]     0 97872    48653      263      49      263             0 crond
[7664696.715087] [97890]     0 97890    31176      215      18      701             0 python3
[7664696.723349] [98579]     0 98579    48653      266      49      235             0 crond
[7664696.731441] [98713]     0 98713    30977      227      16      529             0 python3
[7664696.739701] [99292]     0 99292    48653      257      49      261             0 crond
[7664696.747787] [99450]     0 99450    30913      224      18      446             0 python3
[7664696.756046] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664696.764306] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664696.773259] [100032]     0 100032    48653      266      49      240             0 crond
[7664696.781527] [100105]    89 100105    25553      264      47      274             0 smtp
[7664696.789699] [100203]     0 100203    30816      202      17      333             0 python3
[7664696.798133] Out of memory: Kill process 97192 (python3) score 0 or sacrifice child
[7664696.805879] Killed process 97192 (python3) total-vm:137872kB, anon-rss:0kB, file-rss:980kB, shmem-rss:0kB
[7664696.845455] python3: page allocation failure: order:0, mode:0x200da
[7664696.851913] CPU: 16 PID: 97192 Comm: python3 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664696.864601] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664696.872433] Call Trace:
[7664696.875078]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664696.880401]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664696.886499]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664696.892072]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664696.898086]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664696.904614]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664696.911148]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664696.916989]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664696.923602]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664696.929878]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664696.935809]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664696.941822]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664696.947750]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664696.953674]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664696.959250]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664696.964568] Mem-Info:
[7664696.967045] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33423 inactive_file:35173 isolated_file:3072
 unevictable:9044 dirty:90 writeback:0 unstable:0
 slab_reclaimable:824034 slab_unreclaimable:62296402
 mapped:1608 shmem:0 pagetables:1749 bounce:0
 free:590025 free_pcp:0 free_cma:0
[7664697.001411] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664697.043160] lowmem_reserve[]: 0 1418 63868 63868
[7664697.048090] Node 0 DMA32 free:261348kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:888kB inactive_file:3012kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:20kB writeback:0kB mapped:84kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686224kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:170188 all_unreclaimable? yes
[7664697.093048] lowmem_reserve[]: 0 0 62450 62450
[7664697.097716] Node 0 Normal free:508560kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:4kB active_file:42148kB inactive_file:41352kB unevictable:168kB isolated(anon):0kB isolated(file):11136kB present:64998912kB managed:63949072kB mlocked:168kB dirty:116kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610940kB slab_unreclaimable:60242672kB kernel_stack:6352kB pagetables:2516kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:711185 all_unreclaimable? yes
[7664697.144827] lowmem_reserve[]: 0 0 0 0
[7664697.148798] Node 1 Normal free:525604kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16760kB inactive_file:14932kB unevictable:26488kB isolated(anon):0kB isolated(file):1024kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711248kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:2000kB unstable:0kB bounce:0kB free_pcp:116kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3488 all_unreclaimable? no
[7664697.195837] lowmem_reserve[]: 0 0 0 0
[7664697.199812] Node 2 Normal free:525048kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32344kB inactive_file:37564kB unevictable:8680kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:160kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715188kB slab_unreclaimable:62476112kB kernel_stack:7920kB pagetables:612kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:832 all_unreclaimable? no
[7664697.246589] lowmem_reserve[]: 0 0 0 0
[7664697.250563] Node 3 Normal free:523976kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:40kB active_file:44388kB inactive_file:43124kB unevictable:840kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66038732kB mlocked:840kB dirty:64kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854272kB slab_unreclaimable:62369264kB kernel_stack:4224kB pagetables:1860kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:406370 all_unreclaimable? yes
[7664697.297513] lowmem_reserve[]: 0 0 0 0
[7664697.301487] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664697.316324] Node 0 DMA32: 388*4kB (EM) 400*8kB (UEM) 1213*16kB (UEM) 3689*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261568kB
[7664697.332730] Node 0 Normal: 6522*4kB (UEM) 5721*8kB (UEM) 3911*16kB (UEM) 4485*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508992kB
[7664697.349570] Node 1 Normal: 87977*4kB (UEM) 21640*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525044kB
[7664697.363099] Node 2 Normal: 27395*4kB (UEM) 40218*8kB (UEM) 895*16kB (UEM) 1675*32kB (UEM) 413*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525676kB
[7664697.378499] Node 3 Normal: 131020*4kB (UM) 7*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524136kB
[7664697.391222] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664697.400090] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664697.408696] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664697.417564] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664697.426167] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664697.435035] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664697.443641] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664697.452508] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664697.461112] 74095 total pagecache pages
[7664697.465126] 0 pages in swap cache
[7664697.468617] Swap cache stats: add 21120629, delete 21136601, find 4513416/7609890
[7664697.476270] Free swap  = 3084652kB
[7664697.479848] Total swap = 4194300kB
[7664697.483429] 66993253 pages RAM
[7664697.486660] 0 pages HighMem/MovableOnly
[7664697.490677] 1101945 pages reserved
[7664698.679210] ll_ost_io02_052 invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[7664698.687398] ll_ost_io02_052 cpuset=/ mems_allowed=2
[7664698.692462] CPU: 18 PID: 6885 Comm: ll_ost_io02_052 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664698.705751] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664698.713585] Call Trace:
[7664698.716218]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664698.721533]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664698.727030]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664698.732869]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664698.738626]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664698.744637]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664698.750991]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664698.757084]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664698.762839]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664698.769376]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664698.775911]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664698.782165]  [<ffffffffc124293f>] tgt_checksum_niobuf_rw+0xbf/0xe00 [ptlrpc]
[7664698.789428]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664698.796767]  [<ffffffffc0cb71e0>] ? obd_dif_crc_fn+0x20/0x20 [obdclass]
[7664698.803601]  [<ffffffffc1247325>] tgt_brw_read+0xc35/0x1e50 [ptlrpc]
[7664698.810160]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664698.817515]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664698.824865]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664698.832380]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664698.839299]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664698.846391]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664698.854141]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664698.861399]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664698.869268]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664698.876237]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664698.881679]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664698.888163]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664698.895738]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664698.900796]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664698.907063]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664698.913683]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664698.919947] Mem-Info:
[7664698.922409] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:33575 inactive_file:36254 isolated_file:2688
 unevictable:9044 dirty:90 writeback:0 unstable:0
 slab_reclaimable:824027 slab_unreclaimable:62296403
 mapped:1608 shmem:0 pagetables:1724 bounce:0
 free:590090 free_pcp:0 free_cma:0
[7664698.956768] Node 2 Normal free:525360kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31736kB inactive_file:38236kB unevictable:8680kB isolated(anon):0kB isolated(file):896kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:160kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715184kB slab_unreclaimable:62476112kB kernel_stack:7920kB pagetables:612kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:220965 all_unreclaimable? yes
[7664699.003902] lowmem_reserve[]: 0 0 0 0
[7664699.007879] Node 2 Normal: 27395*4kB (UEM) 40219*8kB (UEM) 895*16kB (EM) 1675*32kB (UEM) 413*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525684kB
[7664699.023239] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664699.032113] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664699.040725] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664699.049601] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664699.058208] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664699.067076] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664699.075682] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664699.084548] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664699.093154] 74044 total pagecache pages
[7664699.097168] 0 pages in swap cache
[7664699.100668] Swap cache stats: add 21120632, delete 21136604, find 4513416/7609892
[7664699.108321] Free swap  = 3090028kB
[7664699.111899] Total swap = 4194300kB
[7664699.115479] 66993253 pages RAM
[7664699.118712] 0 pages HighMem/MovableOnly
[7664699.122723] 1101945 pages reserved
[7664699.126303] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664699.134353] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664699.143310] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664699.152107] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664699.160691] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664699.168870] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664699.177482] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664699.185744] [53099]     0 53099     6670      239      18      649             0 smartd
[7664699.193925] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664699.202018] [53104]     0 53104    74785      324      85      252             0 sssd
[7664699.210017] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664699.218537] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664699.227492] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664699.235846] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664699.244113] [53178]     0 53178    76774      291      95      241             0 sssd_nss
[7664699.252467] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664699.260812] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664699.269685] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664699.277687] [53863]     0 53863   176656      246      39     1247             0 collectd
[7664699.286035] [53969]     0 53969    31572      205      20      168             0 crond
[7664699.294129] [54035]     0 54035    27526      164      10       33             0 agetty
[7664699.302302] [54036]     0 54036    27526      158      11       33             0 agetty
[7664699.310474] [54186]     0 54186    22934      210      46      272             0 master
[7664699.318647] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664699.326764] [36317]     0 36317    28294      187      14       61             0 bash
[7664699.334767] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664699.343287] [36329]     0 36329    28177      160      14       55             0 grep
[7664699.351387] [76204]    89 76204    25501      252      46      282             0 pickup
[7664699.359571] [97173]     0 97173    48653      264      49      261             0 crond
[7664699.367660] [97872]     0 97872    48653      263      49      263             0 crond
[7664699.375753] [97890]     0 97890    31176      215      18      701             0 python3
[7664699.384016] [98579]     0 98579    48653      266      49      235             0 crond
[7664699.392108] [98713]     0 98713    30977      227      16      529             0 python3
[7664699.400377] [99292]     0 99292    48653      257      49      261             0 crond
[7664699.408469] [99450]     0 99450    30913      224      18      446             0 python3
[7664699.416729] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664699.424990] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664699.433950] [100032]     0 100032    48653      266      49      240             0 crond
[7664699.442209] [100105]    89 100105    25553      264      47      274             0 smtp
[7664699.450381] [100203]     0 100203    30816      202      17      333             0 python3
[7664699.458814] Out of memory: Kill process 53863 (collectd) score 0 or sacrifice child
[7664699.466641] Killed process 53863 (collectd) total-vm:706624kB, anon-rss:0kB, file-rss:984kB, shmem-rss:0kB
[7664699.576429] collectd: page allocation failure: order:0, mode:0x200da
[7664699.582970] CPU: 0 PID: 53863 Comm: collectd Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664699.595666] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664699.603500] Call Trace:
[7664699.606139]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664699.611455]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664699.617556]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664699.623128]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664699.629137]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664699.635669]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664699.642196]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664699.648036]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664699.654648]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664699.660914]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664699.666836]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664699.672849]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664699.678768]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664699.684686]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664699.690262]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664699.695572] Mem-Info:
[7664699.698048] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:33398 inactive_file:34474 isolated_file:4096
 unevictable:9044 dirty:90 writeback:0 unstable:0
 slab_reclaimable:824027 slab_unreclaimable:62296403
 mapped:1608 shmem:0 pagetables:1724 bounce:0
 free:590256 free_pcp:0 free_cma:0
[7664699.732404] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664699.774157] lowmem_reserve[]: 0 1418 63868 63868
[7664699.779086] Node 0 DMA32 free:261260kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:948kB inactive_file:3600kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:20kB writeback:0kB mapped:84kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686224kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:220193 all_unreclaimable? yes
[7664699.824039] lowmem_reserve[]: 0 0 62450 62450
[7664699.828720] Node 0 Normal free:508524kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44404kB inactive_file:42716kB unevictable:168kB isolated(anon):0kB isolated(file):12416kB present:64998912kB managed:63949072kB mlocked:168kB dirty:116kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610920kB slab_unreclaimable:60242668kB kernel_stack:6512kB pagetables:2420kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:555294 all_unreclaimable? yes
[7664699.875844] lowmem_reserve[]: 0 0 0 0
[7664699.879822] Node 1 Normal free:525380kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16236kB inactive_file:16784kB unevictable:26488kB isolated(anon):0kB isolated(file):896kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711248kB slab_unreclaimable:63411344kB kernel_stack:20816kB pagetables:2000kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:167047 all_unreclaimable? yes
[7664699.926855] lowmem_reserve[]: 0 0 0 0
[7664699.930826] Node 2 Normal free:525364kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31736kB inactive_file:36824kB unevictable:8680kB isolated(anon):0kB isolated(file):2816kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:160kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715184kB slab_unreclaimable:62476112kB kernel_stack:7920kB pagetables:612kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:154883 all_unreclaimable? yes
[7664699.978024] lowmem_reserve[]: 0 0 0 0
[7664699.981992] Node 3 Normal free:524596kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:40kB active_file:41984kB inactive_file:43068kB unevictable:840kB isolated(anon):0kB isolated(file):256kB present:67108352kB managed:66038732kB mlocked:840kB dirty:64kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369264kB kernel_stack:4208kB pagetables:1856kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:659623 all_unreclaimable? yes
[7664700.028945] lowmem_reserve[]: 0 0 0 0
[7664700.032916] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664700.047755] Node 0 DMA32: 365*4kB (EM) 400*8kB (UEM) 1213*16kB (UEM) 3689*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261476kB
[7664700.064158] Node 0 Normal: 6369*4kB (UEM) 5722*8kB (UEM) 3935*16kB (UEM) 4485*32kB (UEM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508772kB
[7664700.080999] Node 1 Normal: 88022*4kB (UEM) 21640*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525224kB
[7664700.094529] Node 2 Normal: 27394*4kB (UEM) 40219*8kB (UEM) 896*16kB (UEM) 1675*32kB (UEM) 413*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525696kB
[7664700.109928] Node 3 Normal: 131195*4kB (UEM) 6*8kB (UE) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524828kB
[7664700.122827] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664700.131693] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664700.140299] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664700.149163] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664700.157776] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664700.166645] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664700.175252] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664700.184122] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664700.192736] 74042 total pagecache pages
[7664700.196759] 0 pages in swap cache
[7664700.200257] Swap cache stats: add 21120650, delete 21136622, find 4513420/7609903
[7664700.207911] Free swap  = 3090028kB
[7664700.211499] Total swap = 4194300kB
[7664700.215087] 66993253 pages RAM
[7664700.218327] 0 pages HighMem/MovableOnly
[7664700.222338] 1101945 pages reserved
[7664701.049747] LustreError: 89774:0:(ldlm_lib.c:3271:target_bulk_io()) @@@ truncated bulk READ 0(95248)  req@ffff9c2f94fb9050 x1659475663823296/t0(0) o3->b4f8cb5a-edfb-4@10.50.13.3@o2ib2:493/0 lens 488/440 e 1 to 0 dl 1583650743 ref 1 fl Interpret:/0/0 rc 0/0
[7664701.072611] Lustre: fir-OST001f: Bulk IO read error with b4f8cb5a-edfb-4 (at 10.50.13.3@o2ib2), client will retry: rc -110
[7664706.048771] LNetError: 80392:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Timed out tx: active_txs, 0 seconds
[7664706.059117] LNetError: 80392:0:(o2iblnd_cb.c:3350:kiblnd_check_txs_locked()) Skipped 1 previous similar message
[7664706.069373] LNetError: 80392:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Timed out RDMA with 10.0.10.227@o2ib7 (6): c: 0, oc: 0, rc: 8
[7664706.081538] LNetError: 80392:0:(o2iblnd_cb.c:3425:kiblnd_check_conns()) Skipped 1 previous similar message
[7664706.091934] LustreError: 90708:0:(ldlm_lib.c:3271:target_bulk_io()) @@@ truncated bulk READ 0(74651)  req@ffff9c1f527ac850 x1659179002321472/t0(0) o3->ccea6ca8-94f6-4@10.50.15.3@o2ib2:496/0 lens 488/440 e 1 to 0 dl 1583650746 ref 1 fl Interpret:/0/0 rc 0/0
[7664706.092090] Lustre: fir-OST001d: Bulk IO read error with 430e4894-d38d-4 (at 10.50.14.11@o2ib2), client will retry: rc -110
[7664706.126079] LustreError: 90708:0:(ldlm_lib.c:3271:target_bulk_io()) Skipped 4 previous similar messages
[7664706.665150] ll_ost_io00_088 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664706.673594] ll_ost_io00_088 cpuset=/ mems_allowed=0
[7664706.678657] CPU: 20 PID: 90706 Comm: ll_ost_io00_088 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664706.692034] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664706.699864] Call Trace:
[7664706.702506]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664706.707820]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664706.713315]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664706.719156]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664706.724910]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664706.730924]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664706.737278]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664706.743382]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664706.749134]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664706.755671]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664706.762204]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664706.768389]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664706.774396]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664706.780513]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664706.787396]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664706.794620]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664706.800777]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664706.807391]  [<ffffffffa021bd89>] ? ___slab_alloc+0x209/0x4f0
[7664706.813313]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664706.819067]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664706.825513]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664706.832388]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664706.837938]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664706.845026]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664706.852778]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664706.860039]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664706.867904]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664706.874907]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664706.882869]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664706.890138]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664706.896642]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664706.904232]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664706.909288]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664706.915562]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664706.922181]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664706.928447] Mem-Info:
[7664706.930913] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:34050 inactive_file:34936 isolated_file:3744
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824025 slab_unreclaimable:62296545
 mapped:1607 shmem:0 pagetables:1685 bounce:0
 free:590531 free_pcp:2 free_cma:0
[7664706.965183] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664707.006933] lowmem_reserve[]: 0 1418 63868 63868
[7664707.011857] Node 0 DMA32 free:261316kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1052kB inactive_file:3364kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:80kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686220kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:20776 all_unreclaimable? yes
[7664707.056724] lowmem_reserve[]: 0 0 62450 62450
[7664707.061387] Node 0 Normal free:508484kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44788kB inactive_file:44696kB unevictable:168kB isolated(anon):0kB isolated(file):4352kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243256kB kernel_stack:6080kB pagetables:2324kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:380279 all_unreclaimable? yes
[7664707.108245] lowmem_reserve[]: 0 0 0 0
[7664707.112215] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664707.127055] Node 0 DMA32: 351*4kB (UEM) 399*8kB (UEM) 1214*16kB (UEM) 3689*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261428kB
[7664707.143546] Node 0 Normal: 6455*4kB (UEM) 5705*8kB (UEM) 3934*16kB (UEM) 4479*32kB (EM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508772kB
[7664707.160301] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664707.169167] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664707.177774] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664707.186640] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664707.195245] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664707.204110] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664707.212717] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664707.221588] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664707.230197] 73699 total pagecache pages
[7664707.234212] 0 pages in swap cache
[7664707.237703] Swap cache stats: add 21120677, delete 21136649, find 4513424/7609910
[7664707.245355] Free swap  = 3094380kB
[7664707.248935] Total swap = 4194300kB
[7664707.252514] 66993253 pages RAM
[7664707.255745] 0 pages HighMem/MovableOnly
[7664707.259760] 1101945 pages reserved
[7664707.263339] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664707.271388] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664707.280346] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664707.289136] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664707.297736] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664707.305915] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664707.314528] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664707.322795] [53099]     0 53099     6670      239      18      649             0 smartd
[7664707.330969] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664707.339064] [53104]     0 53104    74785      324      85      252             0 sssd
[7664707.347072] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664707.355591] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664707.364542] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664707.372891] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664707.381157] [53178]     0 53178    76774      291      95      241             0 sssd_nss
[7664707.389505] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664707.397860] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664707.406737] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664707.414743] [53969]     0 53969    31572      205      20      168             0 crond
[7664707.422836] [54035]     0 54035    27526      164      10       33             0 agetty
[7664707.431008] [54036]     0 54036    27526      158      11       33             0 agetty
[7664707.439182] [54186]     0 54186    22934      210      46      272             0 master
[7664707.447356] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664707.455478] [36317]     0 36317    28294      187      14       61             0 bash
[7664707.463483] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664707.472002] [36329]     0 36329    28177      160      14       55             0 grep
[7664707.480108] [76204]    89 76204    25501      252      46      282             0 pickup
[7664707.488295] [97173]     0 97173    48653      264      49      261             0 crond
[7664707.496385] [97872]     0 97872    48653      263      49      263             0 crond
[7664707.504478] [97890]     0 97890    31176      215      18      701             0 python3
[7664707.512740] [98579]     0 98579    48653      266      49      235             0 crond
[7664707.520830] [98713]     0 98713    30977      227      16      529             0 python3
[7664707.529093] [99292]     0 99292    48653      257      49      261             0 crond
[7664707.537188] [99450]     0 99450    30913      224      18      446             0 python3
[7664707.545455] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664707.553721] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664707.562676] [100032]     0 100032    48653      266      49      240             0 crond
[7664707.570947] [100105]    89 100105    25553      264      47      274             0 smtp
[7664707.579125] [100203]     0 100203    30816      202      17      333             0 python3
[7664707.587566] Out of memory: Kill process 97890 (python3) score 0 or sacrifice child
[7664707.595313] Killed process 97890 (python3) total-vm:124704kB, anon-rss:0kB, file-rss:860kB, shmem-rss:0kB
[7664707.689556] python3: page allocation failure: order:0, mode:0x201da
[7664707.696007] CPU: 26 PID: 97890 Comm: python3 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664707.708696] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664707.716526] Call Trace:
[7664707.719162]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664707.724491]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664707.730596]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664707.736186]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664707.742200]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664707.748732]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664707.755260]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664707.761448]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664707.767463]  [<ffffffffa01ba3c8>] filemap_fault+0x298/0x490
[7664707.773250]  [<ffffffffc05871c6>] ext4_filemap_fault+0x36/0x50 [ext4]
[7664707.779882]  [<ffffffffa01e593a>] __do_fault.isra.59+0x8a/0x100
[7664707.785998]  [<ffffffffa0233289>] ? __mem_cgroup_uncharge_common+0x49/0x2f0
[7664707.793140]  [<ffffffffa01e5eec>] do_read_fault.isra.61+0x4c/0x1b0
[7664707.799495]  [<ffffffffa01ea874>] handle_pte_fault+0x2f4/0xd10
[7664707.805500]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664707.811421]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664707.817347]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664707.822920]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664707.828237] Mem-Info:
[7664707.830718] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32589 inactive_file:35846 isolated_file:2656
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824025 slab_unreclaimable:62296554
 mapped:1607 shmem:0 pagetables:1685 bounce:0
 free:590366 free_pcp:0 free_cma:0
[7664707.864992] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664707.906747] lowmem_reserve[]: 0 1418 63868 63868
[7664707.911676] Node 0 DMA32 free:261300kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1072kB inactive_file:3432kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:80kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686220kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:21789 all_unreclaimable? yes
[7664707.956574] lowmem_reserve[]: 0 0 62450 62450
[7664707.961373] Node 0 Normal free:508592kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:47580kB inactive_file:44596kB unevictable:168kB isolated(anon):0kB isolated(file):3456kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243256kB kernel_stack:6080kB pagetables:2324kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:778808 all_unreclaimable? yes
[7664708.008263] lowmem_reserve[]: 0 0 0 0
[7664708.012235] Node 1 Normal free:525504kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16312kB inactive_file:16572kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711248kB slab_unreclaimable:63411344kB kernel_stack:20816kB pagetables:1988kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:301918 all_unreclaimable? yes
[7664708.059135] lowmem_reserve[]: 0 0 0 0
[7664708.063108] Node 2 Normal free:525300kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32796kB inactive_file:35048kB unevictable:8680kB isolated(anon):0kB isolated(file):3200kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715184kB slab_unreclaimable:62476092kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1134574 all_unreclaimable? yes
[7664708.110233] lowmem_reserve[]: 0 0 0 0
[7664708.114200] Node 3 Normal free:524888kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:43724kB inactive_file:44284kB unevictable:840kB isolated(anon):0kB isolated(file):384kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369304kB kernel_stack:4208kB pagetables:1852kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1367054 all_unreclaimable? yes
[7664708.161055] lowmem_reserve[]: 0 0 0 0
[7664708.165019] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664708.179860] Node 0 DMA32: 352*4kB (UEM) 402*8kB (UEM) 1214*16kB (UEM) 3689*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261456kB
[7664708.196353] Node 0 Normal: 6469*4kB (UEM) 5705*8kB (UEM) 3934*16kB (UEM) 4479*32kB (EM) 2046*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508828kB
[7664708.213113] Node 1 Normal: 88054*4kB (UEM) 21659*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525504kB
[7664708.226645] Node 2 Normal: 27478*4kB (UEM) 40145*8kB (UEM) 894*16kB (UEM) 1669*32kB (UEM) 414*64kB (UEM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525408kB
[7664708.242511] Node 3 Normal: 131209*4kB (UEM) 6*8kB (UE) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524884kB
[7664708.255410] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664708.264287] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664708.272902] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664708.281775] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664708.290390] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664708.299264] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664708.307871] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664708.316747] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664708.325358] 73769 total pagecache pages
[7664708.329373] 0 pages in swap cache
[7664708.332873] Swap cache stats: add 21120677, delete 21136649, find 4513424/7609910
[7664708.340526] Free swap  = 3094380kB
[7664708.344107] Total swap = 4194300kB
[7664708.347694] 66993253 pages RAM
[7664708.350927] 0 pages HighMem/MovableOnly
[7664708.354946] 1101945 pages reserved
[7664708.500838] ll_ost_io03_035 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664708.509276] ll_ost_io03_035 cpuset=/ mems_allowed=3
[7664708.514332] CPU: 47 PID: 3183 Comm: ll_ost_io03_035 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664708.527623] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664708.535448] Call Trace:
[7664708.538081]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664708.543399]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664708.548893]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664708.554727]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664708.560483]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664708.566496]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664708.572855]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664708.578951]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664708.584703]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664708.591232]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664708.597765]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664708.603946]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664708.609957]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664708.616066]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664708.622952]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664708.630184]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664708.636351]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664708.643006]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664708.650095]  [<ffffffffc11dcbe7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[7664708.656708]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664708.662453]  [<ffffffffa00dca58>] ? __enqueue_entity+0x78/0x80
[7664708.668459]  [<ffffffffa00e367f>] ? enqueue_entity+0x2ef/0xbe0
[7664708.674467]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664708.679995]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664708.687083]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664708.694832]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664708.702087]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664708.709953]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664708.716954]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664708.724907]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664708.732169]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664708.738644]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664708.746211]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664708.751271]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664708.757538]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664708.764158]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664708.770422] Mem-Info:
[7664708.772883] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33918 inactive_file:33676 isolated_file:2528
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824025 slab_unreclaimable:62296538
 mapped:1588 shmem:0 pagetables:1685 bounce:0
 free:590404 free_pcp:0 free_cma:0
[7664708.807153] Node 3 Normal free:524884kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:43272kB inactive_file:42952kB unevictable:840kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369308kB kernel_stack:4208kB pagetables:1852kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1367054 all_unreclaimable? yes
[7664708.853840] lowmem_reserve[]: 0 0 0 0
[7664708.857808] Node 3 Normal: 131209*4kB (UEM) 6*8kB (UE) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524884kB
[7664708.870708] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664708.879573] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664708.888178] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664708.897044] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664708.905651] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664708.914516] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664708.923123] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664708.931990] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664708.940593] 73669 total pagecache pages
[7664708.944607] 0 pages in swap cache
[7664708.948099] Swap cache stats: add 21120678, delete 21136650, find 4513424/7609912
[7664708.955754] Free swap  = 3097196kB
[7664708.959330] Total swap = 4194300kB
[7664708.962914] 66993253 pages RAM
[7664708.966144] 0 pages HighMem/MovableOnly
[7664708.970156] 1101945 pages reserved
[7664708.973735] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664708.981783] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664708.990742] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664708.999529] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664709.008119] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664709.016297] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664709.024907] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664709.033167] [53099]     0 53099     6670      239      18      649             0 smartd
[7664709.041348] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664709.049442] [53104]     0 53104    74785      324      85      252             0 sssd
[7664709.057444] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664709.065969] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664709.074923] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664709.083269] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664709.091541] [53178]     0 53178    76774      291      95      241             0 sssd_nss
[7664709.099895] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664709.108248] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664709.117124] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664709.125134] [53969]     0 53969    31572      205      20      168             0 crond
[7664709.133228] [54035]     0 54035    27526      164      10       33             0 agetty
[7664709.141414] [54036]     0 54036    27526      158      11       33             0 agetty
[7664709.149587] [54186]     0 54186    22934      210      46      272             0 master
[7664709.157760] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664709.165873] [36317]     0 36317    28294      187      14       61             0 bash
[7664709.173872] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664709.176216] LustreError: 8706:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c26709a4a00
[7664709.193251] [36329]     0 36329    28177      160      14       55             0 grep
[7664709.194980] LustreError: 8763:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c3a125b9000
[7664709.212219] [76204]    89 76204    25501      252      46      282             0 pickup
[7664709.220402] [97173]     0 97173    48653      264      49      261             0 crond
[7664709.227002] LustreError: 90680:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c1dbbd31800
[7664709.233911] LustreError: 8712:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c4a49683200
[7664709.240251] LustreError: 90700:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c45bee18a00
[7664709.240262] LustreError: 90700:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c45bee18a00
[7664709.240273] LustreError: 90700:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c45bee18a00
[7664709.240284] LustreError: 90700:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c45bee18a00
[7664709.294083] [97872]     0 97872    48653      263      49      263             0 crond
[7664709.302178] [98579]     0 98579    48653      266      49      235             0 crond
[7664709.310268] [98713]     0 98713    30977      211      16      529             0 python3
[7664709.318532] [99292]     0 99292    48653      257      49      261             0 crond
[7664709.326626] [99450]     0 99450    30913      208      18      446             0 python3
[7664709.334890] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664709.343152] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664709.352104] [100032]     0 100032    48653      266      49      240             0 crond
[7664709.360364] [100105]    89 100105    25553      264      47      274             0 smtp
[7664709.368543] [100203]     0 100203    30816      185      17      333             0 python3
[7664709.376978] Out of memory: Kill process 53099 (smartd) score 0 or sacrifice child
[7664709.384637] Killed process 53099 (smartd) total-vm:26680kB, anon-rss:0kB, file-rss:956kB, shmem-rss:0kB
[7664709.504023] ll_ost_io02_077 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664709.512467] ll_ost_io02_077 cpuset=/ mems_allowed=2
[7664709.517530] CPU: 34 PID: 83188 Comm: ll_ost_io02_077 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664709.530907] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664709.538733] Call Trace:
[7664709.541368]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664709.546683]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664709.552181]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664709.558018]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664709.563766]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664709.569778]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664709.576141]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664709.582240]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664709.587988]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664709.594523]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664709.601059]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664709.607244]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664709.613251]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664709.619360]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664709.626242]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664709.633467]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664709.639637]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664709.646257]  [<ffffffffa002a59e>] ? __switch_to+0xce/0x580
[7664709.651924]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664709.657678]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664709.664128]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664709.671006]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664709.676535]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664709.683629]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664709.691388]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664709.698653]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664709.706549]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664709.713560]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664709.721513]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664709.728776]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664709.735285]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664709.742860]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664709.747922]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664709.754192]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664709.760919]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664709.767201] Mem-Info:
[7664709.769666] active_anon:0 inactive_anon:5 isolated_anon:0
 active_file:34619 inactive_file:35308 isolated_file:2528
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824034 slab_unreclaimable:62296630
 mapped:1588 shmem:0 pagetables:1649 bounce:0
 free:590114 free_pcp:139 free_cma:0
[7664709.804144] Node 2 Normal free:525364kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:20kB active_file:31216kB inactive_file:36544kB unevictable:8680kB isolated(anon):0kB isolated(file):1920kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715188kB slab_unreclaimable:62476080kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:43099 all_unreclaimable? no
[7664709.851134] lowmem_reserve[]: 0 0 0 0
[7664709.855112] Node 2 Normal: 27391*4kB (UEM) 40175*8kB (UEM) 892*16kB (UEM) 1676*32kB (UEM) 414*64kB (UEM) 2*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525620kB
[7664709.871051] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664709.879919] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664709.888522] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664709.897398] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664709.906017] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664709.914925] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664709.923543] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664709.932421] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664709.941036] 73740 total pagecache pages
[7664709.945083] 0 pages in swap cache
[7664709.948584] Swap cache stats: add 21120686, delete 21136658, find 4513428/7609918
[7664709.956250] Free swap  = 3099756kB
[7664709.959833] Total swap = 4194300kB
[7664709.963415] 66993253 pages RAM
[7664709.966645] 0 pages HighMem/MovableOnly
[7664709.970656] 1101945 pages reserved
[7664709.974236] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664709.982296] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664709.991271] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664710.000066] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664710.008676] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664710.016868] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664710.025487] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664710.033783] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664710.041891] [53104]     0 53104    74785      324      85      253             0 sssd
[7664710.049903] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664710.058441] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664710.067442] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664710.075800] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664710.084076] [53178]     0 53178    76774      291      95      241             0 sssd_nss
[7664710.092441] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664710.100834] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664710.109716] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664710.117730] [53969]     0 53969    31572      205      20      168             0 crond
[7664710.125834] [54035]     0 54035    27526      164      10       33             0 agetty
[7664710.134044] [54036]     0 54036    27526      158      11       33             0 agetty
[7664710.142219] [54186]     0 54186    22934      210      46      272             0 master
[7664710.150392] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664710.158519] [36317]     0 36317    28294      187      14       61             0 bash
[7664710.166523] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664710.175050] [36329]     0 36329    28177      160      14       55             0 grep
[7664710.183158] [76204]    89 76204    25501      252      46      282             0 pickup
[7664710.191347] [97173]     0 97173    48653      264      49      261             0 crond
[7664710.199458] [97872]     0 97872    48653      263      49      263             0 crond
[7664710.207560] [98579]     0 98579    48653      266      49      235             0 crond
[7664710.215659] [98713]     0 98713    30977      211      16      529             0 python3
[7664710.223921] [99292]     0 99292    48653      257      49      261             0 crond
[7664710.232018] [99450]     0 99450    30913      208      18      446             0 python3
[7664710.240291] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664710.248569] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664710.257543] [100032]     0 100032    48653      266      49      240             0 crond
[7664710.265841] [100105]    89 100105    25553      264      47      274             0 smtp
[7664710.274024] [100203]     0 100203    30816      185      17      333             0 python3
[7664710.282472] Out of memory: Kill process 98713 (python3) score 0 or sacrifice child
[7664710.290218] Killed process 98713 (python3) total-vm:123908kB, anon-rss:0kB, file-rss:844kB, shmem-rss:0kB
[7664710.498769] python3: page allocation failure: order:0, mode:0x200da
[7664710.505220] CPU: 15 PID: 98713 Comm: python3 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664710.517907] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664710.525739] Call Trace:
[7664710.528371]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664710.533693]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664710.539790]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664710.545364]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664710.551381]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664710.557915]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664710.564447]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664710.570280]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664710.576892]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664710.583157]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664710.589079]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664710.595092]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664710.601016]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664710.606939]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664710.612515]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664710.617834] Mem-Info:
[7664710.620312] active_anon:0 inactive_anon:5 isolated_anon:0
 active_file:34378 inactive_file:34621 isolated_file:2709
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824036 slab_unreclaimable:62296631
 mapped:1588 shmem:0 pagetables:1649 bounce:0
 free:590188 free_pcp:37 free_cma:0
[7664710.654671] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664710.696426] lowmem_reserve[]: 0 1418 63868 63868
[7664710.701354] Node 0 DMA32 free:261312kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:836kB inactive_file:2856kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686256kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:56975 all_unreclaimable? yes
[7664710.746050] lowmem_reserve[]: 0 0 62450 62450
[7664710.750712] Node 0 Normal free:508120kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44724kB inactive_file:42264kB unevictable:168kB isolated(anon):0kB isolated(file):5248kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243532kB kernel_stack:5840kB pagetables:2252kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:283308 all_unreclaimable? yes
[7664710.797574] lowmem_reserve[]: 0 0 0 0
[7664710.801544] Node 1 Normal free:525216kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17264kB inactive_file:14160kB unevictable:26488kB isolated(anon):0kB isolated(file):944kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411344kB kernel_stack:20816kB pagetables:1916kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1244662 all_unreclaimable? yes
[7664710.848666] lowmem_reserve[]: 0 0 0 0
[7664710.852642] Node 2 Normal free:525500kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32880kB inactive_file:35600kB unevictable:8680kB isolated(anon):0kB isolated(file):2688kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715200kB slab_unreclaimable:62476084kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:661237 all_unreclaimable? yes
[7664710.899676] lowmem_reserve[]: 0 0 0 0
[7664710.903646] Node 3 Normal free:524884kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:43280kB inactive_file:42944kB unevictable:840kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369308kB kernel_stack:4208kB pagetables:1852kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1367054 all_unreclaimable? yes
[7664710.950337] lowmem_reserve[]: 0 0 0 0
[7664710.954308] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664710.969147] Node 0 DMA32: 440*4kB (UEM) 407*8kB (UEM) 1213*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261800kB
[7664710.985642] Node 0 Normal: 6370*4kB (UEM) 5719*8kB (UEM) 3934*16kB (UEM) 4480*32kB (UEM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508256kB
[7664711.002482] Node 1 Normal: 88044*4kB (UEM) 21630*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525216kB
[7664711.015637] Node 2 Normal: 27361*4kB (UEM) 40163*8kB (UEM) 894*16kB (UEM) 1676*32kB (UEM) 414*64kB (UEM) 2*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525436kB
[7664711.031585] Node 3 Normal: 131209*4kB (UEM) 6*8kB (UE) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524884kB
[7664711.044482] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664711.053346] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664711.061956] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664711.070835] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664711.079451] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664711.088319] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664711.095509] LustreError: 80409:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c29177a8400
[7664711.095570] LustreError: 90678:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(78730)  req@ffff9c207d7cf050 x1659209236854592/t0(0) o4->541f81d4-bd4f-4@10.50.7.3@o2ib2:494/0 lens 488/448 e 0 to 0 dl 1583650744 ref 1 fl Interpret:/0/0 rc 0/0
[7664711.095598] Lustre: fir-OST001d: Bulk IO write error with 541f81d4-bd4f-4 (at 10.50.7.3@o2ib2), client will retry: rc = -110
[7664711.095600] Lustre: Skipped 1 previous similar message
[7664711.147575] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664711.156449] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664711.165065] 73854 total pagecache pages
[7664711.169087] 0 pages in swap cache
[7664711.172587] Swap cache stats: add 21120687, delete 21136659, find 4513429/7609920
[7664711.180247] Free swap  = 3099756kB
[7664711.183827] Total swap = 4194300kB
[7664711.187406] 66993253 pages RAM
[7664711.190639] 0 pages HighMem/MovableOnly
[7664711.194650] 1101945 pages reserved
[7664711.256602] LustreError: 3109:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c1bafc1ae00
[7664711.611662] ll_ost_io02_088 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664711.620104] ll_ost_io02_088 cpuset=/ mems_allowed=2
[7664711.625172] CPU: 10 PID: 8667 Comm: ll_ost_io02_088 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664711.638457] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664711.646285] Call Trace:
[7664711.648920]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664711.654237]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664711.659730]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664711.665571]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664711.671584]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664711.677936]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664711.684030]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664711.689775]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664711.696304]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664711.702838]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664711.709025]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664711.715031]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664711.721147]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664711.728031]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664711.734184]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664711.741277]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664711.747845]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664711.754497]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664711.761844]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664711.768589]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664711.775937]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664711.783454]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664711.790376]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664711.797462]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664711.805217]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664711.812472]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664711.820341]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664711.827315]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664711.832752]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664711.839234]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664711.846802]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664711.851862]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664711.858138]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664711.864758]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664711.871025] Mem-Info:
[7664711.873483] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:32145 inactive_file:35045 isolated_file:4704
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824040 slab_unreclaimable:62296639
 mapped:1588 shmem:0 pagetables:1633 bounce:0
 free:590317 free_pcp:0 free_cma:0
[7664711.907757] Node 2 Normal free:525116kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32344kB inactive_file:39104kB unevictable:8680kB isolated(anon):0kB isolated(file):1408kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476084kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:139544 all_unreclaimable? yes
[7664711.954796] lowmem_reserve[]: 0 0 0 0
[7664711.958762] Node 2 Normal: 27350*4kB (EM) 40156*8kB (UEM) 887*16kB (EM) 1667*32kB (EM) 414*64kB (EM) 2*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524936kB
[7664711.974278] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664711.983145] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664711.991749] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664712.000618] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664712.009223] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664712.018090] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664712.026696] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664712.035560] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664712.044170] 73728 total pagecache pages
[7664712.048190] 0 pages in swap cache
[7664712.051682] Swap cache stats: add 21120697, delete 21136669, find 4513431/7609924
[7664712.059333] Free swap  = 3101548kB
[7664712.062912] Total swap = 4194300kB
[7664712.066495] 66993253 pages RAM
[7664712.069725] 0 pages HighMem/MovableOnly
[7664712.073735] 1101945 pages reserved
[7664712.077317] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664712.085364] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664712.094324] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664712.095569] LustreError: 80409:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c363824fe00
[7664712.114067] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664712.122663] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664712.130840] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664712.139452] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664712.147724] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664712.155813] [53104]     0 53104    74785      324      85      253             0 sssd
[7664712.163815] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664712.172342] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664712.181296] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664712.189650] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664712.197919] [53178]     0 53178    76774      291      95      241             0 sssd_nss
[7664712.206271] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664712.214617] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664712.223489] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664712.231494] [53969]     0 53969    31572      205      20      168             0 crond
[7664712.239586] [54035]     0 54035    27526      164      10       33             0 agetty
[7664712.247763] [54036]     0 54036    27526      158      11       33             0 agetty
[7664712.255944] [54186]     0 54186    22934      210      46      273             0 master
[7664712.264127] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664712.272245] [36317]     0 36317    28294      187      14       61             0 bash
[7664712.280253] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664712.288780] [36329]     0 36329    28177      160      14       55             0 grep
[7664712.296883] [76204]    89 76204    25501      252      46      282             0 pickup
[7664712.305061] [97173]     0 97173    48653      264      49      262             0 crond
[7664712.313153] [97872]     0 97872    48653      263      49      263             0 crond
[7664712.321249] [98579]     0 98579    48653      266      49      235             0 crond
[7664712.329343] [99292]     0 99292    48653      257      49      261             0 crond
[7664712.337437] [99450]     0 99450    30913      208      18      446             0 python3
[7664712.345702] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664712.353962] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664712.362917] [100032]     0 100032    48653      266      49      240             0 crond
[7664712.371183] [100105]    89 100105    25553      264      47      274             0 smtp
[7664712.379358] [100203]     0 100203    30816      185      17      333             0 python3
[7664712.387798] Out of memory: Kill process 99450 (python3) score 0 or sacrifice child
[7664712.395540] Killed process 99450 (python3) total-vm:123652kB, anon-rss:0kB, file-rss:832kB, shmem-rss:0kB
[7664712.608236] python3: page allocation failure: order:0, mode:0x200da
[7664712.614686] CPU: 11 PID: 99450 Comm: python3 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664712.627375] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664712.635209] Call Trace:
[7664712.637839]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664712.643156]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664712.649253]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664712.654829]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664712.660840]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664712.667375]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664712.673907]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664712.679747]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664712.686360]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664712.692626]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664712.698546]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664712.704551]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664712.710470]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664712.716394]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664712.721970]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664712.727282] Mem-Info:
[7664712.729759] active_anon:1 inactive_anon:1 isolated_anon:0
 active_file:33092 inactive_file:37027 isolated_file:3680
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824040 slab_unreclaimable:62296639
 mapped:1588 shmem:0 pagetables:1633 bounce:0
 free:590181 free_pcp:0 free_cma:0
[7664712.764028] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664712.805780] lowmem_reserve[]: 0 1418 63868 63868
[7664712.810710] Node 0 DMA32 free:261344kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:964kB inactive_file:2888kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686256kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:258864 all_unreclaimable? yes
[7664712.855494] lowmem_reserve[]: 0 0 62450 62450
[7664712.860161] Node 0 Normal free:507988kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44696kB inactive_file:45632kB unevictable:168kB isolated(anon):0kB isolated(file):4352kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243564kB kernel_stack:6304kB pagetables:2188kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1102129 all_unreclaimable? yes
[7664712.907115] lowmem_reserve[]: 0 0 0 0
[7664712.911089] Node 1 Normal free:525308kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16124kB inactive_file:17192kB unevictable:26488kB isolated(anon):0kB isolated(file):128kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411344kB kernel_stack:20816kB pagetables:1916kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:31567 all_unreclaimable? no
[7664712.957950] lowmem_reserve[]: 0 0 0 0
[7664712.961919] Node 2 Normal free:525048kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32568kB inactive_file:39848kB unevictable:8680kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476084kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:453640 all_unreclaimable? yes
[7664713.008868] lowmem_reserve[]: 0 0 0 0
[7664713.012845] Node 3 Normal free:525184kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:8kB active_file:42668kB inactive_file:41472kB unevictable:840kB isolated(anon):0kB isolated(file):6528kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369308kB kernel_stack:4208kB pagetables:1852kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:688829 all_unreclaimable? yes
[7664713.059699] lowmem_reserve[]: 0 0 0 0
[7664713.063665] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664713.078504] Node 0 DMA32: 409*4kB (UEM) 407*8kB (UEM) 1213*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261676kB
[7664713.095003] Node 0 Normal: 6448*4kB (UEM) 5720*8kB (UEM) 3934*16kB (UEM) 4479*32kB (EM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508544kB
[7664713.111758] Node 1 Normal: 88065*4kB (UEM) 21631*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525308kB
[7664713.124912] Node 2 Normal: 27350*4kB (EM) 40156*8kB (UEM) 887*16kB (EM) 1667*32kB (UEM) 414*64kB (UEM) 2*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524936kB
[7664713.140601] Node 3 Normal: 131413*4kB (UM) 1*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525660kB
[7664713.153324] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664713.162191] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664713.170797] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664713.179664] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664713.188268] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664713.197136] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664713.205740] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664713.214605] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664713.223213] 73713 total pagecache pages
[7664713.227227] 0 pages in swap cache
[7664713.230719] Swap cache stats: add 21120697, delete 21136669, find 4513431/7609924
[7664713.238371] Free swap  = 3101548kB
[7664713.241949] Total swap = 4194300kB
[7664713.245532] 66993253 pages RAM
[7664713.248762] 0 pages HighMem/MovableOnly
[7664713.252773] 1101945 pages reserved
[7664713.393655] crond invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
[7664713.401232] crond cpuset=/ mems_allowed=0-3
[7664713.405600] CPU: 28 PID: 53969 Comm: crond Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664713.418113] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664713.425941] Call Trace:
[7664713.428583]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664713.433904]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664713.439396]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664713.445232]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664713.451246]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664713.457600]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664713.463701]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664713.469456]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664713.475990]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664713.482515]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664713.488350]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664713.494971]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664713.501245]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664713.507165]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664713.513177]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664713.519098]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664713.525018]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664713.530592]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664713.535911] Mem-Info:
[7664713.538388] active_anon:0 inactive_anon:8 isolated_anon:0
 active_file:32941 inactive_file:35907 isolated_file:3424
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824040 slab_unreclaimable:62296640
 mapped:1588 shmem:0 pagetables:1633 bounce:0
 free:590269 free_pcp:0 free_cma:0
[7664713.572668] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664713.614432] lowmem_reserve[]: 0 1418 63868 63868
[7664713.619365] Node 0 DMA32 free:261308kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:4kB active_file:988kB inactive_file:2380kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686256kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:67919 all_unreclaimable? yes
[7664713.664061] lowmem_reserve[]: 0 0 62450 62450
[7664713.668729] Node 0 Normal free:508240kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:45232kB inactive_file:44788kB unevictable:168kB isolated(anon):0kB isolated(file):6656kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243564kB kernel_stack:6144kB pagetables:2188kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1236710 all_unreclaimable? yes
[7664713.715681] lowmem_reserve[]: 0 0 0 0
[7664713.719656] Node 1 Normal free:525304kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16272kB inactive_file:17412kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411348kB kernel_stack:20816kB pagetables:1916kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:511474 all_unreclaimable? yes
[7664713.766515] lowmem_reserve[]: 0 0 0 0
[7664713.770487] Node 2 Normal free:524928kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32396kB inactive_file:39948kB unevictable:8680kB isolated(anon):0kB isolated(file):256kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476084kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:373764 all_unreclaimable? yes
[7664713.817436] lowmem_reserve[]: 0 0 0 0
[7664713.821413] Node 3 Normal free:525420kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:20kB active_file:39372kB inactive_file:41620kB unevictable:840kB isolated(anon):0kB isolated(file):2944kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369308kB kernel_stack:4208kB pagetables:1852kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:235096 all_unreclaimable? no
[7664713.868278] lowmem_reserve[]: 0 0 0 0
[7664713.872247] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664713.887087] Node 0 DMA32: 399*4kB (UEM) 407*8kB (UEM) 1213*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261636kB
[7664713.903579] Node 0 Normal: 6449*4kB (UEM) 5721*8kB (UEM) 3921*16kB (UEM) 4479*32kB (EM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508348kB
[7664713.920333] Node 1 Normal: 88065*4kB (UEM) 21631*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525308kB
[7664713.933488] Node 2 Normal: 27350*4kB (EM) 40156*8kB (UEM) 887*16kB (EM) 1667*32kB (EM) 414*64kB (EM) 2*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524936kB
[7664713.949002] Node 3 Normal: 131394*4kB (UM) 7*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525632kB
[7664713.961725] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664713.970592] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664713.979200] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664713.988067] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664713.996682] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664714.005557] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664714.014170] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664714.023038] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664714.031650] 73805 total pagecache pages
[7664714.035663] 0 pages in swap cache
[7664714.039155] Swap cache stats: add 21120703, delete 21136675, find 4513432/7609927
[7664714.046810] Free swap  = 3103340kB
[7664714.050395] Total swap = 4194300kB
[7664714.053978] 66993253 pages RAM
[7664714.057215] 0 pages HighMem/MovableOnly
[7664714.061234] 1101945 pages reserved
[7664714.064817] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664714.072866] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664714.081827] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664714.090622] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664714.099223] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664714.107404] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664714.116014] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664714.124276] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664714.132369] [53104]     0 53104    74785      324      85      253             0 sssd
[7664714.140378] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664714.148907] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664714.157869] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664714.166221] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664714.174489] [53178]     0 53178    76774      291      95      241             0 sssd_nss
[7664714.182845] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664714.191199] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664714.200077] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664714.208084] [53969]     0 53969    31572      205      20      168             0 crond
[7664714.216175] [54035]     0 54035    27526      164      10       33             0 agetty
[7664714.224349] [54036]     0 54036    27526      158      11       33             0 agetty
[7664714.232533] [54186]     0 54186    22934      210      46      273             0 master
[7664714.240714] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664714.248845] [36317]     0 36317    28294      187      14       61             0 bash
[7664714.256851] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664714.265381] [36329]     0 36329    28177      160      14       55             0 grep
[7664714.273507] [76204]    89 76204    25501      252      46      282             0 pickup
[7664714.275609] LustreError: 36965:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c10a3347e00
[7664714.292632] [97173]     0 97173    48653      264      49      262             0 crond
[7664714.300731] [97872]     0 97872    48653      263      49      263             0 crond
[7664714.308827] [98579]     0 98579    48653      266      49      235             0 crond
[7664714.316920] [99292]     0 99292    48653      257      49      261             0 crond
[7664714.325013] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664714.333275] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664714.342240] [100032]     0 100032    48653      266      49      240             0 crond
[7664714.350506] [100105]    89 100105    25553      264      47      274             0 smtp
[7664714.358689] [100203]     0 100203    30816      185      17      333             0 python3
[7664714.367131] Out of memory: Kill process 53104 (sssd) score 0 or sacrifice child
[7664714.374617] Killed process 53178 (sssd_nss) total-vm:307096kB, anon-rss:0kB, file-rss:1164kB, shmem-rss:0kB
[7664714.442074] LustreError: 8680:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c2ca5ff6000
[7664714.804647] sssd_nss: page allocation failure: order:0, mode:0x200da
[7664714.811188] CPU: 20 PID: 53178 Comm: sssd_nss Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664714.823959] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664714.831786] Call Trace:
[7664714.834420]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664714.839736]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664714.845835]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664714.851416]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664714.857424]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664714.863957]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664714.870495]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664714.876338]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664714.882954]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664714.889221]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664714.895144]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664714.901152]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664714.907077]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664714.913001]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664714.918577]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664714.923896] Mem-Info:
[7664714.926372] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:34423 inactive_file:35183 isolated_file:3350
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824040 slab_unreclaimable:62296661
 mapped:1588 shmem:0 pagetables:1615 bounce:0
 free:590131 free_pcp:0 free_cma:0
[7664714.960649] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664715.002406] lowmem_reserve[]: 0 1418 63868 63868
[7664715.007329] Node 0 DMA32 free:261188kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:764kB inactive_file:2552kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686248kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:12559 all_unreclaimable? yes
[7664715.052025] lowmem_reserve[]: 0 0 62450 62450
[7664715.056696] Node 0 Normal free:508116kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:40948kB inactive_file:42636kB unevictable:168kB isolated(anon):0kB isolated(file):5336kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243568kB kernel_stack:6352kB pagetables:2188kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:274981 all_unreclaimable? no
[7664715.103471] lowmem_reserve[]: 0 0 0 0
[7664715.107440] Node 1 Normal free:525304kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16272kB inactive_file:17412kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411348kB kernel_stack:20816kB pagetables:1916kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:511474 all_unreclaimable? yes
[7664715.154302] lowmem_reserve[]: 0 0 0 0
[7664715.158271] Node 2 Normal free:524872kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32340kB inactive_file:35312kB unevictable:8680kB isolated(anon):0kB isolated(file):4608kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476148kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:873448 all_unreclaimable? yes
[7664715.205300] lowmem_reserve[]: 0 0 0 0
[7664715.209275] Node 3 Normal free:525220kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:4kB active_file:40768kB inactive_file:42720kB unevictable:840kB isolated(anon):0kB isolated(file):2560kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369300kB kernel_stack:4208kB pagetables:1780kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:937047 all_unreclaimable? yes
[7664715.256132] lowmem_reserve[]: 0 0 0 0
[7664715.260102] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664715.274951] Node 0 DMA32: 412*4kB (UEM) 408*8kB (UEM) 1213*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261696kB
[7664715.291443] Node 0 Normal: 6482*4kB (UEM) 5721*8kB (UEM) 3935*16kB (UEM) 4480*32kB (UEM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508736kB
[7664715.308284] Node 1 Normal: 88065*4kB (UEM) 21631*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525308kB
[7664715.321438] Node 2 Normal: 27354*4kB (UEM) 40162*8kB (UEM) 887*16kB (EM) 1667*32kB (EM) 414*64kB (UEM) 2*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525000kB
[7664715.337126] Node 3 Normal: 131302*4kB (UEM) 7*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525264kB
[7664715.349936] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664715.358804] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664715.367420] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664715.376294] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664715.384911] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664715.393782] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664715.402391] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664715.411264] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664715.419876] 73858 total pagecache pages
[7664715.423891] 0 pages in swap cache
[7664715.427391] Swap cache stats: add 21120703, delete 21136675, find 4513432/7609927
[7664715.435043] Free swap  = 3103340kB
[7664715.438622] Total swap = 4194300kB
[7664715.442204] 66993253 pages RAM
[7664715.445434] 0 pages HighMem/MovableOnly
[7664715.449448] 1101945 pages reserved
[7664715.663305] ll_ost_io03_074 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664715.671745] ll_ost_io03_074 cpuset=/ mems_allowed=3
[7664715.676806] CPU: 47 PID: 90689 Comm: ll_ost_io03_074 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664715.690185] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664715.698012] Call Trace:
[7664715.700644]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664715.705961]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664715.711459]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664715.717295]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664715.723043]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664715.729060]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664715.735421]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664715.741520]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664715.747275]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664715.753799]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664715.760328]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664715.766514]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664715.772523]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664715.778637]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664715.785522]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664715.792754]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664715.798919]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664715.805575]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664715.812626]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664715.818377]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664715.824824]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664715.831697]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664715.837229]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664715.844318]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664715.852074]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664715.859340]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664715.867206]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664715.874208]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664715.882154]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664715.889408]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664715.895883]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664715.903450]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664715.908502]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664715.914767]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664715.921382]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664715.927644] Mem-Info:
[7664715.930104] active_anon:0 inactive_anon:5 isolated_anon:0
 active_file:32726 inactive_file:35555 isolated_file:4000
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824040 slab_unreclaimable:62296632
 mapped:1588 shmem:0 pagetables:1520 bounce:0
 free:590258 free_pcp:0 free_cma:0
[7664715.964372] Node 3 Normal free:525216kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:4kB active_file:39828kB inactive_file:44528kB unevictable:840kB isolated(anon):0kB isolated(file):2048kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369296kB kernel_stack:4208kB pagetables:1780kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:973906 all_unreclaimable? yes
[7664716.011238] lowmem_reserve[]: 0 0 0 0
[7664716.015203] Node 3 Normal: 131309*4kB (UEM) 7*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525292kB
[7664716.028016] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664716.036883] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664716.045488] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664716.054353] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664716.062961] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664716.071827] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664716.080434] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664716.089298] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664716.096816] LustreError: 8671:0:(ldlm_lib.c:3262:target_bulk_io()) @@@ network error on bulk READ  req@ffff9c406e9db050 x1659467991482368/t0(0) o3->fb2c1382-8f5a-4@10.50.15.10@o2ib2:501/0 lens 488/440 e 1 to 0 dl 1583650751 ref 1 fl Interpret:/0/0 rc 0/0
[7664716.096819] LustreError: 8671:0:(ldlm_lib.c:3262:target_bulk_io()) Skipped 16 previous similar messages
[7664716.096847] Lustre: fir-OST001b: Bulk IO read error with fb2c1382-8f5a-4 (at 10.50.15.10@o2ib2), client will retry: rc -110
[7664716.096848] Lustre: Skipped 5 previous similar messages
[7664716.096879] Lustre: fir-OST0019: Bulk IO write error with c2ca4c5a-e67e-4 (at 10.50.5.43@o2ib2), client will retry: rc = -110
[7664716.158269] 73863 total pagecache pages
[7664716.162281] 0 pages in swap cache
[7664716.165774] Swap cache stats: add 21120710, delete 21136682, find 4513432/7609928
[7664716.173428] Free swap  = 3104364kB
[7664716.177006] Total swap = 4194300kB
[7664716.180586] 66993253 pages RAM
[7664716.183818] 0 pages HighMem/MovableOnly
[7664716.187830] 1101945 pages reserved
[7664716.191410] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664716.199463] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664716.208417] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664716.217211] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664716.225805] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664716.233985] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664716.242598] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664716.250862] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664716.258953] [53104]     0 53104    74785      324      85      253             0 sssd
[7664716.266960] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664716.275479] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664716.284433] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664716.292787] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664716.301054] [53179]     0 53179    71689      280      85      232             0 sssd_pam
[7664716.309403] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664716.318279] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664716.326285] [53969]     0 53969    31572      205      20      168             0 crond
[7664716.334380] [54035]     0 54035    27526      164      10       33             0 agetty
[7664716.342559] [54036]     0 54036    27526      158      11       33             0 agetty
[7664716.350731] [54186]     0 54186    22934      210      46      273             0 master
[7664716.358905] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664716.367025] [36317]     0 36317    28294      187      14       61             0 bash
[7664716.375027] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664716.383555] [36329]     0 36329    28177      160      14       55             0 grep
[7664716.391670] [76204]    89 76204    25501      252      46      282             0 pickup
[7664716.399855] [97173]     0 97173    48653      264      49      262             0 crond
[7664716.407945] [97872]     0 97872    48653      263      49      263             0 crond
[7664716.416041] [98579]     0 98579    48653      266      49      235             0 crond
[7664716.424135] [99292]     0 99292    48653      257      49      261             0 crond
[7664716.432226] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664716.440488] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664716.449447] [100032]     0 100032    48653      266      49      240             0 crond
[7664716.457708] [100105]    89 100105    25553      264      47      274             0 smtp
[7664716.462376] LustreError: 119554:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c39d93b2e00
[7664716.476923] [100203]     0 100203    30816      185      17      334             0 python3
[7664716.485364] Out of memory: Kill process 53104 (sssd) score 0 or sacrifice child
[7664716.492853] Killed process 53179 (sssd_pam) total-vm:286756kB, anon-rss:0kB, file-rss:1120kB, shmem-rss:0kB
[7664716.665807] sssd_pam: page allocation failure: order:0, mode:0x201da
[7664716.672344] CPU: 24 PID: 53179 Comm: sssd_pam Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664716.685115] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664716.692948] Call Trace:
[7664716.695582]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664716.700900]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664716.706997]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664716.712572]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664716.718582]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664716.725112]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664716.731637]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664716.737817]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664716.743825]  [<ffffffffa01ba3c8>] filemap_fault+0x298/0x490
[7664716.749604]  [<ffffffffc05871c6>] ext4_filemap_fault+0x36/0x50 [ext4]
[7664716.756223]  [<ffffffffa01e593a>] __do_fault.isra.59+0x8a/0x100
[7664716.762318]  [<ffffffffa01e5eec>] do_read_fault.isra.61+0x4c/0x1b0
[7664716.768669]  [<ffffffffa01ea874>] handle_pte_fault+0x2f4/0xd10
[7664716.774677]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664716.780600]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664716.786522]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664716.792095]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664716.797408] Mem-Info:
[7664716.799883] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:32885 inactive_file:34869 isolated_file:3936
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824040 slab_unreclaimable:62296641
 mapped:1588 shmem:0 pagetables:1520 bounce:0
 free:590476 free_pcp:0 free_cma:0
[7664716.834151] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664716.875897] lowmem_reserve[]: 0 1418 63868 63868
[7664716.880817] Node 0 DMA32 free:261344kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:772kB inactive_file:2096kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686280kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:225382 all_unreclaimable? yes
[7664716.925593] lowmem_reserve[]: 0 0 62450 62450
[7664716.930263] Node 0 Normal free:508628kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:43552kB inactive_file:43108kB unevictable:168kB isolated(anon):0kB isolated(file):9600kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243584kB kernel_stack:5856kB pagetables:2188kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1276697 all_unreclaimable? yes
[7664716.977211] lowmem_reserve[]: 0 0 0 0
[7664716.981181] Node 1 Normal free:525172kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16876kB inactive_file:16324kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:120605 all_unreclaimable? yes
[7664717.028055] lowmem_reserve[]: 0 0 0 0
[7664717.032029] Node 2 Normal free:525552kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:4kB active_file:30780kB inactive_file:36352kB unevictable:8680kB isolated(anon):0kB isolated(file):4480kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476076kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:837032 all_unreclaimable? yes
[7664717.079080] lowmem_reserve[]: 0 0 0 0
[7664717.083048] Node 3 Normal free:525304kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:4kB active_file:41800kB inactive_file:43728kB unevictable:840kB isolated(anon):0kB isolated(file):1280kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369292kB kernel_stack:4208kB pagetables:1780kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:973906 all_unreclaimable? yes
[7664717.129917] lowmem_reserve[]: 0 0 0 0
[7664717.133885] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664717.148723] Node 0 DMA32: 410*4kB (UEM) 408*8kB (UEM) 1213*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261688kB
[7664717.165217] Node 0 Normal: 6491*4kB (UEM) 5723*8kB (UEM) 3921*16kB (UEM) 4479*32kB (EM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508532kB
[7664717.181979] Node 1 Normal: 88012*4kB (UEM) 21632*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525120kB
[7664717.195508] Node 2 Normal: 27414*4kB (EM) 40178*8kB (UEM) 891*16kB (UEM) 1669*32kB (UEM) 410*64kB (UEM) 1*128kB (U) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525112kB
[7664717.211282] Node 3 Normal: 131309*4kB (UEM) 7*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525292kB
[7664717.224110] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664717.232976] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664717.241579] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664717.250445] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664717.259053] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664717.267919] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664717.276531] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664717.285404] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664717.294015] 74017 total pagecache pages
[7664717.298030] 0 pages in swap cache
[7664717.301530] Swap cache stats: add 21120713, delete 21136685, find 4513434/7609933
[7664717.309180] Free swap  = 3104308kB
[7664717.312759] Total swap = 4194300kB
[7664717.316343] 66993253 pages RAM
[7664717.319580] 0 pages HighMem/MovableOnly
[7664717.323592] 1101945 pages reserved
[7664717.845502] ll_ost_io02_070 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664717.853947] ll_ost_io02_070 cpuset=/ mems_allowed=2
[7664717.859006] CPU: 18 PID: 83172 Comm: ll_ost_io02_070 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664717.872384] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664717.880211] Call Trace:
[7664717.882843]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664717.888158]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664717.893646]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664717.899488]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664717.905245]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664717.911258]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664717.917620]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664717.923731]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664717.929484]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664717.936025]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664717.942552]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664717.948740]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664717.954755]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664717.960870]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664717.967754]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664717.974980]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664717.981142]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664717.987795]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664717.994849]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664718.000604]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664718.007050]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664718.013933]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664718.019468]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664718.026554]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664718.034306]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664718.041564]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664718.049432]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664718.056399]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664718.061841]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664718.068315]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664718.075885]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664718.080943]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664718.087217]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664718.093830]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664718.100093] Mem-Info:
[7664718.102555] active_anon:0 inactive_anon:3 isolated_anon:0
 active_file:33182 inactive_file:34986 isolated_file:3617
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824040 slab_unreclaimable:62296641
 mapped:1588 shmem:0 pagetables:1520 bounce:0
 free:590268 free_pcp:0 free_cma:0
[7664718.136830] Node 2 Normal free:524996kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:8kB active_file:31084kB inactive_file:38012kB unevictable:8680kB isolated(anon):0kB isolated(file):2688kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476076kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:397828 all_unreclaimable? yes
[7664718.183866] lowmem_reserve[]: 0 0 0 0
[7664718.187834] Node 2 Normal: 27414*4kB (UEM) 40178*8kB (UEM) 891*16kB (UEM) 1670*32kB (UEM) 409*64kB (EM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525080kB
[7664718.203610] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664718.212475] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664718.221081] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664718.229949] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664718.238553] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664718.247420] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664718.256027] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664718.264896] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664718.273507] 74017 total pagecache pages
[7664718.277521] 0 pages in swap cache
[7664718.281013] Swap cache stats: add 21120714, delete 21136686, find 4513434/7609934
[7664718.288668] Free swap  = 3105332kB
[7664718.292243] Total swap = 4194300kB
[7664718.295825] 66993253 pages RAM
[7664718.299055] 0 pages HighMem/MovableOnly
[7664718.303069] 1101945 pages reserved
[7664718.306647] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664718.314694] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664718.323646] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664718.332436] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664718.341037] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664718.349215] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664718.357827] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664718.366089] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664718.374181] [53104]     0 53104    74785      324      85      253             0 sssd
[7664718.382184] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664718.390709] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664718.399661] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664718.408008] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664718.416269] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664718.425145] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664718.433152] [53969]     0 53969    31572      205      20      168             0 crond
[7664718.441246] [54035]     0 54035    27526      164      10       33             0 agetty
[7664718.449418] [54036]     0 54036    27526      158      11       33             0 agetty
[7664718.457592] [54186]     0 54186    22934      210      46      273             0 master
[7664718.465766] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664718.473890] [36317]     0 36317    28294      187      14       61             0 bash
[7664718.481893] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664718.490411] [36329]     0 36329    28177      160      14       55             0 grep
[7664718.498510] [76204]    89 76204    25501      252      46      282             0 pickup
[7664718.506695] [97173]     0 97173    48653      264      49      262             0 crond
[7664718.514786] [97872]     0 97872    48653      263      49      263             0 crond
[7664718.522883] [98579]     0 98579    48653      266      49      236             0 crond
[7664718.530983] [99292]     0 99292    48653      257      49      261             0 crond
[7664718.539083] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664718.547351] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664718.556318] [100032]     0 100032    48653      266      49      240             0 crond
[7664718.564592] [100105]    89 100105    25553      264      47      274             0 smtp
[7664718.572774] [100203]     0 100203    30816      185      17      334             0 python3
[7664718.581224] Out of memory: Kill process 53104 (sssd) score 0 or sacrifice child
[7664718.588720] Killed process 53104 (sssd) total-vm:299140kB, anon-rss:0kB, file-rss:1296kB, shmem-rss:0kB
[7664718.624970] sssd: page allocation failure: order:0, mode:0x201da
[7664718.631189] CPU: 38 PID: 53104 Comm: sssd Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664718.643618] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664718.651460] Call Trace:
[7664718.654100]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664718.659418]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664718.665521]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664718.671107]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664718.677128]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664718.683662]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664718.690224]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664718.696413]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664718.702425]  [<ffffffffa01ba3c8>] filemap_fault+0x298/0x490
[7664718.708211]  [<ffffffffc05871c6>] ext4_filemap_fault+0x36/0x50 [ext4]
[7664718.714864]  [<ffffffffa01e593a>] __do_fault.isra.59+0x8a/0x100
[7664718.720963]  [<ffffffffa01e5eec>] do_read_fault.isra.61+0x4c/0x1b0
[7664718.727327]  [<ffffffffa01ea874>] handle_pte_fault+0x2f4/0xd10
[7664718.733339]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664718.739292]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664718.745223]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664718.750803]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664718.756124] Mem-Info:
[7664718.758603] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:34029 inactive_file:35732 isolated_file:2898
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824040 slab_unreclaimable:62296649
 mapped:1588 shmem:0 pagetables:1435 bounce:0
 free:590344 free_pcp:0 free_cma:0
[7664718.792891] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664718.834648] lowmem_reserve[]: 0 1418 63868 63868
[7664718.839579] Node 0 DMA32 free:261336kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:772kB inactive_file:2020kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686280kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:126997 all_unreclaimable? no
[7664718.884314] lowmem_reserve[]: 0 0 62450 62450
[7664718.888989] Node 0 Normal free:508756kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:45552kB inactive_file:42564kB unevictable:168kB isolated(anon):0kB isolated(file):7424kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243584kB kernel_stack:5984kB pagetables:1848kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:808835 all_unreclaimable? no
[7664718.935771] lowmem_reserve[]: 0 0 0 0
[7664718.939751] Node 1 Normal free:525172kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16896kB inactive_file:16456kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:120605 all_unreclaimable? yes
[7664718.986628] lowmem_reserve[]: 0 0 0 0
[7664718.990599] Node 2 Normal free:524968kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:4kB active_file:31084kB inactive_file:38980kB unevictable:8680kB isolated(anon):0kB isolated(file):1736kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715204kB slab_unreclaimable:62476108kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:150759 all_unreclaimable? yes
[7664719.037639] lowmem_reserve[]: 0 0 0 0
[7664719.041621] Node 3 Normal free:525240kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:4kB active_file:42120kB inactive_file:43440kB unevictable:840kB isolated(anon):0kB isolated(file):1024kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369292kB kernel_stack:4208kB pagetables:1780kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:973906 all_unreclaimable? yes
[7664719.088530] lowmem_reserve[]: 0 0 0 0
[7664719.092507] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664719.107392] Node 0 DMA32: 408*4kB (UEM) 408*8kB (UEM) 1213*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261680kB
[7664719.124002] Node 0 Normal: 6581*4kB (UEM) 5723*8kB (UEM) 3923*16kB (UEM) 4479*32kB (EM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508860kB
[7664719.140818] Node 1 Normal: 88012*4kB (UEM) 21632*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525120kB
[7664719.154404] Node 2 Normal: 27414*4kB (UEM) 40178*8kB (UEM) 891*16kB (UEM) 1670*32kB (UEM) 409*64kB (EM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525080kB
[7664719.170319] Node 3 Normal: 131309*4kB (UEM) 7*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525292kB
[7664719.183186] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664719.192068] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664719.200693] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664719.209586] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664719.218202] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664719.227076] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664719.235693] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664719.244569] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664719.253199] 74029 total pagecache pages
[7664719.257215] 0 pages in swap cache
[7664719.260705] Swap cache stats: add 21120714, delete 21136686, find 4513434/7609935
[7664719.268364] Free swap  = 3105332kB
[7664719.271949] Total swap = 4194300kB
[7664719.275546] 66993253 pages RAM
[7664719.278784] 0 pages HighMem/MovableOnly
[7664719.282800] 1101945 pages reserved
[7664719.650569] ll_ost_io01_029 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664719.659016] ll_ost_io01_029 cpuset=/ mems_allowed=1
[7664719.664080] CPU: 1 PID: 123076 Comm: ll_ost_io01_029 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664719.677459] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664719.685292] Call Trace:
[7664719.687924]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664719.693239]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664719.698729]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664719.704570]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664719.710581]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664719.716935]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664719.723038]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664719.728793]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664719.735328]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664719.741862]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664719.748050]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664719.754063]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664719.760170]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664719.767056]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664719.773228]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664719.780331]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664719.786896]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664719.793545]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664719.800894]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664719.807636]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664719.814984]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664719.822504]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664719.829416]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664719.836505]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664719.844258]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664719.851515]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664719.859383]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664719.866352]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664719.871790]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664719.878265]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664719.885835]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664719.890894]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664719.897162]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664719.903781]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664719.910044] Mem-Info:
[7664719.912506] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:34421 inactive_file:36346 isolated_file:2034
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824042 slab_unreclaimable:62296618
 mapped:1588 shmem:0 pagetables:1435 bounce:0
 free:590253 free_pcp:0 free_cma:0
[7664719.946778] Node 1 Normal free:525100kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17364kB inactive_file:16176kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:120605 all_unreclaimable? yes
[7664719.993637] lowmem_reserve[]: 0 0 0 0
[7664719.997606] Node 1 Normal: 88012*4kB (UEM) 21632*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525120kB
[7664720.011136] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664720.020003] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664720.028608] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664720.037474] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664720.038678] LustreError: 8709:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c2ff3c81a00
[7664720.056942] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664720.061624] LustreError: 3045:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c40016e1800
[7664720.076676] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664720.085289] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664720.094155] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664720.102760] 74038 total pagecache pages
[7664720.106774] 0 pages in swap cache
[7664720.110265] Swap cache stats: add 21120716, delete 21136688, find 4513435/7609937
[7664720.117918] Free swap  = 3106356kB
[7664720.121498] Total swap = 4194300kB
[7664720.125077] 66993253 pages RAM
[7664720.128309] 0 pages HighMem/MovableOnly
[7664720.132323] 1101945 pages reserved
[7664720.135901] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664720.143956] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664720.152917] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664720.161706] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664720.170292] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664720.178469] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664720.187083] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664720.195352] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664720.203458] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664720.211986] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664720.220942] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664720.229289] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664720.237552] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664720.246422] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664720.254422] [53969]     0 53969    31572      205      20      168             0 crond
[7664720.262508] [54035]     0 54035    27526      164      10       33             0 agetty
[7664720.270681] [54036]     0 54036    27526      158      11       33             0 agetty
[7664720.278858] [54186]     0 54186    22934      210      46      273             0 master
[7664720.287034] [54206]    89 54206    25545      272      47      271             0 qmgr
[7664720.295124] [36317]     0 36317    28294      187      14       61             0 bash
[7664720.303130] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664720.311657] [36329]     0 36329    28177      160      14       55             0 grep
[7664720.319765] [76204]    89 76204    25501      252      46      282             0 pickup
[7664720.327947] [97173]     0 97173    48653      264      49      262             0 crond
[7664720.336038] [97872]     0 97872    48653      263      49      263             0 crond
[7664720.344137] [98579]     0 98579    48653      266      49      236             0 crond
[7664720.352230] [99292]     0 99292    48653      257      49      261             0 crond
[7664720.360322] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664720.368581] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664720.377534] [100032]     0 100032    48653      266      49      240             0 crond
[7664720.385791] [100105]    89 100105    25553      264      47      274             0 smtp
[7664720.393966] [100203]     0 100203    30816      185      17      334             0 python3
[7664720.402407] Out of memory: Kill process 54206 (qmgr) score 0 or sacrifice child
[7664720.409884] Killed process 54206 (qmgr) total-vm:102180kB, anon-rss:0kB, file-rss:1088kB, shmem-rss:0kB
[7664720.752886] ll_ost_io00_081 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664720.761328] ll_ost_io00_081 cpuset=/ mems_allowed=0
[7664720.766387] CPU: 12 PID: 90690 Comm: ll_ost_io00_081 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664720.779766] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664720.787590] Call Trace:
[7664720.790228]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664720.795540]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664720.801029]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664720.806869]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664720.812624]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664720.818638]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664720.824992]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664720.831091]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664720.836837]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664720.843364]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664720.849890]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664720.856070]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664720.862074]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664720.868189]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664720.875075]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664720.881223]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664720.888320]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664720.894890]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664720.901536]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664720.908882]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664720.915621]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664720.922971]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664720.930497]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664720.937414]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664720.944500]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664720.952253]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664720.959509]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664720.967376]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664720.974370]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664720.982316]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664720.989571]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664720.996054]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664721.003623]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664721.008682]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664721.014950]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664721.021568]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664721.027838] Mem-Info:
[7664721.030310] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:34598 inactive_file:35915 isolated_file:3581
 unevictable:9044 dirty:0 writeback:10 unstable:0
 slab_reclaimable:824042 slab_unreclaimable:62296621
 mapped:1588 shmem:0 pagetables:1350 bounce:0
 free:590093 free_pcp:0 free_cma:0
[7664721.064666] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664721.097855] LustreError: 90700:0:(ldlm_lib.c:3262:target_bulk_io()) @@@ network error on bulk READ  req@ffff9c467d0ab050 x1659494269050880/t0(0) o3->164e843a-d84a-4@10.50.5.36@o2ib2:518/0 lens 488/440 e 2 to 0 dl 1583650768 ref 1 fl Interpret:/0/0 rc 0/0
[7664721.097858] LustreError: 90700:0:(ldlm_lib.c:3262:target_bulk_io()) Skipped 1 previous similar message
[7664721.097957] Lustre: fir-OST001b: Bulk IO write error with c2ca4c5a-e67e-4 (at 10.50.5.43@o2ib2), client will retry: rc = -110
[7664721.150010] lowmem_reserve[]: 0 1418 63868 63868
[7664721.154932] Node 0 DMA32 free:261268kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:804kB inactive_file:2236kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:4kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686220kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:53673 all_unreclaimable? yes
[7664721.187587] LustreError: 6894:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c4a91670200
[7664721.195610] LustreError: 6894:0:(ldlm_lib.c:3246:target_bulk_io()) @@@ timeout on bulk READ after -5+5s  req@ffff9c506a930850 x1659467991533568/t0(0) o3->fb2c1382-8f5a-4@10.50.15.10@o2ib2:487/0 lens 488/440 e 0 to 0 dl 1583650737 ref 2 fl Interpret:/0/0 rc 0/0
[7664721.195613] LustreError: 6894:0:(ldlm_lib.c:3246:target_bulk_io()) Skipped 1 previous similar message
[7664721.195645] Lustre: 6894:0:(service.c:2165:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (45:5s); client may timeout.  req@ffff9c506a930850 x1659467991533568/t0(0) o3->fb2c1382-8f5a-4@10.50.15.10@o2ib2:487/0 lens 488/440 e 0 to 0 dl 1583650737 ref 2 fl Complete:/0/ffffffff rc -110/-1
[7664721.270463] lowmem_reserve[]: 0 0 62450 62450
[7664721.275133] Node 0 Normal free:509008kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:4kB active_file:40288kB inactive_file:43784kB unevictable:168kB isolated(anon):0kB isolated(file):3968kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:4kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243556kB kernel_stack:5984kB pagetables:1848kB unstable:0kB bounce:0kB free_pcp:176kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:70135 all_unreclaimable? no
[7664721.287912] LustreError: 119547:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c433342ba00
[7664721.287921] LustreError: 119547:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c433342ba00
[7664721.344051] lowmem_reserve[]: 0 0 0 0
[7664721.348019] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664721.362855] Node 0 DMA32: 398*4kB (UEM) 409*8kB (UEM) 1214*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261664kB
[7664721.379350] Node 0 Normal: 6477*4kB (UEM) 5812*8kB (UEM) 3930*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 509300kB
[7664721.396193] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664721.405064] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664721.413671] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664721.422537] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664721.431142] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664721.440010] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664721.448614] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664721.457481] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664721.466088] 74015 total pagecache pages
[7664721.470101] 0 pages in swap cache
[7664721.473592] Swap cache stats: add 21120722, delete 21136694, find 4513438/7609944
[7664721.481244] Free swap  = 3107636kB
[7664721.484825] Total swap = 4194300kB
[7664721.488405] 66993253 pages RAM
[7664721.491636] 0 pages HighMem/MovableOnly
[7664721.495648] 1101945 pages reserved
[7664721.499227] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664721.507276] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664721.516237] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664721.525032] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664721.533628] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664721.541804] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664721.550418] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664721.558687] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664721.566781] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664721.575307] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664721.584260] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664721.592608] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664721.600879] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664721.609752] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664721.617759] [53969]     0 53969    31572      205      20      168             0 crond
[7664721.625851] [54035]     0 54035    27526      164      10       33             0 agetty
[7664721.634026] [54036]     0 54036    27526      158      11       33             0 agetty
[7664721.642199] [54186]     0 54186    22934      210      46      273             0 master
[7664721.650494] [36317]     0 36317    28294      187      14       61             0 bash
[7664721.658500] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664721.667020] [36329]     0 36329    28177      160      14       55             0 grep
[7664721.675127] [76204]    89 76204    25501      251      46      283             0 pickup
[7664721.683310] [97173]     0 97173    48653      264      49      262             0 crond
[7664721.691403] [97872]     0 97872    48653      263      49      264             0 crond
[7664721.699496] [98579]     0 98579    48653      266      49      236             0 crond
[7664721.707591] [99292]     0 99292    48653      257      49      261             0 crond
[7664721.715686] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664721.723952] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664721.732906] [100032]     0 100032    48653      266      49      240             0 crond
[7664721.741172] [100105]    89 100105    25553      264      47      274             0 smtp
[7664721.749345] [100203]     0 100203    30816      185      17      335             0 python3
[7664721.757776] Out of memory: Kill process 100105 (smtp) score 0 or sacrifice child
[7664721.765342] Killed process 100105 (smtp) total-vm:102212kB, anon-rss:0kB, file-rss:1056kB, shmem-rss:0kB
[7664721.806534] ll_ost_io00_081 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664721.814972] ll_ost_io00_081 cpuset=/ mems_allowed=0
[7664721.820036] CPU: 36 PID: 90690 Comm: ll_ost_io00_081 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664721.833412] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664721.841238] Call Trace:
[7664721.843872]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664721.849185]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664721.854678]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664721.860516]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664721.866270]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664721.872284]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664721.878635]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664721.884729]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664721.890477]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664721.897008]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664721.903538]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664721.909725]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664721.915730]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664721.921837]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664721.928723]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664721.934871]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664721.941964]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664721.948523]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664721.955162]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664721.962505]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664721.969239]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664721.976579]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664721.984099]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664721.991016]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664721.998104]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664722.005855]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664722.013112]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664722.020980]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664722.027972]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664722.035919]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664722.043175]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664722.049650]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664722.057226]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664722.062284]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664722.068551]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664722.075164]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664722.081427] Mem-Info:
[7664722.083894] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:33366 inactive_file:35866 isolated_file:3136
 unevictable:9044 dirty:0 writeback:9 unstable:0
 slab_reclaimable:824042 slab_unreclaimable:62296614
 mapped:1588 shmem:0 pagetables:1350 bounce:0
 free:590438 free_pcp:0 free_cma:0
[7664722.118163] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664722.159919] lowmem_reserve[]: 0 1418 63868 63868
[7664722.164847] Node 0 DMA32 free:261332kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:816kB inactive_file:2316kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:4kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686224kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:51063 all_unreclaimable? yes
[7664722.209546] lowmem_reserve[]: 0 0 62450 62450
[7664722.214212] Node 0 Normal free:508688kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:43396kB inactive_file:46968kB unevictable:168kB isolated(anon):0kB isolated(file):4864kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:4kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243556kB kernel_stack:5984kB pagetables:1848kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1515561 all_unreclaimable? yes
[7664722.245705] LustreError: 123083:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c16417cd800
[7664722.272200] lowmem_reserve[]: 0 0 0 0
[7664722.276170] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664722.291007] Node 0 DMA32: 393*4kB (UEM) 409*8kB (UEM) 1214*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261644kB
[7664722.307499] Node 0 Normal: 6383*4kB (UEM) 5790*8kB (UEM) 3923*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508636kB
[7664722.324345] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664722.333216] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664722.341824] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664722.350688] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664722.359295] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664722.368161] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664722.376770] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664722.385640] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664722.394249] 73903 total pagecache pages
[7664722.398260] 0 pages in swap cache
[7664722.401754] Swap cache stats: add 21120724, delete 21136696, find 4513439/7609946
[7664722.409405] Free swap  = 3108660kB
[7664722.412983] Total swap = 4194300kB
[7664722.416565] 66993253 pages RAM
[7664722.419796] 0 pages HighMem/MovableOnly
[7664722.423808] 1101945 pages reserved
[7664722.427388] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664722.435434] [ 5686]     0  5686    16012      237      39      105             0 systemd-journal
[7664722.444386] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664722.453177] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664722.461773] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664722.469957] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664722.478568] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664722.486835] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664722.494923] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664722.503450] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664722.512404] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664722.520759] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664722.529028] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664722.537902] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664722.545913] [53969]     0 53969    31572      205      20      168             0 crond
[7664722.554003] [54035]     0 54035    27526      164      10       33             0 agetty
[7664722.562176] [54036]     0 54036    27526      158      11       33             0 agetty
[7664722.570350] [54186]     0 54186    22934      210      46      273             0 master
[7664722.578641] [36317]     0 36317    28294      187      14       61             0 bash
[7664722.586642] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664722.595163] [36329]     0 36329    28177      160      14       55             0 grep
[7664722.603270] [76204]    89 76204    25501      251      46      283             0 pickup
[7664722.611454] [97173]     0 97173    48653      264      49      262             0 crond
[7664722.619543] [97872]     0 97872    48653      263      49      264             0 crond
[7664722.627639] [98579]     0 98579    48653      266      49      236             0 crond
[7664722.635735] [99292]     0 99292    48653      257      49      261             0 crond
[7664722.643826] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664722.652088] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664722.661047] [100032]     0 100032    48653      266      49      240             0 crond
[7664722.669306] [100203]     0 100203    30816      185      17      335             0 python3
[7664722.677738] Out of memory: Kill process 76204 (pickup) score 0 or sacrifice child
[7664722.685389] Killed process 76204 (pickup) total-vm:102004kB, anon-rss:0kB, file-rss:1004kB, shmem-rss:0kB
[7664723.140797] pickup: page allocation failure: order:0, mode:0x200da
[7664723.147160] CPU: 16 PID: 76204 Comm: pickup Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664723.159761] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664723.167594] Call Trace:
[7664723.170226]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664723.175544]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664723.181643]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664723.187217]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664723.193231]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664723.199756]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664723.206285]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664723.212124]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664723.218736]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664723.225002]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664723.230926]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664723.236934]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664723.242855]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664723.248775]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664723.254348]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664723.259659] Mem-Info:
[7664723.262136] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33358 inactive_file:34610 isolated_file:3744
 unevictable:9044 dirty:0 writeback:9 unstable:0
 slab_reclaimable:824043 slab_unreclaimable:62296614
 mapped:1589 shmem:0 pagetables:1350 bounce:0
 free:590279 free_pcp:0 free_cma:0
[7664723.296402] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664723.338154] lowmem_reserve[]: 0 1418 63868 63868
[7664723.343077] Node 0 DMA32 free:261320kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:828kB inactive_file:2460kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:4kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686224kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:49546 all_unreclaimable? no
[7664723.387693] lowmem_reserve[]: 0 0 62450 62450
[7664723.392358] Node 0 Normal free:508408kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:45144kB inactive_file:47292kB unevictable:168kB isolated(anon):0kB isolated(file):2560kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:4kB mapped:172kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243556kB kernel_stack:5856kB pagetables:1848kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:469191 all_unreclaimable? yes
[7664723.439222] lowmem_reserve[]: 0 0 0 0
[7664723.443197] Node 1 Normal free:525244kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17036kB inactive_file:16396kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:72968 all_unreclaimable? yes
[7664723.489976] lowmem_reserve[]: 0 0 0 0
[7664723.493951] Node 2 Normal free:524876kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32160kB inactive_file:36052kB unevictable:8680kB isolated(anon):0kB isolated(file):3072kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:20kB mapped:5332kB shmem:0kB slab_reclaimable:715216kB slab_unreclaimable:62476080kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:853313 all_unreclaimable? yes
[7664723.541077] lowmem_reserve[]: 0 0 0 0
[7664723.545058] Node 3 Normal free:525368kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:4kB active_file:35180kB inactive_file:34176kB unevictable:840kB isolated(anon):0kB isolated(file):13568kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:8kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369264kB kernel_stack:4208kB pagetables:1440kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1115845 all_unreclaimable? yes
[7664723.592090] lowmem_reserve[]: 0 0 0 0
[7664723.596059] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664723.610895] Node 0 DMA32: 391*4kB (UEM) 409*8kB (UEM) 1214*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261636kB
[7664723.627391] Node 0 Normal: 6386*4kB (UEM) 5791*8kB (UEM) 3938*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508896kB
[7664723.644228] Node 1 Normal: 88055*4kB (UEM) 21632*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525292kB
[7664723.657758] Node 2 Normal: 27350*4kB (EM) 40188*8kB (EM) 906*16kB (UEM) 1672*32kB (UEM) 409*64kB (EM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525208kB
[7664723.673356] Node 3 Normal: 131478*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525912kB
[7664723.685796] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664723.694664] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664723.703269] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664723.712138] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664723.720748] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664723.729615] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664723.738223] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664723.747087] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664723.755692] 74066 total pagecache pages
[7664723.759707] 0 pages in swap cache
[7664723.763198] Swap cache stats: add 21120724, delete 21136696, find 4513439/7609946
[7664723.770851] Free swap  = 3108660kB
[7664723.774431] Total swap = 4194300kB
[7664723.778012] 66993253 pages RAM
[7664723.781242] 0 pages HighMem/MovableOnly
[7664723.785256] 1101945 pages reserved
[7664724.197228] ll_ost_io00_068 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664724.205666] ll_ost_io00_068 cpuset=/ mems_allowed=0
[7664724.210729] CPU: 32 PID: 96096 Comm: ll_ost_io00_068 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664724.224106] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664724.231932] Call Trace:
[7664724.234569]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664724.239882]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664724.245372]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664724.251214]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664724.256965]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664724.262971]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664724.269322]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664724.275414]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664724.281161]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664724.287693]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664724.294223]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664724.300402]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664724.306409]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664724.312518]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664724.319401]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664724.325552]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664724.332646]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664724.339177]  [<ffffffffa021bd89>] ? ___slab_alloc+0x209/0x4f0
[7664724.345130]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664724.351770]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664724.359112]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664724.365855]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664724.373202]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664724.380721]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664724.387638]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664724.394724]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664724.402476]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664724.409734]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664724.417594]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664724.424564]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664724.430005]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664724.436480]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664724.444049]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664724.449106]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664724.455376]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664724.461993]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664724.468257] Mem-Info:
[7664724.470724] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:35173 inactive_file:34551 isolated_file:4201
 unevictable:9044 dirty:0 writeback:9 unstable:0
 slab_reclaimable:824043 slab_unreclaimable:62296614
 mapped:1589 shmem:0 pagetables:1350 bounce:0
 free:590173 free_pcp:0 free_cma:0
[7664724.505003] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664724.546754] lowmem_reserve[]: 0 1418 63868 63868
[7664724.551677] Node 0 DMA32 free:261320kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:836kB inactive_file:3032kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:4kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686224kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:83449 all_unreclaimable? yes
[7664724.596372] lowmem_reserve[]: 0 0 62450 62450
[7664724.601034] Node 0 Normal free:508052kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:8kB active_file:42912kB inactive_file:43176kB unevictable:168kB isolated(anon):0kB isolated(file):9856kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:4kB mapped:172kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243556kB kernel_stack:5856kB pagetables:1848kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1788581 all_unreclaimable? yes
[7664724.642808] LustreError: 8713:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c184e9f9400
[7664724.658832] lowmem_reserve[]: 0 0 0 0
[7664724.662800] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664724.677638] Node 0 DMA32: 390*4kB (UEM) 409*8kB (UEM) 1214*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261632kB
[7664724.694130] Node 0 Normal: 6481*4kB (UEM) 5792*8kB (UEM) 3938*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 509284kB
[7664724.710973] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664724.719838] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664724.728444] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664724.737311] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664724.745917] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664724.754784] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664724.763387] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664724.772265] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664724.780869] 74237 total pagecache pages
[7664724.784883] 1 pages in swap cache
[7664724.788374] Swap cache stats: add 21120731, delete 21136702, find 4513442/7609953
[7664724.796026] Free swap  = 3109676kB
[7664724.799605] Total swap = 4194300kB
[7664724.803187] 66993253 pages RAM
[7664724.806418] 0 pages HighMem/MovableOnly
[7664724.810430] 1101945 pages reserved
[7664724.814013] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664724.822058] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664724.831021] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664724.839815] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664724.848411] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664724.856587] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664724.865202] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664724.873467] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664724.881562] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664724.890081] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664724.899034] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664724.907381] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664724.915650] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664724.924526] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664724.932535] [53969]     0 53969    31572      205      20      168             0 crond
[7664724.940627] [54035]     0 54035    27526      164      10       33             0 agetty
[7664724.948809] [54036]     0 54036    27526      158      11       33             0 agetty
[7664724.956989] [54186]     0 54186    22934      209      46      274             0 master
[7664724.965278] [36317]     0 36317    28294      187      14       61             0 bash
[7664724.973281] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664724.981801] [36329]     0 36329    28177      160      14       55             0 grep
[7664724.989916] [97173]     0 97173    48653      264      49      262             0 crond
[7664724.998010] [97872]     0 97872    48653      263      49      264             0 crond
[7664725.006106] [98579]     0 98579    48653      266      49      236             0 crond
[7664725.014199] [99292]     0 99292    48653      257      49      261             0 crond
[7664725.022294] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664725.030561] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664725.039523] [100032]     0 100032    48653      266      49      240             0 crond
[7664725.047794] [100203]     0 100203    30816      185      17      335             0 python3
[7664725.056232] Out of memory: Kill process 97872 (crond) score 0 or sacrifice child
[7664725.063798] Killed process 97872 (crond) total-vm:194612kB, anon-rss:0kB, file-rss:1052kB, shmem-rss:0kB
[7664725.151815] crond: page allocation failure: order:0, mode:0x200da
[7664725.158088] CPU: 16 PID: 97872 Comm: crond Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664725.170600] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664725.178431] Call Trace:
[7664725.181065]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664725.186384]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664725.192481]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664725.198058]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664725.204070]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664725.210597]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664725.217131]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664725.222965]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664725.229587]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664725.235861]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664725.241786]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664725.247793]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664725.253721]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664725.259640]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664725.265213]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664725.270524] Mem-Info:
[7664725.273003] active_anon:0 inactive_anon:6 isolated_anon:0
 active_file:33644 inactive_file:35741 isolated_file:4222
 unevictable:9044 dirty:0 writeback:9 unstable:0
 slab_reclaimable:824043 slab_unreclaimable:62296614
 mapped:1590 shmem:0 pagetables:1303 bounce:0
 free:590130 free_pcp:0 free_cma:0
[7664725.307278] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664725.349029] lowmem_reserve[]: 0 1418 63868 63868
[7664725.353952] Node 0 DMA32 free:261312kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:844kB inactive_file:2572kB unevictable:0kB isolated(anon):0kB isolated(file):512kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:4kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686224kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:186335 all_unreclaimable? no
[7664725.398824] lowmem_reserve[]: 0 0 62450 62450
[7664725.403490] Node 0 Normal free:508460kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:16kB active_file:44368kB inactive_file:47692kB unevictable:168kB isolated(anon):0kB isolated(file):4352kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:4kB mapped:172kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243556kB kernel_stack:5856kB pagetables:1836kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:391143 all_unreclaimable? yes
[7664725.450439] lowmem_reserve[]: 0 0 0 0
[7664725.454410] Node 1 Normal free:525244kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17052kB inactive_file:16380kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:72968 all_unreclaimable? yes
[7664725.501193] lowmem_reserve[]: 0 0 0 0
[7664725.505163] Node 2 Normal free:524868kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32500kB inactive_file:38684kB unevictable:8680kB isolated(anon):0kB isolated(file):1152kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:20kB mapped:5332kB shmem:0kB slab_reclaimable:715216kB slab_unreclaimable:62476080kB kernel_stack:7760kB pagetables:568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:411619 all_unreclaimable? yes
[7664725.552285] lowmem_reserve[]: 0 0 0 0
[7664725.556260] Node 3 Normal free:524708kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:8kB active_file:40344kB inactive_file:38176kB unevictable:840kB isolated(anon):0kB isolated(file):9208kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:8kB mapped:852kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369264kB kernel_stack:4208kB pagetables:1264kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:607905 all_unreclaimable? yes
[7664725.603121] lowmem_reserve[]: 0 0 0 0
[7664725.607088] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664725.621926] Node 0 DMA32: 389*4kB (UEM) 409*8kB (UEM) 1214*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261628kB
[7664725.638420] Node 0 Normal: 6481*4kB (UEM) 5792*8kB (UEM) 3938*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 509284kB
[7664725.655260] Node 1 Normal: 88055*4kB (UEM) 21632*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525292kB
[7664725.668787] Node 2 Normal: 27350*4kB (EM) 40188*8kB (EM) 906*16kB (UEM) 1672*32kB (UEM) 409*64kB (EM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525208kB
[7664725.684388] Node 3 Normal: 131237*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524948kB
[7664725.696739] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664725.705606] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664725.714221] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664725.723087] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664725.731692] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664725.740559] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664725.749166] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664725.749749] Lustre: 123039:0:(service.c:1322:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-10s), not sending early reply. Consider increasing at_early_margin (5)?  req@ffff9c506a930850 x1659467991533568/t0(0) o3->fb2c1382-8f5a-4@10.50.15.10@o2ib2:487/0 lens 488/440 e 0 to 0 dl 1583650737 ref 1 fl Complete:/0/ffffffff rc -110/-1
[7664725.788229] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664725.796842] 74261 total pagecache pages
[7664725.800855] 0 pages in swap cache
[7664725.804346] Swap cache stats: add 21120731, delete 21136703, find 4513442/7609953
[7664725.811998] Free swap  = 3109676kB
[7664725.815578] Total swap = 4194300kB
[7664725.819159] 66993253 pages RAM
[7664725.822391] 0 pages HighMem/MovableOnly
[7664725.826402] 1101945 pages reserved
[7664726.008327] ll_ost_io01_005 invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[7664726.016511] ll_ost_io01_005 cpuset=/ mems_allowed=1
[7664726.021574] CPU: 21 PID: 119516 Comm: ll_ost_io01_005 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664726.035042] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664726.042875] Call Trace:
[7664726.045516]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664726.050834]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664726.056329]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664726.062167]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664726.067915]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664726.073926]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664726.080280]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664726.086376]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664726.092127]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664726.097982] LustreError: 3109:0:(ldlm_lib.c:3262:target_bulk_io()) @@@ network error on bulk READ  req@ffff9c3f875a2050 x1659489523086912/t0(0) o3->f7fe261e-a413-4@10.49.28.2@o2ib1:520/0 lens 488/440 e 2 to 0 dl 1583650770 ref 1 fl Interpret:/0/0 rc 0/0
[7664726.097986] LustreError: 3109:0:(ldlm_lib.c:3262:target_bulk_io()) Skipped 4 previous similar messages
[7664726.130688]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664726.137224]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664726.143480]  [<ffffffffc124293f>] tgt_checksum_niobuf_rw+0xbf/0xe00 [ptlrpc]
[7664726.150741]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664726.158072]  [<ffffffffc0cb71e0>] ? obd_dif_crc_fn+0x20/0x20 [obdclass]
[7664726.164900]  [<ffffffffc1247325>] tgt_brw_read+0xc35/0x1e50 [ptlrpc]
[7664726.171466]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664726.178103]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664726.185454]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664726.192802]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664726.200324]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664726.207237]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664726.214325]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664726.222077]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664726.229332]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664726.237193]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664726.244162]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664726.249606]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664726.256087]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664726.263662]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664726.268722]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664726.274989]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664726.281608]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664726.287875] Mem-Info:
[7664726.290334] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:32627 inactive_file:36148 isolated_file:4093
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824044 slab_unreclaimable:62296614
 mapped:1589 shmem:0 pagetables:1161 bounce:0
 free:590278 free_pcp:0 free_cma:0
[7664726.324600] Node 1 Normal free:525316kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16768kB inactive_file:16488kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:497018 all_unreclaimable? yes
[7664726.371467] lowmem_reserve[]: 0 0 0 0
[7664726.375432] Node 1 Normal: 88058*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525320kB
[7664726.388965] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664726.397832] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664726.406446] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664726.415310] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664726.423920] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664726.432791] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664726.441398] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664726.450263] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664726.458870] 74410 total pagecache pages
[7664726.462884] 0 pages in swap cache
[7664726.466375] Swap cache stats: add 21120733, delete 21136705, find 4513443/7609955
[7664726.474028] Free swap  = 3110188kB
[7664726.477605] Total swap = 4194300kB
[7664726.481188] 66993253 pages RAM
[7664726.484419] 0 pages HighMem/MovableOnly
[7664726.488431] 1101945 pages reserved
[7664726.492010] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664726.500061] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664726.509018] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664726.517803] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664726.526383] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664726.534560] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664726.543175] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664726.551444] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664726.559537] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664726.568065] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664726.577029] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664726.585380] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664726.593642] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664726.602516] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664726.610525] [53969]     0 53969    31572      205      20      168             0 crond
[7664726.618624] [54035]     0 54035    27526      164      10       33             0 agetty
[7664726.626798] [54036]     0 54036    27526      158      11       33             0 agetty
[7664726.634972] [54186]     0 54186    22934      209      46      274             0 master
[7664726.643239] [36317]     0 36317    28294      187      14       61             0 bash
[7664726.651240] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664726.659767] [36329]     0 36329    28177      160      14       55             0 grep
[7664726.667879] [97173]     0 97173    48653      264      49      262             0 crond
[7664726.675972] [98579]     0 98579    48653      266      49      236             0 crond
[7664726.684061] [99292]     0 99292    48653      257      49      261             0 crond
[7664726.692158] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664726.700425] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664726.709391] [100032]     0 100032    48653      266      49      240             0 crond
[7664726.717656] [100203]     0 100203    30816      185      17      335             0 python3
[7664726.726094] Out of memory: Kill process 97173 (crond) score 0 or sacrifice child
[7664726.733661] Killed process 97173 (crond) total-vm:194612kB, anon-rss:0kB, file-rss:1056kB, shmem-rss:0kB
[7664726.796453] crond: page allocation failure: order:0, mode:0x200da
[7664726.802730] CPU: 32 PID: 97173 Comm: crond Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664726.815241] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664726.823064] Call Trace:
[7664726.825702]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664726.831024]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664726.837124]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664726.842700]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664726.848711]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664726.855236]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664726.861764]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664726.867598]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664726.874218]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664726.880484]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664726.886407]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664726.892419]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664726.898343]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664726.904269]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664726.909847]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664726.915158] Mem-Info:
[7664726.917638] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:33428 inactive_file:36293 isolated_file:3392
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824045 slab_unreclaimable:62296623
 mapped:1588 shmem:0 pagetables:1161 bounce:0
 free:590387 free_pcp:0 free_cma:0
[7664726.951909] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664726.993661] lowmem_reserve[]: 0 1418 63868 63868
[7664726.998586] Node 0 DMA32 free:261348kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:888kB inactive_file:2784kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686204kB kernel_stack:384kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:58954 all_unreclaimable? yes
[7664727.043279] lowmem_reserve[]: 0 0 62450 62450
[7664727.047943] Node 0 Normal free:508404kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:4kB active_file:45620kB inactive_file:49060kB unevictable:168kB isolated(anon):0kB isolated(file):1792kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243608kB kernel_stack:5856kB pagetables:1336kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:548345 all_unreclaimable? yes
[7664727.094805] lowmem_reserve[]: 0 0 0 0
[7664727.098785] Node 1 Normal free:525320kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16240kB inactive_file:17196kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:497018 all_unreclaimable? yes
[7664727.145668] lowmem_reserve[]: 0 0 0 0
[7664727.149640] Node 2 Normal free:525408kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32088kB inactive_file:39344kB unevictable:8680kB isolated(anon):0kB isolated(file):256kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476080kB kernel_stack:7760kB pagetables:552kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:163371 all_unreclaimable? yes
[7664727.196616] lowmem_reserve[]: 0 0 0 0
[7664727.200591] Node 3 Normal free:525172kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:37268kB inactive_file:37800kB unevictable:840kB isolated(anon):0kB isolated(file):12544kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369268kB kernel_stack:4208kB pagetables:1212kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1256385 all_unreclaimable? yes
[7664727.247617] lowmem_reserve[]: 0 0 0 0
[7664727.251582] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664727.266422] Node 0 DMA32: 352*4kB (UEM) 412*8kB (UEM) 1214*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261504kB
[7664727.282914] Node 0 Normal: 6356*4kB (UEM) 5763*8kB (UEM) 3922*16kB (UEM) 4479*32kB (EM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508264kB
[7664727.299668] Node 1 Normal: 88058*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525320kB
[7664727.313195] Node 2 Normal: 27371*4kB (UEM) 40191*8kB (UEM) 915*16kB (UEM) 1672*32kB (UEM) 409*64kB (EM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525460kB
[7664727.328969] Node 3 Normal: 131362*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525448kB
[7664727.341408] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664727.350275] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664727.358880] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664727.367746] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664727.376353] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664727.385219] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664727.393824] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664727.402690] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664727.411296] 74308 total pagecache pages
[7664727.415309] 0 pages in swap cache
[7664727.418802] Swap cache stats: add 21120733, delete 21136705, find 4513443/7609955
[7664727.426453] Free swap  = 3110188kB
[7664727.430033] Total swap = 4194300kB
[7664727.433614] 66993253 pages RAM
[7664727.436845] 0 pages HighMem/MovableOnly
[7664727.440857] 1101945 pages reserved
[7664728.267989] ll_ost_io02_074 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664728.276432] ll_ost_io02_074 cpuset=/ mems_allowed=2
[7664728.281489] CPU: 14 PID: 83183 Comm: ll_ost_io02_074 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664728.294866] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664728.302690] Call Trace:
[7664728.305326]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664728.310648]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664728.316138]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664728.321978]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664728.327991]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664728.334345]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664728.340440]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664728.346191]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664728.352719]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664728.359255]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664728.365441]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664728.371446]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664728.377556]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664728.384438]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664728.391661]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664728.397829]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664728.404455]  [<ffffffffc0a844f5>] ? lnet_try_match_md+0x1e5/0x330 [lnet]
[7664728.411334]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664728.417085]  [<ffffffffa00dca58>] ? __enqueue_entity+0x78/0x80
[7664728.423091]  [<ffffffffa00e367f>] ? enqueue_entity+0x2ef/0xbe0
[7664728.429098]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664728.434631]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664728.441725]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664728.449473]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664728.456731]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664728.464598]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664728.471601]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664728.479562]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664728.486819]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664728.493291]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664728.500861]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664728.505919]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664728.512188]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664728.518810]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664728.525081] Mem-Info:
[7664728.527541] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:34268 inactive_file:35841 isolated_file:3424
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824045 slab_unreclaimable:62296619
 mapped:1588 shmem:0 pagetables:1112 bounce:0
 free:590389 free_pcp:0 free_cma:0
[7664728.561810] Node 2 Normal free:525456kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32088kB inactive_file:39600kB unevictable:8680kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476064kB kernel_stack:7760kB pagetables:536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:186618 all_unreclaimable? yes
[7664728.608587] lowmem_reserve[]: 0 0 0 0
[7664728.612553] Node 2 Normal: 27375*4kB (UEM) 40191*8kB (UEM) 915*16kB (UEM) 1672*32kB (UEM) 409*64kB (EM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525476kB
[7664728.628329] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664728.637194] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664728.645802] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664728.654667] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664728.663275] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664728.672139] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664728.680748] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664728.689621] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664728.698226] 74311 total pagecache pages
[7664728.702240] 0 pages in swap cache
[7664728.705731] Swap cache stats: add 21120740, delete 21136712, find 4513444/7609957
[7664728.713385] Free swap  = 3110700kB
[7664728.716963] Total swap = 4194300kB
[7664728.720544] 66993253 pages RAM
[7664728.723776] 0 pages HighMem/MovableOnly
[7664728.727787] 1101945 pages reserved
[7664728.731368] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664728.739414] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664728.748374] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664728.757162] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664728.765759] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664728.773943] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664728.782555] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664728.790816] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664728.798910] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664728.807440] [53108]     0 53108    38960      167      19       84             0 dsm_sa_eventmgr
[7664728.816400] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664728.824754] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664728.833014] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664728.841892] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664728.849897] [53969]     0 53969    31572      205      20      168             0 crond
[7664728.857991] [54035]     0 54035    27526      164      10       33             0 agetty
[7664728.866164] [54036]     0 54036    27526      158      11       33             0 agetty
[7664728.874339] [54186]     0 54186    22934      209      46      274             0 master
[7664728.882633] [36317]     0 36317    28294      187      14       61             0 bash
[7664728.890651] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664728.899176] [36329]     0 36329    28177      160      14       55             0 grep
[7664728.907293] [98579]     0 98579    48653      266      49      236             0 crond
[7664728.915383] [99292]     0 99292    48653      257      49      261             0 crond
[7664728.923470] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664728.931738] [99739]    89 99739    25502      246      47      261             0 trivial-rewrite
[7664728.940699] [100032]     0 100032    48653      266      49      240             0 crond
[7664728.948969] [100203]     0 100203    30816      185      17      335             0 python3
[7664728.957408] Out of memory: Kill process 99739 (trivial-rewrite) score 0 or sacrifice child
[7664728.965849] Killed process 99739 (trivial-rewrite) total-vm:102008kB, anon-rss:0kB, file-rss:984kB, shmem-rss:0kB
[7664729.057875] trivial-rewrite: page allocation failure: order:0, mode:0x201da
[7664729.065016] CPU: 15 PID: 99739 Comm: trivial-rewrite Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664729.078398] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664729.086230] Call Trace:
[7664729.088860]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664729.094178]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664729.100274]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664729.105854]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664729.111866]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664729.118395]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664729.124929]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664729.131117]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664729.137131]  [<ffffffffa01ba3c8>] filemap_fault+0x298/0x490
[7664729.142913]  [<ffffffffc05871c6>] ext4_filemap_fault+0x36/0x50 [ext4]
[7664729.149533]  [<ffffffffa01e593a>] __do_fault.isra.59+0x8a/0x100
[7664729.155625]  [<ffffffffa01e5eec>] do_read_fault.isra.61+0x4c/0x1b0
[7664729.161986]  [<ffffffffa01ea874>] handle_pte_fault+0x2f4/0xd10
[7664729.167990]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664729.173912]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664729.179831]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664729.185404]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664729.190716] Mem-Info:
[7664729.193192] active_anon:0 inactive_anon:1 isolated_anon:0
 active_file:32713 inactive_file:34310 isolated_file:4480
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824045 slab_unreclaimable:62296619
 mapped:1588 shmem:0 pagetables:1112 bounce:0
 free:590487 free_pcp:9 free_cma:0
[7664729.227467] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664729.269224] lowmem_reserve[]: 0 1418 63868 63868
[7664729.274152] Node 0 DMA32 free:261352kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1008kB inactive_file:2976kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686204kB kernel_stack:352kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:9326 all_unreclaimable? yes
[7664729.318845] lowmem_reserve[]: 0 0 62450 62450
[7664729.323509] Node 0 Normal free:508352kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:4kB active_file:44920kB inactive_file:47280kB unevictable:168kB isolated(anon):0kB isolated(file):3584kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243608kB kernel_stack:6288kB pagetables:1268kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:68642 all_unreclaimable? no
[7664729.370199] lowmem_reserve[]: 0 0 0 0
[7664729.374174] Node 1 Normal free:525320kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16548kB inactive_file:16640kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:497018 all_unreclaimable? yes
[7664729.421038] lowmem_reserve[]: 0 0 0 0
[7664729.425014] Node 2 Normal free:525564kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:29596kB inactive_file:31524kB unevictable:8680kB isolated(anon):0kB isolated(file):8832kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476064kB kernel_stack:7760kB pagetables:536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:430364 all_unreclaimable? yes
[7664729.472052] lowmem_reserve[]: 0 0 0 0
[7664729.476027] Node 3 Normal free:525408kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:39776kB inactive_file:43592kB unevictable:840kB isolated(anon):0kB isolated(file):4096kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369268kB kernel_stack:4208kB pagetables:1100kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:575432 all_unreclaimable? yes
[7664729.522891] lowmem_reserve[]: 0 0 0 0
[7664729.526862] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664729.541701] Node 0 DMA32: 359*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3688*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261572kB
[7664729.558194] Node 0 Normal: 6426*4kB (UEM) 5764*8kB (UEM) 3942*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508904kB
[7664729.575037] Node 1 Normal: 88058*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525320kB
[7664729.588564] Node 2 Normal: 27436*4kB (UEM) 40241*8kB (UEM) 892*16kB (UEM) 1671*32kB (UEM) 408*64kB (UEM) 2*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525784kB
[7664729.604511] Node 3 Normal: 131215*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524860kB
[7664729.616861] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664729.625729] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664729.634345] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664729.643216] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664729.651823] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664729.660688] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664729.669296] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664729.678161] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664729.686766] 74201 total pagecache pages
[7664729.690781] 1 pages in swap cache
[7664729.694273] Swap cache stats: add 21120743, delete 21136714, find 4513445/7609959
[7664729.701924] Free swap  = 3110692kB
[7664729.705502] Total swap = 4194300kB
[7664729.709084] 66993253 pages RAM
[7664729.712316] 0 pages HighMem/MovableOnly
[7664729.716329] 1101945 pages reserved
[7664729.902284] ll_ost_io00_031 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664729.910728] ll_ost_io00_031 cpuset=/ mems_allowed=0
[7664729.915796] CPU: 4 PID: 123073 Comm: ll_ost_io00_031 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664729.929171] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664729.937006] Call Trace:
[7664729.939646]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664729.944963]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664729.950457]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664729.956300]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664729.962051]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664729.968055]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664729.974408]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664729.980503]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664729.986258]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664729.992815]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664729.999349]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664730.005559]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664730.011564]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664730.017672]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664730.024558]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664730.030725]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664730.037818]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664730.044387]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664730.051031]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664730.058378]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664730.065118]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664730.072468]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664730.080022]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664730.086936]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664730.094025]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664730.101795]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664730.109051]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664730.116964]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664730.123966]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664730.131921]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664730.139177]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664730.145657]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664730.153233]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664730.158292]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664730.164560]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664730.171171]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664730.177437] Mem-Info:
[7664730.179903] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:32922 inactive_file:34799 isolated_file:5088
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824045 slab_unreclaimable:62296624
 mapped:1588 shmem:0 pagetables:1112 bounce:0
 free:590371 free_pcp:0 free_cma:0
[7664730.214183] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664730.255937] lowmem_reserve[]: 0 1418 63868 63868
[7664730.260865] Node 0 DMA32 free:261136kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1084kB inactive_file:2240kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686204kB kernel_stack:352kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:19213 all_unreclaimable? no
[7664730.305562] lowmem_reserve[]: 0 0 62450 62450
[7664730.310228] Node 0 Normal free:508120kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:47312kB inactive_file:46832kB unevictable:168kB isolated(anon):0kB isolated(file):2304kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243620kB kernel_stack:6672kB pagetables:1268kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:317270 all_unreclaimable? yes
[7664730.357101] lowmem_reserve[]: 0 0 0 0
[7664730.361110] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664730.375950] Node 0 DMA32: 340*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261464kB
[7664730.392448] Node 0 Normal: 6464*4kB (UEM) 5764*8kB (UEM) 3918*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508672kB
[7664730.409292] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664730.418155] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664730.426764] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664730.435629] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664730.444234] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664730.453099] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664730.461708] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664730.470572] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664730.479179] 74443 total pagecache pages
[7664730.483193] 1 pages in swap cache
[7664730.486686] Swap cache stats: add 21120748, delete 21136719, find 4513447/7609964
[7664730.494337] Free swap  = 3111632kB
[7664730.497916] Total swap = 4194300kB
[7664730.501498] 66993253 pages RAM
[7664730.504728] 0 pages HighMem/MovableOnly
[7664730.508741] 1101945 pages reserved
[7664730.512320] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664730.520369] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664730.529327] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664730.538112] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664730.546707] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664730.554889] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664730.563501] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664730.571769] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664730.579863] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664730.588382] [53108]     0 53108    38960      167      19       86             0 dsm_sa_eventmgr
[7664730.597335] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664730.605683] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664730.613952] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664730.622829] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664730.630834] [53969]     0 53969    31572      204      20      169             0 crond
[7664730.638926] [54035]     0 54035    27526      164      10       33             0 agetty
[7664730.647099] [54036]     0 54036    27526      158      11       33             0 agetty
[7664730.655272] [54186]     0 54186    22934      209      46      274             0 master
[7664730.663565] [36317]     0 36317    28294      187      14       61             0 bash
[7664730.671600] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664730.680121] [36329]     0 36329    28177      160      14       55             0 grep
[7664730.688247] [98579]     0 98579    48653      266      49      236             0 crond
[7664730.696337] [99292]     0 99292    48653      257      49      261             0 crond
[7664730.704429] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664730.712689] [100032]     0 100032    48653      266      49      240             0 crond
[7664730.720950] [100203]     0 100203    30816      185      17      335             0 python3
[7664730.729390] Out of memory: Kill process 99292 (crond) score 0 or sacrifice child
[7664730.736955] Killed process 99292 (crond) total-vm:194612kB, anon-rss:0kB, file-rss:1028kB, shmem-rss:0kB
[7664730.792964] crond: page allocation failure: order:0, mode:0x200da
[7664730.799287] CPU: 28 PID: 99292 Comm: crond Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664730.811819] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664730.819674] Call Trace:
[7664730.822315]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664730.827631]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664730.833730]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664730.839313]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664730.845327]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664730.851859]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664730.858387]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664730.864218]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664730.870833]  [<ffffffffa076aaba>] ? __schedule+0x42a/0x860
[7664730.876502]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664730.882779]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664730.888704]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664730.894720]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664730.900648]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664730.906573]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664730.912147]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664730.917460] Mem-Info:
[7664730.919935] active_anon:0 inactive_anon:4 isolated_anon:0
 active_file:34635 inactive_file:33367 isolated_file:5184
 unevictable:9044 dirty:0 writeback:1 unstable:0
 slab_reclaimable:824045 slab_unreclaimable:62296625
 mapped:1595 shmem:0 pagetables:1112 bounce:0
 free:590251 free_pcp:0 free_cma:0
[7664730.954219] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664730.995974] lowmem_reserve[]: 0 1418 63868 63868
[7664731.000903] Node 0 DMA32 free:261188kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1084kB inactive_file:2288kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686204kB kernel_stack:352kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:58529 all_unreclaimable? yes
[7664731.045688] lowmem_reserve[]: 0 0 62450 62450
[7664731.050356] Node 0 Normal free:508608kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44976kB inactive_file:47188kB unevictable:168kB isolated(anon):0kB isolated(file):2304kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243620kB kernel_stack:6672kB pagetables:1268kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:572632 all_unreclaimable? yes
[7664731.098924] Lustre: fir-OST001d: Bulk IO write error with 72866633-325f-4 (at 10.50.15.9@o2ib2), client will retry: rc = -110
[7664731.097227] lowmem_reserve[]: 0 0 0 0
[7664731.112674] Node 1 Normal free:525336kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:18044kB inactive_file:14408kB unevictable:26488kB isolated(anon):0kB isolated(file):128kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711288kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:301214 all_unreclaimable? yes
[7664731.159722] lowmem_reserve[]: 0 0 0 0
[7664731.163693] Node 2 Normal free:524692kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:12kB active_file:28564kB inactive_file:34364kB unevictable:8680kB isolated(anon):0kB isolated(file):7936kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:4kB mapped:5356kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476064kB kernel_stack:7760kB pagetables:536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:119043 all_unreclaimable? no
[7664731.210737] lowmem_reserve[]: 0 0 0 0
[7664731.214713] Node 3 Normal free:525096kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:4kB active_file:44136kB inactive_file:40748kB unevictable:840kB isolated(anon):0kB isolated(file):5760kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:852kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369276kB kernel_stack:4208kB pagetables:1100kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:198515 all_unreclaimable? yes
[7664731.261563] lowmem_reserve[]: 0 0 0 0
[7664731.265529] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664731.280368] Node 0 DMA32: 340*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261464kB
[7664731.296861] Node 0 Normal: 6502*4kB (UEM) 5764*8kB (UEM) 3931*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 509032kB
[7664731.313700] Node 1 Normal: 88060*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525328kB
[7664731.327231] Node 2 Normal: 27486*4kB (UEM) 40197*8kB (UEM) 888*16kB (EM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525120kB
[7664731.342544] Node 3 Normal: 131457*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525828kB
[7664731.354895] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664731.363761] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664731.372367] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664731.381234] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664731.389841] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664731.398705] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664731.407312] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664731.416177] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664731.424785] 74427 total pagecache pages
[7664731.428800] 0 pages in swap cache
[7664731.432300] Swap cache stats: add 21120748, delete 21136720, find 4513447/7609964
[7664731.439951] Free swap  = 3111632kB
[7664731.443530] Total swap = 4194300kB
[7664731.447109] 66993253 pages RAM
[7664731.450342] 0 pages HighMem/MovableOnly
[7664731.454353] 1101945 pages reserved
[7664732.173644] ll_ost_io02_054 invoked oom-killer: gfp_mask=0x82d2, order=0, oom_score_adj=0
[7664732.181995] ll_ost_io02_054 cpuset=/ mems_allowed=2
[7664732.187063] CPU: 2 PID: 6889 Comm: ll_ost_io02_054 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664732.200270] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664732.208101] Call Trace:
[7664732.210741]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664732.216055]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664732.221553]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664732.227394]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664732.233149]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664732.239161]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664732.245524]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664732.251624]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664732.257378]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664732.263905]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664732.270440]  [<ffffffffa01fd95f>] __vmalloc_node_range+0x12f/0x280
[7664732.276871]  [<ffffffffc11e6a03>] ? ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664732.283924]  [<ffffffffa01fdd5e>] vzalloc_node+0x4e/0x50
[7664732.289450]  [<ffffffffc11e6a03>] ? ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664732.296535]  [<ffffffffc11e6a03>] ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664732.303450]  [<ffffffffc11e6ea1>] ptlrpc_grow_req_bufs+0xe1/0x2a0 [ptlrpc]
[7664732.310543]  [<ffffffffc11efc85>] ptlrpc_main+0xc05/0x1460 [ptlrpc]
[7664732.317035]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664732.324607]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664732.329668]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664732.335935]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664732.342554]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664732.348820] Mem-Info:
[7664732.351282] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:34450 inactive_file:35193 isolated_file:2400
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824047 slab_unreclaimable:62296627
 mapped:1591 shmem:0 pagetables:1016 bounce:0
 free:590377 free_pcp:0 free_cma:0
[7664732.385555] Node 2 Normal free:524864kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:8kB active_file:30784kB inactive_file:37400kB unevictable:8680kB isolated(anon):0kB isolated(file):3968kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5340kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476072kB kernel_stack:7760kB pagetables:368kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1223279 all_unreclaimable? yes
[7664732.432681] lowmem_reserve[]: 0 0 0 0
[7664732.436654] Node 2 Normal: 27488*4kB (UEM) 40197*8kB (UEM) 889*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525144kB
[7664732.452060] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664732.460934] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664732.469541] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664732.478416] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664732.487029] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664732.495897] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664732.504510] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664732.513375] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664732.521981] 74412 total pagecache pages
[7664732.525997] 0 pages in swap cache
[7664732.529495] Swap cache stats: add 21120750, delete 21136722, find 4513449/7609966
[7664732.537149] Free swap  = 3112144kB
[7664732.540726] Total swap = 4194300kB
[7664732.544309] 66993253 pages RAM
[7664732.547539] 0 pages HighMem/MovableOnly
[7664732.551549] 1101945 pages reserved
[7664732.555130] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664732.563180] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664732.572138] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664732.580937] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664732.589539] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664732.597717] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664732.606329] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664732.614599] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664732.622693] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664732.631220] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664732.640182] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664732.648537] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664732.656806] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664732.665682] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664732.673688] [53969]     0 53969    31572      204      20      169             0 crond
[7664732.681781] [54035]     0 54035    27526      164      10       33             0 agetty
[7664732.689955] [54036]     0 54036    27526      158      11       33             0 agetty
[7664732.698136] [54186]     0 54186    22934      209      46      274             0 master
[7664732.706434] [36317]     0 36317    28294      187      14       61             0 bash
[7664732.714442] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664732.722967] [36329]     0 36329    28177      160      14       55             0 grep
[7664732.731082] [98579]     0 98579    48653      266      49      236             0 crond
[7664732.739175] [99592]    89 99592    25538      229      47      273             0 cleanup
[7664732.747443] [100032]     0 100032    48653      266      49      240             0 crond
[7664732.755712] [100203]     0 100203    30816      185      17      335             0 python3
[7664732.764151] Out of memory: Kill process 99592 (cleanup) score 0 or sacrifice child
[7664732.771897] Killed process 99592 (cleanup) total-vm:102152kB, anon-rss:0kB, file-rss:916kB, shmem-rss:0kB
[7664732.812301] cleanup: page allocation failure: order:0, mode:0x200da
[7664732.818752] CPU: 0 PID: 99592 Comm: cleanup Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664732.831358] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664732.839189] Call Trace:
[7664732.841833]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664732.847158]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664732.853254]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664732.858841]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664732.864849]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664732.871379]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664732.877913]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664732.883753]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664732.890374]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664732.896647]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664732.902572]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664732.908596]  [<ffffffffa076a10e>] ? schedule_hrtimeout_range_clock+0xbe/0x150
[7664732.915906]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664732.921835]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664732.927763]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664732.933346]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664732.938668] Mem-Info:
[7664732.941150] active_anon:0 inactive_anon:2 isolated_anon:0
 active_file:32804 inactive_file:36254 isolated_file:3488
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824047 slab_unreclaimable:62296628
 mapped:1591 shmem:0 pagetables:1016 bounce:0
 free:590333 free_pcp:134 free_cma:0
[7664732.975602] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664733.010251] LustreError: 120630:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c2126afac00
[7664733.028395] lowmem_reserve[]: 0 1418 63868 63868
[7664733.033322] Node 0 DMA32 free:261344kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1100kB inactive_file:2360kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:4kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:282489 all_unreclaimable? yes
[7664733.078195] lowmem_reserve[]: 0 0 62450 62450
[7664733.082864] Node 0 Normal free:508256kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44732kB inactive_file:43404kB unevictable:168kB isolated(anon):0kB isolated(file):7936kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243616kB kernel_stack:6640kB pagetables:1108kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:39428 all_unreclaimable? no
[7664733.129565] lowmem_reserve[]: 0 0 0 0
[7664733.133535] Node 1 Normal free:525328kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16248kB inactive_file:17544kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711296kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:513849 all_unreclaimable? yes
[7664733.173817] LustreError: 8683:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c2d1437e400
[7664733.191265] lowmem_reserve[]: 0 0 0 0
[7664733.195234] Node 2 Normal free:524888kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:4kB active_file:30776kB inactive_file:38860kB unevictable:8680kB isolated(anon):0kB isolated(file):2176kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5340kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476076kB kernel_stack:7760kB pagetables:368kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:73248 all_unreclaimable? no
[7664733.199739] LustreError: 107086:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c29177a9200
[7664733.253119] lowmem_reserve[]: 0 0 0 0
[7664733.257089] Node 3 Normal free:524936kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:4kB active_file:40824kB inactive_file:40456kB unevictable:840kB isolated(anon):0kB isolated(file):6144kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:852kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:1048kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:214118 all_unreclaimable? no
[7664733.303870] lowmem_reserve[]: 0 0 0 0
[7664733.307840] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664733.322676] Node 0 DMA32: 338*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261456kB
[7664733.339170] Node 0 Normal: 6323*4kB (UEM) 5744*8kB (UEM) 3931*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508156kB
[7664733.356010] Node 1 Normal: 88058*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525320kB
[7664733.369538] Node 2 Normal: 27485*4kB (UEM) 40197*8kB (UEM) 889*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525132kB
[7664733.384940] Node 3 Normal: 131320*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525280kB
[7664733.397376] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664733.406243] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664733.414867] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664733.423734] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664733.432339] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664733.441216] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664733.449829] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664733.458695] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664733.467301] 74577 total pagecache pages
[7664733.471319] 0 pages in swap cache
[7664733.474817] Swap cache stats: add 21120750, delete 21136722, find 4513449/7609966
[7664733.482468] Free swap  = 3112144kB
[7664733.486045] Total swap = 4194300kB
[7664733.489628] 66993253 pages RAM
[7664733.492858] 0 pages HighMem/MovableOnly
[7664733.496871] 1101945 pages reserved
[7664733.811936] ll_ost_io03_058 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664733.820370] ll_ost_io03_058 cpuset=/ mems_allowed=3
[7664733.825426] CPU: 15 PID: 7282 Comm: ll_ost_io03_058 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664733.838724] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664733.846550] Call Trace:
[7664733.849182]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664733.854501]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664733.859995]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664733.865827]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664733.871834]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664733.878187]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664733.884288]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664733.890033]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664733.896563]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664733.903096]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664733.909281]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664733.915287]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664733.921396]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664733.928280]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664733.935505]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664733.941668]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664733.948319]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664733.955374]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664733.961128]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664733.967569]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664733.974449]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664733.979985]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664733.987076]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664733.994833]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664734.002104]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664734.009967]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664734.016931]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664734.022366]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664734.028841]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664734.036416]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664734.041469]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664734.047744]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664734.054363]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664734.060629] Mem-Info:
[7664734.063089] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34898 inactive_file:36671 isolated_file:1685
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824052 slab_unreclaimable:62296631
 mapped:1589 shmem:0 pagetables:1016 bounce:0
 free:590132 free_pcp:0 free_cma:0
[7664734.097367] Node 3 Normal free:525248kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:4kB active_file:43272kB inactive_file:42992kB unevictable:840kB isolated(anon):0kB isolated(file):852kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:852kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:1048kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1041966 all_unreclaimable? yes
[7664734.144228] lowmem_reserve[]: 0 0 0 0
[7664734.148195] Node 3 Normal: 131365*4kB (UEM) 2*8kB (M) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525476kB
[7664734.161008] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664734.169873] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664734.178482] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664734.187355] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664734.195962] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664734.204828] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664734.213432] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664734.222297] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664734.230908] 74669 total pagecache pages
[7664734.234931] 0 pages in swap cache
[7664734.238429] Swap cache stats: add 21120759, delete 21136731, find 4513450/7609968
[7664734.246079] Free swap  = 3113168kB
[7664734.249659] Total swap = 4194300kB
[7664734.253239] 66993253 pages RAM
[7664734.256470] 0 pages HighMem/MovableOnly
[7664734.260485] 1101945 pages reserved
[7664734.264063] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664734.272112] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664734.281070] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664734.289853] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664734.298448] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664734.306631] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664734.315245] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664734.323512] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664734.331608] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664734.340133] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664734.349087] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664734.357433] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664734.365695] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664734.374569] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664734.382576] [53969]     0 53969    31572      204      20      169             0 crond
[7664734.390669] [54035]     0 54035    27526      164      10       33             0 agetty
[7664734.398842] [54036]     0 54036    27526      158      11       33             0 agetty
[7664734.407014] [54186]     0 54186    22934      209      46      274             0 master
[7664734.415301] [36317]     0 36317    28294      187      14       61             0 bash
[7664734.423299] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664734.431821] [36329]     0 36329    28177      160      14       55             0 grep
[7664734.439940] [98579]     0 98579    48653      266      49      236             0 crond
[7664734.448032] [100032]     0 100032    48653      266      49      240             0 crond
[7664734.456293] [100203]     0 100203    30816      185      17      335             0 python3
[7664734.464727] Out of memory: Kill process 100032 (crond) score 0 or sacrifice child
[7664734.472379] Killed process 100203 (python3) total-vm:123264kB, anon-rss:0kB, file-rss:740kB, shmem-rss:0kB
[7664734.525533] Lustre: 90710:0:(service.c:1322:ptlrpc_at_send_early_reply()) @@@ Already past deadline (-14s), not sending early reply. Consider increasing at_early_margin (5)?  req@ffff9c114c91b850 x1660577746876416/t0(0) o4->31cd270d-535d-4@10.50.5.29@o2ib2:492/0 lens 488/448 e 1 to 0 dl 1583650742 ref 2 fl Interpret:/0/0 rc 0/0
[7664734.526031] LustreError: 2948:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c1ae02c0e00
[7664734.676783] python3: page allocation failure: order:0, mode:0x200da
[7664734.683236] CPU: 20 PID: 100203 Comm: python3 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664734.696009] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664734.703833] Call Trace:
[7664734.706467]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664734.711786]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664734.717886]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664734.723467]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664734.729481]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664734.736017]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664734.742550]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664734.748385]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664734.755004]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664734.761269]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664734.767189]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664734.773206]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664734.779132]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664734.785052]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664734.790624]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664734.795937] Mem-Info:
[7664734.798411] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34305 inactive_file:36083 isolated_file:2272
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824052 slab_unreclaimable:62296627
 mapped:1588 shmem:0 pagetables:969 bounce:0
 free:590208 free_pcp:0 free_cma:0
[7664734.832591] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664734.874346] lowmem_reserve[]: 0 1418 63868 63868
[7664734.879276] Node 0 DMA32 free:261332kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1128kB inactive_file:2640kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:4kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:280662 all_unreclaimable? yes
[7664734.924146] lowmem_reserve[]: 0 0 62450 62450
[7664734.928817] Node 0 Normal free:508272kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44512kB inactive_file:45304kB unevictable:168kB isolated(anon):0kB isolated(file):6016kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243624kB kernel_stack:6016kB pagetables:1088kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:487124 all_unreclaimable? yes
[7664734.975685] lowmem_reserve[]: 0 0 0 0
[7664734.979655] Node 1 Normal free:525320kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:18692kB inactive_file:15264kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711304kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:572114 all_unreclaimable? yes
[7664735.026507] lowmem_reserve[]: 0 0 0 0
[7664735.030477] Node 2 Normal free:525148kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31560kB inactive_file:40584kB unevictable:8680kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715236kB slab_unreclaimable:62476060kB kernel_stack:7760kB pagetables:368kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:122880 all_unreclaimable? no
[7664735.077168] lowmem_reserve[]: 0 0 0 0
[7664735.081142] Node 3 Normal free:524856kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:39888kB inactive_file:44164kB unevictable:840kB isolated(anon):0kB isolated(file):2560kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:880kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1207843 all_unreclaimable? yes
[7664735.128005] lowmem_reserve[]: 0 0 0 0
[7664735.131970] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664735.146807] Node 0 DMA32: 336*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261448kB
[7664735.163300] Node 0 Normal: 6355*4kB (UEM) 5745*8kB (UEM) 3923*16kB (UEM) 4480*32kB (UEM) 2040*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508164kB
[7664735.180141] Node 1 Normal: 88058*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525320kB
[7664735.193671] Node 2 Normal: 27525*4kB (UEM) 40223*8kB (UEM) 896*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525612kB
[7664735.209071] Node 3 Normal: 131035*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524140kB
[7664735.221421] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664735.230287] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664735.238897] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664735.247769] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664735.256376] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664735.265242] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664735.273848] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664735.282713] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664735.291319] 74860 total pagecache pages
[7664735.295333] 0 pages in swap cache
[7664735.298826] Swap cache stats: add 21120759, delete 21136731, find 4513450/7609968
[7664735.306476] Free swap  = 3113168kB
[7664735.310056] Total swap = 4194300kB
[7664735.313635] 66993253 pages RAM
[7664735.316869] 0 pages HighMem/MovableOnly
[7664735.320881] 1101945 pages reserved
[7664735.734921] ll_ost_io01_005 invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[7664735.743103] ll_ost_io01_005 cpuset=/ mems_allowed=1
[7664735.748167] CPU: 21 PID: 119516 Comm: ll_ost_io01_005 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664735.761633] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664735.769466] Call Trace:
[7664735.772099]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664735.777414]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664735.782903]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664735.788745]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664735.794499]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664735.800511]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664735.806864]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664735.812957]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664735.818703]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664735.825228]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664735.831756]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664735.838015]  [<ffffffffc124293f>] tgt_checksum_niobuf_rw+0xbf/0xe00 [ptlrpc]
[7664735.845273]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664735.852607]  [<ffffffffc0cb71e0>] ? obd_dif_crc_fn+0x20/0x20 [obdclass]
[7664735.859438]  [<ffffffffc1247325>] tgt_brw_read+0xc35/0x1e50 [ptlrpc]
[7664735.866008]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664735.872647]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664735.880000]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664735.887345]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664735.894866]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664735.901781]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664735.908866]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664735.916621]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664735.923885]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664735.931751]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664735.938716]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664735.944155]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664735.950632]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664735.958207]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664735.963265]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664735.969532]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664735.976143]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664735.982407] Mem-Info:
[7664735.984868] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34394 inactive_file:35163 isolated_file:4023
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824053 slab_unreclaimable:62296627
 mapped:1588 shmem:0 pagetables:969 bounce:0
 free:590151 free_pcp:0 free_cma:0
[7664736.019049] Node 1 Normal free:525320kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16272kB inactive_file:17236kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:60286 all_unreclaimable? yes
[7664736.065832] lowmem_reserve[]: 0 0 0 0
[7664736.069803] Node 1 Normal: 88057*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525316kB
[7664736.083332] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664736.092198] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664736.100806] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664736.101757] LustreError: 8713:0:(ldlm_lib.c:3262:target_bulk_io()) @@@ network error on bulk WRITE  req@ffff9c1f527ad050 x1659307336926208/t0(0) o4->0716ac8f-8ab5-4@10.50.4.38@o2ib2:522/0 lens 488/448 e 2 to 0 dl 1583650772 ref 1 fl Interpret:/0/0 rc 0/0
[7664736.101760] LustreError: 8713:0:(ldlm_lib.c:3262:target_bulk_io()) Skipped 6 previous similar messages
[7664736.141790] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664736.150395] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664736.159264] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664736.167867] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664736.176734] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664736.185340] 74801 total pagecache pages
[7664736.189353] 0 pages in swap cache
[7664736.192846] Swap cache stats: add 21120761, delete 21136733, find 4513450/7609970
[7664736.200499] Free swap  = 3114448kB
[7664736.204078] Total swap = 4194300kB
[7664736.207660] 66993253 pages RAM
[7664736.210890] 0 pages HighMem/MovableOnly
[7664736.214903] 1101945 pages reserved
[7664736.218483] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664736.226525] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664736.235479] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664736.244268] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664736.252840] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664736.261014] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664736.269625] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664736.277887] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664736.285974] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664736.294500] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664736.303453] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664736.311800] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664736.320068] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664736.328935] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664736.336946] [53969]     0 53969    31572      204      20      169             0 crond
[7664736.345036] [54035]     0 54035    27526      164      10       33             0 agetty
[7664736.353209] [54036]     0 54036    27526      158      11       33             0 agetty
[7664736.361382] [54186]     0 54186    22934      209      46      274             0 master
[7664736.369645] [36317]     0 36317    28294      187      14       61             0 bash
[7664736.377648] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664736.386170] [36329]     0 36329    28177      160      14       55             0 grep
[7664736.394292] [98579]     0 98579    48653      266      49      236             0 crond
[7664736.402390] [100032]     0 100032    48653      266      49      240             0 crond
[7664736.410655] Out of memory: Kill process 100032 (crond) score 0 or sacrifice child
[7664736.418312] Killed process 100032 (crond) total-vm:194612kB, anon-rss:0kB, file-rss:1064kB, shmem-rss:0kB
[7664736.464388] crond: page allocation failure: order:0, mode:0x200da
[7664736.470664] CPU: 20 PID: 100032 Comm: crond Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664736.483268] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664736.491096] Call Trace:
[7664736.493730]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664736.499048]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664736.505145]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664736.510719]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664736.516726]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664736.523261]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664736.529787]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664736.535629]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664736.542249]  [<ffffffffa076aaba>] ? __schedule+0x42a/0x860
[7664736.547907]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664736.554173]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664736.560094]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664736.566099]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664736.572020]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664736.577948]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664736.583528]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664736.588840] Mem-Info:
[7664736.591319] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34031 inactive_file:35210 isolated_file:4311
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824053 slab_unreclaimable:62296627
 mapped:1587 shmem:0 pagetables:952 bounce:0
 free:590218 free_pcp:0 free_cma:0
[7664736.625506] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664736.667257] lowmem_reserve[]: 0 1418 63868 63868
[7664736.672180] Node 0 DMA32 free:261308kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1148kB inactive_file:2672kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:4kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:24147 all_unreclaimable? yes
[7664736.716961] lowmem_reserve[]: 0 0 62450 62450
[7664736.721624] Node 0 Normal free:508572kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:43692kB inactive_file:40916kB unevictable:168kB isolated(anon):0kB isolated(file):11868kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243624kB kernel_stack:6064kB pagetables:1040kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:796187 all_unreclaimable? no
[7664736.768488] lowmem_reserve[]: 0 0 0 0
[7664736.772464] Node 1 Normal free:525320kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16208kB inactive_file:17300kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:1536kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:60286 all_unreclaimable? yes
[7664736.819237] lowmem_reserve[]: 0 0 0 0
[7664736.823206] Node 2 Normal free:525568kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31812kB inactive_file:36140kB unevictable:8680kB isolated(anon):0kB isolated(file):3584kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715236kB slab_unreclaimable:62476060kB kernel_stack:7760kB pagetables:348kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:900914 all_unreclaimable? yes
[7664736.870244] lowmem_reserve[]: 0 0 0 0
[7664736.874219] Node 3 Normal free:524200kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:44168kB inactive_file:44048kB unevictable:840kB isolated(anon):0kB isolated(file):256kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:880kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1207843 all_unreclaimable? yes
[7664736.920995] lowmem_reserve[]: 0 0 0 0
[7664736.924960] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664736.939799] Node 0 DMA32: 331*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261428kB
[7664736.956293] Node 0 Normal: 6397*4kB (UEM) 5775*8kB (UEM) 3953*16kB (UEM) 4489*32kB (UEM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 509404kB
[7664736.973131] Node 1 Normal: 88057*4kB (UEM) 21634*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525316kB
[7664736.986662] Node 2 Normal: 27543*4kB (UEM) 40231*8kB (UEM) 896*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525748kB
[7664737.002061] Node 3 Normal: 131035*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524140kB
[7664737.014413] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664737.023279] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664737.031886] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664737.040752] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664737.049358] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664737.058224] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664737.066831] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664737.075696] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664737.084302] 74607 total pagecache pages
[7664737.088315] 0 pages in swap cache
[7664737.091807] Swap cache stats: add 21120761, delete 21136733, find 4513450/7609970
[7664737.099460] Free swap  = 3114448kB
[7664737.103038] Total swap = 4194300kB
[7664737.106621] 66993253 pages RAM
[7664737.109851] 0 pages HighMem/MovableOnly
[7664737.113862] 1101945 pages reserved
[7664737.132040] ll_ost_io03_087 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664737.140482] ll_ost_io03_087 cpuset=/ mems_allowed=3
[7664737.145547] CPU: 43 PID: 8685 Comm: ll_ost_io03_087 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664737.158842] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664737.166671] Call Trace:
[7664737.169304]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664737.174623]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664737.180119]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664737.185957]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664737.191711]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664737.197718]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664737.204070]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664737.210161]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664737.215907]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664737.222434]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664737.228961]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664737.235138]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664737.241145]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664737.247253]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664737.254139]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664737.261370]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664737.267535]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664737.274184]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664737.281232]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664737.286987]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664737.293434]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664737.300313]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664737.305843]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664737.312927]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664737.320684]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664737.327952]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664737.335821]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664737.342818]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664737.350771]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664737.358027]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664737.364508]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664737.372077]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664737.377135]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664737.383404]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664737.390024]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664737.396288] Mem-Info:
[7664737.398750] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34302 inactive_file:36159 isolated_file:3415
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824053 slab_unreclaimable:62296627
 mapped:1587 shmem:0 pagetables:903 bounce:0
 free:590213 free_pcp:0 free_cma:0
[7664737.432938] Node 3 Normal free:524200kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:44168kB inactive_file:44048kB unevictable:840kB isolated(anon):0kB isolated(file):256kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854268kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:768kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1207843 all_unreclaimable? yes
[7664737.479720] lowmem_reserve[]: 0 0 0 0
[7664737.483692] Node 3 Normal: 131035*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524140kB
[7664737.496045] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664737.504909] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664737.513515] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664737.522383] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664737.530987] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664737.539856] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664737.548468] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664737.557337] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664737.565950] 74698 total pagecache pages
[7664737.569966] 0 pages in swap cache
[7664737.573464] Swap cache stats: add 21120764, delete 21136736, find 4513450/7609971
[7664737.581115] Free swap  = 3114960kB
[7664737.584694] Total swap = 4194300kB
[7664737.588275] 66993253 pages RAM
[7664737.591508] 0 pages HighMem/MovableOnly
[7664737.595519] 1101945 pages reserved
[7664737.599098] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664737.607148] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664737.616106] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664737.624889] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664737.633489] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664737.641666] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664737.650278] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664737.658540] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664737.666636] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664737.675167] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664737.684124] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664737.692477] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664737.700737] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664737.709611] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664737.717612] [53969]     0 53969    31572      204      20      169             0 crond
[7664737.725706] [54035]     0 54035    27526      164      10       33             0 agetty
[7664737.733877] [54036]     0 54036    27526      158      11       33             0 agetty
[7664737.742051] [54186]     0 54186    22934      209      46      274             0 master
[7664737.750342] [36317]     0 36317    28294      187      14       61             0 bash
[7664737.758346] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664737.766873] [36329]     0 36329    28177      160      14       55             0 grep
[7664737.774990] [98579]     0 98579    48653      266      49      236             0 crond
[7664737.783086] Out of memory: Kill process 98579 (crond) score 0 or sacrifice child
[7664737.790655] Killed process 98579 (crond) total-vm:194612kB, anon-rss:0kB, file-rss:1064kB, shmem-rss:0kB
[7664737.929375] crond: page allocation failure: order:0, mode:0x200da
[7664737.935651] CPU: 27 PID: 98579 Comm: crond Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664737.948164] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664737.955988] Call Trace:
[7664737.958625]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664737.963938]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664737.970032]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664737.975612]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664737.981617]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664737.988145]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664737.994682]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664738.000522]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664738.007142]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664738.013410]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664738.019338]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664738.025350]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664738.031269]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664738.037189]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664738.042760]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664738.048074] Mem-Info:
[7664738.050550] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34946 inactive_file:37099 isolated_file:2304
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824053 slab_unreclaimable:62296627
 mapped:1587 shmem:0 pagetables:903 bounce:0
 free:590052 free_pcp:0 free_cma:0
[7664738.084729] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664738.126483] lowmem_reserve[]: 0 1418 63868 63868
[7664738.131405] Node 0 DMA32 free:261020kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1216kB inactive_file:2452kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:4kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:69388 all_unreclaimable? yes
[7664738.176188] lowmem_reserve[]: 0 0 62450 62450
[7664738.180858] Node 0 Normal free:508672kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:46040kB inactive_file:45564kB unevictable:168kB isolated(anon):0kB isolated(file):3584kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243624kB kernel_stack:6016kB pagetables:984kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1088878 all_unreclaimable? yes
[7664738.227712] lowmem_reserve[]: 0 0 0 0
[7664738.231681] Node 1 Normal free:525328kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16624kB inactive_file:17200kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411340kB kernel_stack:20816kB pagetables:1524kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:382715 all_unreclaimable? yes
[7664738.278548] lowmem_reserve[]: 0 0 0 0
[7664738.282521] Node 2 Normal free:525168kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31068kB inactive_file:40408kB unevictable:8680kB isolated(anon):0kB isolated(file):128kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715244kB slab_unreclaimable:62476060kB kernel_stack:7760kB pagetables:332kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:423385 all_unreclaimable? yes
[7664738.329459] lowmem_reserve[]: 0 0 0 0
[7664738.333428] Node 3 Normal free:524652kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:44164kB inactive_file:44148kB unevictable:840kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854272kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:768kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:456293 all_unreclaimable? yes
[7664738.380119] lowmem_reserve[]: 0 0 0 0
[7664738.384091] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664738.398929] Node 0 DMA32: 305*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261324kB
[7664738.415423] Node 0 Normal: 6274*4kB (UEM) 5755*8kB (UEM) 3938*16kB (UEM) 4489*32kB (UEM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508512kB
[7664738.432260] Node 1 Normal: 88056*4kB (UEM) 21636*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525328kB
[7664738.445792] Node 2 Normal: 27435*4kB (UEM) 40216*8kB (UEM) 896*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525196kB
[7664738.461190] Node 3 Normal: 131148*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524592kB
[7664738.473629] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664738.482497] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664738.491104] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664738.499979] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664738.508596] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664738.517468] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664738.526074] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664738.534939] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664738.543544] 74846 total pagecache pages
[7664738.547559] 0 pages in swap cache
[7664738.551052] Swap cache stats: add 21120764, delete 21136736, find 4513450/7609971
[7664738.558703] Free swap  = 3114960kB
[7664738.562280] Total swap = 4194300kB
[7664738.565863] 66993253 pages RAM
[7664738.569095] 0 pages HighMem/MovableOnly
[7664738.573107] 1101945 pages reserved
[7664738.584290] ll_ost_io00_058 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664738.592733] ll_ost_io00_058 cpuset=/ mems_allowed=0
[7664738.597793] CPU: 0 PID: 3176 Comm: ll_ost_io00_058 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664738.610998] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664738.618824] Call Trace:
[7664738.621458]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664738.626773]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664738.632259]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664738.638094]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664738.643849]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664738.649863]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664738.656223]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664738.662324]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664738.668079]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664738.674607]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664738.681142]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664738.687328]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664738.693333]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664738.699441]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664738.706326]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664738.712473]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664738.719568]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664738.726129]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664738.732766]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664738.740116]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664738.746853]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664738.754201]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664738.761716]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664738.768637]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664738.775723]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664738.783474]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664738.790733]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664738.798602]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664738.805606]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664738.813561]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664738.820822]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664738.827297]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664738.834872]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664738.839933]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664738.846210]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664738.852827]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664738.859093] Mem-Info:
[7664738.861559] active_anon:0 inactive_anon:4 isolated_anon:0
 active_file:33767 inactive_file:35484 isolated_file:3424
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824056 slab_unreclaimable:62296628
 mapped:1587 shmem:0 pagetables:903 bounce:0
 free:590183 free_pcp:0 free_cma:0
[7664738.895746] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664738.937506] lowmem_reserve[]: 0 1418 63868 63868
[7664738.942433] Node 0 DMA32 free:261324kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1256kB inactive_file:3552kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:4kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:70892 all_unreclaimable? yes
[7664738.987216] lowmem_reserve[]: 0 0 62450 62450
[7664738.991880] Node 0 Normal free:508512kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:48564kB inactive_file:41540kB unevictable:168kB isolated(anon):0kB isolated(file):6016kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243624kB kernel_stack:6096kB pagetables:984kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:372493 all_unreclaimable? yes
[7664739.038662] lowmem_reserve[]: 0 0 0 0
[7664739.042628] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664739.057468] Node 0 DMA32: 305*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261324kB
[7664739.073958] Node 0 Normal: 6274*4kB (UEM) 5756*8kB (UEM) 3922*16kB (UEM) 4489*32kB (UEM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508264kB
[7664739.090801] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664739.099666] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664739.108272] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664739.117138] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664739.125745] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664739.134612] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664739.143224] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664739.152091] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664739.160698] 74841 total pagecache pages
[7664739.164710] 0 pages in swap cache
[7664739.168203] Swap cache stats: add 21120768, delete 21136740, find 4513451/7609973
[7664739.175854] Free swap  = 3115472kB
[7664739.179434] Total swap = 4194300kB
[7664739.183015] 66993253 pages RAM
[7664739.186246] 0 pages HighMem/MovableOnly
[7664739.190258] 1101945 pages reserved
[7664739.193838] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664739.201886] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664739.210846] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664739.219640] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664739.228234] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664739.236414] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664739.245027] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664739.253296] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664739.261391] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664739.269919] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664739.278885] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664739.287234] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664739.295496] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664739.304370] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664739.312379] [53969]     0 53969    31572      204      20      169             0 crond
[7664739.320470] [54035]     0 54035    27526      164      10       33             0 agetty
[7664739.328647] [54036]     0 54036    27526      158      11       33             0 agetty
[7664739.336825] [54186]     0 54186    22934      209      46      274             0 master
[7664739.345112] [36317]     0 36317    28294      187      14       61             0 bash
[7664739.353117] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664739.361639] [36329]     0 36329    28177      160      14       55             0 grep
[7664739.369768] Out of memory: Kill process 54186 (master) score 0 or sacrifice child
[7664739.377427] Killed process 54186 (master) total-vm:91736kB, anon-rss:0kB, file-rss:836kB, shmem-rss:0kB
[7664739.610832] master: page allocation failure: order:0, mode:0x200da
[7664739.617198] CPU: 44 PID: 54186 Comm: master Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664739.629795] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664739.637622] Call Trace:
[7664739.640255]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664739.645572]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664739.651673]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664739.657254]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664739.663268]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664739.669796]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664739.676329]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664739.682169]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664739.688782]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664739.695049]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664739.700967]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664739.706974]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664739.712893]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664739.718812]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664739.724385]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664739.729700]  [<ffffffffa0037c36>] ? save_xstate_sig+0x166/0x1f0
[7664739.735800]  [<ffffffffa0037c23>] ? save_xstate_sig+0x153/0x1f0
[7664739.741891]  [<ffffffffa01ed3ad>] ? handle_mm_fault+0x39d/0x9b0
[7664739.747985]  [<ffffffffa002b949>] do_signal+0x479/0x6f0
[7664739.753383]  [<ffffffffa0772628>] ? __do_page_fault+0x228/0x4f0
[7664739.759475]  [<ffffffffa002bc32>] do_notify_resume+0x72/0xc0
[7664739.765309]  [<ffffffffa076e56c>] retint_signal+0x48/0x8c
[7664739.770881] Mem-Info:
[7664739.773358] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34293 inactive_file:37546 isolated_file:2368
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824056 slab_unreclaimable:62296628
 mapped:1587 shmem:0 pagetables:854 bounce:0
 free:590151 free_pcp:0 free_cma:0
[7664739.807542] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664739.849280] lowmem_reserve[]: 0 1418 63868 63868
[7664739.854205] Node 0 DMA32 free:261324kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1256kB inactive_file:3552kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:4kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:71468 all_unreclaimable? yes
[7664739.898983] lowmem_reserve[]: 0 0 62450 62450
[7664739.903648] Node 0 Normal free:508264kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:43792kB inactive_file:43952kB unevictable:168kB isolated(anon):0kB isolated(file):6016kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610912kB slab_unreclaimable:60243624kB kernel_stack:6352kB pagetables:788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1218480 all_unreclaimable? no
[7664739.950423] lowmem_reserve[]: 0 0 0 0
[7664739.954392] Node 1 Normal free:525328kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16676kB inactive_file:16860kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411340kB kernel_stack:20816kB pagetables:1524kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:382715 all_unreclaimable? yes
[7664740.001254] lowmem_reserve[]: 0 0 0 0
[7664740.005222] Node 2 Normal free:525196kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31244kB inactive_file:41008kB unevictable:8680kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715244kB slab_unreclaimable:62476060kB kernel_stack:7760kB pagetables:332kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:322787 all_unreclaimable? yes
[7664740.051998] lowmem_reserve[]: 0 0 0 0
[7664740.055970] Node 3 Normal free:524588kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:43948kB inactive_file:38428kB unevictable:840kB isolated(anon):0kB isolated(file):5120kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854272kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:768kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1881692 all_unreclaimable? yes
[7664740.102838] lowmem_reserve[]: 0 0 0 0
[7664740.106805] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664740.121643] Node 0 DMA32: 305*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261324kB
[7664740.138134] Node 0 Normal: 6322*4kB (UEM) 5759*8kB (UEM) 3942*16kB (UEM) 4489*32kB (UEM) 2041*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508800kB
[7664740.154975] Node 1 Normal: 88056*4kB (UEM) 21636*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525328kB
[7664740.168503] Node 2 Normal: 27435*4kB (UEM) 40216*8kB (UEM) 896*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525196kB
[7664740.183904] Node 3 Normal: 131148*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524592kB
[7664740.196342] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664740.205211] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664740.213823] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664740.222690] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664740.231300] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664740.240171] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664740.248778] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664740.257650] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664740.266256] 74854 total pagecache pages
[7664740.270272] 0 pages in swap cache
[7664740.273761] Swap cache stats: add 21120768, delete 21136740, find 4513451/7609973
[7664740.281415] Free swap  = 3115472kB
[7664740.284994] Total swap = 4194300kB
[7664740.288575] 66993253 pages RAM
[7664740.291806] 0 pages HighMem/MovableOnly
[7664740.295818] 1101945 pages reserved
[7664740.356458] ll_ost_io01_033 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664740.364897] ll_ost_io01_033 cpuset=/ mems_allowed=1
[7664740.369959] CPU: 25 PID: 123087 Comm: ll_ost_io01_033 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664740.383426] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664740.391257] Call Trace:
[7664740.393894]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664740.399208]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664740.404703]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664740.410547]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664740.416301]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664740.422312]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664740.428664]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664740.434760]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664740.440515]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664740.447048]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664740.453583]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664740.459761]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664740.465766]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664740.471873]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664740.478751]  [<ffffffffa076d7a0>] ? _raw_spin_lock+0x20/0x30
[7664740.484589]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664740.491817]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664740.497984]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664740.504642]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664740.511729]  [<ffffffffc11dcbe7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[7664740.518385]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664740.525440]  [<ffffffffa00dca58>] ? __enqueue_entity+0x78/0x80
[7664740.531453]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664740.536981]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664740.544069]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664740.551820]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664740.559074]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664740.566937]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664740.573941]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664740.581892]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664740.589149]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664740.595634]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664740.603208]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664740.608267]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664740.614534]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664740.621153]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664740.627419] Mem-Info:
[7664740.629877] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33694 inactive_file:35804 isolated_file:4416
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824056 slab_unreclaimable:62296628
 mapped:1587 shmem:0 pagetables:854 bounce:0
 free:590255 free_pcp:0 free_cma:0
[7664740.664060] Node 1 Normal free:525328kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16676kB inactive_file:16316kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411340kB kernel_stack:20816kB pagetables:1524kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:135299 all_unreclaimable? yes
[7664740.710924] lowmem_reserve[]: 0 0 0 0
[7664740.714891] Node 1 Normal: 88058*4kB (UEM) 21636*8kB (UM) 1*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525336kB
[7664740.728425] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664740.737298] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664740.745910] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664740.754776] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664740.763384] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664740.772248] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664740.780856] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664740.789722] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664740.798329] 74854 total pagecache pages
[7664740.802342] 0 pages in swap cache
[7664740.805832] Swap cache stats: add 21120776, delete 21136748, find 4513454/7609978
[7664740.813486] Free swap  = 3116496kB
[7664740.817063] Total swap = 4194300kB
[7664740.820646] 66993253 pages RAM
[7664740.823876] 0 pages HighMem/MovableOnly
[7664740.827889] 1101945 pages reserved
[7664740.831468] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664740.839519] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664740.848475] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664740.857262] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664740.865840] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664740.874017] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664740.882633] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664740.890899] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664740.898987] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664740.907513] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664740.916467] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664740.924820] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664740.933082] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664740.941954] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664740.949956] [53969]     0 53969    31572      204      20      169             0 crond
[7664740.958048] [54035]     0 54035    27526      164      10       33             0 agetty
[7664740.966222] [54036]     0 54036    27526      158      11       33             0 agetty
[7664740.974486] [36317]     0 36317    28294      187      14       61             0 bash
[7664740.982488] [36328]     0 36328   154746      223     201       98             0 journalctl
[7664740.991007] [36329]     0 36329    28177      160      14       55             0 grep
[7664740.999132] Out of memory: Kill process 36328 (journalctl) score 0 or sacrifice child
[7664741.007137] Killed process 36328 (journalctl) total-vm:618984kB, anon-rss:0kB, file-rss:892kB, shmem-rss:0kB
[7664741.188673] ll_ost_io02_052 invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[7664741.196863] ll_ost_io02_052 cpuset=/ mems_allowed=2
[7664741.201933] CPU: 34 PID: 6885 Comm: ll_ost_io02_052 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664741.215226] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664741.223058] Call Trace:
[7664741.225701]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664741.231026]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664741.236518]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664741.242358]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664741.248110]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664741.254127]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664741.260485]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664741.266576]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664741.272324]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664741.278852]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664741.285387]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664741.291646]  [<ffffffffc124293f>] tgt_checksum_niobuf_rw+0xbf/0xe00 [ptlrpc]
[7664741.298903]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664741.306236]  [<ffffffffc0cb71e0>] ? obd_dif_crc_fn+0x20/0x20 [obdclass]
[7664741.313084]  [<ffffffffc1247325>] tgt_brw_read+0xc35/0x1e50 [ptlrpc]
[7664741.319637]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664741.326990]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664741.334334]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664741.341851]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664741.348776]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664741.355868]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664741.363622]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664741.370883]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664741.378758]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664741.385725]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664741.391164]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664741.397644]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664741.405212]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664741.410274]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664741.416550]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664741.423168]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664741.429435] Mem-Info:
[7664741.431894] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33739 inactive_file:35965 isolated_file:4284
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824056 slab_unreclaimable:62296622
 mapped:1587 shmem:0 pagetables:607 bounce:0
 free:590432 free_pcp:0 free_cma:0
[7664741.466081] Node 2 Normal free:525284kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31244kB inactive_file:39600kB unevictable:8680kB isolated(anon):0kB isolated(file):1280kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715244kB slab_unreclaimable:62476044kB kernel_stack:7760kB pagetables:260kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:208183 all_unreclaimable? yes
[7664741.513123] lowmem_reserve[]: 0 0 0 0
[7664741.517097] Node 2 Normal: 27445*4kB (UEM) 40220*8kB (UEM) 897*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525284kB
[7664741.532499] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664741.541366] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664741.549972] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664741.558838] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664741.567447] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664741.576319] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664741.584924] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664741.593789] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664741.602395] 74986 total pagecache pages
[7664741.606412] 0 pages in swap cache
[7664741.609912] Swap cache stats: add 21120779, delete 21136751, find 4513455/7609983
[7664741.617562] Free swap  = 3116752kB
[7664741.621142] Total swap = 4194300kB
[7664741.624724] 66993253 pages RAM
[7664741.627952] 0 pages HighMem/MovableOnly
[7664741.631966] 1101945 pages reserved
[7664741.635546] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664741.643594] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664741.652553] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664741.661348] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664741.669941] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664741.678123] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664741.686735] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664741.695005] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664741.703101] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664741.711627] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664741.720586] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664741.728933] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664741.737202] [53180]     0 53180     6704      219      18      222             0 systemd-logind
[7664741.746078] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664741.754088] [53969]     0 53969    31572      204      20      169             0 crond
[7664741.762179] [54035]     0 54035    27526      164      10       33             0 agetty
[7664741.770360] [54036]     0 54036    27526      158      11       33             0 agetty
[7664741.778659] [36317]     0 36317    28294      187      14       61             0 bash
[7664741.786662] [36329]     0 36329    28177      160      14       55             0 grep
[7664741.794783] Out of memory: Kill process 53180 (systemd-logind) score 0 or sacrifice child
[7664741.803137] Killed process 53180 (systemd-logind) total-vm:26816kB, anon-rss:0kB, file-rss:876kB, shmem-rss:0kB
[7664741.817391] systemd-logind: page allocation failure: order:0, mode:0x200da
[7664741.824446] CPU: 27 PID: 53180 Comm: systemd-logind Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664741.837738] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664741.845570] Call Trace:
[7664741.848199]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664741.853524]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664741.859619]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664741.865195]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664741.871206]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664741.877732]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664741.884261]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664741.890100]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664741.896712]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664741.902977]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664741.908898]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664741.914904]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664741.920824]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664741.926741]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664741.929616] LustreError: 90696:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c14ee558c00
[7664741.943272]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664741.948594]  [<ffffffffa028f7a1>] ? ep_send_events_proc+0x101/0x1b0
[7664741.955040]  [<ffffffffa076a17d>] ? schedule_hrtimeout_range_clock+0x12d/0x150
[7664741.962436]  [<ffffffffa028f6a0>] ? ep_ptable_queue_proc+0xb0/0xb0
[7664741.968794]  [<ffffffffa028fe1a>] ep_scan_ready_list.isra.7+0x9a/0x1f0
[7664741.975496]  [<ffffffffa02900b3>] ep_poll+0x123/0x360
[7664741.980729]  [<ffffffffa00d7c40>] ? wake_up_state+0x20/0x20
[7664741.986476]  [<ffffffffa029169d>] SyS_epoll_wait+0xed/0x120
[7664741.992229]  [<ffffffffa0777ddb>] system_call_fastpath+0x22/0x27
[7664741.998407] Mem-Info:
[7664742.000885] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33879 inactive_file:38131 isolated_file:2848
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824058 slab_unreclaimable:62296622
 mapped:1587 shmem:0 pagetables:607 bounce:0
 free:590294 free_pcp:0 free_cma:0
[7664742.035073] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664742.076829] lowmem_reserve[]: 0 1418 63868 63868
[7664742.081756] Node 0 DMA32 free:261328kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1256kB inactive_file:3552kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:21814 all_unreclaimable? yes
[7664742.126538] lowmem_reserve[]: 0 0 62450 62450
[7664742.131201] Node 0 Normal free:508776kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:43564kB inactive_file:47148kB unevictable:168kB isolated(anon):0kB isolated(file):5504kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610920kB slab_unreclaimable:60243624kB kernel_stack:5856kB pagetables:492kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:543077 all_unreclaimable? yes
[7664742.177976] lowmem_reserve[]: 0 0 0 0
[7664742.181946] Node 1 Normal free:524952kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17052kB inactive_file:16828kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411332kB kernel_stack:20816kB pagetables:1212kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:181939 all_unreclaimable? yes
[7664742.228810] lowmem_reserve[]: 0 0 0 0
[7664742.232796] Node 2 Normal free:525280kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31592kB inactive_file:34256kB unevictable:8680kB isolated(anon):0kB isolated(file):6784kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715248kB slab_unreclaimable:62476044kB kernel_stack:7760kB pagetables:260kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:416034 all_unreclaimable? yes
[7664742.279828] lowmem_reserve[]: 0 0 0 0
[7664742.283799] Node 3 Normal free:524964kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:40208kB inactive_file:41396kB unevictable:840kB isolated(anon):0kB isolated(file):4864kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854272kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:464kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:112249 all_unreclaimable? no
[7664742.330493] lowmem_reserve[]: 0 0 0 0
[7664742.334459] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664742.349296] Node 0 DMA32: 306*4kB (UEM) 413*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261328kB
[7664742.365792] Node 0 Normal: 6278*4kB (EM) 5759*8kB (UEM) 3950*16kB (UEM) 4490*32kB (UEM) 2042*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508848kB
[7664742.382544] Node 1 Normal: 87986*4kB (UEM) 21562*8kB (UM) 4*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524504kB
[7664742.396074] Node 2 Normal: 27443*4kB (UEM) 40221*8kB (UEM) 897*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525284kB
[7664742.411473] Node 3 Normal: 131235*4kB (UEM) 6*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524988kB
[7664742.424284] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664742.433148] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664742.441757] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664742.450625] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664742.459238] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664742.468104] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664742.476711] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664742.485577] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664742.494181] 75195 total pagecache pages
[7664742.498193] 0 pages in swap cache
[7664742.501686] Swap cache stats: add 21120779, delete 21136751, find 4513455/7609983
[7664742.509339] Free swap  = 3116752kB
[7664742.512917] Total swap = 4194300kB
[7664742.516499] 66993253 pages RAM
[7664742.519728] 0 pages HighMem/MovableOnly
[7664742.523742] 1101945 pages reserved
[7664743.092356] ll_ost_io00_067 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664743.100797] ll_ost_io00_067 cpuset=/ mems_allowed=0
[7664743.105859] CPU: 24 PID: 96095 Comm: ll_ost_io00_067 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664743.119237] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664743.127065] Call Trace:
[7664743.129697]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664743.135015]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664743.140509]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664743.146350]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664743.152105]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664743.158118]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664743.164473]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664743.170573]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664743.176320]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664743.182855]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664743.189395]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664743.195579]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664743.201594]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664743.207706]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664743.214591]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664743.221817]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664743.227962]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664743.234577]  [<ffffffffa021ccc1>] ? __slab_free+0x81/0x2f0
[7664743.240235]  [<ffffffffa021ccc1>] ? __slab_free+0x81/0x2f0
[7664743.245897]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664743.251650]  [<ffffffffa00ddd9e>] ? account_entity_dequeue+0xae/0xd0
[7664743.258176]  [<ffffffffa00e192c>] ? dequeue_entity+0x11c/0x5e0
[7664743.264183]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664743.269704]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664743.276783]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664743.284529]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664743.291784]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664743.299646]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664743.306607]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664743.312053]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664743.318529]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664743.326099]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664743.331151]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664743.337418]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664743.344039]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664743.350312] Mem-Info:
[7664743.352779] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35180 inactive_file:37446 isolated_file:1152
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824059 slab_unreclaimable:62296617
 mapped:1587 shmem:0 pagetables:589 bounce:0
 free:590070 free_pcp:0 free_cma:0
[7664743.386960] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664743.428711] lowmem_reserve[]: 0 1418 63868 63868
[7664743.433634] Node 0 DMA32 free:261328kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1256kB inactive_file:3552kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:30550 all_unreclaimable? yes
[7664743.478419] lowmem_reserve[]: 0 0 62450 62450
[7664743.483087] Node 0 Normal free:508020kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:48472kB inactive_file:47780kB unevictable:168kB isolated(anon):0kB isolated(file):256kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610920kB slab_unreclaimable:60243616kB kernel_stack:6160kB pagetables:424kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1505522 all_unreclaimable? yes
[7664743.529862] lowmem_reserve[]: 0 0 0 0
[7664743.533829] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664743.548667] Node 0 DMA32: 306*4kB (UEM) 414*8kB (UEM) 1216*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261336kB
[7664743.565160] Node 0 Normal: 6218*4kB (UEM) 5747*8kB (UEM) 3920*16kB (UEM) 4490*32kB (UEM) 2042*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508032kB
[7664743.582001] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664743.590869] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664743.599474] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664743.608340] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664743.616947] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664743.625822] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664743.634434] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664743.643302] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664743.651907] 75346 total pagecache pages
[7664743.655920] 0 pages in swap cache
[7664743.659412] Swap cache stats: add 21120783, delete 21136755, find 4513455/7609984
[7664743.667064] Free swap  = 3117520kB
[7664743.670643] Total swap = 4194300kB
[7664743.674223] 66993253 pages RAM
[7664743.677456] 0 pages HighMem/MovableOnly
[7664743.681469] 1101945 pages reserved
[7664743.685049] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664743.693092] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664743.702045] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664743.710833] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664743.719420] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664743.727597] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664743.736212] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664743.744480] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664743.752575] [53106]     0 53106     5514      188      15      221             0 irqbalance
[7664743.761101] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664743.770052] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664743.778399] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664743.786664] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664743.794667] [53969]     0 53969    31572      204      20      169             0 crond
[7664743.802755] [54035]     0 54035    27526      164      10       33             0 agetty
[7664743.810934] [54036]     0 54036    27526      158      11       33             0 agetty
[7664743.819220] [36317]     0 36317    28294      187      14       61             0 bash
[7664743.827222] [36329]     0 36329    28177      160      14       55             0 grep
[7664743.835345] Out of memory: Kill process 53106 (irqbalance) score 0 or sacrifice child
[7664743.843349] Killed process 53106 (irqbalance) total-vm:22056kB, anon-rss:0kB, file-rss:752kB, shmem-rss:0kB
[7664743.938893] irqbalance: page allocation failure: order:0, mode:0x200da
[7664743.945604] CPU: 20 PID: 53106 Comm: irqbalance Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664743.958549] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664743.966383] Call Trace:
[7664743.969016]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664743.974334]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664743.980432]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664743.986014]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664743.992028]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664743.998567]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664744.005098]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664744.010942]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664744.017559]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664744.023829]  [<ffffffffa0200cd2>] swapin_readahead+0xe2/0x110
[7664744.029753]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664744.035761]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664744.041687]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664744.047608]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664744.053188]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664744.058499] Mem-Info:
[7664744.060974] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34697 inactive_file:36940 isolated_file:1934
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824059 slab_unreclaimable:62296617
 mapped:1587 shmem:0 pagetables:589 bounce:0
 free:590053 free_pcp:0 free_cma:0
[7664744.095157] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664744.136908] lowmem_reserve[]: 0 1418 63868 63868
[7664744.141832] Node 0 DMA32 free:261316kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1148kB inactive_file:1284kB unevictable:0kB isolated(anon):0kB isolated(file):1332kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686208kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:766438 all_unreclaimable? yes
[7664744.186962] lowmem_reserve[]: 0 0 62450 62450
[7664744.191631] Node 0 Normal free:508032kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:47368kB inactive_file:46768kB unevictable:168kB isolated(anon):0kB isolated(file):896kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610920kB slab_unreclaimable:60243616kB kernel_stack:6096kB pagetables:424kB unstable:0kB bounce:0kB free_pcp:8kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:320428 all_unreclaimable? yes
[7664744.238321] lowmem_reserve[]: 0 0 0 0
[7664744.242297] Node 1 Normal free:524680kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:15772kB inactive_file:17576kB unevictable:26488kB isolated(anon):0kB isolated(file):1408kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:1212kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:752627 all_unreclaimable? yes
[7664744.289420] lowmem_reserve[]: 0 0 0 0
[7664744.293396] Node 2 Normal free:525300kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31240kB inactive_file:39344kB unevictable:8680kB isolated(anon):0kB isolated(file):1536kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715248kB slab_unreclaimable:62476028kB kernel_stack:7760kB pagetables:260kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:385766 all_unreclaimable? yes
[7664744.340434] lowmem_reserve[]: 0 0 0 0
[7664744.344409] Node 3 Normal free:525128kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:44156kB inactive_file:43388kB unevictable:840kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854272kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:460kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:596549 all_unreclaimable? yes
[7664744.390925] lowmem_reserve[]: 0 0 0 0
[7664744.394900] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664744.409738] Node 0 DMA32: 465*4kB (UEM) 401*8kB (EM) 1212*16kB (UEM) 3687*32kB (UEM) 1489*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261804kB
[7664744.426143] Node 0 Normal: 6224*4kB (UEM) 5748*8kB (UEM) 3911*16kB (UEM) 4490*32kB (UEM) 2042*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 507920kB
[7664744.442983] Node 1 Normal: 88031*4kB (UEM) 21563*8kB (UM) 4*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524692kB
[7664744.456513] Node 2 Normal: 27443*4kB (UEM) 40221*8kB (UEM) 898*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525300kB
[7664744.471914] Node 3 Normal: 131268*4kB (UEM) 7*8kB (U) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525128kB
[7664744.484724] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664744.493591] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664744.502206] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664744.511070] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664744.519679] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664744.528552] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664744.537159] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664744.546023] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664744.554630] 75187 total pagecache pages
[7664744.558644] 0 pages in swap cache
[7664744.562137] Swap cache stats: add 21120784, delete 21136756, find 4513455/7609984
[7664744.569787] Free swap  = 3117520kB
[7664744.573368] Total swap = 4194300kB
[7664744.576949] 66993253 pages RAM
[7664744.580186] 0 pages HighMem/MovableOnly
[7664744.584201] 1101945 pages reserved
[7664744.670646] ll_ost_io03_053 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664744.679092] ll_ost_io03_053 cpuset=/ mems_allowed=3
[7664744.684148] CPU: 31 PID: 7277 Comm: ll_ost_io03_053 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664744.697444] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664744.705274] Call Trace:
[7664744.707911]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664744.713224]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664744.718712]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664744.724551]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664744.730297]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664744.736303]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664744.742655]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664744.748749]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664744.754497]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664744.761031]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664744.767568]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664744.773753]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664744.779759]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664744.785866]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664744.792752]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664744.799982]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664744.806146]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664744.812797]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664744.819845]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664744.825598]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664744.832036]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664744.838909]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664744.844440]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664744.851525]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664744.859280]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664744.866543]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664744.874412]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664744.881411]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664744.889360]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664744.896621]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664744.903096]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664744.910663]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664744.915725]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664744.921998]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664744.924369] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664744.924379] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664744.924388] LustreError: 3033:0:(pack_generic.c:605:__lustre_unpack_msg()) message length 0 too small for magic/version check
[7664744.924391] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664744.924396] LustreError: 3033:0:(sec.c:2191:sptlrpc_svc_unwrap_request()) error unpacking request from 12345-10.49.20.19@o2ib1 x1659023349352448
[7664744.924399] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664744.994212]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664745.000484] Mem-Info:
[7664745.002942] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33705 inactive_file:36695 isolated_file:2944
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824059 slab_unreclaimable:62296617
 mapped:1587 shmem:0 pagetables:574 bounce:0
 free:590210 free_pcp:0 free_cma:0
[7664745.037135] Node 3 Normal free:525128kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:41852kB inactive_file:42144kB unevictable:840kB isolated(anon):0kB isolated(file):512kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854272kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:460kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:302072 all_unreclaimable? yes
[7664745.083823] lowmem_reserve[]: 0 0 0 0
[7664745.087790] Node 3 Normal: 131274*4kB (UEM) 7*8kB (U) 5*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525232kB
[7664745.100976] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664745.109850] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664745.118456] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664745.127322] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664745.135926] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664745.144794] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664745.153399] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664745.162264] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664745.170871] 75179 total pagecache pages
[7664745.174885] 0 pages in swap cache
[7664745.178377] Swap cache stats: add 21120785, delete 21136757, find 4513455/7609985
[7664745.186030] Free swap  = 3118288kB
[7664745.189609] Total swap = 4194300kB
[7664745.193191] 66993253 pages RAM
[7664745.196421] 0 pages HighMem/MovableOnly
[7664745.200432] 1101945 pages reserved
[7664745.204012] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664745.212060] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664745.221020] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664745.229817] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664745.238420] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664745.246596] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664745.255201] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664745.256579] LustreError: 107084:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c39d93b1e00
[7664745.268215] LustreError: 6896:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c41ea2f7200
[7664745.277951] LustreError: 27189:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c2126afb200
[7664745.296302] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664745.304394] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664745.313348] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664745.321702] [53139]   997 53139    29446      250      28      128             0 chronyd
[7664745.329976] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664745.337979] [53969]     0 53969    31572      204      20      169             0 crond
[7664745.346072] [54035]     0 54035    27526      164      10       33             0 agetty
[7664745.354244] [54036]     0 54036    27526      158      11       33             0 agetty
[7664745.362540] [36317]     0 36317    28294      187      14       61             0 bash
[7664745.370547] [36329]     0 36329    28177      160      14       55             0 grep
[7664745.378674] Out of memory: Kill process 53139 (chronyd) score 0 or sacrifice child
[7664745.386415] Killed process 53139 (chronyd) total-vm:117784kB, anon-rss:0kB, file-rss:1000kB, shmem-rss:0kB
[7664745.533835] ll_ost_io03_093 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664745.542270] ll_ost_io03_093 cpuset=/ mems_allowed=3
[7664745.547324] CPU: 19 PID: 8717 Comm: ll_ost_io03_093 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664745.560619] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664745.568445] Call Trace:
[7664745.571083]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664745.576402]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664745.581899]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664745.587736]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664745.593486]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664745.599500]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664745.605252]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664745.611779]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664745.618312]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664745.624491]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664745.630498]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664745.636604]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664745.643489]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664745.649645]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664745.656744]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664745.663307]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664745.669953]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664745.677299]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664745.684038]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664745.691386]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664745.698908]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664745.705829]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664745.712925]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664745.720676]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664745.724600] LustreError: 8712:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c5075370e00
[7664745.738793]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664745.746661]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664745.753662]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664745.761610]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664745.768869]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664745.775345]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664745.782914]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664745.787974]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664745.794240]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664745.800853]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664745.807116] Mem-Info:
[7664745.809578] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33054 inactive_file:36598 isolated_file:2939
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824050 slab_unreclaimable:62296614
 mapped:1587 shmem:0 pagetables:574 bounce:0
 free:590189 free_pcp:0 free_cma:0
[7664745.843767] Node 3 Normal free:525268kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:41248kB inactive_file:46336kB unevictable:840kB isolated(anon):0kB isolated(file):896kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854244kB slab_unreclaimable:62369264kB kernel_stack:4208kB pagetables:460kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:235339 all_unreclaimable? yes
[7664745.890456] lowmem_reserve[]: 0 0 0 0
[7664745.894424] Node 3 Normal: 131354*4kB (UEM) 7*8kB (U) 5*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525552kB
[7664745.907610] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664745.916474] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664745.925080] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664745.933946] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664745.942553] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664745.951419] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664745.960024] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664745.968892] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664745.977499] 75199 total pagecache pages
[7664745.981509] 0 pages in swap cache
[7664745.985001] Swap cache stats: add 21120788, delete 21136760, find 4513458/7609990
[7664745.992655] Free swap  = 3118800kB
[7664745.996233] Total swap = 4194300kB
[7664745.999816] 66993253 pages RAM
[7664746.003044] 0 pages HighMem/MovableOnly
[7664746.007057] 1101945 pages reserved
[7664746.010636] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664746.018685] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664746.027646] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664746.036437] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664746.045032] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664746.053213] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664746.061827] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664746.070094] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664746.078185] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664746.087142] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664746.095493] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664746.103499] [53969]     0 53969    31572      204      20      169             0 crond
[7664746.108102] Lustre: fir-OST001b: Bulk IO read error with fb2c1382-8f5a-4 (at 10.50.15.10@o2ib2), client will retry: rc -110
[7664746.108104] Lustre: Skipped 10 previous similar messages
[7664746.128371] [54035]     0 54035    27526      164      10       33             0 agetty
[7664746.136550] [54036]     0 54036    27526      158      11       33             0 agetty
[7664746.144846] [36317]     0 36317    28294      187      14       61             0 bash
[7664746.152845] [36329]     0 36329    28177      160      14       55             0 grep
[7664746.160974] Out of memory: Kill process 53969 (crond) score 0 or sacrifice child
[7664746.168540] Killed process 53969 (crond) total-vm:126288kB, anon-rss:0kB, file-rss:816kB, shmem-rss:0kB
[7664747.313604] ll_ost_io02_075 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664747.322052] ll_ost_io02_075 cpuset=/ mems_allowed=2
[7664747.327113] CPU: 6 PID: 83185 Comm: ll_ost_io02_075 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664747.340402] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664747.348238] Call Trace:
[7664747.350881]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664747.356203]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664747.361697]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664747.367542]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664747.373295]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664747.379310]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664747.385673]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664747.391772]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664747.397530]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664747.404065]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664747.410599]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664747.416785]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664747.422800]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664747.428919]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664747.435804]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664747.443035]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664747.449207]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664747.455859]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664747.462958]  [<ffffffffc11dcbe7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[7664747.469586]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664747.475338]  [<ffffffffa00dca58>] ? __enqueue_entity+0x78/0x80
[7664747.481350]  [<ffffffffa00e367f>] ? enqueue_entity+0x2ef/0xbe0
[7664747.487366]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664747.492901]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664747.499992]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664747.507747]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664747.515008]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664747.522875]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664747.529842]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664747.535291]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664747.541769]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664747.549346]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664747.554405]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664747.560680]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664747.567300]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664747.573576] Mem-Info:
[7664747.576043] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:33450 inactive_file:37198 isolated_file:3040
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824058 slab_unreclaimable:62296629
 mapped:1587 shmem:0 pagetables:554 bounce:0
 free:590294 free_pcp:0 free_cma:0
[7664747.610237] Node 2 Normal free:525308kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31240kB inactive_file:39464kB unevictable:8680kB isolated(anon):0kB isolated(file):1408kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715244kB slab_unreclaimable:62476028kB kernel_stack:7760kB pagetables:260kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:552020 all_unreclaimable? yes
[7664747.657278] lowmem_reserve[]: 0 0 0 0
[7664747.661250] Node 2 Normal: 27465*4kB (UEM) 40220*8kB (UEM) 899*16kB (UEM) 1669*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525396kB
[7664747.676672] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664747.685541] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664747.694155] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664747.703030] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664747.711643] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664747.720510] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664747.729128] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664747.737997] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664747.746608] 75248 total pagecache pages
[7664747.750627] 0 pages in swap cache
[7664747.754130] Swap cache stats: add 21120793, delete 21136765, find 4513458/7609991
[7664747.761788] Free swap  = 3119312kB
[7664747.765369] Total swap = 4194300kB
[7664747.768952] 66993253 pages RAM
[7664747.772190] 0 pages HighMem/MovableOnly
[7664747.776210] 1101945 pages reserved
[7664747.779790] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664747.787844] [ 5686]     0  5686    16012      235      39      106             0 systemd-journal
[7664747.796806] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664747.805603] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664747.814200] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664747.822384] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664747.830993] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664747.839257] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664747.847349] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664747.856313] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664747.864673] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664747.872685] [54035]     0 54035    27526      164      10       33             0 agetty
[7664747.880866] [54036]     0 54036    27526      158      11       33             0 agetty
[7664747.889157] [36317]     0 36317    28294      187      14       61             0 bash
[7664747.897165] [36329]     0 36329    28177      160      14       55             0 grep
[7664747.905283] Out of memory: Kill process 5686 (systemd-journal) score 0 or sacrifice child
[7664747.913630] Killed process 5686 (systemd-journal) total-vm:64048kB, anon-rss:0kB, file-rss:940kB, shmem-rss:0kB
[7664748.370656] systemd-journal: page allocation failure: order:0, mode:0x200da
[7664748.377799] CPU: 25 PID: 5686 Comm: systemd-journal Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664748.391093] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664748.398928] Call Trace:
[7664748.401567]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664748.406886]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664748.412984]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664748.418557]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664748.424564]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664748.431100]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664748.437634]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664748.443476]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664748.450096]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664748.456360]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664748.462281]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664748.468290]  [<ffffffffa028f800>] ? ep_send_events_proc+0x160/0x1b0
[7664748.474734]  [<ffffffffa076a17d>] ? schedule_hrtimeout_range_clock+0x12d/0x150
[7664748.482127]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664748.488048]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664748.493967]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664748.499538]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664748.504852] Mem-Info:
[7664748.507324] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34985 inactive_file:37342 isolated_file:3008
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824059 slab_unreclaimable:62296658
 mapped:1587 shmem:0 pagetables:526 bounce:0
 free:590176 free_pcp:0 free_cma:0
[7664748.541508] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664748.583261] lowmem_reserve[]: 0 1418 63868 63868
[7664748.588182] Node 0 DMA32 free:261344kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:772kB inactive_file:3472kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686220kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:14958 all_unreclaimable? yes
[7664748.633050] lowmem_reserve[]: 0 0 62450 62450
[7664748.637712] Node 0 Normal free:508380kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:48844kB inactive_file:43540kB unevictable:168kB isolated(anon):0kB isolated(file):5376kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610952kB slab_unreclaimable:60243736kB kernel_stack:6128kB pagetables:424kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:65243 all_unreclaimable? no
[7664748.684316] lowmem_reserve[]: 0 0 0 0
[7664748.688292] Node 1 Normal free:524924kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:16916kB inactive_file:16808kB unevictable:26488kB isolated(anon):0kB isolated(file):256kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411336kB kernel_stack:20816kB pagetables:1072kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:166087 all_unreclaimable? yes
[7664748.735323] lowmem_reserve[]: 0 0 0 0
[7664748.739295] Node 2 Normal free:525368kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31240kB inactive_file:39456kB unevictable:8680kB isolated(anon):0kB isolated(file):1408kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715244kB slab_unreclaimable:62476060kB kernel_stack:7760kB pagetables:196kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:555146 all_unreclaimable? yes
[7664748.786332] lowmem_reserve[]: 0 0 0 0
[7664748.790299] Node 3 Normal free:524832kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:42564kB inactive_file:43740kB unevictable:840kB isolated(anon):0kB isolated(file):2944kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854244kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:412kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:736429 all_unreclaimable? yes
[7664748.837075] lowmem_reserve[]: 0 0 0 0
[7664748.841042] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664748.855878] Node 0 DMA32: 399*4kB (UEM) 401*8kB (EM) 1212*16kB (UEM) 3689*32kB (UEM) 1491*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261732kB
[7664748.872286] Node 0 Normal: 6283*4kB (UEM) 5750*8kB (UEM) 3945*16kB (UEM) 4487*32kB (UEM) 2042*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508620kB
[7664748.889126] Node 1 Normal: 88086*4kB (UEM) 21564*8kB (UM) 4*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524920kB
[7664748.902654] Node 2 Normal: 27465*4kB (UEM) 40221*8kB (UEM) 899*16kB (UEM) 1667*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525340kB
[7664748.918056] Node 3 Normal: 131206*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524824kB
[7664748.930493] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664748.939360] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664748.947966] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664748.956832] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664748.965446] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664748.974312] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664748.982918] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664748.991789] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664749.000400] 75404 total pagecache pages
[7664749.004413] 0 pages in swap cache
[7664749.007906] Swap cache stats: add 21120793, delete 21136765, find 4513458/7609991
[7664749.015559] Free swap  = 3119312kB
[7664749.019144] Total swap = 4194300kB
[7664749.022726] 66993253 pages RAM
[7664749.025956] 0 pages HighMem/MovableOnly
[7664749.029972] 1101945 pages reserved
[7664749.920145] ll_ost_io02_074 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664749.928587] ll_ost_io02_074 cpuset=/ mems_allowed=2
[7664749.933649] CPU: 22 PID: 83183 Comm: ll_ost_io02_074 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664749.947028] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664749.954855] Call Trace:
[7664749.957487]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664749.962832]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664749.968344]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664749.974218]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664749.980260]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664749.986678]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664749.992790]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664749.998551]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664750.005080]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664750.011639]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664750.017819]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664750.023835]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664750.029967]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664750.036872]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664750.044131]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664750.050320]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664750.056999]  [<ffffffffc0a844f5>] ? lnet_try_match_md+0x1e5/0x330 [lnet]
[7664750.063895]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664750.069645]  [<ffffffffa00dca58>] ? __enqueue_entity+0x78/0x80
[7664750.075652]  [<ffffffffa00e367f>] ? enqueue_entity+0x2ef/0xbe0
[7664750.081659]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664750.087195]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664750.094284]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664750.102035]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664750.109316]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664750.117203]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664750.124217]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664750.132180]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664750.139461]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664750.145977]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664750.153569]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664750.158670]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664750.164949]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664750.171576]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664750.177849] Mem-Info:
[7664750.180321] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35389 inactive_file:38087 isolated_file:1376
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824059 slab_unreclaimable:62296676
 mapped:1587 shmem:0 pagetables:487 bounce:0
 free:590376 free_pcp:0 free_cma:0
[7664750.214523] Node 2 Normal free:525364kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31240kB inactive_file:40108kB unevictable:8680kB isolated(anon):0kB isolated(file):768kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715244kB slab_unreclaimable:62476092kB kernel_stack:7760kB pagetables:172kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:141218 all_unreclaimable? no
[7664750.261398] lowmem_reserve[]: 0 0 0 0
[7664750.265365] Node 2 Normal: 27467*4kB (UEM) 40223*8kB (UEM) 899*16kB (UEM) 1667*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525364kB
[7664750.280795] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664750.289663] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664750.298275] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664750.307144] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664750.315759] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664750.324631] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664750.333236] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664750.342101] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664750.350710] 75384 total pagecache pages
[7664750.354725] 0 pages in swap cache
[7664750.358224] Swap cache stats: add 21120794, delete 21136766, find 4513460/7609994
[7664750.365875] Free swap  = 3119568kB
[7664750.369453] Total swap = 4194300kB
[7664750.373036] 66993253 pages RAM
[7664750.376266] 0 pages HighMem/MovableOnly
[7664750.380280] 1101945 pages reserved
[7664750.383859] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664750.391909] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664750.400700] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664750.409290] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664750.417474] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664750.426087] [53084]    32 53084    17316      110      37      146             0 rpcbind
[7664750.434355] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664750.442450] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664750.451411] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664750.459770] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664750.467777] [54035]     0 54035    27526      164      10       33             0 agetty
[7664750.475955] [54036]     0 54036    27526      158      11       33             0 agetty
[7664750.484252] [36317]     0 36317    28294      187      14       61             0 bash
[7664750.492258] [36329]     0 36329    28177      160      14       55             0 grep
[7664750.500382] Out of memory: Kill process 53084 (rpcbind) score 0 or sacrifice child
[7664750.508127] Killed process 53084 (rpcbind) total-vm:69264kB, anon-rss:0kB, file-rss:440kB, shmem-rss:0kB
[7664750.719150] rpcbind: page allocation failure: order:0, mode:0x200da
[7664750.725594] CPU: 31 PID: 53084 Comm: rpcbind Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664750.738279] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664750.746105] Call Trace:
[7664750.748736]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664750.754056]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664750.760154]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664750.765732]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664750.771743]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664750.778270]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664750.784805]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664750.790644]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664750.797257]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664750.803523]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664750.809444]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664750.815457]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664750.821377]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664750.827298]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664750.832878]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664750.838189] Mem-Info:
[7664750.840665] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34819 inactive_file:37178 isolated_file:1602
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824059 slab_unreclaimable:62296677
 mapped:1587 shmem:0 pagetables:487 bounce:0
 free:590306 free_pcp:0 free_cma:0
[7664750.874845] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664750.916600] lowmem_reserve[]: 0 1418 63868 63868
[7664750.921529] Node 0 DMA32 free:261348kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:928kB inactive_file:3032kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686228kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:491959 all_unreclaimable? yes
[7664750.966312] lowmem_reserve[]: 0 0 62450 62450
[7664750.970974] Node 0 Normal free:508604kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:44180kB inactive_file:43236kB unevictable:168kB isolated(anon):0kB isolated(file):9472kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610952kB slab_unreclaimable:60243768kB kernel_stack:6160kB pagetables:392kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:476544 all_unreclaimable? yes
[7664751.017750] lowmem_reserve[]: 0 0 0 0
[7664751.021727] Node 1 Normal free:525092kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17312kB inactive_file:17260kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711308kB slab_unreclaimable:63411340kB kernel_stack:20816kB pagetables:980kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:53206 all_unreclaimable? yes
[7664751.068416] lowmem_reserve[]: 0 0 0 0
[7664751.072391] Node 2 Normal free:525364kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31240kB inactive_file:40236kB unevictable:8680kB isolated(anon):0kB isolated(file):640kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715244kB slab_unreclaimable:62476092kB kernel_stack:7760kB pagetables:172kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:839306 all_unreclaimable? yes
[7664751.119330] lowmem_reserve[]: 0 0 0 0
[7664751.123302] Node 3 Normal free:524912kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:43584kB inactive_file:44356kB unevictable:840kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854244kB slab_unreclaimable:62369280kB kernel_stack:4208kB pagetables:404kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:481666 all_unreclaimable? yes
[7664751.169816] lowmem_reserve[]: 0 0 0 0
[7664751.173783] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664751.197481] LustreError: 83171:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c363824f600
[7664751.210996] LustreError: 3017:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c105d52a200
[7664751.188620] Node 0 DMA32: 367*4kB (EM) 402*8kB (UEM) 1203*16kB (UEM) 3687*32kB (UEM) 1491*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261404kB
[7664751.226841] Node 0 Normal: 5966*4kB (EM) 5703*8kB (UEM) 3932*16kB (UEM) 4486*32kB (UEM) 2042*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 506736kB
[7664751.243593] Node 1 Normal: 88118*4kB (UEM) 21565*8kB (UM) 7*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525104kB
[7664751.257123] Node 2 Normal: 27467*4kB (UEM) 40223*8kB (UEM) 899*16kB (UEM) 1667*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525364kB
[7664751.272524] Node 3 Normal: 131213*4kB (UEM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524852kB
[7664751.284961] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664751.293827] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664751.302435] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664751.311301] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664751.319905] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664751.328772] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664751.337377] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664751.346244] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664751.354849] 75568 total pagecache pages
[7664751.358862] 0 pages in swap cache
[7664751.362355] Swap cache stats: add 21120794, delete 21136766, find 4513460/7609994
[7664751.370008] Free swap  = 3119568kB
[7664751.373587] Total swap = 4194300kB
[7664751.377167] 66993253 pages RAM
[7664751.380398] 0 pages HighMem/MovableOnly
[7664751.384412] 1101945 pages reserved
[7664751.530657] ll_ost_io02_023 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664751.539118] ll_ost_io02_023 cpuset=/ mems_allowed=2
[7664751.544182] CPU: 6 PID: 123044 Comm: ll_ost_io02_023 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664751.557561] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664751.565399] Call Trace:
[7664751.568039]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664751.573360]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664751.578855]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664751.584700]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664751.590452]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664751.596468]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664751.602828]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664751.608931]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664751.614685]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664751.621219]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664751.627753]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664751.633943]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664751.639957]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664751.646077]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664751.652969]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664751.659132]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664751.666234]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664751.672804]  [<ffffffffc11966b2>] ? ldlm_res_hop_get_locked+0x12/0x20 [ptlrpc]
[7664751.680217]  [<ffffffffc0a13297>] ? cfs_hash_bd_lookup_intent+0xf7/0x170 [libcfs]
[7664751.687913]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664751.695266]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664751.702015]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664751.709364]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664751.716886]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664751.723802]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664751.730895]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664751.738653]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664751.745914]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664751.753787]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664751.760787]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664751.768740]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664751.776000]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664751.782477]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664751.790047]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664751.795111]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664751.801390]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664751.808003]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664751.814280] Mem-Info:
[7664751.816744] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35933 inactive_file:37401 isolated_file:1024
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824059 slab_unreclaimable:62296678
 mapped:1587 shmem:0 pagetables:450 bounce:0
 free:590142 free_pcp:0 free_cma:0
[7664751.850937] Node 2 Normal free:525524kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31240kB inactive_file:40748kB unevictable:8680kB isolated(anon):0kB isolated(file):128kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715244kB slab_unreclaimable:62476092kB kernel_stack:7760kB pagetables:36kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:122309 all_unreclaimable? yes
[7664751.897803] lowmem_reserve[]: 0 0 0 0
[7664751.901781] Node 2 Normal: 27507*4kB (UEM) 40223*8kB (UEM) 899*16kB (UEM) 1667*32kB (UEM) 406*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525524kB
[7664751.917188] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664751.926065] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664751.934680] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664751.943550] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664751.952158] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664751.961031] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664751.969636] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664751.978506] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664751.987124] 75585 total pagecache pages
[7664751.991144] 0 pages in swap cache
[7664751.994641] Swap cache stats: add 21120796, delete 21136768, find 4513461/7609996
[7664752.002294] Free swap  = 3120080kB
[7664752.005876] Total swap = 4194300kB
[7664752.009463] 66993253 pages RAM
[7664752.012702] 0 pages HighMem/MovableOnly
[7664752.016715] 1101945 pages reserved
[7664752.020295] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664752.028341] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664752.037133] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664752.045723] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664752.053906] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664752.062531] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664752.070627] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664752.079586] [53113]     0 53113    48774      114      37      130             0 gssproxy
[7664752.087946] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664752.095956] [54035]     0 54035    27526      164      10       33             0 agetty
[7664752.104137] [54036]     0 54036    27526      158      11       33             0 agetty
[7664752.112429] [36317]     0 36317    28294      187      14       61             0 bash
[7664752.120440] [36329]     0 36329    28177      160      14       55             0 grep
[7664752.128576] Out of memory: Kill process 53113 (gssproxy) score 0 or sacrifice child
[7664752.136407] Killed process 53113 (gssproxy) total-vm:195096kB, anon-rss:0kB, file-rss:456kB, shmem-rss:0kB
[7664752.792255] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664752.802589] LustreError: 124244:0:(pack_generic.c:605:__lustre_unpack_msg()) message length 0 too small for magic/version check
[7664752.802592] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664752.802605] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664752.802613] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664752.802622] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664752.802630] LustreError: 80392:0:(events.c:305:request_in_callback()) event type 2, status -103, service ost_io
[7664752.802634] LustreError: 119556:0:(sec.c:2191:sptlrpc_svc_unwrap_request()) error unpacking request from 12345-10.49.25.17@o2ib1 x1659540551090944
[7664752.802637] LustreError: 119556:0:(sec.c:2191:sptlrpc_svc_unwrap_request()) Skipped 3 previous similar messages
[7664752.889056] LustreError: 124244:0:(pack_generic.c:605:__lustre_unpack_msg()) Skipped 8 previous similar messages
[7664753.157589] LustreError: 90707:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c4f9aed0c00
[7664753.685370] LustreError: 82913:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff9c211883fe00
[7664756.109065] LustreError: 90696:0:(ldlm_lib.c:3262:target_bulk_io()) @@@ network error on bulk WRITE  req@ffff9c207b493850 x1659596557272128/t0(0) o4->37a8be97-d1cb-4@10.50.3.70@o2ib2:549/0 lens 504/448 e 3 to 0 dl 1583650799 ref 1 fl Interpret:/0/0 rc 0/0
[7664756.131829] LustreError: 90696:0:(ldlm_lib.c:3262:target_bulk_io()) Skipped 9 previous similar messages
[7664756.141474] Lustre: fir-OST001b: Bulk IO write error with 37a8be97-d1cb-4 (at 10.50.3.70@o2ib2), client will retry: rc = -110
[7664756.152943] Lustre: Skipped 3 previous similar messages
[7664764.070016] ll_ost_io00_045 invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
[7664764.078203] ll_ost_io00_045 cpuset=/ mems_allowed=0
[7664764.083262] CPU: 36 PID: 3011 Comm: ll_ost_io00_045 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664764.096555] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664764.104389] Call Trace:
[7664764.107023]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664764.112339]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664764.117835]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664764.123673]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664764.129419]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664764.135425]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664764.141776]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664764.147869]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664764.153617]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664764.160145]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664764.166678]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664764.172929]  [<ffffffffc124293f>] tgt_checksum_niobuf_rw+0xbf/0xe00 [ptlrpc]
[7664764.180182]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664764.187518]  [<ffffffffc0cb71e0>] ? obd_dif_crc_fn+0x20/0x20 [obdclass]
[7664764.194354]  [<ffffffffc1247325>] tgt_brw_read+0xc35/0x1e50 [ptlrpc]
[7664764.200888]  [<ffffffffa021bdce>] ? ___slab_alloc+0x24e/0x4f0
[7664764.206852]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664764.213487]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664764.220839]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664764.228184]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664764.235701]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664764.242623]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664764.249708]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664764.257459]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664764.264719]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664764.272587]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664764.279557]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664764.284996]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664764.291470]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664764.299037]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664764.304098]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664764.310366]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664764.316983]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664764.323250] Mem-Info:
[7664764.325716] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34531 inactive_file:37066 isolated_file:2912
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296797
 mapped:1587 shmem:0 pagetables:413 bounce:0
 free:590118 free_pcp:0 free_cma:0
[7664764.359898] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664764.401654] lowmem_reserve[]: 0 1418 63868 63868
[7664764.406582] Node 0 DMA32 free:261320kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1076kB inactive_file:3376kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686216kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:196905 all_unreclaimable? yes
[7664764.451453] lowmem_reserve[]: 0 0 62450 62450
[7664764.456123] Node 0 Normal free:507756kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:47016kB inactive_file:48152kB unevictable:168kB isolated(anon):0kB isolated(file):2816kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610948kB slab_unreclaimable:60243912kB kernel_stack:5792kB pagetables:232kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2655704 all_unreclaimable? yes
[7664764.502991] lowmem_reserve[]: 0 0 0 0
[7664764.506966] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664764.521805] Node 0 DMA32: 332*4kB (EM) 401*8kB (EM) 1195*16kB (UEM) 3691*32kB (UEM) 1492*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261320kB
[7664764.538126] Node 0 Normal: 6185*4kB (UEM) 5719*8kB (UEM) 3942*16kB (UEM) 4481*32kB (UEM) 2037*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 507420kB
[7664764.554966] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664764.563843] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664764.572455] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664764.581322] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664764.589924] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664764.598795] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664764.607409] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664764.616276] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664764.624888] 76065 total pagecache pages
[7664764.628901] 0 pages in swap cache
[7664764.632393] Swap cache stats: add 21120844, delete 21136816, find 4513466/7610007
[7664764.640046] Free swap  = 3120592kB
[7664764.643626] Total swap = 4194300kB
[7664764.647206] 66993253 pages RAM
[7664764.650437] 0 pages HighMem/MovableOnly
[7664764.654450] 1101945 pages reserved
[7664764.658028] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664764.666076] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664764.674869] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664764.683460] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664764.691633] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664764.700239] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664764.708324] [53108]     0 53108    38960      161      19       86             0 dsm_sa_eventmgr
[7664764.717282] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664764.725286] [54035]     0 54035    27526      164      10       33             0 agetty
[7664764.733460] [54036]     0 54036    27526      158      11       33             0 agetty
[7664764.741755] [36317]     0 36317    28294      187      14       61             0 bash
[7664764.749761] [36329]     0 36329    28177      160      14       55             0 grep
[7664764.757875] Out of memory: Kill process 53108 (dsm_sa_eventmgr) score 0 or sacrifice child
[7664764.766312] Killed process 53108 (dsm_sa_eventmgr) total-vm:155840kB, anon-rss:0kB, file-rss:644kB, shmem-rss:0kB
[7664764.779313] ll_ost_io02_028 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664764.787760] ll_ost_io02_028 cpuset=/ mems_allowed=2
[7664764.792829] CPU: 10 PID: 123082 Comm: ll_ost_io02_028 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664764.806303] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664764.814148] Call Trace:
[7664764.816799]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664764.822132]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664764.827655]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664764.833495]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664764.839256]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664764.845276]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664764.851634]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664764.857738]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664764.863493]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664764.870024]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664764.876556]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664764.882744]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664764.888758]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664764.894866]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664764.901756]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664764.908988]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664764.915167]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664764.921830]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664764.928884]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664764.934634]  [<ffffffffa006213e>] ? physflat_send_IPI_mask+0xe/0x10
[7664764.941083]  [<ffffffffa0056f42>] ? native_smp_send_reschedule+0x52/0x70
[7664764.947965]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664764.953507]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664764.960602]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664764.968354]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664764.975615]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664764.983490]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664764.990501]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664764.998461]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664765.005725]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664765.012206]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664765.019779]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664765.024838]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664765.031113]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664765.037734]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664765.044006] Mem-Info:
[7664765.046466] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35472 inactive_file:37843 isolated_file:1600
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296797
 mapped:1587 shmem:0 pagetables:413 bounce:0
 free:590050 free_pcp:0 free_cma:0
[7664765.080656] Node 2 Normal free:524752kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32192kB inactive_file:40352kB unevictable:8680kB isolated(anon):0kB isolated(file):128kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476440kB kernel_stack:7760kB pagetables:36kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:406321 all_unreclaimable? yes
[7664765.127519] lowmem_reserve[]: 0 0 0 0
[7664765.131486] Node 2 Normal: 27419*4kB (UEM) 40157*8kB (UEM) 917*16kB (UEM) 1663*32kB (UEM) 404*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524676kB
[7664765.146976] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664765.155843] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664765.164457] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664765.173322] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664765.181928] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664765.190796] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664765.199411] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664765.208277] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664765.216890] 76063 total pagecache pages
[7664765.220903] 0 pages in swap cache
[7664765.224397] Swap cache stats: add 21120847, delete 21136819, find 4513467/7610009
[7664765.232050] Free swap  = 3120592kB
[7664765.235635] Total swap = 4194300kB
[7664765.239217] 66993253 pages RAM
[7664765.242446] 0 pages HighMem/MovableOnly
[7664765.246461] 1101945 pages reserved
[7664765.250047] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664765.258095] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664765.266893] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664765.275489] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664765.283677] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664765.292288] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664765.300382] [53133]     0 53108    38960      162      19       86             0 dsm_sa_eventmgr
[7664765.309346] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664765.317352] [54035]     0 54035    27526      164      10       33             0 agetty
[7664765.325535] [54036]     0 54036    27526      158      11       33             0 agetty
[7664765.333828] [36317]     0 36317    28294      187      14       61             0 bash
[7664765.341838] [36329]     0 36329    28177      160      14       55             0 grep
[7664765.349963] Out of memory: Kill process 53133 (dsm_sa_eventmgr) score 0 or sacrifice child
[7664765.358402] Killed process 53133 (dsm_sa_eventmgr) total-vm:155840kB, anon-rss:0kB, file-rss:648kB, shmem-rss:0kB
[7664765.762291] dsm_sa_eventmgr: page allocation failure: order:0, mode:0x200da
[7664765.769434] CPU: 19 PID: 53133 Comm: dsm_sa_eventmgr Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664765.782814] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664765.790650] Call Trace:
[7664765.793290]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664765.798609]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664765.804706]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664765.810281]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664765.816294]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664765.822821]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664765.829348]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664765.835188]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664765.841799]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664765.848066]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664765.853988]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664765.860002]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664765.865932]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664765.871859]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664765.877439]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664765.882759] Mem-Info:
[7664765.885242] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35195 inactive_file:37889 isolated_file:2560
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296798
 mapped:1587 shmem:0 pagetables:413 bounce:0
 free:590065 free_pcp:0 free_cma:0
[7664765.919426] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664765.961180] lowmem_reserve[]: 0 1418 63868 63868
[7664765.966109] Node 0 DMA32 free:261328kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1076kB inactive_file:4024kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686216kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:402182 all_unreclaimable? yes
[7664766.010986] lowmem_reserve[]: 0 0 62450 62450
[7664766.015657] Node 0 Normal free:507532kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:45276kB inactive_file:46256kB unevictable:168kB isolated(anon):0kB isolated(file):8320kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610948kB slab_unreclaimable:60243916kB kernel_stack:6064kB pagetables:232kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:900271 all_unreclaimable? yes
[7664766.062440] lowmem_reserve[]: 0 0 0 0
[7664766.066417] Node 1 Normal free:525484kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17160kB inactive_file:16928kB unevictable:26488kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711284kB slab_unreclaimable:63411348kB kernel_stack:20816kB pagetables:980kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:65347 all_unreclaimable? yes
[7664766.113109] lowmem_reserve[]: 0 0 0 0
[7664766.117087] Node 2 Normal free:524752kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32192kB inactive_file:40608kB unevictable:8680kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476440kB kernel_stack:7760kB pagetables:36kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:585961 all_unreclaimable? yes
[7664766.163784] lowmem_reserve[]: 0 0 0 0
[7664766.167756] Node 3 Normal free:525268kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:43188kB inactive_file:43112kB unevictable:840kB isolated(anon):0kB isolated(file):1152kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854212kB slab_unreclaimable:62369272kB kernel_stack:4208kB pagetables:404kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1306682 all_unreclaimable? yes
[7664766.214618] lowmem_reserve[]: 0 0 0 0
[7664766.218584] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664766.233423] Node 0 DMA32: 332*4kB (EM) 402*8kB (UEM) 1195*16kB (UEM) 3691*32kB (UEM) 1492*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261328kB
[7664766.249831] Node 0 Normal: 6315*4kB (UEM) 5724*8kB (UEM) 3960*16kB (UEM) 4481*32kB (UEM) 2037*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508268kB
[7664766.266670] Node 1 Normal: 88171*4kB (UEM) 21586*8kB (UM) 7*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525484kB
[7664766.280199] Node 2 Normal: 27419*4kB (UEM) 40157*8kB (UEM) 917*16kB (UEM) 1663*32kB (UEM) 404*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524676kB
[7664766.295688] Node 3 Normal: 131335*4kB (UEM) 1*8kB (M) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525348kB
[7664766.308497] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664766.317362] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664766.325968] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664766.334834] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664766.343440] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664766.352306] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664766.360912] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664766.369780] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664766.378387] 75973 total pagecache pages
[7664766.382400] 0 pages in swap cache
[7664766.385899] Swap cache stats: add 21120847, delete 21136819, find 4513467/7610009
[7664766.393551] Free swap  = 3120592kB
[7664766.397132] Total swap = 4194300kB
[7664766.400713] 66993253 pages RAM
[7664766.403952] 0 pages HighMem/MovableOnly
[7664766.407965] 1101945 pages reserved
[7664767.827054] ll_ost_io02_101 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664767.835499] ll_ost_io02_101 cpuset=/ mems_allowed=2
[7664767.840564] CPU: 10 PID: 8716 Comm: ll_ost_io02_101 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664767.853856] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664767.861688] Call Trace:
[7664767.864327]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664767.869647]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664767.875141]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664767.880987]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664767.886737]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664767.892749]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664767.899101]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664767.905194]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664767.910943]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664767.917477]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664767.924013]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664767.930198]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664767.936210]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664767.942323]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664767.949210]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664767.955374]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664767.962474]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664767.969046]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664767.975696]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664767.983042]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664767.989784]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664767.997131]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664768.004649]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664768.011569]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664768.018658]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664768.026409]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664768.033666]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664768.041536]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664768.048501]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664768.053949]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664768.060426]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664768.068001]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664768.073060]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664768.079333]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664768.085947]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664768.092213] Mem-Info:
[7664768.094673] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35294 inactive_file:36371 isolated_file:2432
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296798
 mapped:1587 shmem:0 pagetables:394 bounce:0
 free:590259 free_pcp:0 free_cma:0
[7664768.128854] Node 2 Normal free:524756kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32192kB inactive_file:40480kB unevictable:8680kB isolated(anon):0kB isolated(file):0kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476440kB kernel_stack:7760kB pagetables:32kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:243189 all_unreclaimable? yes
[7664768.175547] lowmem_reserve[]: 0 0 0 0
[7664768.179521] Node 2 Normal: 27420*4kB (UEM) 40157*8kB (UEM) 917*16kB (UEM) 1663*32kB (UEM) 404*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524680kB
[7664768.195009] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664768.203878] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664768.212483] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664768.221349] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664768.229955] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664768.238821] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664768.247427] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664768.256296] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664768.264907] 75967 total pagecache pages
[7664768.268923] 0 pages in swap cache
[7664768.272413] Swap cache stats: add 21120849, delete 21136821, find 4513468/7610011
[7664768.280066] Free swap  = 3120848kB
[7664768.283644] Total swap = 4194300kB
[7664768.287226] 66993253 pages RAM
[7664768.290456] 0 pages HighMem/MovableOnly
[7664768.294469] 1101945 pages reserved
[7664768.298050] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664768.306096] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664768.314891] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664768.323480] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664768.331657] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664768.340272] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664768.348370] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664768.356373] [54035]     0 54035    27526      164      10       33             0 agetty
[7664768.364551] [54036]     0 54036    27526      158      11       33             0 agetty
[7664768.372838] [36317]     0 36317    28294      187      14       61             0 bash
[7664768.380845] [36329]     0 36329    28177      160      14       55             0 grep
[7664768.388965] Out of memory: Kill process 36317 (bash) score 0 or sacrifice child
[7664768.396444] Killed process 36329 (grep) total-vm:112708kB, anon-rss:0kB, file-rss:640kB, shmem-rss:0kB
[7664768.525361] grep: page allocation failure: order:0, mode:0x200da
[7664768.531547] CPU: 26 PID: 36329 Comm: grep Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664768.543969] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664768.551810] Call Trace:
[7664768.554446]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664768.559763]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664768.565876]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664768.571467]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664768.577489]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664768.584033]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664768.590577]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664768.596444]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664768.603080]  [<ffffffffa076aaba>] ? __schedule+0x42a/0x860
[7664768.608746]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664768.615037]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664768.620982]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664768.626992]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664768.632962]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664768.638882]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664768.644483]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664768.649799] Mem-Info:
[7664768.652272] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:36070 inactive_file:35370 isolated_file:4094
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296798
 mapped:1587 shmem:0 pagetables:394 bounce:0
 free:590297 free_pcp:0 free_cma:0
[7664768.686476] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664768.728274] lowmem_reserve[]: 0 1418 63868 63868
[7664768.733221] Node 0 DMA32 free:261328kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1076kB inactive_file:4060kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686216kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:403302 all_unreclaimable? yes
[7664768.778120] lowmem_reserve[]: 0 0 62450 62450
[7664768.782810] Node 0 Normal free:508380kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:50616kB inactive_file:48292kB unevictable:168kB isolated(anon):0kB isolated(file):4088kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610948kB slab_unreclaimable:60243916kB kernel_stack:5760kB pagetables:200kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:688965 all_unreclaimable? yes
[7664768.829652] lowmem_reserve[]: 0 0 0 0
[7664768.833659] Node 1 Normal free:525496kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:18924kB inactive_file:13408kB unevictable:26488kB isolated(anon):0kB isolated(file):3328kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711284kB slab_unreclaimable:63411348kB kernel_stack:20816kB pagetables:976kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1762246 all_unreclaimable? yes
[7664768.880870] lowmem_reserve[]: 0 0 0 0
[7664768.884852] Node 2 Normal free:524932kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:32016kB inactive_file:36748kB unevictable:8680kB isolated(anon):0kB isolated(file):4096kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476440kB kernel_stack:7760kB pagetables:32kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:795135 all_unreclaimable? yes
[7664768.931833] lowmem_reserve[]: 0 0 0 0
[7664768.935840] Node 3 Normal free:525324kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:44212kB inactive_file:42856kB unevictable:840kB isolated(anon):0kB isolated(file):768kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854212kB slab_unreclaimable:62369272kB kernel_stack:4208kB pagetables:368kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2461940 all_unreclaimable? yes
[7664768.982707] lowmem_reserve[]: 0 0 0 0
[7664768.986754] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664769.001732] Node 0 DMA32: 332*4kB (EM) 402*8kB (UEM) 1195*16kB (UEM) 3691*32kB (UEM) 1492*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261328kB
[7664769.018209] Node 0 Normal: 6323*4kB (UEM) 5724*8kB (UEM) 3950*16kB (UEM) 4481*32kB (UEM) 2037*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508140kB
[7664769.035126] Node 1 Normal: 88172*4kB (UEM) 21587*8kB (UM) 7*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525496kB
[7664769.048862] Node 2 Normal: 27473*4kB (UEM) 40182*8kB (UEM) 919*16kB (UEM) 1663*32kB (UEM) 404*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525124kB
[7664769.064605] Node 3 Normal: 131349*4kB (UEM) 1*8kB (M) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525404kB
[7664769.077644] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664769.086555] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664769.095198] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664769.104131] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664769.112767] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664769.121658] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664769.130299] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664769.139252] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664769.147884] 75889 total pagecache pages
[7664769.151927] 0 pages in swap cache
[7664769.155452] Swap cache stats: add 21120849, delete 21136821, find 4513468/7610011
[7664769.163138] Free swap  = 3120848kB
[7664769.166748] Total swap = 4194300kB
[7664769.170341] 66993253 pages RAM
[7664769.173623] 0 pages HighMem/MovableOnly
[7664769.177655] 1101945 pages reserved
[7664769.509309] ll_ost_io03_086 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664769.517745] ll_ost_io03_086 cpuset=/ mems_allowed=3
[7664769.522809] CPU: 7 PID: 8677 Comm: ll_ost_io03_086 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664769.536014] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664769.543846] Call Trace:
[7664769.546478]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664769.551796]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664769.557283]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664769.563125]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664769.568877]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664769.573189] LustreError: 101203:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 99s: evicting client at 10.50.10.29@o2ib2  ns: filter-fir-OST001f_UUID lock: ffff9c203123e0c0/0xb0d9932fd7de3c6a lrc: 3/0,0 mode: PR/PR res: [0x480000401:0x6e827a2:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->134217727) flags: 0x60000400000020 nid: 10.50.10.29@o2ib2 remote: 0xafe739f8d57c3f80 expref: 63 pid: 124126 timeout: 7664678 lvb_type: 1
[7664769.616343]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664769.622699]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664769.628792]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664769.634540]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664769.641072]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664769.647601]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664769.653786]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664769.659791]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664769.665900]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664769.672783]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664769.680009]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664769.686173]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664769.692827]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664769.699911]  [<ffffffffc11dcbe7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[7664769.706534]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664769.712289]  [<ffffffffa00dca58>] ? __enqueue_entity+0x78/0x80
[7664769.718301]  [<ffffffffa00e367f>] ? enqueue_entity+0x2ef/0xbe0
[7664769.724309]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664769.729846]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664769.736933]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664769.744684]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664769.751942]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664769.759808]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664769.766813]  [<ffffffffc11e499e>] ? ptlrpc_server_post_idle_rqbds+0x7e/0xf0 [ptlrpc]
[7664769.774765]  [<ffffffffc11e6e10>] ? ptlrpc_grow_req_bufs+0x50/0x2a0 [ptlrpc]
[7664769.782024]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664769.788504]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664769.796077]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664769.801131]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664769.807406]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664769.814025]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664769.820292] Mem-Info:
[7664769.822750] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:36231 inactive_file:36744 isolated_file:1824
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296796
 mapped:1587 shmem:0 pagetables:380 bounce:0
 free:590402 free_pcp:19 free_cma:0
[7664769.857018] Node 3 Normal free:525404kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:44788kB inactive_file:42760kB unevictable:840kB isolated(anon):0kB isolated(file):256kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854212kB slab_unreclaimable:62369272kB kernel_stack:4208kB pagetables:312kB unstable:0kB bounce:0kB free_pcp:76kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2461940 all_unreclaimable? yes
[7664769.903883] lowmem_reserve[]: 0 0 0 0
[7664769.907848] Node 3 Normal: 131349*4kB (UEM) 1*8kB (M) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525404kB
[7664769.920660] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664769.929529] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664769.938142] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664769.947009] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664769.955615] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664769.964481] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664769.973087] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664769.981954] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664769.990566] 75878 total pagecache pages
[7664769.994582] 0 pages in swap cache
[7664769.998074] Swap cache stats: add 21120851, delete 21136823, find 4513469/7610013
[7664770.005729] Free swap  = 3121104kB
[7664770.009312] Total swap = 4194300kB
[7664770.012893] 66993253 pages RAM
[7664770.016123] 0 pages HighMem/MovableOnly
[7664770.020137] 1101945 pages reserved
[7664770.023719] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664770.031768] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664770.040557] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664770.049145] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664770.057323] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664770.065937] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664770.074037] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664770.082041] [54035]     0 54035    27526      164      10       33             0 agetty
[7664770.090219] [54036]     0 54036    27526      158      11       33             0 agetty
[7664770.098517] [36317]     0 36317    28294      187      14       61             0 bash
[7664770.106648] Out of memory: Kill process 36317 (bash) score 0 or sacrifice child
[7664770.114134] Killed process 36317 (bash) total-vm:113176kB, anon-rss:0kB, file-rss:748kB, shmem-rss:0kB
[7664770.212170] bash: page allocation failure: order:0, mode:0x200da
[7664770.218360] CPU: 38 PID: 36317 Comm: bash Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664770.230785] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664770.238610] Call Trace:
[7664770.241245]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664770.246562]  [<ffffffffa01bdec0>] warn_alloc_failed+0x110/0x180
[7664770.252663]  [<ffffffffa01c0be0>] ? drain_pages+0xb0/0xb0
[7664770.258246]  [<ffffffffa00c3f50>] ? wake_up_atomic_t+0x30/0x30
[7664770.264259]  [<ffffffffa076074e>] __alloc_pages_slowpath+0x6b6/0x724
[7664770.270794]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664770.277328]  [<ffffffffa02128c5>] alloc_pages_vma+0xb5/0x200
[7664770.283171]  [<ffffffffa0200b15>] __read_swap_cache_async+0x115/0x190
[7664770.289791]  [<ffffffffa0200bb6>] read_swap_cache_async+0x26/0x60
[7664770.296065]  [<ffffffffa0200c9c>] swapin_readahead+0xac/0x110
[7664770.301994]  [<ffffffffa01ead92>] handle_pte_fault+0x812/0xd10
[7664770.308007]  [<ffffffffa01ed3ad>] handle_mm_fault+0x39d/0x9b0
[7664770.313936]  [<ffffffffa0772603>] __do_page_fault+0x203/0x4f0
[7664770.319863]  [<ffffffffa0772925>] do_page_fault+0x35/0x90
[7664770.325445]  [<ffffffffa076e768>] page_fault+0x28/0x30
[7664770.330766]  [<ffffffffa0388990>] ? __put_user_4+0x20/0x30
[7664770.336433]  [<ffffffffa009e2e1>] ? wait_consider_task+0x8a1/0xb30
[7664770.342784]  [<ffffffffa009e670>] do_wait+0x100/0x260
[7664770.348013]  [<ffffffffa009f960>] SyS_wait4+0x80/0x110
[7664770.353333]  [<ffffffffa009d3c0>] ? task_stopped_code+0x60/0x60
[7664770.359428]  [<ffffffffa0777ddb>] system_call_fastpath+0x22/0x27
[7664770.365612] Mem-Info:
[7664770.368086] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34634 inactive_file:38360 isolated_file:1824
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296796
 mapped:1587 shmem:0 pagetables:380 bounce:0
 free:590205 free_pcp:0 free_cma:0
[7664770.402275] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664770.444025] lowmem_reserve[]: 0 1418 63868 63868
[7664770.448954] Node 0 DMA32 free:261328kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1084kB inactive_file:4068kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686216kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:407174 all_unreclaimable? yes
[7664770.493823] lowmem_reserve[]: 0 0 62450 62450
[7664770.498491] Node 0 Normal free:508128kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:45384kB inactive_file:45216kB unevictable:168kB isolated(anon):0kB isolated(file):4480kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610948kB slab_unreclaimable:60243916kB kernel_stack:5952kB pagetables:200kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:731572 all_unreclaimable? yes
[7664770.545271] lowmem_reserve[]: 0 0 0 0
[7664770.549247] Node 1 Normal free:525504kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17036kB inactive_file:16244kB unevictable:26488kB isolated(anon):0kB isolated(file):128kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711284kB slab_unreclaimable:63411340kB kernel_stack:20816kB pagetables:976kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:597742 all_unreclaimable? yes
[7664770.596185] lowmem_reserve[]: 0 0 0 0
[7664770.600154] Node 2 Normal free:525096kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31380kB inactive_file:40576kB unevictable:8680kB isolated(anon):0kB isolated(file):384kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476440kB kernel_stack:7760kB pagetables:32kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:462422 all_unreclaimable? yes
[7664770.647018] lowmem_reserve[]: 0 0 0 0
[7664770.650994] Node 3 Normal free:524860kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:41244kB inactive_file:40164kB unevictable:840kB isolated(anon):0kB isolated(file):5120kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854212kB slab_unreclaimable:62369272kB kernel_stack:4208kB pagetables:312kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1132211 all_unreclaimable? yes
[7664770.697858] lowmem_reserve[]: 0 0 0 0
[7664770.701830] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664770.716670] Node 0 DMA32: 332*4kB (EM) 402*8kB (UEM) 1195*16kB (UEM) 3691*32kB (UEM) 1492*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261328kB
[7664770.733078] Node 0 Normal: 6324*4kB (UEM) 5724*8kB (UEM) 3950*16kB (UEM) 4481*32kB (UEM) 2037*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508144kB
[7664770.749914] Node 1 Normal: 88172*4kB (UEM) 21588*8kB (UM) 7*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525504kB
[7664770.763444] Node 2 Normal: 27473*4kB (UEM) 40182*8kB (UEM) 919*16kB (UEM) 1663*32kB (UEM) 404*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525124kB
[7664770.778931] Node 3 Normal: 131229*4kB (UEM) 2*8kB (U) 1*16kB (U) 1*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 524980kB
[7664770.792488] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664770.801364] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664770.809977] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664770.818844] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664770.827450] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664770.836324] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664770.844931] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664770.853805] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664770.862411] 76003 total pagecache pages
[7664770.866427] 0 pages in swap cache
[7664770.869927] Swap cache stats: add 21120851, delete 21136823, find 4513469/7610013
[7664770.877586] Free swap  = 3121104kB
[7664770.881167] Total swap = 4194300kB
[7664770.884757] 66993253 pages RAM
[7664770.887996] 0 pages HighMem/MovableOnly
[7664770.892008] 1101945 pages reserved
[7664770.896405] ll_ost_io01_087 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664770.904846] ll_ost_io01_087 cpuset=/ mems_allowed=1
[7664770.909909] CPU: 45 PID: 83036 Comm: ll_ost_io01_087 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664770.923288] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664770.931121] Call Trace:
[7664770.933754]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664770.939069]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664770.944556]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664770.950400]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664770.956413]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664770.962772]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664770.968864]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664770.974610]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664770.981138]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664770.987665]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664770.993851]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664770.999856]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664771.005963]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664771.012851]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664771.019016]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664771.026104]  [<ffffffffc11e3d25>] ? request_in_callback+0x485/0x920 [ptlrpc]
[7664771.033373]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664771.039916]  [<ffffffffc0b11143>] ? kiblnd_post_rx+0x163/0x520 [ko2iblnd]
[7664771.046909]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664771.054258]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664771.060995]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664771.068342]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664771.075855]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664771.082769]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664771.089859]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664771.097609]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664771.104864]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664771.112693]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664771.118136]  [<ffffffffc11ebb7f>] ? ptlrpc_server_handle_req_in+0x8df/0xd60 [ptlrpc]
[7664771.126090]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664771.132576]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664771.140149]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664771.145211]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664771.151485]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664771.158104]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664771.164369] Mem-Info:
[7664771.166828] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35112 inactive_file:38314 isolated_file:1760
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296796
 mapped:1587 shmem:0 pagetables:366 bounce:0
 free:590218 free_pcp:0 free_cma:0
[7664771.201008] Node 1 Normal free:525504kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:17452kB inactive_file:16016kB unevictable:26488kB isolated(anon):0kB isolated(file):128kB present:67108352kB managed:66054620kB mlocked:26488kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:711284kB slab_unreclaimable:63411340kB kernel_stack:20816kB pagetables:976kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:597742 all_unreclaimable? yes
[7664771.247961] lowmem_reserve[]: 0 0 0 0
[7664771.251928] Node 1 Normal: 88172*4kB (UEM) 21588*8kB (UM) 7*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525504kB
[7664771.265459] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664771.274325] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664771.282931] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664771.291795] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664771.300402] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664771.309268] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664771.317874] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664771.326741] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664771.335347] 76002 total pagecache pages
[7664771.339360] 0 pages in swap cache
[7664771.342851] Swap cache stats: add 21120852, delete 21136824, find 4513469/7610013
[7664771.350504] Free swap  = 3121360kB
[7664771.354083] Total swap = 4194300kB
[7664771.357664] 66993253 pages RAM
[7664771.360896] 0 pages HighMem/MovableOnly
[7664771.364908] 1101945 pages reserved
[7664771.368487] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664771.376532] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664771.385316] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664771.393889] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664771.402065] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664771.410672] [53101]     0 53101     1910       64       9      172             0 mdadm
[7664771.418769] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664771.426775] [54035]     0 54035    27526      164      10       33             0 agetty
[7664771.434948] [54036]     0 54036    27526      158      11       33             0 agetty
[7664771.443326] Out of memory: Kill process 53101 (mdadm) score 0 or sacrifice child
[7664771.450893] Killed process 53101 (mdadm) total-vm:7640kB, anon-rss:0kB, file-rss:256kB, shmem-rss:0kB
[7664771.706326] ll_ost_io02_088 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664771.714766] ll_ost_io02_088 cpuset=/ mems_allowed=2
[7664771.719827] CPU: 10 PID: 8667 Comm: ll_ost_io02_088 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664771.733118] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664771.740944] Call Trace:
[7664771.743577]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664771.748892]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664771.754380]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664771.760221]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664771.765967]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664771.771981]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664771.777726]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664771.784252]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664771.790782]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664771.796969]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664771.802980]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664771.809089]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664771.815971]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664771.822132]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664771.829228]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664771.835795]  [<ffffffffc11d5b56>] ? ptl_send_buf+0x146/0x530 [ptlrpc]
[7664771.842456]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664771.849802]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664771.856555]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664771.863900]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664771.871414]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664771.878326]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664771.885422]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664771.893179]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664771.900440]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664771.908300]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664771.915261]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664771.920693]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664771.927165]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664771.934735]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664771.939794]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664771.946062]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664771.952672]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664771.958938] Mem-Info:
[7664771.961398] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35827 inactive_file:36827 isolated_file:2208
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296796
 mapped:1587 shmem:0 pagetables:366 bounce:0
 free:590178 free_pcp:0 free_cma:0
[7664771.995578] Node 2 Normal free:524852kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:31428kB inactive_file:36036kB unevictable:8680kB isolated(anon):0kB isolated(file):4352kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476440kB kernel_stack:7760kB pagetables:32kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:532083 all_unreclaimable? no
[7664772.042463] lowmem_reserve[]: 0 0 0 0
[7664772.046455] Node 2 Normal: 27540*4kB (UEM) 40211*8kB (UEM) 919*16kB (UEM) 1663*32kB (UEM) 404*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525624kB
[7664772.061943] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664772.070810] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664772.079417] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664772.088282] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664772.096888] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664772.105754] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664772.114360] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664772.123226] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664772.131832] 76002 total pagecache pages
[7664772.135848] 0 pages in swap cache
[7664772.139337] Swap cache stats: add 21120860, delete 21136832, find 4513471/7610016
[7664772.146990] Free swap  = 3122128kB
[7664772.150568] Total swap = 4194300kB
[7664772.154152] 66993253 pages RAM
[7664772.157382] 0 pages HighMem/MovableOnly
[7664772.161393] 1101945 pages reserved
[7664772.164973] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664772.173022] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664772.181814] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664772.190403] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664772.198581] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664772.207202] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664772.215208] [54035]     0 54035    27526      164      10       33             0 agetty
[7664772.223382] [54036]     0 54036    27526      158      11       33             0 agetty
[7664772.231794] Out of memory: Kill process 54035 (agetty) score 0 or sacrifice child
[7664772.239448] Killed process 54035 (agetty) total-vm:110104kB, anon-rss:0kB, file-rss:656kB, shmem-rss:0kB
[7664772.405609] ll_ost_io03_091 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664772.414050] ll_ost_io03_091 cpuset=/ mems_allowed=3
[7664772.419102] CPU: 47 PID: 8714 Comm: ll_ost_io03_091 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664772.432390] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664772.440222] Call Trace:
[7664772.442856]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664772.448172]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664772.453666]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664772.459498]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664772.465505]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664772.471860]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664772.477960]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664772.483712]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664772.490240]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664772.496766]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664772.502946]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664772.508962]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664772.515079]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664772.521961]  [<ffffffffc172e1ca>] ofd_preprw+0x6fa/0x11b0 [ofd]
[7664772.528124]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664772.535228]  [<ffffffffc12470cb>] tgt_brw_read+0x9db/0x1e50 [ptlrpc]
[7664772.541790]  [<ffffffffc0c82a79>] ? lprocfs_counter_add+0xf9/0x160 [obdclass]
[7664772.549135]  [<ffffffffc1217476>] ? null_alloc_rs+0x186/0x340 [ptlrpc]
[7664772.555875]  [<ffffffffc11df335>] ? lustre_pack_reply_v2+0x135/0x290 [ptlrpc]
[7664772.563223]  [<ffffffffc11df4ff>] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc]
[7664772.570735]  [<ffffffffc11df681>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[7664772.577650]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664772.584743]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664772.592498]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664772.599755]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664772.607623]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664772.614584]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664772.620025]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664772.626498]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664772.634065]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664772.639118]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664772.645388]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664772.652004]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664772.658269] Mem-Info:
[7664772.660729] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:34439 inactive_file:36985 isolated_file:3104
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296796
 mapped:1587 shmem:0 pagetables:357 bounce:0
 free:590336 free_pcp:0 free_cma:0
[7664772.694910] Node 3 Normal free:525024kB min:525460kB low:656824kB high:788188kB active_anon:0kB inactive_anon:0kB active_file:41860kB inactive_file:43652kB unevictable:840kB isolated(anon):0kB isolated(file):2816kB present:67108352kB managed:66038732kB mlocked:840kB dirty:0kB writeback:0kB mapped:848kB shmem:0kB slab_reclaimable:854212kB slab_unreclaimable:62369272kB kernel_stack:4208kB pagetables:288kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1117683 all_unreclaimable? yes
[7664772.741776] lowmem_reserve[]: 0 0 0 0
[7664772.745742] Node 3 Normal: 131256*4kB (UEM) 2*8kB (U) 1*16kB (U) 1*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525088kB
[7664772.759298] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664772.768167] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664772.776784] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664772.785654] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664772.794260] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664772.803126] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664772.811732] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664772.820599] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664772.829206] 75916 total pagecache pages
[7664772.833216] 0 pages in swap cache
[7664772.836710] Swap cache stats: add 21120860, delete 21136832, find 4513471/7610016
[7664772.844361] Free swap  = 3122384kB
[7664772.847940] Total swap = 4194300kB
[7664772.851521] 66993253 pages RAM
[7664772.854752] 0 pages HighMem/MovableOnly
[7664772.858767] 1101945 pages reserved
[7664772.862347] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664772.870388] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664772.879175] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664772.887768] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664772.895942] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664772.904562] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664772.912565] [54036]     0 54036    27526      158      11       33             0 agetty
[7664772.920971] Out of memory: Kill process 54036 (agetty) score 0 or sacrifice child
[7664772.928623] Killed process 54036 (agetty) total-vm:110104kB, anon-rss:0kB, file-rss:632kB, shmem-rss:0kB
[7664772.940071] ll_ost_io00_025 invoked oom-killer: gfp_mask=0x200d2, order=0, oom_score_adj=0
[7664772.948515] ll_ost_io00_025 cpuset=/ mems_allowed=0
[7664772.953577] CPU: 12 PID: 123043 Comm: ll_ost_io00_025 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664772.967042] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664772.974870] Call Trace:
[7664772.977512]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664772.982825]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664772.988322]  [<ffffffffa0102372>] ? ktime_get_ts64+0x52/0xf0
[7664772.994161]  [<ffffffffa01595af>] ? delayacct_end+0x8f/0xb0
[7664772.999907]  [<ffffffffa01bb904>] oom_kill_process+0x254/0x3d0
[7664773.005911]  [<ffffffffa01bb3ad>] ? oom_unkillable_task+0xcd/0x120
[7664773.012265]  [<ffffffffa01bb456>] ? find_lock_task_mm+0x56/0xc0
[7664773.018356]  [<ffffffffa01bc146>] out_of_memory+0x4b6/0x4f0
[7664773.024104]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664773.030630]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664773.037157]  [<ffffffffa020f438>] alloc_pages_current+0x98/0x110
[7664773.043335]  [<ffffffffa01b7767>] __page_cache_alloc+0x97/0xb0
[7664773.049340]  [<ffffffffa01b88e5>] find_or_create_page+0x45/0xa0
[7664773.055448]  [<ffffffffc15ac5c3>] osd_bufs_get+0x413/0x870 [osd_ldiskfs]
[7664773.062332]  [<ffffffffc172d0a6>] ofd_preprw_write.isra.31+0x476/0xea0 [ofd]
[7664773.069556]  [<ffffffffc172def2>] ofd_preprw+0x422/0x11b0 [ofd]
[7664773.075703]  [<ffffffffc12491bc>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[7664773.082349]  [<ffffffffc11dcbd0>] ? lustre_msg_buf_v2+0x1e0/0x1e0 [ptlrpc]
[7664773.089428]  [<ffffffffc11dcbe7>] ? lustre_msg_buf+0x17/0x60 [ptlrpc]
[7664773.096078]  [<ffffffffc1204163>] ? __req_capsule_get+0x163/0x740 [ptlrpc]
[7664773.103127]  [<ffffffffa021ccc1>] ? __slab_free+0x81/0x2f0
[7664773.108789]  [<ffffffffa00e143c>] ? update_curr+0x14c/0x1e0
[7664773.114542]  [<ffffffffa00ddd9e>] ? account_entity_dequeue+0xae/0xd0
[7664773.121067]  [<ffffffffa00e192c>] ? dequeue_entity+0x11c/0x5e0
[7664773.127074]  [<ffffffffa0769192>] ? mutex_lock+0x12/0x2f
[7664773.132593]  [<ffffffffc124536a>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[7664773.139672]  [<ffffffffc1220da1>] ? ptlrpc_nrs_req_get_nolock0+0xd1/0x170 [ptlrpc]
[7664773.147422]  [<ffffffffc0a07bde>] ? ktime_get_real_seconds+0xe/0x10 [libcfs]
[7664773.154673]  [<ffffffffc11ec24b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[7664773.162534]  [<ffffffffc11e7805>] ? ptlrpc_wait_event+0xa5/0x360 [ptlrpc]
[7664773.169499]  [<ffffffffa00cfeb4>] ? __wake_up+0x44/0x50
[7664773.174935]  [<ffffffffc11efbac>] ptlrpc_main+0xb2c/0x1460 [ptlrpc]
[7664773.181408]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664773.188982]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664773.194043]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664773.200309]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664773.206920]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664773.213184] Mem-Info:
[7664773.215656] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35235 inactive_file:38111 isolated_file:1248
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296796
 mapped:1587 shmem:0 pagetables:357 bounce:0
 free:590400 free_pcp:0 free_cma:0
[7664773.249843] Node 0 DMA free:15904kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[7664773.291597] lowmem_reserve[]: 0 1418 63868 63868
[7664773.296518] Node 0 DMA32 free:261328kB min:11552kB low:14440kB high:17328kB active_anon:0kB inactive_anon:0kB active_file:1084kB inactive_file:4068kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1633052kB managed:1452284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:404488kB slab_unreclaimable:686216kB kernel_stack:352kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:411910 all_unreclaimable? yes
[7664773.341378] lowmem_reserve[]: 0 0 62450 62450
[7664773.346048] Node 0 Normal free:508256kB min:508832kB low:636040kB high:763248kB active_anon:0kB inactive_anon:0kB active_file:49216kB inactive_file:48688kB unevictable:168kB isolated(anon):0kB isolated(file):384kB present:64998912kB managed:63949072kB mlocked:168kB dirty:0kB writeback:0kB mapped:168kB shmem:0kB slab_reclaimable:610948kB slab_unreclaimable:60243916kB kernel_stack:5904kB pagetables:144kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1450599 all_unreclaimable? yes
[7664773.392815] lowmem_reserve[]: 0 0 0 0
[7664773.396781] Node 0 DMA: 2*4kB (U) 1*8kB (U) 1*16kB (U) 2*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
[7664773.411619] Node 0 DMA32: 332*4kB (EM) 402*8kB (UEM) 1195*16kB (UEM) 3691*32kB (UEM) 1492*64kB (UEM) 140*128kB (UEM) 24*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 261328kB
[7664773.428034] Node 0 Normal: 6330*4kB (UEM) 5727*8kB (UEM) 3952*16kB (UEM) 4482*32kB (UEM) 2037*64kB (UEM) 570*128kB (UEM) 106*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 508256kB
[7664773.444876] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664773.453742] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664773.462348] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664773.471216] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664773.479829] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664773.488694] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664773.497298] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664773.506165] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664773.514772] 75916 total pagecache pages
[7664773.518787] 0 pages in swap cache
[7664773.522279] Swap cache stats: add 21120860, delete 21136832, find 4513471/7610016
[7664773.529931] Free swap  = 3122640kB
[7664773.533510] Total swap = 4194300kB
[7664773.537089] 66993253 pages RAM
[7664773.540319] 0 pages HighMem/MovableOnly
[7664773.544335] 1101945 pages reserved
[7664773.547914] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664773.555957] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664773.564744] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664773.573334] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664773.581510] [53079]    81 53079    17590      260      36      171          -900 dbus-daemon
[7664773.590132] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664773.598374] Out of memory: Kill process 53079 (dbus-daemon) score 0 or sacrifice child
[7664773.606471] Killed process 53079 (dbus-daemon) total-vm:70360kB, anon-rss:0kB, file-rss:1040kB, shmem-rss:0kB
[7664773.887702] ll_ost_io02_054 invoked oom-killer: gfp_mask=0x82d2, order=0, oom_score_adj=0
[7664773.896074] ll_ost_io02_054 cpuset=/ mems_allowed=2
[7664773.901139] CPU: 38 PID: 6889 Comm: ll_ost_io02_054 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664773.914468] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664773.922303] Call Trace:
[7664773.924938]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664773.930252]  [<ffffffffa075fb6a>] dump_header+0x90/0x229
[7664773.935752]  [<ffffffffa01bc16c>] out_of_memory+0x4dc/0x4f0
[7664773.941506]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664773.948045]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664773.954591]  [<ffffffffa01fd95f>] __vmalloc_node_range+0x12f/0x280
[7664773.961020]  [<ffffffffc11e6a03>] ? ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664773.968079]  [<ffffffffa01fdd5e>] vzalloc_node+0x4e/0x50
[7664773.973613]  [<ffffffffc11e6a03>] ? ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664773.980708]  [<ffffffffc11e6a03>] ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664773.987625]  [<ffffffffc11e6ea1>] ptlrpc_grow_req_bufs+0xe1/0x2a0 [ptlrpc]
[7664773.994716]  [<ffffffffc11efc85>] ptlrpc_main+0xc05/0x1460 [ptlrpc]
[7664774.001206]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664774.008783]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664774.013845]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664774.020122]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664774.026740]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664774.033013] Mem-Info:
[7664774.035474] active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:35054 inactive_file:37945 isolated_file:2336
 unevictable:9044 dirty:0 writeback:0 unstable:0
 slab_reclaimable:824039 slab_unreclaimable:62296796
 mapped:1587 shmem:0 pagetables:346 bounce:0
 free:590337 free_pcp:0 free_cma:0
[7664774.069661] Node 2 Normal free:525456kB min:525584kB low:656980kB high:788376kB active_anon:0kB inactive_anon:0kB active_file:30956kB inactive_file:40012kB unevictable:8680kB isolated(anon):0kB isolated(file):1024kB present:67108352kB managed:66054620kB mlocked:8680kB dirty:0kB writeback:0kB mapped:5332kB shmem:0kB slab_reclaimable:715224kB slab_unreclaimable:62476440kB kernel_stack:7760kB pagetables:20kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:209338 all_unreclaimable? yes
[7664774.116614] lowmem_reserve[]: 0 0 0 0
[7664774.120587] Node 2 Normal: 27543*4kB (UEM) 40211*8kB (UEM) 919*16kB (UEM) 1663*32kB (UEM) 404*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 525636kB
[7664774.136134] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664774.145020] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664774.153645] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664774.162546] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664774.171180] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664774.180086] Node 2 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664774.188699] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[7664774.197573] Node 3 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[7664774.206229] 75943 total pagecache pages
[7664774.210254] 0 pages in swap cache
[7664774.213754] Swap cache stats: add 21120866, delete 21136838, find 4513473/7610018
[7664774.221446] Free swap  = 3123152kB
[7664774.225026] Total swap = 4194300kB
[7664774.228619] 66993253 pages RAM
[7664774.231862] 0 pages HighMem/MovableOnly
[7664774.235908] 1101945 pages reserved
[7664774.239515] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[7664774.247585] [ 5717]     0  5717    11193      233      24      190         -1000 systemd-udevd
[7664774.256396] [ 6726]     0  6726  2066254     5088     166        0         -1000 multipathd
[7664774.265024] [53050]     0 53050    13880      124      28      138         -1000 auditd
[7664774.273213] [53860]     0 53860    28216      276      57      257         -1000 sshd
[7664774.281439] Kernel panic - not syncing: Out of memory and no killable processes...

[7664774.290831] CPU: 38 PID: 6889 Comm: ll_ost_io02_054 Kdump: loaded Tainted: G           OE  ------------   3.10.0-957.27.2.el7_lustre.pl2.x86_64 #1
[7664774.304114] Hardware name: Dell Inc. PowerEdge R6415/065PKD, BIOS 1.10.6 08/15/2019
[7664774.311939] Call Trace:
[7664774.314576]  [<ffffffffa0765147>] dump_stack+0x19/0x1b
[7664774.319891]  [<ffffffffa075e850>] panic+0xe8/0x21f
[7664774.324858]  [<ffffffffa01bc17a>] out_of_memory+0x4ea/0x4f0
[7664774.330611]  [<ffffffffa076066e>] __alloc_pages_slowpath+0x5d6/0x724
[7664774.337137]  [<ffffffffa01c2524>] __alloc_pages_nodemask+0x404/0x420
[7664774.343663]  [<ffffffffa01fd95f>] __vmalloc_node_range+0x12f/0x280
[7664774.350077]  [<ffffffffc11e6a03>] ? ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664774.357120]  [<ffffffffa01fdd5e>] vzalloc_node+0x4e/0x50
[7664774.362644]  [<ffffffffc11e6a03>] ? ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664774.369722]  [<ffffffffc11e6a03>] ptlrpc_alloc_rqbd+0x213/0x5d0 [ptlrpc]
[7664774.376630]  [<ffffffffc11e6ea1>] ptlrpc_grow_req_bufs+0xe1/0x2a0 [ptlrpc]
[7664774.383710]  [<ffffffffc11efc85>] ptlrpc_main+0xc05/0x1460 [ptlrpc]
[7664774.390188]  [<ffffffffc11ef080>] ? ptlrpc_register_service+0xf80/0xf80 [ptlrpc]
[7664774.397752]  [<ffffffffa00c2e81>] kthread+0xd1/0xe0
[7664774.402805]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40
[7664774.409072]  [<ffffffffa0777c24>] ret_from_fork_nospec_begin+0xe/0x21
[7664774.415682]  [<ffffffffa00c2db0>] ? insert_kthread_work+0x40/0x40