[ 188.786851] sd 12:0:474:0: Attached scsi generic sg964 type 0 [ 188.794865] sd 12:0:475:0: Attached scsi generic sg965 type 0 [ 188.802836] sd 12:0:476:0: Attached scsi generic sg966 type 0 [ 188.810866] sd 12:0:477:0: Attached scsi generic sg967 type 0 [ 188.818815] sd 12:0:478:0: Attached scsi generic sg968 type 0 [ 188.826815] sd 12:0:479:0: Attached scsi generic sg969 type 0 [ 188.834254] sd 12:0:480:0: Attached scsi generic sg970 type 0 [ 188.840791] sd 12:0:481:0: Attached scsi generic sg971 type 0 [ 188.847332] sd 12:0:482:0: Attached scsi generic sg972 type 0 [ 188.853852] sd 12:0:483:0: Attached scsi generic sg973 type 0 [ 188.860400] sd 12:0:484:0: Attached scsi generic sg974 type 0 [ 188.866923] sd 12:0:485:0: Attached scsi generic sg975 type 0 [ 188.873463] sd 12:0:486:0: Attached scsi generic sg976 type 0 [ 188.879986] sd 12:0:487:0: Attached scsi generic sg977 type 0 [ 188.886509] scsi 12:0:488:0: Attached scsi generic sg978 type 13 [ 189.661864] device-mapper: multipath service-time: version 0.3.0 loaded [ 200.392955] mei_me 0000:00:16.0: Device doesn't have valid ME Interface [ 205.663569] ses 1:0:0:0: Attached Enclosure device [ 205.669041] ses 1:0:61:0: Attached Enclosure device [ 205.674578] ses 1:0:122:0: Attached Enclosure device [ 205.680186] ses 1:0:183:0: Attached Enclosure device [ 205.685808] ses 1:0:244:0: Attached Enclosure device [ 205.691500] ses 1:0:305:0: Attached Enclosure device [ 205.697150] ses 1:0:366:0: Attached Enclosure device [ 205.702828] ses 1:0:427:0: Attached Enclosure device [ 205.708466] ses 1:0:488:0: Attached Enclosure device [ 205.714065] ses 12:0:0:0: Attached Enclosure device [ 205.719594] ses 12:0:61:0: Attached Enclosure device [ 205.725220] ses 12:0:122:0: Attached Enclosure device [ 205.730953] ses 12:0:183:0: Attached Enclosure device [ 205.736700] ses 12:0:244:0: Attached Enclosure device [ 205.742444] ses 12:0:305:0: Attached Enclosure device [ 205.748169] ses 12:0:366:0: Attached Enclosure device [ 205.753937] ses 12:0:427:0: Attached Enclosure device [ 205.759698] ses 12:0:488:0: Attached Enclosure device [ 221.322859] input: PC Speaker as /devices/platform/pcspkr/input/input1 [ 221.513523] cryptd: max_cpu_qlen set to 1000 [ 221.533038] AVX2 version of gcm_enc/dec engaged. [ 221.538201] AES CTR mode by8 optimization enabled [ 221.546931] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni) [ 221.556554] alg: No test for __generic-gcm-aes-aesni (__driver-generic-gcm-aes-aesni) [ 221.663697] intel_rapl: Found RAPL domain package [ 221.668957] intel_rapl: Found RAPL domain dram [ 221.673920] intel_rapl: DRAM domain energy unit 15300pj [ 221.679756] intel_rapl: RAPL package 0 domain package locked by BIOS [ 221.686859] intel_rapl: Found RAPL domain package [ 221.692121] intel_rapl: Found RAPL domain dram [ 221.697082] intel_rapl: DRAM domain energy unit 15300pj [ 221.702910] intel_rapl: RAPL package 1 domain package locked by BIOS [ 221.754660] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 221.754685] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 221.754706] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 221.754715] EDAC sbridge: Seeking for: PCI ID 8086:6f60 [ 221.754725] EDAC sbridge: Seeking for: PCI ID 8086:6f60 [ 221.754735] EDAC sbridge: Seeking for: PCI ID 8086:6f60 [ 221.754740] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 221.754748] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 221.754757] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 221.754763] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 221.754771] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 221.754780] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 221.754783] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 221.754793] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 221.754802] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 221.754806] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 221.754815] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 221.754824] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 221.754828] EDAC sbridge: Seeking for: PCI ID 8086:6fac [ 221.754844] EDAC sbridge: Seeking for: PCI ID 8086:6fad [ 221.754859] EDAC sbridge: Seeking for: PCI ID 8086:6f68 [ 221.754869] EDAC sbridge: Seeking for: PCI ID 8086:6f68 [ 221.754878] EDAC sbridge: Seeking for: PCI ID 8086:6f68 [ 221.754881] EDAC sbridge: Seeking for: PCI ID 8086:6f79 [ 221.754891] EDAC sbridge: Seeking for: PCI ID 8086:6f79 [ 221.754902] EDAC sbridge: Seeking for: PCI ID 8086:6f79 [ 221.754905] EDAC sbridge: Seeking for: PCI ID 8086:6f6a [ 221.754914] EDAC sbridge: Seeking for: PCI ID 8086:6f6a [ 221.754923] EDAC sbridge: Seeking for: PCI ID 8086:6f6a [ 221.754926] EDAC sbridge: Seeking for: PCI ID 8086:6f6b [ 221.754936] EDAC sbridge: Seeking for: PCI ID 8086:6f6b [ 221.754946] EDAC sbridge: Seeking for: PCI ID 8086:6f6b [ 221.754948] EDAC sbridge: Seeking for: PCI ID 8086:6f6c [ 221.754965] EDAC sbridge: Seeking for: PCI ID 8086:6f6d [ 221.754981] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 221.754989] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 221.754998] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 221.755005] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 221.755013] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 221.755023] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 221.755029] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 221.755037] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 221.755046] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 221.755787] EDAC MC0: Giving out device to 'sb_edac.c' 'Broadwell SrcID#0_Ha#0': DEV 0000:7f:12.0 [ 221.767543] EDAC MC1: Giving out device to 'sb_edac.c' 'Broadwell SrcID#1_Ha#0': DEV 0000:ff:12.0 [ 221.778683] EDAC MC2: Giving out device to 'sb_edac.c' 'Broadwell SrcID#0_Ha#1': DEV 0000:7f:12.4 [ 221.799887] EDAC MC3: Giving out device to 'sb_edac.c' 'Broadwell SrcID#1_Ha#1': DEV 0000:ff:12.4 [ 221.809795] EDAC sbridge: Ver: 1.1.2 [ 222.912241] dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.3) [ 234.494616] Adding 4194300k swap on /dev/sda2. Priority:-2 extents:1 across:4194300k FS [ 236.557029] mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v4.9-2.2.4 [ 236.567232] mlx4_ib_add: counter index 0 for port 1 allocated 0 [ 236.921263] mlx4_en: Mellanox ConnectX HCA Ethernet driver v4.9-2.2.4 [ 237.655672] card: mlx4_0, QP: 0x220, inline size: 120 [ 238.308725] tg3 0000:01:00.0: irq 241 for MSI/MSI-X [ 238.308775] tg3 0000:01:00.0: irq 242 for MSI/MSI-X [ 238.308795] tg3 0000:01:00.0: irq 243 for MSI/MSI-X [ 238.308815] tg3 0000:01:00.0: irq 244 for MSI/MSI-X [ 238.308837] tg3 0000:01:00.0: irq 245 for MSI/MSI-X [ 238.437875] IPv6: ADDRCONF(NETDEV_UP): em1: link is not ready [ 241.993690] tg3 0000:01:00.0 em1: Link is up at 1000 Mbps, full duplex [ 242.000982] tg3 0000:01:00.0 em1: Flow control is off for TX and off for RX [ 242.008751] tg3 0000:01:00.0 em1: EEE is enabled [ 242.013932] IPv6: ADDRCONF(NETDEV_CHANGE): em1: link becomes ready [ 242.785342] IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready [ 242.808072] IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready [ 247.037060] FS-Cache: Loaded [ 247.050173] LNet: HW NUMA nodes: 2, HW CPU cores: 48, npartitions: 2 [ 247.060660] alg: No test for adler32 (adler32-zlib) [ 247.084549] FS-Cache: Netfs 'nfs' registered for caching [ 247.107557] Key type dns_resolver registered [ 247.143520] NFS: Registering the id_resolver key type [ 247.149175] Key type id_resolver registered [ 247.153845] Key type id_legacy registered [ 247.889893] LNet: 50575:0:(config.c:1641:lnet_inet_enumerate()) lnet: Ignoring interface em2: it's down [ 247.900413] LNet: Using FMR for registration [ 247.914110] LNet: Added LNI 10.0.2.105@o2ib5 [8/256/0/180] [ 247.954007] iTCO_vendor_support: vendor-support=0 [ 247.964757] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11 [ 247.971098] iTCO_wdt: Found a Wellsburg TCO device (Version=2, TCOBASE=0x0460) [ 247.980302] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0) [ 289.888764] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efce4c0) [ 289.896254] ses 1:0:183:0: [sg184] tag#1 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 289.904720] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 289.916090] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 289.924535] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 289.936698] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efce4c0) [ 289.995134] ses 1:0:183:0: [sg184] tag#102 timing out command, waited 60s [ 350.943372] ses 1:0:183:0: attempting task abort! scmd(ffff8bca64fcb100) [ 350.950857] ses 1:0:183:0: [sg184] tag#17 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 350.959395] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 350.970751] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 350.979196] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 350.991615] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca64fcb100) [ 350.998860] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efcc000) [ 351.006344] ses 1:0:183:0: [sg184] tag#0 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.010891] ses 1:0:183:0: [sg184] tag#92 timing out command, waited 60s [ 351.012883] ses 1:0:183:0: [sg184] tag#97 timing out command, waited 60s [ 351.029756] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.041114] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.049558] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.061845] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efcc000) [ 351.069048] ses 1:0:183:0: attempting task abort! scmd(ffff8bab5a3c0fc0) [ 351.076535] ses 1:0:183:0: [sg184] tag#37 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.085082] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.096436] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.104883] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.117365] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab5a3c0fc0) [ 351.124565] ses 1:0:183:0: attempting task abort! scmd(ffff8baaa7382840) [ 351.132069] ses 1:0:183:0: [sg184] tag#23 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.140643] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.151999] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.160445] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.172943] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8baaa7382840) [ 351.180147] ses 1:0:183:0: attempting task abort! scmd(ffff8bca4ffe5500) [ 351.187690] ses 1:0:183:0: [sg184] tag#21 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.196241] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.207612] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.216057] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.228528] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca4ffe5500) [ 351.235744] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f7640) [ 351.243229] ses 1:0:183:0: [sg184] tag#9 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.251711] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.263083] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.271529] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.283944] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f7640) [ 351.291162] ses 1:0:183:0: attempting task abort! scmd(ffff8be97149ebc0) [ 351.298666] ses 1:0:183:0: [sg184] tag#61 CDB: Inquiry 12 00 00 00 24 00 [ 351.306167] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.317536] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.325982] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.338481] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8be97149ebc0) [ 351.345684] ses 1:0:183:0: attempting task abort! scmd(ffff8baafbb6d880) [ 351.353187] ses 1:0:183:0: [sg184] tag#60 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.361733] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.373088] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.381535] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.393921] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8baafbb6d880) [ 351.401127] ses 1:0:183:0: attempting task abort! scmd(ffff8bca64fd0700) [ 351.408669] ses 1:0:183:0: [sg184] tag#54 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.417222] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.428598] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.437046] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.449412] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca64fd0700) [ 351.456659] ses 1:0:183:0: attempting task abort! scmd(ffff8bab22fae680) [ 351.464145] ses 1:0:183:0: [sg184] tag#53 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.472692] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.484046] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.492492] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.504808] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab22fae680) [ 351.512028] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4eb8abc0) [ 351.519512] ses 1:0:183:0: [sg184] tag#49 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.528061] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.539416] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.547863] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.560206] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4eb8abc0) [ 351.567413] ses 1:0:183:0: attempting task abort! scmd(ffff8bca64f9aa00) [ 351.574919] ses 1:0:183:0: [sg184] tag#3 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.583369] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.594723] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.603169] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.615459] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca64f9aa00) [ 351.622675] ses 1:0:183:0: attempting task abort! scmd(ffff8baaa7382300) [ 351.630171] ses 1:0:183:0: [sg184] tag#42 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.638735] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.650088] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.658533] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.670835] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8baaa7382300) [ 351.678036] ses 1:0:183:0: attempting task abort! scmd(ffff8bab39fae4c0) [ 351.685534] ses 1:0:183:0: [sg184] tag#34 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.694071] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.705424] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.713870] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.726196] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab39fae4c0) [ 351.733398] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efb1a40) [ 351.740879] ses 1:0:183:0: [sg184] tag#33 CDB: Inquiry 12 00 00 00 24 00 [ 351.748377] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.759732] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.768178] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.780493] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efb1a40) [ 351.787692] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673c64c0) [ 351.795191] ses 1:0:183:0: [sg184] tag#28 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 351.803756] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.815113] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.823559] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.835934] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673c64c0) [ 351.843139] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efb3d40) [ 351.850623] ses 1:0:183:0: [sg184] tag#25 CDB: Inquiry 12 00 00 00 24 00 [ 351.858104] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.869451] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.877895] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.890029] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efb3d40) [ 351.897240] ses 1:0:183:0: attempting task abort! scmd(ffff8bab21fbe680) [ 351.904729] ses 1:0:183:0: [sg184] tag#22 CDB: Inquiry 12 00 00 00 24 00 [ 351.912212] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.923569] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.932016] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.944386] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab21fbe680) [ 351.951590] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efcc1c0) [ 351.959074] ses 1:0:183:0: [sg184] tag#13 CDB: Inquiry 12 00 00 00 24 00 [ 351.966558] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 351.977913] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 351.986361] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 351.998655] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efcc1c0) [ 352.005859] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efb39c0) [ 352.013343] ses 1:0:183:0: [sg184] tag#12 CDB: Inquiry 12 00 00 00 24 00 [ 352.020831] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.032185] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.040632] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.052914] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efb39c0) [ 352.060130] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4cf86140) [ 352.067620] ses 1:0:183:0: [sg184] tag#104 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 352.076258] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.087616] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.096062] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.108328] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4cf86140) [ 352.115531] ses 1:0:183:0: attempting task abort! scmd(ffff8bca6a3ae4c0) [ 352.123017] ses 1:0:183:0: [sg184] tag#100 CDB: Inquiry 12 00 00 00 24 00 [ 352.130594] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.141950] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.150395] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.162712] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca6a3ae4c0) [ 352.169916] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4cf864c0) [ 352.177400] ses 1:0:183:0: [sg184] tag#99 CDB: Inquiry 12 00 00 00 24 00 [ 352.184880] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.196234] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.204680] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.216889] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4cf864c0) [ 352.224092] ses 1:0:183:0: attempting task abort! scmd(ffff8bca63f82840) [ 352.231576] ses 1:0:183:0: [sg184] tag#96 CDB: Inquiry 12 00 00 00 24 00 [ 352.239060] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.250416] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.258862] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.271270] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca63f82840) [ 352.278478] ses 1:0:183:0: attempting task abort! scmd(ffff8bab21fbca80) [ 352.285962] ses 1:0:183:0: [sg184] tag#95 CDB: Inquiry 12 00 00 00 24 00 [ 352.293444] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.304800] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.313247] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.325545] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab21fbca80) [ 352.332749] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7908c0) [ 352.340235] ses 1:0:183:0: [sg184] tag#93 CDB: Inquiry 12 00 00 00 24 00 [ 352.347718] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.359077] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.367523] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.379872] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7908c0) [ 352.387084] ses 1:0:183:0: attempting task abort! scmd(ffff8bca64fd0380) [ 352.394573] ses 1:0:183:0: [sg184] tag#89 CDB: Inquiry 12 00 00 00 24 00 [ 352.402058] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.413417] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.421866] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.434168] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca64fd0380) [ 352.441374] ses 1:0:183:0: attempting task abort! scmd(ffff8be9742eabc0) [ 352.448860] ses 1:0:183:0: [sg184] tag#88 CDB: Inquiry 12 00 00 00 24 00 [ 352.456342] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.467697] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.476146] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.488499] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8be9742eabc0) [ 352.495704] ses 1:0:183:0: attempting task abort! scmd(ffff8bca7afcc700) [ 352.503188] ses 1:0:183:0: [sg184] tag#85 CDB: Inquiry 12 00 00 00 24 00 [ 352.510672] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.522027] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.530474] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.542780] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca7afcc700) [ 352.549987] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f5dc0) [ 352.557476] ses 1:0:183:0: [sg184] tag#83 CDB: Inquiry 12 00 00 00 24 00 [ 352.564960] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.576315] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.584760] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.596990] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f5dc0) [ 352.604192] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673ac1c0) [ 352.611676] ses 1:0:183:0: [sg184] tag#81 CDB: Inquiry 12 00 00 00 24 00 [ 352.619159] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.630514] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.638960] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.651241] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673ac1c0) [ 352.658445] ses 1:0:183:0: attempting task abort! scmd(ffff8bab2c79ce00) [ 352.665929] ses 1:0:183:0: [sg184] tag#80 CDB: Inquiry 12 00 00 00 24 00 [ 352.673408] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.684764] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.693209] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.705491] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab2c79ce00) [ 352.712693] ses 1:0:183:0: attempting task abort! scmd(ffff8bab793d56c0) [ 352.720181] ses 1:0:183:0: [sg184] tag#77 CDB: Inquiry 12 00 00 00 24 00 [ 352.727667] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.739024] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.747470] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.759802] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab793d56c0) [ 352.767015] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673af2c0) [ 352.774500] ses 1:0:183:0: [sg184] tag#72 CDB: Inquiry 12 00 00 00 24 00 [ 352.781984] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.793340] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.801787] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.814047] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673af2c0) [ 352.821256] ses 1:0:183:0: attempting task abort! scmd(ffff8bca64fd2bc0) [ 352.828748] ses 1:0:183:0: [sg184] tag#69 CDB: Inquiry 12 00 00 00 24 00 [ 352.836231] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.847587] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.856033] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.868272] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca64fd2bc0) [ 352.875479] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efced80) [ 352.882963] ses 1:0:183:0: [sg184] tag#8 CDB: Inquiry 12 00 00 00 24 00 [ 352.890347] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.901704] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.910152] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.922419] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efced80) [ 352.929632] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4cf85500) [ 352.937120] ses 1:0:183:0: [sg184] tag#68 CDB: Inquiry 12 00 00 00 24 00 [ 352.944606] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 352.955964] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 352.964411] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 352.976683] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4cf85500) [ 352.983891] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4f7c8a80) [ 352.991386] ses 1:0:183:0: [sg184] tag#67 CDB: Inquiry 12 00 00 00 24 00 [ 352.998908] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.010270] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.018720] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.031067] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4f7c8a80) [ 353.038281] ses 1:0:183:0: attempting task abort! scmd(ffff8bab22faea00) [ 353.045771] ses 1:0:183:0: [sg184] tag#64 CDB: Inquiry 12 00 00 00 24 00 [ 353.053256] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.064616] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.073056] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.085379] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab22faea00) [ 353.092635] ses 1:0:183:0: attempting task abort! scmd(ffff8bab2c79c1c0) [ 353.100122] ses 1:0:183:0: [sg184] tag#63 CDB: Inquiry 12 00 00 00 24 00 [ 353.107606] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.118963] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.127412] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.139687] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab2c79c1c0) [ 353.146900] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efb0540) [ 353.154390] ses 1:0:183:0: [sg184] tag#62 CDB: Inquiry 12 00 00 00 24 00 [ 353.161868] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.173227] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.181668] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.193968] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efb0540) [ 353.201178] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efb01c0) [ 353.208668] ses 1:0:183:0: [sg184] tag#59 CDB: Inquiry 12 00 00 00 24 00 [ 353.216157] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.227514] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.235964] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.248257] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efb01c0) [ 353.255464] ses 1:0:183:0: attempting task abort! scmd(ffff8bab39facfc0) [ 353.262963] ses 1:0:183:0: [sg184] tag#58 CDB: Inquiry 12 00 00 00 24 00 [ 353.270453] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.281802] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.290250] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.302523] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab39facfc0) [ 353.309725] ses 1:0:183:0: attempting task abort! scmd(ffff8bab39faed80) [ 353.317220] ses 1:0:183:0: [sg184] tag#56 CDB: Inquiry 12 00 00 00 24 00 [ 353.324720] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.336079] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.344529] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.356837] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab39faed80) [ 353.364071] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4cf848c0) [ 353.371567] ses 1:0:183:0: [sg184] tag#52 CDB: Inquiry 12 00 00 00 24 00 [ 353.379056] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.390414] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.398864] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.411185] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4cf848c0) [ 353.418398] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4cf86300) [ 353.425892] ses 1:0:183:0: [sg184] tag#51 CDB: Inquiry 12 00 00 00 24 00 [ 353.433388] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.444748] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.453190] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.465549] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4cf86300) [ 353.472757] ses 1:0:183:0: attempting task abort! scmd(ffff8bab39fafd40) [ 353.480252] ses 1:0:183:0: [sg184] tag#48 CDB: Inquiry 12 00 00 00 24 00 [ 353.487741] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.499100] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.507548] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.519851] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab39fafd40) [ 353.527062] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4cf84700) [ 353.534551] ses 1:0:183:0: [sg184] tag#47 CDB: Inquiry 12 00 00 00 24 00 [ 353.542065] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.553422] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.561871] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.574175] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4cf84700) [ 353.581380] ses 1:0:183:0: attempting task abort! scmd(ffff8bab39fadf80) [ 353.588868] ses 1:0:183:0: [sg184] tag#44 CDB: Inquiry 12 00 00 00 24 00 [ 353.596349] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.607706] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.616153] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.628187] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab39fadf80) [ 353.635392] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673aebc0) [ 353.642880] ses 1:0:183:0: [sg184] tag#43 CDB: Inquiry 12 00 00 00 24 00 [ 353.650367] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.661725] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.670171] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.682464] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673aebc0) [ 353.689668] ses 1:0:183:0: attempting task abort! scmd(ffff8bab22fada40) [ 353.697151] ses 1:0:183:0: [sg184] tag#40 CDB: Inquiry 12 00 00 00 24 00 [ 353.704630] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.715987] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.724432] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.736684] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab22fada40) [ 353.743890] ses 1:0:183:0: attempting task abort! scmd(ffff8bab26382300) [ 353.751385] ses 1:0:183:0: [sg184] tag#39 CDB: Inquiry 12 00 00 00 24 00 [ 353.758868] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.770227] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.778674] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.790979] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab26382300) [ 353.798187] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4d7ece00) [ 353.805672] ses 1:0:183:0: [sg184] tag#36 CDB: Inquiry 12 00 00 00 24 00 [ 353.813156] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.824515] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.832961] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.845263] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4d7ece00) [ 353.852467] ses 1:0:183:0: attempting task abort! scmd(ffff8bab2c79d6c0) [ 353.859955] ses 1:0:183:0: [sg184] tag#35 CDB: Inquiry 12 00 00 00 24 00 [ 353.867441] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.878798] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.887247] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.899554] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab2c79d6c0) [ 353.906754] ses 1:0:183:0: attempting task abort! scmd(ffff8bab21fbcfc0) [ 353.914243] ses 1:0:183:0: [sg184] tag#31 CDB: Inquiry 12 00 00 00 24 00 [ 353.921732] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.933089] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.941538] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 353.953791] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab21fbcfc0) [ 353.960998] ses 1:0:183:0: attempting task abort! scmd(ffff8bab2c79e300) [ 353.968490] ses 1:0:183:0: [sg184] tag#30 CDB: Inquiry 12 00 00 00 24 00 [ 353.975976] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 353.987334] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 353.995782] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.008020] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab2c79e300) [ 354.015226] ses 1:0:183:0: attempting task abort! scmd(ffff8bab2c79e840) [ 354.022720] ses 1:0:183:0: [sg184] tag#27 CDB: Inquiry 12 00 00 00 24 00 [ 354.030212] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.041571] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.050018] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.062014] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab2c79e840) [ 354.069223] ses 1:0:183:0: attempting task abort! scmd(ffff8bab22fae140) [ 354.076709] ses 1:0:183:0: [sg184] tag#26 CDB: Inquiry 12 00 00 00 24 00 [ 354.084198] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.095556] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.104004] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.116258] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab22fae140) [ 354.123458] ses 1:0:183:0: attempting task abort! scmd(ffff8bca64face00) [ 354.130940] ses 1:0:183:0: [sg184] tag#24 CDB: Inquiry 12 00 00 00 24 00 [ 354.138428] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.149787] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.158237] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.170465] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca64face00) [ 354.177669] ses 1:0:183:0: attempting task abort! scmd(ffff8bab22fadc00) [ 354.185156] ses 1:0:183:0: [sg184] tag#20 CDB: Inquiry 12 00 00 00 24 00 [ 354.192636] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.203994] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.212441] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.224533] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab22fadc00) [ 354.231736] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efcd500) [ 354.239227] ses 1:0:183:0: [sg184] tag#19 CDB: Inquiry 12 00 00 00 24 00 [ 354.246718] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.258076] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.266525] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.278775] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efcd500) [ 354.285980] ses 1:0:183:0: attempting task abort! scmd(ffff8bca62fc8700) [ 354.293466] ses 1:0:183:0: [sg184] tag#16 CDB: Inquiry 12 00 00 00 24 00 [ 354.300949] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.312306] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.320753] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.333008] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca62fc8700) [ 354.340212] ses 1:0:183:0: attempting task abort! scmd(ffff8bab2eb848c0) [ 354.347704] ses 1:0:183:0: [sg184] tag#15 CDB: Inquiry 12 00 00 00 24 00 [ 354.355211] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.366560] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.375008] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.386934] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab2eb848c0) [ 354.394149] ses 1:0:183:0: attempting task abort! scmd(ffff8bab2eb84e00) [ 354.401647] ses 1:0:183:0: [sg184] tag#14 CDB: Inquiry 12 00 00 00 24 00 [ 354.409192] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.420550] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.428998] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.441266] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab2eb84e00) [ 354.448487] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f4c40) [ 354.455977] ses 1:0:183:0: [sg184] tag#11 CDB: Inquiry 12 00 00 00 24 00 [ 354.463471] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.474833] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.483283] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.495539] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f4c40) [ 354.502824] ses 1:0:183:0: attempting task abort! scmd(ffff8bab39fae300) [ 354.510317] ses 1:0:183:0: [sg184] tag#10 CDB: Inquiry 12 00 00 00 24 00 [ 354.517808] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.529169] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.537619] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.549922] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab39fae300) [ 354.557142] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4f7c81c0) [ 354.564643] ses 1:0:183:0: [sg184] tag#7 CDB: Inquiry 12 00 00 00 24 00 [ 354.572036] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.583394] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.591842] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.604324] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4f7c81c0) [ 354.611533] ses 1:0:183:0: attempting task abort! scmd(ffff8bca7afcd6c0) [ 354.619032] ses 1:0:183:0: [sg184] tag#6 CDB: Inquiry 12 00 00 00 24 00 [ 354.626424] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.637782] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.646231] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.658550] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca7afcd6c0) [ 354.665761] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673c5880) [ 354.673255] ses 1:0:183:0: [sg184] tag#4 CDB: Inquiry 12 00 00 00 24 00 [ 354.680649] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.692009] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.700457] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.712656] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673c5880) [ 354.719859] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673c7d40) [ 354.727344] ses 1:0:183:0: [sg184] tag#2 CDB: Inquiry 12 00 00 00 24 00 [ 354.734730] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 354.746088] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 354.754537] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 354.766777] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673c7d40) [ 411.758951] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4cff2300) [ 411.766447] ses 1:0:183:0: [sg184] tag#65 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 411.773949] ses 12:0:183:0: attempting task abort! scmd(ffff8bca673adc00) [ 411.773956] ses 12:0:183:0: [sg673] tag#0 CDB: Inquiry 12 00 00 00 24 00 [ 411.773960] scsi target12:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab903d), phy(76) [ 411.773962] scsi target12:0:183: enclosurelogical id(0x5001636001ab903d), slot(60) [ 411.773963] scsi target12:0:183: enclosure level(0x0001), connector name( ) [ 411.778504] ses 12:0:183:0: task abort: SUCCESS scmd(ffff8bca673adc00) [ 411.778520] ses 12:0:183:0: attempting task abort! scmd(ffff8bca673c6bc0) [ 411.778524] ses 12:0:183:0: [sg673] tag#7 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 411.778527] scsi target12:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab903d), phy(76) [ 411.778528] scsi target12:0:183: enclosurelogical id(0x5001636001ab903d), slot(60) [ 411.778530] scsi target12:0:183: enclosure level(0x0001), connector name( ) [ 411.782941] ses 12:0:183:0: task abort: SUCCESS scmd(ffff8bca673c6bc0) [ 411.782950] ses 12:0:183:0: attempting task abort! scmd(ffff8bca74743100) [ 411.782953] ses 12:0:183:0: [sg673] tag#5 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 411.782955] scsi target12:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab903d), phy(76) [ 411.782956] scsi target12:0:183: enclosurelogical id(0x5001636001ab903d), slot(60) [ 411.782957] scsi target12:0:183: enclosure level(0x0001), connector name( ) [ 411.787606] ses 12:0:183:0: task abort: SUCCESS scmd(ffff8bca74743100) [ 411.787617] ses 12:0:183:0: attempting task abort! scmd(ffff8bcaeafe4540) [ 411.787619] ses 12:0:183:0: [sg673] tag#4 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 411.787621] scsi target12:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab903d), phy(76) [ 411.787622] scsi target12:0:183: enclosurelogical id(0x5001636001ab903d), slot(60) [ 411.787623] scsi target12:0:183: enclosure level(0x0001), connector name( ) [ 411.792025] ses 12:0:183:0: task abort: SUCCESS scmd(ffff8bcaeafe4540) [ 411.980177] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 411.991536] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 411.999974] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.011549] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4cff2300) [ 412.018752] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673c5dc0) [ 412.026242] ses 1:0:183:0: [sg184] tag#57 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 412.034783] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.046140] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.054588] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.066165] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673c5dc0) [ 412.073388] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4cff0540) [ 412.080876] ses 1:0:183:0: [sg184] tag#1 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 412.089327] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.100675] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.109121] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.120682] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4cff0540) [ 412.127888] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f7100) [ 412.135376] ses 1:0:183:0: [sg184] tag#32 CDB: Inquiry 12 00 00 00 24 00 [ 412.142860] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.154218] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.162668] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.175100] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f7100) [ 412.182310] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673b8700) [ 412.189796] ses 1:0:183:0: [sg184] tag#105 CDB: Inquiry 12 00 00 00 24 00 [ 412.197379] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.208736] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.217184] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.229261] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673b8700) [ 412.236563] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673ae680) [ 412.244052] ses 1:0:183:0: [sg184] tag#103 CDB: Inquiry 12 00 00 00 24 00 [ 412.251636] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.262994] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.271441] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.283176] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673ae680) [ 412.292935] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efcd6c0) [ 412.300420] ses 1:0:183:0: [sg184] tag#102 CDB: Inquiry 12 00 00 00 24 00 [ 412.308016] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.319375] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.327823] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.339371] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efcd6c0) [ 412.346584] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673c6680) [ 412.354071] ses 1:0:183:0: [sg184] tag#94 CDB: Inquiry 12 00 00 00 24 00 [ 412.361558] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.372914] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.381361] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.392886] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673c6680) [ 412.400089] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f7d40) [ 412.407574] ses 1:0:183:0: [sg184] tag#91 CDB: Inquiry 12 00 00 00 24 00 [ 412.415057] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.426414] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.434861] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.446342] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f7d40) [ 412.453546] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4f7c9f80) [ 412.461030] ses 1:0:183:0: [sg184] tag#84 CDB: Inquiry 12 00 00 00 24 00 [ 412.468514] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.479871] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.488318] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.499799] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4f7c9f80) [ 412.507002] ses 1:0:183:0: attempting task abort! scmd(ffff8bca6a3aed80) [ 412.514488] ses 1:0:183:0: [sg184] tag#82 CDB: Inquiry 12 00 00 00 24 00 [ 412.521979] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.533337] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.541784] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.553293] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca6a3aed80) [ 412.560500] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673c6f40) [ 412.567988] ses 1:0:183:0: [sg184] tag#75 CDB: Inquiry 12 00 00 00 24 00 [ 412.575473] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.586831] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.595279] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.606797] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673c6f40) [ 412.614001] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f7b80) [ 412.621486] ses 1:0:183:0: [sg184] tag#74 CDB: Inquiry 12 00 00 00 24 00 [ 412.628968] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.640326] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.648776] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.660308] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f7b80) [ 412.667513] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efcda40) [ 412.674998] ses 1:0:183:0: [sg184] tag#66 CDB: Inquiry 12 00 00 00 24 00 [ 412.682477] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.693834] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.702281] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.713829] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efcda40) [ 412.721035] ses 1:0:183:0: attempting task abort! scmd(ffff8bca673c5180) [ 412.728526] ses 1:0:183:0: [sg184] tag#55 CDB: Inquiry 12 00 00 00 24 00 [ 412.736010] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.747368] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.755816] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.767321] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca673c5180) [ 412.774527] ses 1:0:183:0: attempting task abort! scmd(ffff8bca64fd32c0) [ 412.782010] ses 1:0:183:0: [sg184] tag#50 CDB: Inquiry 12 00 00 00 24 00 [ 412.789493] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.800850] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.809299] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.820877] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bca64fd32c0) [ 412.828078] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4efce840) [ 412.835563] ses 1:0:183:0: [sg184] tag#45 CDB: Inquiry 12 00 00 00 24 00 [ 412.843045] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.854401] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.862848] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.875202] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4efce840) [ 412.882408] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f72c0) [ 412.889899] ses 1:0:183:0: [sg184] tag#41 CDB: Inquiry 12 00 00 00 24 00 [ 412.897383] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.908741] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.917188] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.928682] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f72c0) [ 412.935885] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f6f40) [ 412.943372] ses 1:0:183:0: [sg184] tag#38 CDB: Inquiry 12 00 00 00 24 00 [ 412.950856] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 412.962214] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 412.970662] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 412.982189] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f6f40) [ 412.989393] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f6d80) [ 412.996894] ses 1:0:183:0: [sg184] tag#29 CDB: Inquiry 12 00 00 00 24 00 [ 413.004377] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 413.015734] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 413.024183] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 413.035684] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f6d80) [ 413.042890] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f6bc0) [ 413.050376] ses 1:0:183:0: [sg184] tag#18 CDB: Inquiry 12 00 00 00 24 00 [ 413.057856] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 413.069215] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 413.077664] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 413.089129] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f6bc0) [ 413.096328] ses 1:0:183:0: attempting task abort! scmd(ffff8bab4c7f6a00) [ 413.103819] ses 1:0:183:0: [sg184] tag#17 CDB: Inquiry 12 00 00 00 24 00 [ 413.111304] scsi target1:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab917d), phy(76) [ 413.122661] scsi target1:0:183: enclosurelogical id(0x5001636001ab917d), slot(60) [ 413.131109] scsi target1:0:183: enclosure level(0x0001), connector name( ) [ 413.142595] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8bab4c7f6a00) [ 1972.458397] Lustre: Lustre: Build Version: 2.12.6_1_g68bbfcf [ 1984.337273] md: md46 stopped. [ 1984.371514] async_tx: api initialized (async) [ 1984.378180] xor: automatically using best checksumming function: [ 1984.394729] avx : 25084.000 MB/sec [ 1984.427733] raid6: sse2x1 gen() 8179 MB/s [ 1984.448733] raid6: sse2x2 gen() 9988 MB/s [ 1984.469728] raid6: sse2x4 gen() 11726 MB/s [ 1984.490749] raid6: avx2x1 gen() 15847 MB/s [ 1984.511728] raid6: avx2x2 gen() 18402 MB/s [ 1984.532751] raid6: avx2x4 gen() 21234 MB/s [ 1984.537514] raid6: using algorithm avx2x4 gen() (21234 MB/s) [ 1984.543828] raid6: using avx2x2 recovery algorithm [ 1984.567632] md/raid:md46: device dm-478 operational as raid disk 0 [ 1984.574534] md/raid:md46: device dm-474 operational as raid disk 9 [ 1984.581438] md/raid:md46: device dm-473 operational as raid disk 8 [ 1984.588343] md/raid:md46: device dm-462 operational as raid disk 7 [ 1984.595238] md/raid:md46: device dm-460 operational as raid disk 6 [ 1984.602135] md/raid:md46: device dm-454 operational as raid disk 5 [ 1984.609032] md/raid:md46: device dm-450 operational as raid disk 4 [ 1984.615930] md/raid:md46: device dm-441 operational as raid disk 3 [ 1984.622829] md/raid:md46: device dm-427 operational as raid disk 2 [ 1984.629722] md/raid:md46: device dm-479 operational as raid disk 1 [ 1984.637911] md/raid:md46: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1984.689637] md46: detected capacity change from 0 to 64011431837696 [ 1984.703544] md: md42 stopped. [ 1984.731587] md/raid:md42: device dm-431 operational as raid disk 0 [ 1984.738494] md/raid:md42: device dm-459 operational as raid disk 9 [ 1984.745399] md/raid:md42: device dm-456 operational as raid disk 8 [ 1984.752298] md/raid:md42: device dm-448 operational as raid disk 7 [ 1984.759194] md/raid:md42: device dm-444 operational as raid disk 6 [ 1984.766094] md/raid:md42: device dm-433 operational as raid disk 5 [ 1984.772993] md/raid:md42: device dm-436 operational as raid disk 4 [ 1984.779890] md/raid:md42: device dm-432 operational as raid disk 3 [ 1984.786789] md/raid:md42: device dm-418 operational as raid disk 2 [ 1984.793686] md/raid:md42: device dm-430 operational as raid disk 1 [ 1984.801580] md/raid:md42: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1984.829770] md42: detected capacity change from 0 to 64011431837696 [ 1984.884255] md: md26 stopped. [ 1984.904883] md/raid:md26: not clean -- starting background reconstruction [ 1984.912627] md/raid:md26: device dm-271 operational as raid disk 0 [ 1984.919555] md/raid:md26: device dm-285 operational as raid disk 9 [ 1984.926464] md/raid:md26: device dm-292 operational as raid disk 8 [ 1984.933369] md/raid:md26: device dm-273 operational as raid disk 7 [ 1984.940269] md/raid:md26: device dm-276 operational as raid disk 6 [ 1984.947167] md/raid:md26: device dm-264 operational as raid disk 5 [ 1984.954069] md/raid:md26: device dm-270 operational as raid disk 4 [ 1984.960969] md/raid:md26: device dm-246 operational as raid disk 3 [ 1984.967867] md/raid:md26: device dm-242 operational as raid disk 2 [ 1984.974775] md/raid:md26: device dm-300 operational as raid disk 1 [ 1984.982538] md/raid:md26: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1985.022510] md26: detected capacity change from 0 to 64011431837696 [ 1985.029585] md: resync of RAID array md26 [ 1985.047451] md: md28 stopped. [ 1985.068173] md/raid:md28: device dm-294 operational as raid disk 0 [ 1985.075082] md/raid:md28: device dm-303 operational as raid disk 9 [ 1985.081989] md/raid:md28: device dm-283 operational as raid disk 8 [ 1985.088903] md/raid:md28: device dm-287 operational as raid disk 7 [ 1985.095805] md/raid:md28: device dm-281 operational as raid disk 6 [ 1985.102710] md/raid:md28: device dm-254 operational as raid disk 5 [ 1985.109630] md/raid:md28: device dm-253 operational as raid disk 4 [ 1985.116535] md/raid:md28: device dm-255 operational as raid disk 3 [ 1985.123438] md/raid:md28: device dm-267 operational as raid disk 2 [ 1985.130365] md/raid:md28: device dm-298 operational as raid disk 1 [ 1985.138149] md/raid:md28: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1985.170151] md28: detected capacity change from 0 to 64011431837696 [ 1985.195639] md: md12 stopped. [ 1985.217864] md/raid:md12: device dm-82 operational as raid disk 0 [ 1985.224676] md/raid:md12: device dm-130 operational as raid disk 9 [ 1985.231641] md/raid:md12: device dm-128 operational as raid disk 8 [ 1985.238559] md/raid:md12: device dm-86 operational as raid disk 7 [ 1985.245409] md/raid:md12: device dm-80 operational as raid disk 6 [ 1985.252248] md/raid:md12: device dm-100 operational as raid disk 5 [ 1985.259207] md/raid:md12: device dm-77 operational as raid disk 4 [ 1985.266029] md/raid:md12: device dm-84 operational as raid disk 3 [ 1985.272839] md/raid:md12: device dm-50 operational as raid disk 2 [ 1985.279650] md/raid:md12: device dm-87 operational as raid disk 1 [ 1985.287833] md/raid:md12: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1985.334697] md12: detected capacity change from 0 to 64011431837696 [ 1985.358214] md: md2 stopped. [ 1985.396646] md/raid:md2: device dm-47 operational as raid disk 0 [ 1985.403365] md/raid:md2: device dm-81 operational as raid disk 9 [ 1985.410094] md/raid:md2: device dm-56 operational as raid disk 8 [ 1985.416805] md/raid:md2: device dm-43 operational as raid disk 7 [ 1985.423525] md/raid:md2: device dm-55 operational as raid disk 6 [ 1985.430239] md/raid:md2: device dm-2 operational as raid disk 5 [ 1985.436856] md/raid:md2: device dm-3 operational as raid disk 4 [ 1985.439127] LDISKFS-fs warning (device md46): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1985.458136] md/raid:md2: device dm-180 operational as raid disk 3 [ 1985.464948] md/raid:md2: device dm-193 operational as raid disk 2 [ 1985.471752] md/raid:md2: device dm-83 operational as raid disk 1 [ 1985.479313] md/raid:md2: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1985.520014] md2: detected capacity change from 0 to 64011431837696 [ 1985.747689] LDISKFS-fs warning (device md42): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1986.023154] LDISKFS-fs warning (device md26): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1986.116540] LDISKFS-fs warning (device md28): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1986.273982] LDISKFS-fs warning (device md12): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1986.389769] md: md4 stopped. [ 1986.412880] LDISKFS-fs warning (device md2): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1986.428153] md/raid:md4: device dm-127 operational as raid disk 0 [ 1986.434961] md/raid:md4: device dm-107 operational as raid disk 9 [ 1986.441764] md/raid:md4: device dm-79 operational as raid disk 8 [ 1986.448468] md/raid:md4: device dm-66 operational as raid disk 7 [ 1986.455172] md/raid:md4: device dm-48 operational as raid disk 6 [ 1986.461874] md/raid:md4: device dm-42 operational as raid disk 5 [ 1986.468577] md/raid:md4: device dm-14 operational as raid disk 4 [ 1986.475281] md/raid:md4: device dm-0 operational as raid disk 3 [ 1986.481888] md/raid:md4: device dm-4 operational as raid disk 2 [ 1986.488495] md/raid:md4: device dm-108 operational as raid disk 1 [ 1986.496259] md/raid:md4: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1986.535652] md4: detected capacity change from 0 to 64011431837696 [ 1986.553398] md: md24 stopped. [ 1986.600549] md/raid:md24: device dm-244 operational as raid disk 0 [ 1986.607468] md/raid:md24: device dm-278 operational as raid disk 9 [ 1986.614380] md/raid:md24: device dm-280 operational as raid disk 8 [ 1986.621293] md/raid:md24: device dm-274 operational as raid disk 7 [ 1986.628203] md/raid:md24: device dm-279 operational as raid disk 6 [ 1986.635115] md/raid:md24: device dm-256 operational as raid disk 5 [ 1986.642025] md/raid:md24: device dm-259 operational as raid disk 4 [ 1986.648936] md/raid:md24: device dm-247 operational as raid disk 3 [ 1986.655848] md/raid:md24: device dm-241 operational as raid disk 2 [ 1986.662757] md/raid:md24: device dm-237 operational as raid disk 1 [ 1986.671046] md/raid:md24: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1986.713230] md24: detected capacity change from 0 to 64011431837696 [ 1986.727198] md: md0 stopped. [ 1986.760324] md/raid:md0: device dm-160 operational as raid disk 0 [ 1986.767140] md/raid:md0: device dm-67 operational as raid disk 9 [ 1986.773865] md/raid:md0: device dm-52 operational as raid disk 8 [ 1986.780572] md/raid:md0: device dm-37 operational as raid disk 7 [ 1986.787274] md/raid:md0: device dm-19 operational as raid disk 6 [ 1986.793978] md/raid:md0: device dm-15 operational as raid disk 5 [ 1986.800684] md/raid:md0: device dm-11 operational as raid disk 4 [ 1986.807379] md/raid:md0: device dm-195 operational as raid disk 3 [ 1986.814179] md/raid:md0: device dm-189 operational as raid disk 2 [ 1986.820979] md/raid:md0: device dm-172 operational as raid disk 1 [ 1986.828648] md/raid:md0: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1986.871390] md0: detected capacity change from 0 to 64011431837696 [ 1986.945291] md: md32 stopped. [ 1986.978948] md/raid:md32: device dm-338 operational as raid disk 0 [ 1986.985850] md/raid:md32: device dm-360 operational as raid disk 9 [ 1986.992743] md/raid:md32: device dm-348 operational as raid disk 8 [ 1986.999644] md/raid:md32: device dm-342 operational as raid disk 7 [ 1987.006551] md/raid:md32: device dm-324 operational as raid disk 6 [ 1987.013451] md/raid:md32: device dm-321 operational as raid disk 5 [ 1987.020351] md/raid:md32: device dm-329 operational as raid disk 4 [ 1987.027243] md/raid:md32: device dm-310 operational as raid disk 3 [ 1987.034145] md/raid:md32: device dm-318 operational as raid disk 2 [ 1987.041046] md/raid:md32: device dm-344 operational as raid disk 1 [ 1987.048831] md/raid:md32: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1987.080620] md32: detected capacity change from 0 to 64011431837696 [ 1987.144477] md: md10 stopped. [ 1987.188503] md/raid:md10: device dm-63 operational as raid disk 0 [ 1987.195375] md/raid:md10: device dm-46 operational as raid disk 9 [ 1987.202217] md/raid:md10: device dm-45 operational as raid disk 8 [ 1987.209032] md/raid:md10: device dm-17 operational as raid disk 7 [ 1987.215852] md/raid:md10: device dm-9 operational as raid disk 6 [ 1987.222578] md/raid:md10: device dm-238 operational as raid disk 5 [ 1987.229501] md/raid:md10: device dm-263 operational as raid disk 4 [ 1987.236416] md/raid:md10: device dm-173 operational as raid disk 3 [ 1987.243324] md/raid:md10: device dm-154 operational as raid disk 2 [ 1987.249460] LDISKFS-fs warning (device md4): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1987.264791] md/raid:md10: device dm-58 operational as raid disk 1 [ 1987.273207] md/raid:md10: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1987.344941] md10: detected capacity change from 0 to 64011431837696 [ 1987.356111] md: md20 stopped. [ 1987.407159] md/raid:md20: device dm-213 operational as raid disk 0 [ 1987.414066] md/raid:md20: device dm-231 operational as raid disk 9 [ 1987.420970] md/raid:md20: device dm-233 operational as raid disk 8 [ 1987.427871] md/raid:md20: device dm-205 operational as raid disk 7 [ 1987.434804] md/raid:md20: device dm-206 operational as raid disk 6 [ 1987.441709] md/raid:md20: device dm-210 operational as raid disk 5 [ 1987.448635] md/raid:md20: device dm-201 operational as raid disk 4 [ 1987.455542] md/raid:md20: device dm-163 operational as raid disk 3 [ 1987.462445] md/raid:md20: device dm-158 operational as raid disk 2 [ 1987.469350] md/raid:md20: device dm-230 operational as raid disk 1 [ 1987.477376] md/raid:md20: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1987.518732] md20: detected capacity change from 0 to 64011431837696 [ 1987.540167] md: md6 stopped. [ 1987.578152] md/raid:md6: device dm-137 operational as raid disk 0 [ 1987.584963] md/raid:md6: device dm-27 operational as raid disk 9 [ 1987.591676] md/raid:md6: device dm-38 operational as raid disk 8 [ 1987.598389] md/raid:md6: device dm-24 operational as raid disk 7 [ 1987.605103] md/raid:md6: device dm-169 operational as raid disk 5 [ 1987.611914] md/raid:md6: device dm-166 operational as raid disk 4 [ 1987.618752] md/raid:md6: device dm-145 operational as raid disk 3 [ 1987.625585] md/raid:md6: device dm-123 operational as raid disk 2 [ 1987.632394] md/raid:md6: device dm-126 operational as raid disk 1 [ 1987.640305] md/raid:md6: raid level 6 active with 9 out of 10 devices, algorithm 2 [ 1987.679504] md6: detected capacity change from 0 to 64011431837696 [ 1987.686506] md: recovery of RAID array md6 [ 1987.694070] LDISKFS-fs warning (device md24): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1987.801497] md: md30 stopped. [ 1987.858515] md/raid:md30: device dm-299 operational as raid disk 0 [ 1987.865878] md/raid:md30: device dm-339 operational as raid disk 9 [ 1987.872878] md/raid:md30: device dm-345 operational as raid disk 8 [ 1987.879928] md/raid:md30: device dm-322 operational as raid disk 7 [ 1987.880807] LDISKFS-fs warning (device md0): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1987.901447] md/raid:md30: device dm-335 operational as raid disk 6 [ 1987.908351] md/raid:md30: device dm-325 operational as raid disk 5 [ 1987.915253] md/raid:md30: device dm-323 operational as raid disk 4 [ 1987.922156] md/raid:md30: device dm-307 operational as raid disk 3 [ 1987.929060] md/raid:md30: device dm-297 operational as raid disk 2 [ 1987.935960] md/raid:md30: device dm-306 operational as raid disk 1 [ 1987.948306] md/raid:md30: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1987.994015] md30: detected capacity change from 0 to 64011431837696 [ 1988.014145] md: md8 stopped. [ 1988.036907] md/raid:md8: device dm-25 operational as raid disk 0 [ 1988.043617] md/raid:md8: device dm-62 operational as raid disk 9 [ 1988.050328] md/raid:md8: device dm-33 operational as raid disk 8 [ 1988.057039] md/raid:md8: device dm-20 operational as raid disk 7 [ 1988.063755] md/raid:md8: device dm-18 operational as raid disk 6 [ 1988.070554] md/raid:md8: device dm-178 operational as raid disk 5 [ 1988.077356] md/raid:md8: device dm-182 operational as raid disk 4 [ 1988.084159] md/raid:md8: device dm-157 operational as raid disk 3 [ 1988.091027] md/raid:md8: device dm-164 operational as raid disk 2 [ 1988.097989] md/raid:md8: device dm-41 operational as raid disk 1 [ 1988.105849] md/raid:md8: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1988.124932] LDISKFS-fs warning (device md32): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1988.157890] md8: detected capacity change from 0 to 64011431837696 [ 1988.183595] md: md18 stopped. [ 1988.217190] md/raid:md18: device dm-136 operational as raid disk 0 [ 1988.224101] md/raid:md18: device dm-227 operational as raid disk 9 [ 1988.231006] md/raid:md18: device dm-219 operational as raid disk 8 [ 1988.237907] md/raid:md18: device dm-217 operational as raid disk 7 [ 1988.244813] md/raid:md18: device dm-202 operational as raid disk 6 [ 1988.251711] md/raid:md18: device dm-203 operational as raid disk 5 [ 1988.258613] md/raid:md18: device dm-197 operational as raid disk 4 [ 1988.265516] md/raid:md18: device dm-150 operational as raid disk 3 [ 1988.272423] md/raid:md18: device dm-151 operational as raid disk 2 [ 1988.279324] md/raid:md18: device dm-134 operational as raid disk 1 [ 1988.287167] md/raid:md18: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1988.328689] md18: detected capacity change from 0 to 64011431837696 [ 1988.363833] md: md44 stopped. [ 1988.403644] md/raid:md44: device dm-451 operational as raid disk 0 [ 1988.410558] md/raid:md44: device dm-466 operational as raid disk 9 [ 1988.414021] LDISKFS-fs warning (device md10): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1988.432138] md/raid:md44: device dm-476 operational as raid disk 8 [ 1988.439039] md/raid:md44: device dm-455 operational as raid disk 7 [ 1988.445941] md/raid:md44: device dm-457 operational as raid disk 6 [ 1988.452842] md/raid:md44: device dm-437 operational as raid disk 5 [ 1988.459741] md/raid:md44: device dm-435 operational as raid disk 4 [ 1988.464334] LDISKFS-fs warning (device md20): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1988.481294] md/raid:md44: device dm-425 operational as raid disk 3 [ 1988.488193] md/raid:md44: device dm-429 operational as raid disk 2 [ 1988.495085] md/raid:md44: device dm-461 operational as raid disk 1 [ 1988.502851] md/raid:md44: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1988.546069] md44: detected capacity change from 0 to 64011431837696 [ 1988.585727] md: md36 stopped. [ 1988.629419] md/raid:md36: device dm-369 operational as raid disk 0 [ 1988.636326] md/raid:md36: device dm-406 operational as raid disk 9 [ 1988.643232] md/raid:md36: device dm-394 operational as raid disk 8 [ 1988.650148] md/raid:md36: device dm-375 operational as raid disk 7 [ 1988.657052] md/raid:md36: device dm-383 operational as raid disk 6 [ 1988.663952] md/raid:md36: device dm-384 operational as raid disk 5 [ 1988.670844] md/raid:md36: device dm-366 operational as raid disk 4 [ 1988.677745] md/raid:md36: device dm-355 operational as raid disk 3 [ 1988.684646] md/raid:md36: device dm-376 operational as raid disk 2 [ 1988.691578] md/raid:md36: device dm-363 operational as raid disk 1 [ 1988.699436] md/raid:md36: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1988.751563] md36: detected capacity change from 0 to 64011431837696 [ 1988.769778] md: md14 stopped. [ 1988.808150] md/raid:md14: not clean -- starting background reconstruction [ 1988.808878] LDISKFS-fs warning (device md6): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1988.830445] md/raid:md14: device dm-96 operational as raid disk 0 [ 1988.837257] md/raid:md14: device dm-122 operational as raid disk 9 [ 1988.844156] md/raid:md14: device dm-118 operational as raid disk 8 [ 1988.851096] md/raid:md14: device dm-91 operational as raid disk 7 [ 1988.857911] md/raid:md14: device dm-90 operational as raid disk 6 [ 1988.864775] md/raid:md14: device dm-97 operational as raid disk 5 [ 1988.871655] md/raid:md14: device dm-76 operational as raid disk 4 [ 1988.878465] md/raid:md14: device dm-78 operational as raid disk 3 [ 1988.885276] md/raid:md14: device dm-68 operational as raid disk 2 [ 1988.892159] md/raid:md14: device dm-129 operational as raid disk 1 [ 1988.899973] md/raid:md14: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1988.942420] md14: detected capacity change from 0 to 64011431837696 [ 1988.949585] md: resync of RAID array md14 [ 1988.971167] md: md40 stopped. [ 1989.019250] md/raid:md40: not clean -- starting background reconstruction [ 1989.027002] md/raid:md40: device dm-424 operational as raid disk 0 [ 1989.033955] md/raid:md40: device dm-413 operational as raid disk 9 [ 1989.040865] md/raid:md40: device dm-408 operational as raid disk 8 [ 1989.047821] md/raid:md40: device dm-395 operational as raid disk 7 [ 1989.054732] md/raid:md40: device dm-401 operational as raid disk 6 [ 1989.061643] md/raid:md40: device dm-378 operational as raid disk 5 [ 1989.068570] md/raid:md40: device dm-374 operational as raid disk 4 [ 1989.075495] md/raid:md40: device dm-362 operational as raid disk 3 [ 1989.082421] md/raid:md40: device dm-391 operational as raid disk 2 [ 1989.089353] md/raid:md40: device dm-419 operational as raid disk 1 [ 1989.097213] md/raid:md40: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1989.141892] md40: detected capacity change from 0 to 64011431837696 [ 1989.143711] LDISKFS-fs warning (device md30): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1989.163700] md: resync of RAID array md40 [ 1989.187867] md: md22 stopped. [ 1989.218634] md/raid:md22: not clean -- starting background reconstruction [ 1989.226357] md/raid:md22: device dm-236 operational as raid disk 0 [ 1989.233300] md/raid:md22: device dm-229 operational as raid disk 9 [ 1989.240211] md/raid:md22: device dm-228 operational as raid disk 8 [ 1989.247116] md/raid:md22: device dm-212 operational as raid disk 7 [ 1989.247627] LDISKFS-fs warning (device md8): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1989.268598] md/raid:md22: device dm-218 operational as raid disk 6 [ 1989.275500] md/raid:md22: device dm-204 operational as raid disk 5 [ 1989.282400] md/raid:md22: device dm-198 operational as raid disk 4 [ 1989.289300] md/raid:md22: device dm-176 operational as raid disk 3 [ 1989.296201] md/raid:md22: device dm-175 operational as raid disk 2 [ 1989.303101] md/raid:md22: device dm-240 operational as raid disk 1 [ 1989.311496] md/raid:md22: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1989.353150] md22: detected capacity change from 0 to 64011431837696 [ 1989.360265] md: resync of RAID array md22 [ 1989.420150] md: md38 stopped. [ 1989.438536] LDISKFS-fs warning (device md18): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1989.476343] md/raid:md38: device dm-399 operational as raid disk 0 [ 1989.483284] md/raid:md38: device dm-416 operational as raid disk 9 [ 1989.490201] md/raid:md38: device dm-407 operational as raid disk 8 [ 1989.497199] md/raid:md38: device dm-398 operational as raid disk 7 [ 1989.504157] md/raid:md38: device dm-404 operational as raid disk 6 [ 1989.511126] md/raid:md38: device dm-390 operational as raid disk 5 [ 1989.518146] md/raid:md38: device dm-381 operational as raid disk 4 [ 1989.525070] md/raid:md38: device dm-361 operational as raid disk 3 [ 1989.531984] md/raid:md38: device dm-377 operational as raid disk 2 [ 1989.538925] md/raid:md38: device dm-397 operational as raid disk 1 [ 1989.547730] md/raid:md38: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1989.594802] md38: detected capacity change from 0 to 64011431837696 [ 1989.612648] md: md34 stopped. [ 1989.664655] md/raid:md34: not clean -- starting background reconstruction [ 1989.672380] md/raid:md34: device dm-359 operational as raid disk 0 [ 1989.679285] md/raid:md34: device dm-350 operational as raid disk 9 [ 1989.686187] md/raid:md34: device dm-352 operational as raid disk 8 [ 1989.693092] md/raid:md34: device dm-332 operational as raid disk 7 [ 1989.699993] md/raid:md34: device dm-340 operational as raid disk 6 [ 1989.706897] md/raid:md34: device dm-333 operational as raid disk 5 [ 1989.713799] md/raid:md34: device dm-336 operational as raid disk 4 [ 1989.720701] md/raid:md34: device dm-326 operational as raid disk 3 [ 1989.727602] md/raid:md34: device dm-301 operational as raid disk 2 [ 1989.734502] md/raid:md34: device dm-353 operational as raid disk 1 [ 1989.742348] md/raid:md34: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1989.786954] md34: detected capacity change from 0 to 64011431837696 [ 1989.794859] md: resync of RAID array md34 [ 1989.820083] md: md16 stopped. [ 1989.830450] LDISKFS-fs warning (device md44): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1989.879516] md/raid:md16: not clean -- starting background reconstruction [ 1989.887247] md/raid:md16: device dm-120 operational as raid disk 0 [ 1989.894152] md/raid:md16: device dm-143 operational as raid disk 9 [ 1989.901077] md/raid:md16: device dm-148 operational as raid disk 8 [ 1989.907990] md/raid:md16: device dm-133 operational as raid disk 7 [ 1989.914917] md/raid:md16: device dm-102 operational as raid disk 6 [ 1989.921823] md/raid:md16: device dm-109 operational as raid disk 5 [ 1989.928727] md/raid:md16: device dm-125 operational as raid disk 4 [ 1989.935647] md/raid:md16: device dm-101 operational as raid disk 3 [ 1989.942571] md/raid:md16: device dm-114 operational as raid disk 2 [ 1989.949478] md/raid:md16: device dm-124 operational as raid disk 1 [ 1989.957392] md/raid:md16: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 1990.002873] md16: detected capacity change from 0 to 64011431837696 [ 1990.010030] md: resync of RAID array md16 [ 1990.183490] LDISKFS-fs warning (device md36): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1990.418628] LDISKFS-fs warning (device md14): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1990.821442] LDISKFS-fs warning (device md40): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1991.042829] LDISKFS-fs warning (device md22): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1991.379841] LDISKFS-fs warning (device md38): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1991.731142] LDISKFS-fs warning (device md34): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 1991.975277] LDISKFS-fs warning (device md16): ldiskfs_multi_mount_protect:321: MMP interval 42 higher than expected, please wait. [ 2007.211411] md: md34: resync done. [ 2015.847033] md: md22: resync done. [ 2016.994683] md: md16: resync done. [ 2018.152864] md: md26: resync done. [ 2021.076463] md: md40: resync done. [ 2027.098269] md: md14: resync done. [ 2027.553600] LDISKFS-fs (md46): file extents enabled, maximum tree depth=5 [ 2027.849396] LDISKFS-fs (md42): file extents enabled, maximum tree depth=5 [ 2028.142235] LDISKFS-fs (md26): file extents enabled, maximum tree depth=5 [ 2028.206025] LDISKFS-fs (md28): file extents enabled, maximum tree depth=5 [ 2028.360957] LDISKFS-fs (md12): file extents enabled, maximum tree depth=5 [ 2028.544986] LDISKFS-fs (md2): file extents enabled, maximum tree depth=5 [ 2029.150928] LDISKFS-fs (md46): recovery complete [ 2029.207787] LDISKFS-fs (md46): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2029.322788] LDISKFS-fs (md4): file extents enabled, maximum tree depth=5 [ 2029.762344] LustreError: 137-5: oak-OST0032_UUID: not available for connect from 10.50.10.49@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2029.781756] LustreError: Skipped 15 previous similar messages [ 2029.789475] LDISKFS-fs (md24): file extents enabled, maximum tree depth=5 [ 2029.968987] LDISKFS-fs (md0): file extents enabled, maximum tree depth=5 [ 2030.041543] Lustre: oak-OST005e: Not available for connect from 10.210.12.78@tcp1 (not set up) [ 2030.092743] LDISKFS-fs (md42): recovery complete [ 2030.116779] LDISKFS-fs (md26): recovery complete [ 2030.138114] LDISKFS-fs (md26): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2030.151374] LDISKFS-fs (md42): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2030.211915] LDISKFS-fs (md2): recovery complete [ 2030.222056] LDISKFS-fs (md32): file extents enabled, maximum tree depth=5 [ 2030.236746] LDISKFS-fs (md2): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2030.278379] LustreError: 137-5: oak-OST0030_UUID: not available for connect from 10.50.3.63@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2030.278381] LustreError: 137-5: oak-OST0034_UUID: not available for connect from 10.50.3.63@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2030.278382] LustreError: 137-5: oak-OST003a_UUID: not available for connect from 10.50.3.63@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2030.278384] LustreError: Skipped 54 previous similar messages [ 2030.278385] LustreError: Skipped 54 previous similar messages [ 2030.349147] LustreError: Skipped 14 previous similar messages [ 2030.396785] LDISKFS-fs (md12): recovery complete [ 2030.448567] LDISKFS-fs (md12): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2030.520696] LDISKFS-fs (md4): recovery complete [ 2030.543713] LDISKFS-fs (md4): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2030.549966] LDISKFS-fs (md10): file extents enabled, maximum tree depth=5 [ 2030.567874] LDISKFS-fs (md28): recovery complete [ 2030.573109] LDISKFS-fs (md20): file extents enabled, maximum tree depth=5 [ 2030.594946] Lustre: oak-OST005e: Not available for connect from 10.0.3.4@o2ib5 (not set up) [ 2030.604349] Lustre: Skipped 3 previous similar messages [ 2030.621717] LDISKFS-fs (md28): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2030.890100] LDISKFS-fs (md6): file extents enabled, maximum tree depth=5 [ 2031.214477] LDISKFS-fs (md30): file extents enabled, maximum tree depth=5 [ 2031.340928] LDISKFS-fs (md8): file extents enabled, maximum tree depth=5 [ 2031.555913] LDISKFS-fs (md18): file extents enabled, maximum tree depth=5 [ 2031.913702] LDISKFS-fs (md44): file extents enabled, maximum tree depth=5 [ 2031.923747] Lustre: oak-OST005e: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 2031.935909] Lustre: oak-OST005e: in recovery but waiting for the first client to connect [ 2032.044517] LDISKFS-fs (md32): recovery complete [ 2032.065770] LDISKFS-fs (md32): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2032.211437] LDISKFS-fs (md24): recovery complete [ 2032.238462] LDISKFS-fs (md24): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2032.290186] LDISKFS-fs (md36): file extents enabled, maximum tree depth=5 [ 2032.376297] LDISKFS-fs (md20): recovery complete [ 2032.399894] LDISKFS-fs (md20): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2032.484682] Lustre: oak-OST005a: Not available for connect from 10.51.4.40@o2ib3 (not set up) [ 2032.484774] Lustre: oak-OST005e: Will be in recovery for at least 2:30, or until 625 clients reconnect [ 2032.485782] Lustre: oak-OST005e: Connection restored to 855f8733-97ad-fe20-a42c-c9a97f6818f7 (at 10.51.4.40@o2ib3) [ 2032.489261] LDISKFS-fs (md10): recovery complete [ 2032.502805] LDISKFS-fs (md10): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2032.509573] LDISKFS-fs (md14): file extents enabled, maximum tree depth=5 [ 2032.540900] Lustre: Skipped 4 previous similar messages [ 2032.724021] LDISKFS-fs (md18): recovery complete [ 2032.750027] LDISKFS-fs (md18): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2032.926948] LDISKFS-fs (md40): file extents enabled, maximum tree depth=5 [ 2033.011995] LDISKFS-fs (md30): recovery complete [ 2033.026539] LDISKFS-fs (md0): recovery complete [ 2033.034284] LDISKFS-fs (md30): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2033.049838] LDISKFS-fs (md0): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2033.133874] LDISKFS-fs (md22): file extents enabled, maximum tree depth=5 [ 2033.175544] Lustre: oak-OST005a: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 2033.187126] LDISKFS-fs (md6): recovery complete [ 2033.193140] Lustre: oak-OST005a: in recovery but waiting for the first client to connect [ 2033.207150] LDISKFS-fs (md6): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2033.342785] LDISKFS-fs (md8): recovery complete [ 2033.366194] LDISKFS-fs (md8): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2033.462972] LDISKFS-fs (md38): file extents enabled, maximum tree depth=5 [ 2033.516871] Lustre: oak-OST005a: Will be in recovery for at least 2:30, or until 519 clients reconnect [ 2033.517107] Lustre: oak-OST005e: Connection restored to (at 10.0.3.13@o2ib5) [ 2033.761763] LDISKFS-fs (md44): recovery complete [ 2033.791575] LDISKFS-fs (md44): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2033.822949] LDISKFS-fs (md34): file extents enabled, maximum tree depth=5 [ 2034.104958] LDISKFS-fs (md16): file extents enabled, maximum tree depth=5 [ 2034.274246] LDISKFS-fs (md36): recovery complete [ 2034.296913] LDISKFS-fs (md36): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2034.319133] LustreError: 137-5: oak-OST0030_UUID: not available for connect from 10.50.12.7@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2034.338609] LustreError: Skipped 399 previous similar messages [ 2034.424967] LDISKFS-fs (md14): recovery complete [ 2034.475367] LDISKFS-fs (md14): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2034.515598] LDISKFS-fs (md40): recovery complete [ 2034.524752] LDISKFS-fs (md22): recovery complete [ 2034.538512] LDISKFS-fs (md40): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2034.546091] LDISKFS-fs (md22): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2034.594653] Lustre: oak-OST0032: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 2034.606117] Lustre: Skipped 1 previous similar message [ 2034.612649] Lustre: oak-OST0032: in recovery but waiting for the first client to connect [ 2034.621685] Lustre: Skipped 1 previous similar message [ 2034.629207] Lustre: oak-OST0032: Will be in recovery for at least 2:30, or until 632 clients reconnect [ 2034.639700] Lustre: Skipped 1 previous similar message [ 2034.645545] Lustre: oak-OST0032: Connection restored to (at 10.0.2.90@o2ib5) [ 2034.653518] Lustre: Skipped 21 previous similar messages [ 2035.167266] Lustre: oak-OST003c: Not available for connect from 10.210.13.22@tcp1 (not set up) [ 2035.176891] Lustre: Skipped 7 previous similar messages [ 2035.639741] LDISKFS-fs (md38): recovery complete [ 2035.664842] LDISKFS-fs (md38): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2035.778968] LDISKFS-fs (md34): recovery complete [ 2035.804673] LDISKFS-fs (md34): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2035.988917] LDISKFS-fs (md16): recovery complete [ 2036.012262] LDISKFS-fs (md16): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc [ 2036.664041] Lustre: oak-OST005a: Connection restored to d6b8aee2-5679-2263-c6fd-5715d6cfb543 (at 10.210.11.5@tcp1) [ 2036.675611] Lustre: Skipped 70 previous similar messages [ 2037.082150] Lustre: oak-OST0032: Denying connection for new client 200155f1-8786-4 (at 10.50.2.31@o2ib2), waiting for 632 known clients (18 recovered, 0 in progress, and 0 evicted) to recover in 3:34 [ 2037.109958] Lustre: oak-OST004c: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 2037.121422] Lustre: Skipped 2 previous similar messages [ 2037.128209] Lustre: oak-OST004c: in recovery but waiting for the first client to connect [ 2037.137248] Lustre: Skipped 2 previous similar messages [ 2037.584307] Lustre: oak-OST004c: Will be in recovery for at least 2:30, or until 513 clients reconnect [ 2037.594709] Lustre: Skipped 2 previous similar messages [ 2038.009233] Lustre: oak-OST004a: Denying connection for new client 0b774077-ba6c-4 (at 10.51.2.22@o2ib3), waiting for 518 known clients (23 recovered, 2 in progress, and 0 evicted) to recover in 3:32 [ 2039.436127] Lustre: oak-OST003a: Not available for connect from 10.0.2.72@o2ib5 (not set up) [ 2039.445565] Lustre: Skipped 12 previous similar messages [ 2039.486612] Lustre: oak-OST005a: Denying connection for new client 134f9f96-e6d5-4 (at 10.49.25.10@o2ib1), waiting for 519 known clients (38 recovered, 1 in progress, and 0 evicted) to recover in 3:30 [ 2040.731192] Lustre: oak-OST003c: Connection restored to db1a99e7-7659-618d-9695-772d4bcb856e (at 10.0.2.84@o2ib5) [ 2040.742655] Lustre: Skipped 162 previous similar messages [ 2041.766804] Lustre: oak-OST0042: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 2041.778265] Lustre: Skipped 4 previous similar messages [ 2041.785056] Lustre: oak-OST0042: in recovery but waiting for the first client to connect [ 2041.794100] Lustre: Skipped 4 previous similar messages [ 2041.832878] Lustre: oak-OST0042: Will be in recovery for at least 2:30, or until 521 clients reconnect [ 2041.843270] Lustre: Skipped 4 previous similar messages [ 2042.524452] LustreError: 137-5: oak-OST0036_UUID: not available for connect from 10.210.12.115@tcp1 (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2042.543957] LustreError: Skipped 744 previous similar messages [ 2044.691963] Lustre: oak-OST005e: Denying connection for new client 425ea175-9f9c-4 (at 10.49.26.1@o2ib1), waiting for 625 known clients (60 recovered, 1 in progress, and 0 evicted) to recover in 3:24 [ 2044.711763] Lustre: Skipped 1 previous similar message [ 2047.697830] Lustre: oak-OST0054: Not available for connect from 10.210.15.137@tcp1 (not set up) [ 2047.707546] Lustre: Skipped 17 previous similar messages [ 2048.756831] Lustre: oak-OST0034: Connection restored to (at 10.210.12.54@tcp1) [ 2048.764995] Lustre: Skipped 556 previous similar messages [ 2049.869763] Lustre: oak-OST003e: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [ 2049.881226] Lustre: Skipped 7 previous similar messages [ 2049.887914] Lustre: oak-OST003e: in recovery but waiting for the first client to connect [ 2049.896960] Lustre: Skipped 7 previous similar messages [ 2049.908106] Lustre: oak-OST005c: Denying connection for new client 34129e0b-d0c7-4 (at 10.51.13.13@o2ib3), waiting for 521 known clients (16 recovered, 0 in progress, and 0 evicted) to recover in 3:34 [ 2049.928007] Lustre: Skipped 1 previous similar message [ 2049.953043] Lustre: oak-OST003e: Will be in recovery for at least 2:30, or until 507 clients reconnect [ 2049.963439] Lustre: Skipped 7 previous similar messages [ 2059.069751] Lustre: oak-OST004a: Denying connection for new client dbbeec82-59c5-4 (at 10.51.14.11@o2ib3), waiting for 518 known clients (110 recovered, 3 in progress, and 0 evicted) to recover in 3:11 [ 2059.089745] Lustre: Skipped 4 previous similar messages [ 2062.743963] ses 12:0:183:0: attempting task abort! scmd(ffff8bc834bc4fc0) [ 2062.751568] ses 12:0:183:0: [sg673] tag#13 CDB: Receive Diagnostic 1c 01 0f ff ff 00 [ 2062.760216] scsi target12:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab903d), phy(76) [ 2062.771673] scsi target12:0:183: enclosurelogical id(0x5001636001ab903d), slot(60) [ 2062.780209] scsi target12:0:183: enclosure level(0x0001), connector name( ) [ 2062.792401] ses 12:0:183:0: task abort: SUCCESS scmd(ffff8bc834bc4fc0) [ 2062.799700] ses 12:0:183:0: attempting task abort! scmd(ffff8bc85a9cad80) [ 2062.807335] ses 12:0:183:0: [sg673] tag#12 CDB: Inquiry 12 00 00 00 24 00 [ 2062.814916] scsi target12:0:183: _scsih_tm_display_info: handle(0x00ca), sas_address(0x5001636001ab903d), phy(76) [ 2062.826368] scsi target12:0:183: enclosurelogical id(0x5001636001ab903d), slot(60) [ 2062.834910] scsi target12:0:183: enclosure level(0x0001), connector name( ) [ 2062.847225] ses 12:0:183:0: task abort: SUCCESS scmd(ffff8bc85a9cad80) [ 2064.763682] Lustre: oak-OST0056: Connection restored to be1b674b-78d0-7579-a9a7-dd17b2855420 (at 10.50.7.33@o2ib2) [ 2064.763683] Lustre: oak-OST004e: Connection restored to be1b674b-78d0-7579-a9a7-dd17b2855420 (at 10.50.7.33@o2ib2) [ 2064.763686] Lustre: Skipped 2610 previous similar messages [ 2064.792914] Lustre: Skipped 15 previous similar messages [ 2076.644660] Lustre: oak-OST0040: Denying connection for new client ac614dff-9a0e-4 (at 10.50.5.24@o2ib2), waiting for 524 known clients (152 recovered, 5 in progress, and 0 evicted) to recover in 5:49 [ 2076.664545] Lustre: Skipped 67 previous similar messages [ 2108.871947] Lustre: oak-OST0034: Denying connection for new client 7e0a9a61-2b20-4 (at 10.51.2.19@o2ib3), waiting for 527 known clients (504 recovered, 17 in progress, and 0 evicted) to recover in 2:24 [ 2108.891950] Lustre: Skipped 92 previous similar messages [ 2109.664783] Lustre: oak-OST003c: Recovery over after 1:14, of 518 clients 518 recovered and 0 were evicted. [ 2109.759054] Lustre: oak-OST003c: deleting orphan objects from 0x900000bd0:4221191 to 0x900000bd0:4221217 [ 2109.844120] Lustre: oak-OST003c: deleting orphan objects from 0x0:33027217 to 0x0:33027233 [ 2109.863784] Lustre: oak-OST003c: deleting orphan objects from 0x900000400:3245613 to 0x900000400:3245633 [ 2109.932363] Lustre: oak-OST003c: deleting orphan objects from 0x9000013a0:487391 to 0x9000013a0:487425 [ 2110.034854] Lustre: oak-OST003c: deleting orphan objects from 0x900001b70:336 to 0x900001b70:449 [ 2110.081725] Lustre: oak-OST003c: deleting orphan objects from 0x900001b71:463 to 0x900001b71:513 [ 2111.175635] Lustre: oak-OST0040: deleting orphan objects from 0x0:34141685 to 0x0:34141697 [ 2111.229338] Lustre: oak-OST0040: deleting orphan objects from 0x9800013a0:517385 to 0x9800013a0:517409 [ 2111.242399] Lustre: oak-OST0040: deleting orphan objects from 0x980001b70:348 to 0x980001b70:449 [ 2111.477068] Lustre: oak-OST0040: deleting orphan objects from 0x980000400:3304850 to 0x980000400:3304865 [ 2111.587597] Lustre: oak-OST0040: deleting orphan objects from 0x980000bd0:4289263 to 0x980000bd0:4289281 [ 2111.917560] Lustre: oak-OST0040: deleting orphan objects from 0x980001b71:507 to 0x980001b71:545 [ 2112.005747] Lustre: oak-OST0054: Recovery over after 1:04, of 510 clients 510 recovered and 0 were evicted. [ 2112.016641] Lustre: Skipped 1 previous similar message [ 2112.376712] Lustre: oak-OST0054: deleting orphan objects from 0x14c0000bd0:3881624 to 0x14c0000bd0:3881665 [ 2112.395096] Lustre: oak-OST0030: Client d79232c5-d3f6-4 (at 10.51.1.6@o2ib3) reconnected, waiting for 514 clients in recovery for 4:12 [ 2112.408604] Lustre: Skipped 2 previous similar messages [ 2112.678613] Lustre: oak-OST0054: deleting orphan objects from 0x14c00013a0:457034 to 0x14c00013a0:457057 [ 2112.778323] Lustre: oak-OST0054: deleting orphan objects from 0x14c0001b70:353 to 0x14c0001b70:449 [ 2112.968411] Lustre: oak-OST0054: deleting orphan objects from 0x14c0000400:3560654 to 0x14c0000400:3560673 [ 2112.977558] Lustre: oak-OST0054: deleting orphan objects from 0x0:35389683 to 0x0:35389697 [ 2113.081140] Lustre: oak-OST0054: deleting orphan objects from 0x14c0001b71:474 to 0x14c0001b71:513 [ 2114.032391] Lustre: oak-OST0052: deleting orphan objects from 0x0:36707151 to 0x0:36707169 [ 2114.043954] Lustre: oak-OST0052: deleting orphan objects from 0x1400000bd0:4488264 to 0x1400000bd0:4488289 [ 2114.105825] Lustre: oak-OST003a: deleting orphan objects from 0x8c0001b70:357 to 0x8c0001b70:481 [ 2114.110999] Lustre: oak-OST005e: deleting orphan objects from 0x0:34579691 to 0x0:34579713 [ 2114.126747] Lustre: oak-OST0030: Recovery over after 1:12, of 514 clients 514 recovered and 0 were evicted. [ 2114.128142] Lustre: oak-OST005a: deleting orphan objects from 0x1700000bd0:4854948 to 0x1700000bd0:4855041 [ 2114.130106] Lustre: oak-OST005e: deleting orphan objects from 0x1780000bd0:4483461 to 0x1780000bd0:4483489 [ 2114.130670] Lustre: oak-OST004a: deleting orphan objects from 0xfc0000400:3200361 to 0xfc0000400:3200385 [ 2114.131478] Lustre: oak-OST004a: deleting orphan objects from 0x0:33622755 to 0x0:33622785 [ 2114.158655] Lustre: oak-OST005e: deleting orphan objects from 0x1780000400:4293082 to 0x1780000400:4293121 [ 2114.161401] Lustre: oak-OST005e: deleting orphan objects from 0x1780001b70:357 to 0x1780001b70:449 [ 2114.167087] Lustre: oak-OST004a: deleting orphan objects from 0xfc0000bd0:4097958 to 0xfc0000bd0:4097985 [ 2114.169947] Lustre: oak-OST004a: deleting orphan objects from 0xfc0001b70:345 to 0xfc0001b70:449 [ 2114.220728] Lustre: oak-OST005a: deleting orphan objects from 0x1700001b70:357 to 0x1700001b70:449 [ 2114.231159] Lustre: Skipped 6 previous similar messages [ 2114.265776] Lustre: oak-OST004a: deleting orphan objects from 0xfc0001b71:547 to 0xfc0001b71:577 [ 2114.324584] Lustre: oak-OST005e: deleting orphan objects from 0x1780001b71:492 to 0x1780001b71:513 [ 2114.330765] Lustre: oak-OST005a: deleting orphan objects from 0x0:35453386 to 0x0:35453409 [ 2114.371819] Lustre: oak-OST005a: deleting orphan objects from 0x17000013a0:544367 to 0x17000013a0:544385 [ 2114.375601] Lustre: oak-OST0052: deleting orphan objects from 0x1400001b70:356 to 0x1400001b70:449 [ 2114.427128] Lustre: oak-OST0052: deleting orphan objects from 0x1400001b71:547 to 0x1400001b71:577 [ 2114.431100] Lustre: oak-OST005a: deleting orphan objects from 0x1700000400:4371167 to 0x1700000400:4371201 [ 2114.483298] Lustre: oak-OST004a: deleting orphan objects from 0xfc00013a0:494697 to 0xfc00013a0:494721 [ 2114.558077] Lustre: oak-OST005e: deleting orphan objects from 0x17800013a0:487206 to 0x17800013a0:487233 [ 2114.566482] Lustre: oak-OST005a: deleting orphan objects from 0x1700001b71:478 to 0x1700001b71:545 [ 2114.681730] Lustre: oak-OST0052: deleting orphan objects from 0x1400000400:3684558 to 0x1400000400:3684577 [ 2114.696253] Lustre: oak-OST0052: deleting orphan objects from 0x14000013a0:529192 to 0x14000013a0:529217 [ 2114.841827] Lustre: oak-OST003a: deleting orphan objects from 0x8c0000400:3240363 to 0x8c0000400:3240385 [ 2114.869865] Lustre: oak-OST003a: deleting orphan objects from 0x8c0000bd0:4115539 to 0x8c0000bd0:4115553 [ 2114.887076] Lustre: oak-OST003a: deleting orphan objects from 0x0:34826767 to 0x0:34826785 [ 2114.900625] Lustre: oak-OST003a: deleting orphan objects from 0x8c00013a0:489238 to 0x8c00013a0:489281 [ 2114.920689] Lustre: oak-OST003a: deleting orphan objects from 0x8c0001b71:518 to 0x8c0001b71:577 [ 2116.790945] Lustre: oak-OST0030: deleting orphan objects from 0x0:34267246 to 0x0:34267265 [ 2116.917748] Lustre: oak-OST0030: deleting orphan objects from 0x1b00000bd0:4014596 to 0x1b00000bd0:4014625 [ 2116.921929] Lustre: oak-OST0030: deleting orphan objects from 0x1b00001b70:344 to 0x1b00001b70:481 [ 2117.033466] Lustre: oak-OST0030: deleting orphan objects from 0x1b00000400:2962571 to 0x1b00000400:2962593 [ 2117.081909] Lustre: oak-OST0030: deleting orphan objects from 0x1b00001b71:483 to 0x1b00001b71:513 [ 2117.107666] Lustre: oak-OST0030: deleting orphan objects from 0x1b000013a0:597223 to 0x1b000013a0:597249 [ 2117.368858] Lustre: oak-OST0058: deleting orphan objects from 0x1680000400:3787435 to 0x1680000400:3787457 [ 2117.681611] Lustre: oak-OST0058: deleting orphan objects from 0x1680001b70:357 to 0x1680001b70:449 [ 2117.698074] Lustre: oak-OST0058: deleting orphan objects from 0x0:37418068 to 0x0:37418081 [ 2118.066033] Lustre: oak-OST0058: deleting orphan objects from 0x1680000bd0:4171009 to 0x1680000bd0:4171041 [ 2118.320698] Lustre: oak-OST0058: deleting orphan objects from 0x1680001b71:547 to 0x1680001b71:577 [ 2118.389652] Lustre: oak-OST0058: deleting orphan objects from 0x16800013a0:502595 to 0x16800013a0:502625 [ 2119.210377] Lustre: oak-OST0042: deleting orphan objects from 0x0:34148320 to 0x0:34148353 [ 2119.290743] Lustre: oak-OST0042: deleting orphan objects from 0x9c0000bd0:3653576 to 0x9c0000bd0:3653601 [ 2119.528224] Lustre: oak-OST0042: deleting orphan objects from 0x9c0000400:3142088 to 0x9c0000400:3142113 [ 2119.550685] Lustre: oak-OST0042: deleting orphan objects from 0x9c0001b70:346 to 0x9c0001b70:449 [ 2119.725706] Lustre: oak-OST0042: deleting orphan objects from 0x9c0001b71:545 to 0x9c0001b71:577 [ 2119.770472] Lustre: oak-OST0042: deleting orphan objects from 0x9c00013a0:446583 to 0x9c00013a0:446625 [ 2128.843364] Lustre: oak-OST0054: Connection restored to (at 10.51.1.43@o2ib3) [ 2128.851462] Lustre: Skipped 8192 previous similar messages [ 2173.296362] Lustre: oak-OST003e: Denying connection for new client a08e2be5-d8c7-4 (at 10.49.26.17@o2ib1), waiting for 507 known clients (493 recovered, 13 in progress, and 0 evicted) to recover in 4:07 [ 2173.316453] Lustre: Skipped 152 previous similar messages [ 2253.098543] Lustre: oak-OST0034: recovery is timed out, evict stale exports [ 2253.106441] Lustre: oak-OST0034: disconnecting 2 stale clients [ 2253.608034] Lustre: oak-OST0034: deleting orphan objects from 0x0:32241306 to 0x0:32241313 [ 2253.608994] Lustre: oak-OST0034: Recovery over after 3:37, of 527 clients 525 recovered and 2 were evicted. [ 2253.608995] Lustre: Skipped 1 previous similar message [ 2253.913234] Lustre: oak-OST0034: deleting orphan objects from 0x800000400:3176759 to 0x800000400:3176801 [ 2253.979208] Lustre: oak-OST0034: deleting orphan objects from 0x800000bd0:4245305 to 0x800000bd0:4245345 [ 2254.098522] Lustre: oak-OST0050: recovery is timed out, evict stale exports [ 2254.098574] Lustre: oak-OST004c: disconnecting 1 stale clients [ 2254.112801] Lustre: Skipped 1 previous similar message [ 2254.199811] Lustre: oak-OST0034: deleting orphan objects from 0x800001b70:353 to 0x800001b70:481 [ 2254.255030] Lustre: oak-OST0034: deleting orphan objects from 0x800001b71:550 to 0x800001b71:577 [ 2254.281734] Lustre: oak-OST0034: deleting orphan objects from 0x8000013a0:519662 to 0x8000013a0:519681 [ 2255.098526] Lustre: oak-OST0048: recovery is timed out, evict stale exports [ 2255.106375] Lustre: oak-OST0048: disconnecting 1 stale clients [ 2255.112891] Lustre: Skipped 1 previous similar message [ 2255.611527] Lustre: oak-OST0048: Client d79232c5-d3f6-4 (at 10.51.1.6@o2ib3) reconnected, waiting for 521 clients in recovery for 4:12 [ 2255.653874] Lustre: oak-OST0050: deleting orphan objects from 0x0:35757685 to 0x0:35757729 [ 2256.118291] Lustre: oak-OST0050: deleting orphan objects from 0x1300001b71:547 to 0x1300001b71:577 [ 2256.165036] Lustre: oak-OST0050: deleting orphan objects from 0x1300001b70:350 to 0x1300001b70:481 [ 2256.226913] Lustre: oak-OST0050: deleting orphan objects from 0x1300000bd0:4399787 to 0x1300000bd0:4399809 [ 2256.298538] Lustre: oak-OST0050: deleting orphan objects from 0x13000013a0:517999 to 0x13000013a0:518017 [ 2256.380136] Lustre: oak-OST0050: deleting orphan objects from 0x1300000400:3521720 to 0x1300000400:3521761 [ 2256.906892] Lustre: oak-OST004c: deleting orphan objects from 0x1000001b70:347 to 0x1000001b70:449 [ 2256.932029] Lustre: oak-OST004c: deleting orphan objects from 0x10000013a0:535067 to 0x10000013a0:535105 [ 2257.094479] Lustre: oak-OST0048: deleting orphan objects from 0xa80001b70:360 to 0xa80001b70:449 [ 2257.590931] Lustre: oak-OST004c: deleting orphan objects from 0x1000000bd0:4435482 to 0x1000000bd0:4435521 [ 2257.657773] Lustre: oak-OST004c: deleting orphan objects from 0x1000001b71:499 to 0x1000001b71:577 [ 2257.658150] Lustre: oak-OST004c: deleting orphan objects from 0x0:32782214 to 0x0:32782241 [ 2257.682387] Lustre: oak-OST0048: deleting orphan objects from 0xa80000bd0:3685843 to 0xa80000bd0:3685889 [ 2257.703330] Lustre: oak-OST0048: deleting orphan objects from 0x0:34884161 to 0x0:34884193 [ 2257.736453] Lustre: oak-OST0048: deleting orphan objects from 0xa80000400:3243829 to 0xa80000400:3243873 [ 2257.756015] Lustre: oak-OST0048: deleting orphan objects from 0xa800013a0:416553 to 0xa800013a0:416577 [ 2257.763024] Lustre: oak-OST0048: deleting orphan objects from 0xa80001b71:545 to 0xa80001b71:609 [ 2258.237979] Lustre: oak-OST004c: deleting orphan objects from 0x1000000400:3292296 to 0x1000000400:3292321 [ 2259.611661] Lustre: oak-OST004c: Connection restored to 965a6346-6505-4 (at 10.50.4.27@o2ib2) [ 2259.621196] Lustre: Skipped 27 previous similar messages [ 2263.098321] Lustre: oak-OST0038: recovery is timed out, evict stale exports [ 2263.106159] Lustre: oak-OST0038: disconnecting 1 stale clients [ 2263.231001] Lustre: oak-OST0038: Recovery over after 3:37, of 521 clients 520 recovered and 1 was evicted. [ 2263.241788] Lustre: Skipped 3 previous similar messages [ 2263.313176] Lustre: oak-OST0038: deleting orphan objects from 0x1a40000400:3079126 to 0x1a40000400:3079169 [ 2263.352916] Lustre: oak-OST0038: deleting orphan objects from 0x1a40000bd0:4218135 to 0x1a40000bd0:4218177 [ 2263.359455] Lustre: oak-OST0038: deleting orphan objects from 0x1a40001b70:349 to 0x1a40001b70:449 [ 2263.364111] Lustre: oak-OST0038: deleting orphan objects from 0x0:32707099 to 0x0:32707137 [ 2263.373680] Lustre: oak-OST0038: deleting orphan objects from 0x1a400013a0:520231 to 0x1a400013a0:520257 [ 2263.467523] Lustre: oak-OST0038: deleting orphan objects from 0x1a40001b71:547 to 0x1a40001b71:577 [ 2265.578309] Lustre: oak-OST005c: deleting orphan objects from 0x1740000bd0:4262120 to 0x1740000bd0:4262145 [ 2265.578624] Lustre: oak-OST005c: deleting orphan objects from 0x17400013a0:484567 to 0x17400013a0:484609 [ 2265.616349] Lustre: oak-OST005c: deleting orphan objects from 0x1740000400:4188680 to 0x1740000400:4188705 [ 2265.781390] Lustre: oak-OST005c: deleting orphan objects from 0x1740001b70:361 to 0x1740001b70:449 [ 2265.884369] Lustre: oak-OST005c: deleting orphan objects from 0x0:34938668 to 0x0:34938689 [ 2266.074184] Lustre: oak-OST005c: deleting orphan objects from 0x1740001b71:515 to 0x1740001b71:545 [ 2268.098243] Lustre: oak-OST0046: recovery is timed out, evict stale exports [ 2268.106029] Lustre: Skipped 1 previous similar message [ 2268.111820] Lustre: oak-OST0046: disconnecting 2 stale clients [ 2268.118340] Lustre: Skipped 1 previous similar message [ 2268.343866] Lustre: oak-OST0046: deleting orphan objects from 0xa40000400:3425915 to 0xa40000400:3425953 [ 2268.446239] Lustre: oak-OST0046: deleting orphan objects from 0xa40000bd0:4304649 to 0xa40000bd0:4304673 [ 2268.644425] Lustre: oak-OST0046: deleting orphan objects from 0xa40001b71:506 to 0xa40001b71:545 [ 2268.725350] Lustre: oak-OST0046: deleting orphan objects from 0xa40001b70:346 to 0xa40001b70:449 [ 2268.843954] Lustre: oak-OST0046: deleting orphan objects from 0x0:35690191 to 0x0:35690209 [ 2268.945107] Lustre: oak-OST0046: deleting orphan objects from 0xa400013a0:512393 to 0xa400013a0:512417 [ 2302.378955] Lustre: oak-OST003e: Denying connection for new client 489ee6f9-e0d9-4 (at 10.50.14.15@o2ib2), waiting for 507 known clients (493 recovered, 13 in progress, and 0 evicted) to recover in 1:58 [ 2302.399064] Lustre: Skipped 251 previous similar messages [ 2406.095067] Lustre: oak-OST0032: recovery is timed out, evict stale exports [ 2406.102916] Lustre: oak-OST0032: disconnecting 1 stale clients [ 2406.514670] Lustre: oak-OST0032: deleting orphan objects from 0x0:33925240 to 0x0:33925249 [ 2406.519527] Lustre: oak-OST0032: Recovery over after 6:12, of 632 clients 631 recovered and 1 was evicted. [ 2406.519528] Lustre: Skipped 2 previous similar messages [ 2406.681002] Lustre: oak-OST0032: deleting orphan objects from 0x840000400:3236989 to 0x840000400:3237025 [ 2406.964561] Lustre: oak-OST0032: deleting orphan objects from 0x840001b70:352 to 0x840001b70:481 [ 2407.275835] Lustre: oak-OST0032: deleting orphan objects from 0x8400013a0:421519 to 0x8400013a0:421537 [ 2407.299717] Lustre: oak-OST0032: deleting orphan objects from 0x840000bd0:3814026 to 0x840000bd0:3814049 [ 2407.355461] Lustre: oak-OST0032: deleting orphan objects from 0x840001b71:479 to 0x840001b71:513 [ 2412.340617] Lustre: oak-OST0044: Client d79232c5-d3f6-4 (at 10.51.1.6@o2ib3) reconnected, waiting for 522 clients in recovery for 4:12 [ 2412.404637] Lustre: oak-OST0044: deleting orphan objects from 0x0:35536413 to 0x0:35536449 [ 2412.639101] Lustre: oak-OST0044: deleting orphan objects from 0xa00000bd0:4160559 to 0xa00000bd0:4160577 [ 2412.997509] Lustre: oak-OST0044: deleting orphan objects from 0xa00001b70:354 to 0xa00001b70:449 [ 2413.013399] Lustre: oak-OST0044: deleting orphan objects from 0xa00000400:3415513 to 0xa00000400:3415553 [ 2413.131557] Lustre: oak-OST0044: deleting orphan objects from 0xa00001b71:520 to 0xa00001b71:545 [ 2413.235360] Lustre: oak-OST0044: deleting orphan objects from 0xa000013a0:498336 to 0xa000013a0:498369 [ 2418.026310] Lustre: oak-OST004e: deleting orphan objects from 0x1200000bd0:4365834 to 0x1200000bd0:4365857 [ 2418.031058] Lustre: oak-OST004e: deleting orphan objects from 0x0:35395941 to 0x0:35395969 [ 2418.203659] Lustre: oak-OST004e: deleting orphan objects from 0x12000013a0:513779 to 0x12000013a0:513825 [ 2418.231423] Lustre: oak-OST004e: deleting orphan objects from 0x1200001b71:545 to 0x1200001b71:577 [ 2418.367822] Lustre: oak-OST004e: deleting orphan objects from 0x1200001b70:355 to 0x1200001b70:449 [ 2418.449464] Lustre: oak-OST004e: deleting orphan objects from 0x1200000400:3420835 to 0x1200000400:3420865 [ 2420.519538] Lustre: oak-OST0036: Client 1a448fa9-8f9c-4 (at 10.51.0.65@o2ib3) reconnected, waiting for 503 clients in recovery for 4:12 [ 2420.648759] Lustre: oak-OST0036: deleting orphan objects from 0x0:31819460 to 0x0:31819489 [ 2420.875994] Lustre: oak-OST0036: deleting orphan objects from 0x880000bd0:3335324 to 0x880000bd0:3335361 [ 2420.891395] Lustre: oak-OST0036: deleting orphan objects from 0x880001b70:348 to 0x880001b70:481 [ 2421.251737] Lustre: oak-OST0036: deleting orphan objects from 0x880000400:3059179 to 0x880000400:3059201 [ 2421.389449] Lustre: oak-OST0036: deleting orphan objects from 0x8800013a0:388112 to 0x8800013a0:388129 [ 2421.565398] Lustre: oak-OST0036: deleting orphan objects from 0x880001b71:499 to 0x880001b71:577 [ 2422.157878] Lustre: oak-OST003e: deleting orphan objects from 0x940000400:3143224 to 0x940000400:3143265 [ 2422.158166] Lustre: oak-OST003e: deleting orphan objects from 0x940000bd0:4530935 to 0x940000bd0:4530977 [ 2422.168691] Lustre: oak-OST003e: deleting orphan objects from 0x0:33044127 to 0x0:33044161 [ 2422.175071] Lustre: oak-OST003e: deleting orphan objects from 0x9400013a0:567604 to 0x9400013a0:567649 [ 2422.467049] Lustre: oak-OST003e: deleting orphan objects from 0x940001b70:360 to 0x940001b70:481 [ 2422.667348] Lustre: oak-OST003e: deleting orphan objects from 0x940001b71:549 to 0x940001b71:577 [ 2424.094602] Lustre: oak-OST0056: recovery is timed out, evict stale exports [ 2424.102450] Lustre: Skipped 4 previous similar messages [ 2424.108393] Lustre: oak-OST0056: disconnecting 1 stale clients [ 2424.114959] Lustre: Skipped 4 previous similar messages [ 2424.633676] Lustre: oak-OST0056: Client 1a448fa9-8f9c-4 (at 10.51.0.65@o2ib3) reconnected, waiting for 532 clients in recovery for 4:12 [ 2424.647404] Lustre: Skipped 1 previous similar message [ 2424.704491] Lustre: oak-OST0056: deleting orphan objects from 0x0:38102598 to 0x0:38102625 [ 2424.705400] Lustre: oak-OST0056: Recovery over after 6:12, of 532 clients 531 recovered and 1 was evicted. [ 2424.705402] Lustre: Skipped 4 previous similar messages [ 2425.000863] Lustre: oak-OST0056: deleting orphan objects from 0x15c0000400:3794796 to 0x15c0000400:3794817 [ 2425.102747] Lustre: oak-OST0056: deleting orphan objects from 0x15c0001b71:579 to 0x15c0001b71:609 [ 2425.623137] Lustre: oak-OST0056: deleting orphan objects from 0x15c0000bd0:4209741 to 0x15c0000bd0:4209761 [ 2426.016435] Lustre: oak-OST0056: deleting orphan objects from 0x15c0001b70:356 to 0x15c0001b70:449 [ 2426.074392] Lustre: oak-OST0056: deleting orphan objects from 0x15c00013a0:508863 to 0x15c00013a0:508897 [ 2519.970547] Lustre: oak-OST0058: Connection restored to 0028e5c0-f60e-4 (at 10.51.4.34@o2ib3) [ 2519.980087] Lustre: Skipped 379 previous similar messages [ 2558.283699] Lustre: oak-OST0044: Client 0990d505-f804-a11f-b445-5dbc7dcd98cd (at 10.50.10.56@o2ib2) reconnecting [ 2558.295067] Lustre: Skipped 1 previous similar message [ 3028.403036] LustreError: 193393:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0044: cli d073f313-60b4-4 claims 4218880 GRANT, real grant 4169728 [ 3042.182770] Lustre: oak-OST003c: Connection restored to (at 10.50.13.7@o2ib2) [ 3042.190852] Lustre: Skipped 443 previous similar messages [ 3233.448565] LustreError: 193140:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST004c: cli 330d404b-804c-4 claims 4218880 GRANT, real grant 4194304 [ 3233.462954] LustreError: 193140:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 7 previous similar messages [ 3659.924243] Lustre: oak-OST0040: Connection restored to (at 10.50.6.63@o2ib2) [ 3659.932314] Lustre: Skipped 187 previous similar messages [ 3836.127648] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.230@o2ib5: error 0(sending)(waiting) [ 3836.141010] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600900) failed: 5 [ 3836.141247] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.230@o2ib5 exceeded retry count 0 [ 3836.141248] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 1 previous similar message [ 3836.141250] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d1000 [ 3836.141253] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bab103b3000 [ 3836.141257] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d1000 [ 3836.141267] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d1000 [ 3836.141282] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d1000 [ 3836.141292] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7054a5000 [ 3836.141715] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7054a5000 [ 3836.141727] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7054a5000 [ 3836.142805] LNet: 3985:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.230@o2ib5: reconnect (conn race), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [ 3836.142852] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.230@o2ib5 failed: 5 [ 3836.143505] LNet: 50599:0:(o2iblnd_cb.c:1515:kiblnd_reconnect_peer()) Abort reconnection of 10.0.2.230@o2ib5: accepting [ 3836.144728] Lustre: oak-OST005a: Client 4b700cb7-818f-459e-5b7d-8a92115f47c9 (at 10.0.2.230@o2ib5) reconnecting [ 3836.145746] LustreError: 193401:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be75702b050 x1695856590394304/t0(0) o3->4b700cb7-818f-459e-5b7d-8a92115f47c9@10.0.2.230@o2ib5:39/0 lens 488/440 e 0 to 0 dl 1618724569 ref 1 fl Interpret:/0/0 rc 0/0 [ 3836.145855] Lustre: oak-OST005a: Bulk IO read error with 4b700cb7-818f-459e-5b7d-8a92115f47c9 (at 10.0.2.230@o2ib5), client will retry: rc -110 [ 3836.360790] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 66 previous similar messages [ 4260.424907] Lustre: oak-OST0058: Connection restored to 0819d613-98f5-4 (at 10.50.14.14@o2ib2) [ 4260.434541] Lustre: Skipped 228 previous similar messages [ 4434.685108] LustreError: 187417:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0044: cli d073f313-60b4-4 claims 4218880 GRANT, real grant 290816 [ 4434.699406] LustreError: 187417:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 3 previous similar messages [ 4865.047269] Lustre: oak-OST003a: Connection restored to a7406eef-a378-4 (at 10.50.4.26@o2ib2) [ 4865.056793] Lustre: Skipped 189 previous similar messages [ 5291.531807] Lustre: oak-OST005a: haven't heard from client 183ecd55-e514-d06c-cd2f-d7bce2817476 (at 10.210.12.61@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7a2e46c00, cur 1618725933 expire 1618725783 last 1618725706 [ 5470.385353] Lustre: oak-OST0050: Connection restored to a7406eef-a378-4 (at 10.50.4.26@o2ib2) [ 5470.394894] Lustre: Skipped 154 previous similar messages [ 6075.234844] Lustre: oak-OST0048: Connection restored to (at 10.50.5.24@o2ib2) [ 6075.242925] Lustre: Skipped 120 previous similar messages [ 6683.437238] Lustre: oak-OST0048: Connection restored to a467a7e1-9686-4 (at 10.51.4.13@o2ib3) [ 6683.446758] Lustre: Skipped 120 previous similar messages [ 7285.679743] Lustre: oak-OST0040: Connection restored to a7406eef-a378-4 (at 10.50.4.26@o2ib2) [ 7285.689268] Lustre: Skipped 95 previous similar messages [ 7886.645274] Lustre: oak-OST0056: Connection restored to fd16aff2-0371-4 (at 10.51.4.33@o2ib3) [ 7886.654847] Lustre: Skipped 110 previous similar messages [ 7897.112378] Lustre: oak-OST0050: Client 416c99fb-653c-af17-c233-d5516de96d20 (at 10.51.15.7@o2ib3) reconnecting [ 7897.112379] Lustre: oak-OST0054: Client 416c99fb-653c-af17-c233-d5516de96d20 (at 10.51.15.7@o2ib3) reconnecting [ 7901.414114] Lustre: oak-OST0032: Client 56a5a766-0782-0626-7e81-90dde2e2789a (at 10.51.2.28@o2ib3) reconnecting [ 7903.782521] Lustre: oak-OST0030: Client e9bcecd7-a198-50aa-33a9-a04f0aea63df (at 10.51.6.28@o2ib3) reconnecting [ 8069.784046] Lustre: oak-OST005a: Client 87556525-e81b-4 (at 10.51.1.4@o2ib3) reconnecting [ 8488.130149] Lustre: oak-OST005c: Connection restored to (at 10.50.5.24@o2ib2) [ 8488.138221] Lustre: Skipped 265 previous similar messages [ 9098.688294] Lustre: oak-OST0052: Connection restored to (at 10.49.19.6@o2ib1) [ 9098.696361] Lustre: Skipped 174 previous similar messages [ 9742.804442] Lustre: oak-OST0030: Connection restored to (at 10.50.5.24@o2ib2) [ 9742.812511] Lustre: Skipped 129 previous similar messages [10347.835385] Lustre: oak-OST003e: Connection restored to (at 10.49.19.5@o2ib1) [10347.843457] Lustre: Skipped 146 previous similar messages [10954.211207] Lustre: oak-OST0046: Connection restored to fdda7891-cb9d-4 (at 10.50.1.72@o2ib2) [10954.220746] Lustre: Skipped 134 previous similar messages [11559.279652] Lustre: oak-OST005a: Connection restored to 26eed2fd-a720-4 (at 10.49.19.4@o2ib1) [11559.289214] Lustre: Skipped 51 previous similar messages [11597.065269] Lustre: oak-OST003a: Client 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3) reconnecting [11597.065270] Lustre: oak-OST003c: Client 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3) reconnecting [11597.065273] Lustre: Skipped 2 previous similar messages [11597.089553] Lustre: Skipped 1 previous similar message [12160.396490] md: data-check of RAID array md34 [12163.150649] Lustre: oak-OST0042: Connection restored to (at 10.50.5.24@o2ib2) [12163.158743] Lustre: Skipped 68 previous similar messages [12166.593004] md: data-check of RAID array md38 [12172.663037] md: data-check of RAID array md22 [12178.870811] md: data-check of RAID array md40 [12184.923682] md: data-check of RAID array md14 [12191.056580] md: data-check of RAID array md36 [12197.249490] md: data-check of RAID array md18 [12203.321883] md: data-check of RAID array md30 [12209.514937] md: data-check of RAID array md20 [12215.663186] md: data-check of RAID array md10 [12221.732809] md: data-check of RAID array md32 [12227.876953] md: data-check of RAID array md0 [12234.032132] md: data-check of RAID array md24 [12662.792645] perf: interrupt took too long (2514 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [12763.754467] Lustre: oak-OST0058: Connection restored to (at 10.50.5.24@o2ib2) [12763.762567] Lustre: Skipped 106 previous similar messages [13368.715969] Lustre: oak-OST0048: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [13368.725484] Lustre: Skipped 125 previous similar messages [13980.858256] Lustre: oak-OST005c: Connection restored to 4685c272-f3d8-4 (at 10.50.9.9@o2ib2) [13980.867679] Lustre: Skipped 76 previous similar messages [14584.678615] Lustre: oak-OST0044: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [14584.688140] Lustre: Skipped 84 previous similar messages [15191.707674] Lustre: oak-OST005c: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [15191.717207] Lustre: Skipped 101 previous similar messages [15792.287585] Lustre: oak-OST004e: Connection restored to (at 10.49.19.6@o2ib1) [15792.295653] Lustre: Skipped 91 previous similar messages [16395.478247] Lustre: oak-OST005a: Connection restored to 613858a7-34ed-4 (at 10.51.1.33@o2ib3) [16395.487812] Lustre: Skipped 50 previous similar messages [16581.248732] perf: interrupt took too long (3147 > 3142), lowering kernel.perf_event_max_sample_rate to 63000 [16997.406405] Lustre: oak-OST0042: Connection restored to (at 10.50.13.7@o2ib2) [16997.414484] Lustre: Skipped 61 previous similar messages [17602.199013] Lustre: oak-OST005c: Connection restored to (at 10.51.5.32@o2ib3) [17602.207084] Lustre: Skipped 127 previous similar messages [18203.488168] Lustre: oak-OST0040: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [18203.497690] Lustre: Skipped 73 previous similar messages [18825.699256] Lustre: oak-OST005e: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [18825.708845] Lustre: Skipped 100 previous similar messages [19428.732589] Lustre: oak-OST0034: Connection restored to 9a280014-aa1b-4 (at 10.50.13.2@o2ib2) [19428.742111] Lustre: Skipped 123 previous similar messages [20042.999630] Lustre: oak-OST0032: Connection restored to (at 10.50.14.13@o2ib2) [20043.007813] Lustre: Skipped 104 previous similar messages [20647.359712] Lustre: oak-OST0036: Connection restored to (at 10.49.19.5@o2ib1) [20647.367782] Lustre: Skipped 116 previous similar messages [21235.725641] perf: interrupt took too long (3946 > 3933), lowering kernel.perf_event_max_sample_rate to 50000 [21258.213088] Lustre: oak-OST005c: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [21258.222612] Lustre: Skipped 97 previous similar messages [21874.937201] Lustre: oak-OST0050: Connection restored to (at 10.50.13.7@o2ib2) [21874.945289] Lustre: Skipped 30 previous similar messages [22513.519267] Lustre: oak-OST0050: Connection restored to fdda7891-cb9d-4 (at 10.50.1.72@o2ib2) [22513.528824] Lustre: Skipped 144 previous similar messages [23113.810757] Lustre: oak-OST0032: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [23113.820298] Lustre: Skipped 53 previous similar messages [23714.573296] Lustre: oak-OST0054: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [23714.582848] Lustre: Skipped 80 previous similar messages [24315.549598] Lustre: oak-OST003c: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [24315.559167] Lustre: Skipped 65 previous similar messages [24923.930158] Lustre: oak-OST0040: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [24923.939683] Lustre: Skipped 70 previous similar messages [25526.954553] Lustre: oak-OST003a: Connection restored to ba0a009d-55d9-4 (at 10.49.22.34@o2ib1) [25526.964173] Lustre: Skipped 134 previous similar messages [26147.292453] Lustre: oak-OST0044: Connection restored to 26eed2fd-a720-4 (at 10.49.19.4@o2ib1) [26147.301973] Lustre: Skipped 114 previous similar messages [26775.040202] Lustre: oak-OST0042: Connection restored to 168165af-3e38-4 (at 10.49.25.19@o2ib1) [26775.049828] Lustre: Skipped 45 previous similar messages [27377.189251] Lustre: oak-OST0056: Connection restored to (at 10.49.19.1@o2ib1) [27377.197373] Lustre: Skipped 148 previous similar messages [27980.044941] Lustre: oak-OST004c: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [27980.054463] Lustre: Skipped 82 previous similar messages [28602.443327] Lustre: oak-OST005a: Connection restored to a467a7e1-9686-4 (at 10.51.4.13@o2ib3) [28602.452919] Lustre: Skipped 90 previous similar messages [29214.716753] Lustre: oak-OST0038: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [29214.726275] Lustre: Skipped 99 previous similar messages [29840.303543] Lustre: oak-OST004e: Connection restored to (at 10.50.13.7@o2ib2) [29840.311613] Lustre: Skipped 145 previous similar messages [30477.644630] Lustre: oak-OST004a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [30477.654150] Lustre: Skipped 115 previous similar messages [31081.171393] Lustre: oak-OST005a: Connection restored to 247b5cff-df4a-4 (at 10.50.5.11@o2ib2) [31081.180936] Lustre: Skipped 125 previous similar messages [31397.453078] perf: interrupt took too long (4938 > 4932), lowering kernel.perf_event_max_sample_rate to 40000 [31707.636729] Lustre: oak-OST0058: Connection restored to (at 10.49.19.3@o2ib1) [31707.644800] Lustre: Skipped 100 previous similar messages [31903.732928] Lustre: oak-OST005c: Client 5ebd6997-e65a-7d53-0818-093cecfcf87e (at 10.50.5.41@o2ib2) reconnecting [31905.073580] Lustre: oak-OST005c: Client 37ba335a-2b49-af5e-4c04-11b87c9917c5 (at 10.50.9.44@o2ib2) reconnecting [32313.090328] Lustre: oak-OST0032: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [32313.099851] Lustre: Skipped 84 previous similar messages [32919.394291] Lustre: oak-OST0048: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [32919.403836] Lustre: Skipped 168 previous similar messages [33520.833989] Lustre: oak-OST003c: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [33520.843537] Lustre: Skipped 179 previous similar messages [34123.816407] Lustre: oak-OST0036: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [34123.825939] Lustre: Skipped 148 previous similar messages [34729.041116] Lustre: oak-OST0044: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [34729.050647] Lustre: Skipped 182 previous similar messages [35334.563075] Lustre: oak-OST0038: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [35334.572666] Lustre: Skipped 167 previous similar messages [35935.510609] Lustre: oak-OST0054: Connection restored to 5a1327cd-5da5-4 (at 10.50.13.3@o2ib2) [35935.520140] Lustre: Skipped 88 previous similar messages [36539.858473] Lustre: oak-OST0042: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [36539.867986] Lustre: Skipped 119 previous similar messages [37154.547245] Lustre: oak-OST0056: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [37154.556779] Lustre: Skipped 108 previous similar messages [37757.972151] Lustre: oak-OST0036: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [37757.981721] Lustre: Skipped 80 previous similar messages [38361.565839] Lustre: oak-OST0042: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [38361.575404] Lustre: Skipped 134 previous similar messages [38963.460894] Lustre: oak-OST005e: Connection restored to 9e244831-b7e7-4 (at 10.50.4.5@o2ib2) [38963.470328] Lustre: Skipped 96 previous similar messages [39579.960900] Lustre: oak-OST0046: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [39579.970442] Lustre: Skipped 115 previous similar messages [40189.004678] Lustre: oak-OST003c: Connection restored to 7575a33a-d426-4 (at 10.50.4.3@o2ib2) [40189.014592] Lustre: Skipped 129 previous similar messages [40789.998073] Lustre: oak-OST0050: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [40790.007611] Lustre: Skipped 114 previous similar messages [41392.327474] Lustre: oak-OST0040: Connection restored to (at 10.51.4.24@o2ib3) [41392.335561] Lustre: Skipped 102 previous similar messages [41993.259960] Lustre: oak-OST004e: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [41993.269470] Lustre: Skipped 139 previous similar messages [42594.174824] Lustre: oak-OST0038: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [42594.184348] Lustre: Skipped 88 previous similar messages [43202.491667] Lustre: oak-OST0038: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [43202.501201] Lustre: Skipped 92 previous similar messages [43814.026665] Lustre: oak-OST0054: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [43814.036200] Lustre: Skipped 92 previous similar messages [44416.391272] Lustre: oak-OST0044: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [44416.400818] Lustre: Skipped 73 previous similar messages [45030.671225] Lustre: oak-OST005e: Connection restored to 0819d613-98f5-4 (at 10.50.14.14@o2ib2) [45030.680847] Lustre: Skipped 56 previous similar messages [45665.309016] Lustre: oak-OST0040: Connection restored to (at 10.51.1.43@o2ib3) [45665.317125] Lustre: Skipped 36 previous similar messages [46281.182872] Lustre: oak-OST005e: Connection restored to cfbdda79-1410-4 (at 10.49.28.7@o2ib1) [46281.192421] Lustre: Skipped 21 previous similar messages [46890.788591] Lustre: oak-OST003e: Connection restored to 0028e5c0-f60e-4 (at 10.51.4.34@o2ib3) [46890.798112] Lustre: Skipped 322 previous similar messages [47527.019449] Lustre: oak-OST003e: Connection restored to 254f88b5-a6bc-4 (at 10.51.6.12@o2ib3) [47527.028980] Lustre: Skipped 238 previous similar messages [48155.356347] Lustre: oak-OST005a: Connection restored to (at 10.50.13.7@o2ib2) [48155.364417] Lustre: Skipped 157 previous similar messages [48798.029493] Lustre: oak-OST005e: Connection restored to 0819d613-98f5-4 (at 10.50.14.14@o2ib2) [48798.039134] Lustre: Skipped 56 previous similar messages [49398.934786] Lustre: oak-OST0058: Connection restored to 1e9cf67b-f489-4 (at 10.50.5.55@o2ib2) [49398.944312] Lustre: Skipped 73 previous similar messages [50006.430515] Lustre: oak-OST0056: Connection restored to (at 10.49.19.3@o2ib1) [50006.438616] Lustre: Skipped 95 previous similar messages [50670.667009] Lustre: oak-OST004c: Connection restored to (at 10.51.1.43@o2ib3) [50670.675074] Lustre: Skipped 41 previous similar messages [51272.005060] Lustre: oak-OST0036: Connection restored to 8a494beb-4a60-4 (at 10.50.4.34@o2ib2) [51272.014587] Lustre: Skipped 28 previous similar messages [51909.208989] Lustre: oak-OST0030: Connection restored to 254f88b5-a6bc-4 (at 10.51.6.12@o2ib3) [51909.218510] Lustre: Skipped 358 previous similar messages [52510.408097] Lustre: oak-OST0046: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [52510.417658] Lustre: Skipped 62 previous similar messages [53111.625065] Lustre: oak-OST0040: Connection restored to ed109291-03b5-4 (at 10.51.4.16@o2ib3) [53111.634592] Lustre: Skipped 42 previous similar messages [53716.156141] Lustre: oak-OST0054: Connection restored to (at 10.50.14.13@o2ib2) [53716.164311] Lustre: Skipped 50 previous similar messages [54320.562704] Lustre: oak-OST004c: Connection restored to af5286ea-aa80-4 (at 10.49.19.2@o2ib1) [54320.572226] Lustre: Skipped 99 previous similar messages [54929.753884] Lustre: oak-OST0052: Connection restored to af5286ea-aa80-4 (at 10.49.19.2@o2ib1) [54929.763407] Lustre: Skipped 156 previous similar messages [55536.432387] Lustre: oak-OST004a: Connection restored to (at 10.51.1.43@o2ib3) [55536.440488] Lustre: Skipped 91 previous similar messages [55924.277923] perf: interrupt took too long (6205 > 6172), lowering kernel.perf_event_max_sample_rate to 32000 [56145.617229] Lustre: oak-OST003a: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [56145.626792] Lustre: Skipped 184 previous similar messages [56753.254448] Lustre: oak-OST005c: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [56753.263995] Lustre: Skipped 223 previous similar messages [56899.389277] Lustre: oak-OST0048: Client 62935bb7-73e1-04f7-3b52-0957a312c1ed (at 10.50.17.32@o2ib2) reconnecting [56900.390460] Lustre: oak-OST005c: Client 62935bb7-73e1-04f7-3b52-0957a312c1ed (at 10.50.17.32@o2ib2) reconnecting [56953.432173] LustreError: 242900:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0030: cli 182778a0-b920-4 claims 32768 GRANT, real grant 0 [56953.445769] LustreError: 242900:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 1 previous similar message [56954.618465] LustreError: 193190:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0058: cli 182778a0-b920-4 claims 32768 GRANT, real grant 0 [57407.560560] Lustre: oak-OST0058: Connection restored to af5286ea-aa80-4 (at 10.49.19.2@o2ib1) [57407.570084] Lustre: Skipped 195 previous similar messages [58010.832774] Lustre: oak-OST005c: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [58010.842312] Lustre: Skipped 197 previous similar messages [58563.301320] Lustre: oak-OST0030: Client 90752487-0b3e-1696-21a1-6c81abc18872 (at 10.51.1.2@o2ib3) reconnecting [58563.301321] Lustre: oak-OST0050: Client 90752487-0b3e-1696-21a1-6c81abc18872 (at 10.51.1.2@o2ib3) reconnecting [58563.301324] Lustre: Skipped 1 previous similar message [58563.891567] Lustre: oak-OST004e: Client 4cee94e6-025c-589a-13ab-6c9ed337de31 (at 10.51.2.36@o2ib3) reconnecting [58565.126717] Lustre: oak-OST005e: Client 76eb6295-9d00-d7ab-8458-c8aac654030a (at 10.51.6.5@o2ib3) reconnecting [58565.137894] Lustre: Skipped 3 previous similar messages [58625.986359] Lustre: oak-OST0050: Connection restored to 93f0bf1d-9d29-4 (at 10.50.12.8@o2ib2) [58625.995883] Lustre: Skipped 92 previous similar messages [59210.567777] Lustre: oak-OST003c: Client 426db552-aa3e-e716-09b9-521b00b20a6c (at 10.50.9.6@o2ib2) reconnecting [59210.578961] Lustre: Skipped 1 previous similar message [59243.881647] Lustre: oak-OST0048: Connection restored to (at 10.49.19.5@o2ib1) [59243.889715] Lustre: Skipped 236 previous similar messages [59853.144825] Lustre: oak-OST0038: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [59853.154357] Lustre: Skipped 241 previous similar messages [60454.765859] Lustre: oak-OST0036: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [60454.775382] Lustre: Skipped 134 previous similar messages [61057.911683] Lustre: oak-OST005a: Connection restored to 157af72b-9c30-4 (at 10.50.5.51@o2ib2) [61057.921210] Lustre: Skipped 187 previous similar messages [61195.122370] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.50.13.4@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [61195.141685] LustreError: Skipped 320 previous similar messages [61659.610240] Lustre: oak-OST0044: Connection restored to 1012a999-78be-4 (at 10.50.3.47@o2ib2) [61659.619764] Lustre: Skipped 178 previous similar messages [62265.128209] Lustre: oak-OST0038: Connection restored to b77576b5-26ca-4 (at 10.50.14.5@o2ib2) [62265.137733] Lustre: Skipped 98 previous similar messages [62866.039008] Lustre: oak-OST0038: Connection restored to 7d4930b0-1dcd-4 (at 10.51.13.8@o2ib3) [62866.048553] Lustre: Skipped 119 previous similar messages [63352.665881] Lustre: oak-OST004a: Client 712cedcb-4de0-4 (at 10.50.7.39@o2ib2) reconnecting [63352.675156] Lustre: Skipped 2 previous similar messages [63353.479054] Lustre: oak-OST0030: Client f36a27e2-f202-12fb-4235-55f4a8db5980 (at 10.50.7.2@o2ib2) reconnecting [63353.490229] Lustre: Skipped 2 previous similar messages [63401.937778] LustreError: 193415:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(3832625) req@ffff8be721d6f050 x1697002421364416/t0(0) o4->e334e4f1-04fe-8a2a-4f8d-26baac5ef83e@10.50.12.10@o2ib2:668/0 lens 488/448 e 0 to 0 dl 1618784088 ref 1 fl Interpret:/0/0 rc 0/0 [63401.966067] Lustre: oak-OST0048: Bulk IO write error with e334e4f1-04fe-8a2a-4f8d-26baac5ef83e (at 10.50.12.10@o2ib2), client will retry: rc = -110 [63453.022503] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.50.5.62@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [63478.109967] Lustre: oak-OST0056: Client f9b5a9df-ddaf-0ab2-8c1a-f9202576d223 (at 10.50.5.62@o2ib2) reconnecting [63478.110125] Lustre: oak-OST005e: Connection restored to f9b5a9df-ddaf-0ab2-8c1a-f9202576d223 (at 10.50.5.62@o2ib2) [63478.110127] Lustre: Skipped 55 previous similar messages [63478.138723] Lustre: Skipped 1 previous similar message [63520.201002] Lustre: oak-OST0046: Client f9b5a9df-ddaf-0ab2-8c1a-f9202576d223 (at 10.50.5.62@o2ib2) reconnecting [63520.201003] Lustre: oak-OST0036: Client f9b5a9df-ddaf-0ab2-8c1a-f9202576d223 (at 10.50.5.62@o2ib2) reconnecting [63520.223555] Lustre: Skipped 2 previous similar messages [64101.059148] Lustre: oak-OST0038: Connection restored to 5108fcd1-08f1-4 (at 10.50.5.39@o2ib2) [64101.068676] Lustre: Skipped 42 previous similar messages [64715.314291] Lustre: oak-OST0056: Connection restored to 5e3fa4ab-9670-4 (at 10.51.4.21@o2ib3) [64715.323815] Lustre: Skipped 67 previous similar messages [65331.195725] Lustre: oak-OST0034: Connection restored to 1a571093-7cfe-4 (at 10.50.10.9@o2ib2) [65331.205265] Lustre: Skipped 39 previous similar messages [65942.719182] Lustre: oak-OST0050: Connection restored to d96a6637-b485-4 (at 10.50.12.11@o2ib2) [65942.728802] Lustre: Skipped 84 previous similar messages [66552.249832] Lustre: oak-OST0030: Connection restored to 5108fcd1-08f1-4 (at 10.50.5.39@o2ib2) [66552.259356] Lustre: Skipped 56 previous similar messages [66871.961725] Lustre: oak-OST005e: Client d48bbf19-d33f-1f44-7d24-fbb5f0736220 (at 10.50.0.12@o2ib2) reconnecting [66871.973004] Lustre: Skipped 2 previous similar messages [66972.337678] LustreError: 137-5: oak-OST004f_UUID: not available for connect from 10.50.0.12@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [66972.357099] LustreError: Skipped 2 previous similar messages [67101.127147] Lustre: oak-OST0048: haven't heard from client a5645e6e-f4b2-4 (at 10.50.7.9@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be723d1a800, cur 1618787744 expire 1618787594 last 1618787517 [67101.149291] Lustre: Skipped 23 previous similar messages [67152.554383] Lustre: oak-OST0046: Connection restored to c6abed43-2af3-4 (at 10.51.1.48@o2ib3) [67152.563909] Lustre: Skipped 125 previous similar messages [67758.167733] Lustre: oak-OST0046: Connection restored to (at 10.49.19.5@o2ib1) [67758.175809] Lustre: Skipped 132 previous similar messages [68362.716478] Lustre: oak-OST0030: Connection restored to ed109291-03b5-4 (at 10.51.4.16@o2ib3) [68362.726000] Lustre: Skipped 59 previous similar messages [68963.146315] Lustre: oak-OST0030: Connection restored to d0c323ea-8af6-4 (at 10.50.10.11@o2ib2) [68963.155940] Lustre: Skipped 114 previous similar messages [69571.608556] Lustre: oak-OST005e: Connection restored to (at 10.49.19.3@o2ib1) [69571.616636] Lustre: Skipped 102 previous similar messages [70173.354317] Lustre: oak-OST005e: Connection restored to a7406eef-a378-4 (at 10.50.4.26@o2ib2) [70173.363900] Lustre: Skipped 62 previous similar messages [70796.559355] Lustre: oak-OST0032: Connection restored to 9a280014-aa1b-4 (at 10.50.13.2@o2ib2) [70796.568888] Lustre: Skipped 227 previous similar messages [71070.640422] Lustre: oak-OST0034: Client 70e2b68b-00fb-4 (at 10.50.8.53@o2ib2) reconnecting [71070.649667] Lustre: Skipped 1 previous similar message [71400.033769] Lustre: oak-OST0034: Connection restored to 5108fcd1-08f1-4 (at 10.50.5.39@o2ib2) [71400.043292] Lustre: Skipped 77 previous similar messages [71975.087425] Lustre: oak-OST0048: Client 5f61858b-c239-a69f-19dd-103d92afb286 (at 10.50.1.60@o2ib2) reconnecting [72006.988503] Lustre: oak-OST0054: Connection restored to (at 10.49.19.3@o2ib1) [72006.996583] Lustre: Skipped 197 previous similar messages [72619.559691] Lustre: oak-OST0036: Connection restored to fd16aff2-0371-4 (at 10.51.4.33@o2ib3) [72619.569229] Lustre: Skipped 46 previous similar messages [73232.494492] Lustre: oak-OST0040: Connection restored to 287304df-9062-4 (at 10.50.4.25@o2ib2) [73232.504032] Lustre: Skipped 86 previous similar messages [73836.854473] Lustre: oak-OST0032: Connection restored to (at 10.50.13.7@o2ib2) [73836.862556] Lustre: Skipped 186 previous similar messages [74440.879097] Lustre: oak-OST0056: Connection restored to (at 10.49.19.5@o2ib1) [74440.887194] Lustre: Skipped 200 previous similar messages [75042.982917] Lustre: oak-OST0054: Connection restored to (at 10.50.13.7@o2ib2) [75042.991029] Lustre: Skipped 139 previous similar messages [75643.376455] Lustre: oak-OST0040: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [75643.385978] Lustre: Skipped 102 previous similar messages [76254.386308] Lustre: oak-OST005e: Connection restored to 712cedcb-4de0-4 (at 10.50.7.39@o2ib2) [76254.395821] Lustre: Skipped 88 previous similar messages [76875.938185] Lustre: oak-OST0052: Connection restored to (at 10.50.13.7@o2ib2) [76875.946259] Lustre: Skipped 53 previous similar messages [77516.413840] Lustre: oak-OST0040: Connection restored to (at 10.51.2.20@o2ib3) [77516.421925] Lustre: Skipped 39 previous similar messages [78134.651273] Lustre: oak-OST005c: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [78134.660802] Lustre: Skipped 121 previous similar messages [78752.370981] Lustre: oak-OST004a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [78752.380505] Lustre: Skipped 61 previous similar messages [79362.368548] Lustre: oak-OST003a: Connection restored to (at 10.51.1.43@o2ib3) [79362.376613] Lustre: Skipped 209 previous similar messages [79964.883606] Lustre: oak-OST0052: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [79964.893134] Lustre: Skipped 158 previous similar messages [80595.248278] Lustre: oak-OST003c: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [80595.257802] Lustre: Skipped 105 previous similar messages [81165.104095] Lustre: oak-OST005e: Client 426db552-aa3e-e716-09b9-521b00b20a6c (at 10.50.9.6@o2ib2) reconnecting [81273.767259] Lustre: oak-OST0058: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [81273.776788] Lustre: Skipped 79 previous similar messages [81874.264356] Lustre: oak-OST0040: Connection restored to d7c49207-5239-4 (at 10.50.10.47@o2ib2) [81874.274011] Lustre: Skipped 57 previous similar messages [82486.112075] Lustre: oak-OST004c: Connection restored to b77576b5-26ca-4 (at 10.50.14.5@o2ib2) [82486.121616] Lustre: Skipped 94 previous similar messages [82891.116832] Lustre: oak-OST0056: Client 7f5a03b0-a887-10eb-7386-d75d82cdd92b (at 10.51.13.20@o2ib3) reconnecting [82891.116833] Lustre: oak-OST005c: Client 7f5a03b0-a887-10eb-7386-d75d82cdd92b (at 10.51.13.20@o2ib3) reconnecting [82891.222688] LNet: 175095:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(waiting) [82891.235311] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [82891.247531] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [82891.457093] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7220ea800 [82951.391976] LustreError: 193411:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(402) req@ffff8be6c34b6050 x1684976521334784/t0(0) o4->4da274c3-cbba-4@10.50.2.62@o2ib2:582/0 lens 488/448 e 0 to 0 dl 1618803632 ref 1 fl Interpret:/0/0 rc 0/0 [82951.391990] LustreError: 193457:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8be71bcc0050 x1694090442612928/t0(0) o3->0819d613-98f5-4@10.50.14.14@o2ib2:582/0 lens 488/440 e 0 to 0 dl 1618803632 ref 1 fl Interpret:/0/0 rc 0/0 [82951.391998] Lustre: oak-OST0032: Bulk IO write error with 1e62d860-e8bc-4 (at 10.50.5.2@o2ib2), client will retry: rc = -110 [82951.392000] LustreError: 193453:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be660b07050 x1694068475376704/t0(0) o3->7f8e8eb2-f021-4@10.50.14.10@o2ib2:582/0 lens 488/440 e 0 to 0 dl 1618803632 ref 1 fl Interpret:/0/0 rc 0/0 [82951.392007] Lustre: oak-OST0056: Bulk IO read error with 0819d613-98f5-4 (at 10.50.14.14@o2ib2), client will retry: rc -110 [82951.392008] Lustre: Skipped 8 previous similar messages [82951.499005] LustreError: 193411:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [82985.094665] Lustre: oak-OST003c: Client 24be0a47-1b33-0b5c-28de-7a37c40c9b32 (at 10.210.12.64@tcp1) reconnecting [82985.106064] Lustre: Skipped 1 previous similar message [82986.097862] Lustre: oak-OST0032: Client 24be0a47-1b33-0b5c-28de-7a37c40c9b32 (at 10.210.12.64@tcp1) reconnecting [82986.109226] Lustre: Skipped 38 previous similar messages [82988.282474] Lustre: oak-OST0038: Client a0f89c7b-b6d6-fcb9-5bf1-39a2ff594123 (at 10.210.12.58@tcp1) reconnecting [82988.293895] Lustre: Skipped 87 previous similar messages [82992.325980] Lustre: oak-OST0030: Client 55f24183-a271-d072-6af1-1634bdd8dca0 (at 10.210.12.74@tcp1) reconnecting [82992.337347] Lustre: Skipped 159 previous similar messages [83057.833735] Lustre: oak-OST0038: Client 7f5a03b0-a887-10eb-7386-d75d82cdd92b (at 10.51.13.20@o2ib3) reconnecting [83057.845147] Lustre: Skipped 29 previous similar messages [83075.324430] Lustre: oak-OST0056: Client 7660ea29-02f3-4 (at 10.50.13.10@o2ib2) reconnecting [83075.333753] Lustre: Skipped 672 previous similar messages [83086.286120] Lustre: oak-OST003a: Connection restored to 7660ea29-02f3-4 (at 10.50.13.10@o2ib2) [83086.295759] Lustre: Skipped 1130 previous similar messages [83113.924721] Lustre: oak-OST0034: Client 7f8e8eb2-f021-4 (at 10.50.14.10@o2ib2) reconnecting [83113.934052] Lustre: Skipped 24 previous similar messages [83183.806511] Lustre: oak-OST003a: Client 5b69c35d-4f29-4 (at 10.50.12.13@o2ib2) reconnecting [83183.815828] Lustre: Skipped 3 previous similar messages [83369.574816] Lustre: oak-OST0050: Client 489ee6f9-e0d9-4 (at 10.50.14.15@o2ib2) reconnecting [83369.584164] Lustre: Skipped 8 previous similar messages [83761.368151] Lustre: oak-OST0030: Connection restored to 71f0967f-07e1-4 (at 10.50.1.48@o2ib2) [83761.377676] Lustre: Skipped 87 previous similar messages [84235.364264] Lustre: oak-OST0058: Client d96a6637-b485-4 (at 10.50.12.11@o2ib2) reconnecting [84235.373605] Lustre: Skipped 7 previous similar messages [84403.806244] Lustre: oak-OST004a: Connection restored to a3eb25b2-055d-4 (at 10.50.5.31@o2ib2) [84403.815764] Lustre: Skipped 89 previous similar messages [84823.978431] LustreError: 193194:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [84823.992032] LustreError: 193194:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 1 previous similar message [84953.719314] LustreError: 2427:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [85098.517593] Lustre: oak-OST0056: Connection restored to (at 10.51.1.43@o2ib3) [85098.525660] Lustre: Skipped 43 previous similar messages [85170.643148] LustreError: 227676:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [85517.228230] LustreError: 227280:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [85700.385200] Lustre: oak-OST004a: Connection restored to 97723b04-0775-4 (at 10.51.1.24@o2ib3) [85700.394726] Lustre: Skipped 75 previous similar messages [86030.047720] LustreError: 237965:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [86050.110270] LustreError: 3788:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [86230.073897] LustreError: 193407:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [86334.452432] Lustre: oak-OST0034: Connection restored to f3c13fd8-933e-4 (at 10.50.4.12@o2ib2) [86334.461978] Lustre: Skipped 60 previous similar messages [86665.424131] LustreError: 240013:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 49152 GRANT, real grant 0 [86878.232394] Lustre: oak-OST0034: Client ae2e39c9-d6c4-e399-c4e2-9281e18e4ae6 (at 10.50.12.15@o2ib2) reconnecting [86878.232395] Lustre: oak-OST0048: Client ae2e39c9-d6c4-e399-c4e2-9281e18e4ae6 (at 10.50.12.15@o2ib2) reconnecting [86878.232398] Lustre: Skipped 4 previous similar messages [86878.260971] Lustre: Skipped 3 previous similar messages [86944.089766] Lustre: oak-OST005c: Connection restored to 2f2dff73-004a-4 (at 10.51.2.14@o2ib3) [86944.099566] Lustre: Skipped 62 previous similar messages [87174.085435] LustreError: 193189:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [87565.559871] Lustre: oak-OST0038: Connection restored to 671443b1-1450-4 (at 10.50.4.10@o2ib2) [87565.569399] Lustre: Skipped 80 previous similar messages [87619.228610] LustreError: 193395:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [87619.242244] LustreError: 193395:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 2 previous similar messages [88178.158059] LustreError: 238493:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [88178.171664] LustreError: 238493:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 1 previous similar message [88181.945574] Lustre: oak-OST0040: Connection restored to (at 10.50.8.17@o2ib2) [88181.953644] Lustre: Skipped 55 previous similar messages [88785.812508] Lustre: oak-OST0046: Connection restored to 190029f0-b1fc-4 (at 10.50.2.13@o2ib2) [88785.822031] Lustre: Skipped 108 previous similar messages [89022.985911] LustreError: 193397:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [89022.999506] LustreError: 193397:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 1 previous similar message [89402.325502] Lustre: oak-OST005c: Connection restored to f3c13fd8-933e-4 (at 10.50.4.12@o2ib2) [89402.335044] Lustre: Skipped 87 previous similar messages [90017.835190] Lustre: oak-OST0036: Connection restored to (at 10.50.13.7@o2ib2) [90017.843259] Lustre: Skipped 50 previous similar messages [90147.376595] LustreError: 193102:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [90147.390192] LustreError: 193102:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 1 previous similar message [90619.508104] Lustre: oak-OST0036: Connection restored to 9e244831-b7e7-4 (at 10.50.4.5@o2ib2) [90619.517540] Lustre: Skipped 175 previous similar messages [90842.782688] LustreError: 238501:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [90842.796282] LustreError: 238501:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 4 previous similar messages [91234.459451] Lustre: oak-OST0054: Connection restored to 1b1ee8dd-02b3-4 (at 10.50.4.15@o2ib2) [91234.468966] Lustre: Skipped 216 previous similar messages [91844.401344] Lustre: oak-OST0048: Connection restored to 97723b04-0775-4 (at 10.51.1.24@o2ib3) [91844.410871] Lustre: Skipped 136 previous similar messages [91911.046432] LustreError: 3682:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli a8e1d696-3374-4 claims 57344 GRANT, real grant 0 [92454.590677] Lustre: oak-OST0058: Connection restored to (at 10.50.13.7@o2ib2) [92454.598746] Lustre: Skipped 67 previous similar messages [92739.363581] Lustre: oak-OST0048: Client 76eb6295-9d00-d7ab-8458-c8aac654030a (at 10.51.6.5@o2ib3) reconnecting [92739.363582] Lustre: oak-OST004e: Client 76eb6295-9d00-d7ab-8458-c8aac654030a (at 10.51.6.5@o2ib3) reconnecting [92739.363585] Lustre: Skipped 6 previous similar messages [92824.539956] Lustre: oak-OST0048: Client b7b29b0e-7b9e-7f93-7f9b-e31ab6f299f7 (at 10.50.2.42@o2ib2) reconnecting [92824.551232] Lustre: Skipped 1 previous similar message [92872.646749] Lustre: oak-OST0052: Client 4f343bd2-a649-c662-9ecd-b48ac04a0cde (at 10.50.7.43@o2ib2) reconnecting [92872.658042] Lustre: Skipped 1 previous similar message [93076.911832] Lustre: oak-OST0032: Connection restored to 97d5652f-f6c9-4 (at 10.50.5.72@o2ib2) [93076.921367] Lustre: Skipped 61 previous similar messages [93694.027709] Lustre: oak-OST0048: Connection restored to 079c853c-4610-4 (at 10.50.9.12@o2ib2) [93694.037250] Lustre: Skipped 43 previous similar messages [94296.609321] Lustre: oak-OST005c: Connection restored to (at 10.50.13.7@o2ib2) [94296.617391] Lustre: Skipped 45 previous similar messages [94917.496370] Lustre: oak-OST0042: haven't heard from client 9b42eeec-6b03-4 (at 10.50.7.9@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be70d61f400, cur 1618815561 expire 1618815411 last 1618815334 [94917.518500] Lustre: Skipped 23 previous similar messages [94965.666806] Lustre: oak-OST005c: Connection restored to (at 10.51.1.43@o2ib3) [94965.674880] Lustre: Skipped 89 previous similar messages [95604.043120] Lustre: oak-OST0034: Connection restored to 0b908ede-4c82-4 (at 10.50.8.51@o2ib2) [95604.052645] Lustre: Skipped 109 previous similar messages [96251.517373] Lustre: oak-OST0056: Connection restored to 35e5650e-6e37-4 (at 10.50.4.18@o2ib2) [96251.526920] Lustre: Skipped 224 previous similar messages [96853.963735] Lustre: oak-OST0030: Connection restored to 94f977bb-92ee-4 (at 10.50.4.14@o2ib2) [96853.973263] Lustre: Skipped 176 previous similar messages [97471.501909] Lustre: oak-OST005c: Connection restored to 71f0967f-07e1-4 (at 10.50.1.48@o2ib2) [97471.511439] Lustre: Skipped 50 previous similar messages [98087.555436] Lustre: oak-OST005c: Connection restored to (at 10.50.13.7@o2ib2) [98087.563527] Lustre: Skipped 74 previous similar messages [98709.468860] Lustre: oak-OST0054: Connection restored to 167e8185-6ca9-4 (at 10.50.14.6@o2ib2) [98709.478396] Lustre: Skipped 49 previous similar messages [99310.068804] Lustre: oak-OST0038: Connection restored to (at 10.50.13.7@o2ib2) [99310.076893] Lustre: Skipped 32 previous similar messages [99912.241312] Lustre: oak-OST0038: Connection restored to 671443b1-1450-4 (at 10.50.4.10@o2ib2) [99912.250849] Lustre: Skipped 46 previous similar messages [100512.792987] Lustre: oak-OST003e: Connection restored to d2d1291c-869c-4 (at 10.50.12.3@o2ib2) [100512.802612] Lustre: Skipped 42 previous similar messages [100836.360862] Lustre: oak-OST0030: haven't heard from client d0a006b6-5a56-d4b8-f359-c632776eea41 (at 10.51.1.14@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7237a0c00, cur 1618821480 expire 1618821330 last 1618821253 [100836.385220] Lustre: Skipped 23 previous similar messages [101121.800746] Lustre: oak-OST004e: Connection restored to (at 10.51.4.15@o2ib3) [101121.808917] Lustre: Skipped 56 previous similar messages [101722.406105] Lustre: oak-OST0048: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [101722.415727] Lustre: Skipped 40 previous similar messages [102330.283367] Lustre: oak-OST0054: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [102330.293000] Lustre: Skipped 127 previous similar messages [102941.320151] Lustre: oak-OST0032: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [102941.329785] Lustre: Skipped 141 previous similar messages [103546.615095] Lustre: oak-OST003c: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [103546.624718] Lustre: Skipped 103 previous similar messages [104153.651791] Lustre: oak-OST0048: Connection restored to ac9a8546-2b3d-4 (at 10.50.4.4@o2ib2) [104153.661313] Lustre: Skipped 95 previous similar messages [104754.554549] Lustre: oak-OST005c: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [104754.564174] Lustre: Skipped 59 previous similar messages [105363.892255] Lustre: oak-OST0034: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [105363.901876] Lustre: Skipped 71 previous similar messages [105980.856237] Lustre: oak-OST0048: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [105980.865884] Lustre: Skipped 47 previous similar messages [106592.326563] Lustre: oak-OST0054: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [106592.336201] Lustre: Skipped 78 previous similar messages [107192.651696] Lustre: oak-OST0052: Connection restored to (at 10.50.13.7@o2ib2) [107192.659874] Lustre: Skipped 71 previous similar messages [107803.042843] Lustre: oak-OST005a: Connection restored to d4f26081-ad56-4 (at 10.50.12.17@o2ib2) [107803.052563] Lustre: Skipped 91 previous similar messages [108342.383690] LNet: 182051:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.224@o2ib5: error 0(waiting) [108342.396301] LNet: 182051:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [108342.407576] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c6003c0) failed: 5 [108342.417999] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 19 previous similar messages [108342.418080] Lustre: oak-OST004e: Client f504a7a3-efcf-5fbe-e631-3075fd4d7e5d (at 10.0.2.224@o2ib5) reconnecting [108342.418099] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.224@o2ib5 exceeded retry count 0 [108342.418102] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be70c14ac00 [108342.418436] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be70c14ac00 [108342.418442] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bab2abe0400 [108342.418445] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bab2abe0400 [108342.418447] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bab2abe0400 [108342.418450] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bab2abe0400 [108342.538128] LustreError: 193446:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be902cb2050 x1695156809497536/t0(0) o3->f504a7a3-efcf-5fbe-e631-3075fd4d7e5d@10.0.2.224@o2ib5:358/0 lens 488/440 e 0 to 0 dl 1618829078 ref 1 fl Interpret:/0/0 rc 0/0 [108342.556325] Lustre: oak-OST004e: Bulk IO read error with f504a7a3-efcf-5fbe-e631-3075fd4d7e5d (at 10.0.2.224@o2ib5), client will retry: rc -110 [108342.556326] Lustre: Skipped 4 previous similar messages [108342.585558] LustreError: 193446:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 9 previous similar messages [108411.447525] Lustre: oak-OST0058: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [108411.457146] Lustre: Skipped 77 previous similar messages [109040.252866] Lustre: oak-OST005c: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [109040.262489] Lustre: Skipped 58 previous similar messages [109665.704766] Lustre: oak-OST004a: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [109665.714429] Lustre: Skipped 40 previous similar messages [109818.619551] Lustre: oak-OST0052: Client e2dfb392-7af6-8dc1-3231-4888ec96e5e4 (at 10.210.12.45@tcp1) reconnecting [109818.631028] Lustre: Skipped 3 previous similar messages [109819.253214] Lustre: oak-OST0044: Client 46d58de5-f067-bd1c-6271-b328b4f0c51e (at 10.210.12.52@tcp1) reconnecting [109819.264674] Lustre: Skipped 4 previous similar messages [109820.611417] Lustre: oak-OST0044: Client 1b856b6f-6401-599d-9fbc-c9db3b83c966 (at 10.210.12.38@tcp1) reconnecting [109820.622898] Lustre: Skipped 17 previous similar messages [109892.774049] Lustre: oak-OST005e: Client b5a6956e-8daf-3681-391a-61a78fc15208 (at 10.210.12.133@tcp1) reconnecting [109892.785617] Lustre: Skipped 30 previous similar messages [110288.792250] Lustre: oak-OST0056: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [110288.801874] Lustre: Skipped 149 previous similar messages [110892.241103] Lustre: oak-OST0054: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [110892.250798] Lustre: Skipped 73 previous similar messages [111495.189091] Lustre: oak-OST0038: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [111495.198802] Lustre: Skipped 32 previous similar messages [112123.233184] Lustre: oak-OST0038: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [112123.242808] Lustre: Skipped 95 previous similar messages [112729.902493] Lustre: oak-OST0052: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [112729.912131] Lustre: Skipped 71 previous similar messages [113352.638270] Lustre: oak-OST0034: Connection restored to ada8c4a5-0a51-4 (at 10.50.2.32@o2ib2) [113352.647891] Lustre: Skipped 41 previous similar messages [113954.135101] Lustre: oak-OST0038: Connection restored to 3e7c55d0-08c5-4 (at 10.51.4.52@o2ib3) [113954.144726] Lustre: Skipped 94 previous similar messages [114558.185741] Lustre: oak-OST0038: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [114558.195388] Lustre: Skipped 188 previous similar messages [115163.741281] Lustre: oak-OST005a: Connection restored to (at 10.51.1.43@o2ib3) [115163.749455] Lustre: Skipped 114 previous similar messages [115768.077737] Lustre: oak-OST0052: Connection restored to df60e1f5-ba28-4 (at 10.50.10.61@o2ib2) [115768.087459] Lustre: Skipped 79 previous similar messages [116378.868720] Lustre: oak-OST003c: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [116378.878342] Lustre: Skipped 28 previous similar messages [116978.985742] Lustre: oak-OST0034: Connection restored to 311b26ee-30a2-4 (at 10.50.15.6@o2ib2) [116978.995362] Lustre: Skipped 90 previous similar messages [117202.982549] Lustre: oak-OST004e: haven't heard from client 1e15528c-f96c-29fa-1ff3-901af3c89935 (at 10.210.12.8@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be722772000, cur 1618837847 expire 1618837697 last 1618837620 [117203.006927] Lustre: Skipped 23 previous similar messages [117579.978165] Lustre: oak-OST003e: Connection restored to 1d6f2d21-9fc7-4 (at 10.51.1.44@o2ib3) [117579.987795] Lustre: Skipped 105 previous similar messages [118194.653650] Lustre: oak-OST0030: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [118194.663269] Lustre: Skipped 109 previous similar messages [118270.956646] Lustre: oak-OST003e: haven't heard from client ed1391b7-feeb-ce0c-442f-2aaf3b415878 (at 10.210.12.101@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c4520800, cur 1618838915 expire 1618838765 last 1618838688 [118270.981189] Lustre: Skipped 23 previous similar messages [118271.483306] Lustre: oak-OST005e: haven't heard from client 4bdc721a-dcad-a9dd-9485-c022534d7537 (at 10.210.12.107@tcp1) in 221 seconds. I think it's dead, and I am evicting it. exp ffff8be9797a9000, cur 1618838915 expire 1618838765 last 1618838694 [118271.507896] Lustre: Skipped 83 previous similar messages [118817.648480] Lustre: oak-OST004e: Connection restored to (at 10.50.13.7@o2ib2) [118817.656647] Lustre: Skipped 273 previous similar messages [119439.255531] Lustre: oak-OST0054: Connection restored to 167e8185-6ca9-4 (at 10.50.14.6@o2ib2) [119439.265159] Lustre: Skipped 72 previous similar messages [120104.800467] Lustre: oak-OST003c: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [120104.810085] Lustre: Skipped 29 previous similar messages [120707.570787] Lustre: oak-OST0056: Connection restored to 221021ef-24c0-4 (at 10.51.1.22@o2ib3) [120707.580432] Lustre: Skipped 115 previous similar messages [121326.102624] Lustre: oak-OST0032: Connection restored to (at 10.51.4.15@o2ib3) [121326.110792] Lustre: Skipped 59 previous similar messages [121932.190631] Lustre: oak-OST0050: Connection restored to bc175ba0-453f-4 (at 10.51.1.25@o2ib3) [121932.200251] Lustre: Skipped 20 previous similar messages [122540.427925] Lustre: oak-OST005c: Connection restored to c6abed43-2af3-4 (at 10.51.1.48@o2ib3) [122540.437544] Lustre: Skipped 22 previous similar messages [123140.486129] Lustre: oak-OST0038: Connection restored to d96a6637-b485-4 (at 10.50.12.11@o2ib2) [123140.495844] Lustre: Skipped 83 previous similar messages [123513.849183] Lustre: oak-OST003a: haven't heard from client 36b39f4b-d174-0e3d-1c9a-a8a925c9ef4f (at 10.210.12.7@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4cc940800, cur 1618844158 expire 1618844008 last 1618843931 [123513.873561] Lustre: Skipped 155 previous similar messages [123745.634887] Lustre: oak-OST0046: Connection restored to 1d6f2d21-9fc7-4 (at 10.51.1.44@o2ib3) [123745.644562] Lustre: Skipped 66 previous similar messages [124346.094102] Lustre: oak-OST0030: Connection restored to 757d523d-626c-4 (at 10.50.1.5@o2ib2) [124346.103632] Lustre: Skipped 56 previous similar messages [124881.809226] Lustre: oak-OST0048: haven't heard from client ffc4eedc-95fb-f2a5-f042-d20ba5af295f (at 10.210.12.11@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be75c237800, cur 1618845526 expire 1618845376 last 1618845299 [124881.833701] Lustre: Skipped 23 previous similar messages [124882.317145] Lustre: oak-OST0038: haven't heard from client 8d33c1cf-17d5-8b19-c8eb-1afb1ae0801a (at 10.210.12.25@tcp1) in 173 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c8525000, cur 1618845526 expire 1618845376 last 1618845353 [124882.341755] Lustre: Skipped 140 previous similar messages [125033.063231] Lustre: oak-OST0032: Connection restored to (at 10.51.1.43@o2ib3) [125033.071405] Lustre: Skipped 86 previous similar messages [125633.758500] Lustre: oak-OST005e: Connection restored to 221021ef-24c0-4 (at 10.51.1.22@o2ib3) [125633.768145] Lustre: Skipped 138 previous similar messages [126239.271380] Lustre: oak-OST003c: Connection restored to (at 10.50.1.58@o2ib2) [126239.279548] Lustre: Skipped 77 previous similar messages [126412.779796] Lustre: oak-OST004c: haven't heard from client e21e0c89-cf62-4b21-8259-179ee98a1647 (at 10.50.5.65@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4ce212c00, cur 1618847057 expire 1618846907 last 1618846830 [126412.804187] Lustre: Skipped 50 previous similar messages [126845.237897] Lustre: oak-OST004e: Connection restored to (at 10.51.1.43@o2ib3) [126845.246071] Lustre: Skipped 23 previous similar messages [127462.643113] Lustre: oak-OST0032: Connection restored to f35b4cd6-c6c1-4 (at 10.50.13.15@o2ib2) [127462.652836] Lustre: Skipped 80 previous similar messages [128067.741495] Lustre: oak-OST0052: Connection restored to 9925e6e6-5de6-4 (at 10.50.14.11@o2ib2) [128067.751216] Lustre: Skipped 79 previous similar messages [128598.726385] Lustre: oak-OST0044: haven't heard from client 18349d35-d7be-4 (at 10.51.14.24@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4cbf44000, cur 1618849243 expire 1618849093 last 1618849016 [128598.748802] Lustre: Skipped 23 previous similar messages [128693.515577] Lustre: oak-OST005c: Connection restored to 3e7c55d0-08c5-4 (at 10.51.4.52@o2ib3) [128693.525197] Lustre: Skipped 54 previous similar messages [129125.266578] Lustre: oak-OST005a: Client 5a75a566-9084-8a55-841f-1bf333596a74 (at 10.51.0.17@o2ib3) reconnecting [129125.277944] Lustre: Skipped 3 previous similar messages [129155.303889] Lustre: oak-OST005e: Client f7a7c5e1-f0e4-4 (at 10.50.10.21@o2ib2) reconnecting [129155.313314] Lustre: Skipped 1 previous similar message [129295.668142] Lustre: oak-OST0046: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [129295.677852] Lustre: Skipped 172 previous similar messages [129528.858064] Lustre: oak-OST004c: Client 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3) reconnecting [129528.869438] Lustre: Skipped 3 previous similar messages [129540.169183] LustreError: 209388:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0036: cli 182778a0-b920-4 claims 4169728 GRANT, real grant 32768 [129540.183500] LustreError: 209388:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 2 previous similar messages [129875.524103] LustreError: 193195:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0030: cli 182778a0-b920-4 claims 4169728 GRANT, real grant 32768 [129906.633560] Lustre: oak-OST004e: Connection restored to (at 10.51.1.43@o2ib3) [129906.641774] Lustre: Skipped 146 previous similar messages [130024.975519] Lustre: oak-OST004a: Client 7535e7e6-b397-e9ec-ed05-2fc68240cd4c (at 10.51.4.38@o2ib3) reconnecting [130024.986893] Lustre: Skipped 1 previous similar message [130533.718833] Lustre: oak-OST0054: Connection restored to 712cedcb-4de0-4 (at 10.50.7.39@o2ib2) [130533.728454] Lustre: Skipped 514 previous similar messages [130575.674733] Lustre: oak-OST003c: haven't heard from client 2a543e62-414b-2d12-e7e0-ddd67f034061 (at 10.50.0.61@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be73144d000, cur 1618851220 expire 1618851070 last 1618850993 [130575.699106] Lustre: Skipped 191 previous similar messages [131133.981828] Lustre: oak-OST004e: Connection restored to (at 10.50.13.7@o2ib2) [131133.990006] Lustre: Skipped 135 previous similar messages [131524.653146] Lustre: oak-OST005c: haven't heard from client 8b961e9a-d5cb-4 (at 10.49.27.23@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c7b3b800, cur 1618852169 expire 1618852019 last 1618851942 [131524.675569] Lustre: Skipped 23 previous similar messages [131743.355033] Lustre: oak-OST004e: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [131743.364672] Lustre: Skipped 230 previous similar messages [131843.870191] Lustre: oak-OST0048: Client 66bf6793-ef6e-d6d2-96ad-3432d26adbce (at 10.51.13.5@o2ib3) reconnecting [132346.589846] Lustre: oak-OST003a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [132346.599503] Lustre: Skipped 178 previous similar messages [132652.773442] Lustre: oak-OST0048: Client 58f639a2-486d-a021-b489-a27356ac1157 (at 10.51.13.10@o2ib3) reconnecting [132652.773443] Lustre: oak-OST0038: Client 58f639a2-486d-a021-b489-a27356ac1157 (at 10.51.13.10@o2ib3) reconnecting [132947.367995] Lustre: oak-OST004e: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [132947.377689] Lustre: Skipped 230 previous similar messages [133318.830330] Lustre: oak-OST0042: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [133349.609812] Lustre: oak-OST0032: haven't heard from client 6e322dfd-5998-3011-6a36-9aa7c79609e9 (at 10.50.0.61@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be713b21400, cur 1618853994 expire 1618853844 last 1618853767 [133349.634228] Lustre: Skipped 23 previous similar messages [133429.178276] LustreError: 193254:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0058: cli 182778a0-b920-4 claims 987136 GRANT, real grant 32768 [133548.514502] Lustre: oak-OST0044: Connection restored to 7b6b5d3b-ca43-4 (at 10.50.15.1@o2ib2) [133548.524120] Lustre: Skipped 185 previous similar messages [133617.608779] Lustre: oak-OST0044: haven't heard from client 01731016-3eaf-b47d-6b11-10b8893be38d (at 10.210.12.21@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4cbf45c00, cur 1618854262 expire 1618854112 last 1618854035 [133617.633248] Lustre: Skipped 23 previous similar messages [133806.600630] Lustre: oak-OST004e: haven't heard from client d85dc65e-c3dc-3610-40a7-d2e62ee65f1c (at 10.210.12.29@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7314f9c00, cur 1618854451 expire 1618854301 last 1618854224 [133806.625087] Lustre: Skipped 95 previous similar messages [134154.047641] Lustre: oak-OST0030: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [134154.057274] Lustre: Skipped 155 previous similar messages [134754.975490] Lustre: oak-OST0036: Connection restored to 473ec31f-2480-4 (at 10.50.14.9@o2ib2) [134754.985133] Lustre: Skipped 193 previous similar messages [135355.020536] Lustre: oak-OST0056: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [135355.030179] Lustre: Skipped 193 previous similar messages [135981.166125] Lustre: oak-OST005c: Connection restored to (at 10.50.13.7@o2ib2) [135981.174307] Lustre: Skipped 117 previous similar messages [136584.973180] Lustre: oak-OST0058: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [136584.982815] Lustre: Skipped 133 previous similar messages [137078.657090] Lustre: oak-OST0040: Client 5514f76a-ebfd-4c1b-c28e-c2a228cc78fe (at 10.51.1.18@o2ib3) reconnecting [137078.668459] Lustre: Skipped 1 previous similar message [137186.037245] Lustre: oak-OST004c: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [137186.046916] Lustre: Skipped 216 previous similar messages [137251.319948] LNet: 182051:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [137251.332988] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be722756000 [137251.345142] LustreError: 193113:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be715e0c850 x1691356352143424/t0(0) o4->d55d324b-c685-4@10.51.6.4@o2ib3:578/0 lens 488/448 e 0 to 0 dl 1618857988 ref 1 fl Interpret:/0/0 rc 0/0 [137251.345149] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be71640d800 [137251.345163] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be977ca6c00 [137251.345177] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be71640e000 [137251.345193] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be977ca2800 [137251.345205] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6a9cfa800 [137251.345222] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4c7b3b800 [137251.345373] Lustre: oak-OST0056: Bulk IO write error with d55d324b-c685-4 (at 10.51.6.4@o2ib3), client will retry: rc = -110 [137251.345375] Lustre: Skipped 4 previous similar messages [137251.462016] LustreError: 193113:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [137259.525230] Lustre: oak-OST003c: haven't heard from client f5b2c82a-d658-25db-617b-91df2a8d0304 (at 10.50.0.61@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be760d94800, cur 1618857904 expire 1618857754 last 1618857677 [137259.549616] Lustre: Skipped 47 previous similar messages [137300.068251] LustreError: 193413:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be973f98850 x1695831165397632/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:578/0 lens 488/440 e 0 to 0 dl 1618857988 ref 1 fl Interpret:/0/0 rc 0/0 [137300.068258] LustreError: 193443:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be6c118f850 x1695831165402112/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:578/0 lens 488/440 e 0 to 0 dl 1618857988 ref 1 fl Interpret:/0/0 rc 0/0 [137300.068260] LustreError: 193187:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8be715e08850 x1691356352143488/t0(0) o4->d55d324b-c685-4@10.51.6.4@o2ib3:578/0 lens 488/448 e 0 to 0 dl 1618857988 ref 1 fl Interpret:/0/0 rc 0/0 [137300.068341] Lustre: oak-OST004e: Bulk IO read error with 330d404b-804c-4 (at 10.51.15.3@o2ib3), client will retry: rc -110 [137300.068343] Lustre: Skipped 1 previous similar message [137300.068553] Lustre: oak-OST0056: Bulk IO write error with d55d324b-c685-4 (at 10.51.6.4@o2ib3), client will retry: rc = -110 [137300.068554] Lustre: Skipped 4 previous similar messages [137300.183287] LustreError: 193413:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 3 previous similar messages [137325.067572] LustreError: 193455:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be723186850 x1695831165404672/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:583/0 lens 488/440 e 0 to 0 dl 1618857993 ref 1 fl Interpret:/0/0 rc 0/0 [137325.092995] LustreError: 193455:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [137325.103743] Lustre: oak-OST0042: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [137325.116191] Lustre: Skipped 2 previous similar messages [137366.483544] Lustre: oak-OST0036: Client 8ab7929a-8a09-4 (at 10.51.0.72@o2ib3) reconnecting [137418.804619] Lustre: oak-OST0056: Client d55d324b-c685-4 (at 10.51.6.4@o2ib3) reconnecting [137422.678037] Lustre: oak-OST0048: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [137429.879274] Lustre: oak-OST004e: Client 330d404b-804c-4 (at 10.51.15.3@o2ib3) reconnecting [137443.281571] Lustre: oak-OST0042: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [137658.902524] Lustre: oak-OST005c: Client e71e5ef0-3cf7-788c-e457-05fd6b805527 (at 10.51.13.14@o2ib3) reconnecting [137786.455780] Lustre: oak-OST005c: Connection restored to 28c98064-89b8-4 (at 10.51.5.18@o2ib3) [137786.465399] Lustre: Skipped 160 previous similar messages [137826.679525] Lustre: oak-OST004e: Client 2dce41e5-6cf1-0747-6975-01d951e9d8ac (at 10.51.1.46@o2ib3) reconnecting [137826.690894] Lustre: Skipped 1 previous similar message [138296.534874] Lustre: oak-OST0046: Client 276a104e-efee-5359-cb9c-b91182f956c5 (at 10.51.0.15@o2ib3) reconnecting [138389.562119] Lustre: oak-OST004a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [138389.571740] Lustre: Skipped 224 previous similar messages [138638.237611] Lustre: oak-OST0042: Client f46b1f75-6ac7-4 (at 10.51.15.6@o2ib3) reconnecting [138638.246950] Lustre: Skipped 4 previous similar messages [138785.492487] Lustre: oak-OST005a: haven't heard from client 1fde5141-ceea-4 (at 10.49.27.23@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bab0aba7800, cur 1618859430 expire 1618859280 last 1618859203 [138785.514912] Lustre: Skipped 23 previous similar messages [139001.781025] Lustre: oak-OST003e: Connection restored to (at 10.50.14.15@o2ib2) [139001.789287] Lustre: Skipped 185 previous similar messages [139481.715972] Lustre: oak-OST005c: Client b1aa0e56-7229-b79d-2adf-6fd2266928f7 (at 10.51.12.6@o2ib3) reconnecting [139481.727336] Lustre: Skipped 7 previous similar messages [139598.710431] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [139598.724912] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7054a3800 [139598.737110] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be721e0d400 [139598.749258] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be719d87800 [139598.761410] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be976cf7000 [139598.761412] LustreError: 193253:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be725fb5850 x1691356588881664/t0(0) o4->d55d324b-c685-4@10.51.6.4@o2ib3:655/0 lens 488/448 e 0 to 0 dl 1618860330 ref 1 fl Interpret:/0/0 rc 0/0 [139598.761626] Lustre: oak-OST0056: Bulk IO write error with d55d324b-c685-4 (at 10.51.6.4@o2ib3), client will retry: rc = -110 [139598.761628] Lustre: Skipped 1 previous similar message [139598.817330] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be976cf7000 [139605.582101] Lustre: oak-OST0050: Connection restored to aa28aed1-98fc-4 (at 10.51.1.38@o2ib3) [139605.591719] Lustre: Skipped 272 previous similar messages [139650.112607] LustreError: 187418:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 131072(1179648) req@ffff8be973e05850 x1695831333765824/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:655/0 lens 488/440 e 0 to 0 dl 1618860330 ref 1 fl Interpret:/0/0 rc 0/0 [139650.112657] LustreError: 242901:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be721df7050 x1691356588901696/t0(0) o4->d55d324b-c685-4@10.51.6.4@o2ib3:659/0 lens 488/448 e 0 to 0 dl 1618860334 ref 1 fl Interpret:/0/0 rc 0/0 [139650.112659] LustreError: 242901:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [139650.112663] LustreError: 193427:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4059136) req@ffff8be718e56050 x1689658753300800/t0(0) o4->bc22dc9e-c3bf-4@10.51.4.44@o2ib3:659/0 lens 488/448 e 0 to 0 dl 1618860334 ref 1 fl Interpret:/0/0 rc 0/0 [139650.112665] LustreError: 193427:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [139650.112714] Lustre: oak-OST0050: Bulk IO read error with afb0548c-9287-4 (at 10.51.15.2@o2ib3), client will retry: rc -110 [139650.112895] Lustre: oak-OST0038: Bulk IO write error with bc22dc9e-c3bf-4 (at 10.51.4.44@o2ib3), client will retry: rc = -110 [139650.112896] Lustre: Skipped 1 previous similar message [139650.243302] LustreError: 187418:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 11 previous similar messages [139741.958053] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.4.44@o2ib3 ns: filter-oak-OST005c_UUID lock: ffff8ba9a1b4ad00/0xf81cb91faac3300 lrc: 4/0,0 mode: PR/PR res: [0x21527d8:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.4.44@o2ib3 remote: 0x3969149f36a6b578 expref: 9 pid: 193017 timeout: 139744 lvb_type: 1 [139742.004269] LustreError: 187525:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8be718a63400 ns: filter-oak-OST005c_UUID lock: ffff8be4ef3b5e80/0xf81cb91faac351b lrc: 3/0,0 mode: --/PW res: [0x21527d8:0x0:0x0].0x0 rrc: 3 type: EXT [0->4194303] (req 0->4194303) flags: 0x50000000020000 nid: 10.51.4.44@o2ib3 remote: 0x3969149f36a6b5a2 expref: 9 pid: 187525 timeout: 0 lvb_type: 0 [140212.583525] Lustre: oak-OST0050: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [140212.593152] Lustre: Skipped 237 previous similar messages [140327.749034] Lustre: oak-OST0030: Client 02703246-1d69-353e-49ea-eafa755b5258 (at 10.50.5.8@o2ib2) reconnecting [140327.749035] Lustre: oak-OST0044: Client 02703246-1d69-353e-49ea-eafa755b5258 (at 10.50.5.8@o2ib2) reconnecting [140327.749036] Lustre: oak-OST004e: Client 02703246-1d69-353e-49ea-eafa755b5258 (at 10.50.5.8@o2ib2) reconnecting [140327.749044] Lustre: Skipped 114 previous similar messages [140327.749045] Lustre: Skipped 114 previous similar messages [140827.342137] Lustre: oak-OST0030: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [140827.351756] Lustre: Skipped 153 previous similar messages [141428.325825] Lustre: oak-OST0040: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [141428.335485] Lustre: Skipped 128 previous similar messages [141981.743033] Lustre: oak-OST0052: Client 6cbff61b-9898-4 (at 10.51.13.9@o2ib3) reconnecting [141981.752390] Lustre: Skipped 14 previous similar messages [142038.191044] Lustre: oak-OST0034: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [142038.200665] Lustre: Skipped 231 previous similar messages [142063.208122] Lustre: oak-OST0054: Client ea3c0abc-03bf-9064-4b06-a2deb2e4ec48 (at 10.51.4.35@o2ib3) reconnecting [142063.208124] Lustre: oak-OST0044: Client ea3c0abc-03bf-9064-4b06-a2deb2e4ec48 (at 10.51.4.35@o2ib3) reconnecting [142063.208127] Lustre: Skipped 8 previous similar messages [142541.124929] Lustre: oak-OST0036: Client 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3) reconnecting [142648.869406] Lustre: oak-OST0052: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [142648.879030] Lustre: Skipped 193 previous similar messages [143269.137182] Lustre: oak-OST003a: Connection restored to 3b84538d-3753-4 (at 10.49.25.15@o2ib1) [143269.146899] Lustre: Skipped 139 previous similar messages [143618.386127] Lustre: oak-OST005c: haven't heard from client cdbf21e9-b1cc-922f-5264-a12b614a63b5 (at 10.210.12.81@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c7abec00, cur 1618864263 expire 1618864113 last 1618864036 [143618.410583] Lustre: Skipped 23 previous similar messages [143618.894563] Lustre: oak-OST0052: haven't heard from client 8960c7b8-f4a0-5a29-fcf4-36baf0d013f5 (at 10.210.12.78@tcp1) in 216 seconds. I think it's dead, and I am evicting it. exp ffff8be71ff91400, cur 1618864263 expire 1618864113 last 1618864047 [143618.919020] Lustre: Skipped 153 previous similar messages [143876.797030] Lustre: oak-OST0032: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [143876.806648] Lustre: Skipped 109 previous similar messages [144259.250065] Lustre: oak-OST0048: Client 4eca94b5-20cc-8fd4-b8fd-ebd10e301645 (at 10.51.1.21@o2ib3) reconnecting [144259.261430] Lustre: Skipped 10 previous similar messages [144483.129200] Lustre: oak-OST0044: Connection restored to c5674af2-0019-4 (at 10.51.5.2@o2ib3) [144483.138722] Lustre: Skipped 429 previous similar messages [144929.341581] Lustre: oak-OST004c: haven't heard from client 49040081-f174-4 (at 10.49.0.71@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4fe1f9400, cur 1618865574 expire 1618865424 last 1618865347 [144929.363980] Lustre: Skipped 37 previous similar messages [145085.538554] Lustre: oak-OST0030: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [145085.548184] Lustre: Skipped 218 previous similar messages [145685.932957] Lustre: oak-OST0034: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [145685.942578] Lustre: Skipped 331 previous similar messages [146289.892365] Lustre: oak-OST004a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [146289.902027] Lustre: Skipped 161 previous similar messages [146931.589231] Lustre: oak-OST0034: Connection restored to 13ebd6ae-c776-4 (at 10.51.6.8@o2ib3) [146931.598789] Lustre: Skipped 123 previous similar messages [147366.506013] Lustre: oak-OST0038: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [147366.506014] Lustre: oak-OST0036: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [147366.506016] Lustre: Skipped 1 previous similar message [147532.341393] Lustre: oak-OST0032: Connection restored to d96a6637-b485-4 (at 10.50.12.11@o2ib2) [147532.351146] Lustre: Skipped 111 previous similar messages [147719.221333] Lustre: oak-OST004a: Client 56a5a766-0782-0626-7e81-90dde2e2789a (at 10.51.2.28@o2ib3) reconnecting [147719.232700] Lustre: Skipped 3 previous similar messages [148139.479551] Lustre: oak-OST0054: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [148139.489325] Lustre: Skipped 178 previous similar messages [148378.956466] Lustre: oak-OST0042: Client 60fdba1c-732a-b99c-63c8-be091395f5f0 (at 10.51.12.5@o2ib3) reconnecting [148378.967829] Lustre: Skipped 2 previous similar messages [148547.645717] Lustre: oak-OST005c: Client 7535e7e6-b397-e9ec-ed05-2fc68240cd4c (at 10.51.4.38@o2ib3) reconnecting [148547.657085] Lustre: Skipped 3 previous similar messages [148747.636294] Lustre: oak-OST003a: Connection restored to fd16aff2-0371-4 (at 10.51.4.33@o2ib3) [148747.645919] Lustre: Skipped 222 previous similar messages [149349.837293] Lustre: oak-OST0032: Connection restored to f04c53ee-80b0-4 (at 10.51.4.23@o2ib3) [149349.846922] Lustre: Skipped 385 previous similar messages [149555.002516] Lustre: oak-OST0050: Client 10bbb35d-b69c-3a09-088b-4bac90b1107f (at 10.51.13.1@o2ib3) reconnecting [149555.013880] Lustre: Skipped 4 previous similar messages [149556.125517] Lustre: oak-OST005c: Client 14270342-34f5-f84f-318c-4252a9121c55 (at 10.51.12.22@o2ib3) reconnecting [149557.782048] Lustre: oak-OST004c: Client 33016c70-be50-803e-0431-7a51cafca4a6 (at 10.51.14.19@o2ib3) reconnecting [149557.793511] Lustre: Skipped 1 previous similar message [149578.204982] Lustre: oak-OST0050: Client 54153ede-ddc5-4 (at 10.51.2.1@o2ib3) reconnecting [149578.214215] Lustre: Skipped 1 previous similar message [149811.902465] Lustre: oak-OST0032: Client ddf32ebb-67b7-cd30-ef01-7bc176ae5daa (at 10.51.4.9@o2ib3) reconnecting [149811.902466] Lustre: oak-OST005e: Client ddf32ebb-67b7-cd30-ef01-7bc176ae5daa (at 10.51.4.9@o2ib3) reconnecting [149811.925007] Lustre: Skipped 2 previous similar messages [149911.994867] Lustre: oak-OST0050: Client c1ada0a4-de2a-c24a-4d28-c593030bd6bb (at 10.51.6.21@o2ib3) reconnecting [149911.994868] Lustre: oak-OST0040: Client c1ada0a4-de2a-c24a-4d28-c593030bd6bb (at 10.51.6.21@o2ib3) reconnecting [149911.994871] Lustre: Skipped 5 previous similar messages [149915.452943] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [149915.466914] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7226d1800 [149915.479075] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7226d6c00 [149915.491264] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7226d6c00 [149915.503444] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be70d7f0000 [149915.515678] LustreError: 193425:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be7575cb850 x1689669152960960/t0(0) o4->e518c6df-47a3-4@10.51.5.25@o2ib3:406/0 lens 488/448 e 0 to 0 dl 1618870651 ref 1 fl Interpret:/0/0 rc 0/0 [149915.541117] LustreError: 193425:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [149915.551784] Lustre: oak-OST003e: Bulk IO write error with e518c6df-47a3-4 (at 10.51.5.25@o2ib3), client will retry: rc = -110 [149915.564502] Lustre: Skipped 5 previous similar messages [149950.310254] Lustre: oak-OST0042: Connection restored to 7d4930b0-1dcd-4 (at 10.51.13.8@o2ib3) [149950.319899] Lustre: Skipped 602 previous similar messages [149974.372966] LustreError: 193404:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71644e050 x1689650601045952/t0(0) o3->2b92a59a-4428-4@10.51.5.64@o2ib3:408/0 lens 488/440 e 0 to 0 dl 1618870653 ref 1 fl Interpret:/0/0 rc 0/0 [149974.373138] Lustre: oak-OST005c: Bulk IO read error with 2b92a59a-4428-4 (at 10.51.5.64@o2ib3), client will retry: rc -110 [149974.373139] Lustre: Skipped 13 previous similar messages [149974.416742] LustreError: 193404:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [150013.591394] Lustre: oak-OST005e: Client 9d061e04-8567-4 (at 10.51.0.71@o2ib3) reconnecting [150013.600725] Lustre: Skipped 5 previous similar messages [150099.137062] LustreError: 137-5: oak-OST004b_UUID: not available for connect from 10.51.5.3@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [150120.037625] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.51.13.24@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [150185.691584] Lustre: oak-OST005c: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [150185.700928] Lustre: Skipped 13 previous similar messages [150186.668100] LustreError: 193435:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be6a56e0850 x1695832025287808/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:683/0 lens 488/440 e 0 to 0 dl 1618870928 ref 1 fl Interpret:/0/0 rc 0/0 [150186.693084] Lustre: oak-OST005c: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [150186.705523] Lustre: Skipped 1 previous similar message [150232.020457] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [150232.035095] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bc4deb5e400 [150232.047259] LustreError: 209396:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be721cd0850 x1689650470557120/t0(0) o4->e633bac4-d67b-4@10.51.4.63@o2ib3:723/0 lens 488/448 e 0 to 0 dl 1618870968 ref 1 fl Interpret:/0/0 rc 0/0 [150232.072826] Lustre: oak-OST0030: Bulk IO write error with e633bac4-d67b-4 (at 10.51.4.63@o2ib3), client will retry: rc = -110 [150299.395574] LustreError: 193449:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1231121) req@ffff8be71fdd4850 x1685024750533056/t0(0) o4->a28c7102-5eb9-4@10.51.2.57@o2ib3:723/0 lens 488/448 e 0 to 0 dl 1618870968 ref 1 fl Interpret:/0/0 rc 0/0 [150299.395783] Lustre: oak-OST0038: Bulk IO write error with 61fe4605-6b08-4 (at 10.51.5.54@o2ib3), client will retry: rc = -110 [150299.434869] LustreError: 193449:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 7 previous similar messages [150379.713723] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.13.6@o2ib3 ns: filter-oak-OST004c_UUID lock: ffff8bbac5474380/0xf81cb91fafb51f7 lrc: 4/0,0 mode: PW/PW res: [0x1f4412d:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->200703) flags: 0x60000400010020 nid: 10.51.13.6@o2ib3 remote: 0xd45dcbf2613cc5da expref: 6 pid: 188280 timeout: 150382 lvb_type: 0 [150398.790316] Lustre: 193121:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618870870/real 1618870870] req@ffff8bc32468ba80 x1697353844682048/t0(0) o106->oak-OST0050@10.51.4.59@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618871043 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [150399.333294] Lustre: 193037:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618870871/real 1618870871] req@ffff8bd41fd8d100 x1697353844682240/t0(0) o104->oak-OST0058@10.51.13.6@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618871044 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [150399.363953] Lustre: 193037:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [150403.119179] Lustre: 193077:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618870874/real 1618870874] req@ffff8bcba8aec800 x1697353844682368/t0(0) o104->oak-OST0046@10.51.13.6@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618871047 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [150550.918448] Lustre: oak-OST0058: Connection restored to 0e941ceb-3e75-4 (at 10.51.4.53@o2ib3) [150550.928067] Lustre: Skipped 931 previous similar messages [150552.553464] Lustre: oak-OST0046: Client ece9f754-b231-f7f0-99a6-c8016ca66797 (at 10.51.1.32@o2ib3) reconnecting [150552.564829] Lustre: Skipped 298 previous similar messages [150593.245620] LustreError: 137-5: oak-OST0043_UUID: not available for connect from 10.51.14.2@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [150780.028268] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [150780.041583] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bc4c7591000 [150780.053731] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bbe1949dc00 [150780.053737] LustreError: 193406:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bca55b51050 x1688873315364288/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:514/0 lens 488/448 e 0 to 0 dl 1618871514 ref 1 fl Interpret:/0/0 rc 0/0 [150780.053945] Lustre: oak-OST0048: Bulk IO write error with fd16aff2-0371-4 (at 10.51.4.33@o2ib3), client will retry: rc = -110 [150780.053946] Lustre: Skipped 4 previous similar messages [150780.110041] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca6cbf6c00 [150780.122229] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca6cbf6c00 [150780.134389] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb1f29c7800 [150824.426188] LustreError: 193439:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be731a5c850 x1688873315366528/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:514/0 lens 488/448 e 0 to 0 dl 1618871514 ref 1 fl Interpret:/0/0 rc 0/0 [150824.426241] LustreError: 193198:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be6fcee8850 x1695832034463872/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:517/0 lens 488/440 e 0 to 0 dl 1618871517 ref 1 fl Interpret:/0/0 rc 0/0 [150824.426242] LustreError: 193198:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [150824.426355] Lustre: oak-OST0056: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [150824.501674] Lustre: oak-OST0042: Bulk IO write error with fd16aff2-0371-4 (at 10.51.4.33@o2ib3), client will retry: rc = -110 [150824.514429] Lustre: Skipped 1 previous similar message [151141.177326] LustreError: 209398:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be91efaf850 x1696871516448448/t0(0) o4->8cd0fce8-38e8-d9db-a09e-3d56f1d88309@10.51.2.5@o2ib3:108/0 lens 488/448 e 0 to 0 dl 1618871863 ref 1 fl Interpret:/0/0 rc 0/0 [151141.178484] Lustre: oak-OST0040: Bulk IO write error with 8cd0fce8-38e8-d9db-a09e-3d56f1d88309 (at 10.51.2.5@o2ib3), client will retry: rc = -110 [151141.218956] LustreError: 209398:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [151148.977081] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [151148.991007] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bb6bc6dc800 [151151.096075] Lustre: oak-OST0058: Connection restored to a634e013-aa8e-4 (at 10.50.16.3@o2ib2) [151151.105700] Lustre: Skipped 1680 previous similar messages [151154.670048] Lustre: oak-OST005e: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [151154.679377] Lustre: Skipped 37 previous similar messages [151199.445155] LustreError: 193441:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be91edc7850 x1689649251048384/t0(0) o4->0593dea7-ca9c-4@10.51.5.49@o2ib3:128/0 lens 488/448 e 0 to 0 dl 1618871883 ref 1 fl Interpret:/0/0 rc 0/0 [151199.445175] LustreError: 193190:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 640523(1689099) req@ffff8be72268c850 x1685052281741248/t0(0) o4->69e54a80-d9dd-4@10.51.2.37@o2ib3:132/0 lens 488/448 e 0 to 0 dl 1618871887 ref 1 fl Interpret:/0/0 rc 0/0 [151199.445225] Lustre: oak-OST005a: Bulk IO write error with bc22dc9e-c3bf-4 (at 10.51.4.44@o2ib3), client will retry: rc = -110 [151199.445226] Lustre: Skipped 2 previous similar messages [151199.515724] LustreError: 193441:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [151298.692752] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.2.49@o2ib3 ns: filter-oak-OST0056_UUID lock: ffff8bceccfe57c0/0xf81cb91fb0e9959 lrc: 3/0,0 mode: PW/PW res: [0x2457234:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1232895) flags: 0x60000400010020 nid: 10.51.2.49@o2ib3 remote: 0xf3bb15bf05d549e6 expref: 6 pid: 193029 timeout: 151301 lvb_type: 0 [151298.737696] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 2 previous similar messages [151330.239794] Lustre: oak-OST0046: haven't heard from client a62f3c38-26b7-4 (at 10.51.5.9@o2ib3) in 214 seconds. I think it's dead, and I am evicting it. exp ffff8bad37caec00, cur 1618871975 expire 1618871825 last 1618871761 [151330.262024] Lustre: Skipped 23 previous similar messages [151341.199070] Lustre: oak-OST0034: haven't heard from client a62f3c38-26b7-4 (at 10.51.5.9@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc17cc08c00, cur 1618871986 expire 1618871836 last 1618871759 [151349.254307] Lustre: oak-OST004e: haven't heard from client a62f3c38-26b7-4 (at 10.51.5.9@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc49fd92c00, cur 1618871994 expire 1618871844 last 1618871767 [151406.228061] Lustre: oak-OST0046: haven't heard from client 3ef63a30-e195-618c-1a97-63bf4208c795 (at 10.49.0.71@o2ib1) in 162 seconds. I think it's dead, and I am evicting it. exp ffff8bc4ce2ba400, cur 1618872051 expire 1618871901 last 1618871889 [151471.206349] Lustre: oak-OST0034: haven't heard from client 3ef63a30-e195-618c-1a97-63bf4208c795 (at 10.49.0.71@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bab05b89800, cur 1618872116 expire 1618871966 last 1618871889 [151692.005916] LustreError: 193413:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0038: cli 11c92c9a-5a17-4 claims 1925120 GRANT, real grant 0 [151745.612051] LNet: 50607:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.2.5@o2ib3 portal 16 match 1697353846405056 offset 192 length 192: 4 [151754.806555] Lustre: oak-OST004c: Connection restored to 702d9553-ca91-4 (at 10.51.5.62@o2ib3) [151754.816177] Lustre: Skipped 927 previous similar messages [151989.164409] Lustre: oak-OST005e: Client 90752487-0b3e-1696-21a1-6c81abc18872 (at 10.51.1.2@o2ib3) reconnecting [151989.164410] Lustre: oak-OST004e: Client 90752487-0b3e-1696-21a1-6c81abc18872 (at 10.51.1.2@o2ib3) reconnecting [151989.164413] Lustre: Skipped 278 previous similar messages [152111.955526] LNet: 182051:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [152111.969386] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbc6ce0e400 [152111.981599] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bac403abc00 [152111.993806] LustreError: 187416:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be7aa9f4050 x1690570959198720/t0(0) o4->413272a1-a696-4@10.51.2.13@o2ib3:338/0 lens 488/448 e 0 to 0 dl 1618872848 ref 1 fl Interpret:/0/0 rc 0/0 [152112.019404] Lustre: oak-OST0034: Bulk IO write error with 413272a1-a696-4 (at 10.51.2.13@o2ib3), client will retry: rc = -110 [152112.032125] Lustre: Skipped 4 previous similar messages [152174.499537] LustreError: 193455:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2072576(4169728) req@ffff8be760688050 x1690570959199232/t0(0) o4->413272a1-a696-4@10.51.2.13@o2ib3:338/0 lens 504/448 e 0 to 0 dl 1618872848 ref 1 fl Interpret:/0/0 rc 0/0 [152174.499589] LustreError: 193450:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be6f6db1850 x1696864898419520/t0(0) o3->de0b71dd-918a-dbdd-3442-448a7d2edf2a@10.51.6.3@o2ib3:342/0 lens 488/440 e 0 to 0 dl 1618872852 ref 1 fl Interpret:/0/0 rc 0/0 [152174.499772] Lustre: oak-OST0030: Bulk IO read error with de0b71dd-918a-dbdd-3442-448a7d2edf2a (at 10.51.6.3@o2ib3), client will retry: rc -110 [152174.499773] Lustre: Skipped 1 previous similar message [152174.499786] Lustre: oak-OST0034: Bulk IO write error with 413272a1-a696-4 (at 10.51.2.13@o2ib3), client will retry: rc = -110 [152174.586357] LustreError: 193455:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [152355.364653] Lustre: oak-OST003a: Connection restored to (at 10.51.2.21@o2ib3) [152355.372818] Lustre: Skipped 1245 previous similar messages [152733.983647] LNet: 182051:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [152733.997832] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bb2526f2000 [152734.010026] LustreError: 187417:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bca79b4c850 x1689699508334016/t0(0) o4->2bbfb89a-a909-4@10.51.5.29@o2ib3:205/0 lens 488/448 e 0 to 0 dl 1618873470 ref 1 fl Interpret:/0/0 rc 0/0 [152734.035676] Lustre: oak-OST0042: Bulk IO write error with 2bbfb89a-a909-4 (at 10.51.5.29@o2ib3), client will retry: rc = -110 [152734.048400] Lustre: Skipped 1 previous similar message [152792.046794] Lustre: oak-OST0042: Client 4c32b5fb-e821-4 (at 10.51.2.65@o2ib3) reconnecting [152792.046795] Lustre: oak-OST0032: Client 4c32b5fb-e821-4 (at 10.51.2.65@o2ib3) reconnecting [152792.046797] Lustre: Skipped 12 previous similar messages [152799.522199] LustreError: 193438:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be71f9d9850 x1689699508334144/t0(0) o4->2bbfb89a-a909-4@10.51.5.29@o2ib3:206/0 lens 488/448 e 0 to 0 dl 1618873471 ref 1 fl Interpret:/0/0 rc 0/0 [152799.522253] Lustre: oak-OST0042: Bulk IO write error with bc22dc9e-c3bf-4 (at 10.51.4.44@o2ib3), client will retry: rc = -110 [152799.561525] LustreError: 193438:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 5 previous similar messages [152967.429761] Lustre: oak-OST004e: Connection restored to 965a6346-6505-4 (at 10.50.4.27@o2ib2) [152967.439381] Lustre: Skipped 862 previous similar messages [153570.077782] Lustre: oak-OST0050: Connection restored to 5617d6c7-45fc-4 (at 10.50.3.29@o2ib2) [153570.087402] Lustre: Skipped 259 previous similar messages [154173.719975] Lustre: oak-OST004e: Connection restored to 735de3af-f84e-4 (at 10.50.10.58@o2ib2) [154173.729691] Lustre: Skipped 324 previous similar messages [154774.124538] Lustre: oak-OST0040: Connection restored to d75d08df-322b-4 (at 10.50.1.16@o2ib2) [154774.134175] Lustre: Skipped 292 previous similar messages [154803.025090] Lustre: oak-OST0058: Client eb8cea22-3545-c4b8-6cb6-b3e875ecfb11 (at 10.51.1.23@o2ib3) reconnecting [154803.036474] Lustre: Skipped 636 previous similar messages [154928.846393] Lustre: oak-OST005c: Client 07e1f920-cffb-b4f8-01fb-b3be1cdfffbf (at 10.51.15.9@o2ib3) reconnecting [154928.846394] Lustre: oak-OST0054: Client 07e1f920-cffb-b4f8-01fb-b3be1cdfffbf (at 10.51.15.9@o2ib3) reconnecting [154928.846397] Lustre: Skipped 19 previous similar messages [155442.541610] Lustre: oak-OST0030: Connection restored to deb5e22c-cd1d-4 (at 10.50.1.18@o2ib2) [155442.551231] Lustre: Skipped 170 previous similar messages [155882.093074] Lustre: oak-OST0044: haven't heard from client b6e09526-781a-c521-2fe4-8702e9631c57 (at 10.49.0.71@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bbbf943b400, cur 1618876527 expire 1618876377 last 1618876300 [155882.117475] Lustre: Skipped 22 previous similar messages [156047.677232] Lustre: oak-OST0058: Connection restored to 1f6b5cda-3dda-4 (at 10.51.1.11@o2ib3) [156047.686900] Lustre: Skipped 132 previous similar messages [156676.635703] Lustre: oak-OST0030: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [156676.645324] Lustre: Skipped 36 previous similar messages [157279.132808] Lustre: oak-OST004a: Connection restored to 7624d599-2451-4 (at 10.50.16.8@o2ib2) [157279.142428] Lustre: Skipped 98 previous similar messages [157531.061182] Lustre: oak-OST005c: haven't heard from client 958535dc-7351-84b7-4ed4-3707f26a5315 (at 10.210.12.9@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c7abe800, cur 1618878176 expire 1618878026 last 1618877949 [157531.085564] Lustre: Skipped 23 previous similar messages [157885.808225] Lustre: oak-OST0044: Connection restored to 57ffd6e4-3c54-4 (at 10.50.8.13@o2ib2) [157885.817849] Lustre: Skipped 559 previous similar messages [158277.064164] Lustre: oak-OST0030: haven't heard from client 29942c6d-c40b-6f43-d806-7f42a0053b59 (at 10.49.0.71@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bbe649ac800, cur 1618878922 expire 1618878772 last 1618878695 [158277.088579] Lustre: Skipped 23 previous similar messages [158502.350558] Lustre: oak-OST0040: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [158502.360185] Lustre: Skipped 88 previous similar messages [159107.630293] Lustre: oak-OST0044: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [159107.639913] Lustre: Skipped 117 previous similar messages [159711.991116] Lustre: oak-OST0054: Connection restored to 0b23c279-4486-4 (at 10.50.14.8@o2ib2) [159712.000745] Lustre: Skipped 87 previous similar messages [160315.470055] Lustre: oak-OST005a: Connection restored to (at 10.50.14.13@o2ib2) [160315.478329] Lustre: Skipped 154 previous similar messages [160920.100960] Lustre: oak-OST003c: Connection restored to afb0548c-9287-4 (at 10.51.15.2@o2ib3) [160920.110584] Lustre: Skipped 253 previous similar messages [161079.978123] Lustre: oak-OST0048: haven't heard from client 3f96f799-5f53-1046-e142-d616857570ac (at 10.49.0.71@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bb23ff7a000, cur 1618881725 expire 1618881575 last 1618881498 [161080.002482] Lustre: Skipped 23 previous similar messages [161523.642528] Lustre: oak-OST0050: Connection restored to 0d2b66dc-1b19-4 (at 10.50.8.48@o2ib2) [161523.652152] Lustre: Skipped 201 previous similar messages [162144.735698] Lustre: oak-OST0030: Connection restored to 266ca40f-6c8b-4 (at 10.51.4.54@o2ib3) [162144.745331] Lustre: Skipped 113 previous similar messages [162747.601510] Lustre: oak-OST0030: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [162747.611134] Lustre: Skipped 57 previous similar messages [163349.180526] Lustre: oak-OST0030: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [163349.190174] Lustre: Skipped 133 previous similar messages [163377.877522] Lustre: oak-OST0050: Client 44941934-ac39-2b19-c695-ab36186f52dc (at 10.50.10.57@o2ib2) reconnecting [163377.877523] Lustre: oak-OST0034: Client 44941934-ac39-2b19-c695-ab36186f52dc (at 10.50.10.57@o2ib2) reconnecting [163377.877526] Lustre: Skipped 2 previous similar messages [163744.731576] Lustre: oak-OST0036: Client 76eb6295-9d00-d7ab-8458-c8aac654030a (at 10.51.6.5@o2ib3) reconnecting [163744.742847] Lustre: Skipped 1 previous similar message [163957.121396] Lustre: oak-OST0038: Connection restored to 4d85721c-6a00-4 (at 10.50.9.11@o2ib2) [163957.131017] Lustre: Skipped 102 previous similar messages [164563.106148] Lustre: oak-OST0032: Connection restored to a467a7e1-9686-4 (at 10.51.4.13@o2ib3) [164563.115840] Lustre: Skipped 76 previous similar messages [165167.481646] Lustre: oak-OST0032: Connection restored to (at 10.51.1.43@o2ib3) [165167.489813] Lustre: Skipped 91 previous similar messages [165718.039369] Lustre: oak-OST004c: Client 9a831181-5ba7-76a1-7a41-6a1d0b36f622 (at 10.50.1.59@o2ib2) reconnecting [165718.050744] Lustre: Skipped 2 previous similar messages [165769.708321] Lustre: oak-OST003a: Connection restored to c7aec059-c709-4 (at 10.49.22.24@o2ib1) [165769.718039] Lustre: Skipped 139 previous similar messages [166401.866518] Lustre: oak-OST0042: haven't heard from client 4d58344f-5724-3167-4851-639fac6c72c8 (at 10.210.12.107@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4b71ed800, cur 1618887047 expire 1618886897 last 1618886820 [166401.891090] Lustre: Skipped 23 previous similar messages [166402.012766] Lustre: oak-OST004e: Connection restored to ed109291-03b5-4 (at 10.51.4.16@o2ib3) [166402.022386] Lustre: Skipped 99 previous similar messages [166477.861142] Lustre: oak-OST0046: haven't heard from client 6c5f2b86-0252-ef71-8393-6c48da3ac566 (at 10.210.12.137@tcp1) in 213 seconds. I think it's dead, and I am evicting it. exp ffff8be716534400, cur 1618887123 expire 1618886973 last 1618886910 [166477.885726] Lustre: Skipped 71 previous similar messages [166491.848495] Lustre: oak-OST0030: haven't heard from client 6c5f2b86-0252-ef71-8393-6c48da3ac566 (at 10.210.12.137@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be716534c00, cur 1618887137 expire 1618886987 last 1618886910 [166491.873048] Lustre: Skipped 1 previous similar message [166553.886204] Lustre: oak-OST0046: haven't heard from client f9c10983-c320-5641-7539-a90bdb563782 (at 10.210.12.88@tcp1) in 219 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c9bd2c00, cur 1618887199 expire 1618887049 last 1618886980 [166553.910666] Lustre: Skipped 68 previous similar messages [166589.845941] Lustre: oak-OST0030: haven't heard from client b26532b3-d690-023d-a1ea-e1cb66021c1a (at 10.210.12.91@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c3567400, cur 1618887235 expire 1618887085 last 1618887008 [166589.870403] Lustre: Skipped 2 previous similar messages [166629.846551] Lustre: oak-OST0046: haven't heard from client 0fe22a7f-0d28-268d-e201-79778b510ad1 (at 10.210.12.78@tcp1) in 224 seconds. I think it's dead, and I am evicting it. exp ffff8bc4b8199c00, cur 1618887275 expire 1618887125 last 1618887051 [166629.871008] Lustre: Skipped 68 previous similar messages [166665.847016] Lustre: oak-OST0040: haven't heard from client d780c98e-1da8-1d46-f31f-d15355cfa3e3 (at 10.210.12.77@tcp1) in 215 seconds. I think it's dead, and I am evicting it. exp ffff8be71ed5dc00, cur 1618887311 expire 1618887161 last 1618887096 [166665.871479] Lustre: Skipped 2 previous similar messages [167006.404939] Lustre: oak-OST0056: Connection restored to 70adfa6c-d72a-4 (at 10.51.15.19@o2ib3) [167006.414659] Lustre: Skipped 192 previous similar messages [167088.830537] Lustre: oak-OST003c: Client 09749127-876e-3e7f-b6a0-7c54e61383b6 (at 10.51.3.11@o2ib3) reconnecting [167137.727719] Lustre: oak-OST0058: Client 4eca94b5-20cc-8fd4-b8fd-ebd10e301645 (at 10.51.1.21@o2ib3) reconnecting [167615.876070] Lustre: oak-OST0044: Connection restored to (at 10.51.1.43@o2ib3) [167615.884227] Lustre: Skipped 265 previous similar messages [168216.634540] Lustre: oak-OST003c: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [168216.644173] Lustre: Skipped 82 previous similar messages [168549.670592] Lustre: oak-OST0032: Client 56a5a766-0782-0626-7e81-90dde2e2789a (at 10.51.2.28@o2ib3) reconnecting [168550.176661] Lustre: oak-OST003c: Client 0cc1f06c-eda0-ebe4-f37b-9f106eca81f1 (at 10.51.4.62@o2ib3) reconnecting [168631.783701] Lustre: oak-OST004e: Client 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3) reconnecting [168631.795107] Lustre: Skipped 2 previous similar messages [168823.330724] Lustre: oak-OST0058: Connection restored to (at 10.50.13.1@o2ib2) [168823.338915] Lustre: Skipped 154 previous similar messages [169424.028742] Lustre: oak-OST0054: Connection restored to bc22dc9e-c3bf-4 (at 10.51.4.44@o2ib3) [169424.038362] Lustre: Skipped 243 previous similar messages [170027.238617] Lustre: oak-OST0050: Connection restored to (at 10.50.13.7@o2ib2) [170027.246786] Lustre: Skipped 83 previous similar messages [170634.895031] Lustre: oak-OST0048: Connection restored to (at 10.50.13.7@o2ib2) [170634.903193] Lustre: Skipped 314 previous similar messages [171238.128905] Lustre: oak-OST0052: Connection restored to afb0548c-9287-4 (at 10.51.15.2@o2ib3) [171238.138578] Lustre: Skipped 181 previous similar messages [171331.767155] Lustre: oak-OST003e: Client 6c0c8fc9-86c4-9a4d-ddf5-85fe73093cd9 (at 10.51.12.2@o2ib3) reconnecting [171613.201798] Lustre: oak-OST0034: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [171614.639091] Lustre: oak-OST0044: Client eb8cea22-3545-c4b8-6cb6-b3e875ecfb11 (at 10.51.1.23@o2ib3) reconnecting [171843.768920] Lustre: oak-OST004a: Connection restored to afb0548c-9287-4 (at 10.51.15.2@o2ib3) [171843.778531] Lustre: Skipped 47 previous similar messages [172078.766541] Lustre: oak-OST004a: Client f9a32d01-111a-5b35-d23c-bce4c41ca94e (at 10.50.2.45@o2ib2) reconnecting [172078.777919] Lustre: Skipped 4 previous similar messages [172167.277091] Lustre: oak-OST0038: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [172167.286443] Lustre: Skipped 1 previous similar message [172457.033900] Lustre: oak-OST0032: Connection restored to aa966a5c-a1a1-4 (at 10.49.21.34@o2ib1) [172457.043632] Lustre: Skipped 95 previous similar messages [172476.038105] Lustre: oak-OST0052: Client e9bcecd7-a198-50aa-33a9-a04f0aea63df (at 10.51.6.28@o2ib3) reconnecting [173092.553340] Lustre: oak-OST0048: Connection restored to (at 10.50.13.7@o2ib2) [173092.561507] Lustre: Skipped 81 previous similar messages [173708.810451] Lustre: oak-OST0056: Connection restored to 94477130-eaa9-4 (at 10.51.15.17@o2ib3) [173708.820172] Lustre: Skipped 201 previous similar messages [173714.993526] LustreError: 193190:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0042: cli 61781ed1-b14e-4 claims 4169728 GRANT, real grant 0 [173750.589541] Lustre: oak-OST005e: Client eb8cea22-3545-c4b8-6cb6-b3e875ecfb11 (at 10.51.1.23@o2ib3) reconnecting [173750.589542] Lustre: oak-OST004e: Client eb8cea22-3545-c4b8-6cb6-b3e875ecfb11 (at 10.51.1.23@o2ib3) reconnecting [173750.612319] Lustre: Skipped 1 previous similar message [174310.040481] Lustre: oak-OST0040: Connection restored to 61789f7e-98b3-4 (at 10.50.2.65@o2ib2) [174310.050102] Lustre: Skipped 184 previous similar messages [174912.461541] Lustre: oak-OST0042: Connection restored to ea702749-deff-4 (at 10.51.4.14@o2ib3) [174912.471178] Lustre: Skipped 100 previous similar messages [175520.488231] Lustre: oak-OST0038: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [175520.497877] Lustre: Skipped 144 previous similar messages [176135.561994] Lustre: oak-OST0058: Connection restored to bc22dc9e-c3bf-4 (at 10.51.4.44@o2ib3) [176135.571615] Lustre: Skipped 117 previous similar messages [176744.883824] Lustre: oak-OST004e: Connection restored to (at 10.51.2.19@o2ib3) [176744.891992] Lustre: Skipped 285 previous similar messages [176906.833657] LNet: 182051:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [176906.846991] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bb7cb7c3000 [176906.859174] LustreError: 209400:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71eed5050 x1695196474067392/t0(0) o4->63444aa2-b3f5-4@10.51.12.12@o2ib3:220/0 lens 488/448 e 0 to 0 dl 1618897645 ref 1 fl Interpret:/0/0 rc 0/0 [176906.859177] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be7aa8fec00 [176906.897134] Lustre: oak-OST0030: Bulk IO write error with 63444aa2-b3f5-4 (at 10.51.12.12@o2ib3), client will retry: rc = -110 [176906.909971] Lustre: Skipped 5 previous similar messages [176974.298460] LustreError: 187417:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1002640) req@ffff8be70ef0a850 x1688762938773952/t0(0) o4->a06ff198-d843-4@10.51.13.17@o2ib3:220/0 lens 488/448 e 0 to 0 dl 1618897645 ref 1 fl Interpret:/0/0 rc 0/0 [176974.298462] LustreError: 209396:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71ecf1850 x1695196474067584/t0(0) o4->63444aa2-b3f5-4@10.51.12.12@o2ib3:220/0 lens 488/448 e 0 to 0 dl 1618897645 ref 1 fl Interpret:/0/0 rc 0/0 [176974.298483] Lustre: oak-OST004a: Bulk IO write error with 1f534d97-4915-4 (at 10.51.4.15@o2ib3), client will retry: rc = -110 [176974.298596] LustreError: 209395:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be712b81050 x1695383703876416/t0(0) o3->c7c97132-e759-4@10.51.15.4@o2ib3:224/0 lens 488/440 e 0 to 0 dl 1618897649 ref 1 fl Interpret:/0/0 rc 0/0 [176974.298610] Lustre: oak-OST0056: Bulk IO read error with c7c97132-e759-4 (at 10.51.15.4@o2ib3), client will retry: rc -110 [176974.400623] LustreError: 187417:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 14 previous similar messages [177014.102190] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 112s: evicting client at 10.51.4.33@o2ib3 ns: filter-oak-OST0058_UUID lock: ffff8bbd92830fc0/0xf81cb91fca9474e lrc: 4/0,0 mode: PR/PR res: [0x23b0623:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.4.33@o2ib3 remote: 0xe1814cf08d0cc79a expref: 32 pid: 193052 timeout: 177017 lvb_type: 1 [177014.887924] LNet: 50607:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.4.33@o2ib3 portal 16 match 1697353877746240 offset 224 length 224: 4 [177014.904013] LustreError: 216181:0:(ldlm_lockd.c:2366:ldlm_cancel_handler()) ldlm_cancel from 10.51.4.34@o2ib3 arrived at 1618897660 with bad export cookie 1117398010424115209 [177015.102157] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 112s: evicting client at 10.51.4.33@o2ib3 ns: filter-oak-OST005a_UUID lock: ffff8bae93b5cec0/0xf81cb91fc9c26fa lrc: 4/0,0 mode: PR/PR res: [0x21d0b88:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.4.33@o2ib3 remote: 0xe1814cf08d02f868 expref: 35 pid: 193054 timeout: 177018 lvb_type: 1 [177015.148502] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 5 previous similar messages [177016.102148] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 111s: evicting client at 10.51.4.34@o2ib3 ns: filter-oak-OST0034_UUID lock: ffff8bb1622833c0/0xf81cb91fc80f7c6 lrc: 3/0,0 mode: PR/PR res: [0x1ec071a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.4.34@o2ib3 remote: 0x15ea91af01644f3d expref: 34 pid: 193030 timeout: 177019 lvb_type: 1 [177016.148471] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [177018.659556] LNet: 50609:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.4.33@o2ib3 portal 16 match 1697353877748736 offset 224 length 224: 4 [177018.675603] LNet: 50609:0:(lib-move.c:3829:lnet_parse_put()) Skipped 2 previous similar messages [177018.675637] LustreError: 184086:0:(ldlm_lockd.c:2366:ldlm_cancel_handler()) ldlm_cancel from 10.51.4.33@o2ib3 arrived at 1618897664 with bad export cookie 1117398010405773214 [177018.675639] LustreError: 184086:0:(ldlm_lockd.c:2366:ldlm_cancel_handler()) Skipped 3 previous similar messages [177074.606034] Lustre: oak-OST0030: Client 5f42d279-fa63-9a7e-a46d-4dfc7a2a7ba3 (at 10.51.14.2@o2ib3) reconnecting [177074.617395] Lustre: Skipped 6 previous similar messages [177075.108053] Lustre: oak-OST003a: Client 66bf6793-ef6e-d6d2-96ad-3432d26adbce (at 10.51.13.5@o2ib3) reconnecting [177075.119472] Lustre: Skipped 15 previous similar messages [177076.113305] Lustre: oak-OST0042: Client 66bf6793-ef6e-d6d2-96ad-3432d26adbce (at 10.51.13.5@o2ib3) reconnecting [177076.124677] Lustre: Skipped 55 previous similar messages [177076.455757] Lustre: 193014:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618897548/real 1618897548] req@ffff8bdab2a45580 x1697353877557568/t0(0) o104->oak-OST0032@10.51.4.34@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618897721 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [177077.384708] Lustre: 193089:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618897549/real 1618897549] req@ffff8baf83f63180 x1697353877558784/t0(0) o106->oak-OST004c@10.51.13.8@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618897722 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [177078.187304] Lustre: oak-OST0042: Client 4eca94b5-20cc-8fd4-b8fd-ebd10e301645 (at 10.51.1.21@o2ib3) reconnecting [177078.198672] Lustre: Skipped 94 previous similar messages [177082.446197] Lustre: oak-OST0048: Client d7e7ae29-6844-4 (at 10.51.2.18@o2ib3) reconnecting [177082.455542] Lustre: Skipped 48 previous similar messages [177094.072243] Lustre: oak-OST0050: Client 8ff6000c-d966-1cda-f3a5-455db4eb8783 (at 10.51.2.23@o2ib3) reconnecting [177094.083612] Lustre: Skipped 24 previous similar messages [177114.458944] Lustre: oak-OST005e: Client c7c97132-e759-4 (at 10.51.15.4@o2ib3) reconnecting [177114.468271] Lustre: Skipped 6 previous similar messages [177270.289604] Lustre: oak-OST0036: Client 987b9366-48a1-9307-ff57-52c8cdd1c49f (at 10.51.15.13@o2ib3) reconnecting [177270.301065] Lustre: Skipped 3 previous similar messages [177345.228724] Lustre: oak-OST0052: Connection restored to 346e890a-2480-4 (at 10.49.28.9@o2ib1) [177345.238458] Lustre: Skipped 1384 previous similar messages [177452.100388] Lustre: oak-OST005c: Client 4cee94e6-025c-589a-13ab-6c9ed337de31 (at 10.51.2.36@o2ib3) reconnecting [177452.111777] Lustre: Skipped 1 previous similar message [177545.697547] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.51.15.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [177684.457699] Lustre: oak-OST0044: Client 156315a7-a82d-b4fe-847a-396165636f38 (at 10.51.14.3@o2ib3) reconnecting [177893.706723] LustreError: 147685:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST005c: cli 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 claims 1024000 GRANT, real grant 0 [177946.906472] Lustre: oak-OST0030: Connection restored to 58ad829b-9450-4 (at 10.49.17.24@o2ib1) [177946.916294] Lustre: Skipped 652 previous similar messages [177958.046156] Lustre: oak-OST005c: Client a46e0fe5-52aa-b9b6-1fb4-9a67ea6ced2c (at 10.50.2.56@o2ib2) reconnecting [177958.057585] Lustre: Skipped 1 previous similar message [178548.103922] Lustre: oak-OST0044: Connection restored to dc19f4f4-e5ce-4 (at 10.50.9.56@o2ib2) [178548.113546] Lustre: Skipped 801 previous similar messages [178638.572302] Lustre: oak-OST004c: Client e9bcecd7-a198-50aa-33a9-a04f0aea63df (at 10.51.6.28@o2ib3) reconnecting [178638.583683] Lustre: Skipped 35 previous similar messages [178759.365249] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending_nocred)(waiting) [178759.379335] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600740) failed: 5 [178759.380963] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbfb169d000 [178759.389899] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [178759.389901] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 5 previous similar messages [178759.389903] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723dc6c00 [178759.389906] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7220ef800 [178759.390195] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723dc6c00 [178759.390576] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc297e9e800 [178759.390870] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723dc6c00 [178759.391149] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc297e9e800 [178759.391424] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbfb169d000 [178759.508529] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1403 previous similar messages [178823.616173] LustreError: 211957:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be72252c850 x1685034148571136/t0(0) o3->a020122d-d9f6-4@10.51.2.21@o2ib3:557/0 lens 488/440 e 0 to 0 dl 1618899492 ref 1 fl Interpret:/0/0 rc 0/0 [178823.616416] Lustre: oak-OST0036: Bulk IO read error with a020122d-d9f6-4 (at 10.51.2.21@o2ib3), client will retry: rc -110 [178823.616417] Lustre: Skipped 2 previous similar messages [178823.616564] LustreError: 193102:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be719722050 x1684949465508352/t0(0) o3->1f534d97-4915-4@10.51.4.15@o2ib3:562/0 lens 488/440 e 0 to 0 dl 1618899497 ref 1 fl Interpret:/0/0 rc 0/0 [178823.616580] LustreError: 209393:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be71ae20050 x1695196741304192/t0(0) o4->63444aa2-b3f5-4@10.51.12.12@o2ib3:562/0 lens 488/448 e 0 to 0 dl 1618899497 ref 1 fl Interpret:/0/0 rc 0/0 [178823.616905] Lustre: oak-OST0030: Bulk IO write error with 63444aa2-b3f5-4 (at 10.51.12.12@o2ib3), client will retry: rc = -110 [178823.616906] Lustre: Skipped 15 previous similar messages [178823.731786] LustreError: 211957:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 40 previous similar messages [179057.007593] LustreError: 193416:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0052: cli 34d5f413-62e7-39d1-943f-2da96f672f7d claims 1544192 GRANT, real grant 0 [179149.094805] Lustre: oak-OST0052: Connection restored to afb0548c-9287-4 (at 10.51.15.2@o2ib3) [179149.104423] Lustre: Skipped 692 previous similar messages [179300.114934] Lustre: oak-OST0054: Client 4bd1164e-2b59-29c9-84a6-6ae88a2d6ec7 (at 10.51.0.11@o2ib3) reconnecting [179300.126434] Lustre: Skipped 458 previous similar messages [179409.395183] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.50.1.32@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [179409.414620] LustreError: Skipped 1 previous similar message [179696.768440] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [179696.782488] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc3492b9800 [179696.794638] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc3492bb400 [179696.806784] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc3492bb400 [179696.818929] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4cc940800 [179696.831097] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bad4236d000 [179696.843309] LustreError: 241051:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71ff83850 x1695196865945792/t0(0) o4->63444aa2-b3f5-4@10.51.12.12@o2ib3:744/0 lens 488/448 e 0 to 0 dl 1618900434 ref 1 fl Interpret:/0/0 rc 0/0 [179696.868826] LustreError: 241051:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [179696.879761] Lustre: oak-OST0030: Bulk IO write error with 63444aa2-b3f5-4 (at 10.51.12.12@o2ib3), client will retry: rc = -110 [179696.892578] Lustre: Skipped 4 previous similar messages [179748.777793] LustreError: 193432:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be6c118a050 x1694738285851456/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:744/0 lens 488/440 e 0 to 0 dl 1618900434 ref 1 fl Interpret:/0/0 rc 0/0 [179748.777815] Lustre: oak-OST005e: Bulk IO read error with 254f88b5-a6bc-4 (at 10.51.6.12@o2ib3), client will retry: rc -110 [179748.777817] Lustre: Skipped 41 previous similar messages [179748.777960] LustreError: 5994:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8be71ff80850 x1695196866090304/t0(0) o4->63444aa2-b3f5-4@10.51.12.12@o2ib3:745/0 lens 488/448 e 0 to 0 dl 1618900435 ref 1 fl Interpret:/0/0 rc 0/0 [179748.777962] LustreError: 5994:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 5 previous similar messages [179748.777975] LustreError: 209398:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be6ebcef850 x1696619597375680/t0(0) o3->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:745/0 lens 488/440 e 0 to 0 dl 1618900435 ref 1 fl Interpret:/0/0 rc 0/0 [179748.778255] Lustre: oak-OST0030: Bulk IO write error with 63444aa2-b3f5-4 (at 10.51.12.12@o2ib3), client will retry: rc = -110 [179748.900543] LustreError: 193432:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 3 previous similar messages [179749.115061] Lustre: oak-OST004c: Connection restored to 566f9fe9-38b0-4 (at 10.51.5.10@o2ib3) [179749.124729] Lustre: Skipped 175 previous similar messages [179860.140090] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.51.15.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [179860.159502] LustreError: Skipped 2 previous similar messages [179870.259748] LustreError: 153979:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be757051850 x1696616785556928/t0(0) o3->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:167/0 lens 488/440 e 0 to 0 dl 1618900612 ref 1 fl Interpret:/0/0 rc 0/0 [179870.286838] LustreError: 153979:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [179870.297665] Lustre: oak-OST0038: Bulk IO read error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc -110 [179870.312231] Lustre: Skipped 6 previous similar messages [179900.584034] Lustre: oak-OST0040: Client 2dce41e5-6cf1-0747-6975-01d951e9d8ac (at 10.51.1.46@o2ib3) reconnecting [179900.595406] Lustre: Skipped 47 previous similar messages [180356.040323] Lustre: oak-OST0036: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [180356.049960] Lustre: Skipped 134 previous similar messages [180415.551094] LustreError: 227168:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bc5e9a64850 x1696888406387328/t0(0) o4->7fd0339d-d05d-9ff6-3f86-704a0a70f4dc@10.50.15.3@o2ib2:712/0 lens 488/448 e 0 to 0 dl 1618901157 ref 1 fl Interpret:/0/0 rc 0/0 [180415.578258] LustreError: 227168:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [180415.588952] Lustre: oak-OST005a: Bulk IO write error with 7fd0339d-d05d-9ff6-3f86-704a0a70f4dc (at 10.50.15.3@o2ib2), client will retry: rc = -110 [180415.603762] Lustre: Skipped 2 previous similar messages [180568.496961] Lustre: oak-OST0030: Client 90752487-0b3e-1696-21a1-6c81abc18872 (at 10.51.1.2@o2ib3) reconnecting [180568.508229] Lustre: Skipped 32 previous similar messages [180802.116916] md: md6: recovery done. [180848.199510] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.215@o2ib5: error 0(waiting) [180848.212136] LNet: 50605:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d859ce9be0) failed: 5 [180848.212143] LNet: 50604:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.215@o2ib5 exceeded retry count 0 [180848.212145] LNet: 50604:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [180848.212147] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc06ff39800 [180848.212712] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb9916f0000 [180848.213313] LustreError: 50604:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb9916f0000 [180848.213314] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbd50bec000 [180848.213318] LustreError: 50603:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbd50bec000 [180848.213324] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb2f79bf000 [180848.317114] LNet: 50605:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 513 previous similar messages [180898.927856] LustreError: 3681:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bc8d2290850 x1685123946924608/t0(0) o3->8b2463ca-09ad-4@10.50.17.45@o2ib2:386/0 lens 488/440 e 0 to 0 dl 1618901586 ref 1 fl Interpret:/0/0 rc 0/0 [180898.928046] Lustre: oak-OST005a: Bulk IO read error with 8b2463ca-09ad-4 (at 10.50.17.45@o2ib2), client will retry: rc -110 [180898.928048] Lustre: Skipped 2 previous similar messages [180898.971525] LustreError: 3681:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [180981.114639] Lustre: oak-OST0036: Connection restored to 25c96f56-d6ac-4 (at 10.50.1.20@o2ib2) [180981.124259] Lustre: Skipped 334 previous similar messages [181349.118629] Lustre: oak-OST0058: Client 26c3c3e9-afde-d9dd-595e-b5cc51f8ae6e (at 10.50.4.41@o2ib2) reconnecting [181349.129999] Lustre: Skipped 12 previous similar messages [181584.866622] Lustre: oak-OST004e: Connection restored to 6bbc0d63-02d6-4 (at 10.50.10.71@o2ib2) [181584.876333] Lustre: Skipped 782 previous similar messages [182192.728034] Lustre: oak-OST0056: Connection restored to 8f996bbd-e5a0-4 (at 10.51.5.28@o2ib3) [182192.737647] Lustre: Skipped 867 previous similar messages [182794.178568] Lustre: oak-OST004e: Connection restored to f91f1762-c441-4 (at 10.50.10.5@o2ib2) [182794.188189] Lustre: Skipped 452 previous similar messages [183258.413181] Lustre: oak-OST0032: Client db35851b-4880-fd68-1dba-fd7cff216c58 (at 10.51.6.27@o2ib3) reconnecting [183258.424549] Lustre: Skipped 14 previous similar messages [183395.001693] Lustre: oak-OST003a: Connection restored to c3f2a6c8-0188-4 (at 10.51.5.57@o2ib3) [183395.011365] Lustre: Skipped 718 previous similar messages [183604.240379] Lustre: oak-OST004a: Client c1ada0a4-de2a-c24a-4d28-c593030bd6bb (at 10.51.6.21@o2ib3) reconnecting [183604.251770] Lustre: Skipped 14 previous similar messages [183869.134387] Lustre: oak-OST0058: Client c4666443-6408-4 (at 10.50.10.49@o2ib2) reconnecting [183869.143813] Lustre: Skipped 4 previous similar messages [183996.582254] Lustre: oak-OST0034: Connection restored to 78e3ac20-3b71-4 (at 10.50.16.17@o2ib2) [183996.591964] Lustre: Skipped 435 previous similar messages [184229.750100] Lustre: oak-OST005a: Client 6cbff61b-9898-4 (at 10.51.13.9@o2ib3) reconnecting [184229.759434] Lustre: Skipped 10 previous similar messages [184597.333569] Lustre: oak-OST005c: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [184597.343209] Lustre: Skipped 512 previous similar messages [184991.463643] Lustre: oak-OST003c: Client a46e0fe5-52aa-b9b6-1fb4-9a67ea6ced2c (at 10.50.2.56@o2ib2) reconnecting [184991.475037] Lustre: Skipped 12 previous similar messages [185201.659941] Lustre: oak-OST0034: Connection restored to f35b4cd6-c6c1-4 (at 10.50.13.15@o2ib2) [185201.669717] Lustre: Skipped 521 previous similar messages [185239.215677] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [185239.229878] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bb66c4e2800 [185239.242027] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be7220e8800 [185239.242036] LustreError: 209392:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be713960050 x1696616935213440/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:246/0 lens 488/448 e 0 to 0 dl 1618905976 ref 1 fl Interpret:/0/0 rc 0/0 [185239.242302] Lustre: oak-OST0034: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [185241.783020] LustreError: 193453:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be718fb9050 x1696616935234560/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:246/0 lens 488/448 e 0 to 0 dl 1618905976 ref 1 fl Interpret:/0/0 rc 0/0 [185241.785245] Lustre: oak-OST0034: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [185241.785247] Lustre: Skipped 1 previous similar message [185241.830954] LustreError: 193453:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [185268.945248] LustreError: 137-5: oak-OST003f_UUID: not available for connect from 10.51.6.18@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [185268.964660] LustreError: Skipped 6 previous similar messages [185452.781097] LustreError: 137-5: oak-OST0043_UUID: not available for connect from 10.51.1.71@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [185499.979919] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.6.18@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [185652.778836] Lustre: oak-OST0038: Client 0cc1f06c-eda0-ebe4-f37b-9f106eca81f1 (at 10.51.4.62@o2ib3) reconnecting [185652.790220] Lustre: Skipped 65 previous similar messages [185805.740562] Lustre: oak-OST0056: Connection restored to d7c49207-5239-4 (at 10.50.10.47@o2ib2) [185805.750341] Lustre: Skipped 349 previous similar messages [186405.807354] Lustre: oak-OST0056: Connection restored to 0a280b99-b1e4-4 (at 10.50.5.7@o2ib2) [186405.816902] Lustre: Skipped 789 previous similar messages [186426.181791] Lustre: oak-OST004e: Client 5fbc6dd0-2ef9-b4a9-8121-d2dcfb28fdfb (at 10.50.10.8@o2ib2) reconnecting [186426.193165] Lustre: Skipped 18 previous similar messages [187028.118781] Lustre: oak-OST004e: Connection restored to e98fe865-8a7e-4 (at 10.50.2.55@o2ib2) [187028.128425] Lustre: Skipped 455 previous similar messages [187085.159205] Lustre: oak-OST005e: Client 2415e1f1-7c65-d7c3-3763-61211eaa7bc0 (at 10.50.7.1@o2ib2) reconnecting [187085.170476] Lustre: Skipped 6 previous similar messages [187631.313343] Lustre: oak-OST003c: Connection restored to a06ff198-d843-4 (at 10.51.13.17@o2ib3) [187631.323060] Lustre: Skipped 579 previous similar messages [187685.923741] Lustre: oak-OST0058: Client fd8d749d-89ee-cfcb-c690-4236b58aaf4b (at 10.50.8.19@o2ib2) reconnecting [187685.935105] Lustre: Skipped 8 previous similar messages [188219.690726] LustreError: 137-5: oak-OST0039_UUID: not available for connect from 10.50.4.8@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [188219.710060] LustreError: Skipped 1 previous similar message [188231.655611] Lustre: oak-OST005a: Connection restored to a4991fff-ead5-4 (at 10.50.3.44@o2ib2) [188231.665230] Lustre: Skipped 607 previous similar messages [188569.881845] Lustre: oak-OST0050: Client 764dc76b-6419-78a0-7c40-f676bc09a2fb (at 10.50.6.69@o2ib2) reconnecting [188569.893228] Lustre: Skipped 25 previous similar messages [188832.025752] Lustre: oak-OST005c: Connection restored to aec2e59f-49ed-4 (at 10.50.2.4@o2ib2) [188832.035290] Lustre: Skipped 607 previous similar messages [189176.676454] Lustre: oak-OST005c: Client bdfa38aa-0b4b-9ae8-7177-71f369bf7d36 (at 10.50.16.10@o2ib2) reconnecting [189176.687926] Lustre: Skipped 29 previous similar messages [189433.510661] Lustre: oak-OST0030: Connection restored to b257ec18-9565-4 (at 10.50.2.40@o2ib2) [189433.520295] Lustre: Skipped 778 previous similar messages [189773.219726] LustreError: 193448:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71ff85850 x1696616962887616/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:251/0 lens 488/448 e 0 to 0 dl 1618910511 ref 1 fl Interpret:/0/0 rc 0/0 [189773.247137] Lustre: oak-OST005c: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [189773.261997] Lustre: Skipped 1 previous similar message [189779.843957] Lustre: oak-OST004e: Client 70e2b68b-00fb-4 (at 10.50.8.53@o2ib2) reconnecting [189779.853325] Lustre: Skipped 51 previous similar messages [190036.219077] Lustre: oak-OST005a: Connection restored to ac4a35a1-ef91-4 (at 10.50.4.2@o2ib2) [190036.228599] Lustre: Skipped 547 previous similar messages [190401.751019] Lustre: oak-OST0048: Client 3d575049-f2ff-030b-a4ea-af4cdfc8c038 (at 10.50.8.69@o2ib2) reconnecting [190401.762388] Lustre: Skipped 38 previous similar messages [190637.007170] Lustre: oak-OST003e: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [190637.016848] Lustre: Skipped 352 previous similar messages [191238.606653] Lustre: oak-OST0038: Connection restored to 26c3d7e5-cee7-4 (at 10.50.9.54@o2ib2) [191238.616294] Lustre: Skipped 1154 previous similar messages [191644.987840] Lustre: oak-OST0046: Client 5427ab0e-d81a-5916-036b-239cf0d31d9d (at 10.50.2.49@o2ib2) reconnecting [191644.999207] Lustre: Skipped 9 previous similar messages [191645.714387] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(sending)(waiting) [191645.728334] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc36e256000 [191645.740487] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bae27410c00 [191645.752633] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbe649ae400 [191645.764810] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6160c6c00 [191645.777008] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6160c6c00 [191645.789194] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bba8ac82800 [191645.801350] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bba8ac82800 [191645.813505] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bafa95bbc00 [191699.045993] LustreError: 233534:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bc467ca2850 x1689833092926144/t0(0) o3->e48d7b4f-5c09-4@10.50.10.68@o2ib2:612/0 lens 488/440 e 0 to 0 dl 1618912382 ref 1 fl Interpret:/0/0 rc 0/0 [191699.045995] LustreError: 239053:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bc4bf4cf050 x1689833092926208/t0(0) o3->e48d7b4f-5c09-4@10.50.10.68@o2ib2:612/0 lens 488/440 e 0 to 0 dl 1618912382 ref 1 fl Interpret:/0/0 rc 0/0 [191699.045999] LustreError: 239053:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [191699.046009] LustreError: 227676:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(4194304) req@ffff8bc107971850 x1696895330597568/t0(0) o4->ae2e39c9-d6c4-e399-c4e2-9281e18e4ae6@10.50.12.15@o2ib2:612/0 lens 488/448 e 0 to 0 dl 1618912382 ref 1 fl Interpret:/0/0 rc 0/0 [191699.046074] Lustre: oak-OST005c: Bulk IO write error with 3f555497-9bbc-4 (at 10.50.5.30@o2ib2), client will retry: rc = -110 [191699.046081] LustreError: 240014:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bafb8520050 x1688748421226432/t0(0) o3->10d15a4e-dcc6-4@10.50.5.66@o2ib2:613/0 lens 488/440 e 0 to 0 dl 1618912383 ref 1 fl Interpret:/0/0 rc 0/0 [191699.046083] LustreError: 240014:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [191699.046121] Lustre: oak-OST0058: Bulk IO read error with c7970015-4f34-4 (at 10.50.7.41@o2ib2), client will retry: rc -110 [191699.046123] Lustre: Skipped 3 previous similar messages [191699.205442] LustreError: 233534:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 21 previous similar messages [191745.171191] Lustre: oak-OST0052: Client bf83d96d-cb1a-4 (at 10.50.0.71@o2ib2) reconnecting [191791.758825] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.50.5.6@o2ib2 ns: filter-oak-OST0046_UUID lock: ffff8bc014af3f00/0xf81cb91fd40b046 lrc: 3/0,0 mode: PW/PW res: [0xa40000400:0x344c97:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->8191) flags: 0x60000400030020 nid: 10.50.5.6@o2ib2 remote: 0xbd7a67613bd94cb9 expref: 11 pid: 193130 timeout: 191757 lvb_type: 0 [191791.804054] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 5 previous similar messages [191812.682321] Lustre: 187492:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618912285/real 1618912285] req@ffff8bafdb1c5100 x1697353892338688/t0(0) o104->oak-OST005e@10.50.14.13@o2ib2:15/16 lens 296/224 e 0 to 1 dl 1618912458 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [191812.713235] LustreError: 187492:0:(ldlm_lockd.c:681:ldlm_handle_ast_error()) ### client (nid 10.50.14.13@o2ib2) failed to reply to blocking AST (req@ffff8bafdb1c5100 x1697353892338688 status 0 rc -110), evict it ns: filter-oak-OST005e_UUID lock: ffff8bb397b53180/0xf81cb91fd62a72d lrc: 4/0,0 mode: PW/PW res: [0x20fae19:0x0:0x0].0x0 rrc: 6 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400020020 nid: 10.50.14.13@o2ib2 remote: 0x3fd77790c0fed9ed expref: 8 pid: 193147 timeout: 191988 lvb_type: 0 [191812.765162] LustreError: 138-a: oak-OST005e: A client on nid 10.50.14.13@o2ib2 was evicted due to a lock blocking callback time out: rc -110 [191815.304298] Lustre: 193032:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618912288/real 1618912288] req@ffff8bc18dacf500 x1697353892339648/t0(0) o104->oak-OST0036@10.50.5.5@o2ib2:15/16 lens 296/224 e 0 to 1 dl 1618912461 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [191836.322412] Lustre: oak-OST0036: haven't heard from client 692ca972-036b-4 (at 10.50.6.1@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4fe21e400, cur 1618912482 expire 1618912332 last 1618912255 [191836.344638] Lustre: Skipped 45 previous similar messages [191839.665223] Lustre: oak-OST005c: Connection restored to 76ecdc90-9d7b-4 (at 10.50.10.72@o2ib2) [191839.674944] Lustre: Skipped 1758 previous similar messages [191917.850937] Lustre: oak-OST0054: Client ae2e39c9-d6c4-e399-c4e2-9281e18e4ae6 (at 10.50.12.15@o2ib2) reconnecting [191917.862401] Lustre: Skipped 1317 previous similar messages [192167.629125] LustreError: 193439:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be71fe53850 x1695519130697920/t0(0) o3->190029f0-b1fc-4@10.50.2.13@o2ib2:378/0 lens 488/440 e 0 to 0 dl 1618912903 ref 1 fl Interpret:/0/0 rc 0/0 [192167.654256] Lustre: oak-OST003a: Bulk IO read error with 190029f0-b1fc-4 (at 10.50.2.13@o2ib2), client will retry: rc -110 [192167.666685] Lustre: Skipped 30 previous similar messages [192224.113101] LustreError: 193456:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be75c206050 x1696524158915776/t0(0) o3->a526f487-1207-4@10.50.3.40@o2ib2:380/0 lens 488/440 e 0 to 0 dl 1618912905 ref 1 fl Interpret:/0/0 rc 0/0 [192224.113305] Lustre: oak-OST004e: Bulk IO read error with a526f487-1207-4 (at 10.50.3.40@o2ib2), client will retry: rc -110 [192224.113306] Lustre: Skipped 2 previous similar messages [192224.157183] LustreError: 193456:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 3 previous similar messages [192334.889034] Lustre: oak-OST0058: Client dd4aadec-fe7c-0c49-bd2e-8929ca375da6 (at 10.50.2.6@o2ib2) reconnecting [192334.900373] Lustre: Skipped 16 previous similar messages [192440.009087] Lustre: oak-OST0034: Connection restored to 1e9aa385-b96d-4 (at 10.50.2.19@o2ib2) [192440.018721] Lustre: Skipped 389 previous similar messages [193042.193468] Lustre: oak-OST005e: Connection restored to 5a1327cd-5da5-4 (at 10.50.13.3@o2ib2) [193042.203108] Lustre: Skipped 279 previous similar messages [193047.647039] Lustre: oak-OST0048: Client dd4aadec-fe7c-0c49-bd2e-8929ca375da6 (at 10.50.2.6@o2ib2) reconnecting [193047.658338] Lustre: Skipped 14 previous similar messages [193083.663405] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.50.17.8@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [193083.682818] LustreError: Skipped 1 previous similar message [193084.647958] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.50.10.20@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [193084.667504] LustreError: Skipped 3 previous similar messages [193150.119040] LustreError: 137-5: oak-OST0051_UUID: not available for connect from 10.50.10.21@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [193643.551923] Lustre: oak-OST004a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [193643.561622] Lustre: Skipped 194 previous similar messages [193891.748252] Lustre: oak-OST0056: Client 986b1e04-43ac-2fe2-23f4-b8cc2eaef60e (at 10.50.1.32@o2ib2) reconnecting [193891.759646] Lustre: Skipped 26 previous similar messages [194132.656130] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(sending_nocred)(waiting) [194132.670359] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff5c0) failed: 5 [194132.680755] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 6 previous similar messages [194132.680918] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [194132.680921] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8baf65844800 [194132.680922] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [194132.680924] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8baf65844800 [194132.680925] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdb4c847800 [194132.680936] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723de0c00 [194132.680940] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723de0c00 [194132.680941] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb26e8efc00 [194132.680948] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb26e8efc00 [194132.680949] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bde2918c400 [194198.366006] LustreError: 237967:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bb13939b850 x1696970161428288/t0(0) o3->59a9cb76-2f7f-9e34-2fb6-ca009310817e@10.50.15.10@o2ib2:80/0 lens 488/440 e 0 to 0 dl 1618914870 ref 1 fl Interpret:/0/0 rc 0/0 [194198.366012] LustreError: 193397:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bbf02be7050 x1696970161428608/t0(0) o3->59a9cb76-2f7f-9e34-2fb6-ca009310817e@10.50.15.10@o2ib2:80/0 lens 488/440 e 0 to 0 dl 1618914870 ref 1 fl Interpret:/0/0 rc 0/0 [194198.366013] LustreError: 193397:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [194198.366036] Lustre: oak-OST0032: Bulk IO read error with da058466-abbe-4 (at 10.50.5.64@o2ib2), client will retry: rc -110 [194198.366038] Lustre: Skipped 3 previous similar messages [194198.366061] LustreError: 227172:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(3485) req@ffff8bb46dd8d050 x1685001639839808/t0(0) o4->e1dd4add-0477-4@10.50.9.43@o2ib2:81/0 lens 488/448 e 0 to 0 dl 1618914871 ref 1 fl Interpret:/0/0 rc 0/0 [194198.366064] LustreError: 2422:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 470439(1519015) req@ffff8bb46dd8a850 x1684955721437056/t0(0) o4->f08c42a3-d18f-4@10.50.2.18@o2ib2:81/0 lens 488/448 e 0 to 0 dl 1618914871 ref 1 fl Interpret:/0/0 rc 0/0 [194198.366065] LustreError: 227172:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 15 previous similar messages [194198.366066] LustreError: 2422:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 15 previous similar messages [194198.366084] Lustre: oak-OST003e: Bulk IO write error with e1dd4add-0477-4 (at 10.50.9.43@o2ib2), client will retry: rc = -110 [194198.366085] Lustre: Skipped 15 previous similar messages [194198.543338] LustreError: 237967:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 9 previous similar messages [194243.895231] Lustre: oak-OST0046: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [194243.904854] Lustre: Skipped 201 previous similar messages [194301.692543] Lustre: 193172:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618914774/real 1618914774] req@ffff8bb30d998000 x1697353894863232/t0(0) o106->oak-OST005c@10.50.16.7@o2ib2:15/16 lens 296/280 e 0 to 1 dl 1618914947 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [194313.223087] Lustre: oak-OST0052: haven't heard from client e98fe865-8a7e-4 (at 10.50.2.55@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bb2f764b800, cur 1618914959 expire 1618914809 last 1618914732 [194494.916491] Lustre: oak-OST0038: Client 5fbc6dd0-2ef9-b4a9-8121-d2dcfb28fdfb (at 10.50.10.8@o2ib2) reconnecting [194494.927896] Lustre: Skipped 998 previous similar messages [194845.429056] Lustre: oak-OST0052: Connection restored to 831437d2-6335-4 (at 10.51.2.61@o2ib3) [194845.438674] Lustre: Skipped 1843 previous similar messages [195247.125001] Lustre: oak-OST0034: Client 56a5a766-0782-0626-7e81-90dde2e2789a (at 10.51.2.28@o2ib3) reconnecting [195247.136371] Lustre: Skipped 26 previous similar messages [195445.746972] Lustre: oak-OST0050: Connection restored to 57ffd6e4-3c54-4 (at 10.50.8.13@o2ib2) [195445.756614] Lustre: Skipped 601 previous similar messages [195968.784141] Lustre: oak-OST005c: Client 28c71697-daa8-2657-8122-7e55035a312f (at 10.50.6.68@o2ib2) reconnecting [195968.795516] Lustre: Skipped 11 previous similar messages [195993.494501] LustreError: 5986:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0058: cli 6cbff61b-9898-4 claims 1028096 GRANT, real grant 0 [196046.595211] Lustre: oak-OST0036: Connection restored to 7dbc13bf-3fa6-4 (at 10.49.28.11@o2ib1) [196046.604953] Lustre: Skipped 961 previous similar messages [196068.685110] LustreError: 137-5: oak-OST0039_UUID: not available for connect from 10.50.7.44@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [196068.704531] LustreError: Skipped 2 previous similar messages [196648.109129] Lustre: oak-OST0044: Connection restored to 2d3efc18-edb0-4 (at 10.51.5.13@o2ib3) [196648.118752] Lustre: Skipped 385 previous similar messages [196680.915416] Lustre: oak-OST004a: Client c1ada0a4-de2a-c24a-4d28-c593030bd6bb (at 10.51.6.21@o2ib3) reconnecting [196680.926780] Lustre: Skipped 21 previous similar messages [197011.622560] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.1.18@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [197060.373190] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.3.11@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [197248.831351] Lustre: oak-OST005e: Connection restored to 4c6eb2f1-25ca-4 (at 10.50.5.29@o2ib2) [197248.841014] Lustre: Skipped 413 previous similar messages [197321.967543] Lustre: oak-OST0058: Client 5a1a1819-8cf8-0292-bbdd-e93ce3cb58e2 (at 10.50.10.20@o2ib2) reconnecting [197321.979009] Lustre: Skipped 39 previous similar messages [197480.706917] LustreError: 241054:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0054: cli 6cbff61b-9898-4 claims 1507328 GRANT, real grant 0 [197856.724095] Lustre: oak-OST0042: Connection restored to ac9a8546-2b3d-4 (at 10.50.4.4@o2ib2) [197856.733616] Lustre: Skipped 319 previous similar messages [197979.400099] Lustre: oak-OST004e: Client 7575a33a-d426-4 (at 10.50.4.3@o2ib2) reconnecting [197979.409381] Lustre: Skipped 23 previous similar messages [198265.774102] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.6.26@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [198456.846198] Lustre: oak-OST005e: Connection restored to (at 10.51.2.22@o2ib3) [198456.854370] Lustre: Skipped 382 previous similar messages [198589.514304] Lustre: oak-OST0030: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [198589.525666] Lustre: Skipped 17 previous similar messages [198670.944671] LustreError: 241057:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be45a854050 x1696613882345600/t0(0) o3->7f1b7392-400d-f93e-0c1e-8292ad9bca46@10.51.13.6@o2ib3:92/0 lens 488/440 e 0 to 0 dl 1618919412 ref 1 fl Interpret:/0/0 rc 0/0 [198670.971550] LustreError: 241057:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [198672.627532] LNet: 50606:0:(lib-move.c:976:lnet_post_send_locked()) Aborting message for 12345-10.0.2.216@o2ib5: LNetM[DE]Unlink() already called on the MD/ME. [198672.643458] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bba10f5f000 [198672.655801] Lustre: oak-OST005c: Bulk IO read error with 7f1b7392-400d-f93e-0c1e-8292ad9bca46 (at 10.51.13.6@o2ib3), client will retry: rc -110 [198672.670268] Lustre: Skipped 31 previous similar messages [199057.441235] Lustre: oak-OST004c: Connection restored to daa305a4-d6cc-4 (at 10.50.3.36@o2ib2) [199057.450895] Lustre: Skipped 494 previous similar messages [199238.449887] Lustre: oak-OST0050: Client 72c04f1c-ced7-ea27-d7eb-377dea981005 (at 10.50.2.35@o2ib2) reconnecting [199238.449888] Lustre: oak-OST0048: Client 72c04f1c-ced7-ea27-d7eb-377dea981005 (at 10.50.2.35@o2ib2) reconnecting [199238.449891] Lustre: Skipped 24 previous similar messages [199658.006222] Lustre: oak-OST0054: Connection restored to 7fd8cb16-c6a5-4 (at 10.49.17.23@o2ib1) [199658.015953] Lustre: Skipped 669 previous similar messages [199826.161291] LustreError: 5999:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be689f3f850 x1696616995111552/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:489/0 lens 488/448 e 0 to 0 dl 1618920564 ref 1 fl Interpret:/0/0 rc 0/0 [199826.164698] Lustre: oak-OST0032: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [199826.164707] Lustre: Skipped 2 previous similar messages [199826.209093] LustreError: 5999:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [199848.344374] LustreError: 137-5: oak-OST005b_UUID: not available for connect from 10.50.7.31@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [199852.021986] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.50.2.55@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [199857.651021] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.50.5.20@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [199861.194700] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.50.2.5@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [199865.759193] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.50.2.13@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [199865.778609] LustreError: Skipped 3 previous similar messages [199890.728428] Lustre: oak-OST0042: Client 4c32b5fb-e821-4 (at 10.51.2.65@o2ib3) reconnecting [199890.737774] Lustre: Skipped 52 previous similar messages [200261.149337] Lustre: oak-OST0052: Connection restored to df5881f1-066b-4 (at 10.50.1.71@o2ib2) [200261.158960] Lustre: Skipped 532 previous similar messages [200508.155713] Lustre: oak-OST0058: Client 2c5a690e-7625-5039-0841-f15ae7260bd6 (at 10.50.8.15@o2ib2) reconnecting [200508.167105] Lustre: Skipped 46 previous similar messages [200669.757068] LustreError: 193455:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be71fd9d850 x1696864959400128/t0(0) o3->de0b71dd-918a-dbdd-3442-448a7d2edf2a@10.51.6.3@o2ib3:582/0 lens 488/440 e 0 to 0 dl 1618921412 ref 1 fl Interpret:/0/0 rc 0/0 [200669.784240] Lustre: oak-OST0054: Bulk IO read error with de0b71dd-918a-dbdd-3442-448a7d2edf2a (at 10.51.6.3@o2ib3), client will retry: rc -110 [200864.002121] Lustre: oak-OST005e: Connection restored to ac4a35a1-ef91-4 (at 10.50.4.2@o2ib2) [200864.011675] Lustre: Skipped 376 previous similar messages [201151.234192] Lustre: oak-OST005e: Client 853f1535-ef30-151f-429c-d573236cff68 (at 10.51.15.11@o2ib3) reconnecting [201151.245664] Lustre: Skipped 43 previous similar messages [201466.895935] Lustre: oak-OST0040: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [201466.905593] Lustre: Skipped 310 previous similar messages [201790.071955] Lustre: oak-OST0054: Client 71bf2aab-e8fc-f628-8570-5443ce46d22f (at 10.50.9.13@o2ib2) reconnecting [201790.083342] Lustre: Skipped 29 previous similar messages [202072.093636] Lustre: oak-OST0040: Connection restored to 72c60fb7-316a-4 (at 10.50.16.2@o2ib2) [202072.103356] Lustre: Skipped 312 previous similar messages [202468.444427] Lustre: oak-OST0030: Client 7fd0339d-d05d-9ff6-3f86-704a0a70f4dc (at 10.50.15.3@o2ib2) reconnecting [202468.455793] Lustre: Skipped 49 previous similar messages [202672.736682] Lustre: oak-OST004a: Connection restored to b77576b5-26ca-4 (at 10.50.14.5@o2ib2) [202672.746316] Lustre: Skipped 302 previous similar messages [203218.444485] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(sending_nocred)(waiting) [203218.458667] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff860) failed: 5 [203218.458945] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb79c190c00 [203218.458982] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [203218.458983] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 5 previous similar messages [203218.458985] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0a3a4e800 [203218.459533] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8ba9dcbed000 [203218.460105] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcd56b7e400 [203218.460108] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8ba9dcbed000 [203218.460111] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bac9bd20400 [203218.460118] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbef8e8d800 [203218.460713] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb79c190c00 [203218.461874] LNet: 11475:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.214@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [203218.461882] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.214@o2ib5 failed: 5 [203218.461883] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [203218.626419] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 895 previous similar messages [203219.266452] Lustre: oak-OST0052: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [203219.275792] Lustre: Skipped 7 previous similar messages [203273.213273] Lustre: oak-OST0050: Connection restored to 6f53e647-4eee-4 (at 10.51.6.6@o2ib3) [203273.222796] Lustre: Skipped 203 previous similar messages [203273.545550] LustreError: 193423:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bc170fd3050 x1689944511198400/t0(0) o3->d41b6989-a369-4@10.50.5.34@o2ib2:106/0 lens 488/440 e 0 to 0 dl 1618923956 ref 1 fl Interpret:/0/0 rc 0/0 [203273.545558] LustreError: 227172:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bb30027c850 x1689944511198720/t0(0) o3->d41b6989-a369-4@10.50.5.34@o2ib2:106/0 lens 488/440 e 0 to 0 dl 1618923956 ref 1 fl Interpret:/0/0 rc 0/0 [203273.545562] LustreError: 227172:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 25 previous similar messages [203273.545593] Lustre: oak-OST0056: Bulk IO read error with d0c323ea-8af6-4 (at 10.50.10.11@o2ib2), client will retry: rc -110 [203273.545618] LustreError: 227171:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(3015) req@ffff8bbf05fcc850 x1685304102191424/t0(0) o4->fa45e88e-4706-4@10.50.7.37@o2ib2:107/0 lens 488/448 e 0 to 0 dl 1618923957 ref 1 fl Interpret:/0/0 rc 0/0 [203273.545623] LustreError: 227171:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [203273.545643] Lustre: oak-OST0036: Bulk IO write error with df60e1f5-ba28-4 (at 10.50.10.61@o2ib2), client will retry: rc = -110 [203273.545646] Lustre: Skipped 2 previous similar messages [203273.676099] LustreError: 193423:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [203873.638645] Lustre: oak-OST0040: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [203873.648276] Lustre: Skipped 1390 previous similar messages [204006.513352] Lustre: oak-OST0034: Client 122382b0-71fc-1b2e-90d3-c68975dc4fbc (at 10.50.3.37@o2ib2) reconnecting [204006.524751] Lustre: Skipped 1206 previous similar messages [204481.258591] Lustre: oak-OST0056: Connection restored to 3cf99e45-efe2-4 (at 10.49.28.1@o2ib1) [204481.268233] Lustre: Skipped 233 previous similar messages [205083.385626] Lustre: oak-OST0030: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [205083.395267] Lustre: Skipped 149 previous similar messages [205499.335981] Lustre: oak-OST0044: Client 27ff7f5e-c79d-4 (at 10.50.13.4@o2ib2) reconnecting [205499.345308] Lustre: Skipped 25 previous similar messages [205592.338086] Lustre: oak-OST0034: Client b80792d5-f73d-4 (at 10.50.15.5@o2ib2) reconnecting [205592.338087] Lustre: oak-OST0048: Client b80792d5-f73d-4 (at 10.50.15.5@o2ib2) reconnecting [205592.338090] Lustre: Skipped 4 previous similar messages [205685.565830] Lustre: oak-OST0034: Connection restored to cc9d2aad-2b35-1ca5-5745-93eb6c987ed7 (at 10.50.15.7@o2ib2) [205685.577491] Lustre: Skipped 236 previous similar messages [205759.671663] Lustre: oak-OST0044: Client be21abca-4d92-bbd5-3ccb-1ba191006e06 (at 10.50.6.17@o2ib2) reconnecting [205759.683059] Lustre: Skipped 13 previous similar messages [206226.491546] Lustre: oak-OST0034: Client 9bf448c0-5fbe-e1f0-bc01-b0f2eed04641 (at 10.50.12.9@o2ib2) reconnecting [206226.502943] Lustre: Skipped 8 previous similar messages [206288.596575] Lustre: oak-OST0032: Connection restored to (at 10.49.27.22@o2ib1) [206288.604848] Lustre: Skipped 320 previous similar messages [206889.834220] Lustre: oak-OST0054: Connection restored to bf28133d-22a3-4 (at 10.50.5.38@o2ib2) [206889.843840] Lustre: Skipped 139 previous similar messages [207141.234710] Lustre: oak-OST003c: Client 10bbb35d-b69c-3a09-088b-4bac90b1107f (at 10.51.13.1@o2ib3) reconnecting [207141.246077] Lustre: Skipped 5 previous similar messages [207223.876561] LustreError: 241056:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(459776) req@ffff8be71ecf3850 x1695836416456704/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:272/0 lens 488/440 e 0 to 0 dl 1618927897 ref 1 fl Interpret:/0/0 rc 0/0 [207223.902273] LustreError: 241056:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 16 previous similar messages [207223.913082] Lustre: oak-OST005c: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [207223.925509] Lustre: Skipped 24 previous similar messages [207499.707810] Lustre: oak-OST0038: Connection restored to a634e013-aa8e-4 (at 10.50.16.3@o2ib2) [207499.717465] Lustre: Skipped 122 previous similar messages [208105.038025] Lustre: oak-OST0052: Connection restored to eb8a3648-da45-4 (at 10.50.5.69@o2ib2) [208105.047647] Lustre: Skipped 454 previous similar messages [208705.352341] Lustre: oak-OST0030: Connection restored to 17a482bb-dca0-4 (at 10.50.5.13@o2ib2) [208705.361962] Lustre: Skipped 576 previous similar messages [208751.487593] Lustre: oak-OST0034: Client 28c71697-daa8-2657-8122-7e55035a312f (at 10.50.6.68@o2ib2) reconnecting [208751.498987] Lustre: Skipped 12 previous similar messages [208903.547746] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.215@o2ib5: error 0(waiting) [208903.560877] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bb130c68c00 [208903.573029] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bb130c68c00 [208903.585174] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd538bab800 [208903.585200] LustreError: 238499:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bc4bafba050 x1694369938327616/t0(0) o4->529b8899-78f0-4@10.50.14.7@o2ib2:507/0 lens 488/448 e 0 to 0 dl 1618929642 ref 1 fl Interpret:/0/0 rc 0/0 [208903.585499] Lustre: oak-OST0048: Bulk IO write error with 529b8899-78f0-4 (at 10.50.14.7@o2ib2), client will retry: rc = -110 [208903.585501] Lustre: Skipped 3 previous similar messages [208903.641377] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd47b377800 [208903.653539] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be72256f800 [208948.034864] LustreError: 193436:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8baebf0ac050 x1685127295719232/t0(0) o3->777c6d6b-76ad-4@10.50.5.32@o2ib2:507/0 lens 488/440 e 0 to 0 dl 1618929642 ref 1 fl Interpret:/0/0 rc 0/0 [208948.034866] LustreError: 238485:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bbeaf952850 x1685127295719104/t0(0) o3->777c6d6b-76ad-4@10.50.5.32@o2ib2:507/0 lens 488/440 e 0 to 0 dl 1618929642 ref 1 fl Interpret:/0/0 rc 0/0 [208948.035000] Lustre: oak-OST0054: Bulk IO read error with 777c6d6b-76ad-4 (at 10.50.5.32@o2ib2), client will retry: rc -110 [208948.035872] LustreError: 235573:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8baebf0ac850 x1694369938327296/t0(0) o4->529b8899-78f0-4@10.50.14.7@o2ib2:507/0 lens 488/448 e 0 to 0 dl 1618929642 ref 1 fl Interpret:/0/0 rc 0/0 [208948.035874] LustreError: 235573:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 3 previous similar messages [208948.036160] Lustre: oak-OST0048: Bulk IO write error with 529b8899-78f0-4 (at 10.50.14.7@o2ib2), client will retry: rc = -110 [208948.036429] LustreError: 193453:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd1dd4b7850 x1685127295719296/t0(0) o3->777c6d6b-76ad-4@10.50.5.32@o2ib2:507/0 lens 488/440 e 0 to 0 dl 1618929642 ref 1 fl Interpret:/0/0 rc 0/0 [208948.175629] LustreError: 193436:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [208973.038148] LustreError: 193449:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8be7053b1050 x1687936315146176/t0(0) o3->6875c00b-5f8c-4@10.50.5.33@o2ib2:507/0 lens 488/440 e 0 to 0 dl 1618929642 ref 1 fl Interpret:/0/0 rc 0/0 [208973.038176] Lustre: oak-OST0044: Bulk IO read error with a7406eef-a378-4 (at 10.50.4.26@o2ib2), client will retry: rc -110 [208973.038177] Lustre: Skipped 4 previous similar messages [208973.038180] LustreError: 242901:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be76059d850 x1694222655143424/t0(0) o3->473ec31f-2480-4@10.50.14.9@o2ib2:507/0 lens 488/440 e 0 to 0 dl 1618929642 ref 1 fl Interpret:/0/0 rc 0/0 [208973.038232] LustreError: 5989:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1014890) req@ffff8be7237d6050 x1691470893976384/t0(0) o4->d4f26081-ad56-4@10.50.12.17@o2ib2:507/0 lens 488/448 e 0 to 0 dl 1618929642 ref 1 fl Interpret:/0/0 rc 0/0 [208973.038234] LustreError: 5989:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [208973.038296] Lustre: oak-OST004e: Bulk IO write error with ac4a35a1-ef91-4 (at 10.50.4.2@o2ib2), client will retry: rc = -110 [208973.038298] Lustre: Skipped 4 previous similar messages [208973.162571] LustreError: 193449:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 17 previous similar messages [209051.357485] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.50.14.14@o2ib2 ns: filter-oak-OST0050_UUID lock: ffff8be2d0428fc0/0xf81cb91fdfe7d99 lrc: 4/0,0 mode: PR/PR res: [0x1300000400:0x35c8f0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.50.14.14@o2ib2 remote: 0xc6b7a6faddad94e1 expref: 29 pid: 193082 timeout: 209055 lvb_type: 1 [209051.404776] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [209051.415793] LustreError: 193082:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8be71ee2e400 ns: filter-oak-OST0050_UUID lock: ffff8be2d042ad00/0xf81cb91fdfe7da0 lrc: 3/0,0 mode: --/PW res: [0x1300000400:0x35c8f0:0x0].0x0 rrc: 3 type: EXT [0->8191] (req 0->8191) flags: 0x50000000020000 nid: 10.50.14.14@o2ib2 remote: 0xc6b7a6faddad94e8 expref: 29 pid: 193082 timeout: 0 lvb_type: 0 [209070.228277] Lustre: oak-OST0032: Client be21abca-4d92-bbd5-3ccb-1ba191006e06 (at 10.50.6.17@o2ib2) reconnecting [209070.239648] Lustre: Skipped 13 previous similar messages [209081.022115] LustreError: 209400:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be71b030850 x1688315787486656/t0(0) o3->df4223b6-b7e1-4@10.50.9.30@o2ib2:689/0 lens 488/440 e 0 to 0 dl 1618929824 ref 1 fl Interpret:/0/0 rc 0/0 [209081.047109] Lustre: oak-OST0038: Bulk IO read error with df4223b6-b7e1-4 (at 10.50.9.30@o2ib2), client will retry: rc -110 [209081.059542] Lustre: Skipped 19 previous similar messages [209305.944092] Lustre: oak-OST004c: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [209305.953713] Lustre: Skipped 1744 previous similar messages [209363.246538] Lustre: oak-OST005a: Client e71e5ef0-3cf7-788c-e457-05fd6b805527 (at 10.51.13.14@o2ib3) reconnecting [209363.258018] Lustre: Skipped 1050 previous similar messages [209364.078562] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [209364.092051] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff940) failed: 5 [209364.092650] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd33c8a7000 [209364.092654] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd33c8a7000 [209364.092661] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [209364.092662] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [209364.092664] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [209364.092665] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [209364.092666] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd03beda800 [209364.092668] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be2b58cec00 [209364.092671] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be719d86400 [209364.092673] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd33c8a5800 [209364.092675] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd33c8a5800 [209364.092680] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7577c6400 [209364.244207] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 166 previous similar messages [209423.105389] LustreError: 193448:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be2a13a0050 x1696864978122496/t0(0) o3->de0b71dd-918a-dbdd-3442-448a7d2edf2a@10.51.6.3@o2ib3:211/0 lens 488/440 e 0 to 0 dl 1618930101 ref 1 fl Interpret:/0/0 rc 0/0 [209423.105665] Lustre: oak-OST0058: Bulk IO read error with c6866ec8-b698-4 (at 10.51.14.7@o2ib3), client will retry: rc -110 [209423.145092] LustreError: 193448:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 6 previous similar messages [209744.090906] Lustre: oak-OST0038: Client 1bd64923-00bd-7ffb-11d2-574e0422e23d (at 10.51.6.26@o2ib3) reconnecting [209744.090907] Lustre: oak-OST0048: Client 1bd64923-00bd-7ffb-11d2-574e0422e23d (at 10.51.6.26@o2ib3) reconnecting [209744.090910] Lustre: Skipped 17 previous similar messages [209906.614585] Lustre: oak-OST003c: Connection restored to 16c13723-4f07-4 (at 10.50.3.38@o2ib2) [209906.624207] Lustre: Skipped 845 previous similar messages [210113.847758] Lustre: oak-OST0030: haven't heard from client 3a6f9d0c-435c-c626-bcf9-1ca6ac8f4c9e (at 10.50.4.28@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7237a6c00, cur 1618930760 expire 1618930610 last 1618930533 [210115.842267] Lustre: oak-OST004e: haven't heard from client 3a6f9d0c-435c-c626-bcf9-1ca6ac8f4c9e (at 10.50.4.28@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7226a3c00, cur 1618930762 expire 1618930612 last 1618930535 [210115.866628] Lustre: Skipped 21 previous similar messages [210384.055243] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [210384.068448] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bba10f5c400 [210384.080599] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5a5bdbc00 [210384.080639] LustreError: 241059:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd6883dc850 x1695169802158784/t0(0) o4->3d6a6ec7-32fe-4@10.51.14.9@o2ib3:478/0 lens 488/448 e 0 to 0 dl 1618931123 ref 1 fl Interpret:/0/0 rc 0/0 [210384.080726] Lustre: oak-OST004c: Bulk IO write error with 3d6a6ec7-32fe-4 (at 10.51.14.9@o2ib3), client will retry: rc = -110 [210384.080727] Lustre: Skipped 2 previous similar messages [210384.136811] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdac9b18800 [210418.841354] Lustre: oak-OST0046: Client 07e1f920-cffb-b4f8-01fb-b3be1cdfffbf (at 10.51.15.9@o2ib3) reconnecting [210418.852732] Lustre: Skipped 39 previous similar messages [210448.244770] LustreError: 241055:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71b034050 x1689660543162432/t0(0) o3->e84ad1b6-d416-4@10.51.5.56@o2ib3:478/0 lens 488/440 e 0 to 0 dl 1618931123 ref 1 fl Interpret:/0/0 rc 0/0 [210448.245283] Lustre: oak-OST0032: Bulk IO read error with 330d404b-804c-4 (at 10.51.15.3@o2ib3), client will retry: rc -110 [210448.245285] Lustre: Skipped 5 previous similar messages [210448.288568] LustreError: 241055:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [210507.081087] Lustre: oak-OST005c: Connection restored to 405116b8-2d92-4 (at 10.50.9.55@o2ib2) [210507.090714] Lustre: Skipped 1016 previous similar messages [210773.942823] LustreError: 3680:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bb218220850 x1697043981391168/t0(0) o3->ed954721-f94c-14be-e64a-05a7f0440f42@10.50.5.14@o2ib2:118/0 lens 488/440 e 0 to 0 dl 1618931518 ref 1 fl Interpret:/0/0 rc 0/0 [210773.943021] Lustre: oak-OST0040: Bulk IO read error with ed954721-f94c-14be-e64a-05a7f0440f42 (at 10.50.5.14@o2ib2), client will retry: rc -110 [210773.943023] Lustre: Skipped 1 previous similar message [210773.989925] LustreError: 3680:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [210774.904720] LustreError: 193400:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bc2dabe4050 x1697043981390912/t0(0) o3->ed954721-f94c-14be-e64a-05a7f0440f42@10.50.5.14@o2ib2:118/0 lens 488/440 e 0 to 0 dl 1618931518 ref 1 fl Interpret:/0/0 rc 0/0 [211030.412604] Lustre: oak-OST003a: Client c3a6a019-f49f-b1b1-f6c5-6e4ca1071d0b (at 10.50.10.33@o2ib2) reconnecting [211030.412605] Lustre: oak-OST0044: Client c3a6a019-f49f-b1b1-f6c5-6e4ca1071d0b (at 10.50.10.33@o2ib2) reconnecting [211030.412606] Lustre: oak-OST0038: Client c3a6a019-f49f-b1b1-f6c5-6e4ca1071d0b (at 10.50.10.33@o2ib2) reconnecting [211030.412607] Lustre: Skipped 38 previous similar messages [211030.412610] Lustre: Skipped 38 previous similar messages [211107.196900] Lustre: oak-OST0038: Connection restored to 928b0c03-39ce-4 (at 10.50.10.16@o2ib2) [211107.206617] Lustre: Skipped 1525 previous similar messages [211268.155799] LustreError: 137-5: oak-OST0047_UUID: not available for connect from 10.51.5.3@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [211269.675533] LustreError: 137-5: oak-OST0053_UUID: not available for connect from 10.51.2.19@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [211269.694939] LustreError: Skipped 1 previous similar message [211676.381527] Lustre: oak-OST004a: Client 5fbc6dd0-2ef9-b4a9-8121-d2dcfb28fdfb (at 10.50.10.8@o2ib2) reconnecting [211676.381528] Lustre: oak-OST003e: Client 5fbc6dd0-2ef9-b4a9-8121-d2dcfb28fdfb (at 10.50.10.8@o2ib2) reconnecting [211676.381531] Lustre: Skipped 47 previous similar messages [211676.410295] Lustre: Skipped 1 previous similar message [211707.578697] Lustre: oak-OST0056: Connection restored to be5087a8-f6b2-4 (at 10.50.3.48@o2ib2) [211707.588318] Lustre: Skipped 968 previous similar messages [211858.037071] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.51.1.23@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [211858.056485] LustreError: Skipped 1 previous similar message [212314.348826] Lustre: oak-OST0056: Connection restored to b73f0864-ca7d-4 (at 10.50.3.33@o2ib2) [212314.358457] Lustre: Skipped 722 previous similar messages [212410.014977] Lustre: oak-OST0056: Client faa4cc0b-e04f-f706-dbeb-38fb26e6e239 (at 10.50.17.8@o2ib2) reconnecting [212410.026344] Lustre: Skipped 31 previous similar messages [212472.825514] LustreError: 137-5: oak-OST0055_UUID: not available for connect from 10.51.4.48@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [212477.866627] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.5.31@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [212479.796919] LustreError: 137-5: oak-OST005b_UUID: not available for connect from 10.51.5.51@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [212917.447039] Lustre: oak-OST0034: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [212917.456733] Lustre: Skipped 394 previous similar messages [213100.535038] Lustre: oak-OST003e: Client c1ada0a4-de2a-c24a-4d28-c593030bd6bb (at 10.51.6.21@o2ib3) reconnecting [213100.546406] Lustre: Skipped 15 previous similar messages [213408.975343] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [213408.989333] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbe691ce800 [213409.001474] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbfb169c800 [213409.013646] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd07f083400 [213409.025822] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cb8400 [213409.025852] LustreError: 9692:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd64cff2850 x1696626187475776/t0(0) o4->156315a7-a82d-b4fe-847a-396165636f38@10.51.14.3@o2ib3:482/0 lens 488/448 e 0 to 0 dl 1618934147 ref 1 fl Interpret:/0/0 rc 0/0 [213409.026143] Lustre: oak-OST0042: Bulk IO write error with 156315a7-a82d-b4fe-847a-396165636f38 (at 10.51.14.3@o2ib3), client will retry: rc = -110 [213409.080069] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cb8400 [213409.092230] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc17793b400 [213409.104380] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cbfc00 [213409.116564] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbfb169c800 [213473.706301] LustreError: 193404:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be4bbdf2850 x1689648180375232/t0(0) o3->71f8f8a7-b976-4@10.51.5.59@o2ib3:482/0 lens 488/440 e 0 to 0 dl 1618934147 ref 1 fl Interpret:/0/0 rc 0/0 [213473.706594] Lustre: oak-OST003c: Bulk IO read error with 71f8f8a7-b976-4 (at 10.51.5.59@o2ib3), client will retry: rc -110 [213473.706595] Lustre: Skipped 2 previous similar messages [213473.750001] LustreError: 193404:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 4 previous similar messages [213517.661983] Lustre: oak-OST0042: Connection restored to ceb7b5c9-d9e0-4 (at 10.50.13.9@o2ib2) [213517.671604] Lustre: Skipped 588 previous similar messages [213732.064895] Lustre: oak-OST0032: Client de690330-3e87-67eb-b304-5d85f39fb5e5 (at 10.50.2.67@o2ib2) reconnecting [213732.064897] Lustre: oak-OST005a: Client de690330-3e87-67eb-b304-5d85f39fb5e5 (at 10.50.2.67@o2ib2) reconnecting [213732.064902] Lustre: Skipped 109 previous similar messages [214118.137796] Lustre: oak-OST0054: Connection restored to 8b40ead1-d5e7-4 (at 10.51.2.3@o2ib3) [214118.147323] Lustre: Skipped 576 previous similar messages [214333.223317] Lustre: oak-OST003c: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [214333.232646] Lustre: Skipped 37 previous similar messages [214479.403607] LustreError: 193190:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd52bd9d050 x1696856796146432/t0(0) o3->0d6cc963-a657-abef-5505-0c92f514af41@10.50.2.34@o2ib2:47/0 lens 488/440 e 0 to 0 dl 1618935222 ref 1 fl Interpret:/0/0 rc 0/0 [214479.405808] Lustre: oak-OST003a: Bulk IO read error with 0d6cc963-a657-abef-5505-0c92f514af41 (at 10.50.2.34@o2ib2), client will retry: rc -110 [214479.405809] Lustre: Skipped 4 previous similar messages [214479.450882] LustreError: 193190:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [214617.227487] LustreError: 3688:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8badbbf2c850 x1696889170183232/t0(0) o4->7fd0339d-d05d-9ff6-3f86-704a0a70f4dc@10.50.15.3@o2ib2:180/0 lens 488/448 e 0 to 0 dl 1618935355 ref 1 fl Interpret:/0/0 rc 0/0 [214617.246839] Lustre: oak-OST003c: Bulk IO write error with 7fd0339d-d05d-9ff6-3f86-704a0a70f4dc (at 10.50.15.3@o2ib2), client will retry: rc = -110 [214617.269128] LustreError: 3688:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [214618.515659] Lustre: oak-OST003c: Bulk IO write error with 7fd0339d-d05d-9ff6-3f86-704a0a70f4dc (at 10.50.15.3@o2ib2), client will retry: rc = -110 [214618.515784] LNet: 50602:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.50.15.3@o2ib2 for invalid MD 0x1676da12a0d3b7fa.0x8c1160f5 [214618.547306] Lustre: Skipped 5 previous similar messages [214718.344735] Lustre: oak-OST0058: Connection restored to (at 10.50.2.31@o2ib2) [214718.352926] Lustre: Skipped 500 previous similar messages [214942.510539] Lustre: oak-OST004a: Client 1639bdd0-384b-4 (at 10.51.6.19@o2ib3) reconnecting [214942.510540] Lustre: oak-OST005a: Client 1639bdd0-384b-4 (at 10.51.6.19@o2ib3) reconnecting [214942.510543] Lustre: Skipped 32 previous similar messages [214942.535280] Lustre: Skipped 1 previous similar message [215319.955422] Lustre: oak-OST0038: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [215319.965043] Lustre: Skipped 363 previous similar messages [215894.541496] Lustre: oak-OST0048: Client d48bbf19-d33f-1f44-7d24-fbb5f0736220 (at 10.50.0.12@o2ib2) reconnecting [215894.552892] Lustre: Skipped 15 previous similar messages [215921.524948] Lustre: oak-OST003a: Connection restored to d4f26081-ad56-4 (at 10.50.12.17@o2ib2) [215921.534669] Lustre: Skipped 482 previous similar messages [216523.110406] Lustre: oak-OST0046: Connection restored to 3ac62d68-a7e6-4 (at 10.50.7.42@o2ib2) [216523.120036] Lustre: Skipped 675 previous similar messages [216541.193539] Lustre: oak-OST0038: Client 4cee94e6-025c-589a-13ab-6c9ed337de31 (at 10.51.2.36@o2ib3) reconnecting [216541.204915] Lustre: Skipped 36 previous similar messages [217015.902496] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [217015.916605] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be32b114c00 [217015.928769] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd459603400 [217015.940936] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd459603400 [217015.953111] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd8b583b800 [217015.965260] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be66d413c00 [217015.965268] LustreError: 5983:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd02e9eb050 x1688874658392704/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:313/0 lens 488/448 e 0 to 0 dl 1618937753 ref 1 fl Interpret:/0/0 rc 0/0 [217015.965444] Lustre: oak-OST0050: Bulk IO write error with fd16aff2-0371-4 (at 10.51.4.33@o2ib3), client will retry: rc = -110 [217016.015357] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be66d415c00 [217016.027508] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd459601400 [217016.039660] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be338788800 [217016.051810] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5aef4c000 [217016.063962] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5aef4c000 [217073.228064] LustreError: 193406:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be7577b1850 x1685031536517056/t0(0) o3->0507d702-6525-4@10.51.2.51@o2ib3:313/0 lens 488/440 e 0 to 0 dl 1618937753 ref 1 fl Interpret:/0/0 rc 0/0 [217073.228065] LustreError: 241056:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bdc8af6f850 x1685031536517824/t0(0) o3->0507d702-6525-4@10.51.2.51@o2ib3:313/0 lens 488/440 e 0 to 0 dl 1618937753 ref 1 fl Interpret:/0/0 rc 0/0 [217073.228129] Lustre: oak-OST004e: Bulk IO write error with 50e64de8-960e-4 (at 10.51.2.72@o2ib3), client will retry: rc = -110 [217073.228130] Lustre: Skipped 1 previous similar message [217073.228502] Lustre: oak-OST004e: Bulk IO read error with 0507d702-6525-4 (at 10.51.2.51@o2ib3), client will retry: rc -110 [217073.228503] Lustre: Skipped 2 previous similar messages [217073.316624] LustreError: 193406:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 6 previous similar messages [217126.396968] Lustre: oak-OST004a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [217126.406602] Lustre: Skipped 420 previous similar messages [217181.733668] Lustre: oak-OST004e: Client 0507d702-6525-4 (at 10.51.2.51@o2ib3) reconnecting [217181.743006] Lustre: Skipped 24 previous similar messages [217313.867580] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.224@o2ib5: error 0(waiting) [217313.880311] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ffbe0) failed: 5 [217313.880346] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.224@o2ib5 exceeded retry count 0 [217313.880348] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [217313.880351] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbaa1be9000 [217313.891071] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbaa1be9000 [217313.891398] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbaa1be9000 [217313.891740] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbaa1be9000 [217313.891746] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda0b0e7400 [217313.891748] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda0b0e7400 [217313.891750] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda0b0e7400 [217313.891755] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda0b0e7400 [217314.009037] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 780 previous similar messages [217314.341423] LustreError: 193428:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd95afda850 x1695157266742656/t0(0) o3->f504a7a3-efcf-5fbe-e631-3075fd4d7e5d@10.0.2.224@o2ib5:611/0 lens 488/440 e 0 to 0 dl 1618938051 ref 1 fl Interpret:/0/0 rc 0/0 [217314.345597] Lustre: oak-OST0056: Bulk IO read error with f504a7a3-efcf-5fbe-e631-3075fd4d7e5d (at 10.0.2.224@o2ib5), client will retry: rc -110 [217314.345598] Lustre: Skipped 5 previous similar messages [217314.388983] LustreError: 193428:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [217491.891936] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [217491.905281] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bad47a68000 [217491.917442] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bad47a68000 [217526.960360] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.0.71@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [217526.960362] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.51.0.71@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [217526.960365] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.51.0.71@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [217526.960367] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.0.71@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [217527.038048] LustreError: Skipped 20 previous similar messages [217548.295212] LustreError: 209397:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8be45a850850 x1688874685219904/t0(0) o3->fd16aff2-0371-4@10.51.4.33@o2ib3:35/0 lens 488/440 e 0 to 0 dl 1618938230 ref 1 fl Interpret:/0/0 rc 0/0 [217548.295215] LustreError: 193102:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8bd09fc48850 x1689659037701248/t0(0) o4->bc22dc9e-c3bf-4@10.51.4.44@o2ib3:35/0 lens 488/448 e 0 to 0 dl 1618938230 ref 1 fl Interpret:/0/0 rc 0/0 [217548.295218] LustreError: 193102:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 3 previous similar messages [217548.295223] LustreError: 241055:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be8fbcc6850 x1689659037700992/t0(0) o4->bc22dc9e-c3bf-4@10.51.4.44@o2ib3:35/0 lens 488/448 e 0 to 0 dl 1618938230 ref 1 fl Interpret:/0/0 rc 0/0 [217548.295259] Lustre: oak-OST0044: Bulk IO read error with a06ff198-d843-4 (at 10.51.13.17@o2ib3), client will retry: rc -110 [217548.295261] Lustre: oak-OST005e: Bulk IO write error with bc22dc9e-c3bf-4 (at 10.51.4.44@o2ib3), client will retry: rc = -110 [217548.295261] Lustre: Skipped 1 previous similar message [217548.414497] LustreError: 209397:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 4 previous similar messages [217663.559327] LustreError: 193424:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd7e93c1850 x1684942835629504/t0(0) o3->9004a44b-5f07-4@10.51.3.20@o2ib3:209/0 lens 488/440 e 0 to 0 dl 1618938404 ref 1 fl Interpret:/0/0 rc 0/0 [217663.584295] Lustre: oak-OST0042: Bulk IO read error with 9004a44b-5f07-4 (at 10.51.3.20@o2ib3), client will retry: rc -110 [217663.596737] Lustre: Skipped 4 previous similar messages [217667.924303] LustreError: 193438:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be45a4b4050 x1688874694243008/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:216/0 lens 488/448 e 0 to 0 dl 1618938411 ref 1 fl Interpret:/0/0 rc 0/0 [217667.949436] Lustre: oak-OST005a: Bulk IO write error with fd16aff2-0371-4 (at 10.51.4.33@o2ib3), client will retry: rc = -110 [217667.962197] Lustre: Skipped 9 previous similar messages [217670.747602] LustreError: 209397:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6e54c3050 x1688874694237632/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:216/0 lens 488/448 e 0 to 0 dl 1618938411 ref 1 fl Interpret:/0/0 rc 0/0 [217670.754679] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.33@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x8f7519f5 [217670.754681] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 3 previous similar messages [217670.799644] LustreError: 209397:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [217727.581578] Lustre: oak-OST0032: Connection restored to 26c3d7e5-cee7-4 (at 10.50.9.54@o2ib2) [217727.591214] Lustre: Skipped 1129 previous similar messages [217822.642915] Lustre: oak-OST004e: Client 90752487-0b3e-1696-21a1-6c81abc18872 (at 10.51.1.2@o2ib3) reconnecting [217822.654201] Lustre: Skipped 709 previous similar messages [218327.781900] Lustre: oak-OST004e: Connection restored to (at 10.50.2.51@o2ib2) [218327.790161] Lustre: Skipped 323 previous similar messages [218393.649637] Lustre: oak-OST005a: haven't heard from client 514498ac-013a-b01f-7468-7d509df2db40 (at 10.50.0.11@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7aa8d4800, cur 1618939040 expire 1618938890 last 1618938813 [218393.674007] Lustre: Skipped 1 previous similar message [218394.664602] Lustre: oak-OST0030: haven't heard from client 514498ac-013a-b01f-7468-7d509df2db40 (at 10.50.0.11@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7236dd400, cur 1618939041 expire 1618938891 last 1618938814 [218394.689014] Lustre: Skipped 230 previous similar messages [218446.040824] Lustre: oak-OST0036: Client 9e5aa46d-038e-4 (at 10.50.7.9@o2ib2) reconnecting [218446.040824] Lustre: oak-OST003a: Client 9e5aa46d-038e-4 (at 10.50.7.9@o2ib2) reconnecting [218446.040828] Lustre: Skipped 26 previous similar messages [218446.065366] Lustre: Skipped 2 previous similar messages [218930.877745] Lustre: oak-OST0046: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [218930.887383] Lustre: Skipped 474 previous similar messages [219404.400052] Lustre: oak-OST0032: Client f46b1f75-6ac7-4 (at 10.51.15.6@o2ib3) reconnecting [219404.409386] Lustre: Skipped 38 previous similar messages [219531.515784] Lustre: oak-OST0044: Connection restored to a62eb9c4-f0ff-4 (at 10.51.3.63@o2ib3) [219531.525457] Lustre: Skipped 76 previous similar messages [220134.339603] Lustre: oak-OST0032: Connection restored to 93c8e0cc-3df7-4 (at 10.51.4.4@o2ib3) [220134.349134] Lustre: Skipped 264 previous similar messages [220737.349894] Lustre: oak-OST0030: Connection restored to be5087a8-f6b2-4 (at 10.50.3.48@o2ib2) [220737.359513] Lustre: Skipped 139 previous similar messages [221345.047271] Lustre: oak-OST004a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [221345.056897] Lustre: Skipped 115 previous similar messages [221952.404161] Lustre: oak-OST0040: Connection restored to 1a571093-7cfe-4 (at 10.50.10.9@o2ib2) [221952.413791] Lustre: Skipped 176 previous similar messages [222072.560597] Lustre: oak-OST005c: haven't heard from client c8c840df-9c8d-a80d-65c8-9f97ec5e9f96 (at 10.51.1.9@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be723c9d400, cur 1618942719 expire 1618942569 last 1618942492 [222072.584863] Lustre: Skipped 32 previous similar messages [222552.509406] Lustre: oak-OST0044: Connection restored to 006fe87a-1b9a-4 (at 10.51.15.15@o2ib3) [222552.519124] Lustre: Skipped 123 previous similar messages [222618.561621] Lustre: oak-OST005a: haven't heard from client b2611135-9735-00c6-3140-35bce6673f80 (at 10.210.12.10@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be721e0a000, cur 1618943265 expire 1618943115 last 1618943038 [222618.586074] Lustre: Skipped 167 previous similar messages [223153.346228] Lustre: oak-OST005c: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [223153.355847] Lustre: Skipped 225 previous similar messages [223762.385875] Lustre: oak-OST0030: Client fab123b0-1ddb-1087-562d-2215869e7c07 (at 10.210.15.140@tcp1) reconnecting [223762.386058] Lustre: oak-OST0038: Connection restored to (at 10.210.15.140@tcp1) [223762.386060] Lustre: Skipped 406 previous similar messages [223762.411914] Lustre: Skipped 14 previous similar messages [224385.635251] Lustre: oak-OST005e: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [224385.644884] Lustre: Skipped 353 previous similar messages [225023.223720] Lustre: oak-OST004a: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [225023.233346] Lustre: Skipped 96 previous similar messages [225300.592295] Lustre: oak-OST0044: Client ce8388c3-ee1c-18c4-8f03-235d1bdd9de0 (at 10.50.3.39@o2ib2) reconnecting [225300.592308] Lustre: oak-OST005c: Client ce8388c3-ee1c-18c4-8f03-235d1bdd9de0 (at 10.50.3.39@o2ib2) reconnecting [225300.592310] Lustre: Skipped 18 previous similar messages [225516.491064] Lustre: oak-OST0040: haven't heard from client 9b58b185-f47f-5dc8-cab2-e05c1c8e20a6 (at 10.210.12.59@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be71eeb2800, cur 1618946163 expire 1618946013 last 1618945936 [225516.515536] Lustre: Skipped 23 previous similar messages [225592.479673] Lustre: oak-OST0050: haven't heard from client 24be0a47-1b33-0b5c-28de-7a37c40c9b32 (at 10.210.12.64@tcp1) in 225 seconds. I think it's dead, and I am evicting it. exp ffff8be96d3c0c00, cur 1618946239 expire 1618946089 last 1618946014 [225592.504151] Lustre: Skipped 95 previous similar messages [225594.481637] Lustre: oak-OST005a: haven't heard from client 24be0a47-1b33-0b5c-28de-7a37c40c9b32 (at 10.210.12.64@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7a2e47000, cur 1618946241 expire 1618946091 last 1618946014 [225594.506148] Lustre: Skipped 79 previous similar messages [225624.326102] Lustre: oak-OST005c: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [225624.335722] Lustre: Skipped 135 previous similar messages [225782.473886] Lustre: oak-OST0056: haven't heard from client 3ddbdb1e-df09-7259-e6d1-5e154461dc64 (at 10.210.12.67@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c25fd800, cur 1618946429 expire 1618946279 last 1618946202 [225782.498465] Lustre: Skipped 15 previous similar messages [226246.332237] Lustre: oak-OST0044: Connection restored to 934d532f-1b5d-4 (at 10.51.4.50@o2ib3) [226246.341857] Lustre: Skipped 496 previous similar messages [226444.976021] Lustre: oak-OST0044: Client a3a6b28c-7c07-7396-3ae8-aa974c72f517 (at 10.50.8.1@o2ib2) reconnecting [226444.987688] Lustre: Skipped 2 previous similar messages [226848.085327] Lustre: oak-OST0040: Connection restored to 7d4930b0-1dcd-4 (at 10.51.13.8@o2ib3) [226848.094947] Lustre: Skipped 356 previous similar messages [227451.623938] Lustre: oak-OST004a: Connection restored to 311b26ee-30a2-4 (at 10.50.15.6@o2ib2) [227451.633559] Lustre: Skipped 181 previous similar messages [228053.915609] Lustre: oak-OST0040: Connection restored to (at 10.50.13.7@o2ib2) [228053.923773] Lustre: Skipped 180 previous similar messages [228675.459806] Lustre: oak-OST005e: Connection restored to 3e703ead-ff34-4 (at 10.51.14.1@o2ib3) [228675.469463] Lustre: Skipped 158 previous similar messages [229284.560416] Lustre: oak-OST003a: Connection restored to d7c49207-5239-4 (at 10.50.10.47@o2ib2) [229284.570136] Lustre: Skipped 106 previous similar messages [229864.732848] Lustre: oak-OST0042: Client 496bcf56-49e1-a86a-30f9-e288dacdba35 (at 10.210.12.7@tcp1) reconnecting [229864.744217] Lustre: Skipped 1 previous similar message [229865.582879] Lustre: oak-OST0056: Client 496bcf56-49e1-a86a-30f9-e288dacdba35 (at 10.210.12.7@tcp1) reconnecting [229865.594243] Lustre: Skipped 1 previous similar message [229868.109581] Lustre: oak-OST004a: Client 496bcf56-49e1-a86a-30f9-e288dacdba35 (at 10.210.12.7@tcp1) reconnecting [229868.120947] Lustre: Skipped 2 previous similar messages [229870.447519] Lustre: oak-OST005a: Client 496bcf56-49e1-a86a-30f9-e288dacdba35 (at 10.210.12.7@tcp1) reconnecting [229870.458911] Lustre: Skipped 1 previous similar message [229874.797371] Lustre: oak-OST003e: Client 496bcf56-49e1-a86a-30f9-e288dacdba35 (at 10.210.12.7@tcp1) reconnecting [229874.808755] Lustre: Skipped 3 previous similar messages [229886.178878] Lustre: oak-OST0034: Connection restored to 3b262606-25cc-4 (at 10.50.7.40@o2ib2) [229886.188552] Lustre: Skipped 111 previous similar messages [229903.087819] Lustre: oak-OST0046: Client 496bcf56-49e1-a86a-30f9-e288dacdba35 (at 10.210.12.7@tcp1) reconnecting [229903.099183] Lustre: Skipped 2 previous similar messages [229928.558133] Lustre: oak-OST0044: Client 496bcf56-49e1-a86a-30f9-e288dacdba35 (at 10.210.12.7@tcp1) reconnecting [230486.685707] Lustre: oak-OST004c: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [230486.695341] Lustre: Skipped 127 previous similar messages [231046.513624] Lustre: oak-OST0052: Client 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3) reconnecting [231046.524993] Lustre: Skipped 2 previous similar messages [231089.125660] Lustre: oak-OST0054: Connection restored to 45f8c97c-22a3-4 (at 10.51.3.26@o2ib3) [231089.135307] Lustre: Skipped 681 previous similar messages [231150.142990] Lustre: oak-OST0034: Client c7c97132-e759-4 (at 10.51.15.4@o2ib3) reconnecting [231150.142991] Lustre: oak-OST0050: Client c7c97132-e759-4 (at 10.51.15.4@o2ib3) reconnecting [231163.853912] Lustre: oak-OST0048: Client 9458049c-ca8d-335b-3531-2606964e11c0 (at 10.51.2.31@o2ib3) reconnecting [231163.865277] Lustre: Skipped 1 previous similar message [231186.762508] Lustre: oak-OST005e: Client d073f313-60b4-4 (at 10.51.15.5@o2ib3) reconnecting [231186.771836] Lustre: Skipped 2 previous similar messages [231189.128410] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [231189.142534] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd671c17000 [231189.154689] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd221d16c00 [231189.166857] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd221d16c00 [231189.179012] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcbbb50c800 [231189.191159] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcbbb50c800 [231189.203326] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcbbb50c800 [231189.215505] LustreError: 193439:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be08ac0a850 x1689658250376768/t0(0) o4->3e7c55d0-08c5-4@10.51.4.52@o2ib3:138/0 lens 536/456 e 0 to 0 dl 1618951923 ref 1 fl Interpret:/0/0 rc 0/0 [231189.241140] Lustre: oak-OST0042: Bulk IO write error with 3e7c55d0-08c5-4 (at 10.51.4.52@o2ib3), client will retry: rc = -110 [231189.253888] Lustre: Skipped 2 previous similar messages [231211.736117] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.51.4.38@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [231211.736118] LustreError: 137-5: oak-OST0047_UUID: not available for connect from 10.51.4.38@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [231237.517055] Lustre: oak-OST003c: Client 22f75865-19b9-71a9-aa3b-548e62a806e3 (at 10.51.0.16@o2ib3) reconnecting [231237.528438] Lustre: Skipped 3 previous similar messages [231244.933492] LustreError: 193413:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd95afdd850 x1697586491247040/t0(0) o4->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:197/0 lens 488/448 e 0 to 0 dl 1618951982 ref 1 fl Interpret:/0/0 rc 0/0 [231244.960798] Lustre: oak-OST0056: Bulk IO write error with 4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d (at 10.51.2.32@o2ib3), client will retry: rc = -110 [231247.821843] LustreError: 193401:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bde23c2b050 x1685012149943616/t0(0) o3->182778a0-b920-4@10.51.0.61@o2ib3:145/0 lens 488/440 e 0 to 0 dl 1618951930 ref 1 fl Interpret:/0/0 rc 0/0 [231247.821845] LustreError: 187417:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bcf3b8b5850 x1685012149943680/t0(0) o3->182778a0-b920-4@10.51.0.61@o2ib3:145/0 lens 488/440 e 0 to 0 dl 1618951930 ref 1 fl Interpret:/0/0 rc 0/0 [231247.821973] Lustre: oak-OST0050: Bulk IO read error with 182778a0-b920-4 (at 10.51.0.61@o2ib3), client will retry: rc -110 [231259.631082] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.32@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x97f2f65d [231269.249952] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be701dad050 x1696876630955776/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:196/0 lens 488/448 e 0 to 0 dl 1618951981 ref 1 fl Interpret:/0/0 rc 0/0 [231269.277210] Lustre: oak-OST004a: Bulk IO write error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc = -110 [231304.762681] Lustre: oak-OST0054: Client 416c99fb-653c-af17-c233-d5516de96d20 (at 10.51.15.7@o2ib3) reconnecting [231304.774094] Lustre: Skipped 20 previous similar messages [231316.579824] LustreError: 209392:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd505748050 x1696617130223616/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:258/0 lens 488/448 e 0 to 0 dl 1618952043 ref 1 fl Interpret:/0/0 rc 0/0 [231316.607016] Lustre: oak-OST0040: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [231325.318130] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.51.0.12@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [231328.171557] LustreError: 193441:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6bfb99050 x1697586491458304/t0(0) o4->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:280/0 lens 488/448 e 0 to 0 dl 1618952065 ref 1 fl Interpret:/0/0 rc 0/0 [231328.185697] Lustre: oak-OST0052: Bulk IO write error with 4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d (at 10.51.2.32@o2ib3), client will retry: rc = -110 [231328.213390] LustreError: 193441:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [231328.939300] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.32@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x97fc1605 [231328.971030] LustreError: 137-5: oak-OST0055_UUID: not available for connect from 10.51.0.12@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [231332.841820] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.12.6@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [231332.861249] LustreError: Skipped 1 previous similar message [231432.823456] Lustre: oak-OST004c: Client 5382169d-059c-4 (at 10.51.2.52@o2ib3) reconnecting [231432.832788] Lustre: Skipped 37 previous similar messages [231506.567113] LNet: 182063:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [231506.581326] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf6515ac00 [231506.593543] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be05f7db000 [231506.605697] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be05f7db000 [231506.617889] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be9797ae000 [231506.630089] LustreError: 193188:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bcf3b8b1850 x1696860057108672/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:457/0 lens 488/448 e 0 to 0 dl 1618952242 ref 1 fl Interpret:/0/0 rc 0/0 [231506.655474] Lustre: oak-OST004a: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [231506.668100] Lustre: Skipped 2 previous similar messages [231544.211504] LustreError: 193439:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be53bda1050 x1696860057189632/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:497/0 lens 488/448 e 0 to 0 dl 1618952282 ref 1 fl Interpret:/0/0 rc 0/0 [231544.236671] Lustre: oak-OST003a: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [231547.830795] LustreError: 5992:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8bcf4f400850 x1688534600642496/t0(0) o4->0028e5c0-f60e-4@10.51.4.34@o2ib3:457/0 lens 488/448 e 0 to 0 dl 1618952242 ref 1 fl Interpret:/0/0 rc 0/0 [231547.857202] LustreError: 5992:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 8 previous similar messages [231572.831683] LustreError: 241052:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd176695050 x1685012156249728/t0(0) o3->182778a0-b920-4@10.51.0.61@o2ib3:459/0 lens 488/440 e 0 to 0 dl 1618952244 ref 1 fl Interpret:/0/0 rc 0/0 [231572.831807] Lustre: oak-OST0050: Bulk IO read error with 182778a0-b920-4 (at 10.51.0.61@o2ib3), client will retry: rc -110 [231572.831809] Lustre: Skipped 1 previous similar message [231572.875288] LustreError: 241052:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [231661.511192] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.51.1.49@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [231661.530615] LustreError: Skipped 1 previous similar message [231691.066015] Lustre: oak-OST005a: Connection restored to 777c6d6b-76ad-4 (at 10.50.5.32@o2ib2) [231691.075639] Lustre: Skipped 476 previous similar messages [231704.073883] Lustre: oak-OST0058: Client d073f313-60b4-4 (at 10.51.15.5@o2ib3) reconnecting [231704.083235] Lustre: Skipped 52 previous similar messages [231720.137572] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [231720.152155] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb93d18000 [231720.164313] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb93d18000 [231720.176488] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcc952cb400 [231720.188641] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcc952cb400 [231720.200789] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8ba9dcbe9c00 [231720.212950] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd0b07c4400 [231720.225095] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd0b07c4400 [231721.405675] LustreError: 193407:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be6a00d9050 x1696860057514304/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:674/0 lens 488/448 e 0 to 0 dl 1618952459 ref 1 fl Interpret:/0/0 rc 0/0 [231721.431169] Lustre: oak-OST005c: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [231721.443810] Lustre: Skipped 1 previous similar message [231772.832514] LustreError: 193453:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bde96bd9050 x1696860057515328/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:674/0 lens 488/448 e 0 to 0 dl 1618952459 ref 1 fl Interpret:/0/0 rc 0/0 [231772.832545] LustreError: 209391:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1231277) req@ffff8bd714d39050 x1684937009031808/t0(0) o4->1ac0ea05-a948-4@10.51.3.51@o2ib3:674/0 lens 488/448 e 0 to 0 dl 1618952459 ref 1 fl Interpret:/0/0 rc 0/0 [231772.832599] LustreError: 193409:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1048576(4194304) req@ffff8bdec316d850 x1685012159543104/t0(0) o3->182778a0-b920-4@10.51.0.61@o2ib3:674/0 lens 488/440 e 0 to 0 dl 1618952459 ref 1 fl Interpret:/0/0 rc 0/0 [231772.832935] Lustre: oak-OST0050: Bulk IO read error with 182778a0-b920-4 (at 10.51.0.61@o2ib3), client will retry: rc -110 [231772.832936] Lustre: Skipped 1 previous similar message [231772.928389] LustreError: 193453:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 2 previous similar messages [231776.139142] LustreError: 9695:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be70ef0f050 x1696860057606656/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:726/0 lens 488/448 e 0 to 0 dl 1618952511 ref 1 fl Interpret:/0/0 rc 0/0 [231892.705425] LustreError: 5989:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bdb645ad850 x1684955485826432/t0(0) o4->da817238-9680-4@10.51.3.1@o2ib3:96/0 lens 488/448 e 0 to 0 dl 1618952636 ref 1 fl Interpret:/0/0 rc 0/0 [231892.730224] Lustre: oak-OST0042: Bulk IO write error with da817238-9680-4 (at 10.51.3.1@o2ib3), client will retry: rc = -110 [231892.738916] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.3.1@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9841542d [231892.738918] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [231892.769676] Lustre: Skipped 8 previous similar messages [231898.836610] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.3.1@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x98415425 [231901.377493] Lustre: oak-OST0038: haven't heard from client a020122d-d9f6-4 (at 10.51.2.21@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be43fa22800, cur 1618952548 expire 1618952398 last 1618952321 [231901.399814] Lustre: Skipped 167 previous similar messages [231939.414348] LustreError: 193196:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be8fbcc7050 x1696617131439232/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:129/0 lens 488/448 e 0 to 0 dl 1618952669 ref 1 fl Interpret:/0/0 rc 0/0 [231969.266118] LustreError: 137-5: oak-OST0053_UUID: not available for connect from 10.51.0.12@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [232113.105315] LustreError: 193424:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be9298c3850 x1696876633346560/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:312/0 lens 488/448 e 0 to 0 dl 1618952852 ref 1 fl Interpret:/0/0 rc 0/0 [232113.132729] Lustre: oak-OST0030: Bulk IO write error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc = -110 [232113.147492] Lustre: Skipped 1 previous similar message [232222.346120] Lustre: oak-OST0030: Client 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3) reconnecting [232222.357529] Lustre: Skipped 394 previous similar messages [232291.738348] Lustre: oak-OST004c: Connection restored to 9b12e584-d591-4 (at 10.51.12.20@o2ib3) [232291.748091] Lustre: Skipped 497 previous similar messages [232368.989244] LustreError: 137-5: oak-OST003f_UUID: not available for connect from 10.51.14.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [232866.689188] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.15.9@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [232866.708607] LustreError: Skipped 8 previous similar messages [232910.445054] Lustre: oak-OST003a: Connection restored to 12069b00-d6dd-4 (at 10.50.4.66@o2ib2) [232910.454680] Lustre: Skipped 348 previous similar messages [232945.221065] Lustre: oak-OST0054: Client 4eca94b5-20cc-8fd4-b8fd-ebd10e301645 (at 10.51.1.21@o2ib3) reconnecting [232945.232453] Lustre: Skipped 73 previous similar messages [232987.084795] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [232987.099066] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd03beddc00 [232987.111212] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd03bedc000 [232987.123371] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd03bedc000 [232987.135532] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be71ff69400 [233047.868138] LustreError: 193406:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be9068eb850 x1697586521294656/t0(0) o4->9458049c-ca8d-335b-3531-2606964e11c0@10.51.2.31@o2ib3:430/0 lens 488/448 e 0 to 0 dl 1618953725 ref 1 fl Interpret:/0/0 rc 0/0 [233047.868140] LustreError: 241048:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(4194304) req@ffff8be731ba2850 x1688534610641728/t0(0) o4->0028e5c0-f60e-4@10.51.4.34@o2ib3:430/0 lens 488/448 e 0 to 0 dl 1618953725 ref 1 fl Interpret:/0/0 rc 0/0 [233047.868143] LustreError: 241048:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [233047.868299] Lustre: oak-OST0056: Bulk IO write error with 0028e5c0-f60e-4 (at 10.51.4.34@o2ib3), client will retry: rc = -110 [233047.868300] Lustre: Skipped 1 previous similar message [233047.868325] Lustre: oak-OST0050: Bulk IO read error with 182778a0-b920-4 (at 10.51.0.61@o2ib3), client will retry: rc -110 [233047.868326] Lustre: Skipped 3 previous similar messages [233047.969817] LustreError: 193406:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 2 previous similar messages [233214.311012] Lustre: oak-OST0040: haven't heard from client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be78c630000, cur 1618953861 expire 1618953711 last 1618953634 [233349.580557] LustreError: 204436:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be7aa9f4050 x1697586495676928/t0(0) o4->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:41/0 lens 488/448 e 0 to 0 dl 1618954091 ref 1 fl Interpret:/0/0 rc 0/0 [233349.607627] LustreError: 204436:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [233349.618474] Lustre: oak-OST0052: Bulk IO write error with 4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d (at 10.51.2.32@o2ib3), client will retry: rc = -110 [233349.633252] Lustre: Skipped 1 previous similar message [233511.444527] Lustre: oak-OST0038: Connection restored to fecbd0eb-e0ec-4 (at 10.50.8.20@o2ib2) [233511.454172] Lustre: Skipped 202 previous similar messages [233576.014006] Lustre: oak-OST004c: Client a5681ec8-1d55-518e-30a3-8af758dafdd3 (at 10.51.13.23@o2ib3) reconnecting [233576.025471] Lustre: Skipped 90 previous similar messages [233652.069255] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [233652.081977] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff5c0) failed: 5 [233652.092379] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 80 previous similar messages [233652.092513] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [233652.092515] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [233652.092518] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd8f6c1b800 [233652.092859] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd8f6c1f000 [233652.092861] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be6ab3ee800 [233652.092864] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd62b521000 [233652.092868] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be757080800 [233697.890164] LustreError: 193140:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(98304) req@ffff8be6a2ad6850 x1696192780761792/t0(0) o3->afb0548c-9287-4@10.51.15.2@o2ib3:339/0 lens 488/440 e 0 to 0 dl 1618954389 ref 1 fl Interpret:/0/0 rc 0/0 [233697.890174] LustreError: 209399:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be478dd2850 x1684934594412544/t0(0) o4->45f8c97c-22a3-4@10.51.3.26@o2ib3:341/0 lens 488/448 e 0 to 0 dl 1618954391 ref 1 fl Interpret:/0/0 rc 0/0 [233697.890190] LustreError: 5988:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bca51f4a850 x1689694583244032/t0(0) o3->822a70f4-76f2-4@10.51.5.50@o2ib3:341/0 lens 488/440 e 0 to 0 dl 1618954391 ref 1 fl Interpret:/0/0 rc 0/0 [233697.890197] Lustre: oak-OST0040: Bulk IO read error with adcdf203-d39e-4 (at 10.51.5.37@o2ib3), client will retry: rc -110 [233697.890198] Lustre: Skipped 1 previous similar message [233697.890277] Lustre: oak-OST0056: Bulk IO write error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc = -110 [233698.000402] LustreError: 193140:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 4 previous similar messages [233722.889938] LustreError: 193412:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(446464) req@ffff8be973f9c850 x1688875036169536/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:343/0 lens 488/448 e 0 to 0 dl 1618954393 ref 1 fl Interpret:/0/0 rc 0/0 [233722.889968] LustreError: 193190:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(3921) req@ffff8be71d7ee050 x1684937012954880/t0(0) o3->1ac0ea05-a948-4@10.51.3.51@o2ib3:343/0 lens 488/440 e 0 to 0 dl 1618954393 ref 1 fl Interpret:/0/0 rc 0/0 [233722.889985] LustreError: 193143:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bcc39c79050 x1684935990537984/t0(0) o3->9f0935fb-eb44-4@10.51.3.27@o2ib3:345/0 lens 488/440 e 0 to 0 dl 1618954395 ref 1 fl Interpret:/0/0 rc 0/0 [233722.889986] Lustre: oak-OST004e: Bulk IO read error with 1ac0ea05-a948-4 (at 10.51.3.51@o2ib3), client will retry: rc -110 [233722.889988] LustreError: 193143:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [233722.889988] Lustre: Skipped 4 previous similar messages [233722.995540] LustreError: 193412:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 10 previous similar messages [233819.154647] Lustre: 193158:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618954292/real 1618954292] req@ffff8bc327598d80 x1697353939439424/t0(0) o106->oak-OST0056@10.51.13.8@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618954465 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [233823.757533] LustreError: 5990:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd870eb0050 x1684941779339200/t0(0) o3->93c8e0cc-3df7-4@10.51.4.4@o2ib3:517/0 lens 488/440 e 0 to 0 dl 1618954567 ref 1 fl Interpret:/0/0 rc 0/0 [233823.782210] Lustre: oak-OST0058: Bulk IO read error with 93c8e0cc-3df7-4 (at 10.51.4.4@o2ib3), client will retry: rc -110 [233823.794600] Lustre: Skipped 8 previous similar messages [233830.456554] Lustre: oak-OST005a: Bulk IO write error with fd16aff2-0371-4 (at 10.51.4.33@o2ib3), client will retry: rc = -110 [233830.469277] Lustre: Skipped 13 previous similar messages [233831.189952] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.33@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x99450475 [233831.206857] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [233835.634427] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.33@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9945698d [233835.651354] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 2 previous similar messages [233986.431748] LustreError: 5997:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcc65500050 x1696876638565248/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:675/0 lens 488/448 e 0 to 0 dl 1618954725 ref 1 fl Interpret:/0/0 rc 0/0 [233986.458629] LustreError: 5997:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [234143.444012] Lustre: oak-OST0038: Connection restored to fd16aff2-0371-4 (at 10.51.4.33@o2ib3) [234143.453631] Lustre: Skipped 913 previous similar messages [234210.992379] Lustre: oak-OST0040: Client 60fdba1c-732a-b99c-63c8-be091395f5f0 (at 10.51.12.5@o2ib3) reconnecting [234211.003759] Lustre: Skipped 740 previous similar messages [234312.876151] LustreError: 5988:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcf8da44050 x1696876639557504/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:234/0 lens 488/448 e 0 to 0 dl 1618955039 ref 1 fl Interpret:/0/0 rc 0/0 [234312.903176] Lustre: oak-OST0048: Bulk IO write error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc = -110 [234312.917954] Lustre: Skipped 2 previous similar messages [234341.233296] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.12.2@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [234341.252708] LustreError: Skipped 1 previous similar message [234478.561317] LustreError: 5986:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd105c06850 x1696860063378560/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:410/0 lens 488/448 e 0 to 0 dl 1618955215 ref 1 fl Interpret:/0/0 rc 0/0 [234478.586074] LustreError: 5986:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [234487.786392] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.1@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x999bf9f5 [234531.473663] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [234531.487579] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be90ac21400 [234531.499739] LustreError: 193194:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be6a4dc3050 x1688875040715456/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:464/0 lens 488/448 e 0 to 0 dl 1618955269 ref 1 fl Interpret:/0/0 rc 0/0 [234531.525170] LustreError: 193194:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [234597.925772] LustreError: 5999:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1241088) req@ffff8bde689a4050 x1684954388196608/t0(0) o4->0871860c-cfdf-4@10.51.3.30@o2ib3:465/0 lens 488/448 e 0 to 0 dl 1618955270 ref 1 fl Interpret:/0/0 rc 0/0 [234597.925838] LustreError: 193195:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(758044) req@ffff8be64e4c2050 x1696619769818112/t0(0) o3->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:465/0 lens 488/440 e 0 to 0 dl 1618955270 ref 1 fl Interpret:/0/0 rc 0/0 [234597.925841] LustreError: 193195:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 5 previous similar messages [234597.925897] Lustre: oak-OST0052: Bulk IO read error with 84772643-3e0c-a25c-a9b0-965c7b792170 (at 10.51.13.15@o2ib3), client will retry: rc -110 [234598.004815] LustreError: 5999:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 26 previous similar messages [234702.433086] Lustre: 193165:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618955176/real 1618955176] req@ffff8bafd1038480 x1697353941356544/t0(0) o106->oak-OST003e@10.51.3.26@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618955349 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [234754.358737] Lustre: oak-OST004a: Connection restored to c0d07aca-1d41-4 (at 10.51.1.49@o2ib3) [234754.358738] Lustre: oak-OST003a: Connection restored to c0d07aca-1d41-4 (at 10.51.1.49@o2ib3) [234754.358739] Lustre: oak-OST0030: Connection restored to c0d07aca-1d41-4 (at 10.51.1.49@o2ib3) [234754.358740] Lustre: Skipped 836 previous similar messages [234754.358743] Lustre: Skipped 836 previous similar messages [234820.217149] Lustre: oak-OST005a: Client 6c0c8fc9-86c4-9a4d-ddf5-85fe73093cd9 (at 10.51.12.2@o2ib3) reconnecting [234820.228542] Lustre: Skipped 755 previous similar messages [234860.346020] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.0.16@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [234860.365435] LustreError: Skipped 2 previous similar messages [235047.290803] LustreError: 11105:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd289d15850 x1696860064924608/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:224/0 lens 488/448 e 0 to 0 dl 1618955784 ref 1 fl Interpret:/0/0 rc 0/0 [235047.315691] LustreError: 11105:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 7 previous similar messages [235047.326463] Lustre: oak-OST0050: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [235047.339104] Lustre: Skipped 38 previous similar messages [235051.648138] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.1@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x99e3a385 [235051.664940] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 5 previous similar messages [235358.254067] Lustre: oak-OST0050: Connection restored to (at 10.50.12.16@o2ib2) [235358.262328] Lustre: Skipped 173 previous similar messages [235424.500670] Lustre: oak-OST0036: Client ede94a1a-b345-30c6-e4bb-52a3a03e8e5f (at 10.51.6.18@o2ib3) reconnecting [235424.512044] Lustre: Skipped 65 previous similar messages [235803.238253] LustreError: 193439:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be677fe1850 x1697586534580032/t0(0) o4->9458049c-ca8d-335b-3531-2606964e11c0@10.51.2.31@o2ib3:218/0 lens 488/448 e 0 to 0 dl 1618956533 ref 1 fl Interpret:/0/0 rc 0/0 [235803.265344] LustreError: 193439:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [235803.276203] Lustre: oak-OST0032: Bulk IO write error with 9458049c-ca8d-335b-3531-2606964e11c0 (at 10.51.2.31@o2ib3), client will retry: rc = -110 [235803.290955] Lustre: Skipped 4 previous similar messages [235803.957483] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.51.4.33@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [235869.387509] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.31@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9a45f5fd [235869.404404] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 2 previous similar messages [235960.380917] Lustre: oak-OST0058: Connection restored to ed109291-03b5-4 (at 10.51.4.16@o2ib3) [235960.390857] Lustre: Skipped 137 previous similar messages [235976.439990] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [235976.453935] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdf7c7c5800 [236016.916875] LustreError: 137-5: oak-OST0051_UUID: not available for connect from 10.51.15.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [236022.963460] LustreError: 204434:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd2b7a62850 x1697586502352704/t0(0) o4->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:388/0 lens 488/448 e 0 to 0 dl 1618956703 ref 1 fl Interpret:/0/0 rc 0/0 [236022.963655] LustreError: 193444:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1231996) req@ffff8bdfaad59050 x1688534631221184/t0(0) o4->0028e5c0-f60e-4@10.51.4.34@o2ib3:394/0 lens 488/448 e 0 to 0 dl 1618956709 ref 1 fl Interpret:/0/0 rc 0/0 [236033.753907] Lustre: oak-OST004a: Client 983f5bba-d533-f3a5-57c5-feb173326007 (at 10.51.1.36@o2ib3) reconnecting [236033.753908] Lustre: oak-OST005a: Client 983f5bba-d533-f3a5-57c5-feb173326007 (at 10.51.1.36@o2ib3) reconnecting [236033.753911] Lustre: Skipped 69 previous similar messages [236047.963534] LustreError: 193195:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1231468) req@ffff8be757045850 x1684954391302912/t0(0) o4->0871860c-cfdf-4@10.51.3.30@o2ib3:402/0 lens 488/448 e 0 to 0 dl 1618956717 ref 1 fl Interpret:/0/0 rc 0/0 [236047.990122] LustreError: 193195:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 11 previous similar messages [236117.749070] LNet: 50609:0:(o2iblnd_cb.c:1114:kiblnd_tx_complete()) Tx -> 10.0.2.216@o2ib5 cookie 0xc38ac18 sending 1 waiting 0: failed 12 [236117.749093] LNet: 50607:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error -5(sending)(waiting) [236117.749142] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [236117.749143] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 5 previous similar messages [236117.749188] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be457844c00 [236117.749421] LNet: 215579:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.216@o2ib5: ROUTE ERROR -22 [236117.749426] LNet: 215579:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.216@o2ib5: connection failed [236117.749433] LustreError: 215579:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bde29189c00 [236117.749438] LustreError: 215579:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bd9da640800 [236117.749443] LustreError: 215579:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bd9da640800 [236117.776490] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4b4f84000 [236117.776496] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be457844c00 [236117.896096] LNet: 50609:0:(o2iblnd_cb.c:1114:kiblnd_tx_complete()) Skipped 1 previous similar message [236147.965552] LustreError: 193413:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(3186038) req@ffff8be72331f850 x1684936038972160/t0(0) o4->9f0935fb-eb44-4@10.51.3.27@o2ib3:523/0 lens 504/448 e 0 to 0 dl 1618956838 ref 1 fl Interpret:/0/0 rc 0/0 [236147.965715] Lustre: oak-OST003c: Bulk IO read error with 93c8e0cc-3df7-4 (at 10.51.4.4@o2ib3), client will retry: rc -110 [236172.965924] LustreError: 204448:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1957810) req@ffff8be723183050 x1689651185708672/t0(0) o4->2c63b434-3a22-4@10.51.5.53@o2ib3:532/0 lens 504/448 e 0 to 0 dl 1618956847 ref 1 fl Interpret:/0/0 rc 0/0 [236172.966017] LustreError: 193194:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be8cb534850 x1696863887442432/t0(0) o3->8ff6000c-d966-1cda-f3a5-455db4eb8783@10.51.2.23@o2ib3:533/0 lens 488/440 e 0 to 0 dl 1618956848 ref 1 fl Interpret:/0/0 rc 0/0 [236172.966271] Lustre: oak-OST0056: Bulk IO read error with 8ff6000c-d966-1cda-f3a5-455db4eb8783 (at 10.51.2.23@o2ib3), client will retry: rc -110 [236172.966273] Lustre: oak-OST0056: Bulk IO read error with 8ff6000c-d966-1cda-f3a5-455db4eb8783 (at 10.51.2.23@o2ib3), client will retry: rc -110 [236172.966274] Lustre: Skipped 1 previous similar message [236172.966274] Lustre: Skipped 1 previous similar message [236173.060492] LustreError: 204448:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 17 previous similar messages [236197.228180] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.13.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [236197.247625] LustreError: Skipped 1 previous similar message [236313.773328] LustreError: 137-5: oak-OST003f_UUID: not available for connect from 10.51.0.17@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [236313.792765] LustreError: Skipped 4 previous similar messages [236336.006618] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [236336.020413] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be7577c0c00 [236336.032590] LustreError: 5986:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71219c050 x1696876646446016/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:749/0 lens 488/448 e 0 to 0 dl 1618957064 ref 1 fl Interpret:/0/0 rc 0/0 [236336.059866] LustreError: 5986:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [236350.366800] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.2.29@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [236350.386219] LustreError: Skipped 72 previous similar messages [236372.968976] LustreError: 241053:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 24576(3170304) req@ffff8be723ce0050 x1696876646445952/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:749/0 lens 488/448 e 0 to 0 dl 1618957064 ref 1 fl Interpret:/0/0 rc 0/0 [236398.173141] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.15.3@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [236398.192549] LustreError: Skipped 54 previous similar messages [236425.884456] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib5: 0 seconds [236480.448712] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.15.2@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [236480.468138] LustreError: Skipped 67 previous similar messages [236547.120617] LustreError: 193196:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd75790c050 x1696860068123520/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:162/0 lens 488/448 e 0 to 0 dl 1618957232 ref 1 fl Interpret:/0/0 rc 0/0 [236547.145560] LustreError: 193196:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 18 previous similar messages [236547.156400] Lustre: oak-OST0050: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [236547.169021] Lustre: Skipped 51 previous similar messages [236547.966779] LustreError: 209393:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8bd1e29ec050 x1688598633633856/t0(0) o4->42c69b06-39e7-4@10.51.4.25@o2ib3:158/0 lens 488/448 e 0 to 0 dl 1618957228 ref 1 fl Interpret:/0/0 rc 0/0 [236564.086470] Lustre: oak-OST0056: Connection restored to (at 10.51.0.71@o2ib3) [236564.094646] Lustre: Skipped 1468 previous similar messages [236610.567040] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.15.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [236610.567041] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.51.15.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [236610.567044] LustreError: Skipped 202 previous similar messages [236610.612469] LustreError: Skipped 19 previous similar messages [236640.646286] Lustre: oak-OST0030: Client d073f313-60b4-4 (at 10.51.15.5@o2ib3) reconnecting [236640.646288] Lustre: oak-OST0032: Client d073f313-60b4-4 (at 10.51.15.5@o2ib3) reconnecting [236640.646292] Lustre: Skipped 1385 previous similar messages [236640.671149] Lustre: Skipped 20 previous similar messages [236663.296003] Lustre: oak-OST004a: haven't heard from client d073f313-60b4-4 (at 10.51.15.5@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bdd514b0800, cur 1618957310 expire 1618957160 last 1618957083 [236670.756722] LNet: 215579:0:(o2iblnd_cb.c:3248:kiblnd_cm_callback()) 10.0.2.216@o2ib5: UNREACHABLE -110 [236670.767414] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [236670.777716] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 23 previous similar messages [236728.877330] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib5: 2 seconds [236971.869608] LNet: 22243:0:(o2iblnd_cb.c:3248:kiblnd_cm_callback()) 10.0.2.216@o2ib5: UNREACHABLE -110 [236971.880225] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [236971.890549] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [237031.870247] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib5: 4 seconds [237122.556456] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.51.13.9@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [237122.575870] LustreError: Skipped 799 previous similar messages [237147.969773] LustreError: 5989:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(3002368) req@ffff8bcc6c3fe050 x1696860069136576/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:2/0 lens 488/448 e 0 to 0 dl 1618957827 ref 1 fl Interpret:/0/0 rc 0/0 [237147.970013] Lustre: oak-OST003a: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [237147.970015] Lustre: Skipped 4 previous similar messages [237148.014412] LustreError: 5989:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [237170.971845] Lustre: oak-OST0050: Connection restored to 25c96f56-d6ac-4 (at 10.50.1.20@o2ib2) [237170.981467] Lustre: Skipped 1001 previous similar messages [237240.986642] Lustre: oak-OST0046: Client 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3) reconnecting [237240.998023] Lustre: Skipped 890 previous similar messages [237272.854618] LNet: 215579:0:(o2iblnd_cb.c:3248:kiblnd_cm_callback()) 10.0.2.216@o2ib5: UNREACHABLE -110 [237272.865366] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [237334.863159] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib5: 6 seconds [237340.026198] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [237340.039957] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdd44647000 [237340.039966] Lustre: 193174:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1618957983/real 1618957986] req@ffff8bacd092ad00 x1697353943487552/t0(0) o106->oak-OST003a@10.51.4.2@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618958156 ref 1 fl Rpc:eX/0/ffffffff rc 0/-1 [237347.969503] LustreError: 5997:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(3264512) req@ffff8be8fb9b4850 x1696860069623552/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:210/0 lens 488/448 e 0 to 0 dl 1618958035 ref 1 fl Interpret:/0/0 rc 0/0 [237347.995382] LustreError: 5997:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 3 previous similar messages [237397.968812] LustreError: 147686:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bdec316f850 x1689658295601024/t0(0) o4->3e7c55d0-08c5-4@10.51.4.52@o2ib3:253/0 lens 488/448 e 0 to 0 dl 1618958078 ref 1 fl Interpret:/0/0 rc 0/0 [237518.969805] LustreError: 5992:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd3f7414050 x1684954394605504/t0(0) o4->0871860c-cfdf-4@10.51.3.30@o2ib3:438/0 lens 488/448 e 0 to 0 dl 1618958263 ref 1 fl Interpret:/0/0 rc 0/0 [237518.994659] LustreError: 5992:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [237573.775595] LNet: 40150:0:(o2iblnd_cb.c:3248:kiblnd_cm_callback()) 10.0.2.216@o2ib5: UNREACHABLE -110 [237573.786238] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [237637.856077] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib5: 8 seconds [237807.662277] Lustre: oak-OST004a: Connection restored to 5e3fa4ab-9670-4 (at 10.51.4.21@o2ib3) [237807.671900] Lustre: Skipped 1086 previous similar messages [237874.888449] LNet: 40150:0:(o2iblnd_cb.c:3248:kiblnd_cm_callback()) 10.0.2.216@o2ib5: UNREACHABLE -110 [237874.899081] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [237874.909439] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [237879.966573] LNet: 40150:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.216@o2ib5: ROUTE ERROR -22 [237879.976889] LNet: 40150:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.216@o2ib5: connection failed [237879.990288] LNet: 40150:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [237880.002521] LNet: 40150:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [237885.809570] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.2.23@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [237885.829002] LustreError: Skipped 504 previous similar messages [237932.006752] Lustre: oak-OST0036: Client 9d061e04-8567-4 (at 10.51.0.71@o2ib3) reconnecting [237932.006753] Lustre: oak-OST0034: Client 9d061e04-8567-4 (at 10.51.0.71@o2ib3) reconnecting [237932.006755] Lustre: Skipped 952 previous similar messages [237932.031525] Lustre: Skipped 18 previous similar messages [237947.965819] LustreError: 241060:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1232468) req@ffff8bdccec5a850 x1696860070508352/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:57/0 lens 488/448 e 0 to 0 dl 1618958637 ref 1 fl Interpret:/0/0 rc 0/0 [237947.991682] LustreError: 241060:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [237948.002868] Lustre: oak-OST0032: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [237948.015507] Lustre: Skipped 11 previous similar messages [238180.969519] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [238180.969554] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [238180.969556] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.216@o2ib5 rejected: no listener at 987 [238181.009317] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [238181.844530] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [238181.862807] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 10 previous similar messages [238181.873881] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.216@o2ib5 rejected: no listener at 987 [238181.885146] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 10 previous similar messages [238182.844413] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [238183.844333] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.216@o2ib5 rejected: no listener at 987 [238183.855594] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 1 previous similar message [238184.844493] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [238184.862747] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 1 previous similar message [238185.844353] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.216@o2ib5 rejected: no listener at 987 [238185.855632] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 1 previous similar message [238188.844334] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [238188.862604] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 3 previous similar messages [238189.844217] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.216@o2ib5 rejected: no listener at 987 [238189.855505] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 3 previous similar messages [238196.844084] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [238196.862338] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 7 previous similar messages [238197.844018] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.216@o2ib5 rejected: no listener at 987 [238197.844023] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [238197.844024] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 466 previous similar messages [238197.876186] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 7 previous similar messages [238212.843662] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [238212.861910] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 15 previous similar messages [238213.843700] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.216@o2ib5 rejected: no listener at 987 [238213.854971] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 15 previous similar messages [238229.843410] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [238229.853738] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 584 previous similar messages [238243.841780] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.216@o2ib5: 12 seconds [238407.879612] Lustre: oak-OST004c: Connection restored to 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3) [238407.889238] Lustre: Skipped 581 previous similar messages [238481.973433] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [238481.973449] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [238481.973451] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 250 previous similar messages [238482.012598] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 31 previous similar messages [238482.023670] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.216@o2ib5 rejected: no listener at 987 [238482.034931] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 30 previous similar messages [238511.835876] LNet: 89231:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.216@o2ib5: ROUTE ERROR -22 [238511.846196] LNet: 89231:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.216@o2ib5: connection failed [238511.859597] LNet: 89231:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [238529.509857] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.0.14@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [238529.529396] LustreError: Skipped 731 previous similar messages [238537.370640] Lustre: oak-OST0054: Client 9768808c-691d-4f78-accb-0d29f922d720 (at 10.51.13.21@o2ib3) reconnecting [238537.382126] Lustre: Skipped 731 previous similar messages [238782.973707] LNet: 105045:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.216@o2ib5: ROUTE ERROR -22 [238782.984113] LNet: 105045:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.216@o2ib5: connection failed [238782.997639] LNet: 105045:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [238898.989983] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [238899.003404] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be4b4f85000 [238899.015645] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcb7f5ca800 [238947.972646] LustreError: 209388:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1231311) req@ffff8be6f712c850 x1684933529119808/t0(0) o4->3ca4550e-a937-4@10.51.3.29@o2ib3:292/0 lens 488/448 e 0 to 0 dl 1618959627 ref 1 fl Interpret:/0/0 rc 0/0 [238947.972662] LustreError: 193439:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd325826850 x1689658309449088/t0(0) o4->3e7c55d0-08c5-4@10.51.4.52@o2ib3:296/0 lens 488/448 e 0 to 0 dl 1618959631 ref 1 fl Interpret:/0/0 rc 0/0 [238947.972764] Lustre: oak-OST0042: Bulk IO write error with 3e7c55d0-08c5-4 (at 10.51.4.52@o2ib3), client will retry: rc = -110 [238947.972765] Lustre: Skipped 1 previous similar message [238948.042646] LustreError: 209388:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [238972.973032] LustreError: 204443:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1146845) req@ffff8bd325824850 x1688598637261504/t0(0) o4->42c69b06-39e7-4@10.51.4.25@o2ib3:308/0 lens 488/448 e 0 to 0 dl 1618959643 ref 1 fl Interpret:/0/0 rc 0/0 [239017.350913] Lustre: oak-OST003c: Connection restored to 853f1535-ef30-151f-429c-d573236cff68 (at 10.51.15.11@o2ib3) [239017.362669] Lustre: Skipped 441 previous similar messages [239075.903413] LustreError: 187417:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk WRITE failed: rc -107 req@ffff8bcb39347850 x1684940754930048/t0(0) o4->b2a2a37d-f5df-4@10.51.3.50@o2ib3:485/0 lens 488/448 e 0 to 0 dl 1618959820 ref 1 fl Interpret:/0/0 rc 0/0 [239078.898429] LustreError: 193424:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6c34b2850 x1696620884313088/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:481/0 lens 488/448 e 0 to 0 dl 1618959816 ref 1 fl Interpret:/0/0 rc 0/0 [239078.925516] LustreError: 193424:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [239080.350419] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9bae4a25 [239083.976466] LNet: 105045:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.216@o2ib5: ROUTE ERROR -22 [239083.986874] LNet: 105045:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.216@o2ib5: connection failed [239084.000403] LNet: 105045:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [239089.248203] Lustre: oak-OST004e: Bulk IO read error with 0028e5c0-f60e-4 (at 10.51.4.34@o2ib3), client will retry: rc -110 [239101.237718] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.34@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9baf3eed [239101.254628] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 2 previous similar messages [239140.425358] LustreError: 137-5: oak-OST0039_UUID: not available for connect from 10.51.13.12@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [239140.444868] LustreError: Skipped 175 previous similar messages [239165.514195] Lustre: oak-OST0034: Client 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3) reconnecting [239165.525658] Lustre: Skipped 458 previous similar messages [239167.491380] LustreError: 193409:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcc39c7c850 x1696860072179264/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:518/0 lens 488/448 e 0 to 0 dl 1618959853 ref 1 fl Interpret:/0/0 rc 0/0 [239167.491381] LustreError: 209388:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcc39c79050 x1696860072179648/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:518/0 lens 488/448 e 0 to 0 dl 1618959853 ref 1 fl Interpret:/0/0 rc 0/0 [239167.491386] LustreError: 209388:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [239172.977550] LustreError: 193424:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 181663(1230239) req@ffff8be72302c850 x1697586507221248/t0(0) o4->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:522/0 lens 488/448 e 0 to 0 dl 1618959857 ref 1 fl Interpret:/0/0 rc 0/0 [239540.999741] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd28a5cc050 x1697586550792960/t0(0) o4->9458049c-ca8d-335b-3531-2606964e11c0@10.51.2.31@o2ib3:189/0 lens 488/448 e 0 to 0 dl 1618960279 ref 1 fl Interpret:/0/0 rc 0/0 [239618.556015] Lustre: oak-OST0054: Connection restored to (at 10.50.8.17@o2ib2) [239618.564177] Lustre: Skipped 655 previous similar messages [239702.971147] Lustre: oak-OST0044: Bulk IO write error with 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3), client will retry: rc = -110 [239702.985932] Lustre: Skipped 14 previous similar messages [239721.154953] LNet: 50606:0:(o2iblnd_cb.c:1114:kiblnd_tx_complete()) Tx -> 10.0.2.217@o2ib5 cookie 0xc576122 sending 1 waiting 0: failed 12 [239721.154958] LNet: 50609:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error -5(sending)(waiting) [239721.154965] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [239721.154966] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 1 previous similar message [239721.155243] LNet: 105045:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.217@o2ib5: ROUTE ERROR -22 [239721.155248] LNet: 105045:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.217@o2ib5: connection failed [239721.155265] LustreError: 105045:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bd0b95ef800 [239721.155270] LustreError: 105045:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bd0b95ef800 [239721.253416] LNet: 50606:0:(o2iblnd_cb.c:1114:kiblnd_tx_complete()) Skipped 2 previous similar messages [239747.004224] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.13.12@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [239747.023758] LustreError: Skipped 25 previous similar messages [239747.986404] LustreError: 204434:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1957934) req@ffff8be72258f850 x1684933807820544/t0(0) o4->71e1dc67-e312-4@10.51.4.5@o2ib3:351/0 lens 504/448 e 0 to 0 dl 1618960441 ref 1 fl Interpret:/0/0 rc 0/0 [239748.012890] LustreError: 204434:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [239766.761130] Lustre: oak-OST0036: Client 1a448fa9-8f9c-4 (at 10.51.0.65@o2ib3) reconnecting [239766.761131] Lustre: oak-OST003a: Client 1a448fa9-8f9c-4 (at 10.51.0.65@o2ib3) reconnecting [239766.761134] Lustre: Skipped 848 previous similar messages [239772.987611] LustreError: 193254:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be720d33050 x1695838359726080/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:357/0 lens 488/440 e 0 to 0 dl 1618960447 ref 1 fl Interpret:/0/0 rc 0/0 [239773.012958] LustreError: 193254:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [239773.023710] Lustre: oak-OST0034: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [239884.094970] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [239884.108410] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd16cd52400 [239884.120562] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd16cd52400 [239884.132763] LustreError: 187418:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be45c7e4850 x1684937261653120/t0(0) o4->22914da2-6d51-4@10.51.3.49@o2ib3:529/0 lens 488/448 e 0 to 0 dl 1618960619 ref 1 fl Interpret:/0/0 rc 0/0 [239970.989066] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [239970.999381] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 726 previous similar messages [239971.010006] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [239971.028250] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 39 previous similar messages [239971.039335] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.217@o2ib5 rejected: no listener at 987 [239971.050610] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 39 previous similar messages [239979.802845] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [239979.821094] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 18 previous similar messages [239979.832164] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.217@o2ib5 rejected: no listener at 987 [239979.843442] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 18 previous similar messages [239987.802379] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [239987.812690] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 484 previous similar messages [239995.802270] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [239995.820521] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 15 previous similar messages [239996.802290] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.217@o2ib5 rejected: no listener at 987 [239996.813582] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 16 previous similar messages [240019.801622] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [240019.811955] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 569 previous similar messages [240023.800488] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.217@o2ib5: 2 seconds [240219.792044] Lustre: oak-OST0030: Connection restored to d21c6b72-eef5-e657-bc90-b561f085e18a (at 10.51.2.31@o2ib3) [240219.803713] Lustre: Skipped 1135 previous similar messages [240219.928899] LustreError: 193450:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6569da850 x1697586553575936/t0(0) o4->9458049c-ca8d-335b-3531-2606964e11c0@10.51.2.31@o2ib3:54/0 lens 488/448 e 0 to 0 dl 1618960899 ref 1 fl Interpret:/0/0 rc 0/0 [240219.955873] LustreError: 193450:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 23 previous similar messages [240271.985937] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [240272.004208] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 28 previous similar messages [240272.004210] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [240272.004212] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 70 previous similar messages [240272.036087] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.217@o2ib5 rejected: no listener at 987 [240272.047343] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 27 previous similar messages [240326.793593] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.217@o2ib5: 4 seconds [240347.986460] LustreError: 241048:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1230654) req@ffff8be714668850 x1696876656925184/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:190/0 lens 488/448 e 0 to 0 dl 1618961035 ref 1 fl Interpret:/0/0 rc 0/0 [240348.014501] LustreError: 241048:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 10 previous similar messages [240348.025775] Lustre: oak-OST0056: Bulk IO write error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc = -110 [240348.040531] Lustre: Skipped 34 previous similar messages [240349.193712] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.51.15.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [240349.213150] LustreError: Skipped 226 previous similar messages [240370.222886] Lustre: oak-OST0056: Client 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3) reconnecting [240370.234257] Lustre: Skipped 345 previous similar messages [240572.989001] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [240572.999299] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 1199 previous similar messages [240573.009992] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [240573.028237] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 65 previous similar messages [240573.039321] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.217@o2ib5 rejected: no listener at 987 [240573.050583] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 65 previous similar messages [240619.787031] LNet: 167105:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.217@o2ib5: ROUTE ERROR -22 [240619.797431] LNet: 167105:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.217@o2ib5: connection failed [240619.810929] LNet: 167105:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [240619.823260] LNet: 167105:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 10 previous similar messages [240821.216033] Lustre: oak-OST0048: Connection restored to 7b6b5d3b-ca43-4 (at 10.50.15.1@o2ib2) [240821.225658] Lustre: Skipped 320 previous similar messages [240873.988240] LNet: 192493:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.217@o2ib5: ROUTE ERROR -22 [240873.998640] LNet: 192493:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.217@o2ib5: connection failed [240874.012138] LNet: 192493:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [240958.721683] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.0.17@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [240958.741097] LustreError: Skipped 207 previous similar messages [240958.864986] LustreError: 187418:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6a00de850 x1696860075901376/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:47/0 lens 488/448 e 0 to 0 dl 1618961647 ref 1 fl Interpret:/0/0 rc 0/0 [240958.867253] Lustre: oak-OST003a: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [240958.902444] LustreError: 187418:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [240986.462121] Lustre: oak-OST0032: Client 4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d (at 10.51.2.32@o2ib3) reconnecting [240986.462122] Lustre: oak-OST0034: Client 4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d (at 10.51.2.32@o2ib3) reconnecting [240986.462124] Lustre: Skipped 241 previous similar messages [240986.490954] Lustre: Skipped 6 previous similar messages [241174.992081] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [241175.002385] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 1017 previous similar messages [241175.013087] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [241175.031339] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 56 previous similar messages [241175.042422] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.217@o2ib5 rejected: no listener at 987 [241175.053688] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 56 previous similar messages [241217.773254] LNet: 213948:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.217@o2ib5: ROUTE ERROR -22 [241217.783651] LNet: 213948:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.217@o2ib5: connection failed [241217.797150] LNet: 213948:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [241322.992669] LustreError: 5997:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1777324) req@ffff8bdb35261850 x1696620888748608/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:395/0 lens 504/448 e 0 to 0 dl 1618961995 ref 1 fl Interpret:/0/0 rc 0/0 [241422.238372] Lustre: oak-OST003a: Connection restored to 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3) [241422.250034] Lustre: Skipped 402 previous similar messages [241423.384004] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9ccc7c75 [241423.400903] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 6 previous similar messages [241597.421050] Lustre: oak-OST0058: Client 07e1f920-cffb-b4f8-01fb-b3be1cdfffbf (at 10.51.15.9@o2ib3) reconnecting [241597.432412] Lustre: Skipped 197 previous similar messages [241801.357443] LustreError: 193411:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be722759050 x1696860077352064/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:185/0 lens 488/448 e 0 to 0 dl 1618962540 ref 1 fl Interpret:/0/0 rc 0/0 [241801.382390] LustreError: 193411:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [241801.393177] Lustre: oak-OST0050: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [241801.405822] Lustre: Skipped 5 previous similar messages [242027.293861] Lustre: oak-OST0030: Connection restored to 833a1653-9814-4 (at 10.50.8.39@o2ib2) [242027.303499] Lustre: Skipped 159 previous similar messages [242042.267833] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [242042.281850] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3b1fce800 [242042.294015] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3b1fce800 [242042.306187] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcf65159400 [242042.318356] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf65159000 [242042.330555] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf65159000 [242042.342717] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7b92ad000 [242042.354875] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be66d410c00 [242098.012348] LustreError: 193253:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1451233) req@ffff8bd325820050 x1688534670939520/t0(0) o4->0028e5c0-f60e-4@10.51.4.34@o2ib3:422/0 lens 504/448 e 0 to 0 dl 1618962777 ref 1 fl Interpret:/0/0 rc 0/0 [242098.012461] LustreError: 193411:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bcc65504850 x1684933534806592/t0(0) o4->3ca4550e-a937-4@10.51.3.29@o2ib3:423/0 lens 488/448 e 0 to 0 dl 1618962778 ref 1 fl Interpret:/0/0 rc 0/0 [242098.012623] LustreError: 209397:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be150a2f850 x1695844026012352/t0(0) o3->330d404b-804c-4@10.51.15.3@o2ib3:427/0 lens 488/440 e 0 to 0 dl 1618962782 ref 1 fl Interpret:/0/0 rc 0/0 [242098.012768] Lustre: oak-OST0030: Bulk IO read error with 0871860c-cfdf-4 (at 10.51.3.30@o2ib3), client will retry: rc -110 [242098.102748] LustreError: 193253:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 31 previous similar messages [242199.193702] Lustre: oak-OST0030: Client 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3) reconnecting [242199.205079] Lustre: Skipped 29 previous similar messages [242222.298117] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.33@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9d3fd1bd [242222.315043] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [242227.822627] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.33@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9d3fd1c5 [242236.501342] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.33@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9d3feb9d [242574.256619] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [242574.271374] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda0b0e5c00 [242574.283636] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd605b8ac00 [242623.024787] LustreError: 187416:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd1dd4b0850 x1684955508254592/t0(0) o4->da817238-9680-4@10.51.3.1@o2ib3:199/0 lens 488/448 e 0 to 0 dl 1618963309 ref 1 fl Interpret:/0/0 rc 0/0 [242623.024815] LustreError: 193181:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1232777) req@ffff8bd1dd4b4850 x1684955508256704/t0(0) o4->da817238-9680-4@10.51.3.1@o2ib3:199/0 lens 488/448 e 0 to 0 dl 1618963309 ref 1 fl Interpret:/0/0 rc 0/0 [242623.024891] Lustre: oak-OST0042: Bulk IO write error with da817238-9680-4 (at 10.51.3.1@o2ib3), client will retry: rc = -110 [242623.024892] Lustre: Skipped 39 previous similar messages [242623.095270] LustreError: 187416:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 4 previous similar messages [242633.321482] Lustre: oak-OST0032: Connection restored to fdda7891-cb9d-4 (at 10.50.1.72@o2ib2) [242633.331116] Lustre: Skipped 893 previous similar messages [242648.024782] LustreError: 193427:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(2459882) req@ffff8bd232874850 x1684937268728768/t0(0) o4->22914da2-6d51-4@10.51.3.49@o2ib3:208/0 lens 488/448 e 0 to 0 dl 1618963318 ref 1 fl Interpret:/0/0 rc 0/0 [242648.024792] LustreError: 209395:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be71b5c2850 x1689694628321024/t0(0) o3->822a70f4-76f2-4@10.51.5.50@o2ib3:208/0 lens 488/440 e 0 to 0 dl 1618963318 ref 1 fl Interpret:/0/0 rc 0/0 [242648.024793] LustreError: 209395:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 10 previous similar messages [242648.024814] LustreError: 193407:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71b5c2050 x1689694628320832/t0(0) o3->822a70f4-76f2-4@10.51.5.50@o2ib3:208/0 lens 488/440 e 0 to 0 dl 1618963318 ref 1 fl Interpret:/0/0 rc 0/0 [242648.024954] Lustre: oak-OST0056: Bulk IO read error with 822a70f4-76f2-4 (at 10.51.5.50@o2ib3), client will retry: rc -110 [242648.024955] Lustre: Skipped 14 previous similar messages [242648.131793] LustreError: 193427:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 18 previous similar messages [242744.829255] LustreError: 147686:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bcf777cb850 x1684948371582976/t0(0) o3->97608f95-8caf-4@10.51.4.8@o2ib3:376/0 lens 488/440 e 0 to 0 dl 1618963486 ref 1 fl Interpret:/0/0 rc 0/0 [242744.854270] LustreError: 147686:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 7 previous similar messages [242744.864973] Lustre: oak-OST0056: Bulk IO read error with 97608f95-8caf-4 (at 10.51.4.8@o2ib3), client will retry: rc -110 [242744.877323] Lustre: Skipped 1 previous similar message [242750.449106] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.34@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9d89f7f5 [242750.466024] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [242754.694293] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.34@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9d89fca5 [242754.711242] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 5 previous similar messages [242770.086757] Lustre: oak-OST0042: haven't heard from client ea702749-deff-4 (at 10.51.4.14@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd9e4134c00, cur 1618963417 expire 1618963267 last 1618963190 [242775.087311] Lustre: oak-OST0030: haven't heard from client ea702749-deff-4 (at 10.51.4.14@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd605b8d400, cur 1618963422 expire 1618963272 last 1618963195 [242915.023541] LustreError: 193197:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST004c: cli 0028e5c0-f60e-4 claims 4218880 GRANT, real grant 258048 [242920.584122] LustreError: 193254:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST004c: cli 0028e5c0-f60e-4 claims 4218880 GRANT, real grant 0 [242968.143722] Lustre: oak-OST005e: Client 4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d (at 10.51.2.32@o2ib3) reconnecting [242968.155146] Lustre: Skipped 967 previous similar messages [243233.379664] Lustre: oak-OST004c: Connection restored to (at 10.50.1.58@o2ib2) [243233.387828] Lustre: Skipped 606 previous similar messages [243597.228189] Lustre: oak-OST005e: Client 5382169d-059c-4 (at 10.51.2.52@o2ib3) reconnecting [243597.237532] Lustre: Skipped 12 previous similar messages [243832.191094] LustreError: 137-5: oak-OST004b_UUID: not available for connect from 10.50.2.49@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [243832.210506] LustreError: Skipped 147 previous similar messages [243846.306755] Lustre: oak-OST0056: Connection restored to eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) [243846.316386] Lustre: Skipped 170 previous similar messages [243856.763614] LustreError: 204435:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be901ce1850 x1696860081507136/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:730/0 lens 488/448 e 0 to 0 dl 1618964595 ref 1 fl Interpret:/0/0 rc 0/0 [243856.788553] LustreError: 204435:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 14 previous similar messages [243856.799566] Lustre: oak-OST003a: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [243856.812187] Lustre: Skipped 37 previous similar messages [244047.066558] LustreError: 193196:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 984064(2032640) req@ffff8bcd079bb050 x1695838383105216/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:107/0 lens 488/440 e 0 to 0 dl 1618964727 ref 1 fl Interpret:/0/0 rc 0/0 [244047.066764] Lustre: oak-OST0032: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [244047.105276] LustreError: 193196:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [244266.724003] Lustre: oak-OST0030: Client deb59044-8b57-f719-7f9c-2b746ce6a979 (at 10.50.16.1@o2ib2) reconnecting [244266.724003] Lustre: oak-OST0054: Client deb59044-8b57-f719-7f9c-2b746ce6a979 (at 10.50.16.1@o2ib2) reconnecting [244266.724007] Lustre: Skipped 50 previous similar messages [244307.931309] Lustre: oak-OST0050: Bulk IO read error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc -110 [244307.945828] Lustre: Skipped 3 previous similar messages [244448.226012] Lustre: oak-OST003a: Connection restored to 6c59bf0f-612b-4 (at 10.50.6.45@o2ib2) [244448.235635] Lustre: Skipped 240 previous similar messages [244777.204974] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [244777.219166] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6ccd35c00 [244777.231337] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdd9a86c400 [244777.243480] LustreError: 193450:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be13a335850 x1688534690913024/t0(0) o4->0028e5c0-f60e-4@10.51.4.34@o2ib3:141/0 lens 488/448 e 0 to 0 dl 1618965516 ref 1 fl Interpret:/0/0 rc 0/0 [244777.269387] Lustre: oak-OST004c: Bulk IO write error with 0028e5c0-f60e-4 (at 10.51.4.34@o2ib3), client will retry: rc = -110 [244847.083038] LustreError: 193433:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be731ba2850 x1695838388107584/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:144/0 lens 488/440 e 0 to 0 dl 1618965519 ref 1 fl Interpret:/0/0 rc 0/0 [244847.108499] Lustre: oak-OST0032: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [244901.199407] Lustre: oak-OST005e: Client 66bf6793-ef6e-d6d2-96ad-3432d26adbce (at 10.51.13.5@o2ib3) reconnecting [244901.210777] Lustre: Skipped 18 previous similar messages [244943.746390] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6fc531850 x1696617157473280/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:311/0 lens 488/448 e 0 to 0 dl 1618965686 ref 1 fl Interpret:/0/0 rc 0/0 [244943.773593] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [244943.784331] Lustre: oak-OST0034: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [244945.653789] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.13.12@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9ec1281d [244945.670824] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 8 previous similar messages [244992.194514] Lustre: oak-OST0058: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [245055.290605] Lustre: oak-OST004e: Connection restored to c9839c63-0b35-4 (at 10.50.1.40@o2ib2) [245055.300225] Lustre: Skipped 178 previous similar messages [245591.336204] Lustre: oak-OST0042: Client 7f5a03b0-a887-10eb-7386-d75d82cdd92b (at 10.51.13.20@o2ib3) reconnecting [245591.347666] Lustre: Skipped 76 previous similar messages [245661.400460] Lustre: oak-OST0054: Connection restored to 3b262606-25cc-4 (at 10.50.7.40@o2ib2) [245661.410106] Lustre: Skipped 227 previous similar messages [246204.307967] Lustre: oak-OST003a: Client 61781ed1-b14e-4 (at 10.51.13.4@o2ib3) reconnecting [246204.317295] Lustre: Skipped 61 previous similar messages [246206.963055] LustreError: 241046:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be29b319850 x1697586522242880/t0(0) o4->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:60/0 lens 488/448 e 0 to 0 dl 1618966945 ref 1 fl Interpret:/0/0 rc 0/0 [246206.990021] LustreError: 241046:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [246207.000801] Lustre: oak-OST0052: Bulk IO write error with 4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d (at 10.51.2.32@o2ib3), client will retry: rc = -110 [246255.147827] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [246255.161864] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0f3a02400 [246255.174015] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd1c1c52000 [246255.186158] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb7f5cd400 [246255.198297] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b596000 [246255.210457] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7b92adc00 [246255.222617] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be0f3a01800 [246255.234769] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7b92adc00 [246255.234770] LustreError: 241046:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bdfb6f8d050 x1684934618079744/t0(0) o4->45f8c97c-22a3-4@10.51.3.26@o2ib3:110/0 lens 488/448 e 0 to 0 dl 1618966995 ref 1 fl Interpret:/0/0 rc 0/0 [246255.234898] Lustre: oak-OST0056: Bulk IO write error with 45f8c97c-22a3-4 (at 10.51.3.26@o2ib3), client will retry: rc = -110 [246266.845391] Lustre: oak-OST0030: Connection restored to 8b956477-4ff9-4 (at 10.51.2.48@o2ib3) [246266.855048] Lustre: Skipped 600 previous similar messages [246297.135543] LustreError: 193439:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1231578) req@ffff8be4594d4850 x1684936220900736/t0(0) o4->9f0935fb-eb44-4@10.51.3.27@o2ib3:106/0 lens 488/448 e 0 to 0 dl 1618966991 ref 1 fl Interpret:/0/0 rc 0/0 [246297.135663] Lustre: oak-OST003a: Bulk IO write error with 9f0935fb-eb44-4 (at 10.51.3.27@o2ib3), client will retry: rc = -110 [246297.174384] LustreError: 193439:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [246322.135295] LustreError: 241051:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(2006651) req@ffff8be7242da050 x1684934618079808/t0(0) o4->45f8c97c-22a3-4@10.51.3.26@o2ib3:110/0 lens 488/448 e 0 to 0 dl 1618966995 ref 1 fl Interpret:/0/0 rc 0/0 [246322.135311] LustreError: 241059:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bdea1e4e850 x1684941808282944/t0(0) o3->93c8e0cc-3df7-4@10.51.4.4@o2ib3:110/0 lens 488/440 e 0 to 0 dl 1618966995 ref 1 fl Interpret:/0/0 rc 0/0 [246322.135346] LustreError: 5990:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be71b030050 x1689694646850304/t0(0) o3->822a70f4-76f2-4@10.51.5.50@o2ib3:110/0 lens 488/440 e 0 to 0 dl 1618966995 ref 1 fl Interpret:/0/0 rc 0/0 [246322.135393] Lustre: oak-OST0034: Bulk IO write error with f62d458b-f1a4-4 (at 10.51.4.2@o2ib3), client will retry: rc = -110 [246322.135395] Lustre: Skipped 2 previous similar messages [246322.135397] Lustre: oak-OST0056: Bulk IO read error with 862218e0-304a-4 (at 10.51.0.62@o2ib3), client will retry: rc -110 [246322.243858] LustreError: 241051:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 15 previous similar messages [246355.438577] LustreError: 9694:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71b2a8050 x1696876680127104/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:214/0 lens 488/448 e 0 to 0 dl 1618967099 ref 1 fl Interpret:/0/0 rc 0/0 [246429.971626] Lustre: oak-OST0036: Bulk IO write error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc = -110 [246429.984395] Lustre: Skipped 17 previous similar messages [246447.008165] Lustre: oak-OST003e: haven't heard from client 8d217df6-ca17-4 (at 10.51.5.4@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be97479f800, cur 1618967094 expire 1618966944 last 1618966867 [246450.999107] Lustre: oak-OST0030: haven't heard from client 8d217df6-ca17-4 (at 10.51.5.4@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be20f4d7c00, cur 1618967098 expire 1618966948 last 1618966871 [246516.441323] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.15.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [246553.043967] LustreError: 5991:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd7ee4e3050 x1696671678500864/t0(0) o4->ea3c0abc-03bf-9064-4b06-a2deb2e4ec48@10.51.4.35@o2ib3:408/0 lens 488/448 e 0 to 0 dl 1618967293 ref 1 fl Interpret:/0/0 rc 0/0 [246553.070872] LustreError: 5991:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [246553.081439] Lustre: oak-OST0032: Bulk IO write error with ea3c0abc-03bf-9064-4b06-a2deb2e4ec48 (at 10.51.4.35@o2ib3), client will retry: rc = -110 [246555.728187] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.35@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0x9fba7a05 [246717.139234] Lustre: oak-OST0052: Bulk IO write error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc = -110 [246717.154002] Lustre: Skipped 1 previous similar message [246846.117309] Lustre: oak-OST0032: Client 5382169d-059c-4 (at 10.51.2.52@o2ib3) reconnecting [246846.126650] Lustre: Skipped 224 previous similar messages [246854.255965] LustreError: 242901:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be459146850 x1696620902524800/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:693/0 lens 488/448 e 0 to 0 dl 1618967578 ref 1 fl Interpret:/0/0 rc 0/0 [246854.283117] LustreError: 242901:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [246883.026427] LustreError: 137-5: oak-OST004f_UUID: not available for connect from 10.51.13.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [246884.081116] Lustre: oak-OST0048: Connection restored to 25ca5655-1c3b-4 (at 10.51.15.8@o2ib3) [246884.090783] Lustre: Skipped 559 previous similar messages [247006.130230] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [247006.144205] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6c2ab3400 [247006.156360] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6c2ab3400 [247006.168581] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd9da642c00 [247006.180775] LustreError: 241061:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be6dbc7f050 x1689651219524288/t0(0) o4->2c63b434-3a22-4@10.51.5.53@o2ib3:104/0 lens 488/448 e 0 to 0 dl 1618967744 ref 1 fl Interpret:/0/0 rc 0/0 [247006.181315] LustreError: 137-5: oak-OST0047_UUID: not available for connect from 10.51.1.2@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [247006.225560] LustreError: 241061:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 4 previous similar messages [247058.798938] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.51.13.13@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [247058.818452] LustreError: Skipped 1 previous similar message [247072.168405] LustreError: 193442:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be7aaa35850 x1689694650728448/t0(0) o3->822a70f4-76f2-4@10.51.5.50@o2ib3:106/0 lens 488/440 e 0 to 0 dl 1618967746 ref 1 fl Interpret:/0/0 rc 0/0 [247072.193951] Lustre: oak-OST0056: Bulk IO read error with 822a70f4-76f2-4 (at 10.51.5.50@o2ib3), client will retry: rc -110 [247072.206380] Lustre: Skipped 10 previous similar messages [247211.125700] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [247211.139726] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d4899400 [247211.151897] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d4899400 [247211.164057] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d4899400 [247211.176233] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd9e4133c00 [247247.212904] LustreError: 137-5: oak-OST0053_UUID: not available for connect from 10.51.14.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [247247.232372] LustreError: Skipped 1 previous similar message [247272.182415] LustreError: 5998:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1654784) req@ffff8be723e8a050 x1697586524078016/t0(0) o4->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:312/0 lens 488/448 e 0 to 0 dl 1618967952 ref 1 fl Interpret:/0/0 rc 0/0 [247272.182437] LustreError: 193450:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bca52b5f050 x1684931474283904/t0(0) o4->d4f105e0-bc0e-4@10.51.4.1@o2ib3:310/0 lens 488/448 e 0 to 0 dl 1618967950 ref 1 fl Interpret:/0/0 rc 0/0 [247272.182442] LustreError: 193446:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(158465) req@ffff8be71df34050 x1696863471516672/t0(0) o3->eb8cea22-3545-c4b8-6cb6-b3e875ecfb11@10.51.1.23@o2ib3:312/0 lens 488/440 e 0 to 0 dl 1618967952 ref 1 fl Interpret:/0/0 rc 0/0 [247272.182444] LustreError: 193446:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 5 previous similar messages [247272.182466] Lustre: oak-OST003c: Bulk IO read error with eb8cea22-3545-c4b8-6cb6-b3e875ecfb11 (at 10.51.1.23@o2ib3), client will retry: rc -110 [247272.182575] Lustre: oak-OST004c: Bulk IO write error with 9458049c-ca8d-335b-3531-2606964e11c0 (at 10.51.2.31@o2ib3), client will retry: rc = -110 [247272.182577] Lustre: Skipped 6 previous similar messages [247272.309691] LustreError: 5998:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 16 previous similar messages [247375.468894] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.4.33@o2ib3 ns: filter-oak-OST005c_UUID lock: ffff8be3744d69c0/0xf81cb91fecb56ac lrc: 3/0,0 mode: PW/PW res: [0x215345c:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 372637696->373870591) flags: 0x60000400000020 nid: 10.51.4.33@o2ib3 remote: 0xe1814cf08e0b9c8a expref: 15 pid: 192981 timeout: 247380 lvb_type: 0 [247380.219738] Lustre: 216152:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618967854/real 1618967854] req@ffff8be4b8703f00 x1697353955794432/t0(0) o105->oak-OST0050@10.51.1.53@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1618968027 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [247380.250582] Lustre: 216152:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [247380.678838] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.34@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa028e23d [247386.121418] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [247386.135832] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd775049400 [247386.148000] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bde309a6000 [247386.160148] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bde309a6000 [247386.172308] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd3ef083c00 [247447.192175] LustreError: 193420:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd307102050 x1689699809708800/t0(0) o4->2bbfb89a-a909-4@10.51.5.29@o2ib3:485/0 lens 488/448 e 0 to 0 dl 1618968125 ref 1 fl Interpret:/0/0 rc 0/0 [247447.217666] LustreError: 193420:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [247452.219770] Lustre: oak-OST0044: Client 855f8733-97ad-fe20-a42c-c9a97f6818f7 (at 10.51.4.40@o2ib3) reconnecting [247452.231180] Lustre: Skipped 360 previous similar messages [247485.250069] Lustre: oak-OST0054: Connection restored to b25d30ac-f1fd-4 (at 10.50.16.9@o2ib2) [247485.259691] Lustre: Skipped 645 previous similar messages [247557.235631] LustreError: 5990:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6f712a050 x1695473702669888/t0(0) o4->61781ed1-b14e-4@10.51.13.4@o2ib3:653/0 lens 488/448 e 0 to 0 dl 1618968293 ref 1 fl Interpret:/0/0 rc 0/0 [247557.260471] LustreError: 5990:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 5 previous similar messages [247562.694792] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.0.15@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [247562.714204] LustreError: Skipped 2 previous similar messages [247724.395171] LNet: 50609:0:(o2iblnd_cb.c:1114:kiblnd_tx_complete()) Tx -> 10.0.2.214@o2ib5 cookie 0xcad6c36 sending 1 waiting 0: failed 12 [247724.395202] LNet: 50607:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error -5(sending)(waiting) [247724.395211] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [247724.409140] LNet: 46184:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.214@o2ib5: ROUTE ERROR -22 [247724.409147] LNet: 46184:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.214@o2ib5: connection failed [247724.409190] LustreError: 46184:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bc283b36c00 [247724.409198] LustreError: 46184:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bdd8b8e5000 [247724.409204] LustreError: 46184:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bdd8b8e5000 [247724.409211] LustreError: 46184:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bbddf4e5800 [247724.409218] LustreError: 46184:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bb235f25c00 [247724.434869] LustreError: 66049:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bb235f20c00 [247724.434875] LustreError: 66049:0:(events.c:450:server_bulk_callback()) event type 5, status -113, desc ffff8bb235f20c00 [247724.543516] LNet: 50609:0:(o2iblnd_cb.c:1114:kiblnd_tx_complete()) Skipped 1 previous similar message [247759.057056] Lustre: oak-OST0048: Bulk IO read error with 0819d613-98f5-4 (at 10.50.14.14@o2ib2), client will retry: rc -110 [247759.069581] Lustre: Skipped 8 previous similar messages [247762.620829] LNetError: 50599:0:(o2iblnd_cb.c:3359:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [247762.632181] LNetError: 50599:0:(o2iblnd_cb.c:3434:kiblnd_check_conns()) Timed out RDMA with 10.0.2.212@o2ib5 (63): c: 0, oc: 0, rc: 8 [247762.645679] LNet: 50599:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.212@o2ib5: error -110(sending) [247768.790858] Lustre: oak-OST0042: Bulk IO read error with a8e1d696-3374-4 (at 10.50.13.11@o2ib2), client will retry: rc -110 [247772.200745] LustreError: 227897:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bb923596850 x1694154575475776/t0(0) o3->7660ea29-02f3-4@10.50.13.10@o2ib2:54/0 lens 488/440 e 0 to 0 dl 1618968449 ref 1 fl Interpret:/0/0 rc 0/0 [247772.226083] LustreError: 227897:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [247774.620578] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.212@o2ib5: 12 seconds [247774.632075] LNet: 50599:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.212@o2ib5 exceeded retry count 0 [247774.644311] LNet: 50599:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 26 previous similar messages [247812.769687] LNet: 46184:0:(o2iblnd_cb.c:3182:kiblnd_cm_callback()) 10.0.2.212@o2ib5: ADDR ERROR -110 [247945.885603] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [247945.899814] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cbc400 [247945.911982] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3ef081c00 [247945.924125] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3ef081c00 [247945.936291] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bccbd4bb400 [247945.948450] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd273afc800 [247945.948504] LustreError: 193452:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be459ede850 x1689654141856192/t0(0) o4->48f963f4-657b-4@10.51.4.57@o2ib3:284/0 lens 488/448 e 0 to 0 dl 1618968679 ref 1 fl Interpret:/0/0 rc 0/0 [247945.948653] Lustre: oak-OST004e: Bulk IO write error with 48f963f4-657b-4 (at 10.51.4.57@o2ib3), client will retry: rc = -110 [247945.948654] Lustre: Skipped 29 previous similar messages [247946.004764] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd273afc800 [247997.207553] LustreError: 209394:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be459ed8050 x1684941811740736/t0(0) o3->93c8e0cc-3df7-4@10.51.4.4@o2ib3:291/0 lens 488/440 e 0 to 0 dl 1618968686 ref 1 fl Interpret:/0/0 rc 0/0 [247997.207745] Lustre: oak-OST003c: Bulk IO read error with 93c8e0cc-3df7-4 (at 10.51.4.4@o2ib3), client will retry: rc -110 [247997.207746] Lustre: Skipped 3 previous similar messages [247997.251038] LustreError: 209394:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 2 previous similar messages [248023.210056] LNet: 182051:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.214@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [248023.210070] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.214@o2ib5 failed: 5 [248023.210072] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 931 previous similar messages [248023.249215] LNet: 182051:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 52 previous similar messages [248023.260300] LNet: 182051:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.214@o2ib5 rejected: no listener at 987 [248023.271560] LNet: 182051:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 52 previous similar messages [248055.615020] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.214@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [248055.633270] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 41 previous similar messages [248055.644343] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.214@o2ib5 rejected: no listener at 987 [248055.655604] LNet: 182038:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 41 previous similar messages [248073.243651] LNet: 66049:0:(o2iblnd_cb.c:3182:kiblnd_cm_callback()) 10.0.2.212@o2ib5: ADDR ERROR -110 [248073.253950] LNet: 66049:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.212@o2ib5: connection failed [248073.267346] LNet: 66049:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Skipped 2 previous similar messages [248077.613564] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.214@o2ib5: 4 seconds [248077.624939] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Skipped 2 previous similar messages [248080.813818] LustreError: 137-5: oak-OST0043_UUID: not available for connect from 10.50.10.49@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [248080.833356] LustreError: Skipped 14 previous similar messages [248086.214031] Lustre: oak-OST0054: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [248086.214190] Lustre: oak-OST003e: Connection restored to f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) [248086.214192] Lustre: Skipped 1771 previous similar messages [248086.243270] Lustre: Skipped 1557 previous similar messages [248112.881416] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [248112.894880] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd03bedf800 [248116.132404] LustreError: 209400:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71bab9050 x1696876684954880/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:446/0 lens 488/448 e 0 to 0 dl 1618968841 ref 1 fl Interpret:/0/0 rc 0/0 [248122.105301] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd671c16000 [248122.117452] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd671c16000 [248122.129612] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4ad55e000 [248122.141763] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4ad55e000 [248122.153928] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b593400 [248122.166087] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcd56b78800 [248122.166448] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [248122.166451] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 1158 previous similar messages [248122.178255] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [248122.178257] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 22 previous similar messages [248122.228082] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcd56b78800 [248122.240235] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcd56b78800 [248144.881385] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd6d489a800 [248147.253272] LustreError: 193420:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 589170(1637746) req@ffff8bd26ffe1050 x1696876684954944/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:446/0 lens 488/448 e 0 to 0 dl 1618968841 ref 1 fl Interpret:/0/0 rc 0/0 [248149.410406] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.13.12@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa082498d [248172.255490] LustreError: 193446:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1229906) req@ffff8be7238d8050 x1696876684958400/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:449/0 lens 488/448 e 0 to 0 dl 1618968844 ref 1 fl Interpret:/0/0 rc 0/0 [248172.257225] LustreError: 241046:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bd334d68050 x1694751272297792/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:454/0 lens 488/440 e 0 to 0 dl 1618968849 ref 1 fl Interpret:/0/0 rc 0/0 [248172.257227] LustreError: 241046:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 8 previous similar messages [248172.257239] Lustre: oak-OST004e: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [248172.257240] Lustre: Skipped 1 previous similar message [248172.337966] LustreError: 193446:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 55 previous similar messages [248197.259926] LustreError: 193404:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1232988) req@ffff8be7238df050 x1689649718652160/t0(0) o4->61fe4605-6b08-4@10.51.5.54@o2ib3:472/0 lens 488/448 e 0 to 0 dl 1618968867 ref 1 fl Interpret:/0/0 rc 0/0 [248197.259930] LustreError: 193439:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be902cb0850 x1684936255072000/t0(0) o3->9f0935fb-eb44-4@10.51.3.27@o2ib3:472/0 lens 488/440 e 0 to 0 dl 1618968867 ref 1 fl Interpret:/0/0 rc 0/0 [248197.259932] LustreError: 193439:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [248197.260139] Lustre: oak-OST004a: Bulk IO read error with 9f0935fb-eb44-4 (at 10.51.3.27@o2ib3), client will retry: rc -110 [248197.260140] Lustre: Skipped 2 previous similar messages [248206.102681] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [248206.116074] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [248206.127758] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd6d489f400 [248206.139902] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd6d489f400 [248206.152068] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bccbd4bd800 [248206.152076] LustreError: 193446:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be720d33850 x1696620918284544/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:537/0 lens 488/448 e 0 to 0 dl 1618968932 ref 1 fl Interpret:/0/0 rc 0/0 [248206.152079] LustreError: 193446:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [248221.837161] LustreError: 204434:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be723ca2050 x1696617165205952/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:569/0 lens 488/448 e 0 to 0 dl 1618968964 ref 1 fl Interpret:/0/0 rc 0/0 [248221.864338] LustreError: 204434:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 15 previous similar messages [248247.262259] LustreError: 5993:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 181472(1230048) req@ffff8bdb3ce89050 x1696860090136192/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:545/0 lens 488/448 e 0 to 0 dl 1618968940 ref 1 fl Interpret:/0/0 rc 0/0 [248247.262262] LustreError: 9691:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1230955) req@ffff8bdb3ce8e050 x1696860090137024/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:545/0 lens 488/448 e 0 to 0 dl 1618968940 ref 1 fl Interpret:/0/0 rc 0/0 [248247.314815] LustreError: 5993:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [248267.448138] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.13.4@o2ib3 ns: filter-oak-OST005a_UUID lock: ffff8bdea3566780/0xf81cb91fece23f4 lrc: 4/0,0 mode: PR/PR res: [0x21d12d5:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.13.4@o2ib3 remote: 0xef5be6b4fbfcca0f expref: 20 pid: 187406 timeout: 248272 lvb_type: 1 [248267.494456] LustreError: 193054:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8bd33c8a1c00 ns: filter-oak-OST005a_UUID lock: ffff8bde4fa6bcc0/0xf81cb91fece2560 lrc: 3/0,0 mode: --/PW res: [0x21d12d5:0x0:0x0].0x0 rrc: 3 type: EXT [0->4095] (req 0->4095) flags: 0x50000000020000 nid: 10.51.13.4@o2ib3 remote: 0xef5be6b4fbfccb2e expref: 20 pid: 193054 timeout: 0 lvb_type: 0 [248272.263878] LustreError: 5991:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(726781) req@ffff8be4590bb850 x1684948086323392/t0(0) o4->732e3689-d982-4@10.51.3.5@o2ib3:550/0 lens 504/448 e 0 to 0 dl 1618968945 ref 1 fl Interpret:/0/0 rc 0/0 [248272.289499] LustreError: 5991:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 10 previous similar messages [248286.085175] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.32@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa08e1f05 [248286.102068] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 14 previous similar messages [248300.041463] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa08efb5d [248300.058367] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [248324.264964] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.214@o2ib5 failed: 5 [248324.275269] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 52 previous similar messages [248324.275297] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.214@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [248324.275299] LNet: 238533:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 1 previous similar message [248324.275301] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.214@o2ib5 rejected: no listener at 987 [248324.275302] LNet: 238533:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 22 previous similar messages [248380.606547] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.212@o2ib5: 6 seconds [248442.097091] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [248442.111239] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be0c4cfc400 [248442.123508] LustreError: 241058:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be720b4f050 x1689658390252672/t0(0) o4->3e7c55d0-08c5-4@10.51.4.52@o2ib3:22/0 lens 488/448 e 0 to 0 dl 1618969172 ref 1 fl Interpret:/0/0 rc 0/0 [248442.148845] LustreError: 241058:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [248491.872500] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600ac0) failed: 5 [248491.882898] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 44 previous similar messages [248491.883209] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [248491.883211] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 1 previous similar message [248491.883214] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723c48c00 [248491.883506] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723c48c00 [248491.883771] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd311b1f000 [248491.883776] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd311b1e400 [248497.268925] LustreError: 193427:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 184549(1233125) req@ffff8bd6179bb850 x1696620920234944/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:20/0 lens 488/448 e 0 to 0 dl 1618969170 ref 1 fl Interpret:/0/0 rc 0/0 [248497.297394] LustreError: 193427:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [248547.268898] LustreError: 204443:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 184418(1232994) req@ffff8be6c1188050 x1688534711925376/t0(0) o4->0028e5c0-f60e-4@10.51.4.34@o2ib3:76/0 lens 488/448 e 0 to 0 dl 1618969226 ref 1 fl Interpret:/0/0 rc 0/0 [248547.268943] LustreError: 9691:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1048576(4194304) req@ffff8bdb4c700850 x1684941813588800/t0(0) o3->93c8e0cc-3df7-4@10.51.4.4@o2ib3:80/0 lens 488/440 e 0 to 0 dl 1618969230 ref 1 fl Interpret:/0/0 rc 0/0 [248547.268945] LustreError: 9691:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [248547.268982] Lustre: oak-OST0052: Bulk IO write error with 93c8e0cc-3df7-4 (at 10.51.4.4@o2ib3), client will retry: rc = -110 [248547.268983] Lustre: oak-OST0044: Bulk IO read error with 1a448fa9-8f9c-4 (at 10.51.0.65@o2ib3), client will retry: rc -110 [248547.268984] Lustre: Skipped 103 previous similar messages [248547.268985] Lustre: Skipped 1 previous similar message [248547.368578] LustreError: 204443:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 6 previous similar messages [248597.091368] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa0aa519d [248597.108288] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 2 previous similar messages [248625.272993] LNet: 66049:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.214@o2ib5: ROUTE ERROR -22 [248625.283306] LNet: 66049:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) Skipped 2 previous similar messages [248625.293803] LNet: 66049:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.214@o2ib5: connection failed [248625.307228] LNet: 66049:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [248625.319476] LNet: 66049:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 3 previous similar messages [248625.331077] LNet: 11475:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.212@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [248625.331093] LNet: 50604:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.212@o2ib5 failed: 5 [248625.331095] LNet: 50604:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 2181 previous similar messages [248625.370205] LNet: 11475:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 123 previous similar messages [248625.381278] LNet: 11475:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.212@o2ib5 rejected: no listener at 987 [248625.392446] LNet: 11475:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 123 previous similar messages [248664.336072] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa0af24b5 [248664.352966] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [248683.599458] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.212@o2ib5: 8 seconds [248683.610830] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Skipped 1 previous similar message [248688.154335] Lustre: oak-OST0042: Client aee44999-7cd9-ca9a-a4b2-3dd90c5ac02b (at 10.51.4.6@o2ib3) reconnecting [248688.165627] Lustre: Skipped 1383 previous similar messages [248688.166823] Lustre: oak-OST004e: Connection restored to 855f8733-97ad-fe20-a42c-c9a97f6818f7 (at 10.51.4.40@o2ib3) [248688.166824] Lustre: Skipped 1533 previous similar messages [248740.270133] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.50.9.44@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [248740.289588] LustreError: Skipped 96 previous similar messages [248775.089148] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [248775.101673] LNet: 11475:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 1 previous similar message [248775.113397] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcb5e8f6400 [248775.125557] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0613a8c00 [248775.125614] LustreError: 241059:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71cca7850 x1689658392704576/t0(0) o4->3e7c55d0-08c5-4@10.51.4.52@o2ib3:360/0 lens 488/448 e 0 to 0 dl 1618969510 ref 1 fl Interpret:/0/0 rc 0/0 [248775.125615] LustreError: 241059:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [248775.173812] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdaa4557400 [248775.185967] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdaa4557400 [248775.198119] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd9a868800 [248795.562170] LNet: 50606:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.13.3@o2ib3 portal 16 match 1697353956777024 offset 224 length 280: 4 [248797.805366] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.1@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa0be0d35 [248797.822164] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 3 previous similar messages [248822.277471] LustreError: 204436:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(2461259) req@ffff8be71abd3850 x1696860091520384/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:346/0 lens 488/448 e 0 to 0 dl 1618969496 ref 1 fl Interpret:/0/0 rc 0/0 [248822.277897] LustreError: 187417:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1048576(4194304) req@ffff8be839435850 x1684936267777856/t0(0) o3->9f0935fb-eb44-4@10.51.3.27@o2ib3:364/0 lens 488/440 e 0 to 0 dl 1618969514 ref 1 fl Interpret:/0/0 rc 0/0 [248822.277899] LustreError: 187417:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 5 previous similar messages [248822.277936] Lustre: oak-OST0034: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [248822.277938] Lustre: Skipped 8 previous similar messages [248822.359291] LustreError: 204436:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 18 previous similar messages [248926.276092] LNet: 84491:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.212@o2ib5: ROUTE ERROR -22 [248926.287175] LNet: 84491:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.212@o2ib5: connection failed [248926.300606] LNet: 84491:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.212@o2ib5 exceeded retry count 0 [248926.315200] LNet: 11475:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.214@o2ib5 rejected: no listener at 987 [248926.326373] LNet: 11475:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 68 previous similar messages [248974.379661] LustreError: 193401:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71e18c050 x1696860091836608/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:555/0 lens 488/448 e 0 to 0 dl 1618969705 ref 1 fl Interpret:/0/0 rc 0/0 [248974.404633] LustreError: 193401:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 20 previous similar messages [248986.592397] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.214@o2ib5: 10 seconds [249172.280096] LustreError: 237967:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bc4ca2a3050 x1685182456047424/t0(0) o3->965a6346-6505-4@10.50.4.27@o2ib2:708/0 lens 488/440 e 0 to 0 dl 1618969858 ref 1 fl Interpret:/0/0 rc 0/0 [249172.280098] LustreError: 3781:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(4194304) req@ffff8bc4ca2a2050 x1685057942711488/t0(0) o4->a7406eef-a378-4@10.50.4.26@o2ib2:708/0 lens 488/448 e 0 to 0 dl 1618969858 ref 1 fl Interpret:/0/0 rc 0/0 [249172.280101] LustreError: 3781:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 13 previous similar messages [249172.280286] Lustre: oak-OST0052: Bulk IO read error with 7f8e8eb2-f021-4 (at 10.50.14.10@o2ib2), client will retry: rc -110 [249172.280288] Lustre: Skipped 4 previous similar messages [249172.280292] Lustre: oak-OST0046: Bulk IO write error with 252efa96-a88e-4 (at 10.50.15.13@o2ib2), client will retry: rc = -110 [249172.280293] Lustre: Skipped 52 previous similar messages [249172.379565] LustreError: 237967:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 3 previous similar messages [249227.280719] LNet: 11475:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.212@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [249227.280763] LNet: 50604:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.212@o2ib5 failed: 5 [249227.280765] LNet: 50604:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 2497 previous similar messages [249227.319861] LNet: 11475:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 139 previous similar messages [249277.585580] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.212@o2ib5: 0 seconds [249288.465516] Lustre: oak-OST0042: Connection restored to 920354e8-9bff-4 (at 10.50.6.44@o2ib2) [249288.475203] Lustre: Skipped 1612 previous similar messages [249288.697857] Lustre: oak-OST0052: Client 7f8e8eb2-f021-4 (at 10.50.14.10@o2ib2) reconnecting [249288.707285] Lustre: Skipped 1479 previous similar messages [249411.177588] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.51.13.21@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [249411.197119] LustreError: Skipped 540 previous similar messages [249544.847737] LNet: 31423:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [249544.862153] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc383f62c00 [249544.874355] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc383f62c00 [249544.886626] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bbe9c944000 [249544.898814] LustreError: 209396:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be7170f9850 x1684948087920832/t0(0) o4->732e3689-d982-4@10.51.3.5@o2ib3:378/0 lens 488/448 e 0 to 0 dl 1618970283 ref 1 fl Interpret:/0/0 rc 0/0 [249544.924192] LustreError: 209396:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [249597.295497] LustreError: 5995:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be7170fc050 x1694660000217216/t0(0) o3->391ef4fc-9976-4@10.51.13.11@o2ib3:378/0 lens 488/440 e 0 to 0 dl 1618970283 ref 1 fl Interpret:/0/0 rc 0/0 [249597.295506] LustreError: 147685:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be6138bf850 x1696876688351808/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:378/0 lens 488/448 e 0 to 0 dl 1618970283 ref 1 fl Interpret:/0/0 rc 0/0 [249597.295507] LustreError: 147685:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [249597.295745] Lustre: oak-OST0050: Bulk IO read error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc -110 [249597.295746] Lustre: Skipped 2 previous similar messages [249629.717058] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71ce7e050 x1697586590427072/t0(0) o4->9458049c-ca8d-335b-3531-2606964e11c0@10.51.2.31@o2ib3:465/0 lens 488/448 e 0 to 0 dl 1618970370 ref 1 fl Interpret:/0/0 rc 0/0 [249629.744168] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 7 previous similar messages [249690.844778] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcec8127c00 [249690.856930] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4dcf7a800 [249690.869080] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5a4d18000 [249690.881241] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5a4d18000 [249690.893389] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5a4d1ec00 [249810.691543] LNet: 50604:0:(o2iblnd_cb.c:1114:kiblnd_tx_complete()) Tx -> 10.0.2.215@o2ib5 cookie 0xf89c815 sending 1 waiting 1: failed 12 [249810.691569] LNet: 50603:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.213@o2ib5 exceeded retry count 0 [249810.691587] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd610b93c00 [249810.692443] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb77e3c5000 [249810.692451] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb77e3c5000 [249810.753968] LNet: 50604:0:(o2iblnd_cb.c:1114:kiblnd_tx_complete()) Skipped 2 previous similar messages [249810.764463] LustreError: 50604:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd8ba7dc400 [249845.571907] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.213@o2ib5: 10 seconds [249845.583440] LNet: 50599:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.213@o2ib5 exceeded retry count 0 [249845.595673] LNet: 50599:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 9 previous similar messages [249847.103005] Lustre: oak-OST0052: Bulk IO write error with 777c6d6b-76ad-4 (at 10.50.5.32@o2ib2), client will retry: rc = -110 [249847.115755] Lustre: Skipped 15 previous similar messages [249860.849599] LNet: 161601:0:(o2iblnd_cb.c:3182:kiblnd_cm_callback()) 10.0.2.213@o2ib5: ADDR ERROR -110 [249868.064315] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdac9b18800 [249868.076482] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd16cd51000 [249868.088643] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cbcc00 [249891.639960] Lustre: oak-OST0050: Connection restored to 391ef4fc-9976-4 (at 10.51.13.11@o2ib3) [249891.649687] Lustre: Skipped 1367 previous similar messages [249892.075133] Lustre: oak-OST0042: Client 287304df-9062-4 (at 10.50.4.25@o2ib2) reconnecting [249892.084452] Lustre: Skipped 1199 previous similar messages [249922.302517] LustreError: 193456:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1672192(2720768) req@ffff8be71c12f850 x1696192839599360/t0(0) o3->afb0548c-9287-4@10.51.15.2@o2ib3:702/0 lens 488/440 e 0 to 0 dl 1618970607 ref 1 fl Interpret:/0/0 rc 0/0 [250097.308250] LNet: 182180:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.215@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [250097.308257] LNet: 50605:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.215@o2ib5 failed: 5 [250097.308260] LNet: 50605:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 1087 previous similar messages [250097.347478] LNet: 182180:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 60 previous similar messages [250097.358552] LNet: 182180:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.215@o2ib5 rejected: no listener at 987 [250097.369831] LNet: 182180:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 131 previous similar messages [250147.306875] LNet: 161601:0:(o2iblnd_cb.c:3182:kiblnd_cm_callback()) 10.0.2.213@o2ib5: ADDR ERROR -110 [250147.317310] LNet: 161601:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.213@o2ib5: connection failed [250148.564809] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.215@o2ib5: 1 seconds [250341.858086] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.50.10.20@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [250341.877596] LustreError: Skipped 1 previous similar message [250363.918744] LustreError: 9692:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be723183050 x1696620933037952/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:437/0 lens 488/448 e 0 to 0 dl 1618971097 ref 1 fl Interpret:/0/0 rc 0/0 [250363.945621] LustreError: 9692:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [250398.312161] LNet: 124667:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.213@o2ib5 rejected: no listener at 987 [250398.323490] LNet: 124667:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 67 previous similar messages [250451.557693] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.213@o2ib5: 3 seconds [250492.276731] Lustre: oak-OST0036: Connection restored to 6e143049-a3ce-4 (at 10.49.30.22@o2ib1) [250492.286447] Lustre: Skipped 281 previous similar messages [250544.206465] Lustre: oak-OST0054: Client f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3) reconnecting [250544.217952] Lustre: Skipped 114 previous similar messages [250569.898711] Lustre: oak-OST0056: haven't heard from client efb6f24b-0336-4 (at 10.49.0.64@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bab393c3800, cur 1618971217 expire 1618971067 last 1618970990 [250577.897666] Lustre: oak-OST005a: haven't heard from client efb6f24b-0336-4 (at 10.49.0.64@o2ib1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7a5f61400, cur 1618971225 expire 1618971075 last 1618970998 [250577.920018] Lustre: Skipped 4 previous similar messages [250699.322133] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.213@o2ib5: reconnect (invalid service id), 12, 12, msg_size: 4096, queue_depth: 8/-1, max_frags: 256/-1 [250699.322164] LNet: 50602:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.213@o2ib5 failed: 5 [250699.322166] LNet: 50602:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 3378 previous similar messages [250699.340692] LNet: 179836:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.215@o2ib5 rejected: no listener at 987 [250699.340693] LNet: 179836:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 116 previous similar messages [250699.383169] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 191 previous similar messages [250754.385857] Lustre: oak-OST0050: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [250754.398475] Lustre: Skipped 4 previous similar messages [250754.550522] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.213@o2ib5: 5 seconds [250754.561882] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Skipped 1 previous similar message [251000.327971] LNet: 161601:0:(o2iblnd_cb.c:3224:kiblnd_cm_callback()) 10.0.2.213@o2ib5: ROUTE ERROR -22 [251000.338375] LNet: 161601:0:(o2iblnd_cb.c:2293:kiblnd_peer_connect_failed()) Deleting messages for 10.0.2.213@o2ib5: connection failed [251000.351866] LNet: 161601:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.213@o2ib5 exceeded retry count 0 [251000.366895] LNet: 14182:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) 10.0.2.215@o2ib5 rejected: no listener at 987 [251000.378064] LNet: 14182:0:(o2iblnd_cb.c:2883:kiblnd_rejected()) Skipped 131 previous similar messages [251025.592139] LustreError: 193198:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcf33e7c050 x1697586534380160/t0(0) o4->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:343/0 lens 488/448 e 0 to 0 dl 1618971758 ref 1 fl Interpret:/0/0 rc 0/0 [251025.619279] LustreError: 193198:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [251057.543424] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Timed out tx for 10.0.2.215@o2ib5: 7 seconds [251057.554785] LNet: 50599:0:(o2iblnd_cb.c:3405:kiblnd_check_conns()) Skipped 1 previous similar message [251075.570677] LustreError: 137-5: oak-OST0051_UUID: not available for connect from 10.51.2.32@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [251075.570678] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.2.32@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [251075.570681] LustreError: Skipped 22 previous similar messages [251092.906677] Lustre: oak-OST0040: Connection restored to f619a0ca-1ec1-4 (at 10.49.30.10@o2ib1) [251092.916397] Lustre: Skipped 195 previous similar messages [251155.377500] Lustre: oak-OST0048: Client 2dce41e5-6cf1-0747-6975-01d951e9d8ac (at 10.51.1.46@o2ib3) reconnecting [251155.388888] Lustre: Skipped 78 previous similar messages [251693.180250] Lustre: oak-OST004c: Connection restored to af6bbe15-1a07-4 (at 10.49.22.29@o2ib1) [251693.189988] Lustre: Skipped 185 previous similar messages [251762.859431] Lustre: oak-OST005c: Client 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3) reconnecting [251762.870797] Lustre: Skipped 75 previous similar messages [251763.702845] LustreError: 193187:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be7236ed050 x1696876696245376/t0(0) o4->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:331/0 lens 488/448 e 0 to 0 dl 1618972501 ref 1 fl Interpret:/0/0 rc 0/0 [251763.729918] LustreError: 193187:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [251763.740887] Lustre: oak-OST005c: Bulk IO write error with 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3), client will retry: rc = -110 [251763.755639] Lustre: Skipped 5 previous similar messages [252130.992159] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.51.1.32@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [252131.011585] LustreError: Skipped 8 previous similar messages [252302.760640] Lustre: oak-OST003e: Connection restored to da817238-9680-4 (at 10.51.3.1@o2ib3) [252302.770163] Lustre: Skipped 156 previous similar messages [252357.781685] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [252357.795082] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 4 previous similar messages [252357.806631] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd36ae2a400 [252357.818787] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd36ae29c00 [252357.830936] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd36ae29c00 [252357.843126] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be23838cc00 [252357.855289] LustreError: 241059:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71feff850 x1689647235253568/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:168/0 lens 488/448 e 0 to 0 dl 1618973093 ref 1 fl Interpret:/0/0 rc 0/0 [252357.880768] LustreError: 241059:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 11 previous similar messages [252366.225445] Lustre: oak-OST0030: Client e71e5ef0-3cf7-788c-e457-05fd6b805527 (at 10.51.13.14@o2ib3) reconnecting [252366.236907] Lustre: Skipped 73 previous similar messages [252397.368290] LustreError: 193441:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(3693170) req@ffff8be71cbbc850 x1689699824196352/t0(0) o4->2bbfb89a-a909-4@10.51.5.29@o2ib3:159/0 lens 488/448 e 0 to 0 dl 1618973084 ref 1 fl Interpret:/0/0 rc 0/0 [252397.394907] LustreError: 193441:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [252397.406078] Lustre: oak-OST004c: Bulk IO write error with 2bbfb89a-a909-4 (at 10.51.5.29@o2ib3), client will retry: rc = -110 [252397.418811] Lustre: Skipped 5 previous similar messages [252422.368923] LustreError: 193424:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bd1dd4b5050 x1696863479175296/t0(0) o3->eb8cea22-3545-c4b8-6cb6-b3e875ecfb11@10.51.1.23@o2ib3:178/0 lens 488/440 e 0 to 0 dl 1618973103 ref 1 fl Interpret:/0/0 rc 0/0 [252422.369075] Lustre: oak-OST0036: Bulk IO read error with eb8cea22-3545-c4b8-6cb6-b3e875ecfb11 (at 10.51.1.23@o2ib3), client will retry: rc -110 [252422.369076] Lustre: Skipped 10 previous similar messages [252422.417810] LustreError: 193424:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [252503.142531] LustreError: 9693:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be7130ce850 x1689647237179904/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:320/0 lens 488/448 e 0 to 0 dl 1618973245 ref 1 fl Interpret:/0/0 rc 0/0 [252503.167370] LustreError: 9693:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [252503.349447] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa2c9bf0d [252503.366340] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 6 previous similar messages [252533.706323] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.3.1@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa2cd906d [252911.943959] Lustre: oak-OST005a: Connection restored to 966c9511-11a7-4 (at 10.51.2.54@o2ib3) [252911.953576] Lustre: Skipped 746 previous similar messages [253003.030316] Lustre: oak-OST0048: Client 25ca5655-1c3b-4 (at 10.51.15.8@o2ib3) reconnecting [253003.030318] Lustre: oak-OST0034: Client 25ca5655-1c3b-4 (at 10.51.15.8@o2ib3) reconnecting [253003.030322] Lustre: Skipped 599 previous similar messages [253513.813288] Lustre: oak-OST0036: Connection restored to 9768808c-691d-4f78-accb-0d29f922d720 (at 10.51.13.21@o2ib3) [253513.825040] Lustre: Skipped 185 previous similar messages [253538.826256] LustreError: 193401:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be13a331050 x1696620946332544/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:592/0 lens 488/448 e 0 to 0 dl 1618974272 ref 1 fl Interpret:/0/0 rc 0/0 [253538.853336] LustreError: 193401:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [253538.864284] Lustre: oak-OST003a: Bulk IO write error with 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3), client will retry: rc = -110 [253538.879052] Lustre: Skipped 15 previous similar messages [253631.380183] Lustre: oak-OST0036: Client 70e2b68b-00fb-4 (at 10.50.8.53@o2ib2) reconnecting [253631.389528] Lustre: Skipped 46 previous similar messages [253664.517108] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.2.15@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [253780.971765] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [253780.985950] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd4dcf7e000 [253780.998159] LustreError: 209392:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be760e8f050 x1689647249223808/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:85/0 lens 488/448 e 0 to 0 dl 1618974520 ref 1 fl Interpret:/0/0 rc 0/0 [253781.023485] LustreError: 209392:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 2 previous similar messages [253845.211577] LustreError: 137-5: oak-OST004f_UUID: not available for connect from 10.51.13.14@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [253847.399310] LustreError: 193425:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2093322(4190474) req@ffff8be71b034050 x1689692682201728/t0(0) o4->e9ad2042-8f70-4@10.51.5.32@o2ib3:86/0 lens 488/448 e 0 to 0 dl 1618974521 ref 1 fl Interpret:/0/0 rc 0/0 [253847.399331] LustreError: 241061:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(325) req@ffff8be760e8a850 x1688762220790336/t0(0) o3->5bf08143-64b1-4@10.51.3.2@o2ib3:87/0 lens 488/440 e 0 to 0 dl 1618974522 ref 1 fl Interpret:/0/0 rc 0/0 [253847.399346] Lustre: oak-OST0038: Bulk IO read error with 5bf08143-64b1-4 (at 10.51.3.2@o2ib3), client will retry: rc -110 [253847.399348] Lustre: Skipped 3 previous similar messages [253847.469309] LustreError: 193425:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 13 previous similar messages [253950.500596] Lustre: 193158:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618974424/real 1618974424] req@ffff8bc037dd0000 x1697353961098816/t0(0) o106->oak-OST0058@10.51.13.12@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618974597 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [254118.424368] Lustre: oak-OST0056: Connection restored to 6b26a19d-ffb3-4 (at 10.50.9.34@o2ib2) [254118.434006] Lustre: Skipped 668 previous similar messages [254241.673810] Lustre: oak-OST0058: Client 976970b5-5fad-aab3-00a6-23fd47844552 (at 10.51.13.16@o2ib3) reconnecting [254241.685279] Lustre: Skipped 543 previous similar messages [254523.386083] LustreError: 193425:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk READ failed: rc -107 req@ffff8be7010d6050 x1697491671356928/t0(0) o3->ed25cc1b-f8c2-8fce-f51c-bb486337b589@10.51.14.5@o2ib3:78/0 lens 488/440 e 0 to 0 dl 1618975268 ref 1 fl Interpret:/0/0 rc 0/0 [254523.413647] Lustre: oak-OST004e: Bulk IO read error with ed25cc1b-f8c2-8fce-f51c-bb486337b589 (at 10.51.14.5@o2ib3), client will retry: rc -107 [254523.428115] Lustre: Skipped 3 previous similar messages [254524.347163] LustreError: 187417:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be150a2c850 x1697491671356736/t0(0) o3->ed25cc1b-f8c2-8fce-f51c-bb486337b589@10.51.14.5@o2ib3:78/0 lens 488/440 e 0 to 0 dl 1618975268 ref 1 fl Interpret:/0/0 rc 0/0 [254524.374063] LustreError: 187417:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [254718.892707] Lustre: oak-OST0046: Connection restored to 3aa00c0e-4c49-4 (at 10.50.9.38@o2ib2) [254718.902326] Lustre: Skipped 87 previous similar messages [254916.990531] Lustre: oak-OST0052: Client 22f75865-19b9-71a9-aa3b-548e62a806e3 (at 10.51.0.16@o2ib3) reconnecting [254917.001910] Lustre: Skipped 19 previous similar messages [255211.211347] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.15.8@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [255211.230806] LustreError: Skipped 3 previous similar messages [255239.568791] LustreError: 137-5: oak-OST0051_UUID: not available for connect from 10.51.15.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [255321.127493] Lustre: oak-OST0034: Connection restored to a467a7e1-9686-4 (at 10.51.4.13@o2ib3) [255321.137158] Lustre: Skipped 161 previous similar messages [255565.540737] Lustre: oak-OST0034: Client 62377ef1-e7e9-f1fa-66fa-0724308527ab (at 10.51.1.27@o2ib3) reconnecting [255565.540738] Lustre: oak-OST0032: Client 62377ef1-e7e9-f1fa-66fa-0724308527ab (at 10.51.1.27@o2ib3) reconnecting [255565.540741] Lustre: Skipped 37 previous similar messages [255581.929315] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [255581.943398] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf6515f800 [255581.955559] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd0802afc00 [255581.967711] LustreError: 193407:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71b032850 x1689647271318528/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:376/0 lens 488/448 e 0 to 0 dl 1618976321 ref 1 fl Interpret:/0/0 rc 0/0 [255581.993332] Lustre: oak-OST0058: Bulk IO write error with 9b1546d3-bf78-4 (at 10.51.5.26@o2ib3), client will retry: rc = -110 [255582.006055] Lustre: Skipped 4 previous similar messages [255647.434222] LustreError: 209392:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1411250(2459826) req@ffff8be723796050 x1684931485882560/t0(0) o4->d4f105e0-bc0e-4@10.51.4.1@o2ib3:377/0 lens 488/448 e 0 to 0 dl 1618976322 ref 1 fl Interpret:/0/0 rc 0/0 [255647.434295] LustreError: 193456:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be723303050 x1695844090592896/t0(0) o3->330d404b-804c-4@10.51.15.3@o2ib3:381/0 lens 488/440 e 0 to 0 dl 1618976326 ref 1 fl Interpret:/0/0 rc 0/0 [255647.434300] LustreError: 5983:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(1822) req@ffff8be6bfb9b850 x1688762223672896/t0(0) o3->5bf08143-64b1-4@10.51.3.2@o2ib3:380/0 lens 488/440 e 0 to 0 dl 1618976325 ref 1 fl Interpret:/0/0 rc 0/0 [255647.434319] Lustre: oak-OST0038: Bulk IO read error with 5bf08143-64b1-4 (at 10.51.3.2@o2ib3), client will retry: rc -110 [255647.434320] Lustre: Skipped 3 previous similar messages [255647.529864] LustreError: 209392:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 11 previous similar messages [255923.220589] Lustre: oak-OST0046: Connection restored to be97c540-7839-4 (at 10.50.4.35@o2ib2) [255923.230211] Lustre: Skipped 490 previous similar messages [256011.918996] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [256011.932956] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd221d11000 [256011.945139] LustreError: 9692:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be713f97050 x1696617254554752/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:48/0 lens 488/448 e 0 to 0 dl 1618976748 ref 1 fl Interpret:/0/0 rc 0/0 [256011.972602] Lustre: oak-OST0050: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [256011.987455] Lustre: Skipped 14 previous similar messages [256012.744399] LustreError: 193439:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd714727050 x1696617254558272/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:48/0 lens 488/448 e 0 to 0 dl 1618976748 ref 1 fl Interpret:/0/0 rc 0/0 [256013.855422] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.13.12@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa4ab69ed [256047.440805] LustreError: 193454:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 183466(1232042) req@ffff8be6cb8ff850 x1689689228160640/t0(0) o4->8833c96a-f6c7-4@10.51.5.31@o2ib3:41/0 lens 488/448 e 0 to 0 dl 1618976741 ref 1 fl Interpret:/0/0 rc 0/0 [256118.916652] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [256118.929973] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bddaaf18400 [256118.942124] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd538ba9400 [256118.942132] LustreError: 209388:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be91ea13050 x1689383948793920/t0(0) o4->530d8c05-8fea-4@10.51.3.6@o2ib3:146/0 lens 488/448 e 0 to 0 dl 1618976846 ref 1 fl Interpret:/0/0 rc 0/0 [256118.979600] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd555717c00 [256118.991754] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd555717c00 [256119.003900] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd555712400 [256119.016051] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd556867000 [256135.693060] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [256135.707075] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcbbb50c800 [256135.719234] LustreError: 193439:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd7ee4e4050 x1689653444060736/t0(0) o4->7bc8c8b5-b218-4@10.51.5.5@o2ib3:169/0 lens 488/448 e 0 to 0 dl 1618976869 ref 1 fl Interpret:/0/0 rc 0/0 [256167.251823] Lustre: oak-OST0056: Client 8833c96a-f6c7-4 (at 10.51.5.31@o2ib3) reconnecting [256167.261181] Lustre: Skipped 319 previous similar messages [256170.924571] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.51.0.71@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [256170.924573] LustreError: 137-5: oak-OST0041_UUID: not available for connect from 10.51.0.71@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [256172.441686] LustreError: 241059:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bdc8af6a850 x1689383948794176/t0(0) o4->530d8c05-8fea-4@10.51.3.6@o2ib3:146/0 lens 488/448 e 0 to 0 dl 1618976846 ref 1 fl Interpret:/0/0 rc 0/0 [256172.442013] Lustre: oak-OST0032: Bulk IO read error with eb8cea22-3545-c4b8-6cb6-b3e875ecfb11 (at 10.51.1.23@o2ib3), client will retry: rc -110 [256172.442014] Lustre: Skipped 1 previous similar message [256172.487391] LustreError: 241059:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [256172.498189] Lustre: oak-OST003a: Bulk IO write error with 530d8c05-8fea-4 (at 10.51.3.6@o2ib3), client will retry: rc = -110 [256172.510814] Lustre: Skipped 10 previous similar messages [256197.442147] LustreError: 5995:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1663717(2712293) req@ffff8be71c0a2850 x1684931486776448/t0(0) o4->d4f105e0-bc0e-4@10.51.4.1@o2ib3:171/0 lens 488/448 e 0 to 0 dl 1618976871 ref 1 fl Interpret:/0/0 rc 0/0 [256312.730609] LustreError: 147686:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6fceea850 x1684931486992192/t0(0) o4->d4f105e0-bc0e-4@10.51.4.1@o2ib3:345/0 lens 488/448 e 0 to 0 dl 1618977045 ref 1 fl Interpret:/0/0 rc 0/0 [256312.755568] LustreError: 147686:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 7 previous similar messages [256548.791338] Lustre: oak-OST0032: Connection restored to 5514f76a-ebfd-4c1b-c28e-c2a228cc78fe (at 10.51.1.18@o2ib3) [256548.803064] Lustre: Skipped 262 previous similar messages [256649.605737] LustreError: 137-5: oak-OST0041_UUID: not available for connect from 10.51.6.27@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [256824.796457] Lustre: oak-OST004e: Client 8cd0fce8-38e8-d9db-a09e-3d56f1d88309 (at 10.51.2.5@o2ib3) reconnecting [256824.807725] Lustre: Skipped 82 previous similar messages [256994.672983] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [256994.686952] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1e80c8000 [256994.699103] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7577ec000 [256994.711258] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb5e8f5000 [256994.723426] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bddc89d2000 [256994.735580] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb5e8f5000 [257047.484468] LustreError: 242900:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be76059a050 x1689647284933824/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:275/0 lens 488/448 e 0 to 0 dl 1618977730 ref 1 fl Interpret:/0/0 rc 0/0 [257047.484476] LustreError: 5995:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(2463145) req@ffff8be71487a850 x1689647284941888/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:275/0 lens 488/448 e 0 to 0 dl 1618977730 ref 1 fl Interpret:/0/0 rc 0/0 [257047.484591] Lustre: oak-OST0052: Bulk IO write error with 9b1546d3-bf78-4 (at 10.51.5.26@o2ib3), client will retry: rc = -110 [257047.484592] Lustre: Skipped 3 previous similar messages [257047.484677] LustreError: 11105:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be7230b5050 x1695169333197760/t0(0) o3->c6866ec8-b698-4@10.51.14.7@o2ib3:279/0 lens 488/440 e 0 to 0 dl 1618977734 ref 1 fl Interpret:/0/0 rc 0/0 [257047.484815] Lustre: oak-OST0052: Bulk IO read error with 7bc8c8b5-b218-4 (at 10.51.5.5@o2ib3), client will retry: rc -110 [257047.484816] Lustre: Skipped 2 previous similar messages [257047.599606] LustreError: 242900:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [257160.794182] Lustre: oak-OST0058: Connection restored to 18349d35-d7be-4 (at 10.51.14.24@o2ib3) [257160.803924] Lustre: Skipped 206 previous similar messages [257161.540486] Lustre: 193153:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618977635/real 1618977635] req@ffff8bac113f4c80 x1697353963668992/t0(0) o106->oak-OST0050@10.51.13.4@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618977808 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [257161.571194] Lustre: 193153:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [257167.669272] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [257167.683108] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bccbd4b9800 [257167.695280] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be313d7fc00 [257167.707434] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be313d7fc00 [257167.719596] LustreError: 193407:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be6ba677050 x1696617268329536/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:451/0 lens 488/448 e 0 to 0 dl 1618977906 ref 1 fl Interpret:/0/0 rc 0/0 [257167.719607] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1cef9a800 [257167.719631] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1cef9a800 [257180.790753] Lustre: oak-OST0042: haven't heard from client 7e0a9a61-2b20-4 (at 10.51.2.19@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bce23cb8000, cur 1618977828 expire 1618977678 last 1618977601 [257180.813205] Lustre: Skipped 114 previous similar messages [257195.766068] Lustre: oak-OST003e: haven't heard from client 1d6f2d21-9fc7-4 (at 10.51.1.44@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bdb8b591800, cur 1618977843 expire 1618977693 last 1618977616 [257220.750118] LustreError: 241055:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bca7579f050 x1696620955971264/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:488/0 lens 488/448 e 0 to 0 dl 1618977943 ref 1 fl Interpret:/0/0 rc 0/0 [257222.493800] LustreError: 209394:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bdb645a8050 x1696619851669632/t0(0) o3->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:452/0 lens 488/440 e 0 to 0 dl 1618977907 ref 1 fl Interpret:/0/0 rc 0/0 [257222.493832] LustreError: 241052:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bdb645ad050 x1696619851669888/t0(0) o3->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:452/0 lens 488/440 e 0 to 0 dl 1618977907 ref 1 fl Interpret:/0/0 rc 0/0 [257222.493834] LustreError: 241052:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 4 previous similar messages [257222.494008] Lustre: oak-OST003a: Bulk IO read error with 84772643-3e0c-a25c-a9b0-965c7b792170 (at 10.51.13.15@o2ib3), client will retry: rc -110 [257222.494009] Lustre: Skipped 7 previous similar messages [257222.581099] LustreError: 209394:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [257454.364869] Lustre: oak-OST005e: Client ece9f754-b231-f7f0-99a6-c8016ca66797 (at 10.51.1.32@o2ib3) reconnecting [257454.376239] Lustre: Skipped 614 previous similar messages [257688.897045] LustreError: 137-5: oak-OST0039_UUID: not available for connect from 10.51.4.58@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [257764.076611] Lustre: oak-OST0042: Connection restored to d2fa787b-1420-4 (at 10.50.1.14@o2ib2) [257764.086239] Lustre: Skipped 1003 previous similar messages [258060.441866] Lustre: oak-OST003e: Client 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3) reconnecting [258060.453330] Lustre: Skipped 67 previous similar messages [258217.867345] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [258217.881774] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdb8b597c00 [258225.686200] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [258225.700350] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bde907a3000 [258272.538111] LustreError: 11105:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd870eb4850 x1689647299464512/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:743/0 lens 488/448 e 0 to 0 dl 1618978953 ref 1 fl Interpret:/0/0 rc 0/0 [258272.538155] LustreError: 5998:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 182015(1230591) req@ffff8be45c316050 x1684955533164160/t0(0) o4->da817238-9680-4@10.51.3.1@o2ib3:744/0 lens 488/448 e 0 to 0 dl 1618978954 ref 1 fl Interpret:/0/0 rc 0/0 [258272.538157] LustreError: 5998:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 7 previous similar messages [258272.538235] Lustre: oak-OST0046: Bulk IO write error with da817238-9680-4 (at 10.51.3.1@o2ib3), client will retry: rc = -110 [258272.538235] Lustre: Skipped 10 previous similar messages [258272.619152] LustreError: 11105:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [258297.541327] LustreError: 193450:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be459144850 x1689647299543616/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:6/0 lens 488/448 e 0 to 0 dl 1618978971 ref 1 fl Interpret:/0/0 rc 0/0 [258297.567730] LustreError: 193450:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 5 previous similar messages [258364.250573] Lustre: oak-OST0032: Connection restored to (at 10.51.2.19@o2ib3) [258364.258737] Lustre: Skipped 407 previous similar messages [258365.211322] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.4.44@o2ib3 ns: filter-oak-OST0038_UUID lock: ffff8bcc740eba80/0xf81cb91fefcaaac lrc: 3/0,0 mode: PW/PW res: [0xbcdd37:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400020020 nid: 10.51.4.44@o2ib3 remote: 0x3969149f36dbf813 expref: 6 pid: 192987 timeout: 258370 lvb_type: 0 [258492.637773] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [258492.652078] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d7400 [258492.664242] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bde6f889c00 [258492.676406] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bde6f88f400 [258492.688550] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bde6f88f400 [258492.700694] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be70d61d000 [258492.712869] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbf3c92ac00 [258492.725024] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd5b9722800 [258492.737188] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd5b9722800 [258492.749349] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d7400 [258547.570205] LustreError: 193411:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be724675050 x1689654149202688/t0(0) o4->3dc6d22e-58b7-4@10.51.5.3@o2ib3:257/0 lens 488/448 e 0 to 0 dl 1618979222 ref 1 fl Interpret:/0/0 rc 0/0 [258547.570604] LustreError: 193404:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1414476(2463052) req@ffff8bd1e29e8850 x1689689231969408/t0(0) o4->8833c96a-f6c7-4@10.51.5.31@o2ib3:262/0 lens 488/448 e 0 to 0 dl 1618979227 ref 1 fl Interpret:/0/0 rc 0/0 [258547.570735] Lustre: oak-OST0034: Bulk IO write error with 8833c96a-f6c7-4 (at 10.51.5.31@o2ib3), client will retry: rc = -110 [258547.570736] Lustre: Skipped 8 previous similar messages [258547.570927] LustreError: 193427:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bdfaad5d050 x1694752044676800/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:267/0 lens 488/440 e 0 to 0 dl 1618979232 ref 1 fl Interpret:/0/0 rc 0/0 [258547.570939] Lustre: oak-OST0034: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [258547.570940] Lustre: Skipped 2 previous similar messages [258547.684713] LustreError: 193411:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [258660.546244] Lustre: oak-OST003a: Client eb8cea22-3545-c4b8-6cb6-b3e875ecfb11 (at 10.51.1.23@o2ib3) reconnecting [258660.557675] Lustre: Skipped 822 previous similar messages [258822.599919] LustreError: 193188:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1594210(3691362) req@ffff8be713f96850 x1689647303933312/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:548/0 lens 488/448 e 0 to 0 dl 1618979513 ref 1 fl Interpret:/0/0 rc 0/0 [258822.600025] Lustre: oak-OST0050: Bulk IO write error with 9b1546d3-bf78-4 (at 10.51.5.26@o2ib3), client will retry: rc = -110 [258822.600026] Lustre: Skipped 3 previous similar messages [258822.645230] LustreError: 193188:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 5 previous similar messages [258822.903139] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa63799dd [258831.114447] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa6379d75 [258831.131820] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 3 previous similar messages [258893.996924] LustreError: 242901:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd2b7a62850 x1697235615407808/t0(0) o4->60fdba1c-732a-b99c-63c8-be091395f5f0@10.51.12.5@o2ib3:656/0 lens 488/448 e 0 to 0 dl 1618979621 ref 1 fl Interpret:/0/0 rc 0/0 [258949.716642] LustreError: 5988:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd714d38850 x1689647305622464/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:725/0 lens 504/448 e 0 to 0 dl 1618979690 ref 1 fl Interpret:/0/0 rc 0/0 [258950.953568] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa659b78d [258950.970550] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [258965.652614] Lustre: oak-OST0040: Connection restored to dbd26cab-2b6d-4 (at 10.50.2.41@o2ib2) [258965.662260] Lustre: Skipped 1730 previous similar messages [258972.088061] LustreError: 137-5: oak-OST004f_UUID: not available for connect from 10.51.13.6@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [258991.792203] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.51.2.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [259024.493695] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.12.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [259024.513108] LustreError: Skipped 3 previous similar messages [259102.846783] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [259102.860574] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7549d7000 [259102.872833] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd7549d2000 [259102.885032] LustreError: 241055:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be7170f8050 x1694627388116160/t0(0) o4->48001085-6d7b-4@10.51.12.13@o2ib3:118/0 lens 488/448 e 0 to 0 dl 1618979838 ref 1 fl Interpret:/0/0 rc 0/0 [259172.637643] LustreError: 193187:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd2cc20a050 x1695838479859840/t0(0) o3->eab9e2cf-af4c-4@10.51.15.1@o2ib3:125/0 lens 488/440 e 0 to 0 dl 1618979845 ref 1 fl Interpret:/0/0 rc 0/0 [259172.663142] Lustre: oak-OST004a: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [259172.675582] Lustre: Skipped 5 previous similar messages [259265.888066] Lustre: oak-OST005e: Client f3fb32eb-162a-4 (at 10.51.4.58@o2ib3) reconnecting [259265.897395] Lustre: Skipped 88 previous similar messages [259407.943955] LustreError: 147686:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be731ba2850 x1696617273829696/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:429/0 lens 488/448 e 0 to 0 dl 1618980149 ref 1 fl Interpret:/0/0 rc 0/0 [259407.971128] LustreError: 147686:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [259407.981762] Lustre: oak-OST0042: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [259407.996613] Lustre: Skipped 7 previous similar messages [259439.366788] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.51.14.24@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [259522.677688] LustreError: 241060:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(98304) req@ffff8be6a9eaa850 x1696192884061760/t0(0) o3->afb0548c-9287-4@10.51.15.2@o2ib3:486/0 lens 488/440 e 0 to 0 dl 1618980206 ref 1 fl Interpret:/0/0 rc 0/0 [259522.703236] Lustre: oak-OST0032: Bulk IO read error with afb0548c-9287-4 (at 10.51.15.2@o2ib3), client will retry: rc -110 [259565.657132] Lustre: oak-OST005c: Connection restored to 93738c60-9175-4 (at 10.50.1.70@o2ib2) [259565.666767] Lustre: Skipped 384 previous similar messages [259665.743989] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.50.6.68@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [259841.648908] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [259841.662808] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcee8d1c400 [259841.674994] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bde309a5000 [259841.687142] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bde309a5000 [259841.699293] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcd56b78000 [259841.699319] LustreError: 187416:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71abd3050 x1689653462768704/t0(0) o4->7bc8c8b5-b218-4@10.51.5.5@o2ib3:105/0 lens 488/448 e 0 to 0 dl 1618980580 ref 1 fl Interpret:/0/0 rc 0/0 [259866.097924] Lustre: oak-OST004a: Client b1aa0e56-7229-b79d-2adf-6fd2266928f7 (at 10.51.12.6@o2ib3) reconnecting [259866.097924] Lustre: oak-OST005a: Client b1aa0e56-7229-b79d-2adf-6fd2266928f7 (at 10.51.12.6@o2ib3) reconnecting [259866.097928] Lustre: Skipped 83 previous similar messages [259896.711883] LustreError: 193395:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd289d13850 x1689647316087168/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:102/0 lens 488/448 e 0 to 0 dl 1618980577 ref 1 fl Interpret:/0/0 rc 0/0 [259896.712123] LustreError: 209394:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1896610(3993762) req@ffff8be63713b050 x1689647316310528/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:105/0 lens 488/448 e 0 to 0 dl 1618980580 ref 1 fl Interpret:/0/0 rc 0/0 [259896.712245] LustreError: 204447:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be6138bb050 x1696617274478528/t0(0) o3->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:110/0 lens 488/440 e 0 to 0 dl 1618980585 ref 1 fl Interpret:/0/0 rc 0/0 [259896.712278] Lustre: oak-OST005e: Bulk IO read error with 95e7771d-4de6-4 (at 10.51.2.24@o2ib3), client will retry: rc -110 [259896.804875] LustreError: 193395:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [260010.910916] LustreError: 241059:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd1c5ae5050 x1685033016013312/t0(0) o3->3f803d26-c19e-4@10.51.2.20@o2ib3:280/0 lens 488/440 e 0 to 0 dl 1618980755 ref 1 fl Interpret:/0/0 rc 0/0 [260010.935886] Lustre: oak-OST0058: Bulk IO read error with 3f803d26-c19e-4 (at 10.51.2.20@o2ib3), client will retry: rc -110 [260010.948317] Lustre: Skipped 4 previous similar messages [260012.679388] Lustre: oak-OST003a: Bulk IO write error with da817238-9680-4 (at 10.51.3.1@o2ib3), client will retry: rc = -110 [260012.692013] Lustre: Skipped 15 previous similar messages [260015.429546] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.3.6@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa71c9fa5 [260015.446358] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 4 previous similar messages [260016.513041] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.3.6@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa71c9f95 [260019.491799] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.3.6@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa71ca76d [260019.508652] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 3 previous similar messages [260031.721146] LustreError: 193409:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71aa11850 x1684931494164416/t0(0) o4->d4f105e0-bc0e-4@10.51.4.1@o2ib3:286/0 lens 488/448 e 0 to 0 dl 1618980761 ref 1 fl Interpret:/0/0 rc 0/0 [260031.746134] LustreError: 193409:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [260168.258099] Lustre: oak-OST0052: Connection restored to df5881f1-066b-4 (at 10.50.1.71@o2ib2) [260168.267731] Lustre: Skipped 1118 previous similar messages [260276.596012] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [260276.608634] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff6a0) failed: 5 [260276.609236] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [260276.609241] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd16cd53c00 [260276.610016] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd16cd53c00 [260276.610597] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd1ff04a800 [260276.611301] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd1ff04a800 [260276.611304] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 3, status -5, desc ffff8bde29188c00 [260276.611309] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd556860800 [260276.611312] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd556860800 [260276.611344] LustreError: 193196:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be660b06850 x1689383956582144/t0(0) o4->530d8c05-8fea-4@10.51.3.6@o2ib3:541/0 lens 488/448 e 0 to 0 dl 1618981016 ref 1 fl Interpret:/0/0 rc 0/0 [260276.740190] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 917 previous similar messages [260321.749784] LustreError: 193452:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be6cb8ff850 x1694752236982912/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:541/0 lens 488/440 e 0 to 0 dl 1618981016 ref 1 fl Interpret:/0/0 rc 0/0 [260321.749789] LustreError: 241055:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be716449850 x1694752236983040/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:541/0 lens 488/440 e 0 to 0 dl 1618981016 ref 1 fl Interpret:/0/0 rc 0/0 [260321.749791] LustreError: 241055:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 3 previous similar messages [260321.749823] LustreError: 193439:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(3689999) req@ffff8be75702f850 x1689383956582336/t0(0) o4->530d8c05-8fea-4@10.51.3.6@o2ib3:541/0 lens 488/448 e 0 to 0 dl 1618981016 ref 1 fl Interpret:/0/0 rc 0/0 [260321.749824] LustreError: 193439:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 12 previous similar messages [260321.749972] Lustre: oak-OST005e: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [260321.862162] LustreError: 193452:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [260346.752926] LustreError: 241060:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be723e3e050 x1694752237098048/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:544/0 lens 488/440 e 0 to 0 dl 1618981019 ref 1 fl Interpret:/0/0 rc 0/0 [260346.778390] Lustre: oak-OST005e: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [260346.790843] Lustre: Skipped 2 previous similar messages [260441.815378] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [260441.829485] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd6f0968000 [260441.841640] LustreError: 9691:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be65439b850 x1689699850081152/t0(0) o4->2bbfb89a-a909-4@10.51.5.29@o2ib3:702/0 lens 488/448 e 0 to 0 dl 1618981177 ref 1 fl Interpret:/0/0 rc 0/0 [260448.074709] LustreError: 193188:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be901ce0850 x1696617275362112/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:705/0 lens 488/448 e 0 to 0 dl 1618981180 ref 1 fl Interpret:/0/0 rc 0/0 [260451.564110] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.3.6@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa766889d [260451.580909] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [260470.417710] Lustre: oak-OST0036: Client 56a5a766-0782-0626-7e81-90dde2e2789a (at 10.51.2.28@o2ib3) reconnecting [260470.429122] Lustre: Skipped 688 previous similar messages [260496.766338] LustreError: 193443:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(3689416) req@ffff8be723c62050 x1689654044608576/t0(0) o4->8d217df6-ca17-4@10.51.5.4@o2ib3:704/0 lens 488/448 e 0 to 0 dl 1618981179 ref 1 fl Interpret:/0/0 rc 0/0 [260558.601857] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [260558.614871] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be0f3a02400 [260558.627023] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7aa9ff800 [260621.776764] LustreError: 193455:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71d35b850 x1689647324545152/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:64/0 lens 488/448 e 0 to 0 dl 1618981294 ref 1 fl Interpret:/0/0 rc 0/0 [260621.776834] LustreError: 193433:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8be722698850 x1689647324547904/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:64/0 lens 488/448 e 0 to 0 dl 1618981294 ref 1 fl Interpret:/0/0 rc 0/0 [260621.776867] LustreError: 11105:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be71ef06050 x1696617275584640/t0(0) o3->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:65/0 lens 488/440 e 0 to 0 dl 1618981295 ref 1 fl Interpret:/0/0 rc 0/0 [260621.776954] Lustre: oak-OST0042: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [260621.776956] Lustre: Skipped 9 previous similar messages [260621.777033] Lustre: oak-OST003a: Bulk IO read error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc -110 [260621.892274] LustreError: 193455:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [260732.784063] LustreError: 193452:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6552e0850 x1689711349889088/t0(0) o4->19d08a37-93d2-4@10.51.5.30@o2ib3:244/0 lens 488/448 e 0 to 0 dl 1618981474 ref 1 fl Interpret:/0/0 rc 0/0 [260732.809103] LustreError: 193452:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [260737.117133] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.30@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa797dedd [260768.574481] Lustre: oak-OST0034: Connection restored to fe3cfa19-279f-4 (at 10.50.1.67@o2ib2) [260768.584102] Lustre: Skipped 1344 previous similar messages [260793.606918] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [260793.621098] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdeb46b7c00 [260793.633291] LustreError: 193424:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be6d7f36850 x1689653468466368/t0(0) o4->7bc8c8b5-b218-4@10.51.5.5@o2ib3:303/0 lens 488/448 e 0 to 0 dl 1618981533 ref 1 fl Interpret:/0/0 rc 0/0 [260846.802318] LustreError: 9694:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1885336(2933912) req@ffff8be721d6f050 x1696620964549376/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:295/0 lens 488/448 e 0 to 0 dl 1618981525 ref 1 fl Interpret:/0/0 rc 0/0 [260846.802761] LustreError: 187416:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be6d7f32850 x1685053624958144/t0(0) o3->059e2cc2-60f6-4@10.51.2.17@o2ib3:303/0 lens 488/440 e 0 to 0 dl 1618981533 ref 1 fl Interpret:/0/0 rc 0/0 [260846.802763] LustreError: 187416:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [260846.802776] Lustre: oak-OST0030: Bulk IO read error with 059e2cc2-60f6-4 (at 10.51.2.17@o2ib3), client will retry: rc -110 [260846.802777] Lustre: Skipped 2 previous similar messages [260846.885143] LustreError: 9694:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 12 previous similar messages [260861.659884] Lustre: oak-OST0040: haven't heard from client 006fe87a-1b9a-4 (at 10.51.15.15@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4bfb1a800, cur 1618981509 expire 1618981359 last 1618981282 [260865.686340] LustreError: 137-5: oak-OST0051_UUID: not available for connect from 10.51.2.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [260865.705702] LustreError: Skipped 3 previous similar messages [260866.694660] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.6.17@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [260866.714119] LustreError: Skipped 1 previous similar message [260871.661698] Lustre: oak-OST003a: haven't heard from client 006fe87a-1b9a-4 (at 10.51.15.15@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be5a5bdf800, cur 1618981519 expire 1618981369 last 1618981292 [260882.403840] LustreError: 137-5: oak-OST0047_UUID: not available for connect from 10.51.6.13@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [260882.423256] LustreError: Skipped 1 previous similar message [260892.594236] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [260892.608478] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be553a7bc00 [260946.808073] LustreError: 193411:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(4194304) req@ffff8be150a2a850 x1689647328364096/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:398/0 lens 488/448 e 0 to 0 dl 1618981628 ref 1 fl Interpret:/0/0 rc 0/0 [260946.808433] LustreError: 193452:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8bcc70629850 x1695385207134144/t0(0) o3->c7c97132-e759-4@10.51.15.4@o2ib3:406/0 lens 488/440 e 0 to 0 dl 1618981636 ref 1 fl Interpret:/0/0 rc 0/0 [260946.808446] Lustre: oak-OST0042: Bulk IO read error with c7c97132-e759-4 (at 10.51.15.4@o2ib3), client will retry: rc -110 [260946.872521] LustreError: 193411:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [260981.634764] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [260981.649532] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bccda9a7400 [260981.661681] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be3e1abd800 [260997.159885] LustreError: 209394:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd36d0e7050 x1696863955587648/t0(0) o4->8ff6000c-d966-1cda-f3a5-455db4eb8783@10.51.2.23@o2ib3:505/0 lens 488/448 e 0 to 0 dl 1618981735 ref 1 fl Interpret:/0/0 rc 0/0 [261055.128307] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa7cfb2d5 [261055.145249] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [261069.206316] LustreError: 193411:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk WRITE failed: rc -107 req@ffff8be71c174050 x1689692697707968/t0(0) o4->e9ad2042-8f70-4@10.51.5.32@o2ib3:584/0 lens 488/448 e 0 to 0 dl 1618981814 ref 1 fl Interpret:/0/0 rc 0/0 [261069.231670] LustreError: 193411:0:(ldlm_lib.c:3287:target_bulk_io()) Skipped 3 previous similar messages [261070.668137] Lustre: oak-OST0034: Client bc175ba0-453f-4 (at 10.51.1.25@o2ib3) reconnecting [261070.677489] Lustre: Skipped 1395 previous similar messages [261072.717464] Lustre: oak-OST0038: haven't heard from client 5bf08143-64b1-4 (at 10.51.3.2@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be457845400, cur 1618981720 expire 1618981570 last 1618981493 [261072.739689] Lustre: Skipped 1 previous similar message [261078.659223] Lustre: oak-OST0044: haven't heard from client 5bf08143-64b1-4 (at 10.51.3.2@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be457843400, cur 1618981726 expire 1618981576 last 1618981499 [261078.972081] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa7d3a565 [261078.988973] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [261086.472027] LustreError: 137-5: oak-OST0045_UUID: not available for connect from 10.51.2.32@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [261138.799121] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [261138.812895] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdb0ef01000 [261138.825042] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbbb50cc00 [261138.837192] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbbb50cc00 [261138.849346] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb93d1c800 [261196.840327] LustreError: 193444:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(2124422) req@ffff8be6a2236050 x1689653470067904/t0(0) o4->7bc8c8b5-b218-4@10.51.5.5@o2ib3:640/0 lens 488/448 e 0 to 0 dl 1618981870 ref 1 fl Interpret:/0/0 rc 0/0 [261196.840911] LustreError: 193415:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be6a2235050 x1696863498732864/t0(0) o3->eb8cea22-3545-c4b8-6cb6-b3e875ecfb11@10.51.1.23@o2ib3:648/0 lens 488/440 e 0 to 0 dl 1618981878 ref 1 fl Interpret:/0/0 rc 0/0 [261196.840912] LustreError: 193415:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [261196.840920] LustreError: 209396:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bcf3b8b1850 x1694752293601024/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:648/0 lens 488/440 e 0 to 0 dl 1618981878 ref 1 fl Interpret:/0/0 rc 0/0 [261196.840934] Lustre: oak-OST005e: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [261196.942686] LustreError: 193444:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [261207.574499] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [261207.589247] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71fdbe800 [261207.601411] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71fdbe800 [261207.613565] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca6d78f800 [261207.625717] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71fdbd400 [261207.637874] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71fdbd400 [261207.650065] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdeb46b1c00 [261227.798108] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723de0c00 [261227.810270] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bda0b0e5c00 [261227.822443] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be7230c9000 [261227.822719] Lustre: oak-OST0036: Bulk IO write error with 9b1546d3-bf78-4 (at 10.51.5.26@o2ib3), client will retry: rc = -110 [261227.822721] Lustre: Skipped 35 previous similar messages [261246.847921] LustreError: 5991:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1471140) req@ffff8be7460ed850 x1688763562920960/t0(0) o4->a06ff198-d843-4@10.51.13.17@o2ib3:709/0 lens 488/448 e 0 to 0 dl 1618981939 ref 1 fl Interpret:/0/0 rc 0/0 [261246.874413] LustreError: 5991:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [261271.851000] LustreError: 204447:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(30091) req@ffff8bd95afdc850 x1685012502538752/t0(0) o3->182778a0-b920-4@10.51.0.61@o2ib3:721/0 lens 488/440 e 0 to 0 dl 1618981951 ref 1 fl Interpret:/0/0 rc 0/0 [261271.876592] LustreError: 204447:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [261312.850517] LustreError: 209394:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be760ed7050 x1694707945827328/t0(0) o4->7d4930b0-1dcd-4@10.51.13.8@o2ib3:70/0 lens 488/448 e 0 to 0 dl 1618982055 ref 1 fl Interpret:/0/0 rc 0/0 [261312.875481] LustreError: 209394:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [261317.472884] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.13.8@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xa7ff000d [261317.489776] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [261330.475049] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ffbe0) failed: 5 [261330.475908] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bafb85c9400 [261330.477218] LNet: 179836:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.214@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [261330.477220] LNet: 179836:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 196 previous similar messages [261330.485880] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [261330.485882] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcfac75e800 [261330.485884] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [261330.485886] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcfac759000 [261330.485887] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbda45ad000 [261330.485895] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcfac759000 [261330.485898] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be4b4f84000 [261330.485902] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be4b4f84000 [261330.485904] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc0175f2c00 [261330.632919] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 251 previous similar messages [261368.593841] Lustre: oak-OST003c: Connection restored to 6e847374-86a8-4 (at 10.50.6.49@o2ib2) [261368.603462] Lustre: Skipped 1893 previous similar messages [261396.862606] LustreError: 238499:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8bc4cac76850 x1691637963934336/t0(0) o3->a8e1d696-3374-4@10.50.13.11@o2ib2:84/0 lens 488/440 e 0 to 0 dl 1618982069 ref 1 fl Interpret:/0/0 rc 0/0 [261396.862611] LustreError: 237964:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1512140) req@ffff8bc44e0e4050 x1696878420029376/t0(0) o4->23493cad-75f9-843c-a532-11e13d73ee82@10.50.2.51@o2ib2:84/0 lens 488/448 e 0 to 0 dl 1618982069 ref 1 fl Interpret:/0/0 rc 0/0 [261396.862613] LustreError: 237964:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 6 previous similar messages [261396.927660] LustreError: 238499:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 27 previous similar messages [261407.650747] Lustre: oak-OST0036: haven't heard from client a489df6d-ddfc-4 (at 10.51.1.57@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be20f4d1800, cur 1618982055 expire 1618981905 last 1618981828 [261499.360142] Lustre: 188016:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618981973/real 1618981973] req@ffff8bbd8a4c5a00 x1697353970133056/t0(0) o106->oak-OST0036@10.50.5.63@o2ib2:15/16 lens 296/280 e 0 to 1 dl 1618982146 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [261514.782594] LustreError: 137-5: oak-OST005b_UUID: not available for connect from 10.50.1.70@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [261526.639846] Lustre: oak-OST0030: haven't heard from client d99c049c-e9d8-4 (at 10.50.9.19@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bbb596cb000, cur 1618982174 expire 1618982024 last 1618981947 [261643.787288] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [261643.800780] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [261643.813765] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be6c89fec00 [261667.787329] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be70c148000 [261667.799513] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be70c148000 [261675.260404] Lustre: oak-OST005a: Client 438da15f-c993-4 (at 10.51.0.68@o2ib3) reconnecting [261675.269732] Lustre: Skipped 1898 previous similar messages [261696.895454] LustreError: 193433:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(2459643) req@ffff8bdf8c7ff050 x1689649857154432/t0(0) o4->25cf3830-668c-4@10.51.5.22@o2ib3:398/0 lens 488/448 e 0 to 0 dl 1618982383 ref 1 fl Interpret:/0/0 rc 0/0 [261696.922042] LustreError: 193433:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [261781.060582] Lustre: oak-OST0052: Bulk IO read error with 826cc441-b037-301f-308e-d8ef428b1b96 (at 10.50.2.66@o2ib2), client will retry: rc -110 [261781.075146] Lustre: Skipped 41 previous similar messages [261976.520551] Lustre: oak-OST0040: Connection restored to 7af31256-f01a-4 (at 10.50.1.69@o2ib2) [261976.530176] Lustre: Skipped 2081 previous similar messages [262025.218610] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.1.36@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [262025.238025] LustreError: Skipped 2 previous similar messages [262092.819244] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [262092.832638] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 1 previous similar message [262092.844358] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd459603800 [262092.856502] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd459603800 [262092.868657] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd459603800 [262092.880799] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bccbd4ba800 [262092.892950] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd3ef081800 [262097.045390] LustreError: 241050:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be64e4c1050 x1688763566919744/t0(0) o4->a06ff198-d843-4@10.51.13.17@o2ib3:81/0 lens 488/448 e 0 to 0 dl 1618982821 ref 1 fl Interpret:/0/0 rc 0/0 [262097.070815] LustreError: 241050:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 16 previous similar messages [262097.081672] Lustre: oak-OST0046: Bulk IO write error with a06ff198-d843-4 (at 10.51.13.17@o2ib3), client will retry: rc = -110 [262097.094488] Lustre: Skipped 24 previous similar messages [262112.057848] LustreError: 5989:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcf8da43050 x1696620966788864/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:77/0 lens 488/448 e 0 to 0 dl 1618982817 ref 1 fl Interpret:/0/0 rc 0/0 [262112.084627] LustreError: 5989:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [262121.925392] LustreError: 241052:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1956125) req@ffff8be979660850 x1689652670654400/t0(0) o4->0e941ceb-3e75-4@10.51.4.53@o2ib3:63/0 lens 504/448 e 0 to 0 dl 1618982803 ref 1 fl Interpret:/0/0 rc 0/0 [262121.951306] LustreError: 241052:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 6 previous similar messages [262134.211311] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.15.3@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [262134.230728] LustreError: Skipped 3 previous similar messages [262275.531617] Lustre: oak-OST005c: Client fc844464-2173-c48d-a2a1-0cf8294e7bdf (at 10.50.10.48@o2ib2) reconnecting [262275.543083] Lustre: Skipped 886 previous similar messages [262579.282843] Lustre: oak-OST0056: Connection restored to 1639bdd0-384b-4 (at 10.51.6.19@o2ib3) [262579.292464] Lustre: Skipped 692 previous similar messages [262636.764108] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [262636.778743] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be32b110800 [262734.125335] LustreError: 187418:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd2ab438850 x1695781264099072/t0(0) o4->571c5cbe-9605-4@10.51.6.22@o2ib3:736/0 lens 488/448 e 0 to 0 dl 1618983476 ref 1 fl Interpret:/0/0 rc 0/0 [262734.150374] LustreError: 187418:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [262734.161302] Lustre: oak-OST0046: Bulk IO write error with 571c5cbe-9605-4 (at 10.51.6.22@o2ib3), client will retry: rc = -110 [262734.174029] Lustre: Skipped 11 previous similar messages [262879.539967] Lustre: oak-OST0052: Client 6c0c8fc9-86c4-9a4d-ddf5-85fe73093cd9 (at 10.51.12.2@o2ib3) reconnecting [262879.551367] Lustre: Skipped 117 previous similar messages [262992.539968] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.13.15@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [262992.559526] LustreError: Skipped 3 previous similar messages [263042.374522] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.15.14@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [263042.394100] LustreError: Skipped 4 previous similar messages [263097.014295] LustreError: 5988:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1233655(3330807) req@ffff8be660b05850 x1696881758139328/t0(0) o4->d507c462-9f0f-5857-aa2a-29a15c595cfc@10.51.6.7@o2ib3:273/0 lens 488/448 e 0 to 0 dl 1618983768 ref 1 fl Interpret:/0/0 rc 0/0 [263097.042824] LustreError: 5988:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [263182.202788] Lustre: oak-OST0032: Connection restored to 25cf3830-668c-4 (at 10.51.5.22@o2ib3) [263182.202790] Lustre: oak-OST0050: Connection restored to (at 10.51.4.15@o2ib3) [263182.202794] Lustre: Skipped 505 previous similar messages [263260.514464] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.50.2.56@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [263260.533972] LustreError: Skipped 5 previous similar messages [263377.807360] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.3.11@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [263496.806244] Lustre: oak-OST003a: Client 853f1535-ef30-151f-429c-d573236cff68 (at 10.51.15.11@o2ib3) reconnecting [263496.817732] Lustre: Skipped 197 previous similar messages [263783.979503] Lustre: oak-OST0048: Connection restored to 976970b5-5fad-aab3-00a6-23fd47844552 (at 10.51.13.16@o2ib3) [263783.991254] Lustre: Skipped 372 previous similar messages [263793.837249] LustreError: 137-5: oak-OST0047_UUID: not available for connect from 10.51.1.51@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [263793.856689] LustreError: Skipped 2 previous similar messages [263853.824156] Lustre: 192979:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618984328/real 1618984328] req@ffff8bd4496bda00 x1697353973853952/t0(0) o104->oak-OST0032@10.51.12.21@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618984501 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [264030.188032] LustreError: 193442:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be637e0d850 x1696620970894656/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:514/0 lens 488/448 e 0 to 0 dl 1618984764 ref 1 fl Interpret:/0/0 rc 0/0 [264030.215104] LustreError: 193442:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [264030.225884] Lustre: oak-OST0044: Bulk IO write error with 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3), client will retry: rc = -110 [264030.240637] Lustre: Skipped 5 previous similar messages [264102.506761] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [264102.521582] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd8b583c000 [264102.533742] LustreError: 193439:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bcc89aef850 x1689662499746048/t0(0) o4->48551941-ecf7-4@10.51.4.65@o2ib3:588/0 lens 488/448 e 0 to 0 dl 1618984838 ref 1 fl Interpret:/0/0 rc 0/0 [264102.559170] LustreError: 193439:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [264122.091766] LustreError: 193452:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1232839) req@ffff8be6569d9850 x1696620971141568/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:565/0 lens 488/448 e 0 to 0 dl 1618984815 ref 1 fl Interpret:/0/0 rc 0/0 [264122.120615] Lustre: oak-OST005e: Bulk IO write error with 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3), client will retry: rc = -110 [264122.135370] Lustre: Skipped 1 previous similar message [264124.455644] Lustre: oak-OST0034: Client ede94a1a-b345-30c6-e4bb-52a3a03e8e5f (at 10.51.6.18@o2ib3) reconnecting [264124.467140] Lustre: Skipped 128 previous similar messages [264124.506859] LustreError: 193411:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd176694050 x1696866282933952/t0(0) o4->ede94a1a-b345-30c6-e4bb-52a3a03e8e5f@10.51.6.18@o2ib3:594/0 lens 488/448 e 0 to 0 dl 1618984844 ref 1 fl Interpret:/0/0 rc 0/0 [264173.771301] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bde29188800 [264173.783489] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7a2e46400 [264173.795653] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7a2e46400 [264173.807890] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd538ba9c00 [264173.820040] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd538ba9c00 [264216.503975] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [264216.516593] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 1 previous similar message [264216.528425] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd0802a8400 [264216.540568] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbb596cb400 [264216.540621] LustreError: 193415:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bce33f25850 x1696614174694272/t0(0) o4->7535e7e6-b397-e9ec-ed05-2fc68240cd4c@10.51.4.38@o2ib3:706/0 lens 488/448 e 0 to 0 dl 1618984956 ref 1 fl Interpret:/0/0 rc 0/0 [264216.540623] LustreError: 193415:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [264216.590800] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be3e1abfc00 [264222.100639] Lustre: oak-OST005a: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [264222.113093] Lustre: Skipped 1 previous similar message [264224.101759] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.0.67@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [264224.121195] LustreError: Skipped 4 previous similar messages [264272.106613] LustreError: 9691:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(814) req@ffff8bd02d325850 x1689657111848128/t0(0) o3->73ed3aec-5286-4@10.51.5.14@o2ib3:708/0 lens 488/440 e 0 to 0 dl 1618984958 ref 1 fl Interpret:/0/0 rc 0/0 [264272.131755] LustreError: 9691:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [264381.929878] Lustre: 193011:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618984856/real 1618984856] req@ffff8bd91c926300 x1697353974500032/t0(0) o105->oak-OST005e@10.51.13.17@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1618985029 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [264383.092790] LustreError: 193425:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71f9dd050 x1696617280754368/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:105/0 lens 488/448 e 0 to 0 dl 1618985110 ref 1 fl Interpret:/0/0 rc 0/0 [264383.119966] LustreError: 193425:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [264383.130578] Lustre: oak-OST0058: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [264383.145442] Lustre: Skipped 31 previous similar messages [264384.057317] Lustre: oak-OST0040: Connection restored to 7535e7e6-b397-e9ec-ed05-2fc68240cd4c (at 10.51.4.38@o2ib3) [264384.068979] Lustre: Skipped 615 previous similar messages [264387.433694] Lustre: 193129:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618984861/real 1618984861] req@ffff8bba084cf980 x1697353974505152/t0(0) o106->oak-OST0042@10.51.2.23@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618985034 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [264387.866816] LNet: 50607:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.13.17@o2ib3 portal 16 match 1697353974500032 offset 224 length 224: 4 [264402.706586] Lustre: oak-OST0040: haven't heard from client 7bc8c8b5-b218-4 (at 10.51.5.5@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be33878bc00, cur 1618985050 expire 1618984900 last 1618984823 [264402.728833] Lustre: Skipped 9 previous similar messages [264407.571444] Lustre: oak-OST0030: haven't heard from client 7bc8c8b5-b218-4 (at 10.51.5.5@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc283b36000, cur 1618985055 expire 1618984905 last 1618984828 [264407.593683] Lustre: Skipped 3 previous similar messages [264693.586456] LustreError: 193181:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk WRITE failed: rc -107 req@ffff8be9088de850 x1695780872687168/t0(0) o4->1639bdd0-384b-4@10.51.6.19@o2ib3:434/0 lens 488/448 e 0 to 0 dl 1618985439 ref 1 fl Interpret:/0/0 rc 0/0 [264693.611965] Lustre: oak-OST004c: Bulk IO write error with 1639bdd0-384b-4 (at 10.51.6.19@o2ib3), client will retry: rc = -107 [264738.347861] Lustre: oak-OST003e: Client 7f1b7392-400d-f93e-0c1e-8292ad9bca46 (at 10.51.13.6@o2ib3) reconnecting [264738.347862] Lustre: oak-OST004e: Client 7f1b7392-400d-f93e-0c1e-8292ad9bca46 (at 10.51.13.6@o2ib3) reconnecting [264738.347865] Lustre: Skipped 534 previous similar messages [264763.412874] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.51.6.19@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [264763.432292] LustreError: Skipped 6 previous similar messages [264784.815430] LustreError: 241055:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be7575f6050 x1696948037696000/t0(0) o4->0cc1f06c-eda0-ebe4-f37b-9f106eca81f1@10.51.4.62@o2ib3:519/0 lens 488/448 e 0 to 0 dl 1618985524 ref 1 fl Interpret:/0/0 rc 0/0 [264984.099510] Lustre: oak-OST0030: Connection restored to e84ad1b6-d416-4 (at 10.51.5.56@o2ib3) [264984.109133] Lustre: Skipped 2210 previous similar messages [265037.484830] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [265037.498474] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600580) failed: 5 [265037.498825] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd706ce0400 [265037.498838] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [265037.498840] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [265037.498842] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf6515e400 [265037.499401] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf6515e400 [265037.500064] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf409d3400 [265037.500686] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf6515d400 [265037.500689] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdaa4554c00 [265037.500692] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdaa4554c00 [265037.500694] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723c48c00 [265037.627591] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 941 previous similar messages [265097.193491] LustreError: 5991:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be6fc95d050 x1685072640345600/t0(0) o3->0b13f2ae-d6e1-4@10.51.2.70@o2ib3:17/0 lens 488/440 e 0 to 0 dl 1618985777 ref 1 fl Interpret:/0/0 rc 0/0 [265097.193533] LustreError: 193143:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bdb870d6050 x1689648394775744/t0(0) o3->71f8f8a7-b976-4@10.51.5.59@o2ib3:17/0 lens 488/440 e 0 to 0 dl 1618985777 ref 1 fl Interpret:/0/0 rc 0/0 [265097.193668] Lustre: oak-OST0048: Bulk IO read error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc -110 [265097.193669] Lustre: Skipped 4 previous similar messages [265097.263112] LustreError: 5991:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 9 previous similar messages [265214.703862] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [265214.717495] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600740) failed: 5 [265214.718941] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0802a8800 [265214.728223] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [265214.728225] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [265214.728227] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7225ba400 [265214.728566] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd221d16000 [265214.728838] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be74f2ec400 [265214.729132] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be74f2ec800 [265214.729424] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be74f2ec800 [265214.729712] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf6515bc00 [265214.730017] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be74f2ec800 [265214.846580] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1771 previous similar messages [265272.230820] LustreError: 193446:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71b5c4850 x1688454615346048/t0(0) o3->cc997993-6c0d-4@10.51.2.7@o2ib3:194/0 lens 488/440 e 0 to 0 dl 1618985954 ref 1 fl Interpret:/0/0 rc 0/0 [265272.231023] Lustre: oak-OST003c: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [265272.231024] Lustre: Skipped 9 previous similar messages [265272.274423] LustreError: 193446:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [265322.240664] LustreError: 193424:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(24912) req@ffff8bcfdaa22050 x1696617281915136/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:250/0 lens 488/448 e 0 to 0 dl 1618986010 ref 1 fl Interpret:/0/0 rc 0/0 [265322.240706] Lustre: oak-OST0040: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [265322.240707] Lustre: Skipped 3 previous similar messages [265322.289384] LustreError: 193424:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 27 previous similar messages [265323.738146] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.13.12@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xab0458e5 [265323.755133] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 3 previous similar messages [265326.520483] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [265326.534711] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc17cc0bc00 [265326.546873] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad455a6400 [265326.559033] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad455a6400 [265326.571223] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc5e9b51c00 [265326.583397] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bba1d19fc00 [265326.595592] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4c7658800 [265326.607747] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bba1d19fc00 [265326.619907] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad455a5000 [265326.619934] LustreError: 226915:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bc4c593f850 x1688793186831232/t0(0) o4->ff735979-573a-4@10.51.1.53@o2ib3:302/0 lens 488/448 e 0 to 0 dl 1618986062 ref 1 fl Interpret:/0/0 rc 0/0 [265326.657509] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad455a5000 [265326.669671] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc5e9b51c00 [265331.018475] LustreError: 137-5: oak-OST005b_UUID: not available for connect from 10.51.13.12@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [265331.037989] LustreError: Skipped 2 previous similar messages [265348.367962] Lustre: oak-OST0048: Client 330d404b-804c-4 (at 10.51.15.3@o2ib3) reconnecting [265348.377293] Lustre: Skipped 172 previous similar messages [265348.478251] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be9797a9800 [265348.490421] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be9797a9800 [265348.502570] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad455a4800 [265348.514728] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad455a4800 [265348.526900] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bca7efbc000 [265348.539050] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be1e80ce400 [265348.551202] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be1e80ce400 [265348.563362] LNet: 1245:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [265348.563364] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bca7efb9000 [265348.563370] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [265348.563372] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 3540 previous similar messages [265348.563376] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bca7efb9000 [265348.563399] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4b4f84800 [265348.563410] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad455a0c00 [265348.563425] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad455a0c00 [265372.246766] LustreError: 239465:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bc4b4256850 x1683981651595200/t0(0) o3->f555a2b4-9a49-4@10.51.4.17@o2ib3:302/0 lens 488/440 e 0 to 0 dl 1618986062 ref 1 fl Interpret:/0/0 rc 0/0 [265372.247269] Lustre: oak-OST0040: Bulk IO read error with 6f53e647-4eee-4 (at 10.51.6.6@o2ib3), client will retry: rc -110 [265372.247270] Lustre: Skipped 5 previous similar messages [265372.291336] LustreError: 239465:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 7 previous similar messages [265392.700583] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdb8b590000 [265392.712749] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be66d411800 [265397.249135] LustreError: 234136:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bae31d6e050 x1685035241877120/t0(0) o3->04a04758-274c-4@10.51.2.46@o2ib3:324/0 lens 488/440 e 0 to 0 dl 1618986084 ref 1 fl Interpret:/0/0 rc 0/0 [265397.275446] LustreError: 234136:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 204 previous similar messages [265446.475172] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [265446.488663] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [265446.500478] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be70cdcb400 [265446.512639] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdac9b1d800 [265446.524824] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be713b21400 [265446.537000] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd03bedf800 [265446.549164] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdac9b1d400 [265446.561341] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7577c1800 [265446.573517] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7577c1800 [265446.585685] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be976cd6800 [265446.597854] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be713b21400 [265447.262146] LustreError: 217342:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be722705850 x1688679990381056/t0(0) o3->7e0a9a61-2b20-4@10.51.2.19@o2ib3:372/0 lens 488/440 e 0 to 0 dl 1618986132 ref 1 fl Interpret:/0/0 rc 0/0 [265447.287472] LustreError: 217342:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 13 previous similar messages [265447.298357] Lustre: oak-OST004a: Bulk IO read error with 7e0a9a61-2b20-4 (at 10.51.2.19@o2ib3), client will retry: rc -110 [265447.310781] Lustre: Skipped 274 previous similar messages [265497.279432] LustreError: 217332:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be71d07d850 x1689649880822144/t0(0) o3->25cf3830-668c-4@10.51.5.22@o2ib3:425/0 lens 488/440 e 0 to 0 dl 1618986185 ref 1 fl Interpret:/0/0 rc 0/0 [265497.305156] LustreError: 217332:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 180 previous similar messages [265498.044771] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.55@o2ib3 ns: filter-oak-OST0032_UUID lock: ffff8bdcff0a86c0/0xf81cb91ff0fd8fb lrc: 4/0,0 mode: PW/PW res: [0x205be18:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1232895) flags: 0x60000400010020 nid: 10.51.5.55@o2ib3 remote: 0x328ce4163d1453ef expref: 6 pid: 193033 timeout: 265503 lvb_type: 0 [265498.697584] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ffbe0) failed: 5 [265498.699080] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0f3a04800 [265498.699091] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0f3a04800 [265498.699101] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7222fe000 [265498.699110] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd610b91400 [265498.699117] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd1c1c55800 [265498.708011] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [265498.708012] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [265498.708014] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd221d16000 [265498.708364] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be719d83c00 [265498.708611] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be719d83c00 [265498.708887] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd3b1fcdc00 [265498.709159] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7222fe000 [265498.851208] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1044 previous similar messages [265526.929066] LustreError: 193426:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71c174850 x1696620973987136/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:508/0 lens 488/448 e 0 to 0 dl 1618986268 ref 1 fl Interpret:/0/0 rc 0/0 [265526.956144] LustreError: 193426:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [265530.504522] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xab473e45 [265530.521415] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [265543.568122] Lustre: oak-OST0044: haven't heard from client 9b1546d3-bf78-4 (at 10.51.5.26@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bdc033c8c00, cur 1618986191 expire 1618986041 last 1618985964 [265543.590488] Lustre: Skipped 2 previous similar messages [265547.290463] LustreError: 193444:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be6138bb050 x1696620973950976/t0(0) o3->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:478/0 lens 488/440 e 0 to 0 dl 1618986238 ref 1 fl Interpret:/0/0 rc 0/0 [265547.290939] Lustre: oak-OST0040: Bulk IO read error with 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3), client will retry: rc -110 [265547.290940] Lustre: Skipped 186 previous similar messages [265547.338803] LustreError: 193444:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [265548.043573] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 112s: evicting client at 10.51.13.17@o2ib3 ns: filter-oak-OST0056_UUID lock: ffff8bd85d29b600/0xf81cb91ff1bc43e lrc: 4/0,0 mode: PR/PR res: [0x15c0000400:0x39fdef:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.13.17@o2ib3 remote: 0x7c70df512fe4c13f expref: 33 pid: 193008 timeout: 265553 lvb_type: 1 [265548.090871] LustreError: 193026:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8be457843000 ns: filter-oak-OST0056_UUID lock: ffff8bc0b071d7c0/0xf81cb91ff1bcc79 lrc: 3/0,0 mode: --/PW res: [0x15c0000400:0x39fdef:0x0].0x0 rrc: 3 type: EXT [0->8191] (req 0->8191) flags: 0x50000000020000 nid: 10.51.13.17@o2ib3 remote: 0x7c70df512fe4c559 expref: 33 pid: 193026 timeout: 0 lvb_type: 0 [265553.575439] Lustre: oak-OST003a: haven't heard from client f7572664-10e5-4 (at 10.51.1.69@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be75774bc00, cur 1618986201 expire 1618986051 last 1618985974 [265566.696166] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcdee9d8000 [265566.708342] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcdee9d8000 [265566.720507] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7aaa22000 [265566.732697] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7aaa22000 [265566.744865] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be1cde7a000 [265566.757047] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd62b520400 [265566.769229] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd219c8400 [265566.781388] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd219c8400 [265566.781397] LNet: 1245:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [265566.811096] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [265566.811109] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7aaa22000 [265566.833541] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 18 previous similar messages [265586.505027] Lustre: oak-OST004e: Connection restored to 9c75be9d-aa36-4 (at 10.50.5.10@o2ib2) [265586.514677] Lustre: Skipped 1384 previous similar messages [265622.301859] LustreError: 239465:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bc4b62f1850 x1696864309997824/t0(0) o3->4cee94e6-025c-589a-13ab-6c9ed337de31@10.51.2.36@o2ib3:543/0 lens 488/440 e 0 to 0 dl 1618986303 ref 1 fl Interpret:/0/0 rc 0/0 [265622.305022] LustreError: 193181:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be7053b1050 x1688763594166592/t0(0) o3->a06ff198-d843-4@10.51.13.17@o2ib3:546/0 lens 488/440 e 0 to 0 dl 1618986306 ref 1 fl Interpret:/0/0 rc 0/0 [265622.305024] LustreError: 193181:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 15 previous similar messages [265622.366429] LustreError: 239465:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 202 previous similar messages [265713.039759] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.4.33@o2ib3 ns: filter-oak-OST003a_UUID lock: ffff8bceebe506c0/0xf81cb91ff1acb63 lrc: 4/0,0 mode: PW/PW res: [0x2138463:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->212991) flags: 0x60000400010020 nid: 10.51.4.33@o2ib3 remote: 0xe1814cf08e1d2ff4 expref: 7 pid: 206661 timeout: 265718 lvb_type: 0 [265762.560363] Lustre: oak-OST004c: haven't heard from client fd16aff2-0371-4 (at 10.51.4.33@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be1cde7c800, cur 1618986410 expire 1618986260 last 1618986183 [265769.587775] Lustre: oak-OST004e: haven't heard from client fd16aff2-0371-4 (at 10.51.4.33@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be72277b000, cur 1618986417 expire 1618986267 last 1618986190 [265797.350922] LustreError: 209400:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bcc7062e850 x1695783475201728/t0(0) o3->fd9ffaf8-396f-4@10.51.6.24@o2ib3:707/0 lens 488/440 e 0 to 0 dl 1618986467 ref 1 fl Interpret:/0/0 rc 0/0 [265797.377399] Lustre: oak-OST0040: Bulk IO read error with fd9ffaf8-396f-4 (at 10.51.6.24@o2ib3), client will retry: rc -110 [265797.389833] Lustre: Skipped 216 previous similar messages [265942.506041] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [265942.519533] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [265942.531657] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7222fa000 [265942.543822] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9d4bac800 [265942.555992] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723092c00 [265942.568144] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723092c00 [265942.580306] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bde6f88b800 [265942.592485] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be7236dcc00 [265942.604646] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be78c636800 [265942.616780] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be78c636800 [265942.628923] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9d4baa000 [265942.642437] LustreError: 227992:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be7aa8f2850 x1689692728752064/t0(0) o4->e9ad2042-8f70-4@10.51.5.32@o2ib3:167/0 lens 488/448 e 0 to 0 dl 1618986682 ref 1 fl Interpret:/0/0 rc 0/0 [265942.667913] LustreError: 227992:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [265942.679169] Lustre: oak-OST0040: Bulk IO write error with e9ad2042-8f70-4 (at 10.51.5.32@o2ib3), client will retry: rc = -110 [265942.691983] Lustre: Skipped 26 previous similar messages [265947.388845] LustreError: 221471:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be6c118f850 x1696620974642368/t0(0) o3->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:111/0 lens 488/440 e 0 to 0 dl 1618986626 ref 1 fl Interpret:/0/0 rc 0/0 [265950.165361] Lustre: oak-OST0036: Client 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3) reconnecting [265950.174694] Lustre: Skipped 1475 previous similar messages [265979.497549] LustreError: 137-5: oak-OST0041_UUID: not available for connect from 10.51.15.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [265979.516993] LustreError: Skipped 7 previous similar messages [265997.400519] LustreError: 193113:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1576508(2625084) req@ffff8be901ce7050 x1689699865548928/t0(0) o4->2bbfb89a-a909-4@10.51.5.29@o2ib3:173/0 lens 488/448 e 0 to 0 dl 1618986688 ref 1 fl Interpret:/0/0 rc 0/0 [265997.427099] LustreError: 193113:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 14 previous similar messages [266000.987442] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.29@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xabd5bb75 [266001.004341] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [266072.414016] Lustre: oak-OST0040: Bulk IO read error with 7678e55d-59d2-7a86-9043-5a3c52484e0c (at 10.51.14.24@o2ib3), client will retry: rc -110 [266072.428606] Lustre: Skipped 13 previous similar messages [266119.482245] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb93d1dc00 [266119.494458] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb93d1dc00 [266119.506694] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb93d1dc00 [266119.518891] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb93d1dc00 [266119.531085] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be4ad55c400 [266119.543239] LNet: 182178:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [266119.543252] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [266119.543254] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be96d3c5000 [266119.543276] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be96d3c5000 [266119.543294] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be96d3c5000 [266119.543321] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be96d3c5000 [266187.616302] Lustre: oak-OST005c: Connection restored to (at 10.50.5.24@o2ib2) [266187.624468] Lustre: Skipped 1775 previous similar messages [266260.026984] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.39@o2ib3 ns: filter-oak-OST0038_UUID lock: ffff8bd8b3c67500/0xf81cb91ff1e7428 lrc: 4/0,0 mode: PW/PW res: [0x1f32805:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->102399) flags: 0x60000400030020 nid: 10.51.5.39@o2ib3 remote: 0x260063e95cd8ff94 expref: 6 pid: 193059 timeout: 266265 lvb_type: 0 [266292.219220] LustreError: 227921:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be75778c050 x1696620975288256/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:517/0 lens 488/448 e 0 to 0 dl 1618987032 ref 1 fl Interpret:/0/0 rc 0/0 [266292.246291] LustreError: 227921:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 6 previous similar messages [266390.694623] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.26@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xac4e1f65 [266447.508613] LustreError: 220403:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd182d5c050 x1695783476124800/t0(0) o3->fd9ffaf8-396f-4@10.51.6.24@o2ib3:612/0 lens 488/440 e 0 to 0 dl 1618987127 ref 1 fl Interpret:/0/0 rc 0/0 [266447.534919] LustreError: 220403:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 231 previous similar messages [266555.257989] Lustre: oak-OST0040: Client fd9ffaf8-396f-4 (at 10.51.6.24@o2ib3) reconnecting [266555.267338] Lustre: Skipped 1000 previous similar messages [266584.303266] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.15.4@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [266584.322727] LustreError: Skipped 30 previous similar messages [266596.547223] Lustre: oak-OST0044: Bulk IO read error with c7c97132-e759-4 (at 10.51.15.4@o2ib3), client will retry: rc -110 [266596.559654] Lustre: Skipped 229 previous similar messages [266788.144506] Lustre: oak-OST005c: Connection restored to 2c63b434-3a22-4 (at 10.51.5.53@o2ib3) [266788.154132] Lustre: Skipped 1831 previous similar messages [267036.438278] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [267036.450897] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 1 previous similar message [267036.462074] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600040) failed: 5 [267036.462583] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [267036.462584] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [267036.462589] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd8ba7dc800 [267036.463154] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd4dcf79400 [267036.463824] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd4dcf7ec00 [267036.464416] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be757017000 [267036.465152] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be757017000 [267036.555009] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1250 previous similar messages [267071.620413] LustreError: 209399:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(2463204) req@ffff8bd289d16050 x1689647393816128/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:478/0 lens 488/448 e 0 to 0 dl 1618987748 ref 1 fl Interpret:/0/0 rc 0/0 [267071.620511] Lustre: oak-OST0058: Bulk IO write error with 9b1546d3-bf78-4 (at 10.51.5.26@o2ib3), client will retry: rc = -110 [267071.620512] Lustre: Skipped 12 previous similar messages [267071.665767] LustreError: 209399:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 8 previous similar messages [267076.477578] LustreError: 220400:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be7575e7850 x1697130123865856/t0(0) o4->56a5a766-0782-0626-7e81-90dde2e2789a@10.51.2.28@o2ib3:543/0 lens 488/448 e 0 to 0 dl 1618987813 ref 1 fl Interpret:/0/0 rc 0/0 [267076.504660] LustreError: 220400:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [267096.623919] LustreError: 217487:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be8fad96050 x1695780881990336/t0(0) o3->1639bdd0-384b-4@10.51.6.19@o2ib3:505/0 lens 488/440 e 0 to 0 dl 1618987775 ref 1 fl Interpret:/0/0 rc 0/0 [267096.623992] LustreError: 227990:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be7136d8050 x1689691991898752/t0(0) o3->bf9174df-fd25-4@10.51.14.20@o2ib3:507/0 lens 488/440 e 0 to 0 dl 1618987777 ref 1 fl Interpret:/0/0 rc 0/0 [267096.623994] LustreError: 227990:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 5 previous similar messages [267096.686325] LustreError: 217487:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 12 previous similar messages [267157.829007] Lustre: oak-OST0036: Client 071a6a76-f9bb-62b8-4b90-59eef50d0ad6 (at 10.51.0.13@o2ib3) reconnecting [267157.840374] Lustre: Skipped 165 previous similar messages [267181.005518] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.5.1@o2ib3 ns: filter-oak-OST0038_UUID lock: ffff8be6dbb42d00/0xf81cb91ff2264e5 lrc: 3/0,0 mode: PW/PW res: [0x1f3281a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->49151) flags: 0x60000400010020 nid: 10.51.5.1@o2ib3 remote: 0xc13b5fdfd2934ee0 expref: 6 pid: 193039 timeout: 267186 lvb_type: 0 [267181.050057] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 4 previous similar messages [267218.874537] Lustre: 193067:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618987693/real 1618987693] req@ffff8be4d7067500 x1697353978694912/t0(0) o104->oak-OST0046@10.51.6.21@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618987866 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [267224.543261] Lustre: oak-OST004e: haven't heard from client 9b72cdcc-a3a2-4 (at 10.51.5.19@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd7b92ad400, cur 1618987872 expire 1618987722 last 1618987645 [267233.527832] Lustre: oak-OST003e: haven't heard from client bea21439-f3ea-4 (at 10.51.5.8@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd6f096a400, cur 1618987881 expire 1618987731 last 1618987654 [267233.550133] Lustre: Skipped 3 previous similar messages [267305.474470] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [267305.488891] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be05f7dc800 [267305.501040] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be05f7dc800 [267305.513197] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be6160c2c00 [267324.654710] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c6002e0) failed: 5 [267324.665112] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 50 previous similar messages [267324.665144] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [267324.665146] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [267324.665152] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be5a5bd8c00 [267324.665555] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7314bc400 [267324.665827] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd3b1fcfc00 [267371.661769] Lustre: oak-OST0044: Bulk IO read error with c7c97132-e759-4 (at 10.51.15.4@o2ib3), client will retry: rc -110 [267371.674200] Lustre: Skipped 33 previous similar messages [267388.342113] Lustre: oak-OST005e: Connection restored to e7e73b53-3e30-4 (at 10.50.10.39@o2ib2) [267388.351864] Lustre: Skipped 2231 previous similar messages [267761.662986] Lustre: oak-OST0044: Client 853f1535-ef30-151f-429c-d573236cff68 (at 10.51.15.11@o2ib3) reconnecting [267761.674451] Lustre: Skipped 617 previous similar messages [267787.593314] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.50.12.10@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [267787.612823] LustreError: Skipped 26 previous similar messages [267847.415002] LustreError: 217338:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be655c5c050 x1696866295403584/t0(0) o4->ede94a1a-b345-30c6-e4bb-52a3a03e8e5f@10.51.6.18@o2ib3:564/0 lens 488/448 e 0 to 0 dl 1618988589 ref 1 fl Interpret:/0/0 rc 0/0 [267847.442388] Lustre: oak-OST004e: Bulk IO write error with ede94a1a-b345-30c6-e4bb-52a3a03e8e5f (at 10.51.6.18@o2ib3), client will retry: rc = -110 [267847.457164] Lustre: Skipped 25 previous similar messages [267848.438585] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.6.18@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xad9011ed [267848.455501] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 3 previous similar messages [267989.735276] Lustre: oak-OST003e: Connection restored to dc7a6583-bcd1-4 (at 10.50.1.63@o2ib2) [267989.744906] Lustre: Skipped 1609 previous similar messages [268084.678820] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [268084.691455] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 1 previous similar message [268084.703226] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd7a79ad800 [268084.715402] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7058fbc00 [268084.727578] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71edaa000 [268084.739797] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71edaa000 [268084.752017] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd538baec00 [268121.761038] LustreError: 193407:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(14960) req@ffff8be4d3ec6050 x1689647405403584/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:35/0 lens 488/448 e 0 to 0 dl 1618988815 ref 1 fl Interpret:/0/0 rc 0/0 [268121.786756] LustreError: 193407:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 20 previous similar messages [268146.765155] LustreError: 221473:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be45c310050 x1694752713740992/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:44/0 lens 488/440 e 0 to 0 dl 1618988824 ref 1 fl Interpret:/0/0 rc 0/0 [268146.765212] LustreError: 241055:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8be6bfb9d850 x1689658087810304/t0(0) o3->35776d57-c710-4@10.51.5.33@o2ib3:45/0 lens 488/440 e 0 to 0 dl 1618988825 ref 1 fl Interpret:/0/0 rc 0/0 [268146.765213] LustreError: 241055:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 30 previous similar messages [268146.765225] Lustre: oak-OST0058: Bulk IO read error with 35776d57-c710-4 (at 10.51.5.33@o2ib3), client will retry: rc -110 [268146.765226] Lustre: Skipped 3 previous similar messages [268146.844852] LustreError: 221473:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 8 previous similar messages [268258.187196] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.59@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xade5bb8d [268258.204120] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 7 previous similar messages [268283.632336] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [268283.645105] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600900) failed: 5 [268283.645144] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [268283.645146] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 2 previous similar messages [268283.645149] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd316b20c00 [268283.655889] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd316b27800 [268283.656224] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd316b20c00 [268283.656488] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd8ba7ddc00 [268283.726076] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 783 previous similar messages [268361.971086] Lustre: oak-OST0030: Client d073f313-60b4-4 (at 10.51.15.5@o2ib3) reconnecting [268361.980432] Lustre: Skipped 410 previous similar messages [268576.334682] LustreError: 137-5: oak-OST0051_UUID: not available for connect from 10.50.10.21@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [268576.354192] LustreError: Skipped 2 previous similar messages [268590.678838] Lustre: oak-OST0050: Connection restored to 85db8683-aef8-4 (at 10.50.9.64@o2ib2) [268590.688458] Lustre: Skipped 1271 previous similar messages [268602.689317] LustreError: 221489:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6fd063050 x1696614534366016/t0(0) o4->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:563/0 lens 488/448 e 0 to 0 dl 1618989343 ref 1 fl Interpret:/0/0 rc 0/0 [268602.716399] LustreError: 221489:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 13 previous similar messages [268602.727215] Lustre: oak-OST0054: Bulk IO write error with 855f8733-97ad-fe20-a42c-c9a97f6818f7 (at 10.51.4.40@o2ib3), client will retry: rc = -110 [268602.742012] Lustre: Skipped 26 previous similar messages [268694.010717] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.13.12@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xae4e7bfd [268968.437888] Lustre: oak-OST0054: Client 615f6034-750a-f27b-66dc-7c591b98c535 (at 10.51.4.20@o2ib3) reconnecting [268968.449264] Lustre: Skipped 134 previous similar messages [269046.920102] LustreError: 193407:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be63713c850 x1695474835570048/t0(0) o3->61781ed1-b14e-4@10.51.13.4@o2ib3:196/0 lens 488/440 e 0 to 0 dl 1618989731 ref 1 fl Interpret:/0/0 rc 0/0 [269046.945527] LustreError: 193407:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 8 previous similar messages [269046.956227] Lustre: oak-OST0040: Bulk IO read error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc -110 [269046.968650] Lustre: Skipped 14 previous similar messages [269052.391479] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [269052.404731] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be39f8f7800 [269052.416892] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf68ace400 [269052.416904] LustreError: 193188:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be713964050 x1689692000230784/t0(0) o4->bf9174df-fd25-4@10.51.14.20@o2ib3:233/0 lens 488/448 e 0 to 0 dl 1618989768 ref 1 fl Interpret:/0/0 rc 0/0 [269052.416906] LustreError: 193188:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [269052.465281] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3ef082000 [269052.477435] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3ef082000 [269191.092087] Lustre: oak-OST003a: Connection restored to 413272a1-a696-4 (at 10.51.2.13@o2ib3) [269191.101722] Lustre: Skipped 1340 previous similar messages [269231.751182] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.51.2.26@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [269231.770603] LustreError: Skipped 27 previous similar messages [269463.931365] LustreError: 193442:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be71ff9d850 x1696811700906496/t0(0) o3->f91b26c5-3694-4@10.50.5.36@o2ib2:673/0 lens 488/440 e 0 to 0 dl 1618990208 ref 1 fl Interpret:/0/0 rc 0/0 [269463.956305] LustreError: 193442:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [269591.581060] Lustre: oak-OST0040: Client ef10d7d4-e9b4-37f7-a4de-3390f0661fc9 (at 10.50.8.27@o2ib2) reconnecting [269591.592426] Lustre: Skipped 245 previous similar messages [269791.968338] Lustre: oak-OST0052: Connection restored to 19568721-8425-4 (at 10.50.2.64@o2ib2) [269791.977964] Lustre: Skipped 863 previous similar messages [269805.912417] Lustre: oak-OST0040: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [269805.924849] Lustre: Skipped 3 previous similar messages [269817.596568] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [269817.609268] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c6009e0) failed: 5 [269817.609563] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [269817.609565] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 3 previous similar messages [269817.609568] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcee8d19800 [269817.610230] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf6515ec00 [269817.610826] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd1c1c57800 [269817.611534] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd1c1c57800 [269817.611537] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be20f4d6c00 [269817.611542] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be20f4d6c00 [269817.714439] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 902 previous similar messages [269872.064262] LustreError: 224957:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1310720(2359296) req@ffff8be723e3e050 x1689709609600320/t0(0) o3->46f54804-a64f-4@10.51.14.21@o2ib3:267/0 lens 488/440 e 0 to 0 dl 1618990557 ref 1 fl Interpret:/0/0 rc 0/0 [269872.064279] LustreError: 220404:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71bcc0850 x1689699878641408/t0(0) o3->2bbfb89a-a909-4@10.51.5.29@o2ib3:267/0 lens 488/440 e 0 to 0 dl 1618990557 ref 1 fl Interpret:/0/0 rc 0/0 [269872.064283] LustreError: 220404:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 2 previous similar messages [269872.064314] LustreError: 193445:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 182658(1231234) req@ffff8bdcafb6d850 x1689654383402176/t0(0) o4->fdf3ee33-4328-4@10.51.5.65@o2ib3:268/0 lens 488/448 e 0 to 0 dl 1618990558 ref 1 fl Interpret:/0/0 rc 0/0 [269872.064316] LustreError: 193445:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 11 previous similar messages [269872.064329] Lustre: oak-OST003c: Bulk IO write error with ea702749-deff-4 (at 10.51.4.14@o2ib3), client will retry: rc = -110 [269872.064330] Lustre: Skipped 5 previous similar messages [269872.182997] LustreError: 224957:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 12 previous similar messages [269994.005208] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.22@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xafbd042d [269998.592444] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [269998.605825] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bc4c757d800 [269998.617973] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723e36c00 [269998.618097] Lustre: oak-OST004e: Bulk IO write error with 25cf3830-668c-4 (at 10.51.5.22@o2ib3), client will retry: rc = -110 [269998.618099] Lustre: Skipped 6 previous similar messages [269998.648771] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d2c00 [269998.660918] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d2c00 [269998.673068] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d6c00 [270008.483821] Lustre: oak-OST0048: haven't heard from client 6f53e647-4eee-4 (at 10.51.6.6@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bdd2eda5400, cur 1618990656 expire 1618990506 last 1618990429 [270015.503224] Lustre: oak-OST0038: haven't heard from client 2081f97b-bd94-4 (at 10.51.5.38@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bda65d93000, cur 1618990663 expire 1618990513 last 1618990436 [270047.097445] LustreError: 220405:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1527808(2576384) req@ffff8bd124ccd850 x1689649908195904/t0(0) o4->25cf3830-668c-4@10.51.5.22@o2ib3:448/0 lens 488/448 e 0 to 0 dl 1618990738 ref 1 fl Interpret:/0/0 rc 0/0 [270047.124041] LustreError: 220405:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 10 previous similar messages [270085.982850] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.51.13.14@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [270086.002363] LustreError: Skipped 3 previous similar messages [270142.936572] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.5.3@o2ib3 ns: filter-oak-OST0032_UUID lock: ffff8bc2ca1b98c0/0xf81cb91ff33f9c9 lrc: 4/0,0 mode: PR/PR res: [0x1d160dc:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.5.3@o2ib3 remote: 0xa40adbfe15380201 expref: 6 pid: 192985 timeout: 270148 lvb_type: 1 [270146.936490] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.2@o2ib3 ns: filter-oak-OST004e_UUID lock: ffff8bcda50ea400/0xf81cb91ff33fc7e lrc: 4/0,0 mode: PR/PR res: [0x1df8809:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.5.2@o2ib3 remote: 0xd7681f65a481be47 expref: 7 pid: 193034 timeout: 270152 lvb_type: 1 [270204.624800] Lustre: oak-OST0054: Client 48551941-ecf7-4 (at 10.51.4.65@o2ib3) reconnecting [270204.634135] Lustre: Skipped 684 previous similar messages [270393.534144] Lustre: oak-OST0034: Connection restored to 5b3cca43-cba9-4 (at 10.51.5.34@o2ib3) [270393.543768] Lustre: Skipped 1391 previous similar messages [270739.720710] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be722759850 x1696860124989248/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:424/0 lens 488/448 e 0 to 0 dl 1618991469 ref 1 fl Interpret:/0/0 rc 0/0 [270739.745653] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [270739.756450] Lustre: oak-OST003a: Bulk IO write error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc = -110 [270739.769093] Lustre: Skipped 8 previous similar messages [270748.872518] Lustre: oak-OST0040: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [270748.884948] Lustre: Skipped 34 previous similar messages [270808.120083] Lustre: oak-OST003c: Client 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3) reconnecting [270808.129410] Lustre: Skipped 142 previous similar messages [270833.592504] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.2.26@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [270833.592505] LustreError: 137-5: oak-OST004f_UUID: not available for connect from 10.51.2.26@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [270833.592508] LustreError: Skipped 12 previous similar messages [270872.270070] LustreError: 193445:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 182692(1231268) req@ffff8bd701263850 x1696860125064192/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:508/0 lens 488/448 e 0 to 0 dl 1618991553 ref 1 fl Interpret:/0/0 rc 0/0 [270872.296464] LustreError: 193445:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [270965.613595] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [270965.627714] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be91aca7800 [270965.639875] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdc033c9000 [270965.652029] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdc033c9000 [270965.664190] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71ff3f400 [270965.676342] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdc033c9c00 [270965.688493] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd7549d2000 [270965.700654] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd7549d2000 [270965.712805] LustreError: 193420:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd182d5c850 x1688706942179264/t0(0) o4->a7f2d980-27e4-4@10.51.2.58@o2ib3:660/0 lens 488/448 e 0 to 0 dl 1618991705 ref 1 fl Interpret:/0/0 rc 0/0 [270965.712815] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7576cd400 [270965.712844] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdc033c9c00 [270965.762525] LustreError: 193420:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 6 previous similar messages [270992.411557] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [270992.426301] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be32b112800 [270992.438451] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be70c634800 [270992.450600] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9d4bad400 [270992.462748] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9d4babc00 [270992.474903] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9d4ba8800 [270993.585603] Lustre: oak-OST0040: Connection restored to (at 10.50.1.58@o2ib2) [270993.593846] Lustre: Skipped 847 previous similar messages [271022.299384] LustreError: 204443:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be73149f850 x1694752863683712/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:657/0 lens 488/440 e 0 to 0 dl 1618991702 ref 1 fl Interpret:/0/0 rc 0/0 [271022.299693] LustreError: 147685:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(3690767) req@ffff8bd3f7416850 x1689652691403136/t0(0) o4->0e941ceb-3e75-4@10.51.4.53@o2ib3:659/0 lens 488/448 e 0 to 0 dl 1618991704 ref 1 fl Interpret:/0/0 rc 0/0 [271022.299695] LustreError: 221489:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 907230(1955806) req@ffff8bd183522850 x1689652691392000/t0(0) o4->0e941ceb-3e75-4@10.51.4.53@o2ib3:659/0 lens 504/448 e 0 to 0 dl 1618991704 ref 1 fl Interpret:/0/0 rc 0/0 [271022.378755] LustreError: 204443:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 32 previous similar messages [271036.248566] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(sending)(waiting) [271036.262050] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ffcc0) failed: 5 [271036.262960] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bde309a7c00 [271036.272704] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [271036.272706] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 5 previous similar messages [271036.272709] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4c7bcf400 [271036.272958] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4c6e7f400 [271036.272961] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb3f8110400 [271036.273220] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4bace2400 [271036.273224] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bad579a8800 [271036.273228] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bde309a7c00 [271036.273230] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4c6e7f400 [271036.284880] LNet: 88616:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.214@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [271036.408887] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 669 previous similar messages [271047.305209] Lustre: oak-OST003c: Bulk IO write error with 96319bde-17a5-4 (at 10.51.2.6@o2ib3), client will retry: rc = -110 [271047.317838] Lustre: Skipped 25 previous similar messages [271097.310463] LustreError: 148673:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 466189(1514765) req@ffff8bb02d8d8850 x1684955305891072/t0(0) o4->603f6de5-3622-4@10.50.4.30@o2ib2:731/0 lens 488/448 e 0 to 0 dl 1618991776 ref 1 fl Interpret:/0/0 rc 0/0 [271097.336967] LustreError: 148673:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 29 previous similar messages [271111.914036] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.4.31@o2ib3 ns: filter-oak-OST0046_UUID lock: ffff8bdd1860f980/0xf81cb91ff387dd3 lrc: 3/0,0 mode: PW/PW res: [0x220ae9c:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) flags: 0x60000400030020 nid: 10.51.4.31@o2ib3 remote: 0x949e38f809f9713f expref: 6 pid: 193044 timeout: 271117 lvb_type: 0 [271130.369078] LustreError: 209395:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be71db40850 x1696881770443328/t0(0) o3->d507c462-9f0f-5857-aa2a-29a15c595cfc@10.51.6.7@o2ib3:72/0 lens 488/440 e 0 to 0 dl 1618991872 ref 1 fl Interpret:/0/0 rc 0/0 [271130.395892] LustreError: 209395:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [271142.913346] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 112s: evicting client at 10.50.13.4@o2ib2 ns: filter-oak-OST0058_UUID lock: ffff8baeffc857c0/0xf81cb91ff39ac7e lrc: 3/0,0 mode: PW/PW res: [0x23b04c5:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400020020 nid: 10.50.13.4@o2ib2 remote: 0x5adc4f10aea02053 expref: 22 pid: 193127 timeout: 271148 lvb_type: 0 [271183.565776] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [271183.579934] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc2a5e83000 [271183.592099] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723d1a800 [271183.604253] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be39dc62400 [271183.616403] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be39dc62400 [271183.628555] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9d4baa800 [271183.640711] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9d4baa800 [271183.652877] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be717ad1800 [271183.665060] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7098b7800 [271183.677225] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf2af46800 [271183.689381] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf2af46800 [271193.420010] Lustre: oak-OST005a: haven't heard from client 7ce5ea9e-55f5-4 (at 10.51.4.31@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be08c195800, cur 1618991841 expire 1618991691 last 1618991614 [271247.351618] LustreError: 5999:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(3679) req@ffff8be723c3f050 x1689053110640448/t0(0) o4->ed109291-03b5-4@10.51.4.16@o2ib3:123/0 lens 488/448 e 0 to 0 dl 1618991923 ref 1 fl Interpret:/0/0 rc 0/0 [271267.563363] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [271267.577414] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be72307d800 [271267.589578] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be39f8f3000 [271267.601754] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be152e89c00 [271267.613930] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca6cf8bc00 [271267.626115] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71fd92400 [271267.638288] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6160c6800 [271267.650466] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be39dc62c00 [271267.662638] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be39dc62c00 [271267.674803] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcc952cc400 [271267.686961] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6160c6800 [271359.726601] LustreError: 221479:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71c46d850 x1689651233109120/t0(0) o4->b4a05b29-fe90-4@10.51.5.60@o2ib3:300/0 lens 488/448 e 0 to 0 dl 1618992100 ref 1 fl Interpret:/0/0 rc 0/0 [271359.751718] LustreError: 221479:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [271373.907990] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 111s: evicting client at 10.51.0.67@o2ib3 ns: filter-oak-OST004c_UUID lock: ffff8bae570d0d80/0xf81cb91ff3b735b lrc: 4/0,0 mode: PR/PR res: [0x1000000400:0x31d6c8:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.0.67@o2ib3 remote: 0x7e132cdbaef152c0 expref: 9 pid: 187429 timeout: 271379 lvb_type: 1 [271389.200292] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.21@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb16283c5 [271389.217283] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [271390.560690] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [271390.574717] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be2b58ccc00 [271390.586904] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd459600400 [271390.599051] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be70cdca400 [271390.611201] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca727ebc00 [271390.623359] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdeb46b2c00 [271390.635510] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0b07c2800 [271390.647677] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0b07c2800 [271390.659850] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be70c632800 [271390.672029] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdeb46b4c00 [271390.684189] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdeb46b4c00 [271394.337646] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d0800 [271394.349831] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be603359800 [271394.361996] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d0800 [271394.374173] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be603359800 [271394.386357] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d0800 [271394.398558] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bccda9a5400 [271394.410751] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bccda9a5400 [271394.422942] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bccda9a5400 [271394.435125] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be717ad3000 [271394.447291] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be717ad3000 [271394.459441] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be717ad3000 [271396.423311] Lustre: oak-OST0044: Bulk IO read error with 9458049c-ca8d-335b-3531-2606964e11c0 (at 10.51.2.31@o2ib3), client will retry: rc -110 [271396.437781] Lustre: Skipped 192 previous similar messages [271408.336425] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [271408.348948] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 1 previous similar message [271408.360665] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcf68acfc00 [271408.372828] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd9a86d000 [271408.384975] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be08c192000 [271408.397125] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be08c192000 [271408.409269] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be719d82000 [271413.832891] Lustre: oak-OST0032: Client e9bcecd7-a198-50aa-33a9-a04f0aea63df (at 10.51.6.28@o2ib3) reconnecting [271413.832892] Lustre: oak-OST004a: Client e9bcecd7-a198-50aa-33a9-a04f0aea63df (at 10.51.6.28@o2ib3) reconnecting [271413.832894] Lustre: Skipped 2179 previous similar messages [271413.861865] Lustre: Skipped 1 previous similar message [271450.650734] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.51.13.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [271450.650735] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.13.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [271450.650738] LustreError: Skipped 13 previous similar messages [271500.334353] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [271500.348459] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd07a3bb400 [271500.360897] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd7722ac00 [271500.373076] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be39f8f3000 [271500.385600] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d5000 [271500.397774] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be20f4d5000 [271500.409937] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd9a86ac00 [271500.422102] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd9a86ac00 [271500.434282] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd03bedf800 [271500.446450] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd03bedf800 [271558.698667] Lustre: 193069:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618992033/real 1618992033] req@ffff8bd91c927980 x1697353984182656/t0(0) o104->oak-OST0058@10.51.6.6@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618992206 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [271558.729234] Lustre: 193069:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [271563.872505] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.51@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb19eff35 [271575.438790] Lustre: oak-OST004a: haven't heard from client 0b13f2ae-d6e1-4 (at 10.51.2.70@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be720dab400, cur 1618992223 expire 1618992073 last 1618991996 [271595.641114] Lustre: oak-OST004a: Connection restored to 5b343834-158f-4 (at 10.51.5.36@o2ib3) [271595.641115] Lustre: oak-OST0054: Connection restored to (at 10.51.5.1@o2ib3) [271595.641118] Lustre: Skipped 3828 previous similar messages [271685.014784] LustreError: 220406:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bdccec5a850 x1689647575514944/t0(0) o3->17bce2f0-6b2d-4@10.51.5.17@o2ib3:626/0 lens 488/440 e 0 to 0 dl 1618992426 ref 1 fl Interpret:/0/0 rc 0/0 [271685.039746] LustreError: 220406:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 16 previous similar messages [271685.276949] LNet: 50606:0:(lib-move.c:976:lnet_post_send_locked()) Aborting message for 12345-10.0.2.216@o2ib5: LNetM[DE]Unlink() already called on the MD/ME. [271685.292873] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bcf6515a000 [271685.305017] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bcf6515a000 [271685.317158] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be238388000 [271685.329305] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be238388000 [271685.341463] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bdb091f1c00 [271685.353616] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bdb091f1c00 [271685.365756] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bca7f74a400 [271685.377904] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bca7f74a400 [271685.390047] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be757038c00 [271685.402198] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be757038c00 [271685.414344] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be298c45800 [271685.426491] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be298c45800 [271685.438630] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bccda9a5400 [271685.450770] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bccda9a5400 [271685.561114] Lustre: oak-OST0036: Bulk IO write error with 4cee94e6-025c-589a-13ab-6c9ed337de31 (at 10.51.2.36@o2ib3), client will retry: rc = -110 [271685.575887] Lustre: Skipped 70 previous similar messages [271697.329757] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [271697.343246] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff940) failed: 5 [271697.345854] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbf9a9e4400 [271697.353969] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [271697.353971] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [271697.353974] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be9755bd000 [271697.354251] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 3, status -5, desc ffff8bda0b0e3c00 [271697.354256] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be974874000 [271697.354517] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be974874000 [271697.354780] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be1cef99c00 [271697.354782] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be313d78400 [271697.354785] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be313d78400 [271697.354787] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7314f8000 [271697.365807] LustreError: 193425:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be6f1381850 x1696864320815936/t0(0) o4->4cee94e6-025c-589a-13ab-6c9ed337de31@10.51.2.36@o2ib3:633/0 lens 488/448 e 0 to 0 dl 1618992433 ref 1 fl Interpret:/0/0 rc 0/0 [271697.365808] LustreError: 193425:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 50 previous similar messages [271697.522355] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 959 previous similar messages [271746.503678] LustreError: 217486:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bdb870d4050 x1689647575533632/t0(0) o3->17bce2f0-6b2d-4@10.51.5.17@o2ib3:636/0 lens 488/440 e 0 to 0 dl 1618992436 ref 1 fl Interpret:/0/0 rc 0/0 [271746.530005] LustreError: 217486:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 200 previous similar messages [272045.827183] Lustre: oak-OST004c: Client 4db85dd3-53e2-443c-9c81-8e4b8ae700de (at 10.50.12.5@o2ib2) reconnecting [272045.838801] Lustre: Skipped 2049 previous similar messages [272094.335665] LustreError: 137-5: oak-OST0043_UUID: not available for connect from 10.51.12.2@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [272094.355079] LustreError: Skipped 2 previous similar messages [272142.542560] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [272142.555927] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdf2fef5000 [272142.568081] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cbb400 [272142.580217] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cbb400 [272142.592351] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be96cca5800 [272142.604500] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd8ba7de400 [272196.599845] LustreError: 5988:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1280) req@ffff8bde689a1050 x1684968248962240/t0(0) o4->ea702749-deff-4@10.51.4.14@o2ib3:328/0 lens 488/448 e 0 to 0 dl 1618992883 ref 1 fl Interpret:/0/0 rc 0/0 [272196.600046] Lustre: oak-OST0040: Bulk IO read error with 934d532f-1b5d-4 (at 10.51.4.50@o2ib3), client will retry: rc -110 [272196.600047] Lustre: Skipped 88 previous similar messages [272196.643814] LustreError: 5988:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 41 previous similar messages [272197.263937] Lustre: oak-OST003e: Connection restored to 976970b5-5fad-aab3-00a6-23fd47844552 (at 10.51.13.16@o2ib3) [272197.275695] Lustre: Skipped 2097 previous similar messages [272310.519195] LustreError: 193411:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd176695050 x1697130131867200/t0(0) o4->56a5a766-0782-0626-7e81-90dde2e2789a@10.51.2.28@o2ib3:487/0 lens 488/448 e 0 to 0 dl 1618993042 ref 1 fl Interpret:/0/0 rc 0/0 [272310.546337] LustreError: 193411:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [272310.557185] Lustre: oak-OST0032: Bulk IO write error with 56a5a766-0782-0626-7e81-90dde2e2789a (at 10.51.2.28@o2ib3), client will retry: rc = -110 [272310.571951] Lustre: Skipped 8 previous similar messages [272327.219499] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.10@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb295018d [272327.236390] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [272329.405303] Lustre: oak-OST004a: haven't heard from client c6f24fdc-c180-4 (at 10.51.3.48@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be6c89fc400, cur 1618992977 expire 1618992827 last 1618992750 [272331.445967] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.10@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb295076d [272375.132866] LustreError: 193102:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0054: cli 20486326-e3f6-4 claims 4218880 GRANT, real grant 2744320 [272500.534758] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [272500.549381] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bccbd4bd800 [272500.561552] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be298c44800 [272500.573710] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd07f085400 [272500.585873] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd07f085400 [272500.598049] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bddc89d7c00 [272500.610223] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdad4f86800 [272500.622382] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd7b22f3c00 [272500.634556] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd7b22f3c00 [272500.646730] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bc4c2641c00 [272500.658905] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1cef99c00 [272500.671063] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdad4f86800 [272500.700736] LustreError: 193425:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be7eb809050 x1696860127321792/t0(0) o4->54153ede-ddc5-4@10.51.2.1@o2ib3:685/0 lens 488/448 e 0 to 0 dl 1618993240 ref 1 fl Interpret:/0/0 rc 0/0 [272500.726129] LustreError: 193425:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 10 previous similar messages [272510.069736] LNet: 50603:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d859ce9b00) failed: 5 [272510.070252] LNet: 50602:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.215@o2ib5 exceeded retry count 0 [272510.070255] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbf64fed000 [272510.070256] LNet: 50602:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [272510.070258] LustreError: 50604:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbf64fed000 [272510.070259] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcd1bd35400 [272510.070276] LustreError: 50604:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbbb2275000 [272510.070279] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbf64fee000 [272510.070284] LustreError: 50604:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbf64fee000 [272510.070412] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bccef99ac00 [272510.072318] LNet: 14182:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.215@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [272510.072339] LNet: 50604:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.215@o2ib5 failed: 5 [272510.072341] LNet: 50604:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [272510.225418] LNet: 50603:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 234 previous similar messages [272546.672799] LustreError: 193456:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be719723050 x1695475083797248/t0(0) o3->61781ed1-b14e-4@10.51.13.4@o2ib3:685/0 lens 488/440 e 0 to 0 dl 1618993240 ref 1 fl Interpret:/0/0 rc 0/0 [272546.698289] LustreError: 193456:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 13 previous similar messages [272647.211197] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff860) failed: 5 [272647.212101] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbaabe17000 [272647.221948] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [272647.221949] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [272647.221952] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbd1d8d1400 [272647.222275] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbd1d8d1400 [272647.222606] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be140413000 [272647.222949] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be140413000 [272647.223292] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc089cd9400 [272647.223628] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcb07fe0400 [272647.223954] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcb07fe0400 [272647.233979] LNet: 89091:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.214@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [272647.357832] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1768 previous similar messages [272667.474104] Lustre: oak-OST0044: Client d723b707-c0de-4 (at 10.51.5.24@o2ib3) reconnecting [272667.483435] Lustre: Skipped 637 previous similar messages [272713.611656] LustreError: 137-5: oak-OST0055_UUID: not available for connect from 10.51.4.14@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [272713.631131] LustreError: Skipped 1 previous similar message [272788.305194] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4ad55bc00 [272788.317357] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4ad55bc00 [272788.329572] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9ace97c00 [272788.341779] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcf99586c00 [272788.353957] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdf2fef1800 [272788.366169] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdf2fef1800 [272788.378338] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdf2fef1800 [272788.390556] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9ace97c00 [272798.273497] Lustre: oak-OST0046: Connection restored to 3ebfe839-f598-4 (at 10.50.2.20@o2ib2) [272798.283131] Lustre: Skipped 1278 previous similar messages [272846.738748] Lustre: oak-OST0044: Bulk IO read error with 97608f95-8caf-4 (at 10.51.4.8@o2ib3), client will retry: rc -110 [272846.751082] Lustre: Skipped 36 previous similar messages [272908.367286] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcde8807800 [272908.379440] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcde8807800 [272908.391592] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca73398400 [272908.403764] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0a6459400 [272908.415914] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0a6459400 [272908.428066] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be1e80ce800 [272908.440227] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce794c4800 [272908.452376] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b593000 [272950.245365] LustreError: 221477:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd25f7c7050 x1696620899884224/t0(0) o4->62377ef1-e7e9-f1fa-66fa-0724308527ab@10.51.1.27@o2ib3:379/0 lens 488/448 e 0 to 0 dl 1618993689 ref 1 fl Interpret:/0/0 rc 0/0 [272950.272438] LustreError: 221477:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 8 previous similar messages [272950.283313] Lustre: oak-OST0038: Bulk IO write error with 62377ef1-e7e9-f1fa-66fa-0724308527ab (at 10.51.1.27@o2ib3), client will retry: rc = -110 [272950.298064] Lustre: Skipped 11 previous similar messages [272951.565252] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.1.27@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb36e9b75 [272971.763494] LustreError: 217347:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1959) req@ffff8be102b8f850 x1695838610330560/t0(0) o4->eab9e2cf-af4c-4@10.51.15.1@o2ib3:338/0 lens 488/448 e 0 to 0 dl 1618993648 ref 1 fl Interpret:/0/0 rc 0/0 [272971.789217] LustreError: 217347:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [273033.522330] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [273033.535724] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 4 previous similar messages [273033.547500] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9ace97800 [273033.559670] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3b1fcc400 [273033.571825] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd1bd37800 [273033.583995] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd1bd37800 [273033.596165] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd1bd37800 [273033.608324] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd1bd37800 [273033.620484] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be298c42800 [273033.632651] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd459604c00 [273033.644853] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd459604c00 [273033.657015] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd459604c00 [273033.669175] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd741463400 [273033.681351] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdc414ba000 [273081.633330] Lustre: 184632:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618993556/real 1618993556] req@ffff8bd128c9e780 x1697353985350656/t0(0) o105->oak-OST0044@10.51.5.2@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1618993729 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [273081.663950] Lustre: 184632:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [273084.585839] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.60@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb38fa955 [273084.602728] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 5 previous similar messages [273204.295635] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be71b0d0c00 [273204.307796] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd26aee5000 [273204.307850] LustreError: 220397:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bcc89aec050 x1689692031288768/t0(0) o4->bf9174df-fd25-4@10.51.14.20@o2ib3:634/0 lens 488/448 e 0 to 0 dl 1618993944 ref 1 fl Interpret:/0/0 rc 0/0 [273204.307852] LustreError: 220397:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 33 previous similar messages [273204.356247] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda5fa95c00 [273204.368393] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda5fa95c00 [273204.380544] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be978891c00 [273204.392697] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd36ae28c00 [273250.294367] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd9a8ac0400 [273250.306541] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8baea5099800 [273250.318731] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8baea5099800 [273250.330889] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be16e1ecc00 [273250.343059] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd741461000 [273270.303302] Lustre: oak-OST0034: Client 438da15f-c993-4 (at 10.51.0.68@o2ib3) reconnecting [273270.312663] Lustre: Skipped 571 previous similar messages [273271.811205] LustreError: 193426:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1231074) req@ffff8bcc455c1050 x1692936501243328/t0(0) o4->d0879b94-e3f0-4@10.51.2.59@o2ib3:633/0 lens 488/448 e 0 to 0 dl 1618993943 ref 1 fl Interpret:/0/0 rc 0/0 [273271.811650] LustreError: 193181:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bca76ba1050 x1694729485533632/t0(0) o3->57fff42c-ee97-4@10.51.13.7@o2ib3:634/0 lens 488/440 e 0 to 0 dl 1618993944 ref 1 fl Interpret:/0/0 rc 0/0 [273271.811652] LustreError: 193181:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 23 previous similar messages [273271.875026] LustreError: 193426:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [273296.814161] LustreError: 209397:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1230391) req@ffff8bd3b8d1d050 x1689652702286464/t0(0) o4->0e941ceb-3e75-4@10.51.4.53@o2ib3:668/0 lens 488/448 e 0 to 0 dl 1618993978 ref 1 fl Interpret:/0/0 rc 0/0 [273296.840751] LustreError: 209397:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 3 previous similar messages [273324.847663] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.51.15.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [273324.867171] LustreError: Skipped 47 previous similar messages [273401.326407] Lustre: oak-OST0032: Connection restored to 0e941ceb-3e75-4 (at 10.51.4.53@o2ib3) [273401.336064] Lustre: Skipped 959 previous similar messages [273430.290058] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be140416000 [273430.302234] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb2f283400 [273430.314377] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd1bd31c00 [273430.326521] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0a645ec00 [273430.338663] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0a645ec00 [273430.350812] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdef1bf3c00 [273430.362972] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdef1bf3c00 [273430.375123] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd1bd30800 [273432.380681] Lustre: oak-OST0058: haven't heard from client 059e2cc2-60f6-4 (at 10.51.2.17@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd671c14c00, cur 1618994080 expire 1618993930 last 1618993853 [273471.849602] LustreError: 227921:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1838566) req@ffff8bd25f7c7850 x1689649915022336/t0(0) o4->6ab3f4aa-3003-4@10.51.5.16@o2ib3:101/0 lens 488/448 e 0 to 0 dl 1618994166 ref 1 fl Interpret:/0/0 rc 0/0 [273496.850934] Lustre: oak-OST0040: Bulk IO read error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc -110 [273496.863365] Lustre: Skipped 76 previous similar messages [273548.830434] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.36@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb421e47d [273548.847327] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [273620.644812] LustreError: 193439:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be459eda850 x1695475125678400/t0(0) o3->61781ed1-b14e-4@10.51.13.4@o2ib3:300/0 lens 488/440 e 0 to 0 dl 1618994365 ref 1 fl Interpret:/0/0 rc 0/0 [273620.669797] LustreError: 193439:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 19 previous similar messages [273627.374146] Lustre: oak-OST0052: haven't heard from client c6bd94e5-710a-4 (at 10.51.5.6@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd3435bcc00, cur 1618994275 expire 1618994125 last 1618994048 [273843.279829] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [273843.292451] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 3 previous similar messages [273843.303706] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5feb40) failed: 5 [273843.314111] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 3 previous similar messages [273843.314394] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [273843.314396] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [273843.314398] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd741460c00 [273843.314720] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bca7bbe0c00 [273872.427545] Lustre: oak-OST0040: Client 4fa2634c-f66a-aa1b-358f-94f1e7640e7e (at 10.50.5.16@o2ib2) reconnecting [273872.438927] Lustre: Skipped 1663 previous similar messages [273896.925746] LustreError: 209389:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd25f7c7850 x1691972714923328/t0(0) o3->1fbd93da-ab0e-4@10.51.2.40@o2ib3:518/0 lens 488/440 e 0 to 0 dl 1618994583 ref 1 fl Interpret:/0/0 rc 0/0 [273896.925809] LustreError: 217349:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd2bb262850 x1685053789431936/t0(0) o3->059e2cc2-60f6-4@10.51.2.17@o2ib3:518/0 lens 488/440 e 0 to 0 dl 1618994583 ref 1 fl Interpret:/0/0 rc 0/0 [273896.925811] LustreError: 217349:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 14 previous similar messages [273896.925913] LustreError: 219012:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1231948) req@ffff8be5a1bdb050 x1684948111892992/t0(0) o4->732e3689-d982-4@10.51.3.5@o2ib3:521/0 lens 488/448 e 0 to 0 dl 1618994586 ref 1 fl Interpret:/0/0 rc 0/0 [273896.925915] LustreError: 219012:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 10 previous similar messages [273896.925930] Lustre: oak-OST003c: Bulk IO write error with ea702749-deff-4 (at 10.51.4.14@o2ib3), client will retry: rc = -110 [273896.925931] Lustre: Skipped 37 previous similar messages [273897.044539] LustreError: 209389:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 49 previous similar messages [274001.442679] Lustre: oak-OST0040: Connection restored to aa49d6cf-64e9-4 (at 10.51.5.71@o2ib3) [274001.452329] Lustre: Skipped 1995 previous similar messages [274004.030735] LustreError: 137-5: oak-OST0039_UUID: not available for connect from 10.51.4.19@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [274004.050174] LustreError: Skipped 9 previous similar messages [274040.056663] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.1.27@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb4c76e2d [274264.530830] LustreError: 5986:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd9fbd0f050 x1696620991697344/t0(0) o4->8b66e4db-fcc9-6215-2cf9-86ed141f42d8@10.51.2.26@o2ib3:179/0 lens 488/448 e 0 to 0 dl 1618994999 ref 1 fl Interpret:/0/0 rc 0/0 [274264.557718] LustreError: 5986:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 7 previous similar messages [274525.322741] Lustre: oak-OST0040: Client 9768808c-691d-4f78-accb-0d29f922d720 (at 10.51.13.21@o2ib3) reconnecting [274525.334240] Lustre: Skipped 447 previous similar messages [274565.186968] Lustre: oak-OST0040: Bulk IO read error with 56a5a766-0782-0626-7e81-90dde2e2789a (at 10.51.2.28@o2ib3), client will retry: rc -110 [274565.201515] Lustre: Skipped 24 previous similar messages [274601.715400] Lustre: oak-OST003e: Connection restored to a7f2d980-27e4-4 (at 10.51.2.58@o2ib3) [274601.725022] Lustre: Skipped 758 previous similar messages [274786.939637] LustreError: 137-5: oak-OST0045_UUID: not available for connect from 10.51.5.54@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [274786.959049] LustreError: Skipped 7 previous similar messages [274850.480144] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [274850.494545] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be006220800 [274850.506720] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723df6000 [274850.518881] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd8b583e400 [274850.531046] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be723d1ec00 [274850.543209] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be16e1e8000 [274850.555375] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be16e1e8000 [274850.567564] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd8b583e400 [274850.567569] LustreError: 227992:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bcee505c050 x1690572367821248/t0(0) o4->2f2dff73-004a-4@10.51.2.14@o2ib3:12/0 lens 488/448 e 0 to 0 dl 1618995587 ref 1 fl Interpret:/0/0 rc 0/0 [274850.567571] LustreError: 227992:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [274850.567772] Lustre: oak-OST005a: Bulk IO write error with 2f2dff73-004a-4 (at 10.51.2.14@o2ib3), client will retry: rc = -110 [274850.567774] Lustre: Skipped 15 previous similar messages [274872.121514] LustreError: 193196:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(2659516) req@ffff8be2f2c97050 x1696620928953216/t0(0) o4->62377ef1-e7e9-f1fa-66fa-0724308527ab@10.51.1.27@o2ib3:730/0 lens 488/448 e 0 to 0 dl 1618995550 ref 1 fl Interpret:/0/0 rc 0/0 [274872.150141] LustreError: 193196:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 5 previous similar messages [274897.126729] LustreError: 193197:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be210ed0850 x1690572367822208/t0(0) o4->2f2dff73-004a-4@10.51.2.14@o2ib3:12/0 lens 488/448 e 0 to 0 dl 1618995587 ref 1 fl Interpret:/0/0 rc 0/0 [274897.126832] LustreError: 221473:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bdac9478050 x1688875295727872/t0(0) o3->fd16aff2-0371-4@10.51.4.33@o2ib3:13/0 lens 488/440 e 0 to 0 dl 1618995588 ref 1 fl Interpret:/0/0 rc 0/0 [274897.178591] LustreError: 193197:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 13 previous similar messages [274922.127474] LustreError: 242900:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1232960) req@ffff8bd986b4c850 x1688693574589632/t0(0) o4->a7834140-6ca8-4@10.51.2.9@o2ib3:18/0 lens 488/448 e 0 to 0 dl 1618995593 ref 1 fl Interpret:/0/0 rc 0/0 [274922.153889] LustreError: 242900:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [274971.195416] Lustre: 193044:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618995445/real 1618995445] req@ffff8bdfbcd68900 x1697353987527232/t0(0) o104->oak-OST005e@10.51.6.7@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618995618 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [275000.670897] LNet: 50609:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.2.26@o2ib3 portal 16 match 1697353987683328 offset 192 length 192: 4 [275040.338142] Lustre: oak-OST0050: haven't heard from client 61918a97-82ad-4 (at 10.51.3.3@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be20f4d3c00, cur 1618995688 expire 1618995538 last 1618995461 [275082.251506] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd522826400 [275082.263668] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd522826400 [275082.275846] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd556863c00 [275082.288010] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd556863c00 [275082.300173] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0d5167400 [275082.312334] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd610b93400 [275082.324496] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd610b93400 [275082.336698] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd1086f7800 [275122.163318] LustreError: 221410:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(3691336) req@ffff8bd1df64d050 x1689648441388928/t0(0) o4->71f8f8a7-b976-4@10.51.5.59@o2ib3:224/0 lens 488/448 e 0 to 0 dl 1618995799 ref 1 fl Interpret:/0/0 rc 0/0 [275128.208771] Lustre: oak-OST0050: Client d507c462-9f0f-5857-aa2a-29a15c595cfc (at 10.51.6.7@o2ib3) reconnecting [275128.220039] Lustre: Skipped 720 previous similar messages [275164.472366] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff240) failed: 5 [275164.472755] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd316b21400 [275164.474582] LNet: 14182:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [275164.482851] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [275164.482853] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf976e9000 [275164.482854] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 3 previous similar messages [275164.482856] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf976e9000 [275164.482859] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd68c865400 [275164.482861] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bce794c0c00 [275164.482866] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd1086f2800 [275164.482868] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd1086f2800 [275164.482881] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be61bbfe400 [275164.619049] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 62 previous similar messages [275204.163154] Lustre: oak-OST003e: Connection restored to 61789f7e-98b3-4 (at 10.50.2.65@o2ib2) [275204.172796] Lustre: Skipped 993 previous similar messages [275221.818606] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.2.63@o2ib3 ns: filter-oak-OST0048_UUID lock: ffff8be17ffaca40/0xf81cb91ff463ba0 lrc: 4/0,0 mode: PW/PW res: [0x21461f7:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1232895) flags: 0x60000400010020 nid: 10.51.2.63@o2ib3 remote: 0xc4f8ca8d0ef82164 expref: 6 pid: 189717 timeout: 275227 lvb_type: 0 [275222.187489] LustreError: 217333:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1230883) req@ffff8bdfdc860850 x1689650942120448/t0(0) o4->d3d7476c-0db8-4@10.51.5.12@o2ib3:324/0 lens 488/448 e 0 to 0 dl 1618995899 ref 1 fl Interpret:/0/0 rc 0/0 [275222.188603] Lustre: oak-OST0048: Bulk IO read error with 62b89a52-cd86-4 (at 10.51.6.17@o2ib3), client will retry: rc -110 [275222.188604] Lustre: Skipped 46 previous similar messages [275222.232527] LustreError: 217333:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 12 previous similar messages [275263.351551] Lustre: oak-OST003a: haven't heard from client 9b12e584-d591-4 (at 10.51.12.20@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be553a7bc00, cur 1618995911 expire 1618995761 last 1618995684 [275270.347080] Lustre: oak-OST0050: haven't heard from client d122ae48-3039-4 (at 10.51.4.26@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7937e4c00, cur 1618995918 expire 1618995768 last 1618995691 [275286.356078] Lustre: oak-OST0058: haven't heard from client e9ad2042-8f70-4 (at 10.51.5.32@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd556866800, cur 1618995934 expire 1618995784 last 1618995707 [275286.378402] Lustre: Skipped 1 previous similar message [275308.816539] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.5.9@o2ib3 ns: filter-oak-OST0058_UUID lock: ffff8bba307560c0/0xf81cb91ff58dd8d lrc: 4/0,0 mode: PW/PW res: [0x23b0f39:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) flags: 0x60000400030020 nid: 10.51.5.9@o2ib3 remote: 0xe67b4c1e5018f626 expref: 15 pid: 192990 timeout: 275314 lvb_type: 0 [275308.861348] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [275334.531975] Lustre: 223101:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618995809/real 1618995809] req@ffff8bd0d0869b00 x1697353987810816/t0(0) o105->oak-OST004c@10.51.0.67@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1618995982 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [275334.562692] Lustre: 223101:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [275335.615938] Lustre: 193101:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618995810/real 1618995810] req@ffff8bba73b12400 x1697353987811008/t0(0) o106->oak-OST0036@10.51.4.15@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1618995983 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [275348.343413] Lustre: oak-OST0056: haven't heard from client 2f2dff73-004a-4 (at 10.51.2.14@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7237ce000, cur 1618995996 expire 1618995846 last 1618995769 [275405.502306] LustreError: 217449:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be6569de050 x1685039401238592/t0(0) o3->93c6a620-33f9-4@10.51.2.66@o2ib3:574/0 lens 488/440 e 0 to 0 dl 1618996149 ref 1 fl Interpret:/0/0 rc 0/0 [275405.527256] LustreError: 217449:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 10 previous similar messages [275421.970059] LNet: 50608:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.6.19@o2ib3 portal 16 match 1697353987982464 offset 224 length 224: 4 [275423.732764] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.2.52@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [275423.732765] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.2.52@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [275423.732767] LustreError: Skipped 8 previous similar messages [275423.778005] LustreError: Skipped 11 previous similar messages [275430.922736] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.51@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb6a3ee45 [275430.939629] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 4 previous similar messages [275469.233015] Lustre: oak-OST004c: Bulk IO write error with 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3), client will retry: rc = -110 [275469.245748] Lustre: Skipped 35 previous similar messages [275497.228997] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.59@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb6c66a65 [275497.245891] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [275498.197246] LustreError: 227985:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk READ failed: rc -107 req@ffff8be717bab850 x1696863517704960/t0(0) o3->eb8cea22-3545-c4b8-6cb6-b3e875ecfb11@10.51.1.23@o2ib3:668/0 lens 488/440 e 0 to 0 dl 1618996243 ref 1 fl Interpret:/0/0 rc 0/0 [275504.176439] LustreError: 5994:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk WRITE failed: rc -107 req@ffff8be711a48050 x1689652705477504/t0(0) o4->0e941ceb-3e75-4@10.51.4.53@o2ib3:674/0 lens 488/448 e 0 to 0 dl 1618996249 ref 1 fl Interpret:/0/0 rc 0/0 [275516.692543] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.53@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb6cd576d [275516.709468] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 2 previous similar messages [275733.416248] Lustre: oak-OST005a: Client ce8388c3-ee1c-18c4-8f03-235d1bdd9de0 (at 10.50.3.39@o2ib2) reconnecting [275733.427617] Lustre: Skipped 1764 previous similar messages [275746.311162] LustreError: 241046:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be04a007850 x1697586685463680/t0(0) o3->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:93/0 lens 488/440 e 0 to 0 dl 1618996423 ref 1 fl Interpret:/0/0 rc 0/0 [275746.339400] LustreError: 241046:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 66 previous similar messages [275807.248455] Lustre: oak-OST0044: Connection restored to fd16aff2-0371-4 (at 10.51.4.33@o2ib3) [275807.258097] Lustre: Skipped 2189 previous similar messages [275868.603570] Lustre: oak-OST0040: Bulk IO read error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc -110 [275868.616035] Lustre: Skipped 65 previous similar messages [275984.230157] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending_nocred)(waiting) [275984.244231] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [275984.255462] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff240) failed: 5 [275984.255841] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd16cd51c00 [275984.255858] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [275984.255860] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [275984.255864] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0b95eec00 [275984.256451] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0b95eec00 [275984.257112] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4c6a88000 [275984.257801] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0b95ecc00 [275984.258425] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd5703a8000 [275984.259042] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd5703a8000 [275984.259602] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd16cd51c00 [275984.384470] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1700 previous similar messages [276021.374833] LustreError: 221487:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(58851) req@ffff8bdbf8176850 x1689650523256832/t0(0) o4->de57c027-a949-4@10.51.5.55@o2ib3:385/0 lens 488/448 e 0 to 0 dl 1618996715 ref 1 fl Interpret:/0/0 rc 0/0 [276046.376221] LustreError: 221482:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bde83433050 x1684935335213952/t0(0) o3->04118086-b366-4@10.51.3.10@o2ib3:394/0 lens 488/440 e 0 to 0 dl 1618996724 ref 1 fl Interpret:/0/0 rc 0/0 [276046.377502] LustreError: 221470:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1517390) req@ffff8bdfd9489050 x1689656672708416/t0(0) o4->d8462e28-e3fc-4@10.51.5.27@o2ib3:397/0 lens 488/448 e 0 to 0 dl 1618996727 ref 1 fl Interpret:/0/0 rc 0/0 [276046.428138] LustreError: 221482:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 19 previous similar messages [276050.666420] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.60@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xb77fc075 [276065.964364] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.0.15@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [276065.983780] LustreError: Skipped 17 previous similar messages [276175.437219] Lustre: oak-OST003e: haven't heard from client 20e133d7-ba46-4 (at 10.51.3.16@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bdd219ce800, cur 1618996823 expire 1618996673 last 1618996596 [276312.445887] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fff60) failed: 5 [276312.445944] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [276312.445945] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [276312.445949] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd5703ac000 [276312.456615] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd5703ac000 [276312.456947] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be08c190400 [276312.457269] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be08c190400 [276312.526897] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 793 previous similar messages [276342.803110] Lustre: oak-OST0040: Client 987b9366-48a1-9307-ff57-52c8cdd1c49f (at 10.51.15.13@o2ib3) reconnecting [276342.803112] Lustre: oak-OST0042: Client 987b9366-48a1-9307-ff57-52c8cdd1c49f (at 10.51.15.13@o2ib3) reconnecting [276342.803115] Lustre: Skipped 845 previous similar messages [276342.832146] Lustre: Skipped 1 previous similar message [276371.448019] LustreError: 224959:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1269760(3366912) req@ffff8be71bad0850 x1695783502499072/t0(0) o3->fd9ffaf8-396f-4@10.51.6.24@o2ib3:722/0 lens 488/440 e 0 to 0 dl 1618997052 ref 1 fl Interpret:/0/0 rc 0/0 [276371.474318] LustreError: 224959:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 43 previous similar messages [276408.359877] Lustre: oak-OST0042: Connection restored to 8cd0fce8-38e8-d9db-a09e-3d56f1d88309 (at 10.51.2.5@o2ib3) [276408.371487] Lustre: Skipped 1618 previous similar messages [276735.435893] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending_nocred)(waiting) [276735.449967] LNet: 14182:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 1 previous similar message [276735.461051] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fe980) failed: 5 [276735.463636] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd5703ae000 [276735.463643] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd5703ae000 [276735.463692] LustreError: 193140:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be6dbc7a050 x1684931317574336/t0(0) o4->f62d458b-f1a4-4@10.51.4.2@o2ib3:389/0 lens 488/448 e 0 to 0 dl 1618997474 ref 1 fl Interpret:/0/0 rc 0/0 [276735.463694] LustreError: 193140:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 2 previous similar messages [276735.463822] Lustre: oak-OST0056: Bulk IO write error with f62d458b-f1a4-4 (at 10.51.4.2@o2ib3), client will retry: rc = -110 [276735.463822] Lustre: Skipped 12 previous similar messages [276735.471638] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [276735.471641] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 3 previous similar messages [276735.471644] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bccda9a6c00 [276735.471955] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bccda9a6c00 [276735.472231] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bccda9a6c00 [276735.472546] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bccda9a6c00 [276735.472839] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0802ac000 [276735.473128] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0802ac000 [276735.473417] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0802ac000 [276735.473703] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0802ac000 [276735.668760] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1938 previous similar messages [276763.369748] LustreError: 193454:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcf2a5ca850 x1697062284798720/t0(0) o4->e9bcecd7-a198-50aa-33a9-a04f0aea63df@10.51.6.28@o2ib3:423/0 lens 488/448 e 0 to 0 dl 1618997508 ref 1 fl Interpret:/0/0 rc 0/0 [276763.396862] LustreError: 193454:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 30 previous similar messages [276796.556761] Lustre: oak-OST0040: Bulk IO read error with 0cc1f06c-eda0-ebe4-f37b-9f106eca81f1 (at 10.51.4.62@o2ib3), client will retry: rc -110 [276796.571255] Lustre: Skipped 52 previous similar messages [276902.608575] LustreError: 193404:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcd079b9050 x1696881778723264/t0(0) o4->d507c462-9f0f-5857-aa2a-29a15c595cfc@10.51.6.7@o2ib3:541/0 lens 488/448 e 0 to 0 dl 1618997626 ref 1 fl Interpret:/0/0 rc 0/0 [276902.635686] Lustre: oak-OST0030: Bulk IO write error with d507c462-9f0f-5857-aa2a-29a15c595cfc (at 10.51.6.7@o2ib3), client will retry: rc = -110 [276902.650382] Lustre: Skipped 1 previous similar message [276909.432252] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be283f1e000 [276909.444393] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be283f1e000 [276909.456544] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be283f1e000 [276909.468713] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be283f1cc00 [276909.480895] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be6a9cfd000 [276922.436156] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.0.16@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [276922.455583] LustreError: Skipped 31 previous similar messages [276945.581786] Lustre: oak-OST0058: Client 8cd0fce8-38e8-d9db-a09e-3d56f1d88309 (at 10.51.2.5@o2ib3) reconnecting [276945.593080] Lustre: Skipped 225 previous similar messages [276946.594014] LustreError: 221465:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1229755) req@ffff8bd119b7a050 x1689649323005312/t0(0) o4->5b343834-158f-4@10.51.5.36@o2ib3:539/0 lens 488/448 e 0 to 0 dl 1618997624 ref 1 fl Interpret:/0/0 rc 0/0 [276946.620606] LustreError: 221465:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 6 previous similar messages [276971.598562] LustreError: 209389:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 185020(1233596) req@ffff8be31f34c850 x1689695263259840/t0(0) o4->d2b765d6-c390-4@10.51.5.1@o2ib3:559/0 lens 488/448 e 0 to 0 dl 1618997644 ref 1 fl Interpret:/0/0 rc 0/0 [276971.598982] LustreError: 221486:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8bd5145e0050 x1689647505562624/t0(0) o3->9b1546d3-bf78-4@10.51.5.26@o2ib3:563/0 lens 488/440 e 0 to 0 dl 1618997648 ref 1 fl Interpret:/0/0 rc 0/0 [276971.598984] LustreError: 221486:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [276971.661307] LustreError: 209389:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 15 previous similar messages [276981.430774] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bc452ea9400 [276981.443652] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [276981.443666] LNet: 241536:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [276981.471720] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 32 previous similar messages [277008.963508] Lustre: oak-OST0038: Connection restored to 5fb240d2-e880-4 (at 10.50.6.8@o2ib2) [277008.973031] Lustre: Skipped 1408 previous similar messages [277055.775996] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.2.25@o2ib3 ns: filter-oak-OST004e_UUID lock: ffff8bd6637a5c40/0xf81cb91ff6232ee lrc: 3/0,0 mode: PR/PR res: [0x1200000400:0x344bec:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.2.25@o2ib3 remote: 0x52e4fe267635c307 expref: 10 pid: 187430 timeout: 277061 lvb_type: 1 [277055.823312] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 2 previous similar messages [277055.834898] LustreError: 170939:0:(client.c:1187:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff8bb093627980 x1697353989264960/t0(0) o105->oak-OST003e@10.51.6.21@o2ib3:15/16 lens 360/224 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 [277071.880683] Lustre: 193011:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618997546/real 1618997549] req@ffff8be06b66a880 x1697353989151296/t0(0) o105->oak-OST005c@10.51.4.25@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1618997719 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [277079.585476] Lustre: 216148:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618997554/real 1618997554] req@ffff8bd0d086b600 x1697353989152896/t0(0) o105->oak-OST0054@10.51.5.7@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1618997727 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [277079.616041] Lustre: 216148:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [277079.748643] LustreError: 227982:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd714d3f850 x1684949933866496/t0(0) o3->76ded2d0-11e3-4@10.51.2.50@o2ib3:740/0 lens 488/440 e 0 to 0 dl 1618997825 ref 1 fl Interpret:/0/0 rc 0/0 [277079.773687] LustreError: 227982:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [277079.795456] LustreError: 204443:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk READ failed: rc -107 req@ffff8bddf38a0850 x1684949933866752/t0(0) o3->76ded2d0-11e3-4@10.51.2.50@o2ib3:740/0 lens 488/440 e 0 to 0 dl 1618997825 ref 1 fl Interpret:/0/0 rc 0/0 [277099.300137] Lustre: oak-OST0056: haven't heard from client 4b6b0473-d368-4 (at 10.51.3.12@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd4dcf79000, cur 1618997747 expire 1618997597 last 1618997520 [277178.426352] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be3a82cd000 [277178.438522] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be3a82cd000 [277178.450677] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71ece1c00 [277178.462821] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7b92ab800 [277178.474978] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdcd169e000 [277178.487125] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be62860b400 [277178.499276] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be553a7b800 [277178.499496] Lustre: oak-OST0030: Bulk IO write error with f555a2b4-9a49-4 (at 10.51.4.17@o2ib3), client will retry: rc = -110 [277178.499498] Lustre: Skipped 24 previous similar messages [277178.530176] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd03bed8800 [277178.542341] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd03bed8800 [277246.665940] LustreError: 217341:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8be223f63050 x1683981675286784/t0(0) o4->f555a2b4-9a49-4@10.51.4.17@o2ib3:78/0 lens 488/448 e 0 to 0 dl 1618997918 ref 1 fl Interpret:/0/0 rc 0/0 [277247.425032] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71ff15c00 [277247.437200] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0d2d62400 [277247.449347] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be7222fd400 [277247.461488] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be7222fd400 [277247.473637] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be313d79c00 [277247.485810] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be313d79c00 [277247.497961] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5a4d1e400 [277247.510112] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7222fc000 [277247.522262] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7222fc000 [277247.534415] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be19e6fa400 [277296.678924] LustreError: 193196:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2064384(4161536) req@ffff8be715ca2850 x1689652710299968/t0(0) o4->0e941ceb-3e75-4@10.51.4.53@o2ib3:147/0 lens 488/448 e 0 to 0 dl 1618997987 ref 1 fl Interpret:/0/0 rc 0/0 [277296.705520] LustreError: 193196:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 19 previous similar messages [277511.194988] LNet: 241536:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [277511.208481] LNet: 241536:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 4 previous similar messages [277511.220667] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcfd12d4400 [277511.232819] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcfd12d4400 [277511.245002] LustreError: 193404:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be723d34050 x1689653559340544/t0(0) o4->7bc8c8b5-b218-4@10.51.5.5@o2ib3:411/0 lens 488/448 e 0 to 0 dl 1618998251 ref 1 fl Interpret:/0/0 rc 0/0 [277511.270357] LustreError: 193404:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 19 previous similar messages [277511.281302] Lustre: oak-OST0034: Bulk IO write error with 7bc8c8b5-b218-4 (at 10.51.5.5@o2ib3), client will retry: rc = -110 [277511.293923] Lustre: Skipped 30 previous similar messages [277524.693182] LustreError: 227991:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd9d5019050 x1696615000004736/t0(0) o4->62b89a52-cd86-4@10.51.6.17@o2ib3:424/0 lens 488/448 e 0 to 0 dl 1618998264 ref 1 fl Interpret:/0/0 rc 0/0 [277546.736510] Lustre: oak-OST0034: Bulk IO read error with d5ed232a-0a4c-4 (at 10.51.4.10@o2ib3), client will retry: rc -110 [277546.748943] Lustre: Skipped 73 previous similar messages [277548.392520] Lustre: oak-OST0050: Client 2dce41e5-6cf1-0747-6975-01d951e9d8ac (at 10.51.1.46@o2ib3) reconnecting [277548.403921] Lustre: Skipped 820 previous similar messages [277548.459642] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600200) failed: 5 [277548.470065] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 31 previous similar messages [277548.470130] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [277548.470132] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [277548.470134] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdcd169b400 [277548.470491] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdcd169b400 [277548.470772] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf38a33000 [277548.471044] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf38a33000 [277548.471310] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be715dcc800 [277548.471569] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be715dcc800 [277548.471828] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda5fa93800 [277548.472092] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda5fa93800 [277550.277555] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.51.3.11@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [277550.297012] LustreError: Skipped 19 previous similar messages [277571.739146] LustreError: 221482:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be71c30a050 x1689660471368000/t0(0) o3->f3fb32eb-162a-4@10.51.4.58@o2ib3:403/0 lens 488/440 e 0 to 0 dl 1618998243 ref 1 fl Interpret:/0/0 rc 0/0 [277571.739640] LustreError: 227922:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1413119(2461695) req@ffff8bdd0f246850 x1688875351964992/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:404/0 lens 488/448 e 0 to 0 dl 1618998244 ref 1 fl Interpret:/0/0 rc 0/0 [277571.739642] LustreError: 227922:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 6 previous similar messages [277571.802527] LustreError: 221482:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 139 previous similar messages [277609.102882] Lustre: oak-OST0058: Connection restored to dc5e2566-c126-4 (at 10.50.10.51@o2ib2) [277609.112607] Lustre: Skipped 2206 previous similar messages [277646.762348] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.42@o2ib3 ns: filter-oak-OST004a_UUID lock: ffff8bdcb4db6c00/0xf81cb91ff664753 lrc: 4/0,0 mode: PR/PR res: [0xfc0000400:0x30ea3b:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.5.42@o2ib3 remote: 0xd25ade7d09536a6d expref: 9 pid: 193078 timeout: 277652 lvb_type: 1 [277646.809285] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [277646.820372] LustreError: 75144:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8bce794c3400 ns: filter-oak-OST004a_UUID lock: ffff8bb3ad101b00/0xf81cb91ff66492f lrc: 3/0,0 mode: --/PW res: [0xfc0000400:0x30ea3b:0x0].0x0 rrc: 3 type: EXT [0->8191] (req 0->8191) flags: 0x50000000020000 nid: 10.51.5.42@o2ib3 remote: 0xd25ade7d09536a74 expref: 9 pid: 75144 timeout: 0 lvb_type: 0 [277661.762041] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.4.17@o2ib3 ns: filter-oak-OST0058_UUID lock: ffff8bd5c4e01f80/0xf81cb91ff6518b6 lrc: 4/0,0 mode: PW/PW res: [0x23b0f55:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) flags: 0x60000400030020 nid: 10.51.4.17@o2ib3 remote: 0x8200763c1e1350f7 expref: 8 pid: 192979 timeout: 277667 lvb_type: 0 [277692.761319] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.6.25@o2ib3 ns: filter-oak-OST0058_UUID lock: ffff8bdcd4d63a80/0xf81cb91ff668d99 lrc: 3/0,0 mode: PR/PR res: [0x1680000400:0x39e341:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.6.25@o2ib3 remote: 0xcdc9edcf32b86c07 expref: 12 pid: 193003 timeout: 277698 lvb_type: 1 [277715.831797] Lustre: 193011:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618998190/real 1618998190] req@ffff8be06b66ec00 x1697353990307264/t0(0) o105->oak-OST0052@10.51.2.11@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1618998363 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [277715.862506] Lustre: 193011:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [277720.515692] Lustre: 193018:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618998195/real 1618998195] req@ffff8bd0ba9e0000 x1697353990312704/t0(0) o104->oak-OST0034@10.51.4.26@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618998368 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [277720.546358] Lustre: 193018:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [277742.272992] Lustre: oak-OST0056: haven't heard from client d55d324b-c685-4 (at 10.51.6.4@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be75703a000, cur 1618998390 expire 1618998240 last 1618998163 [277869.645857] LNet: 50606:0:(lib-move.c:976:lnet_post_send_locked()) Aborting message for 12345-10.0.2.216@o2ib5: LNetM[DE]Unlink() already called on the MD/ME. [277869.661789] LNet: 50606:0:(lib-move.c:976:lnet_post_send_locked()) Skipped 13 previous similar messages [277869.661845] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bce97e13800 [277869.684554] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bce97e13800 [277873.974307] LNet: 50608:0:(lib-move.c:976:lnet_post_send_locked()) Aborting message for 12345-10.0.2.217@o2ib5: LNetM[DE]Unlink() already called on the MD/ME. [277873.974576] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bdf2fef7c00 [277873.974582] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bdad4f81c00 [277873.974586] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bdad4f81c00 [277873.974591] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be05f7dbc00 [277873.974594] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be05f7dbc00 [277874.050940] LNet: 50608:0:(lib-move.c:976:lnet_post_send_locked()) Skipped 5 previous similar messages [277874.061435] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bdf2fef7c00 [277960.184822] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be71ff3ec00 [277960.196996] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be62860c400 [277960.209150] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be62860c400 [277960.221311] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be62860c400 [277960.233466] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be62860c400 [277960.245635] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be586185000 [277960.257788] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be586185000 [277960.269970] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be586185000 [277960.282138] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be586185000 [278040.406443] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4390ea800 [278040.418679] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd68c867800 [278040.430860] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be9762d6800 [278040.443040] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce5451f000 [278040.455208] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd07f081c00 [278040.467377] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bde309a5c00 [278040.479554] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be140410400 [278040.491726] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce5451f000 [278040.503903] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd07f081c00 [278096.878993] LustreError: 193424:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(2479) req@ffff8bd6bedfb050 x1689650950571968/t0(0) o4->d3d7476c-0db8-4@10.51.5.12@o2ib3:188/0 lens 488/448 e 0 to 0 dl 1618998783 ref 1 fl Interpret:/0/0 rc 0/0 [278096.904734] LustreError: 193424:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 40 previous similar messages [278148.835780] Lustre: oak-OST0042: Client 3db8cbaf-4f5d-336b-6b52-238c3eec207a (at 10.51.0.11@o2ib3) reconnecting [278148.847211] Lustre: Skipped 2040 previous similar messages [278187.024511] LustreError: 137-5: oak-OST0059_UUID: not available for connect from 10.51.13.23@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [278187.044028] LustreError: Skipped 15 previous similar messages [278209.177553] Lustre: oak-OST0038: Connection restored to (at 10.51.2.53@o2ib3) [278209.185721] Lustre: Skipped 2257 previous similar messages [278221.263486] Lustre: oak-OST0034: haven't heard from client e6f1dc1d-f456-4 (at 10.51.2.68@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd7b92aa800, cur 1618998869 expire 1618998719 last 1618998642 [278229.257106] Lustre: oak-OST005e: haven't heard from client 571c5cbe-9605-4 (at 10.51.6.22@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be20f4d2000, cur 1618998877 expire 1618998727 last 1618998650 [278344.175728] LNet: 241536:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [278344.189222] LNet: 241536:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 3 previous similar messages [278344.201724] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be731bffc00 [278344.213888] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b596000 [278344.226044] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b596000 [278344.238202] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b596000 [278344.250358] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b596000 [278344.262571] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b591400 [278344.274723] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcde8802000 [278344.286872] LustreError: 204429:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd630c7e850 x1689657648436544/t0(0) o4->95f4bb74-aa5f-4@10.51.5.52@o2ib3:489/0 lens 488/448 e 0 to 0 dl 1618999084 ref 1 fl Interpret:/0/0 rc 0/0 [278344.286873] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be628608000 [278344.286881] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd1b6135400 [278344.286901] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be628608000 [278344.287089] Lustre: oak-OST005c: Bulk IO write error with 266ca40f-6c8b-4 (at 10.51.4.54@o2ib3), client will retry: rc = -110 [278344.287090] Lustre: Skipped 48 previous similar messages [278344.367666] LustreError: 204429:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 15 previous similar messages [278396.950798] LustreError: 5989:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 363996(2461148) req@ffff8be723e3e050 x1688687434540864/t0(0) o4->42e4ab3a-7964-4@10.51.2.69@o2ib3:481/0 lens 488/448 e 0 to 0 dl 1618999076 ref 1 fl Interpret:/0/0 rc 0/0 [278396.951866] LustreError: 193448:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2912256(3960832) req@ffff8be7225d1050 x1697586550412672/t0(0) o3->7e78116f-4212-7e45-91f6-fb2ef0ac146a@10.51.4.9@o2ib3:489/0 lens 488/440 e 0 to 0 dl 1618999084 ref 1 fl Interpret:/0/0 rc 0/0 [278396.951867] LustreError: 193448:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 61 previous similar messages [278396.951981] Lustre: oak-OST003c: Bulk IO read error with 96319bde-17a5-4 (at 10.51.2.6@o2ib3), client will retry: rc -110 [278396.951982] Lustre: Skipped 172 previous similar messages [278397.034578] LustreError: 5989:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 10 previous similar messages [278490.742886] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.6.2@o2ib3 ns: filter-oak-OST0048_UUID lock: ffff8bd9a7440480/0xf81cb91ff630167 lrc: 4/0,0 mode: PW/PW res: [0x214623a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1232895) flags: 0x60000400010020 nid: 10.51.6.2@o2ib3 remote: 0x6bdb8c94b30e04ee expref: 9 pid: 193031 timeout: 278496 lvb_type: 0 [278493.742792] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.5.60@o2ib3 ns: filter-oak-OST004e_UUID lock: ffff8bdf85897bc0/0xf81cb91ff6c499a lrc: 4/0,0 mode: PW/PW res: [0x21c35b0:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) flags: 0x60000400030020 nid: 10.51.5.60@o2ib3 remote: 0xf0fd426aab50c002 expref: 6 pid: 206660 timeout: 278499 lvb_type: 0 [278493.787762] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [278515.988274] Lustre: 187519:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618998990/real 1618998990] req@ffff8bd807a48480 x1697353991200128/t0(0) o104->oak-OST0058@10.51.6.6@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618999163 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [278516.018903] Lustre: 187519:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [278523.352131] LustreError: 193427:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be722749850 x1696864335160000/t0(0) o4->4cee94e6-025c-589a-13ab-6c9ed337de31@10.51.2.36@o2ib3:665/0 lens 488/448 e 0 to 0 dl 1618999260 ref 1 fl Interpret:/0/0 rc 0/0 [278523.379213] LustreError: 193427:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 20 previous similar messages [278676.390898] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff4e0) failed: 5 [278676.390973] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bccbd4bc000 [278676.391007] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [278676.391008] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [278676.391010] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be70c02e000 [278676.393385] LNet: 182047:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [278676.401610] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be70c02e000 [278676.401613] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdbe7827c00 [278676.401618] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbdecd22000 [278676.401621] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf38a31800 [278676.513838] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 259 previous similar messages [278683.167626] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff5c0) failed: 5 [278683.178039] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 20 previous similar messages [278683.178340] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [278683.178342] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [278683.178345] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be6160c7400 [278683.178348] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d7000 [278683.178352] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d7000 [278683.178363] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d7000 [278683.188591] LNet: 169984:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [278722.035312] LustreError: 5997:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(924765) req@ffff8bdfaa1d2050 x1688683311861248/t0(0) o4->0b774077-ba6c-4@10.51.2.22@o2ib3:66/0 lens 488/448 e 0 to 0 dl 1618999416 ref 1 fl Interpret:/0/0 rc 0/0 [278753.590406] Lustre: oak-OST004a: Client 0cc1f06c-eda0-ebe4-f37b-9f106eca81f1 (at 10.51.4.62@o2ib3) reconnecting [278753.590407] Lustre: oak-OST005a: Client 0cc1f06c-eda0-ebe4-f37b-9f106eca81f1 (at 10.51.4.62@o2ib3) reconnecting [278753.590410] Lustre: Skipped 1292 previous similar messages [278753.619400] Lustre: Skipped 1 previous similar message [278773.389313] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdc033cf000 [278773.401488] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdc033cf000 [278773.413644] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0f3a01c00 [278773.425807] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5aef48400 [278773.437966] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bc4c74c2800 [278773.450133] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7a79aa800 [278773.462292] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7a79aa800 [278773.474449] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5aef48400 [278805.024737] LustreError: 137-5: oak-OST0059_UUID: not available for connect from 10.51.12.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [278805.044183] LustreError: Skipped 11 previous similar messages [278810.169884] Lustre: oak-OST0030: Connection restored to ea3c0abc-03bf-9064-4b06-a2deb2e4ec48 (at 10.51.4.35@o2ib3) [278810.181542] Lustre: Skipped 1824 previous similar messages [278970.282748] Lustre: oak-OST0034: haven't heard from client 69e54a80-d9dd-4 (at 10.51.2.37@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bdd44641400, cur 1618999618 expire 1618999468 last 1618999391 [279096.034986] Lustre: oak-OST0034: Bulk IO write error with 2f2dff73-004a-4 (at 10.51.2.14@o2ib3), client will retry: rc = -110 [279096.047712] Lustre: Skipped 43 previous similar messages [279096.757824] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.14@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbbcab955 [279096.774722] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [279167.156256] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [279167.168780] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 3 previous similar messages [279167.179927] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600ac0) failed: 5 [279167.190330] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 36 previous similar messages [279167.190528] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [279167.190529] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 5 previous similar messages [279167.190532] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be298c47000 [279167.190848] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be705859c00 [279167.191100] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be705859c00 [279167.191380] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71edaf400 [279167.191654] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71edaf400 [279167.191657] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd35cf48400 [279167.191927] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71ff38800 [279167.192173] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71ff38800 [279169.171185] LustreError: 193415:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd30732f850 x1696864599698880/t0(0) o4->76eb6295-9d00-d7ab-8458-c8aac654030a@10.51.6.5@o2ib3:558/0 lens 488/448 e 0 to 0 dl 1618999908 ref 1 fl Interpret:/0/0 rc 0/0 [279169.198206] LustreError: 193415:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [279222.150378] LustreError: 193404:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be701dad050 x1696863522368896/t0(0) o3->eb8cea22-3545-c4b8-6cb6-b3e875ecfb11@10.51.1.23@o2ib3:553/0 lens 488/440 e 0 to 0 dl 1618999903 ref 1 fl Interpret:/0/0 rc 0/0 [279222.150476] LustreError: 5999:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be929673050 x1689651313187712/t0(0) o3->2c63b434-3a22-4@10.51.5.53@o2ib3:553/0 lens 488/440 e 0 to 0 dl 1618999903 ref 1 fl Interpret:/0/0 rc 0/0 [279222.150479] LustreError: 5999:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 16 previous similar messages [279222.150578] Lustre: oak-OST003e: Bulk IO read error with 1d13e74c-2452-4 (at 10.51.4.61@o2ib3), client will retry: rc -110 [279222.150579] Lustre: Skipped 115 previous similar messages [279222.233141] LustreError: 193404:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 204 previous similar messages [279318.023836] LustreError: 193187:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0050: cli 5382169d-059c-4 claims 4218880 GRANT, real grant 2310144 [279318.038318] LustreError: 193187:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 1 previous similar message [279333.544742] LustreError: 193143:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST0050: cli 5382169d-059c-4 claims 4218880 GRANT, real grant 0 [279333.558704] LustreError: 193143:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 2 previous similar messages [279339.007236] Lustre: 120935:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1618999813/real 1618999815] req@ffff8bda2aaff980 x1697353991912512/t0(0) o104->oak-OST0040@10.51.5.21@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1618999986 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [279361.780834] Lustre: oak-OST003c: Client eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3) reconnecting [279361.790154] Lustre: Skipped 2319 previous similar messages [279410.709983] Lustre: oak-OST0032: Connection restored to 8695371d-162f-4 (at 10.50.7.24@o2ib2) [279410.719626] Lustre: Skipped 2865 previous similar messages [279466.150686] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb85868400 [279466.162858] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb85868400 [279466.175018] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb85868400 [279466.187170] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce794c2800 [279466.199332] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce62b65400 [279466.211525] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb85869c00 [279466.223686] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be24bc42400 [279466.235848] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd68c865400 [279466.248001] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb85869c00 [279539.825886] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.52@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbc52c5a5 [279545.371381] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be70c14ec00 [279545.383582] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be23838e400 [279545.395748] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be23838e400 [279545.407930] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcec8121400 [279545.420079] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcec8121400 [279545.432245] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce62b62c00 [279545.432574] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [279545.444409] LNet: 89091:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [279596.225484] LustreError: 220407:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1230971) req@ffff8be7255c2850 x1691357032177344/t0(0) o4->d55d324b-c685-4@10.51.6.4@o2ib3:177/0 lens 488/448 e 0 to 0 dl 1619000282 ref 1 fl Interpret:/0/0 rc 0/0 [279596.252074] LustreError: 220407:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 64 previous similar messages [279672.145143] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be140412000 [279672.157344] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd72c940400 [279672.169507] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd72c940400 [279672.181684] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bddaaf1fc00 [279672.193840] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd26aee4c00 [279672.206430] LNet: 182049:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [279672.206463] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [279672.206465] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [279709.731685] Lustre: 193037:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619000184/real 1619000191] req@ffff8bd8c7e2cc80 x1697353992236544/t0(0) o104->oak-OST0042@10.51.6.28@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619000357 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [279712.848579] Lustre: 143291:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619000187/real 1619000187] req@ffff8bdfccd10900 x1697353992238528/t0(0) o105->oak-OST005c@10.51.3.3@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1619000360 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [279712.879190] Lustre: 143291:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [279717.163500] Lustre: 216158:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619000191/real 1619000191] req@ffff8be06b66e300 x1697353992239296/t0(0) o105->oak-OST0042@10.51.6.28@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1619000364 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [279721.248922] Lustre: oak-OST0054: Bulk IO write error with d0879b94-e3f0-4 (at 10.51.2.59@o2ib3), client will retry: rc = -110 [279721.261688] Lustre: Skipped 66 previous similar messages [279725.367586] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71ef17c00 [279725.379758] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd3ef081400 [279725.391916] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be742a97000 [279725.404070] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bab26b9ec00 [279725.416225] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bab26b9ec00 [279725.428385] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0f3a00000 [279725.440554] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0f3a00000 [279725.452715] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be742a97000 [279725.452717] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [279725.452719] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [279725.452722] LNet: 202714:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [279725.503424] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71ef17c00 [279732.243407] Lustre: oak-OST0036: haven't heard from client 5bacb1f7-c044-4 (at 10.51.4.60@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd522823000, cur 1619000380 expire 1619000230 last 1619000153 [279738.264308] Lustre: oak-OST0058: haven't heard from client 1d6f2d21-9fc7-4 (at 10.51.1.44@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd4dcf79800, cur 1619000386 expire 1619000236 last 1619000159 [279743.241276] Lustre: oak-OST0032: haven't heard from client 1d6f2d21-9fc7-4 (at 10.51.1.44@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be1f78fb800, cur 1619000391 expire 1619000241 last 1619000164 [279743.263621] Lustre: Skipped 1 previous similar message [279817.712186] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.6.18@o2ib3 ns: filter-oak-OST0038_UUID lock: ffff8bd485f10900/0xf81cb91ff7afbb8 lrc: 3/0,0 mode: PR/PR res: [0x1a40000400:0x2f10d9:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.6.18@o2ib3 remote: 0x8a66890d0dfdbd1f expref: 9 pid: 206659 timeout: 279823 lvb_type: 1 [279821.712058] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.2.69@o2ib3 ns: filter-oak-OST0058_UUID lock: ffff8bd3a4aaaf40/0xf81cb91ff7ac188 lrc: 4/0,0 mode: PR/PR res: [0x23b0f74:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.2.69@o2ib3 remote: 0xa3486d345f340928 expref: 13 pid: 120935 timeout: 279827 lvb_type: 1 [279821.758382] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [279840.782661] Lustre: 192979:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619000315/real 1619000315] req@ffff8be2dc0c7980 x1697353992343104/t0(0) o104->oak-OST005c@10.51.4.45@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619000488 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [279840.813358] Lustre: 192979:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [279841.257927] LustreError: 227956:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be3a7341850 x1689651461414016/t0(0) o4->b4a05b29-fe90-4@10.51.5.60@o2ib3:480/0 lens 488/448 e 0 to 0 dl 1619000585 ref 1 fl Interpret:/0/0 rc 0/0 [279841.282981] LustreError: 227956:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 9 previous similar messages [279853.363793] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [279853.377285] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 4 previous similar messages [279853.388932] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf976edc00 [279853.401111] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bc4c5fd1400 [279853.413264] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be7226f4c00 [279853.413300] LustreError: 220400:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bcf8d686850 x1689651461497472/t0(0) o4->b4a05b29-fe90-4@10.51.5.60@o2ib3:486/0 lens 488/448 e 0 to 0 dl 1619000591 ref 1 fl Interpret:/0/0 rc 0/0 [279853.413302] LustreError: 220400:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 23 previous similar messages [279871.710992] Lustre: oak-OST0040: Bulk IO read error with b91e7890-9c4a-4 (at 10.51.2.4@o2ib3), client will retry: rc -110 [279871.723330] Lustre: Skipped 296 previous similar messages [279893.022441] Lustre: 193065:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619000367/real 1619000367] req@ffff8bd4d1f9f980 x1697353992387392/t0(0) o104->oak-OST003c@10.51.3.57@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619000540 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [279893.022443] Lustre: 184633:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619000367/real 1619000367] req@ffff8bdf09964380 x1697353992387456/t0(0) o105->oak-OST003c@10.51.3.57@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1619000540 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [279893.022446] Lustre: 184633:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [279896.296369] LustreError: 221474:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be907840050 x1685363988664576/t0(0) o3->d1983ed4-4e6e-4@10.51.2.47@o2ib3:480/0 lens 488/440 e 0 to 0 dl 1619000585 ref 1 fl Interpret:/0/0 rc 0/0 [279896.322673] LustreError: 221474:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 194 previous similar messages [279940.038760] LNet: 50606:0:(lib-move.c:976:lnet_post_send_locked()) Aborting message for 12345-10.0.2.216@o2ib5: LNetM[DE]Unlink() already called on the MD/ME. [279940.038778] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bd72c940400 [279940.054737] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be7237a4400 [279940.054743] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be7237a4400 [279940.066932] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bd600236800 [279940.066937] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bd600236800 [279940.115362] LNet: 50606:0:(lib-move.c:976:lnet_post_send_locked()) Skipped 5 previous similar messages [279940.125861] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bd72c940400 [279966.721158] Lustre: oak-OST005e: Client 33016c70-be50-803e-0431-7a51cafca4a6 (at 10.51.14.19@o2ib3) reconnecting [279966.721160] Lustre: oak-OST005a: Client 33016c70-be50-803e-0431-7a51cafca4a6 (at 10.51.14.19@o2ib3) reconnecting [279966.721161] Lustre: oak-OST0050: Client 33016c70-be50-803e-0431-7a51cafca4a6 (at 10.51.14.19@o2ib3) reconnecting [279966.721164] Lustre: Skipped 1017 previous similar messages [279966.721164] Lustre: Skipped 1018 previous similar messages [280010.818341] Lustre: oak-OST005c: Connection restored to 20486326-e3f6-4 (at 10.51.5.51@o2ib3) [280010.827981] Lustre: Skipped 1876 previous similar messages [280016.548390] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.28@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbce77a65 [280023.060396] Lustre: 206664:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619000497/real 1619000501] req@ffff8bb54d303a80 x1697353992497344/t0(0) o104->oak-OST005c@10.51.2.43@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619000670 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [280023.091058] Lustre: 206664:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [280097.199633] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff6a0) failed: 5 [280097.210065] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 27 previous similar messages [280097.210109] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [280097.210111] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [280097.210114] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bce62b60000 [280097.210372] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bce62b60000 [280097.210375] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdf7c7c6400 [280097.210378] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8baab3797800 [280097.210386] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcb7ebbf400 [280097.210675] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7314f9c00 [280097.210949] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4c9599800 [280097.211198] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7314f9c00 [280097.302974] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [280097.302975] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [280097.302976] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 3 previous similar messages [280097.302978] LNet: 202714:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [280097.302980] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 4 previous similar messages [280109.098345] LustreError: 137-5: oak-OST0041_UUID: not available for connect from 10.51.2.36@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [280109.117761] LustreError: Skipped 13 previous similar messages [280241.702328] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.4.24@o2ib3 ns: filter-oak-OST0054_UUID lock: ffff8bd6a7f4a640/0xf81cb91ff7faa94 lrc: 3/0,0 mode: PR/PR res: [0x1e45a6c:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.4.24@o2ib3 remote: 0xbc4ad59f8d2af96c expref: 6 pid: 187434 timeout: 280247 lvb_type: 1 [280258.131383] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb7ebbc800 [280258.143577] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9ace90c00 [280258.155747] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4c7bcc400 [280258.167911] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcb7ebbd400 [280258.180068] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0a1fcc000 [280258.192237] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71eec6400 [280258.204390] LNet: 85677:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [280258.204397] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [280258.204399] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 12 previous similar messages [280258.204400] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71eec6400 [280258.204409] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4c7bca000 [280258.204414] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be96d3c0400 [280260.996906] Lustre: 143291:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619000735/real 1619000735] req@ffff8bb54d300480 x1697353992692736/t0(0) o105->oak-OST0052@10.51.4.34@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1619000908 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [280261.027571] Lustre: 143291:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [280280.232558] Lustre: oak-OST0056: haven't heard from client 9aae8976-9e66-4 (at 10.51.2.34@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd8f6c1f000, cur 1619000928 expire 1619000778 last 1619000701 [280321.410168] LustreError: 217350:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(2460402) req@ffff8bd3a732d850 x1688462532161920/t0(0) o4->96319bde-17a5-4@10.51.2.6@o2ib3:138/0 lens 488/448 e 0 to 0 dl 1619000998 ref 1 fl Interpret:/2/0 rc 0/0 [280321.410579] Lustre: oak-OST0042: Bulk IO write error with 96319bde-17a5-4 (at 10.51.2.6@o2ib3), client will retry: rc = -110 [280321.410581] Lustre: Skipped 55 previous similar messages [280321.455542] LustreError: 217350:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 79 previous similar messages [280329.129742] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be39f8f0c00 [280329.141968] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd73b151800 [280329.154132] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd73b151800 [280329.166319] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd73b151800 [280329.178521] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd73b151800 [280329.190805] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be6ccd32c00 [280329.203034] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb467259c00 [280365.699497] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 111s: evicting client at 10.51.2.37@o2ib3 ns: filter-oak-OST0046_UUID lock: ffff8be19a3e7740/0xf81cb91ff80836a lrc: 4/0,0 mode: PW/PW res: [0x220af15:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) flags: 0x60000400030020 nid: 10.51.2.37@o2ib3 remote: 0x3ba197f3cf7bd151 expref: 9 pid: 193000 timeout: 280371 lvb_type: 0 [280381.351831] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbf3c92e000 [280381.364001] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd717b9f000 [280381.376173] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd717b9f000 [280381.388347] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be578f3b000 [280381.400533] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4958fac00 [280381.412713] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4958fac00 [280381.424871] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd77504bc00 [280381.437037] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [280381.437061] LNet: 182049:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [280381.465105] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 33 previous similar messages [280396.616182] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.52@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbd6a7d0d [280485.125798] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [280485.139292] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 4 previous similar messages [280485.151829] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be08c195000 [280485.164014] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be715b57400 [280485.176179] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd35cf4e400 [280485.188377] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be61bbfb800 [280485.200555] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbb596ccc00 [280485.212719] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcb778b6800 [280485.224869] LustreError: 221486:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71cca0850 x1690604909097664/t0(0) o4->42d3440a-b89b-4@10.51.2.2@o2ib3:362/0 lens 488/448 e 0 to 0 dl 1619001222 ref 1 fl Interpret:/0/0 rc 0/0 [280485.224875] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd35cf4c800 [280485.224899] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd35cf4c800 [280485.224914] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbb596cd800 [280485.286653] LustreError: 221486:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 25 previous similar messages [280487.502515] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.1.2@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [280487.521830] LustreError: Skipped 1 previous similar message [280530.695648] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.4.63@o2ib3 ns: filter-oak-OST004c_UUID lock: ffff8be04d3f3f00/0xf81cb91ff8301b3 lrc: 3/0,0 mode: PR/PR res: [0x1000000400:0x325277:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.4.63@o2ib3 remote: 0x93706f32249ffa3 expref: 9 pid: 193068 timeout: 280536 lvb_type: 1 [280546.464721] LustreError: 193195:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bd657b96850 x1696860160333760/t0(0) o3->54153ede-ddc5-4@10.51.2.1@o2ib3:361/0 lens 488/440 e 0 to 0 dl 1619001221 ref 1 fl Interpret:/0/0 rc 0/0 [280546.464844] Lustre: oak-OST0040: Bulk IO read error with 54153ede-ddc5-4 (at 10.51.2.1@o2ib3), client will retry: rc -110 [280546.464846] Lustre: Skipped 647 previous similar messages [280546.509375] LustreError: 193195:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 581 previous similar messages [280554.413073] LustreError: 221402:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be0d0961850 x1688534824098176/t0(0) o3->0028e5c0-f60e-4@10.51.4.34@o2ib3:431/0 lens 488/440 e 0 to 0 dl 1619001291 ref 1 fl Interpret:/0/0 rc 0/0 [280554.438014] LustreError: 221402:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 34 previous similar messages [280567.602689] Lustre: oak-OST004a: Client 66bf6793-ef6e-d6d2-96ad-3432d26adbce (at 10.51.13.5@o2ib3) reconnecting [280567.614061] Lustre: Skipped 3044 previous similar messages [280611.701409] Lustre: oak-OST005c: Connection restored to a8a38ec7-d559-4 (at 10.50.7.5@o2ib2) [280611.710932] Lustre: Skipped 3409 previous similar messages [280628.346416] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd600232000 [280628.358602] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd505ec000 [280628.370774] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7b22f1800 [280628.382935] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7b22f1800 [280628.395077] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7b22f1800 [280628.407367] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be021091c00 [280788.976788] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.51.6.21@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [280788.996214] LustreError: Skipped 3 previous similar messages [280858.183811] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb778b0400 [280858.195990] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd717b98c00 [280858.208141] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd2937cb800 [280858.220315] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd2937cb800 [280858.232480] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be021093c00 [280858.244643] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be021093c00 [280858.256806] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4dd3c7800 [280858.256857] LNet: 167011:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [280858.286721] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4dd3c7800 [280858.298888] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcfd12d0400 [280858.311056] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcfd12d0400 [280858.323221] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be249abd000 [280858.335389] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be249abd000 [280858.347548] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb778b0400 [280921.548563] LustreError: 220409:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1232046) req@ffff8be4596f7850 x1695838692616064/t0(0) o4->eab9e2cf-af4c-4@10.51.15.1@o2ib3:733/0 lens 488/448 e 0 to 0 dl 1619001593 ref 1 fl Interpret:/0/0 rc 0/0 [280921.548632] Lustre: oak-OST003c: Bulk IO write error with eab9e2cf-af4c-4 (at 10.51.15.1@o2ib3), client will retry: rc = -110 [280921.548634] Lustre: Skipped 46 previous similar messages [280921.594002] LustreError: 220409:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 25 previous similar messages [281084.921568] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.65@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbe5305ed [281084.938552] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 7 previous similar messages [281118.111008] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [281118.124501] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [281118.135762] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcdee9da400 [281118.135794] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [281118.135796] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [281118.170734] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd32630b000 [281118.182683] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd32630b000 [281118.194628] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd449b5b800 [281118.206575] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd32630ec00 [281118.218520] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd449b59000 [281118.230506] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd717b9c000 [281118.242474] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd2937cbc00 [281171.616573] LustreError: 227982:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bcda365a850 x1697066830051904/t0(0) o3->c8940c7b-4cef-46f6-4f87-531b13f93658@10.51.6.25@o2ib3:233/0 lens 488/440 e 0 to 0 dl 1619001848 ref 1 fl Interpret:/0/0 rc 0/0 [281171.616893] Lustre: oak-OST005e: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [281171.616895] Lustre: Skipped 477 previous similar messages [281171.617253] LustreError: 211957:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be7136dd050 x1688437047642304/t0(0) o3->1edf0fa0-d190-4@10.51.2.10@o2ib3:242/0 lens 488/440 e 0 to 0 dl 1619001857 ref 1 fl Interpret:/0/0 rc 0/0 [281171.617255] LustreError: 211957:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 19 previous similar messages [281171.698855] LustreError: 227982:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 653 previous similar messages [281185.245956] Lustre: oak-OST0058: Client 668a2172-2080-4 (at 10.51.3.70@o2ib3) reconnecting [281185.255287] Lustre: Skipped 2573 previous similar messages [281211.956752] LustreError: 217349:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd12afd5850 x1695838694278400/t0(0) o4->eab9e2cf-af4c-4@10.51.15.1@o2ib3:328/0 lens 488/448 e 0 to 0 dl 1619001943 ref 1 fl Interpret:/0/0 rc 0/0 [281211.981806] LustreError: 217349:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 49 previous similar messages [281212.645282] Lustre: oak-OST0050: Connection restored to 33016c70-be50-803e-0431-7a51cafca4a6 (at 10.51.14.19@o2ib3) [281212.645282] Lustre: oak-OST0044: Connection restored to 33016c70-be50-803e-0431-7a51cafca4a6 (at 10.51.14.19@o2ib3) [281212.645285] Lustre: Skipped 3200 previous similar messages [281212.675002] Lustre: Skipped 3 previous similar messages [281223.679502] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 112s: evicting client at 10.51.2.26@o2ib3 ns: filter-oak-OST0042_UUID lock: ffff8bda511198c0/0xf81cb91ff8caff9 lrc: 3/0,0 mode: PR/PR res: [0x1d4e21a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.2.26@o2ib3 remote: 0x2f6ab4e15b1bbf6a expref: 6 pid: 187434 timeout: 281229 lvb_type: 1 [281235.351718] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.1.51@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [281235.371230] LustreError: Skipped 2 previous similar messages [281261.678617] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.2.64@o2ib3 ns: filter-oak-OST004c_UUID lock: ffff8bad7f955100/0xf81cb91ff8be340 lrc: 4/0,0 mode: PW/PW res: [0x1f44fb3:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->2998271) flags: 0x60000400030020 nid: 10.51.2.64@o2ib3 remote: 0x8303588636816674 expref: 8 pid: 193042 timeout: 281267 lvb_type: 0 [281264.678552] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.2.6@o2ib3 ns: filter-oak-OST0042_UUID lock: ffff8be6fcff3180/0xf81cb91ff782777 lrc: 4/0,0 mode: PW/PW res: [0x209297a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1232895) flags: 0x60000400010020 nid: 10.51.2.6@o2ib3 remote: 0x40a3ea08b27b49e9 expref: 9 pid: 75143 timeout: 281270 lvb_type: 0 [281287.565059] Lustre: 193060:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619001762/real 1619001763] req@ffff8bd9281c5100 x1697353994178112/t0(0) o104->oak-OST0034@10.51.2.64@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619001935 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [281287.595761] Lustre: 193060:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [281292.107677] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bac403ad400 [281292.119942] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be3e1abe400 [281292.132105] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be3e1abe400 [281292.144263] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcc952c9800 [281292.156426] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcc952c9800 [281292.168589] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb9916f7000 [281292.180754] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb9916f7000 [281292.180773] LNet: 179836:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [281292.180774] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [281292.220969] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bac403ad400 [281292.233128] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1f4653800 [281292.245285] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1f4653800 [281375.105206] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fe600) failed: 5 [281375.105208] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fe600) failed: 5 [281375.105211] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 2 previous similar messages [281375.105387] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb54116ec00 [281375.105428] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [281375.105430] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [281375.105432] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdd2eda4000 [281375.105980] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdd2eda4000 [281375.106724] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcbfbb1ec00 [281375.107380] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be5a4d1dc00 [281375.108059] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be5a4d1dc00 [281375.108131] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be4096f7c00 [281375.108289] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [281375.108291] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [281375.108545] LNet: 202714:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [281375.108831] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda125a4c00 [281375.109548] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda125a4c00 [281375.305899] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1611 previous similar messages [281460.538984] Lustre: 193173:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619001935/real 1619001935] req@ffff8bb88ea61f80 x1697353994319744/t0(0) o106->oak-OST005c@10.51.2.50@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1619002108 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [281460.569974] Lustre: 193173:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [281462.261616] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.62@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbec238dd [281462.278511] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 4 previous similar messages [281462.414336] LNet: 50609:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.2.50@o2ib3 portal 16 match 1697353994319680 offset 224 length 224: 4 [281468.698574] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.60@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbebe3915 [281468.715471] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 3 previous similar messages [281666.441300] Lustre: oak-OST0054: Bulk IO write error with 76eb6295-9d00-d7ab-8458-c8aac654030a (at 10.51.6.5@o2ib3), client will retry: rc = -110 [281666.455962] Lustre: Skipped 47 previous similar messages [281798.318377] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [281798.319957] Lustre: oak-OST005e: Client 4cee94e6-025c-589a-13ab-6c9ed337de31 (at 10.51.2.36@o2ib3) reconnecting [281798.319958] Lustre: Skipped 1018 previous similar messages [281798.349339] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [281798.361137] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bca73fdac00 [281798.373298] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bca73fdac00 [281798.385482] LustreError: 220408:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be723c7f050 x1689674347765632/t0(0) o4->566f9fe9-38b0-4@10.51.5.10@o2ib3:160/0 lens 488/448 e 0 to 0 dl 1619002530 ref 1 fl Interpret:/0/0 rc 0/0 [281798.410909] LustreError: 220408:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 15 previous similar messages [281814.431276] Lustre: oak-OST0036: Connection restored to 166720a1-a472-4 (at 10.50.5.60@o2ib2) [281814.440911] Lustre: Skipped 1562 previous similar messages [281838.461927] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.51.2.68@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [281838.481343] LustreError: Skipped 2 previous similar messages [281846.741613] LustreError: 209392:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd7cf438850 x1688497773268032/t0(0) o3->471f1f7a-e6b9-4@10.51.2.12@o2ib3:152/0 lens 488/440 e 0 to 0 dl 1619002522 ref 1 fl Interpret:/0/0 rc 0/0 [281846.741720] Lustre: oak-OST0040: Bulk IO read error with 2ea63f3f-5c00-4 (at 10.51.5.11@o2ib3), client will retry: rc -110 [281846.741721] Lustre: Skipped 503 previous similar messages [281846.741989] LustreError: 9691:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1507515) req@ffff8be4ec8ff050 x1689655311957504/t0(0) o4->a7561178-bc92-4@10.51.5.23@o2ib3:164/0 lens 488/448 e 0 to 0 dl 1619002534 ref 1 fl Interpret:/0/0 rc 0/0 [281846.741990] LustreError: 9691:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 25 previous similar messages [281846.823462] LustreError: 209392:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 479 previous similar messages [281886.317252] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be04fa90800 [281886.329443] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be04fa90800 [281886.341613] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be04fa90800 [281886.353772] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0f3a03c00 [281886.365944] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bddc89d6000 [281886.378203] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be5a5bdc400 [281886.390360] LNet: 202714:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [281886.390387] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be5a5bdc400 [281886.390393] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [281886.390395] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [281938.662869] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.6.20@o2ib3 ns: filter-oak-OST005a_UUID lock: ffff8bd7288a9440/0xf81cb91ff964b3e lrc: 4/0,0 mode: PR/PR res: [0x1700000400:0x42cc5d:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.6.20@o2ib3 remote: 0x46e07e7561b63fc7 expref: 14 pid: 193070 timeout: 281944 lvb_type: 1 [281938.710017] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 4 previous similar messages [281938.721164] LustreError: 193050:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8be2b58cdc00 ns: filter-oak-OST005a_UUID lock: ffff8bb536abc380/0xf81cb91ff9655cc lrc: 3/0,0 mode: --/PW res: [0x1700000400:0x42cc5d:0x0].0x0 rrc: 3 type: EXT [0->8191] (req 0->8191) flags: 0x50000000020000 nid: 10.51.6.20@o2ib3 remote: 0x46e07e7561b63fce expref: 14 pid: 193050 timeout: 0 lvb_type: 0 [281943.576935] LustreError: 192996:0:(ldlm_lockd.c:2366:ldlm_cancel_handler()) ldlm_cancel from 10.51.6.20@o2ib3 arrived at 1619002591 with bad export cookie 1117398010482392841 [281943.594479] LustreError: 192996:0:(ldlm_lockd.c:2366:ldlm_cancel_handler()) Skipped 2 previous similar messages [281955.315353] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be249ab9800 [281955.327515] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be715b56000 [281955.339687] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd1ff04bc00 [281955.351874] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be9750bb400 [281955.364036] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0613ad000 [281955.376189] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0613ad000 [281955.388380] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be021091c00 [281955.400532] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be90ac27400 [281955.412691] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be1f4653400 [281955.424854] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be1f4653400 [281955.437019] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcb4c35e800 [281985.276474] Lustre: oak-OST005a: haven't heard from client 931eeada-f950-4 (at 10.51.2.16@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be974874000, cur 1619002633 expire 1619002483 last 1619002406 [281985.298799] Lustre: Skipped 2 previous similar messages [282069.357874] LustreError: 217339:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be31a257050 x1690572669281024/t0(0) o3->931eeada-f950-4@10.51.2.16@o2ib3:440/0 lens 488/440 e 0 to 0 dl 1619002810 ref 1 fl Interpret:/0/0 rc 0/0 [282069.382830] LustreError: 217339:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 11 previous similar messages [282098.659124] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.4.56@o2ib3 ns: filter-oak-OST004c_UUID lock: ffff8bc28db00000/0xf81cb91ff98cd46 lrc: 3/0,0 mode: PR/PR res: [0x1000000400:0x3252f9:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.4.56@o2ib3 remote: 0xb45a61620b4d4edb expref: 9 pid: 193012 timeout: 282104 lvb_type: 1 [282115.845318] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.48@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbf7e8c0d [282115.862217] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [282122.368571] Lustre: 216155:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619002597/real 1619002597] req@ffff8bdfccd12d00 x1697353994846592/t0(0) o105->oak-OST0040@10.51.2.15@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1619002770 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [282122.399274] Lustre: 216155:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [282124.481515] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.4.48@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbf7e8c35 [282124.498499] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [282136.546761] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.5.71@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xbf847e6d [282232.738444] LNet: 50607:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.2.11@o2ib3 portal 16 match 1697353995061952 offset 224 length 224: 4 [282249.127674] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca7efb8800 [282249.139858] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd36ae2ac00 [282249.152030] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd36ae2ec00 [282249.164191] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcde3ab6000 [282249.176349] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd2937ca400 [282249.188516] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd2937ca400 [282249.200668] LNet: 167011:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [282249.200678] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [282249.200680] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [282249.200683] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf38a31c00 [282249.200701] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf38a31c00 [282249.200721] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd44640c00 [282249.200735] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcde3ab6000 [282296.849085] Lustre: oak-OST0036: Bulk IO write error with b4a05b29-fe90-4 (at 10.51.5.60@o2ib3), client will retry: rc = -110 [282296.861812] Lustre: Skipped 36 previous similar messages [282344.306376] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdad4f83400 [282344.318533] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723092c00 [282344.330689] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723092c00 [282344.342926] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcbb95da800 [282344.355080] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be715b54800 [282344.367248] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcbb95da800 [282344.379398] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be152e8d400 [282344.391550] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcbb95da800 [282392.652278] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.2.14@o2ib3 ns: filter-oak-OST004e_UUID lock: ffff8bcdb6a7a400/0xf81cb91ff9d8472 lrc: 3/0,0 mode: PR/PR res: [0x1200000400:0x344e6a:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.2.14@o2ib3 remote: 0x9c95471194d1e897 expref: 12 pid: 193010 timeout: 282398 lvb_type: 1 [282392.699360] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [282398.404192] Lustre: oak-OST0040: Client 5625528b-54b7-4 (at 10.51.2.67@o2ib3) reconnecting [282398.413552] Lustre: Skipped 2399 previous similar messages [282414.488618] Lustre: oak-OST003a: Connection restored to de57c027-a949-4 (at 10.51.5.55@o2ib3) [282414.498246] Lustre: Skipped 2935 previous similar messages [282416.651751] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 177s: evicting client at 10.51.2.60@o2ib3 ns: filter-oak-OST0032_UUID lock: ffff8bd7913da400/0xf81cb91ff7c70a2 lrc: 3/0,0 mode: PW/PW res: [0x205bf5e:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000480030020 nid: 10.51.2.60@o2ib3 remote: 0x568a504ff3a8eab0 expref: 6 pid: 189717 timeout: 282422 lvb_type: 0 [282424.302570] Lustre: 75150:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619002899/real 1619002899] req@ffff8bd7f36d9680 x1697353995232064/t0(0) o104->oak-OST003a@10.51.2.15@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619003072 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [282536.182096] Lustre: oak-OST003e: haven't heard from client 1c87de87-2358-4 (at 10.51.4.69@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be0a1fc9c00, cur 1619003184 expire 1619003034 last 1619002957 [282537.232731] Lustre: oak-OST0042: haven't heard from client 9b72cdcc-a3a2-4 (at 10.51.5.19@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd7b92ae400, cur 1619003185 expire 1619003035 last 1619002958 [282537.255944] Lustre: oak-OST0048: Bulk IO read error with 1dcbe18d-7cb3-4 (at 10.51.6.9@o2ib3), client will retry: rc -110 [282537.268298] Lustre: Skipped 974 previous similar messages [282571.920486] LustreError: 193443:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be731aeb050 x1696859924564224/t0(0) o3->4c32b5fb-e821-4@10.51.2.65@o2ib3:133/0 lens 488/440 e 0 to 0 dl 1619003258 ref 1 fl Interpret:/0/0 rc 0/0 [282571.946903] LustreError: 193443:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 694 previous similar messages [282650.392435] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.43@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xc03abebd [282650.409331] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [282663.298103] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending_nocred)(waiting) [282663.312178] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 4 previous similar messages [282663.323382] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4b4f80400 [282663.323390] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [282663.323392] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [282663.323393] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [282663.323395] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [282663.381179] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be719d81800 [282663.393125] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be298c46c00 [282663.405072] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbfb169bc00 [282663.417225] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd36ae2ec00 [282663.429191] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be578f3b800 [282663.441157] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff240) failed: 5 [282663.441216] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd555713800 [282663.442045] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd555716400 [282663.475448] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 255 previous similar messages [282721.952145] LustreError: 193445:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be720755850 x1689654534734848/t0(0) o3->3dc6d22e-58b7-4@10.51.5.3@o2ib3:276/0 lens 488/440 e 0 to 0 dl 1619003401 ref 1 fl Interpret:/0/0 rc 0/0 [282721.960400] LustreError: 227993:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1508765) req@ffff8bd0edced850 x1689649933888384/t0(0) o4->6ab3f4aa-3003-4@10.51.5.16@o2ib3:279/0 lens 488/448 e 0 to 0 dl 1619003404 ref 1 fl Interpret:/0/0 rc 0/0 [282721.960402] LustreError: 227993:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 38 previous similar messages [282722.015195] LustreError: 193445:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 32 previous similar messages [282827.570117] LustreError: 221470:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bdf8325f850 x1696864344343616/t0(0) o4->4cee94e6-025c-589a-13ab-6c9ed337de31@10.51.2.36@o2ib3:436/0 lens 488/448 e 0 to 0 dl 1619003561 ref 1 fl Interpret:/0/0 rc 0/0 [282827.597191] LustreError: 221470:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 49 previous similar messages [282848.156391] Lustre: oak-OST004e: haven't heard from client d0879b94-e3f0-4 (at 10.51.2.59@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be3e1abd800, cur 1619003496 expire 1619003346 last 1619003269 [282854.166413] Lustre: oak-OST0034: haven't heard from client 5aa5e599-0371-4 (at 10.51.3.72@o2ib3) in 219 seconds. I think it's dead, and I am evicting it. exp ffff8bde2918f800, cur 1619003502 expire 1619003352 last 1619003283 [282854.188765] Lustre: Skipped 1 previous similar message [282863.149240] Lustre: oak-OST003c: haven't heard from client 51bad213-c990-4 (at 10.51.6.1@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bcbf53afc00, cur 1619003511 expire 1619003361 last 1619003284 [282863.171472] Lustre: Skipped 1 previous similar message [282890.938397] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.2.27@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [282890.957839] LustreError: Skipped 10 previous similar messages [283008.065577] Lustre: oak-OST0042: Client 54153ede-ddc5-4 (at 10.51.2.1@o2ib3) reconnecting [283008.065579] Lustre: oak-OST003c: Client 54153ede-ddc5-4 (at 10.51.2.1@o2ib3) reconnecting [283008.065581] Lustre: Skipped 3000 previous similar messages [283015.189481] Lustre: oak-OST0056: Connection restored to 2cf3663e-0e4b-4 (at 10.50.5.47@o2ib2) [283015.199112] Lustre: Skipped 3354 previous similar messages [283272.105734] LustreError: 5996:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be38b945050 x1697066515441600/t0(0) o3->5382169d-059c-4@10.51.2.52@o2ib3:63/0 lens 488/440 e 0 to 0 dl 1619003943 ref 1 fl Interpret:/0/0 rc 0/0 [283272.105990] Lustre: oak-OST0052: Bulk IO read error with 5382169d-059c-4 (at 10.51.2.52@o2ib3), client will retry: rc -110 [283272.105992] Lustre: Skipped 291 previous similar messages [283272.150361] LustreError: 5996:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 260 previous similar messages [283372.127044] LustreError: 193438:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1232645) req@ffff8bcfa6eff850 x1685116488091648/t0(0) o4->9b12e584-d591-4@10.51.12.20@o2ib3:187/0 lens 488/448 e 0 to 0 dl 1619004067 ref 1 fl Interpret:/0/0 rc 0/0 [283372.127166] Lustre: oak-OST0036: Bulk IO write error with 9b12e584-d591-4 (at 10.51.12.20@o2ib3), client will retry: rc = -110 [283372.127169] Lustre: Skipped 24 previous similar messages [283372.172086] LustreError: 193438:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [283377.560418] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.20@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xc13bf425 [283381.901712] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.20@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xc13bf42d [283381.918765] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [283386.360577] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.20@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xc13c2175 [283406.057674] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [283406.071556] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be719d83400 [283406.083736] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda65d94000 [283406.083766] LustreError: 227991:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be38f43c850 x1696948247607936/t0(0) o4->0cc1f06c-eda0-ebe4-f37b-9f106eca81f1@10.51.4.62@o2ib3:266/0 lens 488/448 e 0 to 0 dl 1619004146 ref 1 fl Interpret:/0/0 rc 0/0 [283406.123380] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0b95eb000 [283406.135547] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0b95eb000 [283406.147724] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce9991ac00 [283406.159902] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda125a6800 [283406.172070] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda125a6800 [283406.184252] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd316b27c00 [283525.958676] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bab05b8d000 [283525.970848] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbbb2273c00 [283525.983032] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbbb2273c00 [283525.995217] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be9797a9000 [283526.007362] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4cb2c4c00 [283526.019545] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4cb2c4c00 [283526.031692] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4cb2c4c00 [283526.031882] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.214@o2ib5 failed: 5 [283526.031884] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [283526.043880] LNet: 85677:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.214@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [283526.043882] LNet: 85677:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) Skipped 1 previous similar message [283526.093058] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be9797a9000 [283531.395879] LustreError: 137-5: oak-OST004f_UUID: not available for connect from 10.51.1.71@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [283531.415295] LustreError: Skipped 10 previous similar messages [283548.625383] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.13@o2ib3 ns: filter-oak-OST005c_UUID lock: ffff8bd9a5c54380/0xf81cb91ffab2ea4 lrc: 4/0,0 mode: PR/PR res: [0x1740000400:0x40016b:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.5.13@o2ib3 remote: 0x41e390048eac2820 expref: 13 pid: 193050 timeout: 283554 lvb_type: 1 [283548.672511] LustreError: 193047:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8be66d413400 ns: filter-oak-OST005c_UUID lock: ffff8bd22ec7cc80/0xf81cb91ffab33b3 lrc: 3/0,0 mode: --/PW res: [0x1740000400:0x40016b:0x0].0x0 rrc: 4 type: EXT [0->8191] (req 0->8191) flags: 0x50000000020000 nid: 10.51.5.13@o2ib3 remote: 0x41e390048eac2827 expref: 13 pid: 193047 timeout: 0 lvb_type: 0 [283573.682768] Lustre: 125061:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619004048/real 1619004048] req@ffff8bdedc2e5a00 x1697353996118272/t0(0) o105->oak-OST003a@10.51.5.30@o2ib3:15/16 lens 360/224 e 0 to 1 dl 1619004221 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [283576.803721] Lustre: 193101:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619004051/real 1619004051] req@ffff8bc28e1d9b00 x1697353996118976/t0(0) o106->oak-OST005c@10.51.5.2@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1619004224 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [283586.325469] LustreError: 224954:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bcb7f1d3050 x1688687454088000/t0(0) o3->42e4ab3a-7964-4@10.51.2.69@o2ib3:451/0 lens 488/440 e 0 to 0 dl 1619004331 ref 1 fl Interpret:/0/0 rc 0/0 [283586.350407] LustreError: 224954:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 11 previous similar messages [283609.777616] Lustre: oak-OST004c: Client 7f5a03b0-a887-10eb-7386-d75d82cdd92b (at 10.51.13.20@o2ib3) reconnecting [283609.789106] Lustre: Skipped 637 previous similar messages [283615.454330] Lustre: oak-OST005c: Connection restored to 51e35692-a4e4-4 (at 10.50.8.71@o2ib2) [283615.463956] Lustre: Skipped 1075 previous similar messages [283631.421052] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.20@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xc188e83d [283632.053079] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7b22f6800 [283632.065248] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcec8120400 [283632.077400] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3ef083800 [283632.089569] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3ef083800 [283632.101762] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0c4cf8000 [283632.113920] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0c4cf8000 [283632.126067] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0c4cf8000 [283632.138224] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1cde78800 [283704.154488] Lustre: oak-OST0040: haven't heard from client 6805eafa-ecee-4 (at 10.50.2.38@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc5239b0400, cur 1619004352 expire 1619004202 last 1619004125 [283704.176835] Lustre: Skipped 2 previous similar messages [283972.764583] Lustre: oak-OST004e: Bulk IO write error with 1023b702-3df4-4 (at 10.51.6.20@o2ib3), client will retry: rc = -110 [283972.777308] Lustre: Skipped 23 previous similar messages [283995.403938] Lustre: oak-OST0040: Bulk IO read error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc -110 [283995.416395] Lustre: Skipped 76 previous similar messages [284002.181113] LNet: 50608:0:(lib-move.c:976:lnet_post_send_locked()) Aborting message for 12345-10.0.2.217@o2ib5: LNetM[DE]Unlink() already called on the MD/ME. [284002.197040] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be586185400 [284169.062264] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [284169.075658] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 2 previous similar messages [284169.087891] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd85fdc8000 [284169.100047] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd85fdc8000 [284169.112241] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdd219cb800 [284169.124424] LustreError: 217340:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be7aa9e3050 x1684976665173184/t0(0) o4->8a1d869f-b707-4@10.51.3.8@o2ib3:270/0 lens 488/448 e 0 to 0 dl 1619004905 ref 1 fl Interpret:/0/0 rc 0/0 [284169.149759] LustreError: 217340:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 14 previous similar messages [284173.280857] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.15.1@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [284173.300275] LustreError: Skipped 4 previous similar messages [284196.331855] LustreError: 9692:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(4194304) req@ffff8bcc31b16050 x1685035259570816/t0(0) o4->04a04758-274c-4@10.51.2.46@o2ib3:256/0 lens 488/448 e 0 to 0 dl 1619004891 ref 1 fl Interpret:/0/0 rc 0/0 [284196.357719] LustreError: 9692:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 19 previous similar messages [284216.284997] Lustre: oak-OST0038: Connection restored to 702d9553-ca91-4 (at 10.51.5.62@o2ib3) [284216.294621] Lustre: Skipped 1864 previous similar messages [284221.332573] LustreError: 5988:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8bcc4f979050 x1688695530759680/t0(0) o3->9aae8976-9e66-4@10.51.2.34@o2ib3:258/0 lens 488/440 e 0 to 0 dl 1619004893 ref 1 fl Interpret:/0/0 rc 0/0 [284221.358102] LustreError: 5988:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 82 previous similar messages [284227.502540] Lustre: oak-OST0042: Client ed1ad69c-9415-7f56-f757-f4610de93535 (at 10.51.6.13@o2ib3) reconnecting [284227.513908] Lustre: Skipped 1453 previous similar messages [284245.038343] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600ba0) failed: 5 [284245.039169] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71ff6c800 [284245.049014] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [284245.049016] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [284245.049017] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcec8125c00 [284245.049020] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcdb9094400 [284245.049023] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcec8127000 [284245.049026] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723e29c00 [284245.049028] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be4096f2400 [284245.049031] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be66d411000 [284245.049033] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be66d411000 [284245.167327] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 212 previous similar messages [284336.143571] Lustre: oak-OST0036: haven't heard from client d5ed232a-0a4c-4 (at 10.51.4.10@o2ib3) in 202 seconds. I think it's dead, and I am evicting it. exp ffff8bcef6f95c00, cur 1619004984 expire 1619004834 last 1619004782 [284339.578715] LustreError: 224962:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71d7ec050 x1691355877084288/t0(0) o4->1dcbe18d-7cb3-4@10.51.6.9@o2ib3:450/0 lens 488/448 e 0 to 0 dl 1619005085 ref 1 fl Interpret:/0/0 rc 0/0 [284339.603705] LustreError: 224962:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 7 previous similar messages [284389.258521] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcef6f90800 [284389.270703] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbb95dd800 [284389.282872] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0802ae800 [284389.295037] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be1f78fdc00 [284389.307208] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd605b8cc00 [284389.319367] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd605b8cc00 [284389.331530] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71b0d0c00 [284389.343721] LNet: 85677:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [284389.343726] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71b0d0c00 [284389.343742] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71b0d0c00 [284389.343754] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [284389.343756] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 16 previous similar messages [284389.343765] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71b0d0c00 [284389.343770] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [284389.343772] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 1 previous similar message [284389.343786] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0802ae800 [284436.257393] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be670981800 [284436.269592] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be670981800 [284436.281893] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be2e048ac00 [284534.973609] LustreError: 211957:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli 266ca40f-6c8b-4 claims 1257472 GRANT, real grant 274432 [284556.979874] Lustre: 193156:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619005031/real 1619005031] req@ffff8bc28e1d8000 x1697353996811648/t0(0) o106->oak-OST0034@10.51.2.39@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1619005204 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [284559.245398] LustreError: 193449:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli 266ca40f-6c8b-4 claims 1282048 GRANT, real grant 0 [284563.552308] LNet: 50608:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.5.20@o2ib3 portal 16 match 1697353996947968 offset 224 length 224: 4 [284580.103681] Lustre: oak-OST0042: haven't heard from client 0524aa72-f9f7-4 (at 10.51.4.47@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be586187400, cur 1619005228 expire 1619005078 last 1619005001 [284596.389945] Lustre: oak-OST0058: Bulk IO read error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc -110 [284596.402376] Lustre: Skipped 60 previous similar messages [284597.620626] LustreError: 221414:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003e: cli 266ca40f-6c8b-4 claims 2490368 GRANT, real grant 0 [284598.265982] Lustre: oak-OST0058: Bulk IO write error with 571c5cbe-9605-4 (at 10.51.6.22@o2ib3), client will retry: rc = -110 [284598.278708] Lustre: Skipped 69 previous similar messages [284601.600835] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 172s: evicting client at 10.51.4.37@o2ib3 ns: filter-oak-OST0052_UUID lock: ffff8be70aa75c40/0xf81cb91ffa8e2ff lrc: 3/0,0 mode: PW/PW res: [0x23035f1:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000480010020 nid: 10.51.4.37@o2ib3 remote: 0x476c9bc63079f535 expref: 9 pid: 193047 timeout: 284607 lvb_type: 0 [284609.006491] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.6.25@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xc2f58fbd [284631.155253] Lustre: oak-OST0044: haven't heard from client f0643f46-4acf-4 (at 10.51.2.41@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c2642400, cur 1619005279 expire 1619005129 last 1619005052 [284743.249803] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600820) failed: 5 [284743.250036] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd717b9d400 [284743.250050] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [284743.250052] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [284743.250054] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcb2f286c00 [284743.260662] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcd1bd32800 [284743.260666] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 3, status -5, desc ffff8bcf99585000 [284743.260671] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcd1bd32800 [284743.260676] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcdb9092c00 [284743.260678] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be313d7ac00 [284743.260687] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71fdbf000 [284743.260690] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf6d2e3000 [284743.390688] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 368 previous similar messages [284796.474012] LustreError: 217349:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71b032850 x1690821033800512/t0(0) o3->aa49d6cf-64e9-4@10.51.5.71@o2ib3:93/0 lens 488/440 e 0 to 0 dl 1619005483 ref 1 fl Interpret:/0/0 rc 0/0 [284796.499255] LustreError: 217349:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 22 previous similar messages [284817.950325] Lustre: oak-OST004a: Connection restored to b1686e50-92e8-4 (at 10.49.26.33@o2ib1) [284817.960060] Lustre: Skipped 3405 previous similar messages [284828.296664] Lustre: oak-OST003c: Client 9845fb77-4260-4785-8e45-59a03871912a (at 10.51.1.71@o2ib3) reconnecting [284828.308026] Lustre: Skipped 3032 previous similar messages [284836.024471] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [284836.037009] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 4 previous similar messages [284836.048186] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fe7c0) failed: 5 [284836.048231] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [284836.048233] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [284836.048236] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be0a645e800 [284836.058952] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be0a645e800 [284836.059232] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be9797ac800 [284836.059526] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd9e4131800 [284836.059823] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 3, status -5, desc ffff8be16e1e9000 [284836.059828] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be9797ac800 [284836.059830] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcb2f286c00 [284836.165081] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1040 previous similar messages [284896.496944] LustreError: 221479:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be71b348850 x1685031602656640/t0(0) o3->0507d702-6525-4@10.51.2.51@o2ib3:179/0 lens 488/440 e 0 to 0 dl 1619005569 ref 1 fl Interpret:/0/0 rc 0/0 [284896.497825] LustreError: 221402:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1252954(2301530) req@ffff8bcc4e08b850 x1689659537577856/t0(0) o4->1d13e74c-2452-4@10.51.4.61@o2ib3:183/0 lens 488/448 e 0 to 0 dl 1619005573 ref 1 fl Interpret:/0/0 rc 0/0 [284896.497827] LustreError: 221402:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 58 previous similar messages [284896.560995] LustreError: 221479:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 46 previous similar messages [284940.569904] LustreError: 193140:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd50302d050 x1696866328349248/t0(0) o4->ede94a1a-b345-30c6-e4bb-52a3a03e8e5f@10.51.6.18@o2ib3:290/0 lens 488/448 e 0 to 0 dl 1619005680 ref 1 fl Interpret:/0/0 rc 0/0 [284940.596979] LustreError: 193140:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 5 previous similar messages [284963.022093] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbf53a9400 [284963.034347] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be97453f800 [284963.046510] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0a645e800 [284963.058673] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbb95d8800 [284963.070824] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbb95d8800 [284963.082998] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1e80cac00 [284963.095162] LNet: 1245:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [284963.095163] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1e80cac00 [284963.095193] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [284963.095195] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 15 previous similar messages [284963.095220] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd85fdc8800 [284979.591955] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.4.33@o2ib3 ns: filter-oak-OST0046_UUID lock: ffff8bd2b57ce300/0xf81cb91ffaf01f5 lrc: 4/0,0 mode: PW/PW res: [0x220af45:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) flags: 0x60000400030020 nid: 10.51.4.33@o2ib3 remote: 0xe1814cf08e21379b expref: 9 pid: 4534 timeout: 284947 lvb_type: 0 [284980.980261] LustreError: 217338:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST005e: cli f71d1188-d01c-9fa0-935c-a63ad652da8a claims 4218880 GRANT, real grant 1224704 [285003.589374] Lustre: 193048:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619005478/real 1619005478] req@ffff8bdc3922d100 x1697353997134144/t0(0) o104->oak-OST004e@10.51.4.33@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619005651 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [285014.821719] LustreError: 221481:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST005e: cli f71d1188-d01c-9fa0-935c-a63ad652da8a claims 2555904 GRANT, real grant 0 [285014.837652] LustreError: 221481:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 2 previous similar messages [285037.153850] Lustre: oak-OST0034: haven't heard from client 8a1d869f-b707-4 (at 10.51.3.8@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bca76b45800, cur 1619005685 expire 1619005535 last 1619005458 [285096.243177] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be716536400 [285096.255339] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdef1bf1800 [285096.267514] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7237cb400 [285096.279709] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdc3d94e800 [285096.291868] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be75703d000 [285096.304026] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb091f3800 [285096.316177] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7237cb400 [285105.589018] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.4.9@o2ib3 ns: filter-oak-OST005a_UUID lock: ffff8bd3610edc40/0xf81cb91ffafda3f lrc: 4/0,0 mode: PW/PW res: [0x21d150d:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) flags: 0x60000400030020 nid: 10.51.4.9@o2ib3 remote: 0xbe8447c47144b7f1 expref: 7 pid: 187434 timeout: 285111 lvb_type: 0 [285105.633825] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [285131.722388] Lustre: 193059:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619005606/real 1619005606] req@ffff8bd107893a80 x1697353997841088/t0(0) o104->oak-OST0032@10.51.4.68@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619005779 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [285139.212557] Lustre: oak-OST005e: haven't heard from client 1e90ddc0-ff8c-4 (at 10.51.4.39@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bcee8d19000, cur 1619005787 expire 1619005637 last 1619005560 [285204.742071] LNet: 50606:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.2.40@o2ib3 portal 16 match 1697353998029568 offset 224 length 224: 4 [285204.758100] LNet: 50606:0:(lib-move.c:3829:lnet_parse_put()) Skipped 1 previous similar message [285221.574468] Lustre: oak-OST0040: Bulk IO read error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc -110 [285221.587258] Lustre: Skipped 83 previous similar messages [285419.719515] Lustre: oak-OST0040: Connection restored to 69118775-1b09-4 (at 10.50.5.6@o2ib2) [285419.729040] Lustre: Skipped 2450 previous similar messages [285423.234315] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be3de28d800 [285423.246478] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd0802aa000 [285423.258619] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcdb9091400 [285423.270777] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcdb9091400 [285423.282930] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd85fdca000 [285423.295082] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcdb9097400 [285423.307242] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd403a22400 [285423.319402] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6c89fec00 [285423.331732] LustreError: 221409:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be757079050 x1689674378668736/t0(0) o4->566f9fe9-38b0-4@10.51.5.10@o2ib3:19/0 lens 488/448 e 0 to 0 dl 1619006164 ref 1 fl Interpret:/0/0 rc 0/0 [285423.357060] LustreError: 221409:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 17 previous similar messages [285423.368088] Lustre: oak-OST0056: Bulk IO write error with 566f9fe9-38b0-4 (at 10.51.5.10@o2ib3), client will retry: rc = -110 [285423.380806] Lustre: Skipped 28 previous similar messages [285474.680647] Lustre: oak-OST0048: Client bc175ba0-453f-4 (at 10.51.1.25@o2ib3) reconnecting [285474.689978] Lustre: Skipped 2105 previous similar messages [285780.225451] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [285780.237973] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 3 previous similar messages [285780.250557] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd308b77400 [285780.262712] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be39dc66000 [285780.274869] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd738a6c000 [285780.287029] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd738a6c000 [285780.299192] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd1b6136c00 [285780.311348] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd26aee6000 [285780.323502] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd403a20400 [285780.335654] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd403a20400 [285846.712264] Lustre: oak-OST0040: Bulk IO read error with 99edcc35-9e6a-4 (at 10.51.2.64@o2ib3), client will retry: rc -110 [285846.712275] LustreError: 193428:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8bd8b5dfc050 x1685212529270912/t0(0) o3->fac849c2-8ac8-4@10.51.2.39@o2ib3:376/0 lens 488/440 e 0 to 0 dl 1619006521 ref 1 fl Interpret:/0/0 rc 0/0 [285846.712277] LustreError: 193428:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 38 previous similar messages [285846.712352] LustreError: 193395:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(605) req@ffff8be71bcc0850 x1684946109119616/t0(0) o4->831437d2-6335-4@10.51.2.61@o2ib3:377/0 lens 488/448 e 0 to 0 dl 1619006522 ref 1 fl Interpret:/0/0 rc 0/0 [285846.712353] LustreError: 193395:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 13 previous similar messages [285846.797673] Lustre: Skipped 29 previous similar messages [285949.597271] Lustre: 193153:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619006424/real 1619006424] req@ffff8bc3c5e81680 x1697353998494592/t0(0) o106->oak-OST003e@10.51.12.20@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1619006597 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [285952.676208] Lustre: 189036:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619006427/real 1619006427] req@ffff8bdda8ae9f80 x1697353998495680/t0(0) o106->oak-OST005e@10.51.2.52@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1619006600 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [285956.705120] LustreError: 209395:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bdc8af69850 x1696660706481600/t0(0) o4->f71d1188-d01c-9fa0-935c-a63ad652da8a@10.51.12.21@o2ib3:554/0 lens 488/448 e 0 to 0 dl 1619006699 ref 1 fl Interpret:/0/0 rc 0/0 [285956.732297] LustreError: 209395:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 8 previous similar messages [285961.499019] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.21@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xc4d1662d [286019.810072] Lustre: oak-OST0042: Connection restored to 03d6cc0c-b2d2-4 (at 10.51.3.55@o2ib3) [286019.819685] Lustre: Skipped 922 previous similar messages [286126.132133] Lustre: oak-OST003e: Client 7e78116f-4212-7e45-91f6-fb2ef0ac146a (at 10.51.4.9@o2ib3) reconnecting [286126.143475] Lustre: Skipped 648 previous similar messages [286184.485766] Lustre: oak-OST0038: Bulk IO write error with 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3), client will retry: rc = -110 [286184.498544] Lustre: Skipped 10 previous similar messages [286462.986539] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [286463.000546] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdcd1699c00 [286463.012723] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca6cf8e400 [286463.024911] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bca6cf8e400 [286463.037080] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be722754c00 [286463.049262] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4958fe400 [286463.061436] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4958fe400 [286463.073600] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce46dbdc00 [286463.085766] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce46dbdc00 [286521.863873] LustreError: 193102:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bcda365e050 x1689650637275200/t0(0) o3->96aa5aef-bf65-4@10.51.5.48@o2ib3:299/0 lens 488/440 e 0 to 0 dl 1619007199 ref 1 fl Interpret:/0/0 rc 0/0 [286521.863875] LustreError: 193448:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd750f07050 x1689654546771968/t0(0) o3->3dc6d22e-58b7-4@10.51.5.3@o2ib3:299/0 lens 488/440 e 0 to 0 dl 1619007199 ref 1 fl Interpret:/0/0 rc 0/0 [286521.863879] LustreError: 193448:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 12 previous similar messages [286521.864260] Lustre: oak-OST0040: Bulk IO read error with 96aa5aef-bf65-4 (at 10.51.5.48@o2ib3), client will retry: rc -110 [286521.864262] Lustre: Skipped 9 previous similar messages [286521.943715] LustreError: 193102:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [286548.769836] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.51.15.7@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [286548.789256] LustreError: Skipped 1 previous similar message [286593.206936] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bddaaf1bc00 [286593.219122] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bddaaf1bc00 [286593.231286] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1cef9a800 [286593.243450] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be1cef9a800 [286593.255631] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be90ac26800 [286593.267790] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd717b9b400 [286593.279948] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd717b9b400 [286593.292116] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be760d48000 [286593.292119] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [286593.292120] LNet: 50606:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [286593.292157] LNet: 179836:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [286593.342805] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0d2d62400 [286593.354956] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcf65158400 [286622.166697] Lustre: oak-OST005c: Connection restored to 341ff9d8-ce51-a34b-3b59-737651e19da4 (at 10.51.2.29@o2ib3) [286622.178352] Lustre: Skipped 487 previous similar messages [286646.893421] LustreError: 227996:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd02b3a1050 x1689657180826304/t0(0) o3->73ed3aec-5286-4@10.51.5.14@o2ib3:428/0 lens 488/440 e 0 to 0 dl 1619007328 ref 1 fl Interpret:/0/0 rc 0/0 [286646.893427] LustreError: 193428:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(4194304) req@ffff8be719e47050 x1689711623882816/t0(0) o4->19d08a37-93d2-4@10.51.5.30@o2ib3:428/0 lens 488/448 e 0 to 0 dl 1619007328 ref 1 fl Interpret:/0/0 rc 0/0 [286646.893429] LustreError: 193428:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 6 previous similar messages [286646.957532] LustreError: 227996:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 249 previous similar messages [286654.850056] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.4.69@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [286729.891247] Lustre: oak-OST005a: Client 438da15f-c993-4 (at 10.51.0.68@o2ib3) reconnecting [286729.900577] Lustre: Skipped 179 previous similar messages [286760.625219] LustreError: 227919:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be7256f9850 x1689649561205888/t0(0) o3->aa63d6a0-4905-4@10.51.4.48@o2ib3:591/0 lens 488/440 e 0 to 0 dl 1619007491 ref 1 fl Interpret:/0/0 rc 0/0 [286760.650193] LustreError: 227919:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 13 previous similar messages [286807.008314] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.1.9@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [286807.027633] LustreError: Skipped 1 previous similar message [286831.978652] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdf7c7c2000 [286831.990874] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdf7c7c2000 [286832.003041] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd1bd32800 [286832.015232] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd1bd32800 [286832.027416] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be3a82c8c00 [286832.039596] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdd38471800 [286832.051756] LNet: 179836:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [286832.051757] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3435be400 [286832.051789] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3435be400 [286832.051808] LNet: 50608:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [286832.051812] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdd38471800 [286832.051831] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be3a82c8c00 [286832.051847] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be670981400 [286832.051859] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be670981400 [286832.052113] Lustre: oak-OST0054: Bulk IO write error with 4b6b0473-d368-4 (at 10.51.3.12@o2ib3), client will retry: rc = -110 [286832.052114] Lustre: Skipped 17 previous similar messages [286949.975705] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcbbb509400 [286949.976869] LNet: 85677:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [286949.976876] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [286949.976878] LNet: 50609:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [286950.026486] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf6515ac00 [286972.545364] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.14.17@o2ib3 ns: filter-oak-OST0042_UUID lock: ffff8bd0aa841b00/0xf81cb91ffb7afaa lrc: 3/0,0 mode: PW/PW res: [0x20929cd:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) flags: 0x60000400030020 nid: 10.51.14.17@o2ib3 remote: 0x411816e89365d936 expref: 7 pid: 187434 timeout: 286978 lvb_type: 0 [286972.590510] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 4 previous similar messages [286974.545211] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.13@o2ib3 ns: filter-oak-OST003c_UUID lock: ffff8be6af869b00/0xf81cb91ffb7f293 lrc: 3/0,0 mode: PR/PR res: [0x900000400:0x3195d8:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.5.13@o2ib3 remote: 0x41e390048eaceb1d expref: 9 pid: 193065 timeout: 286980 lvb_type: 1 [286974.592110] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [286974.603256] LustreError: 50065:0:(client.c:1187:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff8bb93d705580 x1697353999570880/t0(0) o105->oak-OST003c@10.51.5.13@o2ib3:15/16 lens 360/224 e 0 to 0 dl 0 ref 1 fl Rpc:/0/ffffffff rc 0/-1 [286998.251706] Lustre: 192983:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619007473/real 1619007473] req@ffff8bcb4c91c800 x1697353999470144/t0(0) o104->oak-OST005a@10.51.5.6@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619007646 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [286998.282304] Lustre: 192983:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [287000.479849] LNet: 50609:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.4.48@o2ib3 portal 16 match 1697353999586816 offset 224 length 224: 4 [287002.471570] Lustre: 187433:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619007477/real 1619007477] req@ffff8bdcee14de80 x1697353999470592/t0(0) o106->oak-OST004e@10.51.2.50@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1619007650 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [287002.502242] Lustre: 187433:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [287005.283729] LNet: 50608:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.2.50@o2ib3 portal 16 match 1697353999588800 offset 224 length 224: 4 [287202.969502] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [287202.982888] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 3 previous similar messages [287202.995089] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd36ae2e800 [287203.007252] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd36ae2e800 [287203.019435] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd403a21400 [287203.031592] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd403a21400 [287203.043763] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be140412800 [287203.055955] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda0b0e4400 [287203.068125] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda0b0e4400 [287203.080329] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be3d8683400 [287203.092484] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5a4d19c00 [287203.112403] LustreError: 221488:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71aa12050 x1685027091254144/t0(0) o4->99edcc35-9e6a-4@10.51.2.64@o2ib3:281/0 lens 488/448 e 0 to 0 dl 1619007936 ref 1 fl Interpret:/0/0 rc 0/0 [287203.137842] LustreError: 221488:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 15 previous similar messages [287222.034965] Lustre: oak-OST0040: Bulk IO read error with f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3), client will retry: rc -110 [287222.049529] Lustre: Skipped 486 previous similar messages [287222.250908] Lustre: oak-OST0052: Connection restored to 7fca2a32-f1a9-4 (at 10.51.3.19@o2ib3) [287222.260528] Lustre: Skipped 2144 previous similar messages [287247.035833] LustreError: 227986:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd0edced050 x1697586573124160/t0(0) o3->74cfd00f-ce2b-6f66-6c9e-b2a137740d54@10.51.2.15@o2ib3:263/0 lens 488/440 e 0 to 0 dl 1619007918 ref 1 fl Interpret:/0/0 rc 0/0 [287247.035982] LustreError: 224959:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(2462039) req@ffff8bd340fd7850 x1685116511956160/t0(0) o4->9b12e584-d591-4@10.51.12.20@o2ib3:266/0 lens 488/448 e 0 to 0 dl 1619007921 ref 1 fl Interpret:/0/0 rc 0/0 [287247.035983] LustreError: 224959:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 25 previous similar messages [287247.102139] LustreError: 227986:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 255 previous similar messages [287337.890980] Lustre: oak-OST0040: Client f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3) reconnecting [287337.902443] Lustre: Skipped 1911 previous similar messages [287351.536400] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.4.54@o2ib3 ns: filter-oak-OST005c_UUID lock: ffff8be60e539440/0xf81cb91ffb62534 lrc: 4/0,0 mode: PW/PW res: [0x215371b:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1232895) flags: 0x60000400010020 nid: 10.51.4.54@o2ib3 remote: 0x9f90b78f094b4b8 expref: 10 pid: 4534 timeout: 287320 lvb_type: 0 [287371.357959] Lustre: 192982:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619007846/real 1619007846] req@ffff8bcf19aaec00 x1697353999720832/t0(0) o104->oak-OST0034@10.51.4.22@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619008019 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [287371.388652] Lustre: 192982:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [287521.962823] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda9b19d800 [287521.975046] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda9b19d800 [287521.987205] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd26aee2c00 [287521.999364] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd26aee2c00 [287522.011521] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be32b114800 [287522.023690] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd9e4135000 [287522.023977] Lustre: oak-OST0040: Bulk IO write error with 8a1d869f-b707-4 (at 10.51.3.8@o2ib3), client will retry: rc = -110 [287522.023979] Lustre: Skipped 29 previous similar messages [287522.048462] LNet: 85677:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [287522.072174] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be283f18000 [287522.084312] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be283f18000 [287522.085408] LustreError: 221477:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be5a1bd8050 x1695844246415680/t0(0) o3->330d404b-804c-4@10.51.15.3@o2ib3:610/0 lens 488/440 e 0 to 0 dl 1619008265 ref 1 fl Interpret:/0/0 rc 0/0 [287522.085410] LustreError: 221477:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 9 previous similar messages [287522.132065] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd6d739a000 [287522.144217] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd6d739a000 [287522.156368] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4ad55b000 [287522.168529] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be70d61cc00 [287522.180690] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4ad55b000 [287522.192839] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd556863c00 [287632.529862] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 112s: evicting client at 10.51.12.22@o2ib3 ns: filter-oak-OST0040_UUID lock: ffff8bb25a8fa1c0/0xf81cb91ffbab89c lrc: 3/0,0 mode: PR/PR res: [0x1cd8afc:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.12.22@o2ib3 remote: 0xa022db4517afb097 expref: 8 pid: 193042 timeout: 287638 lvb_type: 1 [287632.576314] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 2 previous similar messages [287692.935416] Lustre: 192983:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619008167/real 1619008167] req@ffff8bac371c5100 x1697353999964416/t0(0) o104->oak-OST003c@10.51.2.65@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619008340 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [287692.966219] Lustre: 192983:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [287706.042209] Lustre: oak-OST004a: haven't heard from client bb6977be-fbc5-4 (at 10.51.4.68@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd600232000, cur 1619008354 expire 1619008204 last 1619008127 [287719.073319] Lustre: oak-OST003c: haven't heard from client a28c7102-5eb9-4 (at 10.51.2.57@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be977274800, cur 1619008367 expire 1619008217 last 1619008140 [287724.050097] Lustre: oak-OST0034: haven't heard from client a7ebb784-0f8b-4 (at 10.51.6.23@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bde49513000, cur 1619008372 expire 1619008222 last 1619008145 [287724.072422] Lustre: Skipped 1 previous similar message [287750.571372] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.1.36@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [287750.590858] LustreError: Skipped 4 previous similar messages [287804.305915] LustreError: 137-5: oak-OST0047_UUID: not available for connect from 10.51.5.60@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [287822.838241] Lustre: oak-OST0058: Connection restored to 330d404b-804c-4 (at 10.51.15.3@o2ib3) [287822.847871] Lustre: Skipped 3338 previous similar messages [287859.632551] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.20@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xc783847d [287859.649545] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [287889.218491] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [287889.231036] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 1 previous similar message [287889.242166] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fed00) failed: 5 [287889.252583] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 16 previous similar messages [287889.252947] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [287889.252948] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 5 previous similar messages [287889.252951] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd7549d6c00 [287889.253231] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd7549d6c00 [287889.253492] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd7549d6c00 [287889.253745] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd7549d6c00 [287946.209242] LustreError: 221394:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be757051850 x1690718773479680/t0(0) o3->b91e7890-9c4a-4@10.51.2.4@o2ib3:219/0 lens 488/440 e 0 to 0 dl 1619008629 ref 1 fl Interpret:/0/0 rc 0/0 [287946.234486] LustreError: 221394:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 14 previous similar messages [287946.245501] Lustre: oak-OST0040: Bulk IO read error with b91e7890-9c4a-4 (at 10.51.2.4@o2ib3), client will retry: rc -110 [287946.257833] Lustre: Skipped 416 previous similar messages [287965.806493] Lustre: oak-OST0054: Client 156315a7-a82d-b4fe-847a-396165636f38 (at 10.51.14.3@o2ib3) reconnecting [287965.817903] Lustre: Skipped 3271 previous similar messages [288030.173277] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c6003c0) failed: 5 [288030.173333] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [288030.173335] LNet: 50606:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [288030.173338] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb26e8eec00 [288030.175607] LNet: 19313:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [288030.184007] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bb26e8eec00 [288030.184023] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd1086f5800 [288030.260012] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 264 previous similar messages [288030.520538] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.67@o2ib3 ns: filter-oak-OST003e_UUID lock: ffff8bcc9310c800/0xf81cb91ffbc307a lrc: 4/0,0 mode: PR/PR res: [0x940000400:0x301389:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.5.67@o2ib3 remote: 0x8127f6c4a986609d expref: 11 pid: 193067 timeout: 288036 lvb_type: 1 [288030.567560] LustreError: 193029:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8bd738a6a800 ns: filter-oak-OST003e_UUID lock: ffff8bd126780480/0xf81cb91ffbc3b9b lrc: 3/0,0 mode: --/PW res: [0x940000400:0x301389:0x0].0x0 rrc: 3 type: EXT [0->8191] (req 0->8191) flags: 0x50000000020000 nid: 10.51.5.67@o2ib3 remote: 0x8127f6c4a98660a4 expref: 11 pid: 193029 timeout: 0 lvb_type: 0 [288057.708488] LNet: 50602:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d859cea120) failed: 5 [288057.708741] LNet: 50605:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.215@o2ib5 exceeded retry count 0 [288057.708743] LNet: 50605:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [288057.708746] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd7549d4c00 [288057.709310] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcfb0c2cc00 [288057.709313] LustreError: 50604:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd7549d4c00 [288057.777627] LNet: 50602:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 368 previous similar messages [288280.007724] Lustre: 187491:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619008754/real 1619008754] req@ffff8bcd3c6b4380 x1697354000360768/t0(0) o104->oak-OST0038@10.51.4.35@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619008927 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [288331.226973] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.51.1.18@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [288331.246403] LustreError: Skipped 2 previous similar messages [288373.985154] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7a79af400 [288373.997348] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bae5f5e9800 [288374.009512] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bae5f5e9800 [288374.021676] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce58032800 [288374.033848] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be24bc41800 [288374.046002] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bae5f5ee400 [288374.058154] LNet: 182038:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.216@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [288374.058170] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.216@o2ib5 failed: 5 [288374.058172] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0b07c0800 [288374.058173] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [288374.058199] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0b07c0800 [288374.058217] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd0b07c0800 [288396.299706] Lustre: oak-OST0058: Bulk IO write error with 94477130-eaa9-4 (at 10.51.15.17@o2ib3), client will retry: rc = -110 [288396.312568] Lustre: Skipped 27 previous similar messages [288423.247409] Lustre: oak-OST004a: Connection restored to 7af31256-f01a-4 (at 10.50.1.69@o2ib2) [288423.257040] Lustre: Skipped 459 previous similar messages [288569.517095] Lustre: oak-OST005a: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [288569.517096] Lustre: oak-OST004a: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [288569.517099] Lustre: Skipped 159 previous similar messages [288661.158973] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [288661.172368] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Skipped 3 previous similar messages [288661.183508] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff240) failed: 5 [288661.185164] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bde309a1400 [288661.194141] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [288661.194142] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 2 previous similar messages [288661.194145] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd522826000 [288661.194491] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bda5fa94800 [288661.194756] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd522826000 [288661.195032] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd7e37e1000 [288661.195335] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd7e37e1000 [288661.195619] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bce58035800 [288661.195892] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bce58035800 [288661.312440] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1706 previous similar messages [288680.533834] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.1.36@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [288680.553269] LustreError: Skipped 6 previous similar messages [288696.363137] LustreError: 217346:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be7170fd050 x1689695324034752/t0(0) o3->d2b765d6-c390-4@10.51.5.1@o2ib3:226/0 lens 488/440 e 0 to 0 dl 1619009391 ref 1 fl Interpret:/0/0 rc 0/0 [288696.363170] LustreError: 241048:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8be5178f4050 x1685039193915136/t0(0) o4->f0643f46-4acf-4@10.51.2.41@o2ib3:226/0 lens 488/448 e 0 to 0 dl 1619009391 ref 1 fl Interpret:/0/0 rc 0/0 [288696.363172] LustreError: 241048:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 32 previous similar messages [288696.363543] Lustre: oak-OST0040: Bulk IO read error with d2b765d6-c390-4 (at 10.51.5.1@o2ib3), client will retry: rc -110 [288696.363544] Lustre: Skipped 9 previous similar messages [288696.444767] LustreError: 217346:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 370 previous similar messages [288721.367428] LustreError: 221472:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bda1031c050 x1689647966163072/t0(0) o3->280483d6-e910-4@10.51.5.47@o2ib3:236/0 lens 488/440 e 0 to 0 dl 1619009401 ref 1 fl Interpret:/0/0 rc 0/0 [288721.392768] LustreError: 221472:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 14 previous similar messages [288921.408572] LustreError: 5987:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1048576(4194304) req@ffff8bd0edce9050 x1696614590191488/t0(0) o3->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:435/0 lens 488/440 e 0 to 0 dl 1619009600 ref 1 fl Interpret:/0/0 rc 0/0 [288921.408574] LustreError: 193442:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be02f8b1850 x1696614590191424/t0(0) o3->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:435/0 lens 488/440 e 0 to 0 dl 1619009600 ref 1 fl Interpret:/0/0 rc 0/0 [288921.408578] LustreError: 193442:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 54 previous similar messages [288954.715990] LustreError: 193445:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be1bb568850 x1684935344383680/t0(0) o3->04118086-b366-4@10.51.3.10@o2ib3:534/0 lens 488/440 e 0 to 0 dl 1619009699 ref 1 fl Interpret:/0/0 rc 0/0 [288954.740930] LustreError: 193445:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 16 previous similar messages [289024.143852] Lustre: oak-OST0042: Connection restored to aa49d6cf-64e9-4 (at 10.51.5.71@o2ib3) [289024.153477] Lustre: Skipped 1177 previous similar messages [289247.736868] Lustre: oak-OST0032: Client 7248222f-4666-93b2-8a86-11760c329552 (at 10.51.15.14@o2ib3) reconnecting [289247.748379] Lustre: Skipped 948 previous similar messages [289626.893394] Lustre: oak-OST004e: Connection restored to 9a96bf08-f925-4 (at 10.50.4.21@o2ib2) [289626.903031] Lustre: Skipped 313 previous similar messages [289779.131291] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.4.35@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [289779.150727] LustreError: Skipped 3 previous similar messages [289923.770816] Lustre: oak-OST0034: Client ea3c0abc-03bf-9064-4b06-a2deb2e4ec48 (at 10.51.4.35@o2ib3) reconnecting [289923.770817] Lustre: oak-OST0036: Client ea3c0abc-03bf-9064-4b06-a2deb2e4ec48 (at 10.51.4.35@o2ib3) reconnecting [289923.770820] Lustre: Skipped 103 previous similar messages [290233.052187] Lustre: oak-OST0036: Connection restored to c6206ab8-59d6-4 (at 10.51.14.17@o2ib3) [290233.061906] Lustre: Skipped 260 previous similar messages [290374.012875] LustreError: 193448:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd071b9f050 x1695475966629888/t0(0) o4->61781ed1-b14e-4@10.51.13.4@o2ib3:438/0 lens 488/448 e 0 to 0 dl 1619011113 ref 1 fl Interpret:/0/0 rc 0/0 [290374.038005] Lustre: oak-OST0042: Bulk IO write error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc = -110 [290374.050726] Lustre: Skipped 21 previous similar messages [290457.768967] LustreError: 193411:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71a325850 x1696661796649536/t0(0) o4->f71d1188-d01c-9fa0-935c-a63ad652da8a@10.51.12.21@o2ib3:519/0 lens 488/448 e 0 to 0 dl 1619011194 ref 1 fl Interpret:/0/0 rc 0/0 [290457.772107] Lustre: oak-OST005a: Bulk IO write error with f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3), client will retry: rc = -110 [290457.811015] LustreError: 193411:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [290472.371550] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.21@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xcb20a7dd [290536.770170] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.51.3.24@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [290536.789651] LustreError: Skipped 1 previous similar message [290545.418587] Lustre: oak-OST0030: Client 8ff6000c-d966-1cda-f3a5-455db4eb8783 (at 10.51.2.23@o2ib3) reconnecting [290545.429954] Lustre: Skipped 28 previous similar messages [290633.112706] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [290633.127442] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6e0f63800 [290633.139603] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd1ff04f000 [290633.151755] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be39dc63400 [290633.163910] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be39dc63400 [290633.176062] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3da512c00 [290633.188227] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce46dbc000 [290633.200393] LustreError: 217349:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be718fba050 x1689659043042432/t0(0) o4->934d532f-1b5d-4@10.51.4.50@o2ib3:698/0 lens 488/448 e 0 to 0 dl 1619011373 ref 1 fl Interpret:/0/0 rc 0/0 [290633.200398] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71eeb3000 [290633.200438] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71eeb3000 [290633.250322] Lustre: oak-OST0032: Bulk IO write error with 934d532f-1b5d-4 (at 10.51.4.50@o2ib3), client will retry: rc = -110 [290633.263042] Lustre: Skipped 7 previous similar messages [290671.825305] LustreError: 217480:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(2965504) req@ffff8be6a2232850 x1696614593287232/t0(0) o4->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:686/0 lens 504/448 e 0 to 0 dl 1619011361 ref 1 fl Interpret:/0/0 rc 0/0 [290671.853982] LustreError: 217480:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 24 previous similar messages [290696.827279] LustreError: 241049:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1229908) req@ffff8be29b319050 x1685038031988544/t0(0) o4->4346eb7a-31d0-4@10.51.2.60@o2ib3:695/0 lens 488/448 e 0 to 0 dl 1619011370 ref 1 fl Interpret:/0/0 rc 0/0 [290696.827400] LustreError: 211957:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be8d3bba850 x1695205002399808/t0(0) o3->94477130-eaa9-4@10.51.15.17@o2ib3:698/0 lens 488/440 e 0 to 0 dl 1619011373 ref 1 fl Interpret:/0/0 rc 0/0 [290696.827483] Lustre: oak-OST0040: Bulk IO read error with 0808fcad-54e4-4 (at 10.51.4.41@o2ib3), client will retry: rc -110 [290696.827486] Lustre: Skipped 68 previous similar messages [290696.897625] LustreError: 241049:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 19 previous similar messages [290736.457433] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 111s: evicting client at 10.51.5.26@o2ib3 ns: filter-oak-OST0038_UUID lock: ffff8bcbab031200/0xf81cb91ffcb160e lrc: 3/0,0 mode: PW/PW res: [0x1f32961:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->3461119) flags: 0x60000400030020 nid: 10.51.5.26@o2ib3 remote: 0xd40a2db96398abd0 expref: 8 pid: 192982 timeout: 290742 lvb_type: 0 [290739.457381] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 112s: evicting client at 10.51.2.68@o2ib3 ns: filter-oak-OST0048_UUID lock: ffff8be76397b180/0xf81cb91ffcb4d0b lrc: 4/0,0 mode: PR/PR res: [0xa80000400:0x31982d:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400000020 nid: 10.51.2.68@o2ib3 remote: 0x89687f38035512f6 expref: 14 pid: 193010 timeout: 290745 lvb_type: 1 [290739.504416] LustreError: 193033:0:(ldlm_lockd.c:1351:ldlm_handle_enqueue0()) ### lock on destroyed export ffff8bbc182d7400 ns: filter-oak-OST0048_UUID lock: ffff8be0745c1200/0xf81cb91ffcb4fce lrc: 3/0,0 mode: --/PW res: [0xa80000400:0x31982d:0x0].0x0 rrc: 3 type: EXT [0->8191] (req 0->8191) flags: 0x50000000020000 nid: 10.51.2.68@o2ib3 remote: 0x89687f38035512fd expref: 14 pid: 193033 timeout: 0 lvb_type: 0 [290801.630930] LustreError: 193198:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6ccd06050 x1684946122568448/t0(0) o4->831437d2-6335-4@10.51.2.61@o2ib3:114/0 lens 488/448 e 0 to 0 dl 1619011544 ref 1 fl Interpret:/0/0 rc 0/0 [290801.656004] LustreError: 193198:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 7 previous similar messages [290801.951986] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.2.61@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xcb909835 [290801.968868] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 11 previous similar messages [290804.017047] Lustre: oak-OST0056: Bulk IO read error with 330d404b-804c-4 (at 10.51.15.3@o2ib3), client will retry: rc -110 [290804.029480] Lustre: Skipped 96 previous similar messages [290805.490028] LNet: 50608:0:(lib-move.c:3829:lnet_parse_put()) Dropping PUT from 12345-10.51.14.17@o2ib3 portal 16 match 1697354002901760 offset 224 length 224: 4 [290805.506206] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.14.17@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xcb9124ad [290805.523195] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [290834.261922] Lustre: oak-OST004a: Connection restored to 97d5652f-f6c9-4 (at 10.50.5.72@o2ib2) [290834.271544] Lustre: Skipped 1144 previous similar messages [291166.541917] Lustre: oak-OST0048: Client 2cb8c9f5-d10d-9cbb-b3e4-1f91fb16b811 (at 10.51.0.14@o2ib3) reconnecting [291166.553277] Lustre: Skipped 905 previous similar messages [291435.282903] Lustre: oak-OST0030: Connection restored to 831437d2-6335-4 (at 10.51.2.61@o2ib3) [291435.292519] Lustre: Skipped 224 previous similar messages [291438.910163] LustreError: 193453:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be08ac0d050 x1696662028150336/t0(0) o4->f71d1188-d01c-9fa0-935c-a63ad652da8a@10.51.12.21@o2ib3:732/0 lens 488/448 e 0 to 0 dl 1619012162 ref 1 fl Interpret:/0/0 rc 0/0 [291438.937434] LustreError: 193453:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [291438.948233] Lustre: oak-OST0046: Bulk IO write error with f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3), client will retry: rc = -110 [291438.963087] Lustre: Skipped 28 previous similar messages [291559.662280] LustreError: 137-5: oak-OST0045_UUID: not available for connect from 10.51.4.18@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [291773.690023] Lustre: oak-OST003a: Client 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3) reconnecting [291773.701508] Lustre: Skipped 56 previous similar messages [291793.766579] LustreError: 137-5: oak-OST0045_UUID: not available for connect from 10.51.1.27@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [291871.067425] LustreError: 193253:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd176693850 x1696614597674048/t0(0) o3->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:359/0 lens 488/440 e 0 to 0 dl 1619012544 ref 1 fl Interpret:/0/0 rc 0/0 [291871.095776] LustreError: 193253:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 91 previous similar messages [291871.106696] Lustre: oak-OST0040: Bulk IO read error with 855f8733-97ad-fe20-a42c-c9a97f6818f7 (at 10.51.4.40@o2ib3), client will retry: rc -110 [291871.121165] Lustre: Skipped 1 previous similar message [291946.082746] LustreError: 193406:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8bd3a732f850 x1697586581038592/t0(0) o3->74cfd00f-ce2b-6f66-6c9e-b2a137740d54@10.51.2.15@o2ib3:450/0 lens 488/440 e 0 to 0 dl 1619012635 ref 1 fl Interpret:/0/0 rc 0/0 [291946.082945] Lustre: oak-OST0040: Bulk IO read error with 74cfd00f-ce2b-6f66-6c9e-b2a137740d54 (at 10.51.2.15@o2ib3), client will retry: rc -110 [291946.124988] LustreError: 193406:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [292039.231459] Lustre: oak-OST003c: Connection restored to a2211f2f-9723-4 (at 10.51.5.21@o2ib3) [292039.241128] Lustre: Skipped 328 previous similar messages [292104.923350] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.5.21@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [292104.923351] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.5.21@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [292107.987215] LustreError: 137-5: oak-OST0039_UUID: not available for connect from 10.51.5.21@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [292107.987216] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.5.21@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [292108.026095] LustreError: Skipped 1 previous similar message [292221.143759] LustreError: 5992:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be719547850 x1697586581515136/t0(0) o3->74cfd00f-ce2b-6f66-6c9e-b2a137740d54@10.51.2.15@o2ib3:725/0 lens 488/440 e 0 to 0 dl 1619012910 ref 1 fl Interpret:/0/0 rc 0/0 [292221.143968] Lustre: oak-OST0040: Bulk IO read error with 74cfd00f-ce2b-6f66-6c9e-b2a137740d54 (at 10.51.2.15@o2ib3), client will retry: rc -110 [292221.143969] Lustre: Skipped 2 previous similar messages [292221.192320] LustreError: 5992:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [292246.145964] LustreError: 219011:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be02f8b5850 x1696671767209472/t0(0) o3->ea3c0abc-03bf-9064-4b06-a2deb2e4ec48@10.51.4.35@o2ib3:733/0 lens 488/440 e 0 to 0 dl 1619012918 ref 1 fl Interpret:/0/0 rc 0/0 [292417.130629] Lustre: oak-OST0052: Client 2cb8c9f5-d10d-9cbb-b3e4-1f91fb16b811 (at 10.51.0.14@o2ib3) reconnecting [292417.142002] Lustre: Skipped 79 previous similar messages [292530.854720] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending_nocred)(waiting) [292530.868889] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fe520) failed: 5 [292530.869572] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [292530.869574] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [292530.869577] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd522822c00 [292530.869598] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be1e80c9000 [292530.870980] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be1e80c9000 [292530.871965] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be1e80c9000 [292530.872745] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be1e80c9000 [292530.872748] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be723de3c00 [292530.872755] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcf65159400 [292530.986162] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 993 previous similar messages [292594.067802] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [292594.081647] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd5c9e3800 [292594.093826] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd5c9e3800 [292594.105976] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd5c9e3800 [292594.118147] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd5c9e3800 [292594.130377] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7a79adc00 [292594.142587] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd24f111400 [292594.154773] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcb5e8f2400 [292594.154803] LustreError: 217487:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd03755f850 x1685069902779328/t0(0) o4->8b956477-4ff9-4@10.51.2.48@o2ib3:395/0 lens 488/448 e 0 to 0 dl 1619013335 ref 1 fl Interpret:/0/0 rc 0/0 [292594.154805] LustreError: 217487:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [292594.155050] Lustre: oak-OST0058: Bulk IO write error with 8b956477-4ff9-4 (at 10.51.2.48@o2ib3), client will retry: rc = -110 [292596.220592] LustreError: 219011:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be6e3871850 x1685033214564544/t0(0) o3->c6a2f9d7-72f1-4@10.51.2.56@o2ib3:329/0 lens 488/440 e 0 to 0 dl 1619013269 ref 1 fl Interpret:/0/0 rc 0/0 [292596.220670] Lustre: oak-OST0046: Bulk IO read error with ea3c0abc-03bf-9064-4b06-a2deb2e4ec48 (at 10.51.4.35@o2ib3), client will retry: rc -110 [292596.220672] Lustre: Skipped 3 previous similar messages [292596.220857] LustreError: 211957:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1514015) req@ffff8bcd082ff850 x1689647711904448/t0(0) o4->9b1546d3-bf78-4@10.51.5.26@o2ib3:333/0 lens 488/448 e 0 to 0 dl 1619013273 ref 1 fl Interpret:/0/0 rc 0/0 [292596.293877] LustreError: 219011:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 104 previous similar messages [292639.715628] Lustre: oak-OST0030: Connection restored to 8ab7929a-8a09-4 (at 10.51.0.72@o2ib3) [292639.725247] Lustre: Skipped 264 previous similar messages [292646.232368] LustreError: 193420:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be55f10e850 x1685069902779392/t0(0) o4->8b956477-4ff9-4@10.51.2.48@o2ib3:395/0 lens 488/448 e 0 to 0 dl 1619013335 ref 1 fl Interpret:/0/0 rc 0/0 [292646.232665] LustreError: 193413:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8bdc6c15e050 x1685069902779456/t0(0) o4->8b956477-4ff9-4@10.51.2.48@o2ib3:395/0 lens 488/448 e 0 to 0 dl 1619013335 ref 1 fl Interpret:/0/0 rc 0/0 [292646.232667] LustreError: 193413:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 11 previous similar messages [292646.295538] LustreError: 193420:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 6 previous similar messages [292703.876676] Lustre: 187491:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619013178/real 1619013178] req@ffff8bdc3922e300 x1697354003960448/t0(0) o104->oak-OST0046@10.51.5.52@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619013351 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [292703.907333] Lustre: 187491:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [292763.512256] LustreError: 227956:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bdc7007d050 x1697221590604352/t0(0) o4->a5681ec8-1d55-518e-30a3-8af758dafdd3@10.51.13.23@o2ib3:565/0 lens 488/448 e 0 to 0 dl 1619013505 ref 1 fl Interpret:/0/0 rc 0/0 [293031.436421] Lustre: oak-OST0050: Client ed25cc1b-f8c2-8fce-f51c-bb486337b589 (at 10.51.14.5@o2ib3) reconnecting [293031.436423] Lustre: oak-OST005c: Client ed25cc1b-f8c2-8fce-f51c-bb486337b589 (at 10.51.14.5@o2ib3) reconnecting [293031.436425] Lustre: Skipped 735 previous similar messages [293217.439203] LustreError: 221486:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli a7834140-6ca8-4 claims 4218880 GRANT, real grant 999424 [293229.348732] LustreError: 193415:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli a7834140-6ca8-4 claims 3493888 GRANT, real grant 0 [293239.422137] LustreError: 5995:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli a7834140-6ca8-4 claims 2400256 GRANT, real grant 0 [293242.649789] Lustre: oak-OST004c: Connection restored to d55d324b-c685-4 (at 10.51.6.4@o2ib3) [293242.659369] Lustre: Skipped 1106 previous similar messages [293249.428024] LustreError: 193446:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli a7834140-6ca8-4 claims 3444736 GRANT, real grant 0 [293261.935559] LustreError: 227982:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli a7834140-6ca8-4 claims 2617344 GRANT, real grant 0 [293265.070602] LustreError: 193407:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be757053050 x1696615024584064/t0(0) o4->62b89a52-cd86-4@10.51.6.17@o2ib3:310/0 lens 488/448 e 0 to 0 dl 1619014005 ref 1 fl Interpret:/0/0 rc 0/0 [293265.095853] Lustre: oak-OST0036: Bulk IO write error with 62b89a52-cd86-4 (at 10.51.6.17@o2ib3), client will retry: rc = -110 [293265.108627] Lustre: Skipped 16 previous similar messages [293265.827691] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.6.17@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xcecfec8d [293282.243117] LustreError: 217331:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli a7834140-6ca8-4 claims 2961408 GRANT, real grant 0 [293282.257004] LustreError: 217331:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 1 previous similar message [293291.828105] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [293291.842116] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcfd12d0000 [293291.854364] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdad4f81c00 [293291.866539] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdad4f81c00 [293291.878740] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce62b60800 [293291.890950] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce62b60800 [293291.903102] LustreError: 224954:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be0f2214050 x1685039273404608/t0(0) o4->42c6b41d-bbaa-4@10.51.2.35@o2ib3:334/0 lens 488/448 e 0 to 0 dl 1619014029 ref 1 fl Interpret:/0/0 rc 0/0 [293291.903108] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7549d4400 [293291.903123] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7549d4400 [293291.903140] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be723df5000 [293291.903147] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7549d6800 [293291.903174] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bda125a7800 [293291.903184] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bda125a7800 [293292.001427] LustreError: 224954:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [293318.336051] LustreError: 5988:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli a7834140-6ca8-4 claims 2342912 GRANT, real grant 0 [293318.349862] LustreError: 5988:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 2 previous similar messages [293341.702828] LustreError: 217450:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd325824050 x1696859948333376/t0(0) o4->4c32b5fb-e821-4@10.51.2.65@o2ib3:384/0 lens 488/448 e 0 to 0 dl 1619014079 ref 1 fl Interpret:/0/0 rc 0/0 [293341.727878] LustreError: 217450:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 6 previous similar messages [293346.355071] LustreError: 209393:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bdefff49850 x1690604941267456/t0(0) o3->42d3440a-b89b-4@10.51.2.2@o2ib3:337/0 lens 488/440 e 0 to 0 dl 1619014032 ref 1 fl Interpret:/0/0 rc 0/0 [293346.355073] LustreError: 221452:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd93c5f3850 x1688840057081344/t0(0) o3->7fca2a32-f1a9-4@10.51.3.19@o2ib3:337/0 lens 488/440 e 0 to 0 dl 1619014032 ref 1 fl Interpret:/0/0 rc 0/0 [293346.355271] Lustre: oak-OST0040: Bulk IO read error with d4f105e0-bc0e-4 (at 10.51.4.1@o2ib3), client will retry: rc -110 [293346.355273] Lustre: Skipped 111 previous similar messages [293346.424136] LustreError: 209393:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [293394.932594] LustreError: 221482:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli a7834140-6ca8-4 claims 4018176 GRANT, real grant 0 [293394.946481] LustreError: 221482:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 6 previous similar messages [293641.042714] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [293641.056700] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd9e4133000 [293641.068885] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcd5c9e1400 [293641.068897] LustreError: 5988:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be719437050 x1689648981628608/t0(0) o4->702d9553-ca91-4@10.51.5.62@o2ib3:687/0 lens 488/448 e 0 to 0 dl 1619014382 ref 1 fl Interpret:/0/0 rc 0/0 [293653.807507] Lustre: oak-OST005e: Client 84772643-3e0c-a25c-a9b0-965c7b792170 (at 10.51.13.15@o2ib3) reconnecting [293653.818990] Lustre: Skipped 106 previous similar messages [293696.417481] LustreError: 241047:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1527015) req@ffff8bcd9dc3d050 x1689648981628544/t0(0) o4->702d9553-ca91-4@10.51.5.62@o2ib3:687/0 lens 488/448 e 0 to 0 dl 1619014382 ref 1 fl Interpret:/0/0 rc 0/0 [293696.417505] LustreError: 219011:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd528025850 x1695844300723328/t0(0) o3->330d404b-804c-4@10.51.15.3@o2ib3:687/0 lens 488/440 e 0 to 0 dl 1619014382 ref 1 fl Interpret:/0/0 rc 0/0 [293696.417535] Lustre: oak-OST004e: Bulk IO read error with 330d404b-804c-4 (at 10.51.15.3@o2ib3), client will retry: rc -110 [293696.417536] Lustre: Skipped 4 previous similar messages [293847.593171] Lustre: oak-OST0034: Connection restored to c6206ab8-59d6-4 (at 10.51.14.17@o2ib3) [293847.602886] Lustre: Skipped 362 previous similar messages [294031.033743] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [294031.046750] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be716534400 [294031.058905] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be716534c00 [294031.058910] LustreError: 193194:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71beed850 x1685116569569792/t0(0) o4->9b12e584-d591-4@10.51.12.20@o2ib3:321/0 lens 488/448 e 0 to 0 dl 1619014771 ref 1 fl Interpret:/0/0 rc 0/0 [294031.058997] Lustre: oak-OST0050: Bulk IO write error with 9b12e584-d591-4 (at 10.51.12.20@o2ib3), client will retry: rc = -110 [294031.058998] Lustre: Skipped 11 previous similar messages [294031.115441] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0d2d61c00 [294031.127589] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0d2d61c00 [294031.139743] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb4c359400 [294031.151891] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be33878ac00 [294031.164056] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be33878ac00 [294031.176217] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc1d31f1000 [294031.188364] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc1d31f1000 [294176.021690] LustreError: 137-5: oak-OST0041_UUID: not available for connect from 10.51.1.49@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [294176.041222] LustreError: Skipped 1 previous similar message [294196.500386] LustreError: 221484:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1048576(4194304) req@ffff8bd638982850 x1697586585641344/t0(0) o3->74cfd00f-ce2b-6f66-6c9e-b2a137740d54@10.51.2.15@o2ib3:440/0 lens 488/440 e 0 to 0 dl 1619014890 ref 1 fl Interpret:/0/0 rc 0/0 [294196.528834] LustreError: 221484:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 4 previous similar messages [294285.326096] Lustre: oak-OST004a: Client e71e5ef0-3cf7-788c-e457-05fd6b805527 (at 10.51.13.14@o2ib3) reconnecting [294285.337596] Lustre: Skipped 94 previous similar messages [294449.654304] Lustre: oak-OST0042: Connection restored to 7b30f7d2-3cbb-4 (at 10.51.2.63@o2ib3) [294449.663926] Lustre: Skipped 319 previous similar messages [294514.841171] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [294514.855661] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cbc400 [294514.867813] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce23cbc400 [294514.879957] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd7722c400 [294514.892106] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be7230c9000 [294514.904256] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdd44641400 [294514.916412] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdd44643400 [294571.571075] LustreError: 221482:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be6b3ef8850 x1689647991195968/t0(0) o3->280483d6-e910-4@10.51.5.47@o2ib3:37/0 lens 488/440 e 0 to 0 dl 1619015242 ref 1 fl Interpret:/0/0 rc 0/0 [294571.571437] Lustre: oak-OST0050: Bulk IO read error with bc175ba0-453f-4 (at 10.51.1.25@o2ib3), client will retry: rc -110 [294571.571438] Lustre: Skipped 8 previous similar messages [294571.571779] LustreError: 217485:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1506765) req@ffff8bce15b79850 x1684936870385152/t0(0) o4->75e945b2-3e94-4@10.51.3.13@o2ib3:42/0 lens 488/448 e 0 to 0 dl 1619015247 ref 1 fl Interpret:/0/0 rc 0/0 [294571.572725] LustreError: 209389:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bca7b74e850 x1689662743925504/t0(0) o4->c6206ab8-59d6-4@10.51.14.17@o2ib3:44/0 lens 488/448 e 0 to 0 dl 1619015249 ref 1 fl Interpret:/0/0 rc 0/0 [294571.572727] LustreError: 209389:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [294571.677693] LustreError: 221482:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 135 previous similar messages [294673.858980] LustreError: 17909:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk READ failed: rc -107 req@ffff8bcdeb653850 x1689656819166336/t0(0) o3->d8462e28-e3fc-4@10.51.5.27@o2ib3:214/0 lens 488/440 e 0 to 0 dl 1619015419 ref 1 fl Interpret:/0/0 rc 0/0 [294680.187629] LustreError: 209400:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be1cf15a850 x1688693627183936/t0(0) o3->a7834140-6ca8-4@10.51.2.9@o2ib3:214/0 lens 488/440 e 0 to 0 dl 1619015419 ref 1 fl Interpret:/0/0 rc 0/0 [294682.868575] Lustre: 193079:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619015158/real 1619015158] req@ffff8bd0e1ab7980 x1697354006027456/t0(0) o104->oak-OST0038@10.51.13.12@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619015331 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [294682.899339] Lustre: 193079:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [294784.015964] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [294784.029460] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fefa0) failed: 5 [294784.030377] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [294784.030381] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [294784.030383] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd775049400 [294784.030884] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be4dd3c6000 [294784.030889] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce46db8c00 [294784.030893] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be553a79400 [294784.030993] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcfb0c2d000 [294784.031669] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be731b5b800 [294784.031673] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be731b5b800 [294784.031678] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd522822c00 [294784.158850] LNet: 50609:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 733 previous similar messages [294796.629395] LustreError: 5993:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bd638987850 x1696614246055680/t0(0) o3->7535e7e6-b397-e9ec-ed05-2fc68240cd4c@10.51.4.38@o2ib3:268/0 lens 488/440 e 0 to 0 dl 1619015473 ref 1 fl Interpret:/0/0 rc 0/0 [294846.643500] LustreError: 217450:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bce66f92850 x1689656819847808/t0(0) o3->d8462e28-e3fc-4@10.51.5.27@o2ib3:315/0 lens 488/440 e 0 to 0 dl 1619015520 ref 1 fl Interpret:/0/0 rc 0/0 [294846.644507] LustreError: 221490:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd8a004c850 x1688751192079232/t0(0) o3->bdc2eff6-717d-4@10.51.2.42@o2ib3:319/0 lens 488/440 e 0 to 0 dl 1619015524 ref 1 fl Interpret:/0/0 rc 0/0 [294846.644509] LustreError: 221490:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 4 previous similar messages [294846.644938] LustreError: 193427:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(3984161) req@ffff8be7170ff850 x1689666994496448/t0(0) o4->a8f4bdbd-09ba-4@10.51.4.46@o2ib3:322/0 lens 488/448 e 0 to 0 dl 1619015527 ref 1 fl Interpret:/0/0 rc 0/0 [294846.644940] LustreError: 193427:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 15 previous similar messages [294846.645376] Lustre: oak-OST0036: Bulk IO write error with a8f4bdbd-09ba-4 (at 10.51.4.46@o2ib3), client will retry: rc = -110 [294846.645378] Lustre: Skipped 17 previous similar messages [294846.761410] LustreError: 217450:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 108 previous similar messages [294900.265975] Lustre: oak-OST0040: Client 7535e7e6-b397-e9ec-ed05-2fc68240cd4c (at 10.51.4.38@o2ib3) reconnecting [294900.277340] Lustre: Skipped 1275 previous similar messages [294911.770705] LustreError: 241046:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli bdc2eff6-717d-4 claims 4218880 GRANT, real grant 446464 [294911.785090] LustreError: 241046:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 5 previous similar messages [294933.359763] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.3.8@o2ib3 ns: filter-oak-OST0030_UUID lock: ffff8bcb579fcc80/0xf81cb91ffd5e06e lrc: 3/0,0 mode: PW/PW res: [0x20b7faa:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->937983) flags: 0x60000400030020 nid: 10.51.3.8@o2ib3 remote: 0x3d147f0ea56e9ddd expref: 6 pid: 193061 timeout: 294939 lvb_type: 0 [294956.965218] Lustre: 192985:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619015432/real 1619015432] req@ffff8bd3ddbb3600 x1697354006184640/t0(0) o104->oak-OST004e@10.51.3.8@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619015605 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [295014.862708] Lustre: oak-OST0034: haven't heard from client 966da6e3-1ee0-328a-4279-9e5d5706984d (at 10.210.12.145@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bab2d3f9400, cur 1619015663 expire 1619015513 last 1619015436 [295051.343416] Lustre: oak-OST004e: Connection restored to f471526d-4b8b-4 (at 10.49.28.6@o2ib1) [295051.353049] Lustre: Skipped 1562 previous similar messages [295085.009033] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [295085.023172] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be731426800 [295085.035356] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bb6a409e800 [295085.047503] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd68c864800 [295085.059671] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5aef4d000 [295085.071843] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce62b61000 [295096.232991] LustreError: 5991:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be71bf73050 x1685046362220928/t0(0) o4->7b30f7d2-3cbb-4@10.51.2.63@o2ib3:624/0 lens 488/448 e 0 to 0 dl 1619015829 ref 1 fl Interpret:/0/0 rc 0/0 [295108.021754] LustreError: 193443:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be7577da050 x1691357065900416/t0(0) o4->d55d324b-c685-4@10.51.6.4@o2ib3:620/0 lens 488/448 e 0 to 0 dl 1619015825 ref 1 fl Interpret:/0/0 rc 0/0 [295108.046712] LustreError: 193443:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [295146.682486] LustreError: 9693:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1048576(4194304) req@ffff8bcdeb652050 x1684932199100864/t0(0) o3->ba46adca-a399-4@10.51.3.57@o2ib3:616/0 lens 488/440 e 0 to 0 dl 1619015821 ref 1 fl Interpret:/0/0 rc 0/0 [295146.683505] LustreError: 5983:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1993) req@ffff8bcdeb653850 x1688762511295552/t0(0) o4->5bf08143-64b1-4@10.51.3.2@o2ib3:620/0 lens 488/448 e 0 to 0 dl 1619015825 ref 1 fl Interpret:/0/0 rc 0/0 [295146.683507] LustreError: 5983:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [295146.685035] LustreError: 223906:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71baca850 x1688692222857280/t0(0) o3->cf404440-7e75-4@10.51.2.49@o2ib3:616/0 lens 488/440 e 0 to 0 dl 1619015821 ref 1 fl Interpret:/0/0 rc 0/0 [295146.685037] LustreError: 223906:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 7 previous similar messages [295146.780891] LustreError: 9693:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 192 previous similar messages [295256.375246] Lustre: 187491:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619015731/real 1619015731] req@ffff8bdfa76c3a80 x1697354006401216/t0(0) o104->oak-OST0044@10.51.5.35@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619015904 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [295281.913885] Lustre: oak-OST0048: haven't heard from client 4e222094-b28a-4 (at 10.51.2.38@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc15b297800, cur 1619015930 expire 1619015780 last 1619015703 [295281.936220] Lustre: Skipped 23 previous similar messages [295373.789604] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.51.1.23@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [295373.809083] LustreError: Skipped 2 previous similar messages [295375.938917] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.51.1.23@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [295383.821297] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [295383.835514] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be05f7d9400 [295383.847680] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bddc89d1c00 [295383.859837] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb8586b400 [295383.871991] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb8586d000 [295383.884152] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd8f6c1f000 [295383.896308] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf65159400 [295383.908479] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb8586f000 [295383.920654] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcb8586f000 [295383.932815] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdaa4555000 [295409.268695] LustreError: 220404:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd1f9138050 x1684936877776320/t0(0) o3->75e945b2-3e94-4@10.51.3.13@o2ib3:165/0 lens 488/440 e 0 to 0 dl 1619016125 ref 1 fl Interpret:/0/0 rc 0/0 [295409.293638] LustreError: 220404:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 7 previous similar messages [295409.304355] Lustre: oak-OST0046: Bulk IO read error with 75e945b2-3e94-4 (at 10.51.3.13@o2ib3), client will retry: rc -110 [295409.316782] Lustre: Skipped 462 previous similar messages [295446.746078] LustreError: 220844:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be6f6db6050 x1684938580676224/t0(0) o3->817bfdee-7b31-4@10.51.3.7@o2ib3:164/0 lens 488/440 e 0 to 0 dl 1619016124 ref 1 fl Interpret:/0/0 rc 0/0 [295446.746083] LustreError: 193411:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be6b7f7b850 x1689652770597632/t0(0) o4->b290c20f-399c-4@10.51.5.68@o2ib3:164/0 lens 488/448 e 0 to 0 dl 1619016124 ref 1 fl Interpret:/0/0 rc 0/0 [295446.746085] LustreError: 193411:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 8 previous similar messages [295446.746326] Lustre: oak-OST0040: Bulk IO write error with b290c20f-399c-4 (at 10.51.5.68@o2ib3), client will retry: rc = -110 [295446.746327] Lustre: Skipped 17 previous similar messages [295446.828789] LustreError: 220844:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 237 previous similar messages [295491.346799] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 112s: evicting client at 10.51.2.63@o2ib3 ns: filter-oak-OST004a_UUID lock: ffff8be6d188dc40/0xf81cb91ffd817ec lrc: 4/0,0 mode: PR/PR res: [0x1c7967f:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.2.63@o2ib3 remote: 0xc4f8ca8d0efa8e57 expref: 7 pid: 193067 timeout: 295497 lvb_type: 1 [295491.392986] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [295494.346738] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 111s: evicting client at 10.51.2.58@o2ib3 ns: filter-oak-OST003a_UUID lock: ffff8be49805bcc0/0xf81cb91ffd7f4f3 lrc: 4/0,0 mode: PW/PW res: [0x2138676:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1933311) flags: 0x60000400030020 nid: 10.51.2.58@o2ib3 remote: 0xdb68fa3fcd7a64cf expref: 16 pid: 187519 timeout: 295500 lvb_type: 0 [295550.859498] Lustre: oak-OST0040: Client 2c63b434-3a22-4 (at 10.51.5.53@o2ib3) reconnecting [295550.868827] Lustre: Skipped 1482 previous similar messages [295552.339363] Lustre: 187433:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619016027/real 1619016027] req@ffff8bd83917e300 x1697354006583168/t0(0) o104->oak-OST005a@10.51.2.41@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619016200 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [295552.370028] Lustre: 187433:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [295582.853336] Lustre: oak-OST003c: haven't heard from client 7d9d4a89-23ea-4 (at 10.51.4.3@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bde29188800, cur 1619016231 expire 1619016081 last 1619016004 [295587.858118] Lustre: oak-OST0038: haven't heard from client 83e05d96-ba49-4 (at 10.51.5.35@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bdcd1699400, cur 1619016236 expire 1619016086 last 1619016009 [295651.705074] Lustre: oak-OST003a: Connection restored to d7e7ae29-6844-4 (at 10.51.2.18@o2ib3) [295651.714686] Lustre: Skipped 1721 previous similar messages [295771.810401] LustreError: 204434:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be35226f050 x1696876873942976/t0(0) o3->341ff9d8-ce51-a34b-3b59-737651e19da4@10.51.2.29@o2ib3:493/0 lens 488/440 e 0 to 0 dl 1619016453 ref 1 fl Interpret:/0/0 rc 0/0 [295771.838154] LustreError: 204434:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [295903.012408] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending_nocred)(waiting) [295903.026662] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff860) failed: 5 [295903.026664] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff860) failed: 5 [295903.026666] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 3 previous similar messages [295903.026969] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd2937ca000 [295903.026973] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be670987000 [295903.026980] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [295903.026981] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 4 previous similar messages [295903.026983] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd1ff04b000 [295903.026985] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0b07c4c00 [295903.027566] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd0b07c4c00 [295903.028134] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be61bbf8c00 [295903.028789] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be61bbf8c00 [295903.028792] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 3, status -5, desc ffff8be670987000 [295903.028796] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be021090000 [295903.028805] LustreError: 227921:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bdfcd3b8050 x1696614615035072/t0(0) o4->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:684/0 lens 488/448 e 0 to 0 dl 1619016644 ref 1 fl Interpret:/0/0 rc 0/0 [295903.028806] LustreError: 227921:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 11 previous similar messages [295903.029362] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd2937ca000 [295903.238505] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1159 previous similar messages [295970.845494] LustreError: 217484:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be711405050 x1696614615035136/t0(0) o4->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:684/0 lens 488/448 e 0 to 0 dl 1619016644 ref 1 fl Interpret:/0/0 rc 0/0 [295970.874130] LustreError: 217484:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 17 previous similar messages [296057.333614] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.3.56@o2ib3 ns: filter-oak-OST005c_UUID lock: ffff8bd673960000/0xf81cb91ffd87db2 lrc: 3/0,0 mode: PW/PW res: [0x21537c8:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1232895) flags: 0x60000400010020 nid: 10.51.3.56@o2ib3 remote: 0x4ef6a02edfa54624 expref: 13 pid: 193050 timeout: 296063 lvb_type: 0 [296102.839295] Lustre: oak-OST004c: haven't heard from client 03d6cc0c-b2d2-4 (at 10.51.3.55@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bce58034c00, cur 1619016751 expire 1619016601 last 1619016524 [296190.404130] Lustre: oak-OST004c: Client 2be289fc-a075-4 (at 10.51.3.56@o2ib3) reconnecting [296190.413461] Lustre: Skipped 1122 previous similar messages [296196.712850] LustreError: 5991:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk READ failed: rc -107 req@ffff8bda6575e850 x1688681790841920/t0(0) o3->16aa0ad1-8494-4@10.51.2.71@o2ib3:227/0 lens 488/440 e 0 to 0 dl 1619016942 ref 1 fl Interpret:/0/0 rc 0/0 [296196.737917] Lustre: oak-OST0042: Bulk IO read error with 16aa0ad1-8494-4 (at 10.51.2.71@o2ib3), client will retry: rc -107 [296196.750371] Lustre: Skipped 461 previous similar messages [296251.999381] Lustre: oak-OST0044: Connection restored to b914d870-4ce1-4 (at 10.51.4.27@o2ib3) [296252.008996] Lustre: Skipped 1801 previous similar messages [296441.729085] LustreError: 137-5: oak-OST003f_UUID: not available for connect from 10.51.0.68@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [296755.789510] LNet: 238533:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [296755.803702] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be08c192000 [296755.815867] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d739dc00 [296755.828034] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d739dc00 [296755.840212] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d739dc00 [296755.852385] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd1c1c51c00 [296755.864543] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd1c1c51c00 [296755.876700] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be670981c00 [296755.888850] LustreError: 221489:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd7ebd7f850 x1688875687659712/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:26/0 lens 488/448 e 0 to 0 dl 1619017496 ref 1 fl Interpret:/0/0 rc 0/0 [296755.888871] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be715b57000 [296755.926304] LustreError: 221489:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [296755.937102] Lustre: oak-OST0036: Bulk IO write error with fd16aff2-0371-4 (at 10.51.4.33@o2ib3), client will retry: rc = -110 [296755.949841] Lustre: Skipped 19 previous similar messages [296821.006456] LustreError: 227982:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(262144) req@ffff8bcc89aeb050 x1689651041202240/t0(0) o3->d723b707-c0de-4@10.51.5.24@o2ib3:26/0 lens 488/440 e 0 to 0 dl 1619017496 ref 1 fl Interpret:/0/0 rc 0/0 [296821.006645] LustreError: 217338:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8bdf83258850 x1688692229193664/t0(0) o4->cf404440-7e75-4@10.51.2.49@o2ib3:26/0 lens 488/448 e 0 to 0 dl 1619017496 ref 1 fl Interpret:/0/0 rc 0/0 [296821.006757] Lustre: oak-OST004c: Bulk IO read error with 92fb6ca6-34ae-4 (at 10.51.4.42@o2ib3), client will retry: rc -110 [296821.006758] Lustre: Skipped 1 previous similar message [296821.076746] LustreError: 227982:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 256 previous similar messages [296846.059146] Lustre: oak-OST0042: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [296846.059147] Lustre: oak-OST0032: Client f4f31fbb-c316-9d0e-dea6-2a23d0a9a983 (at 10.51.1.12@o2ib3) reconnecting [296846.059150] Lustre: Skipped 575 previous similar messages [296853.838564] Lustre: oak-OST003e: Connection restored to 95e7771d-4de6-4 (at 10.51.2.24@o2ib3) [296853.848184] Lustre: Skipped 184 previous similar messages [297401.720335] LustreError: 217481:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcdb1361850 x1696561292828160/t0(0) o4->11c92c9a-5a17-4@10.51.2.27@o2ib3:671/0 lens 488/448 e 0 to 0 dl 1619018141 ref 1 fl Interpret:/0/0 rc 0/0 [297401.745444] Lustre: oak-OST0034: Bulk IO write error with 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3), client will retry: rc = -110 [297401.758167] Lustre: Skipped 23 previous similar messages [297421.112970] LustreError: 217335:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be18ee1e850 x1696663453519296/t0(0) o3->f71d1188-d01c-9fa0-935c-a63ad652da8a@10.51.12.21@o2ib3:621/0 lens 488/440 e 0 to 0 dl 1619018091 ref 1 fl Interpret:/0/0 rc 0/0 [297421.141401] LustreError: 217335:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [297421.152183] Lustre: oak-OST0040: Bulk IO read error with f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3), client will retry: rc -110 [297421.166742] Lustre: Skipped 48 previous similar messages [297455.819747] Lustre: oak-OST0046: Connection restored to fd16aff2-0371-4 (at 10.51.4.33@o2ib3) [297455.829370] Lustre: Skipped 753 previous similar messages [297470.890583] Lustre: oak-OST005e: Client de0b71dd-918a-dbdd-3442-448a7d2edf2a (at 10.51.6.3@o2ib3) reconnecting [297470.901866] Lustre: Skipped 474 previous similar messages [298059.262343] Lustre: oak-OST0050: Connection restored to a7406eef-a378-4 (at 10.50.4.26@o2ib2) [298059.271969] Lustre: Skipped 270 previous similar messages [298182.971412] Lustre: oak-OST0050: Client 87556525-e81b-4 (at 10.51.1.4@o2ib3) reconnecting [298182.980647] Lustre: Skipped 45 previous similar messages [298230.797465] Lustre: oak-OST0044: haven't heard from client b43c81e8-dea4-f4fd-23f2-11130821f637 (at 10.50.0.64@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4cbee5400, cur 1619018879 expire 1619018729 last 1619018652 [298249.786302] Lustre: oak-OST0032: haven't heard from client b43c81e8-dea4-f4fd-23f2-11130821f637 (at 10.50.0.64@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7575dac00, cur 1619018898 expire 1619018748 last 1619018671 [298249.810660] Lustre: Skipped 22 previous similar messages [298662.055820] Lustre: oak-OST003e: Connection restored to fd16aff2-0371-4 (at 10.51.4.33@o2ib3) [298662.065463] Lustre: Skipped 288 previous similar messages [298777.083742] LustreError: 193457:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli bdc2eff6-717d-4 claims 4218880 GRANT, real grant 0 [298777.097645] LustreError: 193457:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 2 previous similar messages [298863.899549] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.51.2.65@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [298863.918962] LustreError: Skipped 1 previous similar message [298922.243608] Lustre: oak-OST0030: Client f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3) reconnecting [298922.255133] Lustre: Skipped 34 previous similar messages [298922.387244] LustreError: 193140:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd389d5c050 x1696663840050176/t0(0) o4->f71d1188-d01c-9fa0-935c-a63ad652da8a@10.51.12.21@o2ib3:682/0 lens 488/448 e 0 to 0 dl 1619019662 ref 1 fl Interpret:/0/0 rc 0/0 [298922.399523] Lustre: oak-OST0030: Bulk IO write error with f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3), client will retry: rc = -110 [298922.429280] LustreError: 193140:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [298923.068012] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.21@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xd61a0b85 [298923.085010] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 4 previous similar messages [299229.033310] LustreError: 194702:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bc831de7050 x1696881117844288/t0(0) o4->dcf00494-e949-8e41-7675-4379abab2f9b@10.50.13.5@o2ib2:234/0 lens 488/448 e 0 to 0 dl 1619019969 ref 1 fl Interpret:/0/0 rc 0/0 [299229.060486] LustreError: 194702:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [299229.071360] Lustre: oak-OST0048: Bulk IO write error with dcf00494-e949-8e41-7675-4379abab2f9b (at 10.50.13.5@o2ib2), client will retry: rc = -110 [299229.086165] Lustre: Skipped 5 previous similar messages [299263.936453] Lustre: oak-OST0056: Connection restored to be4bc875-67de-4 (at 10.51.4.7@o2ib3) [299263.945977] Lustre: Skipped 389 previous similar messages [299421.553214] LustreError: 5999:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(139264) req@ffff8bcee505b050 x1696207580846464/t0(0) o3->d073f313-60b4-4@10.51.15.5@o2ib3:378/0 lens 488/440 e 0 to 0 dl 1619020113 ref 1 fl Interpret:/0/0 rc 0/0 [299421.578818] Lustre: oak-OST0034: Bulk IO read error with d073f313-60b4-4 (at 10.51.15.5@o2ib3), client will retry: rc -110 [299481.442924] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.215@o2ib5: error 0(sending)(waiting) [299481.457578] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0a1fcdc00 [299481.469777] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bc2d6c2f800 [299481.481943] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bae5f5ef000 [299481.494087] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bae5f5ef000 [299481.506230] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be0a1fcdc00 [299540.656332] Lustre: oak-OST0034: Client d073f313-60b4-4 (at 10.51.15.5@o2ib3) reconnecting [299540.665665] Lustre: Skipped 56 previous similar messages [299546.576998] LustreError: 3782:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bac52338850 x1691423947459072/t0(0) o4->d0ff9086-ea28-4@10.50.13.7@o2ib2:487/0 lens 488/448 e 0 to 0 dl 1619020222 ref 1 fl Interpret:/0/0 rc 0/0 [299546.577032] LustreError: 194698:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bb7b6b16050 x1697024694269568/t0(0) o3->a360479a-8705-4f99-6507-cbe27f33622a@10.50.1.49@o2ib2:487/0 lens 488/440 e 0 to 0 dl 1619020222 ref 1 fl Interpret:/0/0 rc 0/0 [299546.577158] Lustre: oak-OST0054: Bulk IO read error with a360479a-8705-4f99-6507-cbe27f33622a (at 10.50.1.49@o2ib2), client will retry: rc -110 [299546.578574] LustreError: 217342:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be721d69050 x1691423947459456/t0(0) o4->d0ff9086-ea28-4@10.50.13.7@o2ib2:487/0 lens 488/448 e 0 to 0 dl 1619020222 ref 1 fl Interpret:/0/0 rc 0/0 [299546.578576] LustreError: 217342:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 20 previous similar messages [299546.578888] Lustre: oak-OST0040: Bulk IO write error with a7406eef-a378-4 (at 10.50.4.26@o2ib2), client will retry: rc = -110 [299546.695545] LustreError: 3782:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 6 previous similar messages [299654.857513] Lustre: 193138:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619020130/real 1619020130] req@ffff8bb4309bde80 x1697354010553280/t0(0) o106->oak-OST0038@10.50.4.25@o2ib2:15/16 lens 296/280 e 0 to 1 dl 1619020303 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [299864.050828] Lustre: oak-OST003c: Connection restored to 71bdfe57-cd01-4 (at 10.49.28.8@o2ib1) [299864.060448] Lustre: Skipped 1002 previous similar messages [299946.687612] LustreError: 209395:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be35226b850 x1697586854126720/t0(0) o3->4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d@10.51.2.32@o2ib3:147/0 lens 488/440 e 0 to 0 dl 1619020637 ref 1 fl Interpret:/0/0 rc 0/0 [299946.715948] LustreError: 209395:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 22 previous similar messages [299946.726900] Lustre: oak-OST0040: Bulk IO read error with 4bbd0d1e-77b0-1661-2b5b-32f8ed0a525d (at 10.51.2.32@o2ib3), client will retry: rc -110 [299946.741377] Lustre: Skipped 24 previous similar messages [300043.236429] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.0.67@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [300048.468569] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.51.0.18@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [300048.488069] LustreError: Skipped 1 previous similar message [300197.402130] Lustre: oak-OST0050: Client 416c99fb-653c-af17-c233-d5516de96d20 (at 10.51.15.7@o2ib3) reconnecting [300197.413495] Lustre: Skipped 803 previous similar messages [300270.767941] LustreError: 193198:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bdcafb69850 x1695780942157440/t0(0) o3->1639bdd0-384b-4@10.51.6.19@o2ib3:451/0 lens 488/440 e 0 to 0 dl 1619020941 ref 1 fl Interpret:/0/0 rc 0/0 [300270.794505] Lustre: oak-OST0040: Bulk IO read error with 1639bdd0-384b-4 (at 10.51.6.19@o2ib3), client will retry: rc -110 [300277.888394] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(sending_nocred)(waiting) [300277.903348] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600900) failed: 5 [300277.903352] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be249abc000 [300277.903382] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [300277.903383] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [300277.903385] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd3da515800 [300277.903906] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd3da517400 [300277.904481] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 3, status -5, desc ffff8bcb69387c00 [300277.904488] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd73b151c00 [300277.904495] LustreError: 193428:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71a7b5850 x1696617328888192/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:527/0 lens 488/448 e 0 to 0 dl 1619021017 ref 1 fl Interpret:/0/0 rc 0/0 [300277.904509] Lustre: oak-OST0038: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [300277.904510] Lustre: Skipped 11 previous similar messages [300277.905070] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd73b153800 [300277.905653] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd73b153800 [300277.906235] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcb69381400 [300277.906794] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3b1fce000 [300277.907499] LNet: 3985:0:(o2iblnd_cb.c:2857:kiblnd_check_reconnect()) 10.0.2.217@o2ib5: don't reconnect (no need), 12, 12, msg_size: 4096, queue_depth: 8/8, max_frags: 256/256 [300277.907538] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Rx from 10.0.2.217@o2ib5 failed: 5 [300277.907540] LNet: 50607:0:(o2iblnd_cb.c:507:kiblnd_rx_complete()) Skipped 17 previous similar messages [300278.131214] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1278 previous similar messages [300339.781323] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.4.35@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [300339.800765] LustreError: Skipped 1 previous similar message [300345.782021] LustreError: 241055:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be673a81850 x1684014899158400/t0(0) o3->8e1214b4-ce05-4@10.51.4.28@o2ib3:527/0 lens 488/440 e 0 to 0 dl 1619021017 ref 1 fl Interpret:/0/0 rc 0/0 [300345.785266] LustreError: 17909:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(548) req@ffff8be71ff58850 x1685033137382720/t0(0) o4->3f803d26-c19e-4@10.51.2.20@o2ib3:528/0 lens 488/448 e 0 to 0 dl 1619021018 ref 1 fl Interpret:/0/0 rc 0/0 [300345.785268] LustreError: 17909:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 10 previous similar messages [300345.785287] Lustre: oak-OST003e: Bulk IO write error with 3f803d26-c19e-4 (at 10.51.2.20@o2ib3), client will retry: rc = -110 [300345.856740] LustreError: 241055:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 6 previous similar messages [300421.232818] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.26@o2ib3 ns: filter-oak-OST003a_UUID lock: ffff8bdcb5db4ec0/0xf81cb91ffef0d84 lrc: 4/0,0 mode: PW/PW res: [0x21386df:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->12287) flags: 0x60000400010020 nid: 10.51.5.26@o2ib3 remote: 0xd40a2db963db96d6 expref: 9 pid: 206660 timeout: 300389 lvb_type: 0 [300426.232726] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.14.17@o2ib3 ns: filter-oak-OST0030_UUID lock: ffff8be31a2e0d80/0xf81cb91fff176f0 lrc: 3/0,0 mode: PW/PW res: [0x20b829a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->12287) flags: 0x60000400010020 nid: 10.51.14.17@o2ib3 remote: 0x411816e8936f6ef5 expref: 9 pid: 187511 timeout: 300394 lvb_type: 0 [300426.277667] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 2 previous similar messages [300445.126265] Lustre: 187408:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619020920/real 1619020920] req@ffff8bd83917d580 x1697354011187392/t0(0) o104->oak-OST004e@10.51.13.22@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619021093 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [300448.637252] Lustre: 193050:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619020923/real 1619020923] req@ffff8bdcadde8480 x1697354011187968/t0(0) o104->oak-OST0044@10.51.4.45@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619021096 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [300448.667911] Lustre: 193050:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [300464.216060] Lustre: oak-OST0032: Connection restored to 8833c96a-f6c7-4 (at 10.51.5.31@o2ib3) [300464.225704] Lustre: Skipped 1273 previous similar messages [300469.288532] LustreError: 193194:0:(tgt_grant.c:758:tgt_grant_check()) oak-OST003a: cli bdc2eff6-717d-4 claims 327680 GRANT, real grant 0 [300469.302356] LustreError: 193194:0:(tgt_grant.c:758:tgt_grant_check()) Skipped 7 previous similar messages [300473.800443] Lustre: oak-OST0030: haven't heard from client 266ca40f-6c8b-4 (at 10.51.4.54@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be33878d000, cur 1619021122 expire 1619020972 last 1619020895 [300482.738544] Lustre: oak-OST0044: haven't heard from client 8833c96a-f6c7-4 (at 10.51.5.31@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bcf99581000, cur 1619021131 expire 1619020981 last 1619020904 [300822.907880] Lustre: oak-OST0038: Client 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3) reconnecting [300822.907880] Lustre: oak-OST0044: Client 8b66e4db-fcc9-6215-2cf9-86ed141f42d8 (at 10.51.2.26@o2ib3) reconnecting [300822.907884] Lustre: Skipped 809 previous similar messages [300828.057787] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.50.1.59@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [301068.823436] Lustre: oak-OST0042: Connection restored to 7532fd31-2588-4 (at 10.50.9.39@o2ib2) [301068.833059] Lustre: Skipped 620 previous similar messages [301089.690647] LNet: 182180:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [301089.703398] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ffb00) failed: 5 [301089.703505] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.216@o2ib5 exceeded retry count 0 [301089.703509] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 5 previous similar messages [301089.703512] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdd38472000 [301089.704131] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be24bc42400 [301089.704135] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bce5451cc00 [301089.704139] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be72267ac00 [301089.784399] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 305 previous similar messages [301137.526474] LustreError: 148682:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bc4ce2a3050 x1696881145272256/t0(0) o4->dcf00494-e949-8e41-7675-4379abab2f9b@10.50.13.5@o2ib2:632/0 lens 488/448 e 0 to 0 dl 1619021877 ref 1 fl Interpret:/0/0 rc 0/0 [301137.553610] Lustre: oak-OST0054: Bulk IO write error with dcf00494-e949-8e41-7675-4379abab2f9b (at 10.50.13.5@o2ib2), client will retry: rc = -110 [301137.568370] Lustre: Skipped 8 previous similar messages [301145.966206] LustreError: 9692:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be722617850 x1689650736174720/t0(0) o3->7a3c78c8-8ece-4@10.51.5.39@o2ib3:585/0 lens 488/440 e 0 to 0 dl 1619021830 ref 1 fl Interpret:/0/0 rc 0/0 [301145.966215] LustreError: 187417:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bd6179bd050 x1689647822960704/t0(0) o3->9b1546d3-bf78-4@10.51.5.26@o2ib3:585/0 lens 488/440 e 0 to 0 dl 1619021830 ref 1 fl Interpret:/0/0 rc 0/0 [301145.966217] LustreError: 187417:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 236 previous similar messages [301145.966345] Lustre: oak-OST0040: Bulk IO read error with 9b1546d3-bf78-4 (at 10.51.5.26@o2ib3), client will retry: rc -110 [301145.966346] Lustre: Skipped 244 previous similar messages [301146.047398] LustreError: 9692:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [301218.106233] LustreError: 137-5: oak-OST0059_UUID: not available for connect from 10.50.5.16@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [301371.905085] LNet: 182180:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [301371.918662] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600f20) failed: 5 [301371.918852] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [301371.918854] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 3 previous similar messages [301371.918859] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd6ac776800 [301371.919431] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be0a1fcf000 [301371.919513] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcf6515c400 [301371.920104] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd8b583e400 [301371.920674] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd8b583e400 [301371.921367] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd8b583a400 [301371.921988] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bce58036c00 [301371.921993] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd9da640400 [301372.047627] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1358 previous similar messages [301421.027773] LustreError: 211957:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71ef55850 x1689668291342656/t0(0) o3->e84ad1b6-d416-4@10.51.5.56@o2ib3:112/0 lens 488/440 e 0 to 0 dl 1619022112 ref 1 fl Interpret:/0/0 rc 0/0 [301421.032967] LustreError: 5984:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(3444) req@ffff8bcc6c3f9850 x1688875847784896/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:113/0 lens 488/448 e 0 to 0 dl 1619022113 ref 1 fl Interpret:/0/0 rc 0/0 [301421.032969] LustreError: 5984:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 8 previous similar messages [301421.032983] Lustre: oak-OST0038: Bulk IO write error with fd16aff2-0371-4 (at 10.51.4.33@o2ib3), client will retry: rc = -110 [301421.102220] LustreError: 211957:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 6 previous similar messages [301446.030303] LustreError: 193432:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(116401) req@ffff8bd87a611850 x1696617330463168/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:117/0 lens 488/448 e 0 to 0 dl 1619022117 ref 1 fl Interpret:/0/0 rc 0/0 [301446.058382] LustreError: 193432:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [301446.069411] Lustre: oak-OST0038: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [301446.084273] Lustre: Skipped 1 previous similar message [301469.965566] Lustre: oak-OST0034: Client 9d061e04-8567-4 (at 10.51.0.71@o2ib3) reconnecting [301469.975021] Lustre: Skipped 42 previous similar messages [301479.208634] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 111s: evicting client at 10.51.4.3@o2ib3 ns: filter-oak-OST0046_UUID lock: ffff8be2cda39d40/0xf81cb91ffda650b lrc: 3/0,0 mode: PW/PW res: [0xa40000400:0x346182:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->8191) flags: 0x60000400030020 nid: 10.51.4.3@o2ib3 remote: 0xc8426d5655a87076 expref: 9 pid: 188446 timeout: 301485 lvb_type: 0 [301479.253823] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 4 previous similar messages [301525.207590] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.4.60@o2ib3 ns: filter-oak-OST0046_UUID lock: ffff8bd25ea4da00/0xf81cb91fff8bd59 lrc: 3/0,0 mode: PW/PW res: [0x220b068:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->1232895) flags: 0x60000400010020 nid: 10.51.4.60@o2ib3 remote: 0x521c481ad0037528 expref: 7 pid: 193030 timeout: 301531 lvb_type: 0 [301543.037191] Lustre: 193019:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619022018/real 1619022018] req@ffff8bd5c63e5100 x1697354013062336/t0(0) o104->oak-OST005a@10.51.2.9@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619022191 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [301543.067835] Lustre: 193019:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [301567.864132] Lustre: oak-OST003a: haven't heard from client c6206ab8-59d6-4 (at 10.51.14.17@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bcbf53aa000, cur 1619022216 expire 1619022066 last 1619021989 [301572.711396] Lustre: oak-OST004a: haven't heard from client 3e7c55d0-08c5-4 (at 10.51.4.52@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bda125a0400, cur 1619022221 expire 1619022071 last 1619021994 [301576.717939] Lustre: oak-OST004e: haven't heard from client 1444e1bb-fe1c-4 (at 10.51.5.70@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd47b370400, cur 1619022225 expire 1619022075 last 1619021998 [301669.429105] Lustre: oak-OST004e: Connection restored to e84ad1b6-d416-4 (at 10.51.5.56@o2ib3) [301669.438724] Lustre: Skipped 1647 previous similar messages [301913.531926] LNet: 182178:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(sending)(waiting) [301913.545980] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4dcf7f800 [301913.558126] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4dcf7f800 [301913.570276] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4dcf7f800 [301913.582438] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd4dcf7f800 [301913.593658] LustreError: 2069:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bb513400850 x1696881158137024/t0(0) o4->dcf00494-e949-8e41-7675-4379abab2f9b@10.50.13.5@o2ib2:656/0 lens 488/448 e 0 to 0 dl 1619022656 ref 1 fl Interpret:/0/0 rc 0/0 [301913.593906] Lustre: oak-OST0058: Bulk IO write error with dcf00494-e949-8e41-7675-4379abab2f9b (at 10.50.13.5@o2ib2), client will retry: rc = -110 [301913.636294] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce6bfcf000 [301913.636329] Lustre: 245408:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1619022555/real 1619022561] req@ffff8bba13be7980 x1697354013713344/t0(0) o105->oak-OST0050@10.50.2.43@o2ib2:15/16 lens 360/224 e 0 to 1 dl 1619022728 ref 1 fl Rpc:eX/0/ffffffff rc 0/-1 [301913.679487] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce6bfcf000 [301913.691644] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce6bfcf000 [301913.703784] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8baf462d7800 [301913.715947] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bb5458b6800 [301913.715979] LustreError: 2430:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bc25fc9d050 x1696863065616128/t0(0) o4->0990d505-f804-a11f-b445-5dbc7dcd98cd@10.50.10.56@o2ib2:653/0 lens 488/448 e 0 to 0 dl 1619022653 ref 1 fl Interpret:/0/0 rc 0/0 [301913.755448] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce6bfcf000 [301914.513611] LustreError: 227987:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcbd228d850 x1696881158133760/t0(0) o4->dcf00494-e949-8e41-7675-4379abab2f9b@10.50.13.5@o2ib2:653/0 lens 488/448 e 0 to 0 dl 1619022653 ref 1 fl Interpret:/0/0 rc 0/0 [301914.540736] LustreError: 227987:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 6 previous similar messages [301971.181384] LustreError: 114973:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bbb049f6050 x1694100303394240/t0(0) o3->0819d613-98f5-4@10.50.14.14@o2ib2:653/0 lens 488/440 e 0 to 0 dl 1619022653 ref 1 fl Interpret:/0/0 rc 0/0 [301971.181386] LustreError: 2424:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bc4c84f4050 x1685218962716224/t0(0) o3->43fbdde4-1860-4@10.50.7.63@o2ib2:653/0 lens 488/440 e 0 to 0 dl 1619022653 ref 1 fl Interpret:/0/0 rc 0/0 [301971.181388] LustreError: 235572:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bc4c9476850 x1687464519658304/t0(0) o3->4685c272-f3d8-4@10.50.9.9@o2ib2:653/0 lens 488/440 e 0 to 0 dl 1619022653 ref 1 fl Interpret:/0/0 rc 0/0 [301971.181390] LustreError: 2424:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 65 previous similar messages [301971.181394] LustreError: 235572:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 65 previous similar messages [301971.181406] Lustre: oak-OST003c: Bulk IO read error with 4685c272-f3d8-4 (at 10.50.9.9@o2ib2), client will retry: rc -110 [301971.181406] Lustre: Skipped 76 previous similar messages [301971.181449] LustreError: 148682:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(2519) req@ffff8bc4c58bf050 x1696616927234624/t0(0) o4->4840a713-cd5c-4977-8e46-6a1db00048d5@10.50.6.9@o2ib2:654/0 lens 488/448 e 0 to 0 dl 1619022654 ref 1 fl Interpret:/0/0 rc 0/0 [301971.181458] Lustre: oak-OST003a: Bulk IO write error with 4840a713-cd5c-4977-8e46-6a1db00048d5 (at 10.50.6.9@o2ib2), client will retry: rc = -110 [301971.181459] Lustre: Skipped 8 previous similar messages [302079.441213] Lustre: oak-OST004c: Client 7f8e8eb2-f021-4 (at 10.50.14.10@o2ib2) reconnecting [302079.450645] Lustre: Skipped 937 previous similar messages [302080.512827] Lustre: 187405:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619022555/real 1619022555] req@ffff8bbfabd30900 x1697354013713280/t0(0) o104->oak-OST003a@10.50.6.53@o2ib2:15/16 lens 296/224 e 0 to 1 dl 1619022728 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [302080.543493] Lustre: 187405:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [302083.590766] Lustre: 218154:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619022558/real 1619022558] req@ffff8bbfabd35a00 x1697354013714688/t0(0) o105->oak-OST0038@10.50.17.29@o2ib2:15/16 lens 360/224 e 0 to 1 dl 1619022731 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [302083.621537] Lustre: 218154:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [302269.901863] Lustre: oak-OST0048: Connection restored to f35b4cd6-c6c1-4 (at 10.50.13.15@o2ib2) [302269.911614] Lustre: Skipped 1553 previous similar messages [302313.522268] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(sending)(waiting) [302313.537497] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7a79ab800 [302313.549668] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd449b5c400 [302313.561827] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd449b5c400 [302313.573991] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bbd92e02000 [302313.586159] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be0d5160000 [302313.598311] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be0d5160000 [302313.610473] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcee8d19400 [302313.610488] LustreError: 5996:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be349e35850 x1691424003588992/t0(0) o4->d0ff9086-ea28-4@10.50.13.7@o2ib2:299/0 lens 488/448 e 0 to 0 dl 1619023054 ref 1 fl Interpret:/0/0 rc 0/0 [302313.610714] Lustre: oak-OST004c: Bulk IO write error with d0ff9086-ea28-4 (at 10.50.13.7@o2ib2), client will retry: rc = -110 [302313.610717] Lustre: Skipped 8 previous similar messages [302313.666487] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcee8d19400 [302313.678640] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcee8d19400 [302313.690805] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd449b5c400 [302371.299175] LustreError: 2424:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bc4c9477850 x1696957635147072/t0(0) o3->3d575049-f2ff-030b-a4ea-af4cdfc8c038@10.50.8.69@o2ib2:299/0 lens 488/440 e 0 to 0 dl 1619023054 ref 1 fl Interpret:/0/0 rc 0/0 [302371.299206] LustreError: 2429:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(218) req@ffff8bbff96b9050 x1685161695259136/t0(0) o4->5c0bc3fb-9416-4@10.50.9.35@o2ib2:299/0 lens 488/448 e 0 to 0 dl 1619023054 ref 1 fl Interpret:/0/0 rc 0/0 [302371.299209] LustreError: 2429:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 8 previous similar messages [302371.362690] LustreError: 2424:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [302441.724394] Lustre: oak-OST003c: haven't heard from client 3c9effa1-09aa-4 (at 10.51.0.67@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd3b1fce400, cur 1619023090 expire 1619022940 last 1619022863 [302441.746718] Lustre: Skipped 1 previous similar message [302451.707099] Lustre: oak-OST003e: haven't heard from client 3c9effa1-09aa-4 (at 10.51.0.67@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bce46dbd800, cur 1619023100 expire 1619022950 last 1619022873 [302453.703273] Lustre: oak-OST0034: haven't heard from client 3c9effa1-09aa-4 (at 10.51.0.67@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be43fa22000, cur 1619023102 expire 1619022952 last 1619022875 [302453.725626] Lustre: Skipped 7 previous similar messages [302456.703097] Lustre: oak-OST005a: haven't heard from client 3c9effa1-09aa-4 (at 10.51.0.67@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be7aa94ac00, cur 1619023105 expire 1619022955 last 1619022878 [302456.725450] Lustre: Skipped 10 previous similar messages [302476.373022] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.215@o2ib5: error 0(sending)(waiting) [302476.386643] LNet: 50604:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d859ce9940) failed: 5 [302476.387268] LNet: 50605:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.215@o2ib5 exceeded retry count 0 [302476.387270] LNet: 50605:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [302476.387274] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd671c17000 [302476.387428] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd671c15800 [302476.387972] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd671c17000 [302476.387981] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc0175f3400 [302476.387983] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbc682ec000 [302476.387986] LustreError: 50603:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc0175f3400 [302476.387989] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbc682ec000 [302476.387991] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd671c15800 [302476.515557] LNet: 50604:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 504 previous similar messages [302483.345603] LustreError: 221469:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bcc70628050 x1691640284676160/t0(0) o3->a8e1d696-3374-4@10.50.13.11@o2ib2:462/0 lens 488/440 e 0 to 0 dl 1619023217 ref 1 fl Interpret:/0/0 rc 0/0 [302486.298568] LustreError: 211957:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd87a611050 x1691640284698304/t0(0) o3->a8e1d696-3374-4@10.50.13.11@o2ib2:463/0 lens 488/440 e 0 to 0 dl 1619023218 ref 1 fl Interpret:/0/0 rc 0/0 [302521.346818] LustreError: 239464:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bbef6e0d050 x1696867914346176/t0(0) o3->62935bb7-73e1-04f7-3b52-0957a312c1ed@10.50.17.32@o2ib2:461/0 lens 488/440 e 0 to 0 dl 1619023216 ref 1 fl Interpret:/0/0 rc 0/0 [302521.374295] LustreError: 239464:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [302546.348884] LustreError: 221485:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(954) req@ffff8be7aaa0d850 x1685136661382656/t0(0) o4->52880d4a-9270-4@10.50.6.11@o2ib2:463/0 lens 488/448 e 0 to 0 dl 1619023218 ref 1 fl Interpret:/0/0 rc 0/0 [302546.349593] Lustre: oak-OST0030: Bulk IO write error with 5ebd6997-e65a-7d53-0818-093cecfcf87e (at 10.50.5.41@o2ib2), client will retry: rc = -110 [302546.349594] Lustre: Skipped 9 previous similar messages [302546.395188] LustreError: 221485:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 13 previous similar messages [302646.902888] Lustre: 206657:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619023122/real 1619023122] req@ffff8bcc65783180 x1697354014732864/t0(0) o106->oak-OST005e@10.50.4.28@o2ib2:15/16 lens 296/280 e 0 to 1 dl 1619023295 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [302646.933574] Lustre: 206657:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [302679.508527] Lustre: oak-OST005a: Client 7f1b7392-400d-f93e-0c1e-8292ad9bca46 (at 10.51.13.6@o2ib3) reconnecting [302679.519895] Lustre: Skipped 2033 previous similar messages [302868.685342] Lustre: oak-OST0036: haven't heard from client ae2e39c9-d6c4-e399-c4e2-9281e18e4ae6 (at 10.50.12.15@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c9bd6400, cur 1619023517 expire 1619023367 last 1619023290 [302868.709840] Lustre: Skipped 4 previous similar messages [302871.024739] Lustre: oak-OST004a: Connection restored to 4c6eb2f1-25ca-4 (at 10.50.5.29@o2ib2) [302871.034353] Lustre: Skipped 2051 previous similar messages [303345.563519] Lustre: oak-OST003c: Client 09749127-876e-3e7f-b6a0-7c54e61383b6 (at 10.51.3.11@o2ib3) reconnecting [303345.574884] Lustre: Skipped 317 previous similar messages [303356.952237] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.15.4@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [303475.663404] Lustre: oak-OST003e: Connection restored to 2c63b434-3a22-4 (at 10.51.5.53@o2ib3) [303475.673131] Lustre: Skipped 896 previous similar messages [303596.634960] LustreError: 148673:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bb2a3e68050 x1697044160351232/t0(0) o3->ed954721-f94c-14be-e64a-05a7f0440f42@10.50.5.14@o2ib2:3/0 lens 488/440 e 0 to 0 dl 1619024268 ref 1 fl Interpret:/0/0 rc 0/0 [303596.663100] LustreError: 148673:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 112 previous similar messages [303596.674105] Lustre: oak-OST003c: Bulk IO read error with ed954721-f94c-14be-e64a-05a7f0440f42 (at 10.50.5.14@o2ib2), client will retry: rc -110 [303596.688566] Lustre: Skipped 129 previous similar messages [303946.201773] Lustre: oak-OST005e: Client bb34ca44-6ad7-e3e7-0e9d-ec7e598e1b0e (at 10.50.4.8@o2ib2) reconnecting [303946.213103] Lustre: Skipped 42 previous similar messages [304076.803916] Lustre: oak-OST0056: Connection restored to d073f313-60b4-4 (at 10.51.15.5@o2ib3) [304076.813530] Lustre: Skipped 862 previous similar messages [304679.053594] Lustre: oak-OST0040: Connection restored to 03d6cc0c-b2d2-4 (at 10.51.3.55@o2ib3) [304679.063258] Lustre: Skipped 339 previous similar messages [304747.847290] Lustre: oak-OST0038: Client d4be9a17-4e81-b3d1-f53d-21ecf82a80e0 (at 10.51.1.51@o2ib3) reconnecting [304747.858680] Lustre: Skipped 19 previous similar messages [304970.886695] LustreError: 227989:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd528027850 x1696864638919680/t0(0) o3->76eb6295-9d00-d7ab-8458-c8aac654030a@10.51.6.5@o2ib3:628/0 lens 488/440 e 0 to 0 dl 1619025648 ref 1 fl Interpret:/0/0 rc 0/0 [304970.915091] Lustre: oak-OST0040: Bulk IO read error with 76eb6295-9d00-d7ab-8458-c8aac654030a (at 10.51.6.5@o2ib3), client will retry: rc -110 [305279.108207] Lustre: oak-OST0034: Connection restored to fe3cfa19-279f-4 (at 10.50.1.67@o2ib2) [305279.117828] Lustre: Skipped 664 previous similar messages [305293.772396] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(waiting) [305293.785175] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ffa20) failed: 5 [305293.795601] LNet: 50607:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 31 previous similar messages [305293.795884] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.217@o2ib5 exceeded retry count 0 [305293.795886] LNet: 50608:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [305293.795888] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcb4c35ac00 [305293.795891] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcb4c35ac00 [305293.795893] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be976d57c00 [305293.795896] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be7aa94e000 [305345.964057] LustreError: 209390:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd464834850 x1688807834898240/t0(0) o3->64e0f24e-f320-4@10.51.1.40@o2ib3:258/0 lens 488/440 e 0 to 0 dl 1619026033 ref 1 fl Interpret:/0/0 rc 0/0 [305345.964068] LustreError: 227920:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd464835850 x1688807834898368/t0(0) o3->64e0f24e-f320-4@10.51.1.40@o2ib3:258/0 lens 488/440 e 0 to 0 dl 1619026033 ref 1 fl Interpret:/0/0 rc 0/0 [305345.964102] Lustre: oak-OST0054: Bulk IO read error with c7c97132-e759-4 (at 10.51.15.4@o2ib3), client will retry: rc -110 [305346.048510] LustreError: 209390:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 2 previous similar messages [305367.616771] Lustre: oak-OST0046: Client 8d217df6-ca17-4 (at 10.51.5.4@o2ib3) reconnecting [305367.626003] Lustre: Skipped 33 previous similar messages [305463.304113] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.215@o2ib5: error 0(sending)(waiting) [305463.317702] LNet: 50603:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d859ce9a20) failed: 5 [305463.318641] LNet: 50602:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.215@o2ib5 exceeded retry count 0 [305463.318643] LNet: 50604:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.215@o2ib5 exceeded retry count 0 [305463.318644] LNet: 50602:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 1 previous similar message [305463.318645] LNet: 50604:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 1 previous similar message [305463.318648] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bafd8c18400 [305463.318649] LustreError: 50604:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bdcd169b000 [305463.318652] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd35cf49800 [305463.318656] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd03bedbc00 [305463.318691] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd03bedbc00 [305463.318694] LustreError: 50604:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be725ed7400 [305463.318696] LustreError: 50602:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be578f38400 [305463.319162] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be578f38400 [305463.469256] LNet: 50603:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 251 previous similar messages [305513.566208] LNet: 182039:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [305513.580325] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be16e1efc00 [305513.592483] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be16e1efc00 [305513.604640] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bdf7c7c1800 [305513.616817] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbf53af800 [305513.616821] LustreError: 9694:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bca63b5f850 x1691542771390784/t0(0) o4->016623a0-673f-4@10.51.4.37@o2ib3:479/0 lens 504/448 e 0 to 0 dl 1619026254 ref 1 fl Interpret:/0/0 rc 0/0 [305513.616955] Lustre: oak-OST004e: Bulk IO write error with 016623a0-673f-4 (at 10.51.4.37@o2ib3), client will retry: rc = -110 [305513.616957] Lustre: Skipped 6 previous similar messages [305513.672867] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbf53af800 [305513.685025] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be16e1ee000 [305513.697180] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdc3d948000 [305513.709347] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce46dbb000 [305513.721506] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce46dbb000 [305518.622220] Lustre: oak-OST0036: haven't heard from client 828adda8-7d85-4 (at 10.51.0.2@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c9639000, cur 1619026167 expire 1619026017 last 1619025940 [305518.644466] Lustre: Skipped 23 previous similar messages [305520.618900] Lustre: oak-OST0058: haven't heard from client 828adda8-7d85-4 (at 10.51.0.2@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c765bc00, cur 1619026169 expire 1619026019 last 1619025942 [305520.641126] Lustre: Skipped 11 previous similar messages [305520.977342] LustreError: 2067:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bc4c4cc3050 x1690068524523072/t0(0) o3->2258ca88-edf8-4@10.50.5.63@o2ib2:428/0 lens 488/440 e 0 to 0 dl 1619026203 ref 1 fl Interpret:/0/0 rc 0/0 [305520.977456] Lustre: oak-OST0040: Bulk IO read error with 2258ca88-edf8-4 (at 10.50.5.63@o2ib2), client will retry: rc -110 [305520.977458] Lustre: Skipped 4 previous similar messages [305521.021806] LustreError: 2067:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 3 previous similar messages [305570.989687] LustreError: 242900:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8bcfe2b3c850 x1691542771390848/t0(0) o4->016623a0-673f-4@10.51.4.37@o2ib3:479/0 lens 488/448 e 0 to 0 dl 1619026254 ref 1 fl Interpret:/0/0 rc 0/0 [305570.989698] LustreError: 193456:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be65439a850 x1689655362440768/t0(0) o3->8f996bbd-e5a0-4@10.51.5.28@o2ib3:479/0 lens 488/440 e 0 to 0 dl 1619026254 ref 1 fl Interpret:/0/0 rc 0/0 [305570.989700] LustreError: 193456:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 6 previous similar messages [305570.989743] LustreError: 227996:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd401fa5050 x1689655362441024/t0(0) o3->8f996bbd-e5a0-4@10.51.5.28@o2ib3:479/0 lens 488/440 e 0 to 0 dl 1619026254 ref 1 fl Interpret:/0/0 rc 0/0 [305570.989794] Lustre: oak-OST0042: Bulk IO read error with 156315a7-a82d-b4fe-847a-396165636f38 (at 10.51.14.3@o2ib3), client will retry: rc -110 [305570.989795] Lustre: Skipped 8 previous similar messages [305571.099064] Lustre: oak-OST004e: Bulk IO write error with 016623a0-673f-4 (at 10.51.4.37@o2ib3), client will retry: rc = -110 [305880.499630] Lustre: oak-OST003e: Connection restored to 71f0967f-07e1-4 (at 10.50.1.48@o2ib2) [305880.509251] Lustre: Skipped 950 previous similar messages [306050.275770] Lustre: oak-OST003c: Client 11c92c9a-5a17-4 (at 10.51.2.27@o2ib3) reconnecting [306050.285112] Lustre: Skipped 44 previous similar messages [306320.026458] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.50.6.10@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [306321.350720] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.50.2.4@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [306322.374437] LustreError: 137-5: oak-OST0045_UUID: not available for connect from 10.50.4.1@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [306322.393790] LustreError: Skipped 1 previous similar message [306328.524252] LustreError: 137-5: oak-OST0053_UUID: not available for connect from 10.50.3.71@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [306420.546033] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [306420.560154] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be6c89fd000 [306420.572332] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcd1bd35400 [306471.199688] LustreError: 227989:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd3f7412050 x1688875912368576/t0(0) o4->fd16aff2-0371-4@10.51.4.33@o2ib3:630/0 lens 488/448 e 0 to 0 dl 1619027160 ref 1 fl Interpret:/0/0 rc 0/0 [306471.199697] LustreError: 211957:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bd3a2b9f850 x1689659934109760/t0(0) o3->a2211f2f-9723-4@10.51.5.21@o2ib3:630/0 lens 488/440 e 0 to 0 dl 1619027160 ref 1 fl Interpret:/0/0 rc 0/0 [306471.199712] Lustre: oak-OST003c: Bulk IO read error with a2211f2f-9723-4 (at 10.51.5.21@o2ib3), client will retry: rc -110 [306471.199713] Lustre: Skipped 6 previous similar messages [306471.199869] LustreError: 221470:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1563015) req@ffff8bd334d6d050 x1688683364512128/t0(0) o4->0b774077-ba6c-4@10.51.2.22@o2ib3:631/0 lens 488/448 e 0 to 0 dl 1619027161 ref 1 fl Interpret:/0/0 rc 0/0 [306471.199909] Lustre: oak-OST005a: Bulk IO write error with 0871860c-cfdf-4 (at 10.51.3.30@o2ib3), client will retry: rc = -110 [306471.308267] LustreError: 227989:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [306481.087692] Lustre: oak-OST0036: Connection restored to ee7553d9-ca2e-4 (at 10.50.4.47@o2ib2) [306481.097322] Lustre: Skipped 1283 previous similar messages [306564.091068] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.5.7@o2ib3 ns: filter-oak-OST0040_UUID lock: ffff8be5ca571680/0xf81cb920022cdba lrc: 3/0,0 mode: PW/PW res: [0x20910d5:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->45055) flags: 0x60000400030020 nid: 10.51.5.7@o2ib3 remote: 0x6cd65876537fa14b expref: 7 pid: 187958 timeout: 306570 lvb_type: 0 [306565.091052] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 150s: evicting client at 10.51.5.7@o2ib3 ns: filter-oak-OST005e_UUID lock: ffff8bc4a8ea8fc0/0xf81cb920022b6ad lrc: 3/0,0 mode: PW/PW res: [0x20fc21b:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->106495) flags: 0x60000400010020 nid: 10.51.5.7@o2ib3 remote: 0x6cd65876537f9b71 expref: 7 pid: 193014 timeout: 306571 lvb_type: 0 [306572.090871] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 149s: evicting client at 10.51.6.1@o2ib3 ns: filter-oak-OST004e_UUID lock: ffff8bdb73c2ec00/0xf81cb920022c188 lrc: 3/0,0 mode: PW/PW res: [0x21c385a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->114687) flags: 0x60000400010020 nid: 10.51.6.1@o2ib3 remote: 0xf4991b9a971b8de7 expref: 6 pid: 187429 timeout: 306578 lvb_type: 0 [306593.186400] Lustre: 75146:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619027068/real 1619027068] req@ffff8be4fe6e2d00 x1697354019789632/t0(0) o104->oak-OST005e@10.51.4.39@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619027241 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [306594.824356] LustreError: 194707:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bc2640ef050 x1696864561173312/t0(0) o3->f9b5a9df-ddaf-0ab2-8c1a-f9202576d223@10.50.5.62@o2ib2:52/0 lens 488/440 e 0 to 0 dl 1619027337 ref 1 fl Interpret:/0/0 rc 0/0 [306594.851348] LustreError: 194707:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [306595.183844] Lustre: oak-OST0058: Bulk IO read error with f9b5a9df-ddaf-0ab2-8c1a-f9202576d223 (at 10.50.5.62@o2ib2), client will retry: rc -110 [306595.198505] Lustre: Skipped 19 previous similar messages [306612.605977] Lustre: oak-OST0036: haven't heard from client 48551941-ecf7-4 (at 10.51.4.65@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bdaa4551c00, cur 1619027261 expire 1619027111 last 1619027034 [306612.628331] Lustre: Skipped 11 previous similar messages [306614.609004] Lustre: oak-OST005e: haven't heard from client 7af22e77-c13a-4 (at 10.51.3.9@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be719d87c00, cur 1619027263 expire 1619027113 last 1619027036 [306614.631234] Lustre: Skipped 1 previous similar message [306646.244795] LustreError: 221477:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be71ffbd050 x1696614216692096/t0(0) o3->a2bd0da5-60a6-e2af-583e-69fe101fe84c@10.50.10.31@o2ib2:48/0 lens 488/440 e 0 to 0 dl 1619027333 ref 1 fl Interpret:/0/0 rc 0/0 [306646.245038] Lustre: oak-OST003a: Bulk IO read error with a2bd0da5-60a6-e2af-583e-69fe101fe84c (at 10.50.10.31@o2ib2), client will retry: rc -110 [306646.287148] LustreError: 221477:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 20 previous similar messages [306670.390783] Lustre: oak-OST005e: Client 34d5f413-62e7-39d1-943f-2da96f672f7d (at 10.50.8.68@o2ib2) reconnecting [306670.402173] Lustre: Skipped 806 previous similar messages [306693.124046] LustreError: 137-5: oak-OST0051_UUID: not available for connect from 10.51.3.57@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [306791.273464] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.215@o2ib5: error 0(sending)(waiting) [306791.287115] LNet: 50604:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d859cea660) failed: 5 [306791.287791] LNet: 50603:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.215@o2ib5 exceeded retry count 0 [306791.287793] LNet: 50603:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 5 previous similar messages [306791.287796] LustreError: 50603:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bce54518c00 [306791.288843] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcee8d1f400 [306791.289499] LustreError: 50603:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be66d412000 [306791.290490] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be66d414000 [306791.290500] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcee8d1f400 [306791.291334] LustreError: 50603:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be722656800 [306791.292361] LustreError: 50605:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcee8d1f400 [306791.293297] LustreError: 50603:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be66d414000 [306791.416063] LNet: 50604:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 1613 previous similar messages [306829.492940] LustreError: 223906:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be713046850 x1696617339628672/t0(0) o4->057d7d47-6e0c-f38f-eddf-48feb04705f1@10.51.13.12@o2ib3:284/0 lens 488/448 e 0 to 0 dl 1619027569 ref 1 fl Interpret:/0/0 rc 0/0 [306829.520134] Lustre: oak-OST0046: Bulk IO write error with 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3), client will retry: rc = -110 [306829.535002] Lustre: Skipped 6 previous similar messages [306846.282616] LustreError: 2584:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bc4ac40b850 x1696887664301568/t0(0) o3->02703246-1d69-353e-49ea-eafa755b5258@10.50.5.8@o2ib2:243/0 lens 488/440 e 0 to 0 dl 1619027528 ref 1 fl Interpret:/0/0 rc 0/0 [306846.282623] LustreError: 193191:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bc4ac40c050 x1685195824051904/t0(0) o3->712cedcb-4de0-4@10.50.7.39@o2ib2:243/0 lens 488/440 e 0 to 0 dl 1619027528 ref 1 fl Interpret:/0/0 rc 0/0 [306846.282775] Lustre: oak-OST0048: Bulk IO read error with 23647018-5b3f-4 (at 10.50.9.8@o2ib2), client will retry: rc -110 [306846.282777] Lustre: Skipped 1 previous similar message [306846.353282] LustreError: 2584:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [306908.488130] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.50.0.11@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [306908.507541] LustreError: Skipped 1 previous similar message [307046.330025] LustreError: 224955:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be71d07c050 x1696864066784320/t0(0) o3->8ff6000c-d966-1cda-f3a5-455db4eb8783@10.51.2.23@o2ib3:441/0 lens 488/440 e 0 to 0 dl 1619027726 ref 1 fl Interpret:/0/0 rc 0/0 [307046.357505] Lustre: oak-OST005a: Bulk IO read error with 8ff6000c-d966-1cda-f3a5-455db4eb8783 (at 10.51.2.23@o2ib3), client will retry: rc -110 [307046.371972] Lustre: Skipped 7 previous similar messages [307083.580216] Lustre: oak-OST005e: Connection restored to 98760d71-1fb8-50a2-442b-fd0b22f1e291 (at 10.50.6.54@o2ib2) [307083.591871] Lustre: Skipped 2297 previous similar messages [307132.612971] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.50.9.22@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [307367.298782] Lustre: oak-OST003a: Client a7ebb784-0f8b-4 (at 10.51.6.23@o2ib3) reconnecting [307367.308114] Lustre: Skipped 93 previous similar messages [307684.080364] Lustre: oak-OST005e: Connection restored to 51e35692-a4e4-4 (at 10.50.8.71@o2ib2) [307684.090001] Lustre: Skipped 679 previous similar messages [307745.455221] LustreError: 137-5: oak-OST0039_UUID: not available for connect from 10.51.2.2@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [307856.127165] LustreError: 193425:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd657b94050 x1696614657477632/t0(0) o3->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:557/0 lens 488/440 e 0 to 0 dl 1619028597 ref 1 fl Interpret:/0/0 rc 0/0 [307856.154186] Lustre: oak-OST0032: Bulk IO read error with 855f8733-97ad-fe20-a42c-c9a97f6818f7 (at 10.51.4.40@o2ib3), client will retry: rc -110 [307966.198623] LustreError: 148672:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bc5e9b5b050 x1696857001056000/t0(0) o4->b7b29b0e-7b9e-7f93-7f9b-e31ab6f299f7@10.50.2.42@o2ib2:665/0 lens 488/448 e 0 to 0 dl 1619028705 ref 1 fl Interpret:/0/0 rc 0/0 [307966.225814] Lustre: oak-OST0030: Bulk IO write error with b7b29b0e-7b9e-7f93-7f9b-e31ab6f299f7 (at 10.50.2.42@o2ib2), client will retry: rc = -110 [307969.184451] Lustre: oak-OST004a: Client c65841de-bdd4-2bc5-0fce-cb37d045d1b1 (at 10.51.4.19@o2ib3) reconnecting [307969.195822] Lustre: Skipped 68 previous similar messages [308249.727516] LNet: 182036:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [308249.741675] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7549d3400 [308249.753829] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd7549d3400 [308249.765981] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71edfa400 [308249.778135] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcde3ab5000 [308249.790293] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be71edfa400 [308249.790302] LustreError: 221471:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be91ed9a850 x1696617564355072/t0(0) o4->c65841de-bdd4-2bc5-0fce-cb37d045d1b1@10.51.4.19@o2ib3:191/0 lens 488/448 e 0 to 0 dl 1619028986 ref 1 fl Interpret:/0/0 rc 0/0 [308249.790304] LustreError: 221471:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 4 previous similar messages [308249.790383] Lustre: oak-OST003a: Bulk IO write error with c65841de-bdd4-2bc5-0fce-cb37d045d1b1 (at 10.51.4.19@o2ib3), client will retry: rc = -110 [308284.150837] Lustre: oak-OST003e: Connection restored to 48551941-ecf7-4 (at 10.51.4.65@o2ib3) [308284.160458] Lustre: Skipped 559 previous similar messages [308295.607398] LustreError: 227991:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(1208320) req@ffff8bd632559850 x1689651984991808/t0(0) o3->28b4f5fd-89a3-4@10.51.4.55@o2ib3:190/0 lens 488/440 e 0 to 0 dl 1619028985 ref 1 fl Interpret:/0/0 rc 0/0 [308295.607427] Lustre: oak-OST0032: Bulk IO read error with bb6977be-fbc5-4 (at 10.51.4.68@o2ib3), client will retry: rc -110 [308295.607805] LustreError: 193409:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be92d50d850 x1684934653524544/t0(0) o3->45f8c97c-22a3-4@10.51.3.26@o2ib3:195/0 lens 488/440 e 0 to 0 dl 1619028990 ref 1 fl Interpret:/0/0 rc 0/0 [308295.608341] LustreError: 217486:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1519265) req@ffff8bdb15cf3850 x1694708046327808/t0(0) o4->7d4930b0-1dcd-4@10.51.13.8@o2ib3:194/0 lens 488/448 e 0 to 0 dl 1619028989 ref 1 fl Interpret:/0/0 rc 0/0 [308295.608343] LustreError: 217486:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 5 previous similar messages [308295.608434] Lustre: oak-OST0044: Bulk IO write error with 7d4930b0-1dcd-4 (at 10.51.13.8@o2ib3), client will retry: rc = -110 [308295.720772] LustreError: 227991:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 29 previous similar messages [308320.612454] LustreError: 147686:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be45a223050 x1688598752814016/t0(0) o3->42c69b06-39e7-4@10.51.4.25@o2ib3:196/0 lens 488/440 e 0 to 0 dl 1619028991 ref 1 fl Interpret:/0/0 rc 0/0 [308320.612565] LustreError: 193194:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1519265) req@ffff8bd95afdb050 x1689646563669440/t0(0) o4->9b72cdcc-a3a2-4@10.51.5.19@o2ib3:197/0 lens 488/448 e 0 to 0 dl 1619028992 ref 1 fl Interpret:/0/0 rc 0/0 [308320.612566] LustreError: 193194:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [308320.612631] Lustre: oak-OST004e: Bulk IO write error with 9b72cdcc-a3a2-4 (at 10.51.5.19@o2ib3), client will retry: rc = -110 [308320.612631] Lustre: Skipped 2 previous similar messages [308320.694239] LustreError: 147686:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 14 previous similar messages [308360.049518] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 111s: evicting client at 10.51.4.34@o2ib3 ns: filter-oak-OST0056_UUID lock: ffff8be45038e540/0xf81cb92002c6857 lrc: 3/0,0 mode: PW/PW res: [0x2458555:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->36863) flags: 0x60000400030020 nid: 10.51.4.34@o2ib3 remote: 0x15ea91af01f88073 expref: 7 pid: 193001 timeout: 308366 lvb_type: 0 [308416.526223] Lustre: 193066:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619028891/real 1619028891] req@ffff8bdcb4501b00 x1697354021356736/t0(0) o104->oak-OST0038@10.51.4.33@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619029064 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [308619.547076] Lustre: oak-OST0032: Client 90752487-0b3e-1696-21a1-6c81abc18872 (at 10.51.1.2@o2ib3) reconnecting [308619.558351] Lustre: Skipped 997 previous similar messages [308768.814838] LustreError: 137-5: oak-OST0045_UUID: not available for connect from 10.51.15.11@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [308834.993682] LNet: 182036:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.3.5@o2ib5: error 0(sending) [308878.467100] LNet: 182036:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [308878.481053] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be9797a9800 [308878.493258] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd555717c00 [308878.505444] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd85fdc8400 [308878.505450] LustreError: 209389:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be4596f1050 x1688687501521088/t0(0) o4->42e4ab3a-7964-4@10.51.2.69@o2ib3:68/0 lens 488/448 e 0 to 0 dl 1619029618 ref 1 fl Interpret:/0/0 rc 0/0 [308878.505452] LustreError: 209389:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [308878.505519] Lustre: oak-OST005a: Bulk IO write error with 42e4ab3a-7964-4 (at 10.51.2.69@o2ib3), client will retry: rc = -110 [308878.566229] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd85fdc8400 [308878.578382] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd600230400 [308878.590573] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd600230400 [308878.602733] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3b1fc8800 [308878.614896] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3b1fc8800 [308884.862653] Lustre: oak-OST005a: Connection restored to f6269348-5ee6-4 (at 10.51.13.24@o2ib3) [308884.872370] Lustre: Skipped 1481 previous similar messages [308945.750810] LustreError: 217482:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be71f845050 x1689646565854272/t0(0) o3->9b72cdcc-a3a2-4@10.51.5.19@o2ib3:69/0 lens 488/440 e 0 to 0 dl 1619029619 ref 1 fl Interpret:/0/0 rc 0/0 [308945.750971] Lustre: oak-OST0040: Bulk IO read error with c07d0863-218d-4 (at 10.51.4.30@o2ib3), client will retry: rc -110 [308945.750972] Lustre: Skipped 46 previous similar messages [308945.794498] LustreError: 217482:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [309078.365620] LNet: 182036:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(waiting) [309078.378371] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5fe7c0) failed: 5 [309078.388770] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 30 previous similar messages [309078.388950] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [309078.388952] LNet: 50609:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 6 previous similar messages [309078.388955] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d6800 [309078.389259] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be71b0d6800 [309078.389301] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd3435b9000 [309145.801385] LustreError: 194709:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8bb34dfce050 x1685034235663296/t0(0) o3->df5881f1-066b-4@10.50.1.71@o2ib2:268/0 lens 488/440 e 0 to 0 dl 1619029818 ref 1 fl Interpret:/0/0 rc 0/0 [309145.801387] LustreError: 194699:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8bb34dfcd050 x1685040690426240/t0(0) o3->ff473f56-6f13-4@10.50.2.5@o2ib2:268/0 lens 488/440 e 0 to 0 dl 1619029818 ref 1 fl Interpret:/0/0 rc 0/0 [309145.801401] Lustre: oak-OST0050: Bulk IO read error with ff473f56-6f13-4 (at 10.50.2.5@o2ib2), client will retry: rc -110 [309145.801402] Lustre: Skipped 3 previous similar messages [309145.801430] LustreError: 239055:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 35689(1084265) req@ffff8bac54314050 x1689251863967936/t0(0) o4->74bbf7e2-4b61-4@10.50.3.45@o2ib2:269/0 lens 488/448 e 0 to 0 dl 1619029819 ref 1 fl Interpret:/0/0 rc 0/0 [309145.801520] Lustre: oak-OST0042: Bulk IO write error with 74bbf7e2-4b61-4 (at 10.50.3.45@o2ib2), client will retry: rc = -110 [309145.805009] LustreError: 217487:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be673a84050 x1697066924614400/t0(0) o3->b640a7ae-ee3b-9428-1e4e-6897dd2dabf0@10.50.7.71@o2ib2:268/0 lens 488/440 e 0 to 0 dl 1619029818 ref 1 fl Interpret:/0/0 rc 0/0 [309145.936850] LustreError: 194709:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 70 previous similar messages [309194.363040] LNet: 182036:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(sending)(waiting) [309194.376637] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c600200) failed: 5 [309194.378479] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd83bbde400 [309194.387232] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [309194.387235] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [309194.387238] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbfb169c400 [309194.387502] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbfb169c400 [309194.387505] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bbfb1698000 [309194.387506] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8be5a5bde800 [309194.387510] LustreError: 50608:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bde45cc9800 [309194.387514] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bde45cc9800 [309194.387515] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bd83bbde400 [309194.505618] LNet: 50606:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 349 previous similar messages [309240.993556] Lustre: oak-OST005e: Client 44941934-ac39-2b19-c695-ab36186f52dc (at 10.50.10.57@o2ib2) reconnecting [309241.005030] Lustre: Skipped 98 previous similar messages [309245.827065] LustreError: 234136:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bc4c7afc850 x1688831091365184/t0(0) o3->17194102-dead-4@10.50.17.18@o2ib2:385/0 lens 488/440 e 0 to 0 dl 1619029935 ref 1 fl Interpret:/0/0 rc 0/0 [309245.827072] LustreError: 2422:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8bc4c5910050 x1691700698510016/t0(0) o3->f67a35f1-3618-4@10.50.14.13@o2ib2:385/0 lens 488/440 e 0 to 0 dl 1619029935 ref 1 fl Interpret:/0/0 rc 0/0 [309245.827093] Lustre: oak-OST0032: Bulk IO read error with f67a35f1-3618-4 (at 10.50.14.13@o2ib2), client will retry: rc -110 [309245.827094] Lustre: Skipped 73 previous similar messages [309245.896375] LustreError: 234136:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 5 previous similar messages [309250.876910] Lustre: 193063:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619029726/real 1619029726] req@ffff8be3d33b0480 x1697354022962816/t0(0) o104->oak-OST0046@10.50.0.61@o2ib2:15/16 lens 296/224 e 0 to 1 dl 1619029899 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [309366.276274] LustreError: 221414:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be71fe83850 x1696856981272704/t0(0) o3->2287cf7f-fad1-50b7-542b-1b783e1fae7b@10.50.2.3@o2ib2:561/0 lens 488/440 e 0 to 0 dl 1619030111 ref 1 fl Interpret:/0/0 rc 0/0 [309366.303177] Lustre: oak-OST004a: Bulk IO read error with 2287cf7f-fad1-50b7-542b-1b783e1fae7b (at 10.50.2.3@o2ib2), client will retry: rc -110 [309366.317546] Lustre: Skipped 21 previous similar messages [309367.079286] LustreError: 238493:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bc4cbe21850 x1696963074972480/t0(0) o3->856ef2a6-067b-cb5e-638c-4fcbf516dc6c@10.50.8.62@o2ib2:560/0 lens 488/440 e 0 to 0 dl 1619030110 ref 1 fl Interpret:/0/0 rc 0/0 [309367.106329] LustreError: 238493:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 5 previous similar messages [309370.071176] LustreError: 229133:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bc4b7ce7050 x1691677493196416/t0(0) o3->489ee6f9-e0d9-4@10.50.14.15@o2ib2:564/0 lens 488/440 e 0 to 0 dl 1619030114 ref 1 fl Interpret:/0/0 rc 0/0 [309370.096215] LustreError: 229133:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [309371.854651] LNet: 50603:0:(lib-move.c:976:lnet_post_send_locked()) Aborting message for 12345-10.0.2.215@o2ib5: LNetM[DE]Unlink() already called on the MD/ME. [309371.870577] LustreError: 50603:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8bbaabe17c00 [309372.280783] LustreError: 234136:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bbd2e778850 x1696881226611392/t0(0) o3->dcf00494-e949-8e41-7675-4379abab2f9b@10.50.13.5@o2ib2:568/0 lens 488/440 e 0 to 0 dl 1619030118 ref 1 fl Interpret:/0/0 rc 0/0 [309376.331040] LustreError: 221449:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be6569df050 x1696894117014784/t0(0) o3->27ff7f5e-c79d-4@10.50.13.4@o2ib2:568/0 lens 488/440 e 0 to 0 dl 1619030118 ref 1 fl Interpret:/0/0 rc 0/0 [309486.438252] Lustre: oak-OST0032: Connection restored to 04a04758-274c-4 (at 10.51.2.46@o2ib3) [309486.447884] Lustre: Skipped 2218 previous similar messages [309533.644583] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.6.24@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [309533.664026] LustreError: Skipped 1 previous similar message [309606.067757] LustreError: 221475:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd4f95ab050 x1696957646829568/t0(0) o4->3d575049-f2ff-030b-a4ea-af4cdfc8c038@10.50.8.69@o2ib2:40/0 lens 488/448 e 0 to 0 dl 1619030345 ref 1 fl Interpret:/0/0 rc 0/0 [309606.094876] Lustre: oak-OST004c: Bulk IO write error with 3d575049-f2ff-030b-a4ea-af4cdfc8c038 (at 10.50.8.69@o2ib2), client will retry: rc = -110 [309606.109646] Lustre: Skipped 4 previous similar messages [309659.971526] LustreError: 238499:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bc4c5862850 x1696864978388160/t0(0) o3->4e7846b4-d3af-3eb0-574a-20943de34eaf@10.50.10.10@o2ib2:100/0 lens 488/440 e 0 to 0 dl 1619030405 ref 1 fl Interpret:/0/0 rc 0/0 [309659.998754] Lustre: oak-OST005a: Bulk IO read error with 4e7846b4-d3af-3eb0-574a-20943de34eaf (at 10.50.10.10@o2ib2), client will retry: rc -110 [309660.013315] Lustre: Skipped 11 previous similar messages [309859.344109] Lustre: oak-OST0044: Client 6a9de4e6-b748-6da3-5c58-3c2a8da7b38d (at 10.51.3.24@o2ib3) reconnecting [309859.355473] Lustre: Skipped 1965 previous similar messages [310088.862990] Lustre: oak-OST004e: Connection restored to 62ad6c4f-ef15-4 (at 10.50.5.23@o2ib2) [310088.872640] Lustre: Skipped 335 previous similar messages [310460.458630] Lustre: oak-OST0048: Client afb0548c-9287-4 (at 10.51.15.2@o2ib3) reconnecting [310460.467961] Lustre: Skipped 83 previous similar messages [310488.626412] LustreError: 241050:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be6f6db7050 x1696614664743104/t0(0) o4->855f8733-97ad-fe20-a42c-c9a97f6818f7@10.51.4.40@o2ib3:169/0 lens 488/448 e 0 to 0 dl 1619031229 ref 1 fl Interpret:/0/0 rc 0/0 [310488.653568] Lustre: oak-OST003e: Bulk IO write error with 855f8733-97ad-fe20-a42c-c9a97f6818f7 (at 10.51.4.40@o2ib3), client will retry: rc = -110 [310679.329085] LNet: 182036:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.214@o2ib5: error 0(sending)(waiting) [310679.342702] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) RDMA (tx: ffff99d85c5ff5c0) failed: 5 [310679.343617] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bad579ac400 [310679.353386] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) msg 0@<0:0>->10.0.2.214@o2ib5 exceeded retry count 0 [310679.353388] LNet: 50607:0:(lib-msg.c:698:lnet_attempt_msg_resend()) Skipped 7 previous similar messages [310679.353389] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4cbf70800 [310679.353392] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4cc848000 [310679.353400] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bcd505ebc00 [310679.353403] LustreError: 50607:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4cc848000 [310679.353406] LustreError: 50609:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4c7ab9800 [310679.353408] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -5, desc ffff8bc4c7ab9800 [310679.459692] LNet: 50608:0:(o2iblnd_cb.c:3710:kiblnd_complete()) Skipped 234 previous similar messages [310690.509021] Lustre: oak-OST0034: Connection restored to 170638d9-2fd6-987a-704d-7adc26421eb6 (at 10.50.5.70@o2ib2) [310690.520719] Lustre: Skipped 361 previous similar messages [310705.264816] LustreError: 114964:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk READ failed: rc -107 req@ffff8bafb8522850 x1696976345578496/t0(0) o3->e548c1d8-3cfa-9f83-2c72-1ee70c34c5a9@10.50.8.67@o2ib2:391/0 lens 488/440 e 0 to 0 dl 1619031451 ref 1 fl Interpret:/0/0 rc 0/0 [310705.264940] Lustre: oak-OST005a: Bulk IO read error with e548c1d8-3cfa-9f83-2c72-1ee70c34c5a9 (at 10.50.8.67@o2ib2), client will retry: rc -107 [310705.266998] LustreError: 193413:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bd196302850 x1696976345552000/t0(0) o3->e548c1d8-3cfa-9f83-2c72-1ee70c34c5a9@10.50.8.67@o2ib2:385/0 lens 488/440 e 0 to 0 dl 1619031445 ref 1 fl Interpret:/0/0 rc 0/0 [310705.333531] LustreError: 114964:0:(ldlm_lib.c:3287:target_bulk_io()) Skipped 1 previous similar message [310746.149482] LustreError: 3782:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bc4deb36850 x1696865239203584/t0(0) o3->5ebd6997-e65a-7d53-0818-093cecfcf87e@10.50.5.41@o2ib2:360/0 lens 488/440 e 0 to 0 dl 1619031420 ref 1 fl Interpret:/0/0 rc 0/0 [310746.149503] LustreError: 238504:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8bc4cb89b850 x1687775344601984/t0(0) o3->15df203e-c19a-4@10.50.3.41@o2ib2:360/0 lens 488/440 e 0 to 0 dl 1619031420 ref 1 fl Interpret:/0/0 rc 0/0 [310746.149505] LustreError: 238504:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 17 previous similar messages [310746.149516] Lustre: oak-OST005e: Bulk IO read error with ba34b460-703e-cbe1-c5c9-7187792f041a (at 10.50.2.60@o2ib2), client will retry: rc -110 [310746.149517] Lustre: Skipped 7 previous similar messages [310746.149541] LustreError: 193440:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(2722) req@ffff8bac90578850 x1689924747994496/t0(0) o4->d7e00b58-c1b4-4@10.50.7.32@o2ib2:362/0 lens 488/448 e 0 to 0 dl 1619031422 ref 1 fl Interpret:/0/0 rc 0/0 [310746.149543] LustreError: 193440:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [310746.149556] Lustre: oak-OST004e: Bulk IO write error with d7e00b58-c1b4-4 (at 10.50.7.32@o2ib2), client will retry: rc = -110 [310746.282845] LustreError: 3782:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [310771.159410] LustreError: 204447:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1515640) req@ffff8bd2b7a60850 x1696913908033472/t0(0) o4->847f3077-ffdf-4570-5321-c5126a45bdaa@10.50.2.26@o2ib2:394/0 lens 488/448 e 0 to 0 dl 1619031454 ref 1 fl Interpret:/0/0 rc 0/0 [310771.188082] LustreError: 204447:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 4 previous similar messages [310771.199227] Lustre: oak-OST0038: Bulk IO write error with 847f3077-ffdf-4570-5321-c5126a45bdaa (at 10.50.2.26@o2ib2), client will retry: rc = -110 [310771.213997] Lustre: Skipped 4 previous similar messages [310849.744076] Lustre: 193158:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619031325/real 1619031325] req@ffff8bc3a9ff7980 x1697354024266496/t0(0) o106->oak-OST005a@10.50.5.44@o2ib2:15/16 lens 296/280 e 0 to 1 dl 1619031498 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [310870.522975] Lustre: oak-OST0052: haven't heard from client 8a494beb-4a60-4 (at 10.50.4.34@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bae2a2b9800, cur 1619031519 expire 1619031369 last 1619031292 [311049.077451] LustreError: 227922:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd39d842050 x1696614300295168/t0(0) o4->7535e7e6-b397-e9ec-ed05-2fc68240cd4c@10.51.4.38@o2ib3:731/0 lens 488/448 e 0 to 0 dl 1619031791 ref 1 fl Interpret:/0/0 rc 0/0 [311049.104532] LustreError: 227922:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [311049.115263] Lustre: oak-OST0036: Bulk IO write error with 7535e7e6-b397-e9ec-ed05-2fc68240cd4c (at 10.51.4.38@o2ib3), client will retry: rc = -110 [311067.500643] Lustre: oak-OST0042: Client 4f343bd2-a649-c662-9ecd-b48ac04a0cde (at 10.50.7.43@o2ib2) reconnecting [311067.500644] Lustre: oak-OST004c: Client 4f343bd2-a649-c662-9ecd-b48ac04a0cde (at 10.50.7.43@o2ib2) reconnecting [311067.500648] Lustre: Skipped 600 previous similar messages [311302.407024] Lustre: oak-OST0042: Connection restored to f08c42a3-d18f-4 (at 10.50.2.18@o2ib2) [311302.416647] Lustre: Skipped 775 previous similar messages [311738.486897] Lustre: oak-OST004e: Client a0f89c7b-b6d6-fcb9-5bf1-39a2ff594123 (at 10.210.12.58@tcp1) reconnecting [311738.498372] Lustre: Skipped 19 previous similar messages [311906.363564] Lustre: oak-OST0038: Connection restored to d4f26081-ad56-4 (at 10.50.12.17@o2ib2) [311906.373300] Lustre: Skipped 283 previous similar messages [311946.633082] LustreError: 137-5: oak-OST005b_UUID: not available for connect from 10.51.4.15@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [312019.467058] Lustre: oak-OST0032: haven't heard from client e8eda0c0-91ca-8ebe-f2c8-a0b8a9f19328 (at 10.210.12.39@tcp1) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be96d25f800, cur 1619032668 expire 1619032518 last 1619032441 [312019.970817] Lustre: oak-OST005c: haven't heard from client 22a26ac0-b9ae-ad30-ba6d-37c4966f2a9e (at 10.210.12.46@tcp1) in 184 seconds. I think it's dead, and I am evicting it. exp ffff8bc4c7bb0800, cur 1619032668 expire 1619032518 last 1619032484 [312019.995329] Lustre: Skipped 64 previous similar messages [312020.974982] Lustre: oak-OST0050: haven't heard from client 0f2a03c7-ee66-3f60-92a5-7cc69846e7c0 (at 10.210.12.50@tcp1) in 193 seconds. I think it's dead, and I am evicting it. exp ffff8be96d3c5c00, cur 1619032669 expire 1619032519 last 1619032476 [312020.999441] Lustre: Skipped 189 previous similar messages [312146.382855] LustreError: 5986:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be3adfd8050 x1696881834629696/t0(0) o3->d507c462-9f0f-5857-aa2a-29a15c595cfc@10.51.6.7@o2ib3:255/0 lens 488/440 e 0 to 0 dl 1619032825 ref 1 fl Interpret:/0/0 rc 0/0 [312146.410029] LustreError: 5986:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 31 previous similar messages [312146.420671] Lustre: oak-OST0058: Bulk IO read error with d507c462-9f0f-5857-aa2a-29a15c595cfc (at 10.51.6.7@o2ib3), client will retry: rc -110 [312146.435052] Lustre: Skipped 38 previous similar messages [312237.455272] LNet: 182036:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [312237.469605] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdac9b1d400 [312237.481753] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce97e10c00 [312237.493929] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d4899000 [312237.506087] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d4899000 [312237.518236] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d4899000 [312237.530396] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd32630a800 [312237.542558] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd32630a800 [312237.554724] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd32630a800 [312237.566903] LustreError: 193190:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd0881f4050 x1696620161985728/t0(0) o4->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:408/0 lens 488/448 e 0 to 0 dl 1619032978 ref 1 fl Interpret:/0/0 rc 0/0 [312237.566906] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd6d4899000 [312237.606765] Lustre: oak-OST0056: Bulk IO write error with 84772643-3e0c-a25c-a9b0-965c7b792170 (at 10.51.13.15@o2ib3), client will retry: rc = -110 [312296.410641] LustreError: 221488:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be721251050 x1689700054795008/t0(0) o3->2bbfb89a-a909-4@10.51.5.29@o2ib3:407/0 lens 488/440 e 0 to 0 dl 1619032977 ref 1 fl Interpret:/0/0 rc 0/0 [312296.410643] LustreError: 227992:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bdb870d7050 x1694754895546752/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:407/0 lens 488/440 e 0 to 0 dl 1619032977 ref 1 fl Interpret:/0/0 rc 0/0 [312296.410727] Lustre: oak-OST004e: Bulk IO read error with 64ebd172-d79e-4 (at 10.51.13.3@o2ib3), client will retry: rc -110 [312296.410771] LustreError: 209393:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bd389d58850 x1688828540880512/t0(0) o3->966c9511-11a7-4@10.51.2.54@o2ib3:408/0 lens 488/440 e 0 to 0 dl 1619032978 ref 1 fl Interpret:/0/0 rc 0/0 [312296.410800] LustreError: 224956:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1517015) req@ffff8bd4f04c1850 x1691542822049600/t0(0) o4->016623a0-673f-4@10.51.4.37@o2ib3:408/0 lens 488/448 e 0 to 0 dl 1619032978 ref 1 fl Interpret:/0/0 rc 0/0 [312296.410891] Lustre: oak-OST0034: Bulk IO write error with 016623a0-673f-4 (at 10.51.4.37@o2ib3), client will retry: rc = -110 [312296.540355] LustreError: 221488:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 59 previous similar messages [312356.078742] Lustre: oak-OST004a: Client 416c99fb-653c-af17-c233-d5516de96d20 (at 10.51.15.7@o2ib3) reconnecting [312356.090188] Lustre: Skipped 53 previous similar messages [312408.106965] Lustre: 193039:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619032883/real 1619032883] req@ffff8bd19cf0b600 x1697354025607296/t0(0) o104->oak-OST005c@10.51.4.58@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619033056 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [312408.137667] Lustre: 193039:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [312427.426822] LNet: 182036:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [312427.440986] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be71ffb5c00 [312427.453148] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce794c5000 [312427.453181] LustreError: 193188:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be4ec8fd850 x1696620163053696/t0(0) o4->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:598/0 lens 488/448 e 0 to 0 dl 1619033168 ref 1 fl Interpret:/0/0 rc 0/0 [312427.453182] LustreError: 193188:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 2 previous similar messages [312427.453387] Lustre: oak-OST0056: Bulk IO write error with 84772643-3e0c-a25c-a9b0-965c7b792170 (at 10.51.13.15@o2ib3), client will retry: rc = -110 [312427.453387] Lustre: Skipped 9 previous similar messages [312427.524365] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bce794c5000 [312496.441823] LustreError: 209392:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1048576(1500265) req@ffff8be71c01f050 x1689655365541824/t0(0) o4->a7561178-bc92-4@10.51.5.23@o2ib3:598/0 lens 488/448 e 0 to 0 dl 1619033168 ref 1 fl Interpret:/0/0 rc 0/0 [312496.441831] LustreError: 204447:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be63ba3e050 x1689680743122048/t0(0) o3->5b3cca43-cba9-4@10.51.5.34@o2ib3:598/0 lens 488/440 e 0 to 0 dl 1619033168 ref 1 fl Interpret:/0/0 rc 0/0 [312496.441845] Lustre: oak-OST0056: Bulk IO write error with 1a448fa9-8f9c-4 (at 10.51.0.65@o2ib3), client will retry: rc = -110 [312496.441900] Lustre: oak-OST0058: Bulk IO read error with ed1ad69c-9415-7f56-f757-f4610de93535 (at 10.51.6.13@o2ib3), client will retry: rc -110 [312496.441901] Lustre: Skipped 64 previous similar messages [312496.442541] LustreError: 193401:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be7eb808050 x1689680743121920/t0(0) o3->5b3cca43-cba9-4@10.51.5.34@o2ib3:598/0 lens 488/440 e 0 to 0 dl 1619033168 ref 1 fl Interpret:/0/0 rc 0/0 [312496.553348] LustreError: 209392:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 25 previous similar messages [312512.526680] Lustre: oak-OST0052: Connection restored to a7278d3a-d141-4 (at 10.50.5.20@o2ib2) [312512.536301] Lustre: Skipped 1058 previous similar messages [312594.837761] LustreError: 241048:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk WRITE failed: rc -107 req@ffff8bde3f959850 x1696620166101376/t0(0) o4->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:16/0 lens 488/448 e 0 to 0 dl 1619033341 ref 1 fl Interpret:/0/0 rc 0/0 [312594.865936] Lustre: oak-OST003c: Bulk IO write error with 84772643-3e0c-a25c-a9b0-965c7b792170 (at 10.51.13.15@o2ib3), client will retry: rc = -107 [312594.880790] Lustre: Skipped 16 previous similar messages [312595.760574] Lustre: 193164:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619033071/real 1619033071] req@ffff8bb08e93ec00 x1697354025807296/t0(0) o106->oak-OST0048@10.51.2.55@o2ib3:15/16 lens 296/280 e 0 to 1 dl 1619033244 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [312596.968556] LustreError: 221409:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be7029f1050 x1689695454138240/t0(0) o3->d2b765d6-c390-4@10.51.5.1@o2ib3:16/0 lens 488/440 e 0 to 0 dl 1619033341 ref 1 fl Interpret:/0/0 rc 0/0 [312596.993359] Lustre: oak-OST0050: Bulk IO read error with d2b765d6-c390-4 (at 10.51.5.1@o2ib3), client will retry: rc -110 [312597.005699] Lustre: Skipped 75 previous similar messages [312607.861673] LustreError: 221489:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8be6a25ba850 x1695385553636352/t0(0) o3->c7c97132-e759-4@10.51.15.4@o2ib3:27/0 lens 488/440 e 0 to 0 dl 1619033352 ref 1 fl Interpret:/0/0 rc 0/0 [312611.456087] Lustre: oak-OST0036: haven't heard from client 7d4930b0-1dcd-4 (at 10.51.13.8@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd85fdcc400, cur 1619033260 expire 1619033110 last 1619033033 [312611.478409] Lustre: Skipped 80 previous similar messages [312618.453135] Lustre: oak-OST003c: haven't heard from client 7e0a9a61-2b20-4 (at 10.51.2.19@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be71ef58800, cur 1619033267 expire 1619033117 last 1619033040 [312618.475457] Lustre: Skipped 1 previous similar message [312722.836662] LustreError: 221461:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bce2d07a050 x1689268954748672/t0(0) o3->a2c5442d-4a42-4@10.51.2.45@o2ib3:143/0 lens 488/440 e 0 to 0 dl 1619033468 ref 1 fl Interpret:/0/0 rc 0/0 [312724.212355] LNet: 50606:0:(lib-move.c:976:lnet_post_send_locked()) Aborting message for 12345-10.0.2.216@o2ib5: LNetM[DE]Unlink() already called on the MD/ME. [312724.228289] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be71eec2400 [312724.240451] LustreError: 50606:0:(events.c:450:server_bulk_callback()) event type 5, status -125, desc ffff8be71eec2400 [312724.252884] Lustre: oak-OST003e: Bulk IO read error with a2c5442d-4a42-4 (at 10.51.2.45@o2ib3), client will retry: rc -110 [312724.265316] Lustre: Skipped 1 previous similar message [313115.406432] Lustre: oak-OST0056: Connection restored to 42f9a3f7-1b42-4 (at 10.50.5.21@o2ib2) [313115.416054] Lustre: Skipped 1120 previous similar messages [313152.332005] Lustre: oak-OST004e: Client 14270342-34f5-f84f-318c-4252a9121c55 (at 10.51.12.22@o2ib3) reconnecting [313152.343498] Lustre: Skipped 1290 previous similar messages [313716.098201] Lustre: oak-OST004a: Connection restored to 6b26a19d-ffb3-4 (at 10.50.9.34@o2ib2) [313716.107821] Lustre: Skipped 346 previous similar messages [313819.143427] Lustre: oak-OST005a: Client 5fbc6dd0-2ef9-b4a9-8121-d2dcfb28fdfb (at 10.50.10.8@o2ib2) reconnecting [313819.154794] Lustre: Skipped 44 previous similar messages [314288.027648] LustreError: 137-5: oak-OST0049_UUID: not available for connect from 10.51.15.4@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [314288.047061] LustreError: Skipped 2 previous similar messages [314316.484325] Lustre: oak-OST0052: Connection restored to 67013b06-bef7-4 (at 10.50.2.30@o2ib2) [314316.493944] Lustre: Skipped 408 previous similar messages [314374.414551] Lustre: oak-OST0030: haven't heard from client bf83d96d-cb1a-4 (at 10.50.0.71@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be976067400, cur 1619035023 expire 1619034873 last 1619034796 [314374.436874] Lustre: Skipped 1 previous similar message [314377.412293] Lustre: oak-OST0052: haven't heard from client bf83d96d-cb1a-4 (at 10.50.0.71@o2ib2) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be71ff92000, cur 1619035026 expire 1619034876 last 1619034799 [314377.434659] Lustre: Skipped 1 previous similar message [314436.307924] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.51.0.17@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [314481.053033] Lustre: oak-OST0036: Client 416c99fb-653c-af17-c233-d5516de96d20 (at 10.51.15.7@o2ib3) reconnecting [314481.053034] Lustre: oak-OST0044: Client 416c99fb-653c-af17-c233-d5516de96d20 (at 10.51.15.7@o2ib3) reconnecting [314481.053037] Lustre: Skipped 14 previous similar messages [314578.489542] LustreError: 137-5: oak-OST0057_UUID: not available for connect from 10.51.2.28@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [314927.062076] Lustre: oak-OST0038: Connection restored to 70adfa6c-d72a-4 (at 10.51.15.19@o2ib3) [314927.071791] Lustre: Skipped 310 previous similar messages [315082.235909] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.51.3.24@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [315090.265099] Lustre: oak-OST0058: Client 56a5a766-0782-0626-7e81-90dde2e2789a (at 10.51.2.28@o2ib3) reconnecting [315090.276469] Lustre: Skipped 82 previous similar messages [315527.815881] Lustre: oak-OST0058: Connection restored to 14eaca5b-4bd5-4 (at 10.49.18.30@o2ib1) [315527.825599] Lustre: Skipped 375 previous similar messages [315592.845131] LustreError: 147686:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bdb8d141050 x1697572767003328/t0(0) o4->ae42a492-45b3-41a5-b54a-2ba5a4359c63@10.50.4.28@o2ib2:748/0 lens 488/448 e 0 to 0 dl 1619036338 ref 1 fl Interpret:/0/0 rc 0/0 [315592.846493] Lustre: oak-OST0034: Bulk IO write error with ae42a492-45b3-41a5-b54a-2ba5a4359c63 (at 10.50.4.28@o2ib2), client will retry: rc = -110 [315592.886989] LustreError: 147686:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [315595.830829] LustreError: 193254:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1116) req@ffff8bd17d62a050 x1696957659604224/t0(0) o4->3d575049-f2ff-030b-a4ea-af4cdfc8c038@10.50.8.69@o2ib2:695/0 lens 488/448 e 0 to 0 dl 1619036285 ref 1 fl Interpret:/0/0 rc 0/0 [315612.260501] LustreError: 193432:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be7010d7850 x1697572767732864/t0(0) o4->ae42a492-45b3-41a5-b54a-2ba5a4359c63@10.50.4.28@o2ib2:6/0 lens 488/448 e 0 to 0 dl 1619036351 ref 1 fl Interpret:/0/0 rc 0/0 [315612.287427] LustreError: 193432:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 3 previous similar messages [315612.298365] Lustre: oak-OST0034: Bulk IO write error with ae42a492-45b3-41a5-b54a-2ba5a4359c63 (at 10.50.4.28@o2ib2), client will retry: rc = -110 [315612.313150] Lustre: Skipped 5 previous similar messages [315711.493063] Lustre: oak-OST0042: Client 3d575049-f2ff-030b-a4ea-af4cdfc8c038 (at 10.50.8.69@o2ib2) reconnecting [315711.504449] Lustre: Skipped 283 previous similar messages [315760.621098] LustreError: 137-5: oak-OST004f_UUID: not available for connect from 10.50.5.70@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [315785.158153] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.50.6.13@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [315970.863284] LustreError: 224952:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8be6f6db0850 x1696193437054080/t0(0) o3->afb0548c-9287-4@10.51.15.2@o2ib3:316/0 lens 504/440 e 0 to 0 dl 1619036661 ref 1 fl Interpret:/0/0 rc 0/0 [315970.889580] LustreError: 224952:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 75 previous similar messages [315970.900519] Lustre: oak-OST0054: Bulk IO read error with afb0548c-9287-4 (at 10.51.15.2@o2ib3), client will retry: rc -110 [316135.246550] Lustre: oak-OST0056: Connection restored to (at 10.50.5.67@o2ib2) [316135.254742] Lustre: Skipped 403 previous similar messages [316321.321344] Lustre: oak-OST0034: Client 6c0c8fc9-86c4-9a4d-ddf5-85fe73093cd9 (at 10.51.12.2@o2ib3) reconnecting [316321.332742] Lustre: Skipped 60 previous similar messages [316735.483142] Lustre: oak-OST005c: Connection restored to c7c97132-e759-4 (at 10.51.15.4@o2ib3) [316735.492765] Lustre: Skipped 412 previous similar messages [316937.539453] Lustre: oak-OST0036: Client 330d404b-804c-4 (at 10.51.15.3@o2ib3) reconnecting [316937.548783] Lustre: Skipped 68 previous similar messages [317335.740891] Lustre: oak-OST0044: Connection restored to 216bd0f9-69d4-4 (at 10.50.9.37@o2ib2) [317335.750533] Lustre: Skipped 546 previous similar messages [317673.777984] Lustre: oak-OST0058: Client b1aa0e56-7229-b79d-2adf-6fd2266928f7 (at 10.51.12.6@o2ib3) reconnecting [317673.777985] Lustre: oak-OST0046: Client b1aa0e56-7229-b79d-2adf-6fd2266928f7 (at 10.51.12.6@o2ib3) reconnecting [317673.777988] Lustre: Skipped 30 previous similar messages [317703.058998] LustreError: 137-5: oak-OST003f_UUID: not available for connect from 10.51.2.27@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [317703.078501] LustreError: Skipped 2 previous similar messages [317942.021394] Lustre: oak-OST0030: Connection restored to 64d080ce-52f0-4 (at 10.50.9.31@o2ib2) [317942.031032] Lustre: Skipped 631 previous similar messages [318223.118888] LustreError: 137-5: oak-OST005d_UUID: not available for connect from 10.51.12.5@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [318287.787991] Lustre: oak-OST004e: Client 6d1e0888-b170-e983-cc6c-061a821fd6ec (at 10.210.13.22@tcp1) reconnecting [318287.799462] Lustre: Skipped 42 previous similar messages [318521.127510] LustreError: 193444:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bd4c17bb850 x1695844985211968/t0(0) o3->330d404b-804c-4@10.51.15.3@o2ib3:593/0 lens 488/440 e 0 to 0 dl 1619039203 ref 1 fl Interpret:/0/0 rc 0/0 [318521.127659] Lustre: oak-OST004e: Bulk IO read error with 330d404b-804c-4 (at 10.51.15.3@o2ib3), client will retry: rc -110 [318521.166271] LustreError: 193444:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [318546.238544] Lustre: oak-OST0042: Connection restored to 4934bbdf-4206-4 (at 10.50.4.42@o2ib2) [318546.248170] Lustre: Skipped 600 previous similar messages [318917.969760] Lustre: oak-OST005a: Client 416c99fb-653c-af17-c233-d5516de96d20 (at 10.51.15.7@o2ib3) reconnecting [318917.981127] Lustre: Skipped 79 previous similar messages [318996.169601] LustreError: 193401:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 3145728(4194304) req@ffff8bd492cdf050 x1696620287479680/t0(0) o3->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:306/0 lens 488/440 e 0 to 0 dl 1619039671 ref 1 fl Interpret:/0/0 rc 0/0 [318996.169800] Lustre: oak-OST0042: Bulk IO read error with 84772643-3e0c-a25c-a9b0-965c7b792170 (at 10.51.13.15@o2ib3), client will retry: rc -110 [318996.169802] Lustre: Skipped 2 previous similar messages [318996.218504] LustreError: 193401:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 1 previous similar message [319147.017758] Lustre: oak-OST003c: Connection restored to c7c97132-e759-4 (at 10.51.15.4@o2ib3) [319147.027391] Lustre: Skipped 272 previous similar messages [319571.996325] Lustre: oak-OST0032: Client de0b71dd-918a-dbdd-3442-448a7d2edf2a (at 10.51.6.3@o2ib3) reconnecting [319572.007628] Lustre: Skipped 14 previous similar messages [319748.353534] Lustre: oak-OST0046: Connection restored to 089c3668-8c08-4 (at 10.50.5.68@o2ib2) [319748.363178] Lustre: Skipped 352 previous similar messages [320348.778862] Lustre: oak-OST0044: Connection restored to 2d9281f3-e357-4 (at 10.50.9.40@o2ib2) [320348.788483] Lustre: Skipped 805 previous similar messages [320367.166841] Lustre: oak-OST004e: Client 8dd6f024-d3b5-4043-9dff-3e60feb1d17a (at 10.50.4.70@o2ib2) reconnecting [320367.178209] Lustre: Skipped 5 previous similar messages [320527.241195] LNet: 19313:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(waiting) [320527.254343] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be62860cc00 [320527.266517] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be021095000 [320527.266531] LustreError: 227994:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71c0a5050 x1671929438748224/t0(0) o4->8ab7929a-8a09-4@10.51.0.72@o2ib3:393/0 lens 488/448 e 0 to 0 dl 1619041268 ref 1 fl Interpret:/0/0 rc 0/0 [320527.266752] Lustre: oak-OST005c: Bulk IO write error with 8ab7929a-8a09-4 (at 10.51.0.72@o2ib3), client will retry: rc = -110 [320527.316829] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be62860a000 [320527.328975] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be62860a000 [320595.342110] LustreError: 209396:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be1bb56f850 x1694756769423488/t0(0) o3->64ebd172-d79e-4@10.51.13.3@o2ib3:393/0 lens 488/440 e 0 to 0 dl 1619041268 ref 1 fl Interpret:/0/0 rc 0/0 [320595.342132] LustreError: 221394:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8be1bb56c050 x1671929438748480/t0(0) o4->8ab7929a-8a09-4@10.51.0.72@o2ib3:393/0 lens 488/448 e 0 to 0 dl 1619041268 ref 1 fl Interpret:/0/0 rc 0/0 [320595.342135] LustreError: 221394:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [320595.342294] LustreError: 221488:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be93cea1850 x1696620319447616/t0(0) o3->84772643-3e0c-a25c-a9b0-965c7b792170@10.51.13.15@o2ib3:396/0 lens 488/440 e 0 to 0 dl 1619041271 ref 1 fl Interpret:/0/0 rc 0/0 [320595.342320] Lustre: oak-OST0046: Bulk IO read error with 84772643-3e0c-a25c-a9b0-965c7b792170 (at 10.51.13.15@o2ib3), client will retry: rc -110 [320595.342321] Lustre: Skipped 1 previous similar message [320595.342339] Lustre: oak-OST0052: Bulk IO write error with ba46adca-a399-4 (at 10.51.3.57@o2ib3), client will retry: rc = -110 [320595.465646] LustreError: 209396:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [320698.466759] Lustre: 193037:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619041174/real 1619041174] req@ffff8bb2d7470900 x1697354036269312/t0(0) o104->oak-OST005a@10.51.0.68@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619041347 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [320698.497474] Lustre: 193037:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 1 previous similar message [320949.941817] Lustre: oak-OST0034: Connection restored to 51bad213-c990-4 (at 10.51.6.1@o2ib3) [320949.951386] Lustre: Skipped 647 previous similar messages [320982.231194] Lustre: oak-OST0044: Client e9d49521-f8c3-cd6b-2d1b-54d6cf8055c4 (at 10.51.1.9@o2ib3) reconnecting [320982.242462] Lustre: Skipped 249 previous similar messages [321551.062091] Lustre: oak-OST0032: Connection restored to 9bf1345b-84c6-4 (at 10.49.17.25@o2ib1) [321551.071812] Lustre: Skipped 427 previous similar messages [322009.674264] Lustre: oak-OST0054: Client 9458049c-ca8d-335b-3531-2606964e11c0 (at 10.51.2.31@o2ib3) reconnecting [322009.685629] Lustre: Skipped 15 previous similar messages [322151.931201] Lustre: oak-OST003c: Connection restored to 30f1fb69-d9ec-4 (at 10.50.9.33@o2ib2) [322151.940905] Lustre: Skipped 198 previous similar messages [322752.332914] Lustre: oak-OST0042: Connection restored to de57c027-a949-4 (at 10.51.5.55@o2ib3) [322752.342541] Lustre: Skipped 183 previous similar messages [323357.969877] Lustre: oak-OST004e: Connection restored to 90657cbe-1d23-4 (at 10.50.4.33@o2ib2) [323357.979498] Lustre: Skipped 194 previous similar messages [323561.786687] Lustre: oak-OST0042: Client 66e4a5a6-bfff-b7ac-e28c-1480ab22a61f (at 10.50.2.50@o2ib2) reconnecting [323959.446904] Lustre: oak-OST0050: Connection restored to 38b67c9a-160a-4 (at 10.50.4.9@o2ib2) [323959.456426] Lustre: Skipped 320 previous similar messages [324562.478508] Lustre: oak-OST0054: Connection restored to 4f2d98a7-e903-4 (at 10.51.5.61@o2ib3) [324562.488199] Lustre: Skipped 315 previous similar messages [325166.244500] Lustre: oak-OST0034: Connection restored to 5e3fa4ab-9670-4 (at 10.51.4.21@o2ib3) [325166.254119] Lustre: Skipped 463 previous similar messages [325425.724522] Lustre: oak-OST0054: Client 6f0ae26c-9f27-af9d-0215-23b0004a3490 (at 10.50.5.4@o2ib2) reconnecting [325425.735836] Lustre: Skipped 1 previous similar message [325519.346045] Lustre: oak-OST0034: Client 239426d9-7bc0-1b77-9b70-2a297e43e6b9 (at 10.50.12.7@o2ib2) reconnecting [325642.325221] Lustre: oak-OST0044: Client 9a831181-5ba7-76a1-7a41-6a1d0b36f622 (at 10.50.1.59@o2ib2) reconnecting [325642.325223] Lustre: oak-OST0048: Client 9a831181-5ba7-76a1-7a41-6a1d0b36f622 (at 10.50.1.59@o2ib2) reconnecting [325642.325226] Lustre: Skipped 3 previous similar messages [325743.756841] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.50.6.17@o2ib2 (no target). If you are running an HA pair check that the target is mounted on the other server. [325773.754469] Lustre: oak-OST004e: Connection restored to 346e890a-2480-4 (at 10.49.28.9@o2ib1) [325773.764131] Lustre: Skipped 308 previous similar messages [325811.936698] Lustre: oak-OST0046: Client be21abca-4d92-bbd5-3ccb-1ba191006e06 (at 10.50.6.17@o2ib2) reconnecting [325811.948131] Lustre: Skipped 30 previous similar messages [326241.477259] Lustre: oak-OST0042: Client dd4aadec-fe7c-0c49-bd2e-8929ca375da6 (at 10.50.2.6@o2ib2) reconnecting [326241.488543] Lustre: Skipped 2 previous similar messages [326374.114038] Lustre: oak-OST0052: Connection restored to 4934bbdf-4206-4 (at 10.50.4.42@o2ib2) [326374.123658] Lustre: Skipped 250 previous similar messages [326441.249114] Lustre: oak-OST005a: Client ede94a1a-b345-30c6-e4bb-52a3a03e8e5f (at 10.51.6.18@o2ib3) reconnecting [326968.092288] Lustre: oak-OST0046: Client db35851b-4880-fd68-1dba-fd7cff216c58 (at 10.51.6.27@o2ib3) reconnecting [326989.387023] Lustre: oak-OST0030: Connection restored to 2b8b9c87-271a-4 (at 10.51.4.32@o2ib3) [326989.396645] Lustre: Skipped 274 previous similar messages [327226.063556] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [327226.077831] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bde49516000 [327226.089994] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be723c76000 [327226.089996] LustreError: 220402:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bcd9dc3c850 x1684960848044864/t0(0) o4->a6bf3f69-be17-4@10.51.3.25@o2ib3:295/0 lens 488/448 e 0 to 0 dl 1619047965 ref 1 fl Interpret:/0/0 rc 0/0 [327226.090140] Lustre: oak-OST003a: Bulk IO write error with a6bf3f69-be17-4 (at 10.51.3.25@o2ib3), client will retry: rc = -110 [327226.090141] Lustre: Skipped 6 previous similar messages [327270.979805] LustreError: 193432:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8bcd9dc3e050 x1684960848044992/t0(0) o4->a6bf3f69-be17-4@10.51.3.25@o2ib3:295/0 lens 488/448 e 0 to 0 dl 1619047965 ref 1 fl Interpret:/0/0 rc 0/0 [327270.979938] Lustre: oak-OST003a: Bulk IO write error with a6bf3f69-be17-4 (at 10.51.3.25@o2ib3), client will retry: rc = -110 [327270.979939] Lustre: Skipped 1 previous similar message [327270.980082] LustreError: 209399:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bdb2ee4f050 x1691355954002752/t0(0) o3->1dcbe18d-7cb3-4@10.51.6.9@o2ib3:297/0 lens 488/440 e 0 to 0 dl 1619047967 ref 1 fl Interpret:/0/0 rc 0/0 [327270.980084] LustreError: 209399:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 7 previous similar messages [327270.980097] Lustre: oak-OST005e: Bulk IO read error with 1dcbe18d-7cb3-4 (at 10.51.6.9@o2ib3), client will retry: rc -110 [327270.980098] Lustre: Skipped 9 previous similar messages [327271.079184] LustreError: 193432:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 7 previous similar messages [327295.979654] LustreError: 217483:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8be71c6cf850 x1689656198906048/t0(0) o3->4f2d98a7-e903-4@10.51.5.61@o2ib3:298/0 lens 488/440 e 0 to 0 dl 1619047968 ref 1 fl Interpret:/0/0 rc 0/0 [327295.979899] LustreError: 193426:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 3145728(4194304) req@ffff8bcd4e183850 x1684960848071168/t0(0) o4->a6bf3f69-be17-4@10.51.3.25@o2ib3:299/0 lens 488/448 e 0 to 0 dl 1619047969 ref 1 fl Interpret:/0/0 rc 0/0 [327295.979959] Lustre: oak-OST0038: Bulk IO write error with 1fbd93da-ab0e-4 (at 10.51.2.40@o2ib3), client will retry: rc = -110 [327295.979960] Lustre: Skipped 2 previous similar messages [327295.980037] Lustre: oak-OST0050: Bulk IO read error with 1dcbe18d-7cb3-4 (at 10.51.6.9@o2ib3), client will retry: rc -110 [327296.062638] LustreError: 217483:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [327389.278212] LustreError: 241059:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk WRITE failed: rc -107 req@ffff8be71ed77050 x1684960849440320/t0(0) o4->a6bf3f69-be17-4@10.51.3.25@o2ib3:466/0 lens 488/448 e 0 to 0 dl 1619048136 ref 1 fl Interpret:/0/0 rc 0/0 [327389.303685] Lustre: oak-OST003a: Bulk IO write error with a6bf3f69-be17-4 (at 10.51.3.25@o2ib3), client will retry: rc = -107 [327389.316418] Lustre: Skipped 6 previous similar messages [327579.865283] Lustre: oak-OST003c: Client 7248222f-4666-93b2-8a86-11760c329552 (at 10.51.15.14@o2ib3) reconnecting [327579.876755] Lustre: Skipped 851 previous similar messages [327592.584223] Lustre: oak-OST005a: Connection restored to d32f812c-619f-4 (at 10.50.2.29@o2ib2) [327592.593877] Lustre: Skipped 1052 previous similar messages [327874.247224] LustreError: 227991:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd0311a7850 x1695840396933632/t0(0) o4->70adfa6c-d72a-4@10.51.15.19@o2ib3:179/0 lens 488/448 e 0 to 0 dl 1619048604 ref 1 fl Interpret:/0/0 rc 0/0 [327874.272426] Lustre: oak-OST0054: Bulk IO write error with 70adfa6c-d72a-4 (at 10.51.15.19@o2ib3), client will retry: rc = -110 [327886.409964] LustreError: 193181:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be459144050 x1695840396938560/t0(0) o4->70adfa6c-d72a-4@10.51.15.19@o2ib3:182/0 lens 488/448 e 0 to 0 dl 1619048607 ref 1 fl Interpret:/0/0 rc 0/0 [327898.271062] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [327898.285243] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bde309a3800 [327898.297468] LustreError: 209396:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd84dd73050 x1695840396938560/t0(0) o4->70adfa6c-d72a-4@10.51.15.19@o2ib3:210/0 lens 488/448 e 0 to 0 dl 1619048635 ref 1 fl Interpret:/2/0 rc 0/0 [327898.323003] LustreError: 209396:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [327898.333808] Lustre: oak-OST0032: Bulk IO write error with 70adfa6c-d72a-4 (at 10.51.15.19@o2ib3), client will retry: rc = -110 [327898.346621] Lustre: Skipped 1 previous similar message [327908.406415] LustreError: 227982:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bcd180f9050 x1695840396976576/t0(0) o4->70adfa6c-d72a-4@10.51.15.19@o2ib3:195/0 lens 488/448 e 0 to 0 dl 1619048620 ref 1 fl Interpret:/0/0 rc 0/0 [327946.012350] LustreError: 241058:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(3691321) req@ffff8be71ae20050 x1689700122653504/t0(0) o4->2bbfb89a-a909-4@10.51.5.29@o2ib3:207/0 lens 488/448 e 0 to 0 dl 1619048632 ref 1 fl Interpret:/0/0 rc 0/0 [327946.012588] Lustre: oak-OST0052: Bulk IO write error with 0efbab40-1738-4 (at 10.51.12.23@o2ib3), client will retry: rc = -110 [327946.012589] Lustre: oak-OST004c: Bulk IO write error with 0efbab40-1738-4 (at 10.51.12.23@o2ib3), client will retry: rc = -110 [327946.012590] Lustre: Skipped 1 previous similar message [327946.012590] Lustre: Skipped 1 previous similar message [327946.076210] LustreError: 241058:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 10 previous similar messages [327971.014721] LustreError: 223907:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1047411(4193139) req@ffff8bd371042050 x1689652999799232/t0(0) o4->b290c20f-399c-4@10.51.5.68@o2ib3:219/0 lens 488/448 e 0 to 0 dl 1619048644 ref 1 fl Interpret:/0/0 rc 0/0 [327971.014723] LustreError: 193439:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(2116) req@ffff8be720a34850 x1691542917555840/t0(0) o4->016623a0-673f-4@10.51.4.37@o2ib3:219/0 lens 488/448 e 0 to 0 dl 1619048644 ref 1 fl Interpret:/0/0 rc 0/0 [327971.014726] LustreError: 193439:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 3 previous similar messages [327971.078167] LustreError: 223907:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [328203.405679] Lustre: oak-OST0044: Connection restored to d32f812c-619f-4 (at 10.50.2.29@o2ib2) [328203.415324] Lustre: Skipped 966 previous similar messages [328236.263221] Lustre: oak-OST0048: Client de0b71dd-918a-dbdd-3442-448a7d2edf2a (at 10.51.6.3@o2ib3) reconnecting [328236.274546] Lustre: Skipped 631 previous similar messages [328404.511892] LustreError: 193406:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd3fc047850 x1695845340727872/t0(0) o4->330d404b-804c-4@10.51.15.3@o2ib3:724/0 lens 488/448 e 0 to 0 dl 1619049149 ref 1 fl Interpret:/0/0 rc 0/0 [328404.536970] Lustre: oak-OST0034: Bulk IO write error with 330d404b-804c-4 (at 10.51.15.3@o2ib3), client will retry: rc = -110 [328404.549708] Lustre: Skipped 10 previous similar messages [328471.257405] LNet: 182038:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [328471.271501] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbfbb1b800 [328471.283678] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd221d16000 [328471.295839] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4c7bcac00 [328471.307985] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4c7bcac00 [328471.320136] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be725ed6000 [328471.332287] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bc4c7bce800 [328471.332327] LustreError: 227982:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd165f00050 x1695347809177856/t0(0) o4->25ca5655-1c3b-4@10.51.15.8@o2ib3:31/0 lens 488/448 e 0 to 0 dl 1619049211 ref 1 fl Interpret:/0/0 rc 0/0 [328521.064010] LustreError: 220406:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be5aa9c7850 x1689651512155968/t0(0) o3->2c63b434-3a22-4@10.51.5.53@o2ib3:31/0 lens 488/440 e 0 to 0 dl 1619049211 ref 1 fl Interpret:/0/0 rc 0/0 [328521.064012] LustreError: 227990:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bdf950c4050 x1696208091055424/t0(0) o3->d073f313-60b4-4@10.51.15.5@o2ib3:31/0 lens 488/440 e 0 to 0 dl 1619049211 ref 1 fl Interpret:/0/0 rc 0/0 [328521.064028] Lustre: oak-OST0050: Bulk IO read error with d073f313-60b4-4 (at 10.51.15.5@o2ib3), client will retry: rc -110 [328521.064030] Lustre: Skipped 2 previous similar messages [328521.064061] LustreError: 227997:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(2167) req@ffff8bdf950c4850 x1691972835967232/t0(0) o4->1fbd93da-ab0e-4@10.51.2.40@o2ib3:32/0 lens 488/448 e 0 to 0 dl 1619049212 ref 1 fl Interpret:/0/0 rc 0/0 [328521.064063] LustreError: 227997:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 1 previous similar message [328521.169511] LustreError: 220406:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 3 previous similar messages [328546.068180] LustreError: 217343:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 2097152(4194304) req@ffff8bcb6a2b9050 x1671929871994944/t0(0) o4->8ab7929a-8a09-4@10.51.0.72@o2ib3:38/0 lens 488/448 e 0 to 0 dl 1619049218 ref 1 fl Interpret:/0/0 rc 0/0 [328579.579805] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer expired after 111s: evicting client at 10.51.13.15@o2ib3 ns: filter-oak-OST0038_UUID lock: ffff8bcb6f5df740/0xf81cb9200d69881 lrc: 3/0,0 mode: PR/PR res: [0x1f31a6a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x60000400010020 nid: 10.51.13.15@o2ib3 remote: 0xe0447cbca998ff36 expref: 93 pid: 193081 timeout: 328586 lvb_type: 1 [328579.626281] LustreError: 187399:0:(ldlm_lockd.c:256:expired_lock_main()) Skipped 1 previous similar message [328639.184655] LustreError: 193407:0:(ldlm_lib.c:3287:target_bulk_io()) @@@ bulk READ failed: rc -107 req@ffff8be0e4968850 x1685066078888192/t0(0) o3->a489df6d-ddfc-4@10.51.1.57@o2ib3:206/0 lens 488/440 e 0 to 0 dl 1619049386 ref 1 fl Interpret:/0/0 rc 0/0 [328639.185637] Lustre: oak-OST0042: Bulk IO read error with a489df6d-ddfc-4 (at 10.51.1.57@o2ib3), client will retry: rc -107 [328639.185639] Lustre: Skipped 29 previous similar messages [328639.228426] LustreError: 193407:0:(ldlm_lib.c:3287:target_bulk_io()) Skipped 6 previous similar messages [328641.415335] Lustre: 75146:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1619049117/real 1619049117] req@ffff8bcfedc39200 x1697354044150400/t0(0) o104->oak-OST004e@10.51.13.15@o2ib3:15/16 lens 296/224 e 0 to 1 dl 1619049290 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [328659.576238] LustreError: 217489:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff8bceb2cc7850 x1695484195402688/t0(0) o3->61781ed1-b14e-4@10.51.13.4@o2ib3:224/0 lens 488/440 e 0 to 0 dl 1619049404 ref 1 fl Interpret:/0/0 rc 0/0 [328659.601258] Lustre: oak-OST0048: Bulk IO read error with 61781ed1-b14e-4 (at 10.51.13.4@o2ib3), client will retry: rc -110 [328659.613686] Lustre: Skipped 6 previous similar messages [328803.886124] Lustre: oak-OST005a: Connection restored to (at 10.50.1.58@o2ib2) [328803.894289] Lustre: Skipped 1165 previous similar messages [328924.419477] Lustre: oak-OST0052: Client 33016c70-be50-803e-0431-7a51cafca4a6 (at 10.51.14.19@o2ib3) reconnecting [328924.431002] Lustre: Skipped 962 previous similar messages [329006.642442] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.3.63@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [329102.898570] LustreError: 217489:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be75707c850 x1696667686121856/t0(0) o4->f71d1188-d01c-9fa0-935c-a63ad652da8a@10.51.12.21@o2ib3:658/0 lens 488/448 e 0 to 0 dl 1619049838 ref 1 fl Interpret:/0/0 rc 0/0 [329102.925918] Lustre: oak-OST0040: Bulk IO write error with f71d1188-d01c-9fa0-935c-a63ad652da8a (at 10.51.12.21@o2ib3), client will retry: rc = -110 [329102.940762] Lustre: Skipped 4 previous similar messages [329104.376102] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.12.21@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xf6280525 [329104.393117] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 3 previous similar messages [329177.597196] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.51.0.17@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [329177.597197] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.0.17@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [329177.636015] LustreError: Skipped 2 previous similar messages [329407.695282] Lustre: oak-OST003e: Connection restored to 0c6c3e10-78fc-c243-408c-39766a782302 (at 10.51.3.4@o2ib3) [329407.706841] Lustre: Skipped 235 previous similar messages [329480.477637] LustreError: 137-5: oak-OST0039_UUID: not available for connect from 10.51.12.4@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [329506.763157] LustreError: 193187:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd638985850 x1695347811006400/t0(0) o4->25ca5655-1c3b-4@10.51.15.8@o2ib3:313/0 lens 488/448 e 0 to 0 dl 1619050248 ref 1 fl Interpret:/0/0 rc 0/0 [329506.788213] LustreError: 193187:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [329529.925183] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.15.8@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xf66a965d [329529.942074] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 5 previous similar messages [329542.383652] Lustre: oak-OST005e: Client 7f1b7392-400d-f93e-0c1e-8292ad9bca46 (at 10.51.13.6@o2ib3) reconnecting [329542.395045] Lustre: Skipped 112 previous similar messages [329564.986654] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [329565.000633] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcf66d94800 [329565.012812] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcd5c9e4c00 [329565.012818] LustreError: 217342:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bce44197050 x1689653836272320/t0(0) o4->dcedf54d-d80a-4@10.51.5.7@o2ib3:371/0 lens 488/448 e 0 to 0 dl 1619050306 ref 1 fl Interpret:/0/0 rc 0/0 [329586.676257] LustreError: 137-5: oak-OST004d_UUID: not available for connect from 10.51.15.9@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [329620.160531] LustreError: 220400:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 182322(1230898) req@ffff8be9068eb050 x1696626596809600/t0(0) o4->156315a7-a82d-b4fe-847a-396165636f38@10.51.14.3@o2ib3:358/0 lens 488/448 e 0 to 0 dl 1619050293 ref 1 fl Interpret:/0/0 rc 0/0 [329620.160686] LustreError: 241059:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(8192) req@ffff8be3f9e4e850 x1688693701457280/t0(0) o3->a7834140-6ca8-4@10.51.2.9@o2ib3:369/0 lens 488/440 e 0 to 0 dl 1619050304 ref 1 fl Interpret:/0/0 rc 0/0 [329620.160687] LustreError: 241059:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 25 previous similar messages [329620.160700] Lustre: oak-OST0046: Bulk IO read error with a7834140-6ca8-4 (at 10.51.2.9@o2ib3), client will retry: rc -110 [329620.160860] LustreError: 221481:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be6c1188050 x1689653836272768/t0(0) o4->dcedf54d-d80a-4@10.51.5.7@o2ib3:371/0 lens 488/448 e 0 to 0 dl 1619050306 ref 1 fl Interpret:/0/0 rc 0/0 [329620.262922] LustreError: 220400:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 8 previous similar messages [329777.158207] LustreError: 137-5: oak-OST004b_UUID: not available for connect from 10.51.15.6@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [329777.177616] LustreError: Skipped 2 previous similar messages [329785.529659] LustreError: 193426:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd1e479a050 x1696626597141696/t0(0) o4->156315a7-a82d-b4fe-847a-396165636f38@10.51.14.3@o2ib3:549/0 lens 488/448 e 0 to 0 dl 1619050484 ref 1 fl Interpret:/0/0 rc 0/0 [329785.556741] LustreError: 193426:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 1 previous similar message [329785.567582] Lustre: oak-OST0042: Bulk IO write error with 156315a7-a82d-b4fe-847a-396165636f38 (at 10.51.14.3@o2ib3), client will retry: rc = -110 [329785.582357] Lustre: Skipped 17 previous similar messages [329838.018277] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.51.15.2@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [329838.037691] LustreError: Skipped 1 previous similar message [329868.482707] LustreError: 193198:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8bd04ea78050 x1695347811448640/t0(0) o4->25ca5655-1c3b-4@10.51.15.8@o2ib3:671/0 lens 488/448 e 0 to 0 dl 1619050606 ref 1 fl Interpret:/0/0 rc 0/0 [329868.507744] LustreError: 193198:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 4 previous similar messages [329899.489377] LustreError: 137-5: oak-OST005f_UUID: not available for connect from 10.51.6.13@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [329899.508793] LustreError: Skipped 1 previous similar message [329991.976384] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [329991.990320] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd3da517400 [329992.002651] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd68c861400 [329992.014800] LustreError: 193254:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bdca5c01050 x1696622008449920/t0(0) o4->976970b5-5fad-aab3-00a6-23fd47844552@10.51.13.16@o2ib3:30/0 lens 488/448 e 0 to 0 dl 1619050720 ref 1 fl Interpret:/0/0 rc 0/0 [330002.825388] LustreError: 137-5: oak-OST0055_UUID: not available for connect from 10.51.0.12@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [330002.825389] LustreError: 137-5: oak-OST0053_UUID: not available for connect from 10.51.0.12@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [330010.106341] Lustre: oak-OST003a: Connection restored to 5514f76a-ebfd-4c1b-c28e-c2a228cc78fe (at 10.51.1.18@o2ib3) [330010.117995] Lustre: Skipped 791 previous similar messages [330020.187114] LustreError: 193198:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(1231818) req@ffff8be593ee5850 x1696626597648896/t0(0) o4->156315a7-a82d-b4fe-847a-396165636f38@10.51.14.3@o2ib3:27/0 lens 488/448 e 0 to 0 dl 1619050717 ref 1 fl Interpret:/0/0 rc 0/0 [330045.188166] LustreError: 217336:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 466944(1515520) req@ffff8bdaa05d7050 x1696622008452288/t0(0) o4->976970b5-5fad-aab3-00a6-23fd47844552@10.51.13.16@o2ib3:34/0 lens 488/448 e 0 to 0 dl 1619050724 ref 1 fl Interpret:/0/0 rc 0/0 [330045.188586] LustreError: 9691:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be731b61050 x1688706924757952/t0(0) o3->2f0be466-619b-4@10.51.1.67@o2ib3:46/0 lens 488/440 e 0 to 0 dl 1619050736 ref 1 fl Interpret:/0/0 rc 0/0 [330045.188598] Lustre: oak-OST0032: Bulk IO read error with 2f0be466-619b-4 (at 10.51.1.67@o2ib3), client will retry: rc -110 [330045.188599] Lustre: Skipped 5 previous similar messages [330045.188624] LustreError: 220406:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8be7aaa09050 x1685072721709824/t0(0) o3->2336974a-ecec-4@10.51.13.22@o2ib3:46/0 lens 488/440 e 0 to 0 dl 1619050736 ref 1 fl Interpret:/0/0 rc 0/0 [330045.188627] LustreError: 220406:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 5 previous similar messages [330045.297078] LustreError: 217336:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 18 previous similar messages [330144.451020] Lustre: oak-OST004e: Client 9768808c-691d-4f78-accb-0d29f922d720 (at 10.51.13.21@o2ib3) reconnecting [330144.462474] Lustre: Skipped 431 previous similar messages [330146.074209] LustreError: 193254:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be719d97050 x1696626598008640/t0(0) o4->156315a7-a82d-b4fe-847a-396165636f38@10.51.14.3@o2ib3:202/0 lens 488/448 e 0 to 0 dl 1619050892 ref 1 fl Interpret:/0/0 rc 0/0 [330146.101282] LustreError: 193254:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 2 previous similar messages [330153.911055] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.14.3@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xf6d55615 [330153.927964] LNet: 50609:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 5 previous similar messages [330170.195493] LustreError: 193188:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 0(2460693) req@ffff8be7237d6050 x1697491947521728/t0(0) o4->ed25cc1b-f8c2-8fce-f51c-bb486337b589@10.51.14.5@o2ib3:174/0 lens 488/448 e 0 to 0 dl 1619050864 ref 1 fl Interpret:/0/0 rc 0/0 [330170.223536] LustreError: 193188:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 2 previous similar messages [330177.051019] Lustre: oak-OST005c: haven't heard from client 1e90ddc0-ff8c-4 (at 10.51.4.39@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bcd505ed000, cur 1619050826 expire 1619050676 last 1619050599 [330177.073500] Lustre: Skipped 45 previous similar messages [330184.226577] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.14.6@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xf6cea755 [330184.243479] LNet: 50607:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [330220.197155] LustreError: 221485:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 1048576(4194304) req@ffff8bdfbd7a2850 x1695840403021440/t0(0) o3->70adfa6c-d72a-4@10.51.15.19@o2ib3:210/0 lens 488/440 e 0 to 0 dl 1619050900 ref 1 fl Interpret:/2/0 rc 0/0 [330220.197346] Lustre: oak-OST004c: Bulk IO read error with 70adfa6c-d72a-4 (at 10.51.15.19@o2ib3), client will retry: rc -110 [330220.197347] Lustre: Skipped 8 previous similar messages [330220.242217] LustreError: 221485:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 8 previous similar messages [330228.100434] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.14.1@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xf6e21ba5 [330228.117330] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 6 previous similar messages [330245.200152] LustreError: 241054:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 1307898(3405050) req@ffff8bd105c05850 x1697054995118144/t0(0) o4->987b9366-48a1-9307-ff57-52c8cdd1c49f@10.51.15.13@o2ib3:250/0 lens 488/448 e 0 to 0 dl 1619050940 ref 1 fl Interpret:/0/0 rc 0/0 [330257.008166] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.15.13@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xf6ddd7ad [330257.025154] LNet: 50606:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 2 previous similar messages [330271.610018] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) 10.0.2.105@o2ib5: Dropping REPLY from 12345-10.51.15.15@o2ib3 for invalid MD 0x1676da12a0d3b7fa.0xf6e822b5 [330271.627005] LNet: 50608:0:(lib-move.c:3930:lnet_parse_reply()) Skipped 1 previous similar message [330407.232498] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [330407.246441] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8be2b58ca000 [330417.305222] LustreError: 137-5: oak-OST0041_UUID: not available for connect from 10.51.6.26@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [330417.324639] LustreError: Skipped 4 previous similar messages [330425.444630] LustreError: 217481:0:(ldlm_lib.c:3338:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8be7aa9ba850 x1696863220540608/t0(0) o4->7248222f-4666-93b2-8a86-11760c329552@10.51.15.14@o2ib3:472/0 lens 488/448 e 0 to 0 dl 1619051162 ref 1 fl Interpret:/0/0 rc 0/0 [330425.471826] LustreError: 217481:0:(ldlm_lib.c:3338:target_bulk_io()) Skipped 12 previous similar messages [330425.482855] Lustre: oak-OST003e: Bulk IO write error with 7248222f-4666-93b2-8a86-11760c329552 (at 10.51.15.14@o2ib3), client will retry: rc = -110 [330425.497707] Lustre: Skipped 45 previous similar messages [330470.208269] LustreError: 241058:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be8151fe050 x1685072722986368/t0(0) o4->2336974a-ecec-4@10.51.13.22@o2ib3:456/0 lens 488/448 e 0 to 0 dl 1619051146 ref 1 fl Interpret:/0/0 rc 0/0 [330500.964567] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.216@o2ib5: error 0(sending)(waiting) [330500.978516] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bda0b0e2000 [330500.990694] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbbb50e800 [330501.002848] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b597000 [330501.015064] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bdb8b597000 [330501.027269] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bcb93d19400 [330501.039456] LustreError: 217331:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8be71cbbf050 x1689674757031168/t0(0) o4->566f9fe9-38b0-4@10.51.5.10@o2ib3:548/0 lens 488/448 e 0 to 0 dl 1619051238 ref 1 fl Interpret:/0/0 rc 0/0 [330545.214647] LustreError: 209396:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 183344(1231920) req@ffff8be71cbbc050 x1696616874449920/t0(0) o4->5f42d279-fa63-9a7e-a46d-4dfc7a2a7ba3@10.51.14.2@o2ib3:527/0 lens 488/448 e 0 to 0 dl 1619051217 ref 1 fl Interpret:/0/0 rc 0/0 [330545.216459] LustreError: 193181:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8be723746850 x1696801643001280/t0(0) o3->4eca94b5-20cc-8fd4-b8fd-ebd10e301645@10.51.1.21@o2ib3:551/0 lens 488/440 e 0 to 0 dl 1619051241 ref 1 fl Interpret:/0/0 rc 0/0 [330545.216461] LustreError: 193406:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk READ req@ffff8bcbc1a9b850 x1696801643001920/t0(0) o3->4eca94b5-20cc-8fd4-b8fd-ebd10e301645@10.51.1.21@o2ib3:551/0 lens 488/440 e 0 to 0 dl 1619051241 ref 1 fl Interpret:/0/0 rc 0/0 [330545.216464] LustreError: 187418:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 2097152(4194304) req@ffff8bd6a9d2c050 x1696801643002048/t0(0) o3->4eca94b5-20cc-8fd4-b8fd-ebd10e301645@10.51.1.21@o2ib3:551/0 lens 488/440 e 0 to 0 dl 1619051241 ref 1 fl Interpret:/0/0 rc 0/0 [330545.216826] Lustre: oak-OST004a: Bulk IO read error with 4eca94b5-20cc-8fd4-b8fd-ebd10e301645 (at 10.51.1.21@o2ib3), client will retry: rc -110 [330545.216827] Lustre: Skipped 1 previous similar message [330545.346542] LustreError: 209396:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 6 previous similar messages [330570.218806] LustreError: 209388:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(1090) req@ffff8be005860050 x1690673830154304/t0(0) o3->bc175ba0-453f-4@10.51.1.25@o2ib3:554/0 lens 488/440 e 0 to 0 dl 1619051244 ref 1 fl Interpret:/0/0 rc 0/0 [330570.244247] Lustre: oak-OST0052: Bulk IO read error with bc175ba0-453f-4 (at 10.51.1.25@o2ib3), client will retry: rc -110 [330570.256677] Lustre: Skipped 3 previous similar messages [330610.592320] Lustre: oak-OST004a: Connection restored to 80873df8-db2a-4 (at 10.50.10.62@o2ib2) [330610.602036] Lustre: Skipped 807 previous similar messages [330688.064206] Lustre: oak-OST003c: haven't heard from client a072242b-693d-4 (at 10.51.1.31@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8be4dd3c7800, cur 1619051337 expire 1619051187 last 1619051110 [330691.070060] Lustre: oak-OST0046: haven't heard from client a072242b-693d-4 (at 10.51.1.31@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8bd605b8b400, cur 1619051340 expire 1619051190 last 1619051113 [330752.308515] Lustre: oak-OST005a: Client 057d7d47-6e0c-f38f-eddf-48feb04705f1 (at 10.51.13.12@o2ib3) reconnecting [330752.319984] Lustre: Skipped 1208 previous similar messages [330808.180622] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [330808.194671] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd449b59000 [330808.206831] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bce6bfc8c00 [330808.218983] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be021095c00 [330808.219026] LustreError: 224953:0:(ldlm_lib.c:3344:target_bulk_io()) @@@ network error on bulk WRITE req@ffff8bd201669850 x1689648483028672/t0(0) o4->994b1785-908c-4@10.51.4.67@o2ib3:99/0 lens 488/448 e 0 to 0 dl 1619051544 ref 1 fl Interpret:/0/0 rc 0/0 [330808.219029] LustreError: 224953:0:(ldlm_lib.c:3344:target_bulk_io()) Skipped 1 previous similar message [330808.267039] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd741465400 [330808.279190] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd741465400 [330820.225673] LustreError: 224959:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4096) req@ffff8bde6a315050 x1697061381486592/t0(0) o3->db35851b-4880-fd68-1dba-fd7cff216c58@10.51.6.27@o2ib3:59/0 lens 488/440 e 0 to 0 dl 1619051504 ref 1 fl Interpret:/0/0 rc 0/0 [330820.253071] Lustre: oak-OST003a: Bulk IO read error with db35851b-4880-fd68-1dba-fd7cff216c58 (at 10.51.6.27@o2ib3), client will retry: rc -110 [330862.179337] LNet: 3985:0:(o2iblnd_cb.c:2085:kiblnd_close_conn_locked()) Closing conn to 10.0.2.217@o2ib5: error 0(sending)(waiting) [330862.193308] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bd717b9a400 [330862.205468] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8be5a5bdcc00 [330862.217614] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbfbb1f000 [330862.229774] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 5, status -103, desc ffff8bcbfbb1f000 [330862.241940] LustreError: 50599:0:(events.c:450:server_bulk_callback()) event type 3, status -103, desc ffff8bd5e9eed400 [330870.236418] Lustre: oak-OST005c: Bulk IO read error with 330d404b-804c-4 (at 10.51.15.3@o2ib3), client will retry: rc -110 [330870.236742] LustreError: 209396:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(4194304) req@ffff8be1f5cbf050 x1695840405426496/t0(0) o3->70adfa6c-d72a-4@10.51.15.19@o2ib3:112/0 lens 488/440 e 0 to 0 dl 1619051557 ref 1 fl Interpret:/0/0 rc 0/0 [330870.237300] LustreError: 211957:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) @@@ truncated bulk GET 184318(1232894) req@ffff8bcf777cb850 x1697491951858048/t0(0) o4->ed25cc1b-f8c2-8fce-f51c-bb486337b589@10.51.14.5@o2ib3:122/0 lens 488/448 e 0 to 0 dl 1619051567 ref 1 fl Interpret:/0/0 rc 0/0 [330870.237302] LustreError: 211957:0:(sec.c:2485:sptlrpc_svc_unwrap_bulk()) Skipped 6 previous similar messages [330870.314235] Lustre: Skipped 4 previous similar messages [330920.240232] LustreError: 204448:0:(ldlm_lib.c:3353:target_bulk_io()) @@@ truncated bulk READ 0(1145) req@ffff8bd1df64a050 x1689654758708608/t0(0) o3->3dc6d22e-58b7-4@10.51.5.3@o2ib3:159/0 lens 488/440 e 0 to 0 dl 1619051604 ref 1 fl Interpret:/0/0 rc 0/0 [330920.240501] Lustre: oak-OST003e: Bulk IO read error with 330d404b-804c-4 (at 10.51.15.3@o2ib3), client will retry: rc -110 [330920.277984] LustreError: 204448:0:(ldlm_lib.c:3353:target_bulk_io()) Skipped 2 previous similar messages [330956.154500] LNetError: 50573:0:(peer.c:282:lnet_destroy_peer_locked()) ASSERTION( list_empty(&lp->lp_peer_nets) ) failed: [330956.166945] LNetError: 50573:0:(peer.c:282:lnet_destroy_peer_locked()) LBUG [330956.174812] Pid: 50573, comm: lnet_discovery 3.10.0-1160.6.1.el7_lustre.pl1.x86_64 #1 SMP Mon Dec 14 21:25:04 PST 2020 [330956.186861] Call Trace: [330956.189703] [] libcfs_call_trace+0x8c/0xc0 [libcfs] [330956.197114] [] lbug_with_loc+0x4c/0xa0 [libcfs] [330956.204125] [] lnet_destroy_peer_locked+0x24a/0x350 [lnet] [330956.212254] [] lnet_peer_discovery_complete+0x2a5/0x350 [lnet] [330956.220715] [] lnet_peer_discovery+0x6c0/0x1140 [lnet] [330956.228410] [] kthread+0xd1/0xe0 [330956.233965] [] ret_from_fork_nospec_begin+0x7/0x21 [330956.241271] [] 0xffffffffffffffff [330956.246936] Kernel panic - not syncing: LBUG [330956.251797] CPU: 22 PID: 50573 Comm: lnet_discovery Kdump: loaded Tainted: G OE ------------ 3.10.0-1160.6.1.el7_lustre.pl1.x86_64 #1 [330956.266548] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [330956.274996] Call Trace: [330956.277828] [] dump_stack+0x19/0x1b [330956.283659] [] panic+0xe8/0x21f [330956.289118] [] lbug_with_loc+0x9b/0xa0 [libcfs] [330956.296131] [] lnet_destroy_peer_locked+0x24a/0x350 [lnet] [330956.304198] [] lnet_peer_discovery_complete+0x2a5/0x350 [lnet] [330956.312653] [] lnet_peer_discovery+0x6c0/0x1140 [lnet] [330956.320328] [] ? wake_up_atomic_t+0x30/0x30 [330956.326940] [] ? lnet_peer_merge_data+0xe00/0xe00 [lnet] [330956.334805] [] kthread+0xd1/0xe0 [330956.340344] [] ? insert_kthread_work+0x40/0x40 [330956.347242] [] ret_from_fork_nospec_begin+0x7/0x21 [330956.354526] [] ? insert_kthread_work+0x40/0x40