[ 37.666513] sd 12:0:256:0: [sdwo] Attached SCSI disk [ 37.666556] sd 12:0:248:0: [sdwg] Attached SCSI disk [ 37.666808] sd 12:0:222:0: [sdvh] Attached SCSI disk [ 37.667168] sd 12:0:96:0: [sdqn] Attached SCSI disk [ 37.667489] sd 12:0:188:0: [sdtz] Attached SCSI disk [ 37.667556] sd 12:0:285:0: [sdxr] Attached SCSI disk [ 37.667674] sd 12:0:200:0: [sdul] Attached SCSI disk [ 37.667847] sd 12:0:190:0: [sdub] Attached SCSI disk [ 37.667909] sd 12:0:272:0: [sdxe] Attached SCSI disk [ 37.667992] sd 12:0:295:0: [sdyb] Attached SCSI disk [ 37.669507] sd 12:0:283:0: [sdxp] Attached SCSI disk [ 37.669563] sd 12:0:201:0: [sdum] Attached SCSI disk [ 37.670370] sd 12:0:181:0: [sdtt] Attached SCSI disk [ 37.670567] sd 12:0:264:0: [sdww] Attached SCSI disk [ 37.672606] sd 12:0:300:0: [sdyg] Attached SCSI disk [ 37.673311] sd 12:0:236:0: [sdvv] Attached SCSI disk [ 37.673371] sd 12:0:229:0: [sdvo] Attached SCSI disk [ 37.673410] sd 12:0:224:0: [sdvj] Attached SCSI disk [ 37.674774] sd 12:0:203:0: [sduo] Attached SCSI disk [ 37.675021] sd 12:0:292:0: [sdxy] Attached SCSI disk [ 37.675420] sd 12:0:217:0: [sdvc] Attached SCSI disk [ 37.675499] sd 12:0:280:0: [sdxm] Attached SCSI disk [ 37.677194] sd 12:0:226:0: [sdvl] Attached SCSI disk [ 37.677505] sd 12:0:348:0: [sdaab] Attached SCSI disk [ 37.677849] sd 12:0:205:0: [sduq] Attached SCSI disk [ 37.678168] sd 12:0:268:0: [sdxa] Attached SCSI disk [ 37.679130] sd 12:0:228:0: [sdvn] Attached SCSI disk [ 37.680840] sd 12:0:287:0: [sdxt] Attached SCSI disk [ 37.681080] sd 12:0:296:0: [sdyc] Attached SCSI disk [ 37.681140] sd 12:0:275:0: [sdxh] Attached SCSI disk [ 37.681679] sd 12:0:204:0: [sdup] Attached SCSI disk [ 37.682172] sd 12:0:255:0: [sdwn] Attached SCSI disk [ 37.682471] sd 12:0:263:0: [sdwv] Attached SCSI disk [ 37.682830] sd 12:0:260:0: [sdws] Attached SCSI disk [ 37.683340] sd 12:0:237:0: [sdvw] Attached SCSI disk [ 37.684300] sd 12:0:252:0: [sdwk] Attached SCSI disk [ 37.686797] sd 12:0:214:0: [sduz] Attached SCSI disk [ 37.689437] sd 12:0:299:0: [sdyf] Attached SCSI disk [ 37.690832] sd 12:0:297:0: [sdyd] Attached SCSI disk [ 37.691942] sd 12:0:227:0: [sdvm] Attached SCSI disk [ 37.693039] sd 12:0:265:0: [sdwx] Attached SCSI disk [ 37.693064] sd 12:0:267:0: [sdwz] Attached SCSI disk [ 37.693088] sd 12:0:289:0: [sdxv] Attached SCSI disk [ 37.693093] sd 12:0:279:0: [sdxl] Attached SCSI disk [ 37.693373] sd 12:0:254:0: [sdwm] Attached SCSI disk [ 37.693586] sd 12:0:290:0: [sdxw] Attached SCSI disk [ 37.693655] sd 12:0:288:0: [sdxu] Attached SCSI disk [ 37.694048] sd 12:0:276:0: [sdxi] Attached SCSI disk [ 37.694128] sd 12:0:277:0: [sdxj] Attached SCSI disk [ 37.695081] sd 12:0:302:0: [sdyi] Attached SCSI disk [ 37.695894] sd 12:0:266:0: [sdwy] Attached SCSI disk [ 37.696121] sd 12:0:278:0: [sdxk] Attached SCSI disk [ 37.696293] sd 12:0:303:0: [sdyj] Attached SCSI disk [ 37.696357] sd 12:0:240:0: [sdvz] Attached SCSI disk [ 37.697697] sd 12:0:304:0: [sdyk] Attached SCSI disk [ 37.698359] sd 12:0:301:0: [sdyh] Attached SCSI disk [ 37.698923] sd 12:0:253:0: [sdwl] Attached SCSI disk [ 37.705393] sd 12:0:215:0: [sdva] Attached SCSI disk [ 37.709633] sd 12:0:291:0: [sdxx] Attached SCSI disk [ 37.714355] sd 12:0:212:0: [sdux] Attached SCSI disk [ 37.717110] sd 12:0:225:0: [sdvk] Attached SCSI disk [ 37.720336] sd 12:0:241:0: [sdwa] Attached SCSI disk [ 39.116725] sd 1:0:16:0: [sdp] Mode Sense: db 00 10 08 [ 39.130939] sd 1:0:16:0: [sdp] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 39.164739] sd 1:0:16:0: [sdp] Attached SCSI disk [ 40.519709] EXT4-fs (sdmw1): mounted filesystem with ordered data mode. Opts: (null) [ 41.033593] systemd-journald[337]: Received SIGTERM from PID 1 (systemd). [ 41.064367] SELinux: Disabled at runtime. [ 41.064740] SELinux: Unregistering netfilter hooks [ 41.126399] type=1404 audit(1518132631.460:2): selinux=0 auid=4294967295 ses=4294967295 [ 41.147486] ip_tables: (C) 2000-2006 Netfilter Core Team [ 41.147926] systemd[1]: Inserted module 'ip_tables' [ 41.315021] EXT4-fs (sdmw1): re-mounted. Opts: (null) [ 41.326667] RPC: Registered named UNIX socket transport module. [ 41.326668] RPC: Registered udp transport module. [ 41.326668] RPC: Registered tcp transport module. [ 41.326669] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 41.330463] systemd-journald[3278]: Received request to flush runtime journal from PID 1 [ 41.359281] Installing knfsd (copyright (C) 1996 okir@monad.swb.de). [ 41.389732] type=1305 audit(1518132631.723:3): audit_pid=3338 old=0 auid=4294967295 ses=4294967295 res=1 [ 41.494586] ACPI Error: No handler for Region [SYSI] (ffff8801535847e0) [IPMI] (20130517/evregion-162) [ 41.495286] ACPI Error: Region IPMI (ID=7) has no handler (20130517/exfldio-305) [ 41.495990] ACPI Error: Method parse/execution failed [\_SB_.PMI0._GHL] (Node ffff880153566578), AE_NOT_EXIST (20130517/psparse-536) [ 41.496848] ACPI Error: Method parse/execution failed [\_SB_.PMI0._PMC] (Node ffff8801535664d8), AE_NOT_EXIST (20130517/psparse-536) [ 41.497784] ACPI Exception: AE_NOT_EXIST, Evaluating _PMC (20130517/power_meter-753) [ 41.498938] wmi: Mapper loaded [ 41.510584] ipmi message handler version 39.2 [ 41.514315] ipmi device interface [ 41.517812] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 41.519255] IPMI System Interface driver. [ 41.519613] ipmi_si: probing via SMBIOS [ 41.519913] ipmi_si: SMBIOS: io 0xca8 regsize 1 spacing 4 irq 10 [ 41.520222] ipmi_si: Adding SMBIOS-specified kcs state machine [ 41.520646] ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xca8, slave address 0x20, irq 10 [ 41.542259] scsi 1:0:0:0: Attached scsi generic sg0 type 13 [ 41.542868] sd 1:0:1:0: Attached scsi generic sg1 type 0 [ 41.543443] sd 1:0:2:0: Attached scsi generic sg2 type 0 [ 41.544004] sd 1:0:3:0: Attached scsi generic sg3 type 0 [ 41.544598] sd 1:0:4:0: Attached scsi generic sg4 type 0 [ 41.545124] sd 1:0:5:0: Attached scsi generic sg5 type 0 [ 41.546212] sd 1:0:6:0: Attached scsi generic sg6 type 0 [ 41.546778] sd 1:0:7:0: Attached scsi generic sg7 type 0 [ 41.547333] sd 1:0:8:0: Attached scsi generic sg8 type 0 [ 41.547930] sd 1:0:9:0: Attached scsi generic sg9 type 0 [ 41.548492] sd 1:0:10:0: Attached scsi generic sg10 type 0 [ 41.549066] sd 1:0:11:0: Attached scsi generic sg11 type 0 [ 41.549605] sd 1:0:12:0: Attached scsi generic sg12 type 0 [ 41.550204] sd 1:0:13:0: Attached scsi generic sg13 type 0 [ 41.550735] sd 1:0:14:0: Attached scsi generic sg14 type 0 [ 41.551334] sd 1:0:15:0: Attached scsi generic sg15 type 0 [ 41.551916] sd 1:0:16:0: Attached scsi generic sg16 type 0 [ 41.552457] sd 1:0:17:0: Attached scsi generic sg17 type 0 [ 41.553046] sd 1:0:18:0: Attached scsi generic sg18 type 0 [ 41.553581] sd 1:0:19:0: Attached scsi generic sg19 type 0 [ 41.554212] sd 1:0:20:0: Attached scsi generic sg20 type 0 [ 41.554786] sd 1:0:21:0: Attached scsi generic sg21 type 0 [ 41.555385] sd 1:0:22:0: Attached scsi generic sg22 type 0 [ 41.556010] sd 1:0:23:0: Attached scsi generic sg23 type 0 [ 41.556636] sd 1:0:24:0: Attached scsi generic sg24 type 0 [ 41.557223] sd 1:0:25:0: Attached scsi generic sg25 type 0 [ 41.557879] sd 1:0:26:0: Attached scsi generic sg26 type 0 [ 41.558530] sd 1:0:27:0: Attached scsi generic sg27 type 0 [ 41.559178] sd 1:0:28:0: Attached scsi generic sg28 type 0 [ 41.559755] sd 1:0:29:0: Attached scsi generic sg29 type 0 [ 41.560363] sd 1:0:30:0: Attached scsi generic sg30 type 0 [ 41.560985] sd 1:0:31:0: Attached scsi generic sg31 type 0 [ 41.561500] sd 1:0:32:0: Attached scsi generic sg32 type 0 [ 41.562034] sd 1:0:33:0: Attached scsi generic sg33 type 0 [ 41.562481] sd 1:0:34:0: Attached scsi generic sg34 type 0 [ 41.562907] sd 1:0:35:0: Attached scsi generic sg35 type 0 [ 41.563331] sd 1:0:36:0: Attached scsi generic sg36 type 0 [ 41.563775] sd 1:0:37:0: Attached scsi generic sg37 type 0 [ 41.564198] sd 1:0:38:0: Attached scsi generic sg38 type 0 [ 41.564595] sd 1:0:39:0: Attached scsi generic sg39 type 0 [ 41.565009] sd 1:0:40:0: Attached scsi generic sg40 type 0 [ 41.565436] sd 1:0:41:0: Attached scsi generic sg41 type 0 [ 41.565865] sd 1:0:42:0: Attached scsi generic sg42 type 0 [ 41.566265] sd 1:0:43:0: Attached scsi generic sg43 type 0 [ 41.566739] sd 1:0:44:0: Attached scsi generic sg44 type 0 [ 41.567175] sd 1:0:45:0: Attached scsi generic sg45 type 0 [ 41.567588] sd 1:0:46:0: Attached scsi generic sg46 type 0 [ 41.568042] sd 1:0:47:0: Attached scsi generic sg47 type 0 [ 41.568470] sd 1:0:48:0: Attached scsi generic sg48 type 0 [ 41.569017] sd 1:0:49:0: Attached scsi generic sg49 type 0 [ 41.569413] (NULL device *): The BMC does not support setting the recv irq bit, compensating, but the BMC needs to be fixed. [ 41.569437] sd 1:0:50:0: Attached scsi generic sg50 type 0 [ 41.569536] sd 1:0:51:0: Attached scsi generic sg51 type 0 [ 41.569632] sd 1:0:52:0: Attached scsi generic sg52 type 0 [ 41.569763] sd 1:0:53:0: Attached scsi generic sg53 type 0 [ 41.569851] sd 1:0:54:0: Attached scsi generic sg54 type 0 [ 41.569932] sd 1:0:55:0: Attached scsi generic sg55 type 0 [ 41.570015] sd 1:0:56:0: Attached scsi generic sg56 type 0 [ 41.570112] sd 1:0:57:0: Attached scsi generic sg57 type 0 [ 41.570193] sd 1:0:58:0: Attached scsi generic sg58 type 0 [ 41.570282] sd 1:0:59:0: Attached scsi generic sg59 type 0 [ 41.570369] sd 1:0:60:0: Attached scsi generic sg60 type 0 [ 41.570487] scsi 1:0:61:0: Attached scsi generic sg61 type 13 [ 41.570575] sd 1:0:62:0: Attached scsi generic sg62 type 0 [ 41.570664] sd 1:0:63:0: Attached scsi generic sg63 type 0 [ 41.570765] sd 1:0:64:0: Attached scsi generic sg64 type 0 [ 41.570861] sd 1:0:65:0: Attached scsi generic sg65 type 0 [ 41.570941] sd 1:0:66:0: Attached scsi generic sg66 type 0 [ 41.571022] sd 1:0:67:0: Attached scsi generic sg67 type 0 [ 41.571116] sd 1:0:68:0: Attached scsi generic sg68 type 0 [ 41.571208] sd 1:0:69:0: Attached scsi generic sg69 type 0 [ 41.571292] sd 1:0:70:0: Attached scsi generic sg70 type 0 [ 41.571376] sd 1:0:71:0: Attached scsi generic sg71 type 0 [ 41.571487] sd 1:0:72:0: Attached scsi generic sg72 type 0 [ 41.571572] sd 1:0:73:0: Attached scsi generic sg73 type 0 [ 41.571713] sd 1:0:74:0: Attached scsi generic sg74 type 0 [ 41.571794] sd 1:0:75:0: Attached scsi generic sg75 type 0 [ 41.571885] sd 1:0:76:0: Attached scsi generic sg76 type 0 [ 41.571970] sd 1:0:77:0: Attached scsi generic sg77 type 0 [ 41.572050] sd 1:0:78:0: Attached scsi generic sg78 type 0 [ 41.572131] sd 1:0:79:0: Attached scsi generic sg79 type 0 [ 41.572220] sd 1:0:80:0: Attached scsi generic sg80 type 0 [ 41.572301] sd 1:0:81:0: Attached scsi generic sg81 type 0 [ 41.572415] sd 1:0:82:0: Attached scsi generic sg82 type 0 [ 41.572502] sd 1:0:83:0: Attached scsi generic sg83 type 0 [ 41.572603] sd 1:0:84:0: Attached scsi generic sg84 type 0 [ 41.572716] sd 1:0:85:0: Attached scsi generic sg85 type 0 [ 41.572804] sd 1:0:86:0: Attached scsi generic sg86 type 0 [ 41.572888] sd 1:0:87:0: Attached scsi generic sg87 type 0 [ 41.572978] sd 1:0:88:0: Attached scsi generic sg88 type 0 [ 41.573066] sd 1:0:89:0: Attached scsi generic sg89 type 0 [ 41.573151] sd 1:0:90:0: Attached scsi generic sg90 type 0 [ 41.573238] sd 1:0:91:0: Attached scsi generic sg91 type 0 [ 41.573322] sd 1:0:92:0: Attached scsi generic sg92 type 0 [ 41.573431] sd 1:0:93:0: Attached scsi generic sg93 type 0 [ 41.573522] sd 1:0:94:0: Attached scsi generic sg94 type 0 [ 41.573613] sd 1:0:95:0: Attached scsi generic sg95 type 0 [ 41.573727] sd 1:0:96:0: Attached scsi generic sg96 type 0 [ 41.573809] sd 1:0:97:0: Attached scsi generic sg97 type 0 [ 41.573893] sd 1:0:98:0: Attached scsi generic sg98 type 0 [ 41.573979] sd 1:0:99:0: Attached scsi generic sg99 type 0 [ 41.574117] sd 1:0:100:0: Attached scsi generic sg100 type 0 [ 41.574254] sd 1:0:101:0: Attached scsi generic sg101 type 0 [ 41.574408] sd 1:0:102:0: Attached scsi generic sg102 type 0 [ 41.574568] sd 1:0:103:0: Attached scsi generic sg103 type 0 [ 41.574716] sd 1:0:104:0: Attached scsi generic sg104 type 0 [ 41.574939] sd 1:0:105:0: Attached scsi generic sg105 type 0 [ 41.575140] sd 1:0:106:0: Attached scsi generic sg106 type 0 [ 41.575347] sd 1:0:107:0: Attached scsi generic sg107 type 0 [ 41.575602] sd 1:0:108:0: Attached scsi generic sg108 type 0 [ 41.575861] sd 1:0:109:0: Attached scsi generic sg109 type 0 [ 41.576152] sd 1:0:110:0: Attached scsi generic sg110 type 0 [ 41.576473] sd 1:0:111:0: Attached scsi generic sg111 type 0 [ 41.576812] sd 1:0:112:0: Attached scsi generic sg112 type 0 [ 41.577160] sd 1:0:113:0: Attached scsi generic sg113 type 0 [ 41.577498] sd 1:0:114:0: Attached scsi generic sg114 type 0 [ 41.577713] sd 1:0:115:0: Attached scsi generic sg115 type 0 [ 41.578140] sd 1:0:116:0: Attached scsi generic sg116 type 0 [ 41.578452] sd 1:0:117:0: Attached scsi generic sg117 type 0 [ 41.578749] sd 1:0:118:0: Attached scsi generic sg118 type 0 [ 41.579028] sd 1:0:119:0: Attached scsi generic sg119 type 0 [ 41.579297] sd 1:0:120:0: Attached scsi generic sg120 type 0 [ 41.579586] sd 1:0:121:0: Attached scsi generic sg121 type 0 [ 41.579972] scsi 1:0:122:0: Attached scsi generic sg122 type 13 [ 41.580230] sd 1:0:123:0: Attached scsi generic sg123 type 0 [ 41.580482] sd 1:0:124:0: Attached scsi generic sg124 type 0 [ 41.580750] sd 1:0:125:0: Attached scsi generic sg125 type 0 [ 41.581040] sd 1:0:126:0: Attached scsi generic sg126 type 0 [ 41.581361] sd 1:0:127:0: Attached scsi generic sg127 type 0 [ 41.581614] sd 1:0:128:0: Attached scsi generic sg128 type 0 [ 41.581915] sd 1:0:129:0: Attached scsi generic sg129 type 0 [ 41.582207] sd 1:0:130:0: Attached scsi generic sg130 type 0 [ 41.582428] sd 1:0:131:0: Attached scsi generic sg131 type 0 [ 41.582697] sd 1:0:132:0: Attached scsi generic sg132 type 0 [ 41.583031] sd 1:0:133:0: Attached scsi generic sg133 type 0 [ 41.583353] sd 1:0:134:0: Attached scsi generic sg134 type 0 [ 41.583662] sd 1:0:135:0: Attached scsi generic sg135 type 0 [ 41.584027] sd 1:0:136:0: Attached scsi generic sg136 type 0 [ 41.584321] sd 1:0:137:0: Attached scsi generic sg137 type 0 [ 41.584671] sd 1:0:138:0: Attached scsi generic sg138 type 0 [ 41.584992] sd 1:0:139:0: Attached scsi generic sg139 type 0 [ 41.585318] sd 1:0:140:0: Attached scsi generic sg140 type 0 [ 41.585598] sd 1:0:141:0: Attached scsi generic sg141 type 0 [ 41.585877] sd 1:0:142:0: Attached scsi generic sg142 type 0 [ 41.586152] sd 1:0:143:0: Attached scsi generic sg143 type 0 [ 41.586533] sd 1:0:144:0: Attached scsi generic sg144 type 0 [ 41.586851] sd 1:0:145:0: Attached scsi generic sg145 type 0 [ 41.587149] sd 1:0:146:0: Attached scsi generic sg146 type 0 [ 41.587431] sd 1:0:147:0: Attached scsi generic sg147 type 0 [ 41.587757] sd 1:0:148:0: Attached scsi generic sg148 type 0 [ 41.588070] sd 1:0:149:0: Attached scsi generic sg149 type 0 [ 41.588374] sd 1:0:150:0: Attached scsi generic sg150 type 0 [ 41.588673] sd 1:0:151:0: Attached scsi generic sg151 type 0 [ 41.588977] sd 1:0:152:0: Attached scsi generic sg152 type 0 [ 41.589267] sd 1:0:153:0: Attached scsi generic sg153 type 0 [ 41.589571] sd 1:0:154:0: Attached scsi generic sg154 type 0 [ 41.589902] sd 1:0:155:0: Attached scsi generic sg155 type 0 [ 41.590234] sd 1:0:156:0: Attached scsi generic sg156 type 0 [ 41.590581] sd 1:0:157:0: Attached scsi generic sg157 type 0 [ 41.590884] sd 1:0:158:0: Attached scsi generic sg158 type 0 [ 41.591210] sd 1:0:159:0: Attached scsi generic sg159 type 0 [ 41.591974] sd 1:0:160:0: Attached scsi generic sg160 type 0 [ 41.592271] sd 1:0:161:0: Attached scsi generic sg161 type 0 [ 41.592680] sd 1:0:162:0: Attached scsi generic sg162 type 0 [ 41.593065] sd 1:0:163:0: Attached scsi generic sg163 type 0 [ 41.593346] sd 1:0:164:0: Attached scsi generic sg164 type 0 [ 41.593655] sd 1:0:165:0: Attached scsi generic sg165 type 0 [ 41.594011] sd 1:0:166:0: Attached scsi generic sg166 type 0 [ 41.594349] sd 1:0:167:0: Attached scsi generic sg167 type 0 [ 41.594628] sd 1:0:168:0: Attached scsi generic sg168 type 0 [ 41.595048] sd 1:0:169:0: Attached scsi generic sg169 type 0 [ 41.595425] sd 1:0:170:0: Attached scsi generic sg170 type 0 [ 41.595834] sd 1:0:171:0: Attached scsi generic sg171 type 0 [ 41.596214] sd 1:0:172:0: Attached scsi generic sg172 type 0 [ 41.596571] sd 1:0:173:0: Attached scsi generic sg173 type 0 [ 41.596880] sd 1:0:174:0: Attached scsi generic sg174 type 0 [ 41.597296] sd 1:0:175:0: Attached scsi generic sg175 type 0 [ 41.597636] sd 1:0:176:0: Attached scsi generic sg176 type 0 [ 41.597919] sd 1:0:177:0: Attached scsi generic sg177 type 0 [ 41.598220] sd 1:0:178:0: Attached scsi generic sg178 type 0 [ 41.598540] sd 1:0:179:0: Attached scsi generic sg179 type 0 [ 41.598835] sd 1:0:180:0: Attached scsi generic sg180 type 0 [ 41.599098] sd 1:0:181:0: Attached scsi generic sg181 type 0 [ 41.599464] sd 1:0:182:0: Attached scsi generic sg182 type 0 [ 41.599843] scsi 1:0:183:0: Attached scsi generic sg183 type 13 [ 41.600156] sd 1:0:184:0: Attached scsi generic sg184 type 0 [ 41.600543] sd 1:0:185:0: Attached scsi generic sg185 type 0 [ 41.600913] sd 1:0:186:0: Attached scsi generic sg186 type 0 [ 41.601226] sd 1:0:187:0: Attached scsi generic sg187 type 0 [ 41.601606] sd 1:0:188:0: Attached scsi generic sg188 type 0 [ 41.601868] sd 1:0:189:0: Attached scsi generic sg189 type 0 [ 41.602177] sd 1:0:190:0: Attached scsi generic sg190 type 0 [ 41.602444] sd 1:0:191:0: Attached scsi generic sg191 type 0 [ 41.602754] sd 1:0:192:0: Attached scsi generic sg192 type 0 [ 41.603069] sd 1:0:193:0: Attached scsi generic sg193 type 0 [ 41.603406] sd 1:0:194:0: Attached scsi generic sg194 type 0 [ 41.603650] sd 1:0:195:0: Attached scsi generic sg195 type 0 [ 41.603932] sd 1:0:196:0: Attached scsi generic sg196 type 0 [ 41.604188] sd 1:0:197:0: Attached scsi generic sg197 type 0 [ 41.604491] sd 1:0:198:0: Attached scsi generic sg198 type 0 [ 41.604763] sd 1:0:199:0: Attached scsi generic sg199 type 0 [ 41.604996] sd 1:0:200:0: Attached scsi generic sg200 type 0 [ 41.605249] sd 1:0:201:0: Attached scsi generic sg201 type 0 [ 41.605417] sd 1:0:202:0: Attached scsi generic sg202 type 0 [ 41.605590] sd 1:0:203:0: Attached scsi generic sg203 type 0 [ 41.605754] sd 1:0:204:0: Attached scsi generic sg204 type 0 [ 41.605931] sd 1:0:205:0: Attached scsi generic sg205 type 0 [ 41.606057] sd 1:0:206:0: Attached scsi generic sg206 type 0 [ 41.606150] sd 1:0:207:0: Attached scsi generic sg207 type 0 [ 41.606233] sd 1:0:208:0: Attached scsi generic sg208 type 0 [ 41.606323] sd 1:0:209:0: Attached scsi generic sg209 type 0 [ 41.606424] sd 1:0:210:0: Attached scsi generic sg210 type 0 [ 41.606503] sd 1:0:211:0: Attached scsi generic sg211 type 0 [ 41.606590] sd 1:0:212:0: Attached scsi generic sg212 type 0 [ 41.606677] sd 1:0:213:0: Attached scsi generic sg213 type 0 [ 41.606765] sd 1:0:214:0: Attached scsi generic sg214 type 0 [ 41.606855] sd 1:0:215:0: Attached scsi generic sg215 type 0 [ 41.606934] sd 1:0:216:0: Attached scsi generic sg216 type 0 [ 41.607023] sd 1:0:217:0: Attached scsi generic sg217 type 0 [ 41.607106] sd 1:0:218:0: Attached scsi generic sg218 type 0 [ 41.607187] sd 1:0:219:0: Attached scsi generic sg219 type 0 [ 41.607273] sd 1:0:220:0: Attached scsi generic sg220 type 0 [ 41.607360] sd 1:0:221:0: Attached scsi generic sg221 type 0 [ 41.607446] sd 1:0:222:0: Attached scsi generic sg222 type 0 [ 41.607529] sd 1:0:223:0: Attached scsi generic sg223 type 0 [ 41.607612] sd 1:0:224:0: Attached scsi generic sg224 type 0 [ 41.607705] sd 1:0:225:0: Attached scsi generic sg225 type 0 [ 41.607798] sd 1:0:226:0: Attached scsi generic sg226 type 0 [ 41.607881] sd 1:0:227:0: Attached scsi generic sg227 type 0 [ 41.607969] sd 1:0:228:0: Attached scsi generic sg228 type 0 [ 41.608057] sd 1:0:229:0: Attached scsi generic sg229 type 0 [ 41.608135] sd 1:0:230:0: Attached scsi generic sg230 type 0 [ 41.608214] sd 1:0:231:0: Attached scsi generic sg231 type 0 [ 41.608293] sd 1:0:232:0: Attached scsi generic sg232 type 0 [ 41.608385] sd 1:0:233:0: Attached scsi generic sg233 type 0 [ 41.608473] sd 1:0:234:0: Attached scsi generic sg234 type 0 [ 41.608561] sd 1:0:235:0: Attached scsi generic sg235 type 0 [ 41.608648] sd 1:0:236:0: Attached scsi generic sg236 type 0 [ 41.608748] sd 1:0:237:0: Attached scsi generic sg237 type 0 [ 41.608825] sd 1:0:238:0: Attached scsi generic sg238 type 0 [ 41.608909] sd 1:0:239:0: Attached scsi generic sg239 type 0 [ 41.608995] sd 1:0:240:0: Attached scsi generic sg240 type 0 [ 41.609080] sd 1:0:241:0: Attached scsi generic sg241 type 0 [ 41.609165] sd 1:0:242:0: Attached scsi generic sg242 type 0 [ 41.609247] sd 1:0:243:0: Attached scsi generic sg243 type 0 [ 41.609328] scsi 1:0:244:0: Attached scsi generic sg244 type 13 [ 41.609433] sd 1:0:245:0: Attached scsi generic sg245 type 0 [ 41.609523] sd 1:0:246:0: Attached scsi generic sg246 type 0 [ 41.609606] sd 1:0:247:0: Attached scsi generic sg247 type 0 [ 41.609724] sd 1:0:248:0: Attached scsi generic sg248 type 0 [ 41.609816] sd 1:0:249:0: Attached scsi generic sg249 type 0 [ 41.609900] sd 1:0:250:0: Attached scsi generic sg250 type 0 [ 41.610023] sd 1:0:251:0: Attached scsi generic sg251 type 0 [ 41.610260] sd 1:0:252:0: Attached scsi generic sg252 type 0 [ 41.610505] sd 1:0:253:0: Attached scsi generic sg253 type 0 [ 41.610730] sd 1:0:254:0: Attached scsi generic sg254 type 0 [ 41.610942] sd 1:0:255:0: Attached scsi generic sg255 type 0 [ 41.611231] sd 1:0:256:0: Attached scsi generic sg256 type 0 [ 41.611541] sd 1:0:257:0: Attached scsi generic sg257 type 0 [ 41.611829] sd 1:0:258:0: Attached scsi generic sg258 type 0 [ 41.612121] sd 1:0:259:0: Attached scsi generic sg259 type 0 [ 41.612412] sd 1:0:260:0: Attached scsi generic sg260 type 0 [ 41.612739] sd 1:0:261:0: Attached scsi generic sg261 type 0 [ 41.613067] sd 1:0:262:0: Attached scsi generic sg262 type 0 [ 41.613402] sd 1:0:263:0: Attached scsi generic sg263 type 0 [ 41.613733] sd 1:0:264:0: Attached scsi generic sg264 type 0 [ 41.614106] sd 1:0:265:0: Attached scsi generic sg265 type 0 [ 41.614516] sd 1:0:266:0: Attached scsi generic sg266 type 0 [ 41.614926] sd 1:0:267:0: Attached scsi generic sg267 type 0 [ 41.615465] sd 1:0:268:0: Attached scsi generic sg268 type 0 [ 41.615833] sd 1:0:269:0: Attached scsi generic sg269 type 0 [ 41.616283] sd 1:0:270:0: Attached scsi generic sg270 type 0 [ 41.616689] sd 1:0:271:0: Attached scsi generic sg271 type 0 [ 41.617097] sd 1:0:272:0: Attached scsi generic sg272 type 0 [ 41.617532] sd 1:0:273:0: Attached scsi generic sg273 type 0 [ 41.617959] sd 1:0:274:0: Attached scsi generic sg274 type 0 [ 41.618362] sd 1:0:275:0: Attached scsi generic sg275 type 0 [ 41.618722] sd 1:0:276:0: Attached scsi generic sg276 type 0 [ 41.619167] sd 1:0:277:0: Attached scsi generic sg277 type 0 [ 41.619679] sd 1:0:278:0: Attached scsi generic sg278 type 0 [ 41.620128] sd 1:0:279:0: Attached scsi generic sg279 type 0 [ 41.620532] sd 1:0:280:0: Attached scsi generic sg280 type 0 [ 41.620822] sd 1:0:281:0: Attached scsi generic sg281 type 0 [ 41.621060] sd 1:0:282:0: Attached scsi generic sg282 type 0 [ 41.621372] sd 1:0:283:0: Attached scsi generic sg283 type 0 [ 41.621742] sd 1:0:284:0: Attached scsi generic sg284 type 0 [ 41.622066] sd 1:0:285:0: Attached scsi generic sg285 type 0 [ 41.622439] sd 1:0:286:0: Attached scsi generic sg286 type 0 [ 41.622744] sd 1:0:287:0: Attached scsi generic sg287 type 0 [ 41.623122] sd 1:0:288:0: Attached scsi generic sg288 type 0 [ 41.623483] sd 1:0:289:0: Attached scsi generic sg289 type 0 [ 41.623848] sd 1:0:290:0: Attached scsi generic sg290 type 0 [ 41.624202] sd 1:0:291:0: Attached scsi generic sg291 type 0 [ 41.624587] sd 1:0:292:0: Attached scsi generic sg292 type 0 [ 41.624970] sd 1:0:293:0: Attached scsi generic sg293 type 0 [ 41.625375] sd 1:0:294:0: Attached scsi generic sg294 type 0 [ 41.625678] sd 1:0:295:0: Attached scsi generic sg295 type 0 [ 41.625759] sd 1:0:296:0: Attached scsi generic sg296 type 0 [ 41.626076] sd 1:0:297:0: Attached scsi generic sg297 type 0 [ 41.626467] sd 1:0:298:0: Attached scsi generic sg298 type 0 [ 41.626760] sd 1:0:299:0: Attached scsi generic sg299 type 0 [ 41.627080] sd 1:0:300:0: Attached scsi generic sg300 type 0 [ 41.627445] sd 1:0:301:0: Attached scsi generic sg301 type 0 [ 41.627805] sd 1:0:302:0: Attached scsi generic sg302 type 0 [ 41.628144] sd 1:0:303:0: Attached scsi generic sg303 type 0 [ 41.628453] sd 1:0:304:0: Attached scsi generic sg304 type 0 [ 41.628824] scsi 1:0:305:0: Attached scsi generic sg305 type 13 [ 41.629185] sd 1:0:306:0: Attached scsi generic sg306 type 0 [ 41.629527] sd 1:0:307:0: Attached scsi generic sg307 type 0 [ 41.629859] sd 1:0:308:0: Attached scsi generic sg308 type 0 [ 41.630154] sd 1:0:309:0: Attached scsi generic sg309 type 0 [ 41.630502] sd 1:0:310:0: Attached scsi generic sg310 type 0 [ 41.630801] sd 1:0:311:0: Attached scsi generic sg311 type 0 [ 41.631220] sd 1:0:312:0: Attached scsi generic sg312 type 0 [ 41.631869] sd 1:0:313:0: Attached scsi generic sg313 type 0 [ 41.632298] sd 1:0:314:0: Attached scsi generic sg314 type 0 [ 41.632700] sd 1:0:315:0: Attached scsi generic sg315 type 0 [ 41.633028] sd 1:0:316:0: Attached scsi generic sg316 type 0 [ 41.633491] sd 1:0:317:0: Attached scsi generic sg317 type 0 [ 41.633780] sd 1:0:318:0: Attached scsi generic sg318 type 0 [ 41.634128] sd 1:0:319:0: Attached scsi generic sg319 type 0 [ 41.634542] sd 1:0:320:0: Attached scsi generic sg320 type 0 [ 41.634902] sd 1:0:321:0: Attached scsi generic sg321 type 0 [ 41.635173] sd 1:0:322:0: Attached scsi generic sg322 type 0 [ 41.635519] sd 1:0:323:0: Attached scsi generic sg323 type 0 [ 41.635796] sd 1:0:324:0: Attached scsi generic sg324 type 0 [ 41.636160] sd 1:0:325:0: Attached scsi generic sg325 type 0 [ 41.636580] sd 1:0:326:0: Attached scsi generic sg326 type 0 [ 41.636846] sd 1:0:327:0: Attached scsi generic sg327 type 0 [ 41.637120] sd 1:0:328:0: Attached scsi generic sg328 type 0 [ 41.637408] sd 1:0:329:0: Attached scsi generic sg329 type 0 [ 41.637670] sd 1:0:330:0: Attached scsi generic sg330 type 0 [ 41.637966] sd 1:0:331:0: Attached scsi generic sg331 type 0 [ 41.638349] sd 1:0:332:0: Attached scsi generic sg332 type 0 [ 41.638749] sd 1:0:333:0: Attached scsi generic sg333 type 0 [ 41.639021] sd 1:0:334:0: Attached scsi generic sg334 type 0 [ 41.639368] sd 1:0:335:0: Attached scsi generic sg335 type 0 [ 41.639657] sd 1:0:336:0: Attached scsi generic sg336 type 0 [ 41.640028] sd 1:0:337:0: Attached scsi generic sg337 type 0 [ 41.640345] sd 1:0:338:0: Attached scsi generic sg338 type 0 [ 41.640671] sd 1:0:339:0: Attached scsi generic sg339 type 0 [ 41.640977] sd 1:0:340:0: Attached scsi generic sg340 type 0 [ 41.641202] sd 1:0:341:0: Attached scsi generic sg341 type 0 [ 41.641471] sd 1:0:342:0: Attached scsi generic sg342 type 0 [ 41.641731] sd 1:0:343:0: Attached scsi generic sg343 type 0 [ 41.641967] sd 1:0:344:0: Attached scsi generic sg344 type 0 [ 41.642200] sd 1:0:345:0: Attached scsi generic sg345 type 0 [ 41.642429] sd 1:0:346:0: Attached scsi generic sg346 type 0 [ 41.642635] sd 1:0:347:0: Attached scsi generic sg347 type 0 [ 41.642834] sd 1:0:348:0: Attached scsi generic sg348 type 0 [ 41.642998] sd 1:0:349:0: Attached scsi generic sg349 type 0 [ 41.643149] sd 1:0:350:0: Attached scsi generic sg350 type 0 [ 41.643293] sd 1:0:351:0: Attached scsi generic sg351 type 0 [ 41.643419] sd 1:0:352:0: Attached scsi generic sg352 type 0 [ 41.643510] sd 1:0:353:0: Attached scsi generic sg353 type 0 [ 41.643603] sd 1:0:354:0: Attached scsi generic sg354 type 0 [ 41.643695] sd 1:0:355:0: Attached scsi generic sg355 type 0 [ 41.643784] sd 1:0:356:0: Attached scsi generic sg356 type 0 [ 41.643872] sd 1:0:357:0: Attached scsi generic sg357 type 0 [ 41.643958] sd 1:0:358:0: Attached scsi generic sg358 type 0 [ 41.644052] sd 1:0:359:0: Attached scsi generic sg359 type 0 [ 41.644133] sd 1:0:360:0: Attached scsi generic sg360 type 0 [ 41.644223] sd 1:0:361:0: Attached scsi generic sg361 type 0 [ 41.644306] sd 1:0:362:0: Attached scsi generic sg362 type 0 [ 41.644401] sd 1:0:363:0: Attached scsi generic sg363 type 0 [ 41.644493] sd 1:0:364:0: Attached scsi generic sg364 type 0 [ 41.644580] sd 1:0:365:0: Attached scsi generic sg365 type 0 [ 41.644689] scsi 1:0:366:0: Attached scsi generic sg366 type 13 [ 41.644775] sd 0:2:0:0: Attached scsi generic sg367 type 0 [ 41.644864] scsi 12:0:0:0: Attached scsi generic sg368 type 13 [ 41.644960] sd 12:0:1:0: Attached scsi generic sg369 type 0 [ 41.645049] sd 12:0:2:0: Attached scsi generic sg370 type 0 [ 41.645137] sd 12:0:3:0: Attached scsi generic sg371 type 0 [ 41.645219] sd 12:0:4:0: Attached scsi generic sg372 type 0 [ 41.645319] sd 12:0:5:0: Attached scsi generic sg373 type 0 [ 41.645479] sd 12:0:6:0: Attached scsi generic sg374 type 0 [ 41.645674] sd 12:0:7:0: Attached scsi generic sg375 type 0 [ 41.645889] sd 12:0:8:0: Attached scsi generic sg376 type 0 [ 41.646102] sd 12:0:9:0: Attached scsi generic sg377 type 0 [ 41.646316] sd 12:0:10:0: Attached scsi generic sg378 type 0 [ 41.646519] sd 12:0:11:0: Attached scsi generic sg379 type 0 [ 41.646773] sd 12:0:12:0: Attached scsi generic sg380 type 0 [ 41.647041] sd 12:0:13:0: Attached scsi generic sg381 type 0 [ 41.647329] sd 12:0:14:0: Attached scsi generic sg382 type 0 [ 41.647698] sd 12:0:15:0: Attached scsi generic sg383 type 0 [ 41.648000] sd 12:0:16:0: Attached scsi generic sg384 type 0 [ 41.648366] sd 12:0:17:0: Attached scsi generic sg385 type 0 [ 41.648753] sd 12:0:18:0: Attached scsi generic sg386 type 0 [ 41.649012] sd 12:0:19:0: Attached scsi generic sg387 type 0 [ 41.649446] sd 12:0:20:0: Attached scsi generic sg388 type 0 [ 41.649827] sd 12:0:21:0: Attached scsi generic sg389 type 0 [ 41.650220] sd 12:0:22:0: Attached scsi generic sg390 type 0 [ 41.650541] sd 12:0:23:0: Attached scsi generic sg391 type 0 [ 41.651007] sd 12:0:24:0: Attached scsi generic sg392 type 0 [ 41.651415] sd 12:0:25:0: Attached scsi generic sg393 type 0 [ 41.651828] sd 12:0:26:0: Attached scsi generic sg394 type 0 [ 41.652221] sd 12:0:27:0: Attached scsi generic sg395 type 0 [ 41.652833] sd 12:0:28:0: Attached scsi generic sg396 type 0 [ 41.653226] sd 12:0:29:0: Attached scsi generic sg397 type 0 [ 41.653661] sd 12:0:30:0: Attached scsi generic sg398 type 0 [ 41.654116] sd 12:0:31:0: Attached scsi generic sg399 type 0 [ 41.654589] sd 12:0:32:0: Attached scsi generic sg400 type 0 [ 41.655090] sd 12:0:33:0: Attached scsi generic sg401 type 0 [ 41.655581] sd 12:0:34:0: Attached scsi generic sg402 type 0 [ 41.656040] sd 12:0:35:0: Attached scsi generic sg403 type 0 [ 41.656367] sd 12:0:36:0: Attached scsi generic sg404 type 0 [ 41.656749] sd 12:0:37:0: Attached scsi generic sg405 type 0 [ 41.657185] sd 12:0:38:0: Attached scsi generic sg406 type 0 [ 41.657540] sd 12:0:39:0: Attached scsi generic sg407 type 0 [ 41.657910] sd 12:0:40:0: Attached scsi generic sg408 type 0 [ 41.658265] sd 12:0:41:0: Attached scsi generic sg409 type 0 [ 41.658705] sd 12:0:42:0: Attached scsi generic sg410 type 0 [ 41.659114] sd 12:0:43:0: Attached scsi generic sg411 type 0 [ 41.659609] sd 12:0:44:0: Attached scsi generic sg412 type 0 [ 41.660106] sd 12:0:45:0: Attached scsi generic sg413 type 0 [ 41.661453] sd 12:0:46:0: Attached scsi generic sg414 type 0 [ 41.661923] sd 12:0:47:0: Attached scsi generic sg415 type 0 [ 41.662444] sd 12:0:48:0: Attached scsi generic sg416 type 0 [ 41.664407] sd 12:0:49:0: Attached scsi generic sg417 type 0 [ 41.664883] sd 12:0:50:0: Attached scsi generic sg418 type 0 [ 41.665232] sd 12:0:51:0: Attached scsi generic sg419 type 0 [ 41.665709] sd 12:0:52:0: Attached scsi generic sg420 type 0 [ 41.666191] sd 12:0:53:0: Attached scsi generic sg421 type 0 [ 41.666690] sd 12:0:54:0: Attached scsi generic sg422 type 0 [ 41.667249] sd 12:0:55:0: Attached scsi generic sg423 type 0 [ 41.667779] sd 12:0:56:0: Attached scsi generic sg424 type 0 [ 41.668266] sd 12:0:57:0: Attached scsi generic sg425 type 0 [ 41.668910] sd 12:0:58:0: Attached scsi generic sg426 type 0 [ 41.669314] sd 12:0:59:0: Attached scsi generic sg427 type 0 [ 41.669757] sd 12:0:60:0: Attached scsi generic sg428 type 0 [ 41.670262] scsi 12:0:61:0: Attached scsi generic sg429 type 13 [ 41.670722] sd 12:0:62:0: Attached scsi generic sg430 type 0 [ 41.671165] sd 12:0:63:0: Attached scsi generic sg431 type 0 [ 41.671723] sd 12:0:64:0: Attached scsi generic sg432 type 0 [ 41.672615] sd 12:0:65:0: Attached scsi generic sg433 type 0 [ 41.673081] sd 12:0:66:0: Attached scsi generic sg434 type 0 [ 41.673549] sd 12:0:67:0: Attached scsi generic sg435 type 0 [ 41.674042] sd 12:0:68:0: Attached scsi generic sg436 type 0 [ 41.674483] sd 12:0:69:0: Attached scsi generic sg437 type 0 [ 41.674970] sd 12:0:70:0: Attached scsi generic sg438 type 0 [ 41.675413] sd 12:0:71:0: Attached scsi generic sg439 type 0 [ 41.675792] sd 12:0:72:0: Attached scsi generic sg440 type 0 [ 41.677340] sd 12:0:73:0: Attached scsi generic sg441 type 0 [ 41.680327] sd 12:0:74:0: Attached scsi generic sg442 type 0 [ 41.680678] sd 12:0:75:0: Attached scsi generic sg443 type 0 [ 41.681039] sd 12:0:76:0: Attached scsi generic sg444 type 0 [ 41.681484] sd 12:0:77:0: Attached scsi generic sg445 type 0 [ 41.681939] sd 12:0:78:0: Attached scsi generic sg446 type 0 [ 41.682468] sd 12:0:79:0: Attached scsi generic sg447 type 0 [ 41.683013] sd 12:0:80:0: Attached scsi generic sg448 type 0 [ 41.683645] sd 12:0:81:0: Attached scsi generic sg449 type 0 [ 41.684059] sd 12:0:82:0: Attached scsi generic sg450 type 0 [ 41.685239] sd 12:0:83:0: Attached scsi generic sg451 type 0 [ 41.685781] sd 12:0:84:0: Attached scsi generic sg452 type 0 [ 41.686691] sd 12:0:85:0: Attached scsi generic sg453 type 0 [ 41.687034] sd 12:0:86:0: Attached scsi generic sg454 type 0 [ 41.687393] sd 12:0:87:0: Attached scsi generic sg455 type 0 [ 41.687646] sd 12:0:88:0: Attached scsi generic sg456 type 0 [ 41.688001] sd 12:0:89:0: Attached scsi generic sg457 type 0 [ 41.688404] sd 12:0:90:0: Attached scsi generic sg458 type 0 [ 41.688766] sd 12:0:91:0: Attached scsi generic sg459 type 0 [ 41.689083] sd 12:0:92:0: Attached scsi generic sg460 type 0 [ 41.689404] sd 12:0:93:0: Attached scsi generic sg461 type 0 [ 41.689723] sd 12:0:94:0: Attached scsi generic sg462 type 0 [ 41.689990] sd 12:0:95:0: Attached scsi generic sg463 type 0 [ 41.690317] sd 12:0:96:0: Attached scsi generic sg464 type 0 [ 41.690642] sd 12:0:97:0: Attached scsi generic sg465 type 0 [ 41.690922] sd 12:0:98:0: Attached scsi generic sg466 type 0 [ 41.691585] sd 12:0:99:0: Attached scsi generic sg467 type 0 [ 41.691869] sd 12:0:100:0: Attached scsi generic sg468 type 0 [ 41.692166] sd 12:0:101:0: Attached scsi generic sg469 type 0 [ 41.692511] sd 12:0:102:0: Attached scsi generic sg470 type 0 [ 41.692794] sd 12:0:103:0: Attached scsi generic sg471 type 0 [ 41.693093] sd 12:0:104:0: Attached scsi generic sg472 type 0 [ 41.693416] sd 12:0:105:0: Attached scsi generic sg473 type 0 [ 41.693812] sd 12:0:106:0: Attached scsi generic sg474 type 0 [ 41.694130] sd 12:0:107:0: Attached scsi generic sg475 type 0 [ 41.694426] sd 12:0:108:0: Attached scsi generic sg476 type 0 [ 41.694701] sd 12:0:109:0: Attached scsi generic sg477 type 0 [ 41.695040] sd 12:0:110:0: Attached scsi generic sg478 type 0 [ 41.695335] sd 12:0:111:0: Attached scsi generic sg479 type 0 [ 41.695743] sd 12:0:112:0: Attached scsi generic sg480 type 0 [ 41.696149] sd 12:0:113:0: Attached scsi generic sg481 type 0 [ 41.696590] sd 12:0:114:0: Attached scsi generic sg482 type 0 [ 41.697093] sd 12:0:115:0: Attached scsi generic sg483 type 0 [ 41.697421] sd 12:0:116:0: Attached scsi generic sg484 type 0 [ 41.697805] sd 12:0:117:0: Attached scsi generic sg485 type 0 [ 41.698225] sd 12:0:118:0: Attached scsi generic sg486 type 0 [ 41.698642] sd 12:0:119:0: Attached scsi generic sg487 type 0 [ 41.699122] sd 12:0:120:0: Attached scsi generic sg488 type 0 [ 41.699525] sd 12:0:121:0: Attached scsi generic sg489 type 0 [ 41.699871] scsi 12:0:122:0: Attached scsi generic sg490 type 13 [ 41.700353] sd 12:0:123:0: Attached scsi generic sg491 type 0 [ 41.700754] sd 12:0:124:0: Attached scsi generic sg492 type 0 [ 41.701145] sd 12:0:125:0: Attached scsi generic sg493 type 0 [ 41.701567] sd 12:0:126:0: Attached scsi generic sg494 type 0 [ 41.702026] sd 12:0:127:0: Attached scsi generic sg495 type 0 [ 41.702351] sd 12:0:128:0: Attached scsi generic sg496 type 0 [ 41.702752] sd 12:0:129:0: Attached scsi generic sg497 type 0 [ 41.703092] sd 12:0:130:0: Attached scsi generic sg498 type 0 [ 41.703528] sd 12:0:131:0: Attached scsi generic sg499 type 0 [ 41.704020] sd 12:0:132:0: Attached scsi generic sg500 type 0 [ 41.704458] sd 12:0:133:0: Attached scsi generic sg501 type 0 [ 41.704822] sd 12:0:134:0: Attached scsi generic sg502 type 0 [ 41.705253] sd 12:0:135:0: Attached scsi generic sg503 type 0 [ 41.705628] sd 12:0:136:0: Attached scsi generic sg504 type 0 [ 41.706066] sd 12:0:137:0: Attached scsi generic sg505 type 0 [ 41.706445] sd 12:0:138:0: Attached scsi generic sg506 type 0 [ 41.706857] sd 12:0:139:0: Attached scsi generic sg507 type 0 [ 41.707231] sd 12:0:140:0: Attached scsi generic sg508 type 0 [ 41.707651] sd 12:0:141:0: Attached scsi generic sg509 type 0 [ 41.708019] sd 12:0:142:0: Attached scsi generic sg510 type 0 [ 41.708442] sd 12:0:143:0: Attached scsi generic sg511 type 0 [ 41.708851] sd 12:0:144:0: Attached scsi generic sg512 type 0 [ 41.709183] sd 12:0:145:0: Attached scsi generic sg513 type 0 [ 41.709587] sd 12:0:146:0: Attached scsi generic sg514 type 0 [ 41.709968] sd 12:0:147:0: Attached scsi generic sg515 type 0 [ 41.710380] sd 12:0:148:0: Attached scsi generic sg516 type 0 [ 41.710764] sd 12:0:149:0: Attached scsi generic sg517 type 0 [ 41.711081] sd 12:0:150:0: Attached scsi generic sg518 type 0 [ 41.712185] sd 12:0:151:0: Attached scsi generic sg519 type 0 [ 41.713987] device-mapper: uevent: version 1.0.3 [ 41.715265] sd 12:0:152:0: Attached scsi generic sg520 type 0 [ 41.715271] device-mapper: ioctl: 4.35.0-ioctl (2016-06-23) initialised: dm-devel@redhat.com [ 41.718555] sd 12:0:153:0: Attached scsi generic sg521 type 0 [ 41.718890] sd 12:0:154:0: Attached scsi generic sg522 type 0 [ 41.719246] sd 12:0:155:0: Attached scsi generic sg523 type 0 [ 41.719688] sd 12:0:156:0: Attached scsi generic sg524 type 0 [ 41.720124] sd 12:0:157:0: Attached scsi generic sg525 type 0 [ 41.720515] sd 12:0:158:0: Attached scsi generic sg526 type 0 [ 41.720851] sd 12:0:159:0: Attached scsi generic sg527 type 0 [ 41.721217] sd 12:0:160:0: Attached scsi generic sg528 type 0 [ 41.721538] sd 12:0:161:0: Attached scsi generic sg529 type 0 [ 41.721868] sd 12:0:162:0: Attached scsi generic sg530 type 0 [ 41.722226] sd 12:0:163:0: Attached scsi generic sg531 type 0 [ 41.722546] sd 12:0:164:0: Attached scsi generic sg532 type 0 [ 41.722821] sd 12:0:165:0: Attached scsi generic sg533 type 0 [ 41.723142] sd 12:0:166:0: Attached scsi generic sg534 type 0 [ 41.723461] sd 12:0:167:0: Attached scsi generic sg535 type 0 [ 41.723792] sd 12:0:168:0: Attached scsi generic sg536 type 0 [ 41.724084] sd 12:0:169:0: Attached scsi generic sg537 type 0 [ 41.724385] sd 12:0:170:0: Attached scsi generic sg538 type 0 [ 41.724737] sd 12:0:171:0: Attached scsi generic sg539 type 0 [ 41.724984] sd 12:0:172:0: Attached scsi generic sg540 type 0 [ 41.729674] sd 12:0:173:0: Attached scsi generic sg541 type 0 [ 41.732383] sd 12:0:174:0: Attached scsi generic sg542 type 0 [ 41.737687] sd 12:0:175:0: Attached scsi generic sg543 type 0 [ 41.738147] sd 12:0:176:0: Attached scsi generic sg544 type 0 [ 41.738454] sd 12:0:177:0: Attached scsi generic sg545 type 0 [ 41.738809] sd 12:0:178:0: Attached scsi generic sg546 type 0 [ 41.739213] sd 12:0:179:0: Attached scsi generic sg547 type 0 [ 41.739683] sd 12:0:180:0: Attached scsi generic sg548 type 0 [ 41.740115] sd 12:0:181:0: Attached scsi generic sg549 type 0 [ 41.740590] sd 12:0:182:0: Attached scsi generic sg550 type 0 [ 41.741052] scsi 12:0:183:0: Attached scsi generic sg551 type 13 [ 41.741492] sd 12:0:184:0: Attached scsi generic sg552 type 0 [ 41.741948] sd 12:0:185:0: Attached scsi generic sg553 type 0 [ 41.742751] sd 12:0:186:0: Attached scsi generic sg554 type 0 [ 41.743517] sd 12:0:187:0: Attached scsi generic sg555 type 0 [ 41.744247] sd 12:0:188:0: Attached scsi generic sg556 type 0 [ 41.744970] sd 12:0:189:0: Attached scsi generic sg557 type 0 [ 41.745760] sd 12:0:190:0: Attached scsi generic sg558 type 0 [ 41.746513] sd 12:0:191:0: Attached scsi generic sg559 type 0 [ 41.747235] sd 12:0:192:0: Attached scsi generic sg560 type 0 [ 41.747991] sd 12:0:193:0: Attached scsi generic sg561 type 0 [ 41.748740] sd 12:0:194:0: Attached scsi generic sg562 type 0 [ 41.749569] sd 12:0:195:0: Attached scsi generic sg563 type 0 [ 41.749691] ipmi_si ipmi_si.0: Using irq 10 [ 41.751754] sd 12:0:196:0: Attached scsi generic sg564 type 0 [ 41.752635] sd 12:0:197:0: Attached scsi generic sg565 type 0 [ 41.754642] sd 12:0:198:0: Attached scsi generic sg566 type 0 [ 41.755384] sd 12:0:199:0: Attached scsi generic sg567 type 0 [ 41.756148] sd 12:0:200:0: Attached scsi generic sg568 type 0 [ 41.757016] sd 12:0:201:0: Attached scsi generic sg569 type 0 [ 41.757786] sd 12:0:202:0: Attached scsi generic sg570 type 0 [ 41.758492] sd 12:0:203:0: Attached scsi generic sg571 type 0 [ 41.759102] sd 12:0:204:0: Attached scsi generic sg572 type 0 [ 41.759786] sd 12:0:205:0: Attached scsi generic sg573 type 0 [ 41.760400] sd 12:0:206:0: Attached scsi generic sg574 type 0 [ 41.761056] sd 12:0:207:0: Attached scsi generic sg575 type 0 [ 41.761701] sd 12:0:208:0: Attached scsi generic sg576 type 0 [ 41.762496] sd 12:0:209:0: Attached scsi generic sg577 type 0 [ 41.763362] sd 12:0:210:0: Attached scsi generic sg578 type 0 [ 41.764170] sd 12:0:211:0: Attached scsi generic sg579 type 0 [ 41.764903] sd 12:0:212:0: Attached scsi generic sg580 type 0 [ 41.765538] sd 12:0:213:0: Attached scsi generic sg581 type 0 [ 41.770109] sd 12:0:214:0: Attached scsi generic sg582 type 0 [ 41.770879] sd 12:0:215:0: Attached scsi generic sg583 type 0 [ 41.771527] sd 12:0:216:0: Attached scsi generic sg584 type 0 [ 41.772239] sd 12:0:217:0: Attached scsi generic sg585 type 0 [ 41.773049] sd 12:0:218:0: Attached scsi generic sg586 type 0 [ 41.773890] sd 12:0:219:0: Attached scsi generic sg587 type 0 [ 41.774595] sd 12:0:220:0: Attached scsi generic sg588 type 0 [ 41.775253] sd 12:0:221:0: Attached scsi generic sg589 type 0 [ 41.775403] ipmi_si ipmi_si.0: Found new BMC (man_id: 0x0002a2, prod_id: 0x0100, dev_id: 0x20) [ 41.775748] ipmi_si ipmi_si.0: IPMI kcs interface initialized [ 41.776920] sd 12:0:222:0: Attached scsi generic sg590 type 0 [ 41.777565] sd 12:0:223:0: Attached scsi generic sg591 type 0 [ 41.778276] sd 12:0:224:0: Attached scsi generic sg592 type 0 [ 41.779066] sd 12:0:225:0: Attached scsi generic sg593 type 0 [ 41.781737] sd 12:0:226:0: Attached scsi generic sg594 type 0 [ 41.782432] sd 12:0:227:0: Attached scsi generic sg595 type 0 [ 41.788283] sd 12:0:228:0: Attached scsi generic sg596 type 0 [ 41.788996] sd 12:0:229:0: Attached scsi generic sg597 type 0 [ 41.789664] sd 12:0:230:0: Attached scsi generic sg598 type 0 [ 41.790445] sd 12:0:231:0: Attached scsi generic sg599 type 0 [ 41.791141] sd 12:0:232:0: Attached scsi generic sg600 type 0 [ 41.794718] sd 12:0:233:0: Attached scsi generic sg601 type 0 [ 41.795503] sd 12:0:234:0: Attached scsi generic sg602 type 0 [ 41.797411] sd 12:0:235:0: Attached scsi generic sg603 type 0 [ 41.798158] sd 12:0:236:0: Attached scsi generic sg604 type 0 [ 41.800223] sd 12:0:237:0: Attached scsi generic sg605 type 0 [ 41.801284] sd 12:0:238:0: Attached scsi generic sg606 type 0 [ 41.802708] sd 12:0:239:0: Attached scsi generic sg607 type 0 [ 41.803374] sd 12:0:240:0: Attached scsi generic sg608 type 0 [ 41.804165] sd 12:0:241:0: Attached scsi generic sg609 type 0 [ 41.804865] sd 12:0:242:0: Attached scsi generic sg610 type 0 [ 41.805533] sd 12:0:243:0: Attached scsi generic sg611 type 0 [ 41.806288] scsi 12:0:244:0: Attached scsi generic sg612 type 13 [ 41.806992] sd 12:0:245:0: Attached scsi generic sg613 type 0 [ 41.807732] sd 12:0:246:0: Attached scsi generic sg614 type 0 [ 41.808493] sd 12:0:247:0: Attached scsi generic sg615 type 0 [ 41.809220] sd 12:0:248:0: Attached scsi generic sg616 type 0 [ 41.809976] sd 12:0:249:0: Attached scsi generic sg617 type 0 [ 41.810742] sd 12:0:250:0: Attached scsi generic sg618 type 0 [ 41.812020] sd 12:0:251:0: Attached scsi generic sg619 type 0 [ 41.812728] sd 12:0:252:0: Attached scsi generic sg620 type 0 [ 41.813451] sd 12:0:253:0: Attached scsi generic sg621 type 0 [ 41.814811] sd 12:0:254:0: Attached scsi generic sg622 type 0 [ 41.815632] sd 12:0:255:0: Attached scsi generic sg623 type 0 [ 41.816333] sd 12:0:256:0: Attached scsi generic sg624 type 0 [ 41.817059] sd 12:0:257:0: Attached scsi generic sg625 type 0 [ 41.817754] sd 12:0:258:0: Attached scsi generic sg626 type 0 [ 41.818882] sd 12:0:259:0: Attached scsi generic sg627 type 0 [ 41.819559] sd 12:0:260:0: Attached scsi generic sg628 type 0 [ 41.821347] sd 12:0:261:0: Attached scsi generic sg629 type 0 [ 41.822262] sd 12:0:262:0: Attached scsi generic sg630 type 0 [ 41.823067] sd 12:0:263:0: Attached scsi generic sg631 type 0 [ 41.823815] sd 12:0:264:0: Attached scsi generic sg632 type 0 [ 41.824572] sd 12:0:265:0: Attached scsi generic sg633 type 0 [ 41.825261] sd 12:0:266:0: Attached scsi generic sg634 type 0 [ 41.826112] sd 12:0:267:0: Attached scsi generic sg635 type 0 [ 41.826921] sd 12:0:268:0: Attached scsi generic sg636 type 0 [ 41.827944] sd 12:0:269:0: Attached scsi generic sg637 type 0 [ 41.828585] sd 12:0:270:0: Attached scsi generic sg638 type 0 [ 41.829392] sd 12:0:271:0: Attached scsi generic sg639 type 0 [ 41.830190] sd 12:0:272:0: Attached scsi generic sg640 type 0 [ 41.830947] sd 12:0:273:0: Attached scsi generic sg641 type 0 [ 41.831754] sd 12:0:274:0: Attached scsi generic sg642 type 0 [ 41.832395] sd 12:0:275:0: Attached scsi generic sg643 type 0 [ 41.833158] sd 12:0:276:0: Attached scsi generic sg644 type 0 [ 41.833870] sd 12:0:277:0: Attached scsi generic sg645 type 0 [ 41.834552] sd 12:0:278:0: Attached scsi generic sg646 type 0 [ 41.835377] sd 12:0:279:0: Attached scsi generic sg647 type 0 [ 41.836033] sd 12:0:280:0: Attached scsi generic sg648 type 0 [ 41.836698] sd 12:0:281:0: Attached scsi generic sg649 type 0 [ 41.837405] sd 12:0:282:0: Attached scsi generic sg650 type 0 [ 41.838087] sd 12:0:283:0: Attached scsi generic sg651 type 0 [ 41.838751] sd 12:0:284:0: Attached scsi generic sg652 type 0 [ 41.839392] sd 12:0:285:0: Attached scsi generic sg653 type 0 [ 41.840020] sd 12:0:286:0: Attached scsi generic sg654 type 0 [ 41.840639] sd 12:0:287:0: Attached scsi generic sg655 type 0 [ 41.841316] sd 12:0:288:0: Attached scsi generic sg656 type 0 [ 41.842045] sd 12:0:289:0: Attached scsi generic sg657 type 0 [ 41.842724] sd 12:0:290:0: Attached scsi generic sg658 type 0 [ 41.843366] sd 12:0:291:0: Attached scsi generic sg659 type 0 [ 41.844105] sd 12:0:292:0: Attached scsi generic sg660 type 0 [ 41.844901] sd 12:0:293:0: Attached scsi generic sg661 type 0 [ 41.845667] sd 12:0:294:0: Attached scsi generic sg662 type 0 [ 41.846333] sd 12:0:295:0: Attached scsi generic sg663 type 0 [ 41.847153] sd 12:0:296:0: Attached scsi generic sg664 type 0 [ 41.847933] sd 12:0:297:0: Attached scsi generic sg665 type 0 [ 41.848874] sd 12:0:298:0: Attached scsi generic sg666 type 0 [ 41.849722] sd 12:0:299:0: Attached scsi generic sg667 type 0 [ 41.850592] sd 12:0:300:0: Attached scsi generic sg668 type 0 [ 41.851659] sd 12:0:301:0: Attached scsi generic sg669 type 0 [ 41.852423] sd 12:0:302:0: Attached scsi generic sg670 type 0 [ 41.853194] sd 12:0:303:0: Attached scsi generic sg671 type 0 [ 41.853999] sd 12:0:304:0: Attached scsi generic sg672 type 0 [ 41.854774] scsi 12:0:305:0: Attached scsi generic sg673 type 13 [ 41.855611] sd 12:0:306:0: Attached scsi generic sg674 type 0 [ 41.856823] sd 12:0:307:0: Attached scsi generic sg675 type 0 [ 41.857569] sd 12:0:308:0: Attached scsi generic sg676 type 0 [ 41.858313] sd 12:0:309:0: Attached scsi generic sg677 type 0 [ 41.858977] sd 12:0:310:0: Attached scsi generic sg678 type 0 [ 41.859769] sd 12:0:311:0: Attached scsi generic sg679 type 0 [ 41.861477] sd 12:0:312:0: Attached scsi generic sg680 type 0 [ 41.864692] sd 12:0:313:0: Attached scsi generic sg681 type 0 [ 41.865388] sd 12:0:314:0: Attached scsi generic sg682 type 0 [ 41.866048] sd 12:0:315:0: Attached scsi generic sg683 type 0 [ 41.866876] sd 12:0:316:0: Attached scsi generic sg684 type 0 [ 41.867681] sd 12:0:317:0: Attached scsi generic sg685 type 0 [ 41.868524] sd 12:0:318:0: Attached scsi generic sg686 type 0 [ 41.869386] sd 12:0:319:0: Attached scsi generic sg687 type 0 [ 41.870254] sd 12:0:320:0: Attached scsi generic sg688 type 0 [ 41.871003] sd 12:0:321:0: Attached scsi generic sg689 type 0 [ 41.871858] sd 12:0:322:0: Attached scsi generic sg690 type 0 [ 41.872778] sd 12:0:323:0: Attached scsi generic sg691 type 0 [ 41.873610] sd 12:0:324:0: Attached scsi generic sg692 type 0 [ 41.874761] sd 12:0:325:0: Attached scsi generic sg693 type 0 [ 41.875739] sd 12:0:326:0: Attached scsi generic sg694 type 0 [ 41.876495] sd 12:0:327:0: Attached scsi generic sg695 type 0 [ 41.877191] sd 12:0:328:0: Attached scsi generic sg696 type 0 [ 41.877974] sd 12:0:329:0: Attached scsi generic sg697 type 0 [ 41.878711] sd 12:0:330:0: Attached scsi generic sg698 type 0 [ 41.879465] sd 12:0:331:0: Attached scsi generic sg699 type 0 [ 41.880102] sd 12:0:332:0: Attached scsi generic sg700 type 0 [ 41.880791] sd 12:0:333:0: Attached scsi generic sg701 type 0 [ 41.882463] sd 12:0:334:0: Attached scsi generic sg702 type 0 [ 41.883194] sd 12:0:335:0: Attached scsi generic sg703 type 0 [ 41.883950] sd 12:0:336:0: Attached scsi generic sg704 type 0 [ 41.884664] sd 12:0:337:0: Attached scsi generic sg705 type 0 [ 41.885570] sd 12:0:338:0: Attached scsi generic sg706 type 0 [ 41.886370] sd 12:0:339:0: Attached scsi generic sg707 type 0 [ 41.887158] sd 12:0:340:0: Attached scsi generic sg708 type 0 [ 41.887927] sd 12:0:341:0: Attached scsi generic sg709 type 0 [ 41.888622] sd 12:0:342:0: Attached scsi generic sg710 type 0 [ 41.889261] sd 12:0:343:0: Attached scsi generic sg711 type 0 [ 41.892145] sd 12:0:344:0: Attached scsi generic sg712 type 0 [ 41.892843] sd 12:0:345:0: Attached scsi generic sg713 type 0 [ 41.893556] sd 12:0:346:0: Attached scsi generic sg714 type 0 [ 41.894186] sd 12:0:347:0: Attached scsi generic sg715 type 0 [ 41.894896] sd 12:0:348:0: Attached scsi generic sg716 type 0 [ 41.895583] sd 12:0:349:0: Attached scsi generic sg717 type 0 [ 41.896293] sd 12:0:350:0: Attached scsi generic sg718 type 0 [ 41.896910] sd 12:0:351:0: Attached scsi generic sg719 type 0 [ 41.897657] sd 12:0:352:0: Attached scsi generic sg720 type 0 [ 41.898358] sd 12:0:353:0: Attached scsi generic sg721 type 0 [ 41.899089] sd 12:0:354:0: Attached scsi generic sg722 type 0 [ 41.899879] sd 12:0:355:0: Attached scsi generic sg723 type 0 [ 41.900635] sd 12:0:356:0: Attached scsi generic sg724 type 0 [ 41.901435] sd 12:0:357:0: Attached scsi generic sg725 type 0 [ 41.902225] sd 12:0:358:0: Attached scsi generic sg726 type 0 [ 41.902997] sd 12:0:359:0: Attached scsi generic sg727 type 0 [ 41.903804] sd 12:0:360:0: Attached scsi generic sg728 type 0 [ 41.904637] sd 12:0:361:0: Attached scsi generic sg729 type 0 [ 41.905408] sd 12:0:362:0: Attached scsi generic sg730 type 0 [ 41.906281] sd 12:0:363:0: Attached scsi generic sg731 type 0 [ 41.907038] sd 12:0:364:0: Attached scsi generic sg732 type 0 [ 41.907744] sd 12:0:365:0: Attached scsi generic sg733 type 0 [ 41.908487] scsi 12:0:366:0: Attached scsi generic sg734 type 13 [ 50.783958] mei_me 0000:00:16.0: Device doesn't have valid ME Interface [ 51.034182] mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014) [ 51.034517] mlx4_core: Initializing 0000:81:00.0 [ 54.503592] ses 1:0:0:0: Attached Enclosure device [ 54.503993] ses 1:0:61:0: Attached Enclosure device [ 54.504408] ses 1:0:122:0: Attached Enclosure device [ 54.504831] ses 1:0:183:0: Attached Enclosure device [ 54.505197] ses 1:0:244:0: Attached Enclosure device [ 54.505577] ses 1:0:305:0: Attached Enclosure device [ 54.505996] ses 1:0:366:0: Attached Enclosure device [ 54.506345] ses 12:0:0:0: Attached Enclosure device [ 54.506755] ses 12:0:61:0: Attached Enclosure device [ 54.507112] ses 12:0:122:0: Attached Enclosure device [ 54.507483] ses 12:0:183:0: Attached Enclosure device [ 54.507873] ses 12:0:244:0: Attached Enclosure device [ 54.508269] ses 12:0:305:0: Attached Enclosure device [ 54.508654] ses 12:0:366:0: Attached Enclosure device [ 57.440183] mlx4_core 0000:81:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s [ 57.440738] mlx4_core 0000:81:00.0: PCIe link width is x8, device supports x8 [ 57.441684] mlx4_core 0000:81:00.0: irq 191 for MSI/MSI-X [ 57.441715] mlx4_core 0000:81:00.0: irq 192 for MSI/MSI-X [ 57.441744] mlx4_core 0000:81:00.0: irq 193 for MSI/MSI-X [ 57.441773] mlx4_core 0000:81:00.0: irq 194 for MSI/MSI-X [ 57.441802] mlx4_core 0000:81:00.0: irq 195 for MSI/MSI-X [ 57.441830] mlx4_core 0000:81:00.0: irq 196 for MSI/MSI-X [ 57.441859] mlx4_core 0000:81:00.0: irq 197 for MSI/MSI-X [ 57.441888] mlx4_core 0000:81:00.0: irq 198 for MSI/MSI-X [ 57.441917] mlx4_core 0000:81:00.0: irq 199 for MSI/MSI-X [ 57.441946] mlx4_core 0000:81:00.0: irq 200 for MSI/MSI-X [ 57.441975] mlx4_core 0000:81:00.0: irq 201 for MSI/MSI-X [ 57.442005] mlx4_core 0000:81:00.0: irq 202 for MSI/MSI-X [ 57.442075] mlx4_core 0000:81:00.0: irq 203 for MSI/MSI-X [ 57.442104] mlx4_core 0000:81:00.0: irq 204 for MSI/MSI-X [ 57.442132] mlx4_core 0000:81:00.0: irq 205 for MSI/MSI-X [ 57.442162] mlx4_core 0000:81:00.0: irq 206 for MSI/MSI-X [ 57.442191] mlx4_core 0000:81:00.0: irq 207 for MSI/MSI-X [ 57.442219] mlx4_core 0000:81:00.0: irq 208 for MSI/MSI-X [ 57.442248] mlx4_core 0000:81:00.0: irq 209 for MSI/MSI-X [ 57.442277] mlx4_core 0000:81:00.0: irq 210 for MSI/MSI-X [ 57.442305] mlx4_core 0000:81:00.0: irq 211 for MSI/MSI-X [ 57.442334] mlx4_core 0000:81:00.0: irq 212 for MSI/MSI-X [ 57.442363] mlx4_core 0000:81:00.0: irq 213 for MSI/MSI-X [ 57.442394] mlx4_core 0000:81:00.0: irq 214 for MSI/MSI-X [ 57.442422] mlx4_core 0000:81:00.0: irq 215 for MSI/MSI-X [ 57.442451] mlx4_core 0000:81:00.0: irq 216 for MSI/MSI-X [ 57.442479] mlx4_core 0000:81:00.0: irq 217 for MSI/MSI-X [ 57.442508] mlx4_core 0000:81:00.0: irq 218 for MSI/MSI-X [ 57.442584] mlx4_core 0000:81:00.0: irq 219 for MSI/MSI-X [ 57.442613] mlx4_core 0000:81:00.0: irq 220 for MSI/MSI-X [ 57.442643] mlx4_core 0000:81:00.0: irq 221 for MSI/MSI-X [ 57.442672] mlx4_core 0000:81:00.0: irq 222 for MSI/MSI-X [ 57.442701] mlx4_core 0000:81:00.0: irq 223 for MSI/MSI-X [ 57.442729] mlx4_core 0000:81:00.0: irq 224 for MSI/MSI-X [ 57.442759] mlx4_core 0000:81:00.0: irq 225 for MSI/MSI-X [ 57.442788] mlx4_core 0000:81:00.0: irq 226 for MSI/MSI-X [ 57.442817] mlx4_core 0000:81:00.0: irq 227 for MSI/MSI-X [ 57.442846] mlx4_core 0000:81:00.0: irq 228 for MSI/MSI-X [ 57.442875] mlx4_core 0000:81:00.0: irq 229 for MSI/MSI-X [ 57.442905] mlx4_core 0000:81:00.0: irq 230 for MSI/MSI-X [ 57.442934] mlx4_core 0000:81:00.0: irq 231 for MSI/MSI-X [ 57.442963] mlx4_core 0000:81:00.0: irq 232 for MSI/MSI-X [ 57.442992] mlx4_core 0000:81:00.0: irq 233 for MSI/MSI-X [ 57.443021] mlx4_core 0000:81:00.0: irq 234 for MSI/MSI-X [ 57.443091] mlx4_core 0000:81:00.0: irq 235 for MSI/MSI-X [ 57.443119] mlx4_core 0000:81:00.0: irq 236 for MSI/MSI-X [ 57.443148] mlx4_core 0000:81:00.0: irq 237 for MSI/MSI-X [ 57.443177] mlx4_core 0000:81:00.0: irq 238 for MSI/MSI-X [ 57.443206] mlx4_core 0000:81:00.0: irq 239 for MSI/MSI-X [ 57.644572] mlx4_en: Mellanox ConnectX HCA Ethernet driver v2.2-1 (Feb 2014) [ 57.666497] mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 (Feb 2014) [ 57.668053] mlx4_ib_add: counter index 0 for port 1 allocated 0 [ 64.538145] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 64.839972] AES CTR mode by8 optimization enabled [ 64.844112] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni) [ 64.859863] alg: No test for crc32 (crc32-pclmul) [ 64.913355] intel_rapl: Found RAPL domain package [ 64.913675] intel_rapl: Found RAPL domain dram [ 64.913980] intel_rapl: DRAM domain energy unit 15300pj [ 64.914300] intel_rapl: RAPL package 0 domain package locked by BIOS [ 64.914620] intel_rapl: Found RAPL domain package [ 64.914937] intel_rapl: Found RAPL domain dram [ 64.915254] intel_rapl: DRAM domain energy unit 15300pj [ 64.915567] intel_rapl: RAPL package 1 domain package locked by BIOS [ 64.944316] EDAC MC: Ver: 3.0.0 [ 64.949032] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 64.949050] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 64.949059] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 64.949062] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 64.949066] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 64.949071] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 64.949073] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 64.949077] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 64.949082] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 64.949084] EDAC sbridge: Seeking for: PCI ID 8086:6f60 [ 64.949089] EDAC sbridge: Seeking for: PCI ID 8086:6f60 [ 64.949094] EDAC sbridge: Seeking for: PCI ID 8086:6f60 [ 64.949096] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 64.949100] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 64.949105] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 64.949107] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 64.949111] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 64.949116] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 64.949118] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 64.949123] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 64.949127] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 64.949129] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 64.949134] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 64.949138] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 64.949140] EDAC sbridge: Seeking for: PCI ID 8086:6fac [ 64.949147] EDAC sbridge: Seeking for: PCI ID 8086:6fad [ 64.949154] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 64.949159] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 64.949163] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 64.949165] EDAC sbridge: Seeking for: PCI ID 8086:6f68 [ 64.949170] EDAC sbridge: Seeking for: PCI ID 8086:6f68 [ 64.949186] EDAC sbridge: Seeking for: PCI ID 8086:6f68 [ 64.949188] EDAC sbridge: Seeking for: PCI ID 8086:6f79 [ 64.949193] EDAC sbridge: Seeking for: PCI ID 8086:6f79 [ 64.949198] EDAC sbridge: Seeking for: PCI ID 8086:6f79 [ 64.949199] EDAC sbridge: Seeking for: PCI ID 8086:6f6a [ 64.949204] EDAC sbridge: Seeking for: PCI ID 8086:6f6a [ 64.949209] EDAC sbridge: Seeking for: PCI ID 8086:6f6a [ 64.949210] EDAC sbridge: Seeking for: PCI ID 8086:6f6b [ 64.949215] EDAC sbridge: Seeking for: PCI ID 8086:6f6b [ 64.949220] EDAC sbridge: Seeking for: PCI ID 8086:6f6b [ 64.949222] EDAC sbridge: Seeking for: PCI ID 8086:6f6c [ 64.949228] EDAC sbridge: Seeking for: PCI ID 8086:6f6d [ 64.950361] EDAC MC0: Giving out device to 'sbridge_edac.c' 'Broadwell Socket#0': DEV 0000:7f:12.0 [ 64.958491] EDAC MC1: Giving out device to 'sbridge_edac.c' 'Broadwell Socket#1': DEV 0000:ff:12.0 [ 64.959120] EDAC sbridge: Ver: 1.1.1 [ 66.049512] dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2) [ 66.399019] iTCO_vendor_support: vendor-support=0 [ 66.401487] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11 [ 66.401907] iTCO_wdt: Found a Wellsburg TCO device (Version=2, TCOBASE=0x0460) [ 66.403170] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0) [ 92.422945] device-mapper: multipath service-time: version 0.3.0 loaded [ 93.106692] Adding 4194300k swap on /dev/sdmw2. Priority:-1 extents:1 across:4194300k FS [ 189.462617] tg3 0000:01:00.0: irq 240 for MSI/MSI-X [ 189.462649] tg3 0000:01:00.0: irq 241 for MSI/MSI-X [ 189.462678] tg3 0000:01:00.0: irq 242 for MSI/MSI-X [ 189.462707] tg3 0000:01:00.0: irq 243 for MSI/MSI-X [ 189.462777] tg3 0000:01:00.0: irq 244 for MSI/MSI-X [ 189.586158] IPv6: ADDRCONF(NETDEV_UP): em1: link is not ready [ 193.299821] tg3 0000:01:00.0 em1: Link is up at 1000 Mbps, full duplex [ 193.300134] tg3 0000:01:00.0 em1: Flow control is off for TX and off for RX [ 193.300441] tg3 0000:01:00.0 em1: EEE is enabled [ 193.300762] IPv6: ADDRCONF(NETDEV_CHANGE): em1: link becomes ready [ 194.490235] IPv6: ADDRCONF(NETDEV_UP): ib0: link is not ready [ 194.492271] IPv6: ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready [ 199.307165] FS-Cache: Loaded [ 199.328889] FS-Cache: Netfs 'nfs' registered for caching [ 199.344343] Key type dns_resolver registered [ 199.377527] NFS: Registering the id_resolver key type [ 199.377850] Key type id_resolver registered [ 199.378154] Key type id_legacy registered [ 206.615174] usb 1-1.6.1: USB disconnect, device number 4 [ 210.698295] Fusion MPT base driver 3.04.20 [ 210.698386] Copyright (c) 1999-2008 LSI Corporation [ 210.702094] Fusion MPT misc device (ioctl) driver 3.04.20 [ 210.702337] mptctl: Registered with Fusion MPT base driver [ 210.702425] mptctl: /dev/mptctl @ (major,minor=10,220) [ 210.838259] mpt2sas version 20.103.00.00 loaded [ 233.175796] usbcore: registered new interface driver usb-storage [ 234.716189] usb 1-1.6.2: new high-speed USB device number 5 using ehci-pci [ 234.816155] usb 1-1.6.2: New USB device found, idVendor=0624, idProduct=0250 [ 234.816280] usb 1-1.6.2: New USB device strings: Mfr=4, Product=5, SerialNumber=6 [ 234.816371] usb 1-1.6.2: Product: Mass Storage Function [ 234.816471] usb 1-1.6.2: Manufacturer: Avocent [ 234.816554] usb 1-1.6.2: SerialNumber: 20120731 [ 234.817043] usb-storage 1-1.6.2:1.0: USB Mass Storage device detected [ 234.817249] scsi host13: usb-storage 1-1.6.2:1.0 [ 234.828049] usbcore: registered new interface driver uas [ 235.819496] scsi 13:0:0:0: Direct-Access iDRAC SECUPD 0329 PQ: 0 ANSI: 0 CCS [ 235.820297] sd 13:0:0:0: Attached scsi generic sg735 type 0 [ 235.821752] sd 13:0:0:0: [sdaat] 2112 512-byte logical blocks: (1.08 MB/1.03 MiB) [ 235.823744] sd 13:0:0:0: [sdaat] Write Protect is off [ 235.824055] sd 13:0:0:0: [sdaat] Mode Sense: 23 00 00 00 [ 235.825113] sd 13:0:0:0: [sdaat] No Caching mode page found [ 235.825448] sd 13:0:0:0: [sdaat] Assuming drive cache: write through [ 235.930103] sdaat: [ 236.148136] sd 13:0:0:0: [sdaat] Attached SCSI removable disk [ 251.632255] usb 1-1.6.2: USB disconnect, device number 5 [ 256.934141] usb 1-1.6.2: new high-speed USB device number 6 using ehci-pci [ 257.034073] usb 1-1.6.2: New USB device found, idVendor=0624, idProduct=0250 [ 257.034403] usb 1-1.6.2: New USB device strings: Mfr=4, Product=5, SerialNumber=6 [ 257.034934] usb 1-1.6.2: Product: Mass Storage Function [ 257.035243] usb 1-1.6.2: Manufacturer: Avocent [ 257.035561] usb 1-1.6.2: SerialNumber: 20120731 [ 257.036479] usb-storage 1-1.6.2:1.0: USB Mass Storage device detected [ 257.036953] scsi host14: usb-storage 1-1.6.2:1.0 [ 258.040045] scsi 14:0:0:0: Direct-Access iDRAC SECUPD 0329 PQ: 0 ANSI: 0 CCS [ 258.041326] sd 14:0:0:0: Attached scsi generic sg735 type 0 [ 258.042441] sd 14:0:0:0: [sdaau] 2112 512-byte logical blocks: (1.08 MB/1.03 MiB) [ 258.044063] sd 14:0:0:0: [sdaau] Write Protect is off [ 258.044389] sd 14:0:0:0: [sdaau] Mode Sense: 23 00 00 00 [ 258.045397] sd 14:0:0:0: [sdaau] No Caching mode page found [ 258.045732] sd 14:0:0:0: [sdaau] Assuming drive cache: write through [ 258.267150] sdaau: [ 258.270776] sd 14:0:0:0: [sdaau] Attached SCSI removable disk [ 273.686322] usb 1-1.6.2: USB disconnect, device number 6 [ 273.850331] ses 1:0:366:0: attempting task abort! scmd(ffff883fdf277a80) [ 273.850644] ses 1:0:366:0: [sg366] CDB: Inquiry 12 00 00 00 24 00 [ 273.850953] scsi target1:0:366: handle(0x017f), sas_address(0x5001636001c4bebd), phy(76) [ 273.851490] scsi target1:0:366: enclosure_logical_id(0x5001636001c4bebd), slot(60) [ 273.852019] scsi target1:0:366: enclosure level(0x0001),connector name( ) [ 273.856400] ses 1:0:366:0: task abort: SUCCESS scmd(ffff883fdf277a80) [ 273.856735] ses 1:0:366:0: attempting task abort! scmd(ffff883fdff7a840) [ 273.857044] ses 1:0:366:0: [sg366] CDB: Inquiry 12 00 00 00 24 00 [ 273.857358] scsi target1:0:366: handle(0x017f), sas_address(0x5001636001c4bebd), phy(76) [ 273.857942] scsi target1:0:366: enclosure_logical_id(0x5001636001c4bebd), slot(60) [ 273.858486] scsi target1:0:366: enclosure level(0x0001),connector name( ) [ 273.862834] ses 1:0:366:0: task abort: SUCCESS scmd(ffff883fdff7a840) [ 327.426307] usb 1-1.6.1: new high-speed USB device number 7 using ehci-pci [ 327.526257] usb 1-1.6.1: New USB device found, idVendor=0624, idProduct=0249 [ 327.526568] usb 1-1.6.1: New USB device strings: Mfr=4, Product=5, SerialNumber=6 [ 327.527093] usb 1-1.6.1: Product: Keyboard/Mouse Function [ 327.527399] usb 1-1.6.1: Manufacturer: Avocent [ 327.527700] usb 1-1.6.1: SerialNumber: 20121018 [ 327.529220] input: Avocent Keyboard/Mouse Function as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.6/1-1.6.1/1-1.6.1:1.0/input/input5 [ 327.580447] hid-generic 0003:0624:0249.0004: input,hidraw0: USB HID v1.00 Keyboard [Avocent Keyboard/Mouse Function] on usb-0000:00:1a.0-1.6.1/input0 [ 327.581900] input: Avocent Keyboard/Mouse Function as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.6/1-1.6.1/1-1.6.1:1.1/input/input6 [ 327.582612] hid-generic 0003:0624:0249.0005: input,hidraw1: USB HID v1.00 Mouse [Avocent Keyboard/Mouse Function] on usb-0000:00:1a.0-1.6.1/input1 [ 327.583985] input: Avocent Keyboard/Mouse Function as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.6/1-1.6.1/1-1.6.1:1.2/input/input7 [ 327.584628] hid-generic 0003:0624:0249.0006: input,hidraw2: USB HID v1.00 Mouse [Avocent Keyboard/Mouse Function] on usb-0000:00:1a.0-1.6.1/input2 [ 330.867128] usb 1-1.6.1: USB disconnect, device number 7 [ 419.179537] tg3 0000:01:00.1: irq 245 for MSI/MSI-X [ 419.179568] tg3 0000:01:00.1: irq 246 for MSI/MSI-X [ 419.179597] tg3 0000:01:00.1: irq 247 for MSI/MSI-X [ 419.179625] tg3 0000:01:00.1: irq 248 for MSI/MSI-X [ 419.179652] tg3 0000:01:00.1: irq 249 for MSI/MSI-X [ 419.302953] IPv6: ADDRCONF(NETDEV_UP): em2: link is not ready [ 419.306822] tg3 0000:02:00.0: irq 250 for MSI/MSI-X [ 419.306851] tg3 0000:02:00.0: irq 251 for MSI/MSI-X [ 419.306879] tg3 0000:02:00.0: irq 252 for MSI/MSI-X [ 419.306906] tg3 0000:02:00.0: irq 253 for MSI/MSI-X [ 419.306933] tg3 0000:02:00.0: irq 254 for MSI/MSI-X [ 419.431844] IPv6: ADDRCONF(NETDEV_UP): em3: link is not ready [ 419.435811] tg3 0000:02:00.1: irq 255 for MSI/MSI-X [ 419.435843] tg3 0000:02:00.1: irq 256 for MSI/MSI-X [ 419.435872] tg3 0000:02:00.1: irq 257 for MSI/MSI-X [ 419.435899] tg3 0000:02:00.1: irq 258 for MSI/MSI-X [ 419.435969] tg3 0000:02:00.1: irq 259 for MSI/MSI-X [ 419.559406] IPv6: ADDRCONF(NETDEV_UP): em4: link is not ready [ 2672.280703] perf: interrupt took too long (2506 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 [ 9981.117538] perf: interrupt took too long (3140 > 3132), lowering kernel.perf_event_max_sample_rate to 63000 [30789.199193] perf: interrupt took too long (3930 > 3925), lowering kernel.perf_event_max_sample_rate to 50000 [81807.435849] usb 1-1.6.2: new high-speed USB device number 8 using ehci-pci [81807.535792] usb 1-1.6.2: New USB device found, idVendor=0624, idProduct=0250 [81807.536036] usb 1-1.6.2: New USB device strings: Mfr=4, Product=5, SerialNumber=6 [81807.536504] usb 1-1.6.2: Product: Mass Storage Function [81807.536743] usb 1-1.6.2: Manufacturer: Avocent [81807.536980] usb 1-1.6.2: SerialNumber: 20120731 [81807.537684] usb-storage 1-1.6.2:1.0: USB Mass Storage device detected [81807.538043] scsi host15: usb-storage 1-1.6.2:1.0 [81808.540650] scsi 15:0:0:0: Direct-Access iDRAC SECUPD 0329 PQ: 0 ANSI: 0 CCS [81808.541874] sd 15:0:0:0: Attached scsi generic sg735 type 0 [81808.543127] sd 15:0:0:0: [sdaat] 2112 512-byte logical blocks: (1.08 MB/1.03 MiB) [81808.544747] sd 15:0:0:0: [sdaat] Write Protect is off [81808.545007] sd 15:0:0:0: [sdaat] Mode Sense: 23 00 00 00 [81808.546129] sd 15:0:0:0: [sdaat] No Caching mode page found [81808.546368] sd 15:0:0:0: [sdaat] Assuming drive cache: write through [81808.550998] sdaat: [81808.767905] sd 15:0:0:0: [sdaat] Attached SCSI removable disk [81824.116890] usb 1-1.6.2: USB disconnect, device number 8 [81829.623852] usb 1-1.6.2: new high-speed USB device number 9 using ehci-pci [81829.723718] usb 1-1.6.2: New USB device found, idVendor=0624, idProduct=0250 [81829.723971] usb 1-1.6.2: New USB device strings: Mfr=4, Product=5, SerialNumber=6 [81829.724444] usb 1-1.6.2: Product: Mass Storage Function [81829.724679] usb 1-1.6.2: Manufacturer: Avocent [81829.724915] usb 1-1.6.2: SerialNumber: 20120731 [81829.725603] usb-storage 1-1.6.2:1.0: USB Mass Storage device detected [81829.725979] scsi host16: usb-storage 1-1.6.2:1.0 [81830.729694] scsi 16:0:0:0: Direct-Access iDRAC SECUPD 0329 PQ: 0 ANSI: 0 CCS [81830.730906] sd 16:0:0:0: Attached scsi generic sg735 type 0 [81830.731921] sd 16:0:0:0: [sdaau] 2112 512-byte logical blocks: (1.08 MB/1.03 MiB) [81830.733542] sd 16:0:0:0: [sdaau] Write Protect is off [81830.733799] sd 16:0:0:0: [sdaau] Mode Sense: 23 00 00 00 [81830.735059] sd 16:0:0:0: [sdaau] No Caching mode page found [81830.735315] sd 16:0:0:0: [sdaau] Assuming drive cache: write through [81830.846789] sdaau: [81831.066180] sd 16:0:0:0: [sdaau] Attached SCSI removable disk [81846.376087] usb 1-1.6.2: USB disconnect, device number 9 [81923.865325] tg3 0000:01:00.1: irq 245 for MSI/MSI-X [81923.865357] tg3 0000:01:00.1: irq 246 for MSI/MSI-X [81923.865386] tg3 0000:01:00.1: irq 247 for MSI/MSI-X [81923.865415] tg3 0000:01:00.1: irq 248 for MSI/MSI-X [81923.865443] tg3 0000:01:00.1: irq 249 for MSI/MSI-X [81923.989495] IPv6: ADDRCONF(NETDEV_UP): em2: link is not ready [81923.993781] tg3 0000:02:00.0: irq 250 for MSI/MSI-X [81923.993829] tg3 0000:02:00.0: irq 251 for MSI/MSI-X [81923.993856] tg3 0000:02:00.0: irq 252 for MSI/MSI-X [81923.993886] tg3 0000:02:00.0: irq 253 for MSI/MSI-X [81923.993914] tg3 0000:02:00.0: irq 254 for MSI/MSI-X [81924.119572] IPv6: ADDRCONF(NETDEV_UP): em3: link is not ready [81924.123985] tg3 0000:02:00.1: irq 255 for MSI/MSI-X [81924.124015] tg3 0000:02:00.1: irq 256 for MSI/MSI-X [81924.124042] tg3 0000:02:00.1: irq 257 for MSI/MSI-X [81924.124070] tg3 0000:02:00.1: irq 258 for MSI/MSI-X [81924.124137] tg3 0000:02:00.1: irq 259 for MSI/MSI-X [81924.248264] IPv6: ADDRCONF(NETDEV_UP): em4: link is not ready [82199.215978] usb 1-1.6.2: new high-speed USB device number 10 using ehci-pci [82199.315878] usb 1-1.6.2: New USB device found, idVendor=0624, idProduct=0250 [82199.316120] usb 1-1.6.2: New USB device strings: Mfr=4, Product=5, SerialNumber=6 [82199.316580] usb 1-1.6.2: Product: Mass Storage Function [82199.316816] usb 1-1.6.2: Manufacturer: Avocent [82199.317051] usb 1-1.6.2: SerialNumber: 20120731 [82199.317798] usb-storage 1-1.6.2:1.0: USB Mass Storage device detected [82199.318175] scsi host17: usb-storage 1-1.6.2:1.0 [82200.320737] scsi 17:0:0:0: Direct-Access iDRAC SECUPD 0329 PQ: 0 ANSI: 0 CCS [82200.322054] sd 17:0:0:0: Attached scsi generic sg735 type 0 [82200.323086] sd 17:0:0:0: [sdaat] 2112 512-byte logical blocks: (1.08 MB/1.03 MiB) [82200.324589] sd 17:0:0:0: [sdaat] Write Protect is off [82200.324844] sd 17:0:0:0: [sdaat] Mode Sense: 23 00 00 00 [82200.427989] sd 17:0:0:0: [sdaat] No Caching mode page found [82200.428228] sd 17:0:0:0: [sdaat] Assuming drive cache: write through [82200.439078] sdaat: [82200.657621] sd 17:0:0:0: [sdaat] Attached SCSI removable disk [82215.962723] usb 1-1.6.2: USB disconnect, device number 10 [82221.273976] usb 1-1.6.2: new high-speed USB device number 11 using ehci-pci [82221.373896] usb 1-1.6.2: New USB device found, idVendor=0624, idProduct=0250 [82221.374142] usb 1-1.6.2: New USB device strings: Mfr=4, Product=5, SerialNumber=6 [82221.374602] usb 1-1.6.2: Product: Mass Storage Function [82221.374839] usb 1-1.6.2: Manufacturer: Avocent [82221.375076] usb 1-1.6.2: SerialNumber: 20120731 [82221.375790] usb-storage 1-1.6.2:1.0: USB Mass Storage device detected [82221.376198] scsi host18: usb-storage 1-1.6.2:1.0 [82222.379862] scsi 18:0:0:0: Direct-Access iDRAC SECUPD 0329 PQ: 0 ANSI: 0 CCS [82222.381160] sd 18:0:0:0: Attached scsi generic sg735 type 0 [82222.382361] sd 18:0:0:0: [sdaau] 2112 512-byte logical blocks: (1.08 MB/1.03 MiB) [82222.383834] sd 18:0:0:0: [sdaau] Write Protect is off [82222.384093] sd 18:0:0:0: [sdaau] Mode Sense: 23 00 00 00 [82222.385134] sd 18:0:0:0: [sdaau] No Caching mode page found [82222.385445] sd 18:0:0:0: [sdaau] Assuming drive cache: write through [82222.392486] sdaau: [82222.605854] sd 18:0:0:0: [sdaau] Attached SCSI removable disk [82238.021994] usb 1-1.6.2: USB disconnect, device number 11 [82315.174588] tg3 0000:01:00.1: irq 245 for MSI/MSI-X [82315.174615] tg3 0000:01:00.1: irq 246 for MSI/MSI-X [82315.174663] tg3 0000:01:00.1: irq 247 for MSI/MSI-X [82315.174691] tg3 0000:01:00.1: irq 248 for MSI/MSI-X [82315.174718] tg3 0000:01:00.1: irq 249 for MSI/MSI-X [82315.298886] IPv6: ADDRCONF(NETDEV_UP): em2: link is not ready [82315.303100] tg3 0000:02:00.0: irq 250 for MSI/MSI-X [82315.303127] tg3 0000:02:00.0: irq 251 for MSI/MSI-X [82315.303153] tg3 0000:02:00.0: irq 252 for MSI/MSI-X [82315.303181] tg3 0000:02:00.0: irq 253 for MSI/MSI-X [82315.303207] tg3 0000:02:00.0: irq 254 for MSI/MSI-X [82315.427997] IPv6: ADDRCONF(NETDEV_UP): em3: link is not ready [82315.432302] tg3 0000:02:00.1: irq 255 for MSI/MSI-X [82315.432331] tg3 0000:02:00.1: irq 256 for MSI/MSI-X [82315.432356] tg3 0000:02:00.1: irq 257 for MSI/MSI-X [82315.432382] tg3 0000:02:00.1: irq 258 for MSI/MSI-X [82315.432449] tg3 0000:02:00.1: irq 259 for MSI/MSI-X [82315.556657] IPv6: ADDRCONF(NETDEV_UP): em4: link is not ready [82400.647076] md: md30 stopped. [82400.670004] async_tx: api initialized (async) [82400.672000] xor: automatically using best checksumming function: [82400.681734] avx : 25220.000 MB/sec [82400.709733] raid6: sse2x1 gen() 8179 MB/s [82400.731734] raid6: sse2x2 gen() 10046 MB/s [82400.748734] raid6: sse2x4 gen() 11738 MB/s [82400.765730] raid6: avx2x1 gen() 15878 MB/s [82400.782728] raid6: avx2x2 gen() 18484 MB/s [82400.799726] raid6: avx2x4 gen() 21238 MB/s [82400.799968] raid6: using algorithm avx2x4 gen() (21238 MB/s) [82400.800208] raid6: using avx2x2 recovery algorithm [82400.815996] md/raid:md30: device dm-255 operational as raid disk 0 [82400.816242] md/raid:md30: device dm-299 operational as raid disk 9 [82400.816482] md/raid:md30: device dm-298 operational as raid disk 8 [82400.816729] md/raid:md30: device dm-285 operational as raid disk 6 [82400.816969] md/raid:md30: device dm-273 operational as raid disk 5 [82400.817212] md/raid:md30: device dm-272 operational as raid disk 4 [82400.817456] md/raid:md30: device dm-260 operational as raid disk 3 [82400.817699] md/raid:md30: device dm-259 operational as raid disk 2 [82400.817952] md/raid:md30: device dm-256 operational as raid disk 1 [82400.819853] md/raid:md30: raid level 6 active with 9 out of 10 devices, algorithm 2 [82400.863586] md30: detected capacity change from 0 to 64011431837696 [82411.341912] md: recovery of RAID array md30 [92698.719749] perf: interrupt took too long (5035 > 4912), lowering kernel.perf_event_max_sample_rate to 39000 [189706.686972] md: md30: recovery done. [207000.805679] md: data-check of RAID array md30 [492886.009519] md: md30: data-check done. [576396.635386] libcfs: loading out-of-tree module taints kernel. [576396.635861] libcfs: module verification failed: signature and/or required key missing - tainting kernel [576396.640892] LNet: HW NUMA nodes: 2, HW CPU cores: 48, npartitions: 2 [576396.642140] alg: No test for adler32 (adler32-zlib) [576396.642418] alg: No test for crc32 (crc32-table) [576397.410311] Lustre: Lustre: Build Version: 2.10.3_RC1 [576397.459465] LNet: Using FMR for registration [576397.483454] LNet: Added LNI 10.0.2.105@o2ib5 [8/256/0/180] [576496.040519] md: md2 stopped. [576496.060876] md/raid:md2: device dm-333 operational as raid disk 0 [576496.061134] md/raid:md2: device dm-316 operational as raid disk 9 [576496.061389] md/raid:md2: device dm-4 operational as raid disk 8 [576496.061649] md/raid:md2: device dm-350 operational as raid disk 7 [576496.061936] md/raid:md2: device dm-342 operational as raid disk 6 [576496.062188] md/raid:md2: device dm-358 operational as raid disk 5 [576496.062438] md/raid:md2: device dm-321 operational as raid disk 4 [576496.062699] md/raid:md2: device dm-338 operational as raid disk 3 [576496.062974] md/raid:md2: device dm-329 operational as raid disk 2 [576496.063232] md/raid:md2: device dm-359 operational as raid disk 1 [576496.064372] md/raid:md2: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.112333] md2: detected capacity change from 0 to 64011431837696 [576496.119119] md: md0 stopped. [576496.143720] md/raid:md0: device dm-341 operational as raid disk 0 [576496.143963] md/raid:md0: device dm-315 operational as raid disk 9 [576496.144200] md/raid:md0: device dm-318 operational as raid disk 8 [576496.144438] md/raid:md0: device dm-326 operational as raid disk 7 [576496.144680] md/raid:md0: device dm-328 operational as raid disk 6 [576496.144918] md/raid:md0: device dm-355 operational as raid disk 5 [576496.145155] md/raid:md0: device dm-337 operational as raid disk 4 [576496.145392] md/raid:md0: device dm-320 operational as raid disk 3 [576496.145633] md/raid:md0: device dm-339 operational as raid disk 2 [576496.145870] md/raid:md0: device dm-323 operational as raid disk 1 [576496.146961] md/raid:md0: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.187516] md0: detected capacity change from 0 to 64011431837696 [576496.197511] md: md8 stopped. [576496.218351] md/raid:md8: device dm-53 operational as raid disk 0 [576496.218608] md/raid:md8: device dm-67 operational as raid disk 9 [576496.218855] md/raid:md8: device dm-66 operational as raid disk 8 [576496.219099] md/raid:md8: device dm-55 operational as raid disk 7 [576496.219349] md/raid:md8: device dm-54 operational as raid disk 6 [576496.219607] md/raid:md8: device dm-41 operational as raid disk 5 [576496.219862] md/raid:md8: device dm-40 operational as raid disk 4 [576496.220108] md/raid:md8: device dm-28 operational as raid disk 3 [576496.220354] md/raid:md8: device dm-27 operational as raid disk 2 [576496.220606] md/raid:md8: device dm-63 operational as raid disk 1 [576496.222238] md/raid:md8: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.284997] md8: detected capacity change from 0 to 64011431837696 [576496.289499] md: md18 stopped. [576496.307050] md/raid:md18: device dm-135 operational as raid disk 0 [576496.307297] md/raid:md18: device dm-179 operational as raid disk 9 [576496.307542] md/raid:md18: device dm-178 operational as raid disk 8 [576496.307876] md/raid:md18: device dm-166 operational as raid disk 7 [576496.308115] md/raid:md18: device dm-165 operational as raid disk 6 [576496.308359] md/raid:md18: device dm-153 operational as raid disk 5 [576496.308605] md/raid:md18: device dm-152 operational as raid disk 4 [576496.308845] md/raid:md18: device dm-140 operational as raid disk 3 [576496.309088] md/raid:md18: device dm-139 operational as raid disk 2 [576496.309332] md/raid:md18: device dm-136 operational as raid disk 1 [576496.310895] md/raid:md18: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.360986] md18: detected capacity change from 0 to 64011431837696 [576496.372219] md: md4 stopped. [576496.390045] md/raid:md4: device dm-13 operational as raid disk 0 [576496.390286] md/raid:md4: device dm-8 operational as raid disk 9 [576496.390533] md/raid:md4: device dm-7 operational as raid disk 8 [576496.390774] md/raid:md4: device dm-357 operational as raid disk 7 [576496.391014] md/raid:md4: device dm-1 operational as raid disk 6 [576496.391253] md/raid:md4: device dm-345 operational as raid disk 5 [576496.391491] md/raid:md4: device dm-349 operational as raid disk 4 [576496.391738] md/raid:md4: device dm-331 operational as raid disk 3 [576496.391976] md/raid:md4: device dm-325 operational as raid disk 2 [576496.392214] md/raid:md4: device dm-18 operational as raid disk 1 [576496.393732] md/raid:md4: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.439378] md4: detected capacity change from 0 to 64011431837696 [576496.466191] md: md10 stopped. [576496.499341] md/raid:md10: device dm-73 operational as raid disk 0 [576496.499595] md/raid:md10: device dm-69 operational as raid disk 9 [576496.499833] md/raid:md10: device dm-16 operational as raid disk 8 [576496.500070] md/raid:md10: device dm-59 operational as raid disk 7 [576496.500309] md/raid:md10: device dm-58 operational as raid disk 6 [576496.500553] md/raid:md10: device dm-46 operational as raid disk 5 [576496.500795] md/raid:md10: device dm-45 operational as raid disk 4 [576496.501034] md/raid:md10: device dm-33 operational as raid disk 3 [576496.501272] md/raid:md10: device dm-32 operational as raid disk 2 [576496.501518] md/raid:md10: device dm-74 operational as raid disk 1 [576496.503088] md/raid:md10: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.543146] md10: detected capacity change from 0 to 64011431837696 [576496.557168] md: md14 stopped. [576496.587625] md/raid:md14: device dm-109 operational as raid disk 0 [576496.587867] md/raid:md14: device dm-124 operational as raid disk 9 [576496.588104] md/raid:md14: device dm-123 operational as raid disk 8 [576496.588342] md/raid:md14: device dm-111 operational as raid disk 7 [576496.588585] md/raid:md14: device dm-110 operational as raid disk 6 [576496.588826] md/raid:md14: device dm-97 operational as raid disk 5 [576496.589065] md/raid:md14: device dm-96 operational as raid disk 4 [576496.589309] md/raid:md14: device dm-84 operational as raid disk 3 [576496.589552] md/raid:md14: device dm-83 operational as raid disk 2 [576496.589793] md/raid:md14: device dm-120 operational as raid disk 1 [576496.590857] md/raid:md14: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.644006] md14: detected capacity change from 0 to 64011431837696 [576496.657263] md: md22 stopped. [576496.693983] md/raid:md22: device dm-193 operational as raid disk 0 [576496.694230] md/raid:md22: device dm-188 operational as raid disk 9 [576496.694471] md/raid:md22: device dm-187 operational as raid disk 8 [576496.694725] md/raid:md22: device dm-175 operational as raid disk 7 [576496.694964] md/raid:md22: device dm-174 operational as raid disk 6 [576496.695206] md/raid:md22: device dm-162 operational as raid disk 5 [576496.695449] md/raid:md22: device dm-161 operational as raid disk 4 [576496.695697] md/raid:md22: device dm-149 operational as raid disk 3 [576496.695938] md/raid:md22: device dm-148 operational as raid disk 2 [576496.696179] md/raid:md22: device dm-194 operational as raid disk 1 [576496.697265] md/raid:md22: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.752830] md22: detected capacity change from 0 to 64011431837696 [576496.764442] md: md28 stopped. [576496.784850] md/raid:md28: device dm-253 operational as raid disk 0 [576496.785108] md/raid:md28: device dm-248 operational as raid disk 9 [576496.785362] md/raid:md28: device dm-247 operational as raid disk 8 [576496.785624] md/raid:md28: device dm-235 operational as raid disk 7 [576496.785874] md/raid:md28: device dm-234 operational as raid disk 6 [576496.786125] md/raid:md28: device dm-222 operational as raid disk 5 [576496.786378] md/raid:md28: device dm-221 operational as raid disk 4 [576496.786635] md/raid:md28: device dm-209 operational as raid disk 3 [576496.786888] md/raid:md28: device dm-208 operational as raid disk 2 [576496.787137] md/raid:md28: device dm-254 operational as raid disk 1 [576496.788819] md/raid:md28: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.823887] md28: detected capacity change from 0 to 64011431837696 [576496.834877] md: md6 stopped. [576496.880270] md/raid:md6: device dm-19 operational as raid disk 0 [576496.880517] md/raid:md6: device dm-62 operational as raid disk 9 [576496.880756] md/raid:md6: device dm-61 operational as raid disk 8 [576496.881000] md/raid:md6: device dm-50 operational as raid disk 7 [576496.881243] md/raid:md6: device dm-49 operational as raid disk 6 [576496.881483] md/raid:md6: device dm-37 operational as raid disk 5 [576496.881733] md/raid:md6: device dm-36 operational as raid disk 4 [576496.881974] md/raid:md6: device dm-24 operational as raid disk 3 [576496.882215] md/raid:md6: device dm-23 operational as raid disk 2 [576496.882456] md/raid:md6: device dm-20 operational as raid disk 1 [576496.884199] md/raid:md6: raid level 6 active with 10 out of 10 devices, algorithm 2 [576496.955860] md6: detected capacity change from 0 to 64011431837696 [576496.983673] md: md12 stopped. [576497.050397] md/raid:md12: device dm-75 operational as raid disk 0 [576497.050658] md/raid:md12: device dm-119 operational as raid disk 9 [576497.050908] md/raid:md12: device dm-118 operational as raid disk 8 [576497.051164] md/raid:md12: device dm-106 operational as raid disk 7 [576497.051413] md/raid:md12: device dm-105 operational as raid disk 6 [576497.057386] md/raid:md12: device dm-93 operational as raid disk 5 [576497.057640] md/raid:md12: device dm-92 operational as raid disk 4 [576497.057891] md/raid:md12: device dm-80 operational as raid disk 3 [576497.058144] md/raid:md12: device dm-79 operational as raid disk 2 [576497.058396] md/raid:md12: device dm-76 operational as raid disk 1 [576497.060519] md/raid:md12: raid level 6 active with 10 out of 10 devices, algorithm 2 [576497.094789] md12: detected capacity change from 0 to 64011431837696 [576497.118019] md: md16 stopped. [576497.136474] md/raid:md16: device dm-133 operational as raid disk 0 [576497.136728] md/raid:md16: device dm-128 operational as raid disk 9 [576497.136974] md/raid:md16: device dm-127 operational as raid disk 8 [576497.137214] md/raid:md16: device dm-115 operational as raid disk 7 [576497.137460] md/raid:md16: device dm-114 operational as raid disk 6 [576497.137712] md/raid:md16: device dm-102 operational as raid disk 5 [576497.137963] md/raid:md16: device dm-101 operational as raid disk 4 [576497.138212] md/raid:md16: device dm-89 operational as raid disk 3 [576497.138455] md/raid:md16: device dm-88 operational as raid disk 2 [576497.138698] md/raid:md16: device dm-134 operational as raid disk 1 [576497.139809] md/raid:md16: raid level 6 active with 10 out of 10 devices, algorithm 2 [576497.173874] md16: detected capacity change from 0 to 64011431837696 [576497.183985] md: md24 stopped. [576497.220680] md/raid:md24: device dm-195 operational as raid disk 0 [576497.220939] md/raid:md24: device dm-239 operational as raid disk 9 [576497.221187] md/raid:md24: device dm-238 operational as raid disk 8 [576497.221439] md/raid:md24: device dm-226 operational as raid disk 7 [576497.221694] md/raid:md24: device dm-225 operational as raid disk 6 [576497.221944] md/raid:md24: device dm-213 operational as raid disk 5 [576497.222193] md/raid:md24: device dm-212 operational as raid disk 4 [576497.222441] md/raid:md24: device dm-200 operational as raid disk 3 [576497.222694] md/raid:md24: device dm-199 operational as raid disk 2 [576497.222947] md/raid:md24: device dm-196 operational as raid disk 1 [576497.224075] md/raid:md24: raid level 6 active with 10 out of 10 devices, algorithm 2 [576497.251271] md24: detected capacity change from 0 to 64011431837696 [576497.256022] md: md26 stopped. [576497.297745] md/raid:md26: device dm-229 operational as raid disk 0 [576497.297991] md/raid:md26: device dm-244 operational as raid disk 9 [576497.298234] md/raid:md26: device dm-243 operational as raid disk 8 [576497.298481] md/raid:md26: device dm-231 operational as raid disk 7 [576497.298720] md/raid:md26: device dm-230 operational as raid disk 6 [576497.298959] md/raid:md26: device dm-217 operational as raid disk 5 [576497.299196] md/raid:md26: device dm-216 operational as raid disk 4 [576497.299433] md/raid:md26: device dm-204 operational as raid disk 3 [576497.299675] md/raid:md26: device dm-203 operational as raid disk 2 [576497.299912] md/raid:md26: device dm-240 operational as raid disk 1 [576497.301423] md/raid:md26: raid level 6 active with 10 out of 10 devices, algorithm 2 [576497.340087] md26: detected capacity change from 0 to 64011431837696 [576497.353135] md: md20 stopped. [576497.383683] md/raid:md20: device dm-169 operational as raid disk 0 [576497.383925] md/raid:md20: device dm-184 operational as raid disk 9 [576497.384166] md/raid:md20: device dm-183 operational as raid disk 8 [576497.384409] md/raid:md20: device dm-171 operational as raid disk 7 [576497.384661] md/raid:md20: device dm-170 operational as raid disk 6 [576497.384905] md/raid:md20: device dm-157 operational as raid disk 5 [576497.385144] md/raid:md20: device dm-156 operational as raid disk 4 [576497.385384] md/raid:md20: device dm-144 operational as raid disk 3 [576497.385623] md/raid:md20: device dm-143 operational as raid disk 2 [576497.385863] md/raid:md20: device dm-180 operational as raid disk 1 [576497.387287] md/raid:md20: raid level 6 active with 10 out of 10 devices, algorithm 2 [576497.421438] md20: detected capacity change from 0 to 64011431837696 [576497.576624] LDISKFS-fs (md2): file extents enabled, maximum tree depth=5 [576497.909273] LDISKFS-fs (md2): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576497.964603] LDISKFS-fs (md0): file extents enabled, maximum tree depth=5 [576498.113676] LDISKFS-fs (md8): file extents enabled, maximum tree depth=5 [576498.276654] LustreError: 137-5: oak-OST0030_UUID: not available for connect from 10.9.101.38@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [576498.277382] LustreError: Skipped 11 previous similar messages [576498.333812] LDISKFS-fs (md0): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576498.448557] LDISKFS-fs (md8): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576498.459592] LDISKFS-fs (md18): file extents enabled, maximum tree depth=5 [576498.591519] Lustre: oak-OST0032: Not available for connect from 10.9.105.25@o2ib4 (not set up) [576498.611602] LDISKFS-fs (md4): file extents enabled, maximum tree depth=5 [576498.798385] LDISKFS-fs (md18): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576498.805206] LustreError: 137-5: oak-OST0038_UUID: not available for connect from 10.210.47.106@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [576498.805927] LustreError: Skipped 177 previous similar messages [576498.885194] Lustre: oak-OST0032: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [576498.912260] Lustre: oak-OST0032: Will be in recovery for at least 2:30, or until 1262 clients reconnect [576498.912775] Lustre: oak-OST0032: Connection restored to (at 10.210.47.219@o2ib3) [576498.969190] LDISKFS-fs (md4): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576499.160279] Lustre: oak-OST0030: Not available for connect from 10.210.46.25@o2ib3 (not set up) [576499.160762] Lustre: Skipped 4 previous similar messages [576499.199548] LDISKFS-fs (md10): file extents enabled, maximum tree depth=5 [576499.228514] LDISKFS-fs (md14): file extents enabled, maximum tree depth=5 [576499.457704] Lustre: oak-OST0030: Will be in recovery for at least 2:30, or until 1262 clients reconnect [576499.458242] Lustre: oak-OST0030: Connection restored to (at 10.210.44.82@o2ib3) [576499.458737] Lustre: Skipped 6 previous similar messages [576499.471545] LDISKFS-fs (md22): file extents enabled, maximum tree depth=5 [576499.558707] LDISKFS-fs (md14): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576499.569385] LDISKFS-fs (md10): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576499.761523] LDISKFS-fs (md28): file extents enabled, maximum tree depth=5 [576499.808854] LustreError: 137-5: oak-OST0034_UUID: not available for connect from 10.8.18.31@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [576499.809568] LustreError: Skipped 182 previous similar messages [576499.811416] LDISKFS-fs (md22): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576499.920508] LDISKFS-fs (md6): file extents enabled, maximum tree depth=5 [576499.933646] Lustre: oak-OST0038: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [576499.934134] Lustre: Skipped 1 previous similar message [576500.152901] LDISKFS-fs (md28): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576500.282465] LDISKFS-fs (md6): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576500.319277] Lustre: oak-OST0042: Not available for connect from 10.9.102.10@o2ib4 (not set up) [576500.319759] Lustre: Skipped 4 previous similar messages [576500.461298] Lustre: oak-OST0030: Connection restored to 7cc76dae-951d-b1a2-fd15-72bf8c45ce3a (at 10.9.104.66@o2ib4) [576500.461785] Lustre: Skipped 62 previous similar messages [576500.463494] LDISKFS-fs (md16): file extents enabled, maximum tree depth=5 [576500.562459] LDISKFS-fs (md12): file extents enabled, maximum tree depth=5 [576500.673766] Lustre: oak-OST0042: Will be in recovery for at least 2:30, or until 1262 clients reconnect [576500.674254] Lustre: Skipped 1 previous similar message [576500.808627] LDISKFS-fs (md16): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576501.024459] LDISKFS-fs (md24): file extents enabled, maximum tree depth=5 [576501.024762] LDISKFS-fs (md12): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576501.294114] Lustre: oak-OST0034: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [576501.294589] Lustre: Skipped 1 previous similar message [576501.370868] LDISKFS-fs (md24): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576501.403451] LDISKFS-fs (md26): file extents enabled, maximum tree depth=5 [576501.815235] LDISKFS-fs (md26): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576501.825874] LustreError: 137-5: oak-OST003a_UUID: not available for connect from 10.9.114.8@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [576501.826614] LustreError: Skipped 472 previous similar messages [576501.833429] LDISKFS-fs (md20): file extents enabled, maximum tree depth=5 [576502.165769] LDISKFS-fs (md20): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [576502.319661] Lustre: oak-OST003a: Not available for connect from 10.210.47.158@o2ib3 (not set up) [576502.320142] Lustre: Skipped 19 previous similar messages [576502.493708] Lustre: oak-OST0030: Connection restored to e05c56e7-3809-4a66-d54a-34bdb694796d (at 10.210.44.150@o2ib3) [576502.493709] Lustre: oak-OST0038: Connection restored to e05c56e7-3809-4a66-d54a-34bdb694796d (at 10.210.44.150@o2ib3) [576502.493712] Lustre: Skipped 267 previous similar messages [576503.218098] Lustre: oak-OST0046: Will be in recovery for at least 2:30, or until 1262 clients reconnect [576503.218584] Lustre: Skipped 3 previous similar messages [576503.625729] Lustre: oak-OST004c: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [576503.626211] Lustre: Skipped 3 previous similar messages [576505.833929] LustreError: 137-5: oak-OST004a_UUID: not available for connect from 10.210.47.59@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [576505.834654] LustreError: Skipped 914 previous similar messages [576506.332748] Lustre: oak-OST004a: Not available for connect from 10.210.45.58@o2ib3 (not set up) [576506.333224] Lustre: Skipped 53 previous similar messages [576508.050046] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1518709120/real 0] req@ffff883dd1e00000 x1592481824116144/t0(0) o38->oak-MDT0000-lwp-OST0046@10.0.2.52@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518709125 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [576509.056000] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1518709121/real 0] req@ffff881e8ced8000 x1592481824116576/t0(0) o38->oak-MDT0000-lwp-OST004c@10.0.2.52@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518709126 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [576509.056986] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 7 previous similar messages [576510.056949] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1518709122/real 0] req@ffff883ddb6d0900 x1592481824117440/t0(0) o38->oak-MDT0000-lwp-OST0040@10.0.2.52@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518709127 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [576510.503087] Lustre: oak-OST0032: Connection restored to 1a82bd75-72e1-8040-9fd0-f227412ed7d7 (at 10.210.45.38@o2ib3) [576510.503089] Lustre: oak-OST0036: Connection restored to 1a82bd75-72e1-8040-9fd0-f227412ed7d7 (at 10.210.45.38@o2ib3) [576510.503092] Lustre: Skipped 2886 previous similar messages [576542.519472] Lustre: oak-OST0036: Connection restored to ca6e2c5e-e98b-5237-976d-2488001f19ce (at 10.210.47.125@o2ib3) [576542.520002] Lustre: Skipped 5155 previous similar messages [576549.452119] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 0 seconds [576562.451530] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 12 seconds [576562.452033] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 1 previous similar message [576574.450951] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 25 seconds [576574.451446] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [576606.715739] Lustre: oak-OST0036: Connection restored to e10f123d-3b07-7a0c-5fe0-20d2e89e9f18 (at 10.12.4.87@o2ib) [576606.715740] Lustre: oak-OST0032: Connection restored to e10f123d-3b07-7a0c-5fe0-20d2e89e9f18 (at 10.12.4.87@o2ib) [576606.715743] Lustre: Skipped 9102 previous similar messages [576606.716967] Lustre: Skipped 7 previous similar messages [576640.065371] Lustre: oak-OST004c: Recovery over after 2:16, of 1262 clients 1262 recovered and 0 were evicted. [576640.108630] Lustre: oak-OST0044: deleting orphan objects from 0x0:4271241 to 0x0:4271265 [576640.110595] Lustre: oak-OST004c: deleting orphan objects from 0x0:3309631 to 0x0:3309665 [576640.111304] Lustre: oak-OST0042: deleting orphan objects from 0x0:4201162 to 0x0:4201249 [576640.128080] Lustre: oak-OST003c: deleting orphan objects from 0x0:4298343 to 0x0:4298433 [576640.191323] Lustre: oak-OST0048: deleting orphan objects from 0x0:3295070 to 0x0:3295105 [576640.208103] Lustre: oak-OST0030: deleting orphan objects from 0x0:4336458 to 0x0:4336481 [576640.528107] Lustre: oak-OST0046: deleting orphan objects from 0x0:4323664 to 0x0:4323681 [576640.534706] Lustre: oak-OST0032: deleting orphan objects from 0x0:4289209 to 0x0:4289249 [576643.078674] Lustre: oak-OST003a: Recovery over after 2:20, of 1262 clients 1262 recovered and 0 were evicted. [576643.079148] Lustre: Skipped 7 previous similar messages [576643.103212] Lustre: oak-OST003a: deleting orphan objects from 0x0:4275024 to 0x0:4275041 [576643.430881] Lustre: oak-OST003e: deleting orphan objects from 0x0:4159312 to 0x0:4159361 [576643.521455] Lustre: oak-OST004a: deleting orphan objects from 0x0:3290952 to 0x0:3291009 [576643.804966] Lustre: oak-OST0034: deleting orphan objects from 0x0:4161401 to 0x0:4161441 [576643.879464] Lustre: oak-OST0036: deleting orphan objects from 0x0:4125650 to 0x0:4125697 [576646.762976] Lustre: oak-OST0038: deleting orphan objects from 0x0:4254236 to 0x0:4254273 [576646.774726] Lustre: oak-OST0038: Recovery over after 2:27, of 1262 clients 1262 recovered and 0 were evicted. [576646.775197] Lustre: Skipped 4 previous similar messages [576653.519330] Lustre: oak-OST0040: Recovery over after 2:29, of 1262 clients 1262 recovered and 0 were evicted. [576653.527574] Lustre: oak-OST0040: deleting orphan objects from 0x0:4307854 to 0x0:4307873 [577132.021206] LustreError: 11-0: oak-MDT0000-lwp-OST0032: operation obd_ping to node 10.0.2.51@o2ib5 failed: rc = -107 [577132.021210] Lustre: oak-MDT0000-lwp-OST0034: Connection to oak-MDT0000 (at 10.0.2.51@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [577132.022415] LustreError: Skipped 14 previous similar messages [577163.019510] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518709774/real 1518709774] req@ffff883e874bf800 x1592481824137408/t0(0) o38->oak-MDT0000-lwp-OST0048@10.0.2.52@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518709780 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [577163.020754] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [577218.016977] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518709824/real 1518709824] req@ffff883e874bc800 x1592481824137936/t0(0) o38->oak-MDT0000-lwp-OST004a@10.0.2.52@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518709835 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [577218.018249] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 14 previous similar messages [577236.698387] Lustre: oak-OST0030: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [577236.698892] Lustre: Skipped 510 previous similar messages [577248.015578] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518709849/real 1518709849] req@ffff883e874b8900 x1592481824138160/t0(0) o38->oak-MDT0000-lwp-OST003c@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518709865 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [577248.016775] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 14 previous similar messages [577557.001494] LustreError: 167-0: oak-MDT0000-lwp-OST0030: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [577577.573026] Lustre: oak-OST0034: deleting orphan objects from 0x0:4161451 to 0x0:4161473 [577577.573027] Lustre: oak-OST0032: deleting orphan objects from 0x0:4289260 to 0x0:4289281 [577577.573031] Lustre: oak-OST003a: deleting orphan objects from 0x0:4275050 to 0x0:4275073 [577577.573032] Lustre: oak-OST0036: deleting orphan objects from 0x0:4125702 to 0x0:4125729 [577577.573033] Lustre: oak-OST003e: deleting orphan objects from 0x0:4159366 to 0x0:4159393 [577577.573034] Lustre: oak-OST003c: deleting orphan objects from 0x0:4298440 to 0x0:4298465 [577577.573036] Lustre: oak-OST0040: deleting orphan objects from 0x0:4307889 to 0x0:4307905 [577577.573036] Lustre: oak-OST0048: deleting orphan objects from 0x0:3295117 to 0x0:3295137 [577577.573037] Lustre: oak-OST0046: deleting orphan objects from 0x0:4323690 to 0x0:4323713 [577577.573039] Lustre: oak-OST0042: deleting orphan objects from 0x0:4201256 to 0x0:4201281 [577577.573040] Lustre: oak-OST004c: deleting orphan objects from 0x0:3309677 to 0x0:3309697 [577577.573041] Lustre: oak-OST0044: deleting orphan objects from 0x0:4271274 to 0x0:4271297 [577577.573042] Lustre: oak-OST004a: deleting orphan objects from 0x0:3291017 to 0x0:3291041 [577577.573044] Lustre: oak-OST0038: deleting orphan objects from 0x0:4254279 to 0x0:4254305 [577577.573046] Lustre: oak-OST0030: deleting orphan objects from 0x0:4336492 to 0x0:4336513 [584613.903862] Lustre: oak-OST0030: Connection restored to 37fd2092-874c-6423-1c63-c8618b62c4b5 (at 10.8.15.3@o2ib6) [584613.904355] Lustre: Skipped 39 previous similar messages [586322.801016] Lustre: oak-OST0036: haven't heard from client 8db677fd-c2b7-fbdd-5cdb-bbd8f49841be (at 10.12.4.76@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d57f7b000, cur 1518718941 expire 1518718791 last 1518718714 [587018.727241] Lustre: oak-OST0036: haven't heard from client b3426c79-579f-36c8-108d-d63d7b0c1c6a (at 10.12.4.35@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d2cb26800, cur 1518719637 expire 1518719487 last 1518719410 [587018.728273] Lustre: Skipped 59 previous similar messages [587336.683634] Lustre: oak-OST0036: Connection restored to 07d0e0b2-8f8d-b1ae-13a5-562994837339 (at 10.12.4.28@o2ib) [587336.683635] Lustre: oak-OST0032: Connection restored to 07d0e0b2-8f8d-b1ae-13a5-562994837339 (at 10.12.4.28@o2ib) [587336.683638] Lustre: Skipped 3 previous similar messages [587336.684818] Lustre: Skipped 8 previous similar messages [587636.714941] Lustre: oak-OST0032: haven't heard from client c7cd2e66-750a-5f80-db94-ac9b167051ad (at 10.12.4.30@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883fe77edc00, cur 1518720255 expire 1518720105 last 1518720028 [587636.715926] Lustre: Skipped 14 previous similar messages [587965.121013] md: md32 stopped. [587965.139607] md/raid:md32: device dm-289 operational as raid disk 0 [587965.139854] md/raid:md32: device dm-304 operational as raid disk 9 [587965.140105] md/raid:md32: device dm-303 operational as raid disk 8 [587965.140348] md/raid:md32: device dm-291 operational as raid disk 7 [587965.140589] md/raid:md32: device dm-290 operational as raid disk 6 [587965.140829] md/raid:md32: device dm-277 operational as raid disk 5 [587965.141074] md/raid:md32: device dm-276 operational as raid disk 4 [587965.141314] md/raid:md32: device dm-264 operational as raid disk 3 [587965.141556] md/raid:md32: device dm-263 operational as raid disk 2 [587965.141801] md/raid:md32: device dm-300 operational as raid disk 1 [587965.142955] md/raid:md32: raid level 6 active with 10 out of 10 devices, algorithm 2 [587965.163914] md32: detected capacity change from 0 to 64011431837696 [587965.169961] md: md34 stopped. [587965.186173] md/raid:md34: device dm-313 operational as raid disk 0 [587965.186423] md/raid:md34: device dm-308 operational as raid disk 9 [587965.186667] md/raid:md34: device dm-307 operational as raid disk 8 [587965.186912] md/raid:md34: device dm-295 operational as raid disk 7 [587965.187163] md/raid:md34: device dm-294 operational as raid disk 6 [587965.192910] md/raid:md34: device dm-282 operational as raid disk 5 [587965.193158] md/raid:md34: device dm-281 operational as raid disk 4 [587965.193404] md/raid:md34: device dm-269 operational as raid disk 3 [587965.193650] md/raid:md34: device dm-268 operational as raid disk 2 [587965.193896] md/raid:md34: device dm-314 operational as raid disk 1 [587965.195189] md/raid:md34: raid level 6 active with 10 out of 10 devices, algorithm 2 [587965.216593] md34: detected capacity change from 0 to 64011431837696 [587978.009636] LDISKFS-fs (md30): file extents enabled, maximum tree depth=5 [587978.322810] LDISKFS-fs (md30): mounted filesystem with ordered data mode. Opts: errors=remount-ro [587984.447311] LDISKFS-fs (md32): file extents enabled, maximum tree depth=5 [587984.778776] LDISKFS-fs (md32): mounted filesystem with ordered data mode. Opts: errors=remount-ro [587985.203272] LDISKFS-fs (md34): file extents enabled, maximum tree depth=5 [587985.519762] LDISKFS-fs (md34): mounted filesystem with ordered data mode. Opts: errors=remount-ro [588027.595195] Lustre: oak-OST0036: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [588027.595197] Lustre: oak-OST0032: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [588027.595199] Lustre: oak-OST0030: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [588027.595204] Lustre: Skipped 1 previous similar message [588027.596890] Lustre: Skipped 11 previous similar messages [588167.363792] LDISKFS-fs (md30): file extents enabled, maximum tree depth=5 [588167.661714] LDISKFS-fs (md30): mounted filesystem with ordered data mode. Opts: errors=remount-ro [588168.788783] LDISKFS-fs (md30): file extents enabled, maximum tree depth=5 [588169.078014] LDISKFS-fs (md30): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [588479.126235] LustreError: 166-1: MGC10.0.2.51@o2ib5: Connection to MGS (at 10.0.2.51@o2ib5) was lost; in progress operations using this service will fail [588479.126740] LustreError: 131954:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1518720797, 300s ago), entering recovery for MGS@10.0.2.51@o2ib5 ns: MGC10.0.2.51@o2ib5 lock: ffff882b089d9400/0x806f959338b0feb8 lrc: 4/1,0 mode: --/CR res: [0x6b616f:0x0:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0x7117bf3e96f125a1 expref: -99 pid: 131954 timeout: 0 lvb_type: 0 [588479.126804] LustreError: 15f-b: oak-OST004e: cannot register this server with the MGS: rc = -5. Is the MGS running? [588479.126883] LustreError: 213908:0:(obd_mount_server.c:1866:server_fill_super()) Unable to start targets: -5 [588479.126985] LustreError: 213908:0:(obd_mount_server.c:1576:server_put_super()) no obd oak-OST004e [588479.127006] LustreError: 213908:0:(obd_mount_server.c:135:server_deregister_mount()) oak-OST004e not registered [588479.127122] LustreError: 225296:0:(ldlm_resource.c:1100:ldlm_resource_complain()) MGC10.0.2.51@o2ib5: namespace resource [0x6b616f:0x0:0x0].0x0 (ffff881ec02c4600) refcount nonzero (2) after lock cleanup; forcing cleanup. [588479.127125] LustreError: 225296:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource: [0x6b616f:0x0:0x0].0x0 (ffff881ec02c4600) refcount = 3 [588479.127126] LustreError: 225296:0:(ldlm_resource.c:1703:ldlm_resource_dump()) Waiting locks: [588479.127134] LustreError: 225296:0:(ldlm_resource.c:1705:ldlm_resource_dump()) ### ### ns: MGC10.0.2.51@o2ib5 lock: ffff882b089d9400/0x806f959338b0feb8 lrc: 4/1,0 mode: --/CR res: [0x6b616f:0x0:0x0].0x0 rrc: 4 type: PLN flags: 0x1106400000000 nid: local remote: 0x7117bf3e96f125a1 expref: -99 pid: 131954 timeout: 0 lvb_type: 0 [588479.127140] Lustre: MGC10.0.2.51@o2ib5: Connection restored to 10.0.2.51@o2ib5 (at 10.0.2.51@o2ib5) [588479.133847] LustreError: 131954:0:(mgc_request.c:603:do_requeue()) failed processing log: -5 [588479.385674] Lustre: server umount oak-OST004e complete [588479.385922] LustreError: 213908:0:(obd_mount.c:1506:lustre_fill_super()) Unable to mount (-5) [588554.198992] Lustre: 129592:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1518721172/real 1518721172] req@ffff883d7c755d00 x1592481825872400/t0(0) o400->MGC10.0.2.51@o2ib5@10.0.2.51@o2ib5:26/25 lens 224/224 e 0 to 1 dl 1518721591 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 [588554.198994] Lustre: 129600:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1518721147/real 1518721172] req@ffff883d7c752400 x1592481825865792/t0(0) o400->MGC10.0.2.51@o2ib5@10.0.2.51@o2ib5:26/25 lens 224/224 e 0 to 1 dl 1518721566 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 [588554.198998] Lustre: 129600:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 14 previous similar messages [588554.199005] LustreError: 166-1: MGC10.0.2.51@o2ib5: Connection to MGS (at 10.0.2.51@o2ib5) was lost; in progress operations using this service will fail [588554.202476] Lustre: 129592:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message [588585.197300] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518721197/real 1518721197] req@ffff883d7c754200 x1592481825878512/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.52@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721203 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [588604.197082] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1518721222/real 1518721222] req@ffff883d7c751b00 x1592481825884608/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.51@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721233 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 [588640.194769] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518721247/real 1518721247] req@ffff883d7c750f00 x1592481825892368/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.52@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721258 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [588654.194754] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1518721272/real 1518721272] req@ffff883d7c757200 x1592481825901632/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.51@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721288 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 [588695.192215] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518721297/real 1518721297] req@ffff883d7c755d00 x1592481825909344/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.52@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721313 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [588750.189645] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518721347/real 1518721347] req@ffff883d7c753600 x1592481825926240/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.52@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721368 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [588750.190859] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message [588779.120323] LustreError: 131954:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1518721097, 300s ago), entering recovery for MGS@10.0.2.51@o2ib5 ns: MGC10.0.2.51@o2ib5 lock: ffff881bf48a5e00/0x806f959338b1a1da lrc: 4/1,0 mode: --/CR res: [0x6b616f:0x0:0x0].0x0 rrc: 2 type: PLN flags: 0x1000000000000 nid: local remote: 0x7117bf3e96f14bf0 expref: -99 pid: 131954 timeout: 0 lvb_type: 0 [588805.187100] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518721397/real 1518721397] req@ffff883d7c754800 x1592481825940112/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.52@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721423 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [588805.188491] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message [588885.183394] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518721472/real 1518721472] req@ffff883d7c753000 x1592481825964576/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.52@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721503 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [588885.184581] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message [589020.177119] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1518721597/real 0] req@ffff883d7c751b00 x1592481826002496/t0(0) o250->MGC10.0.2.51@o2ib5@10.0.2.51@o2ib5:26/25 lens 520/544 e 0 to 1 dl 1518721638 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1 [589020.178079] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [589034.872427] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.51@o2ib5: 5 seconds [589034.872896] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 4 previous similar messages [589202.825410] LustreError: 137-5: oak-OST004e_UUID: not available for connect from 10.210.47.164@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [589202.826219] LustreError: Skipped 49 previous similar messages [589203.888147] LustreError: 137-5: oak-OST004e_UUID: not available for connect from 10.210.44.242@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [589203.888969] LustreError: Skipped 3 previous similar messages [589204.168781] Lustre: Evicted from MGS (at 10.0.2.51@o2ib5) after server handle changed from 0x7117bf3e95d257ff to 0x6d1f3ec7b386fb45 [589204.169384] LustreError: 251469:0:(ldlm_resource.c:1100:ldlm_resource_complain()) MGC10.0.2.51@o2ib5: namespace resource [0x6b616f:0x0:0x0].0x0 (ffff881cd0bac780) refcount nonzero (1) after lock cleanup; forcing cleanup. [589204.170238] LustreError: 251469:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource: [0x6b616f:0x0:0x0].0x0 (ffff881cd0bac780) refcount = 2 [589204.170778] LustreError: 251469:0:(ldlm_resource.c:1703:ldlm_resource_dump()) Waiting locks: [589204.171323] LustreError: 251469:0:(ldlm_resource.c:1705:ldlm_resource_dump()) ### ### ns: ?? lock: ffff881bf48a5e00/0x806f959338b1a1da lrc: 4/1,0 mode: --/CR res: ?? rrc=?? type: ??? flags: 0x1106400000000 nid: local remote: 0x7117bf3e96f14bf0 expref: -99 pid: 131954 timeout: 0 lvb_type: 0 [589204.172360] Lustre: MGC10.0.2.51@o2ib5: Connection restored to 10.0.2.51@o2ib5 (at 10.0.2.51@o2ib5) [589204.172394] LustreError: 131954:0:(mgc_request.c:603:do_requeue()) failed processing log: -5 [589205.922125] LustreError: 137-5: oak-OST004e_UUID: not available for connect from 10.210.47.111@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [589205.922910] LustreError: Skipped 25 previous similar messages [589209.925248] LustreError: 137-5: oak-OST004e_UUID: not available for connect from 10.9.101.20@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [589209.925987] LustreError: Skipped 124 previous similar messages [589218.061947] LustreError: 137-5: oak-OST004e_UUID: not available for connect from 10.210.46.185@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [589218.068238] LustreError: Skipped 101 previous similar messages [589234.392339] LustreError: 137-5: oak-OST004e_UUID: not available for connect from 10.12.4.64@o2ib (no target). If you are running an HA pair check that the target is mounted on the other server. [589234.393088] LustreError: Skipped 35 previous similar messages [589266.638063] LustreError: 137-5: oak-OST004e_UUID: not available for connect from 10.210.46.183@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [589266.638793] LustreError: Skipped 310 previous similar messages [589309.632828] Lustre: oak-OST0032: haven't heard from client 7b95e1a5-91bb-5f7d-09e5-ce63330d06f9 (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881cfccb6000, cur 1518721928 expire 1518721778 last 1518721701 [589309.633865] Lustre: Skipped 14 previous similar messages [589330.688280] LustreError: 137-5: oak-OST004e_UUID: not available for connect from 10.9.105.54@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [589330.689089] LustreError: Skipped 1277 previous similar messages [589380.212614] LDISKFS-fs (md30): file extents enabled, maximum tree depth=5 [589380.509320] LDISKFS-fs (md30): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [589380.994475] Lustre: oak-OST004e: new disk, initializing [589380.994795] Lustre: srv-oak-OST004e: No data found on store. Initialize space [589381.002983] Lustre: oak-OST004e: Imperative Recovery not enabled, recovery window 300-900 [589381.128780] Lustre: oak-OST004e: Connection restored to dfe4adc3-947b-41de-05ed-c9677b1ffd7b (at 10.9.103.3@o2ib4) [589382.186019] Lustre: oak-OST004e: Connection restored to b7ac9c1d-bf6f-e740-25a7-94d82e827d5e (at 10.210.47.74@o2ib3) [589382.186509] Lustre: Skipped 15 previous similar messages [589384.239582] Lustre: oak-OST004e: Connection restored to 04b33c24-14dd-abb7-cec5-178b44515661 (at 10.210.46.85@o2ib3) [589384.240058] Lustre: Skipped 40 previous similar messages [589388.241652] Lustre: oak-OST004e: Connection restored to 12d37605-02a7-aeb2-67f7-530ca37adb8c (at 10.210.44.121@o2ib3) [589388.242210] Lustre: Skipped 580 previous similar messages [589500.237025] LDISKFS-fs (md32): file extents enabled, maximum tree depth=5 [589500.558104] LDISKFS-fs (md32): mounted filesystem with ordered data mode. Opts: errors=remount-ro [589501.539927] LDISKFS-fs (md32): file extents enabled, maximum tree depth=5 [589501.852797] LDISKFS-fs (md32): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [589502.079783] Lustre: oak-OST0050: new disk, initializing [589502.080087] Lustre: srv-oak-OST0050: No data found on store. Initialize space [589502.088352] Lustre: oak-OST0050: Imperative Recovery not enabled, recovery window 300-900 [589507.053086] Lustre: oak-OST0050: Connection restored to e4a093cd-c66a-c919-1e4f-24f9a3c7e16a (at 10.210.44.90@o2ib3) [589507.053602] Lustre: Skipped 618 previous similar messages [589574.875608] LDISKFS-fs (md34): file extents enabled, maximum tree depth=5 [589575.200061] LDISKFS-fs (md34): mounted filesystem with ordered data mode. Opts: errors=remount-ro [589576.143544] LDISKFS-fs (md34): file extents enabled, maximum tree depth=5 [589576.462142] LDISKFS-fs (md34): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [589576.780996] Lustre: oak-OST0052: new disk, initializing [589576.781306] Lustre: srv-oak-OST0052: No data found on store. Initialize space [589576.790594] Lustre: oak-OST0052: Imperative Recovery not enabled, recovery window 300-900 [589581.767083] Lustre: oak-OST0052: Connection restored to 23c6a6f7-121e-7f52-cded-484d2c105cca (at 10.9.104.41@o2ib4) [589581.767598] Lustre: Skipped 1257 previous similar messages [589881.937261] Lustre: oak-OST0034: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [589881.937263] Lustre: oak-OST0038: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [589881.937268] Lustre: Skipped 1257 previous similar messages [589881.938465] Lustre: Skipped 13 previous similar messages [590505.721205] Lustre: oak-OST0030: Connection restored to 8db677fd-c2b7-fbdd-5cdb-bbd8f49841be (at 10.12.4.76@o2ib) [590505.721694] Lustre: Skipped 16 previous similar messages [590856.405074] Lustre: oak-OST0030: Connection restored to c7cd2e66-750a-5f80-db94-ac9b167051ad (at 10.12.4.30@o2ib) [590856.405561] Lustre: Skipped 13 previous similar messages [590933.831257] Lustre: oak-OST0034: haven't heard from client 34aa28cd-2941-4f01-f6cb-0a19c0d6c4aa (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881eaf33fc00, cur 1518723552 expire 1518723402 last 1518723325 [590933.832347] Lustre: Skipped 14 previous similar messages [590934.363416] Lustre: oak-OST003e: haven't heard from client 34aa28cd-2941-4f01-f6cb-0a19c0d6c4aa (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d03a6ac00, cur 1518723552 expire 1518723402 last 1518723325 [590934.364394] Lustre: Skipped 3 previous similar messages [593873.538741] Lustre: oak-OST0030: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [593873.539235] Lustre: Skipped 32 previous similar messages [594743.115370] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [914358272-918552575]: client csum 86d9cff0, server csum e3745cf7 [594745.438624] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [914358272-918552575]: client csum 86d9cff0, server csum e3745cf7 [594745.439395] LustreError: Skipped 1 previous similar message [594748.436294] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [914358272-918552575]: client csum 86d9cff0, server csum e3745cf7 [594752.378933] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [914358272-918552575]: client csum 86d9cff0, server csum e3745cf7 [594757.391922] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [914358272-918552575]: client csum 86d9cff0, server csum e3745cf7 [594770.263429] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [914358272-918552575]: client csum 86d9cff0, server csum e3745cf7 [594770.264210] LustreError: Skipped 1 previous similar message [594789.168288] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [914358272-918552575]: client csum 86d9cff0, server csum e3745cf7 [594789.169063] LustreError: Skipped 2 previous similar messages [594895.373122] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [594897.361225] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [594900.376076] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [594904.401350] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [594909.380225] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [594921.387981] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [594921.388943] LustreError: Skipped 1 previous similar message [594940.177020] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [594940.177999] LustreError: Skipped 2 previous similar messages [594973.176541] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [594973.177501] LustreError: Skipped 7 previous similar messages [595207.341737] Lustre: oak-OST004e: haven't heard from client 0edc08ec-1c7b-767d-afe5-3bc1b9f9ec1a (at 10.12.4.30@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881f0a732800, cur 1518727826 expire 1518727676 last 1518727599 [595207.342758] Lustre: Skipped 13 previous similar messages [595300.606834] Lustre: oak-OST0030: haven't heard from client cd6fe05c-e91c-6f5b-6754-9cebbf0e1515 (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d00dc8800, cur 1518727919 expire 1518727769 last 1518727692 [595300.607808] Lustre: Skipped 35 previous similar messages [595301.611638] Lustre: oak-OST0038: haven't heard from client cd6fe05c-e91c-6f5b-6754-9cebbf0e1515 (at 10.12.4.68@o2ib) in 228 seconds. I think it's dead, and I am evicting it. exp ffff88160519a400, cur 1518727920 expire 1518727770 last 1518727692 [595301.612673] Lustre: Skipped 12 previous similar messages [595418.378713] Lustre: oak-OST0032: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [595418.378714] Lustre: oak-OST0030: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [595418.379668] Lustre: Skipped 16 previous similar messages [595462.348729] LustreError: 132-0: oak-OST0044: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x20000c36e:0x1b17:0x0] object 0x0:4275642 extent [775946240-780140543], client returned csum c7b5ec88 (type 4), server csum ee3e2d97 (type 4) [595462.349692] LustreError: Skipped 2 previous similar messages [596811.205748] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [2348810240-2353004543]: client csum ffe750c0, server csum cc5c4ea3 [596816.396433] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [2348810240-2353004543]: client csum ffe750c0, server csum cc5c4ea3 [596816.403354] LustreError: Skipped 2 previous similar messages [596825.288201] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [2348810240-2353004543]: client csum ffe750c0, server csum cc5c4ea3 [596825.289018] LustreError: Skipped 1 previous similar message [596844.229025] LustreError: 168-f: oak-OST0036: BAD WRITE CHECKSUM: from 12345-10.0.2.230@o2ib5 inode [0x20000c36f:0xf25:0x0] object 0x0:4129770 extent [2348810240-2353004543]: client csum ffe750c0, server csum cc5c4ea3 [596844.229833] LustreError: Skipped 2 previous similar messages [597867.037616] Lustre: oak-OST0030: Connection restored to c7cd2e66-750a-5f80-db94-ac9b167051ad (at 10.12.4.30@o2ib) [597867.038086] Lustre: Skipped 14 previous similar messages [599170.295628] Lustre: oak-OST0040: haven't heard from client 19748cb7-47c6-5d4c-a906-e7124cd26d05 (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881cfec37400, cur 1518731789 expire 1518731639 last 1518731562 [599170.296636] Lustre: Skipped 4 previous similar messages [599170.859941] Lustre: oak-OST003a: haven't heard from client 19748cb7-47c6-5d4c-a906-e7124cd26d05 (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d0319c800, cur 1518731789 expire 1518731639 last 1518731562 [599170.860990] Lustre: Skipped 2 previous similar messages [599171.873574] Lustre: oak-OST0038: haven't heard from client 19748cb7-47c6-5d4c-a906-e7124cd26d05 (at 10.12.4.68@o2ib) in 228 seconds. I think it's dead, and I am evicting it. exp ffff881d03199400, cur 1518731790 expire 1518731640 last 1518731562 [599171.874568] Lustre: Skipped 7 previous similar messages [599314.268769] Lustre: oak-OST0034: Connection restored to b3426c79-579f-36c8-108d-d63d7b0c1c6a (at 10.12.4.35@o2ib) [599314.268770] Lustre: oak-OST0030: Connection restored to b3426c79-579f-36c8-108d-d63d7b0c1c6a (at 10.12.4.35@o2ib) [599314.268773] Lustre: Skipped 1 previous similar message [599314.269971] Lustre: Skipped 15 previous similar messages [599461.345416] Lustre: oak-OST0030: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [599461.345417] Lustre: oak-OST0032: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [599461.346505] Lustre: Skipped 16 previous similar messages [599860.562935] Lustre: oak-OST0032: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [599860.562936] Lustre: oak-OST0030: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [599860.562937] Lustre: oak-OST0034: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [599860.562938] Lustre: Skipped 1 previous similar message [599860.562941] Lustre: Skipped 1 previous similar message [599860.564876] Lustre: Skipped 14 previous similar messages [601885.908295] Lustre: oak-OST0030: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [601885.908775] Lustre: Skipped 4 previous similar messages [610357.717807] Lustre: 129584:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518742720/real 1518742720] req@ffff881e7bf2b600 x1592481831763168/t0(0) o101->oak-MDT0000-lwp-OST0050@10.0.2.52@o2ib5:23/10 lens 456/496 e 1 to 1 dl 1518742977 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [610357.719115] Lustre: 129584:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [610357.719673] Lustre: oak-MDT0000-lwp-OST0050: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [610357.720430] Lustre: Skipped 14 previous similar messages [610357.721928] Lustre: oak-MDT0000-lwp-OST0050: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) [610357.722453] Lustre: Skipped 12 previous similar messages [610799.697163] Lustre: 129586:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518742968/real 1518742968] req@ffff881e7bf2e000 x1592481831766624/t0(0) o101->oak-MDT0000-lwp-OST0044@10.0.2.52@o2ib5:23/10 lens 456/496 e 0 to 1 dl 1518743419 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [610799.698338] Lustre: oak-MDT0000-lwp-OST0044: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [610799.699913] Lustre: oak-MDT0000-lwp-OST0044: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) [611033.445229] Lustre: 129564:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518743061/real 1518743061] req@ffff881623175d00 x1592481831767904/t0(0) o101->oak-MDT0000-lwp-OST0050@10.0.2.52@o2ib5:23/10 lens 456/496 e 0 to 1 dl 1518743652 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [611033.446454] Lustre: oak-MDT0000-lwp-OST0050: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [611033.448116] Lustre: oak-MDT0000-lwp-OST0050: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) [611386.669693] Lustre: 129565:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518743250/real 1518743250] req@ffff881623170f00 x1592481831770032/t0(0) o101->oak-MDT0000-lwp-OST0050@10.0.2.52@o2ib5:23/10 lens 456/496 e 0 to 1 dl 1518744006 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [611402.429991] Lustre: oak-MDT0000-lwp-OST0052: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [611402.431008] Lustre: oak-MDT0000-lwp-OST0052: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) [611426.428851] Lustre: oak-MDT0000-lwp-OST004e: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [611426.430087] Lustre: oak-MDT0000-lwp-OST004e: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) [611556.825747] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 5 seconds [611556.826320] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (55): c: 0, oc: 0, rc: 8 [611556.827309] Lustre: oak-MDT0000-lwp-OST004a: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [611556.828099] Lustre: Skipped 7 previous similar messages [611569.825159] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 18 seconds [611569.825756] LustreError: 129598:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST004e qtype:grp id:3262 enforced:1 granted:268435456 pending:0 waiting:0 req:1 usage:493364 qunit:0 qtune:0 edquot:0 [611569.826802] LustreError: 129598:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 1 previous similar message [611581.824578] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 30 seconds [611581.825084] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [611581.825652] Lustre: oak-MDT0000-lwp-OST0034: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [611581.826405] Lustre: Skipped 8 previous similar messages [611594.823990] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 43 seconds [611594.824468] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [611607.823397] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 56 seconds [611607.823892] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [611619.822809] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 68 seconds [611619.823288] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [611632.822255] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 81 seconds [611632.822764] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [611644.418685] Lustre: 129603:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518743507/real 1518743507] req@ffff883b416a3600 x1592481831773136/t0(0) o101->oak-MDT0000-lwp-OST0044@10.0.2.52@o2ib5:23/10 lens 456/496 e 0 to 1 dl 1518744263 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [611644.419939] Lustre: 129603:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 65 previous similar messages [611718.586922] Lustre: oak-OST004e: haven't heard from client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883c8b01bc00, cur 1518744338 expire 1518744188 last 1518744111 [611718.587696] Lustre: Skipped 6 previous similar messages [611720.592397] Lustre: oak-OST0050: haven't heard from client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) in 228 seconds. I think it's dead, and I am evicting it. exp ffff883c8a3e8c00, cur 1518744340 expire 1518744190 last 1518744112 [611721.656935] Lustre: oak-OST0034: haven't heard from client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883c89e16000, cur 1518744341 expire 1518744191 last 1518744114 [611809.813902] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 1 seconds [611809.814384] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (51): c: 0, oc: 0, rc: 8 [611887.583793] Lustre: oak-OST0030: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [611895.810182] LustreError: 167-0: oak-MDT0000-lwp-OST0038: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [611895.810703] LustreError: Skipped 14 previous similar messages [611895.811351] LustreError: 129609:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST0050 qtype:grp id:4718 enforced:1 granted:67108864 pending:0 waiting:0 req:1 usage:27540 qunit:0 qtune:0 edquot:0 [611895.812086] LustreError: 129609:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 3 previous similar messages [611928.310191] Lustre: oak-OST003a: deleting orphan objects from 0x0:4283138 to 0x0:4283169 [611928.310200] Lustre: oak-OST003c: deleting orphan objects from 0x0:4306721 to 0x0:4306753 [611928.310220] Lustre: oak-OST0042: deleting orphan objects from 0x0:4209177 to 0x0:4209217 [611928.310221] Lustre: oak-OST0030: deleting orphan objects from 0x0:4344642 to 0x0:4344673 [611928.310227] Lustre: oak-OST0048: deleting orphan objects from 0x0:3303911 to 0x0:3303937 [611928.310228] Lustre: oak-OST0040: deleting orphan objects from 0x0:4316036 to 0x0:4316065 [611928.310232] Lustre: oak-OST004c: deleting orphan objects from 0x0:3318441 to 0x0:3318465 [611928.310235] Lustre: oak-OST0044: deleting orphan objects from 0x0:4279409 to 0x0:4279425 [611928.310242] Lustre: oak-OST004e: deleting orphan objects from 0x0:7290 to 0x0:7329 [611928.310245] Lustre: oak-OST0050: deleting orphan objects from 0x0:7278 to 0x0:7297 [611928.310274] Lustre: oak-OST0032: deleting orphan objects from 0x0:4297230 to 0x0:4297249 [611928.310277] Lustre: oak-OST0036: deleting orphan objects from 0x0:4133082 to 0x0:4133121 [611928.310294] Lustre: oak-OST0052: deleting orphan objects from 0x0:7089 to 0x0:7105 [611928.310302] Lustre: oak-OST0046: deleting orphan objects from 0x0:4331818 to 0x0:4331841 [611928.310308] Lustre: oak-OST004a: deleting orphan objects from 0x0:3299802 to 0x0:3299841 [611928.310313] Lustre: oak-OST003e: deleting orphan objects from 0x0:4167052 to 0x0:4167073 [611928.310588] Lustre: oak-OST0034: deleting orphan objects from 0x0:4169190 to 0x0:4169217 [611928.310605] Lustre: oak-OST0038: deleting orphan objects from 0x0:4262259 to 0x0:4262305 [611945.807831] LustreError: 167-0: oak-MDT0000-lwp-OST0030: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [611945.808334] LustreError: Skipped 8 previous similar messages [611945.809407] Lustre: oak-MDT0000-lwp-OST0032: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) [611945.809893] Lustre: Skipped 32 previous similar messages [625635.924036] Lustre: oak-OST004c: haven't heard from client d1679d70-c9ec-d53d-f2c0-30f7f69fbf83 (at 10.12.4.49@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cac67e800, cur 1518758256 expire 1518758106 last 1518758029 [625635.925036] Lustre: Skipped 15 previous similar messages [652922.345294] Lustre: oak-OST0034: Connection restored to d1679d70-c9ec-d53d-f2c0-30f7f69fbf83 (at 10.12.4.49@o2ib) [652922.345295] Lustre: oak-OST0036: Connection restored to d1679d70-c9ec-d53d-f2c0-30f7f69fbf83 (at 10.12.4.49@o2ib) [652922.345296] Lustre: oak-OST0030: Connection restored to d1679d70-c9ec-d53d-f2c0-30f7f69fbf83 (at 10.12.4.49@o2ib) [652922.345297] Lustre: oak-OST0032: Connection restored to d1679d70-c9ec-d53d-f2c0-30f7f69fbf83 (at 10.12.4.49@o2ib) [652922.345298] Lustre: Skipped 1 previous similar message [652922.345298] Lustre: Skipped 1 previous similar message [652922.345301] Lustre: Skipped 1 previous similar message [652922.347900] Lustre: Skipped 13 previous similar messages [662800.203524] Lustre: oak-OST0032: haven't heard from client b6f3bc6f-c100-f3ea-3d54-0482b1b40196 (at 10.8.15.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ff35ea800, cur 1518795422 expire 1518795272 last 1518795195 [662800.204530] Lustre: Skipped 17 previous similar messages [674291.921978] Lustre: 129594:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518806313/real 1518806313] req@ffff882b22115700 x1592481880766656/t0(0) o601->oak-MDT0000-lwp-OST0046@10.0.2.52@o2ib5:23/10 lens 336/336 e 7 to 1 dl 1518806914 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [674291.923209] Lustre: 129594:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 147 previous similar messages [674291.923687] Lustre: oak-MDT0000-lwp-OST0046: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [674291.924397] Lustre: Skipped 1 previous similar message [674291.925477] Lustre: oak-MDT0000-lwp-OST0046: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) [674312.920984] Lustre: oak-MDT0000-lwp-OST004c: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [674312.922618] Lustre: oak-MDT0000-lwp-OST004c: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) [674342.850622] Lustre: oak-MDT0000-lwp-OST004a: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [674347.128296] Lustre: oak-OST0030: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [674347.128838] Lustre: oak-OST0030: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [674356.849932] Lustre: 129611:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1518806972/real 0] req@ffff8821e77fd100 x1592481880788880/t0(0) o103->oak-MDT0000-lwp-OST0032@10.0.2.52@o2ib5:17/18 lens 328/224 e 0 to 1 dl 1518806979 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 [674356.849934] Lustre: 129607:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for sent delay: [sent 1518806972/real 0] req@ffff882f25920f00 x1592481880789200/t0(0) o103->oak-MDT0000-lwp-OST0032@10.0.2.52@o2ib5:17/18 lens 328/224 e 0 to 1 dl 1518806979 ref 2 fl Rpc:X/0/ffffffff rc 0/-1 [674356.849938] Lustre: 129607:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 253 previous similar messages [674356.852347] Lustre: 129611:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 31 previous similar messages [674359.849823] Lustre: oak-MDT0000-lwp-OST0034: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [674359.850522] Lustre: Skipped 10 previous similar messages [674374.367528] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.0.2.52@o2ib5 (no target). If you are running an HA pair check that the target is mounted on the other server. [674374.368254] LustreError: Skipped 1215 previous similar messages [674388.909446] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 3 seconds [674388.909932] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (14): c: 0, oc: 2, rc: 8 [674388.910895] LustreError: 129580:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST003c qtype:grp id:3779 enforced:1 granted:61766956 pending:0 waiting:0 req:1 usage:45992760 qunit:0 qtune:0 edquot:0 [674388.911914] LustreError: 129580:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 2 previous similar messages [674399.366301] Lustre: oak-OST0032: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [674399.366302] Lustre: oak-OST0030: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [674399.366306] Lustre: Skipped 17 previous similar messages [674399.366324] Lustre: oak-OST0030: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [674399.366325] Lustre: Skipped 17 previous similar messages [674399.368294] Lustre: Skipped 16 previous similar messages [674424.365100] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.0.2.52@o2ib5 (no target). If you are running an HA pair check that the target is mounted on the other server. [674424.365877] LustreError: Skipped 17 previous similar messages [674449.363899] Lustre: oak-OST0032: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [674449.363901] Lustre: oak-OST0030: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [674449.363917] Lustre: oak-OST0030: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [674449.363918] Lustre: oak-OST0034: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [674449.363919] Lustre: Skipped 16 previous similar messages [674449.363919] Lustre: Skipped 16 previous similar messages [674449.366363] Lustre: Skipped 16 previous similar messages [674451.906545] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [674451.907028] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (2): c: 0, oc: 3, rc: 8 [674451.908025] LustreError: 129580:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST004a qtype:grp id:3325 enforced:1 granted:1073741824 pending:0 waiting:0 req:1 usage:1034779772 qunit:0 qtune:0 edquot:0 [674451.909115] LustreError: 129580:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 7 previous similar messages [674461.983100] Lustre: oak-OST0032: Client 8759322d-2934-56c4-478d-94af3a5ef586 (at 10.210.47.185@o2ib3) reconnecting [674461.983132] Lustre: oak-OST0038: Connection restored to 8759322d-2934-56c4-478d-94af3a5ef586 (at 10.210.47.185@o2ib3) [674461.983133] Lustre: Skipped 16 previous similar messages [674461.984352] Lustre: Skipped 3 previous similar messages [674474.363091] LustreError: 137-5: oak-OST003b_UUID: not available for connect from 10.0.2.52@o2ib5 (no target). If you are running an HA pair check that the target is mounted on the other server. [674474.363925] LustreError: Skipped 17 previous similar messages [674499.361567] Lustre: oak-OST0030: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [674499.361607] Lustre: oak-OST0032: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [674499.361609] Lustre: Skipped 3 previous similar messages [674499.362891] Lustre: Skipped 16 previous similar messages [674503.601165] LNet: Service thread pid 251684 was inactive for 200.08s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [674503.601882] Pid: 251684, comm: ll_ost_io01_026 [674503.602144] Call Trace: [674503.602625] [] schedule+0x29/0x70 [674503.602887] [] schedule_timeout+0x174/0x2c0 [674503.603165] [] ? process_timeout+0x0/0x10 [674503.603479] [] ptlrpc_set_wait+0x4c0/0x910 [ptlrpc] [674503.603743] [] ? default_wake_function+0x0/0x20 [674503.604009] [] ? qsd_req_completion+0x0/0xb20 [lquota] [674503.604293] [] ptlrpc_queue_wait+0x7d/0x220 [ptlrpc] [674503.604632] [] qsd_send_dqacq+0x2e8/0x340 [lquota] [674503.604896] [] qsd_acquire+0x8a3/0xc70 [lquota] [674503.605174] [] qsd_op_begin0+0x181/0x940 [lquota] [674503.605422] [] ? kiblnd_queue_tx+0x36/0x50 [ko2iblnd] [674503.605706] [] qsd_op_begin+0x259/0x4d0 [lquota] [674503.605965] [] osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [674503.606237] [] osd_declare_inode_qid+0x203/0x3b0 [osd_ldiskfs] [674503.612512] [] osd_declare_write_commit+0x307/0x500 [osd_ldiskfs] [674503.613003] [] ofd_commitrw_write+0x74d/0x1c50 [ofd] [674503.613271] [] ofd_commitrw+0x4b9/0xac0 [ofd] [674503.613566] [] obd_commitrw+0x2ed/0x330 [ptlrpc] [674503.613897] [] tgt_brw_write+0xff1/0x17c0 [ptlrpc] [674503.614155] [] ? update_curr+0x104/0x190 [674503.614399] [] ? __enqueue_entity+0x78/0x80 [674503.614691] [] ? enqueue_entity+0x26c/0xb60 [674503.614969] [] ? target_bulk_timeout+0x0/0xb0 [ptlrpc] [674503.615269] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [674503.615549] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [674503.616092] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [674503.616361] [] ? default_wake_function+0x12/0x20 [674503.616641] [] ? __wake_up_common+0x58/0x90 [674503.616964] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [674503.617248] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [674503.617496] [] kthread+0xcf/0xe0 [674503.617788] [] ? kthread+0x0/0xe0 [674503.618045] [] ret_from_fork+0x58/0x90 [674503.618304] [] ? kthread+0x0/0xe0 [674503.618541] [674503.618874] LustreError: dumping log to /tmp/lustre-log.1518807125.251684 [674514.903631] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [674514.904112] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (15): c: 0, oc: 0, rc: 8 [674514.905212] Lustre: 129569:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1518806958/real 1518807137] req@ffff881448134800 x1592481880780176/t0(0) o101->oak-MDT0000-lwp-OST004a@10.0.2.52@o2ib5:23/10 lens 456/496 e 0 to 1 dl 1518807714 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 [674514.905216] Lustre: 129571:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1518806958/real 1518807137] req@ffff881448136000 x1592481880780208/t0(0) o101->oak-MDT0000-lwp-OST004a@10.0.2.52@o2ib5:23/10 lens 456/496 e 0 to 1 dl 1518807714 ref 1 fl Rpc:eXN/0/ffffffff rc 0/-1 [674514.905223] Lustre: 129571:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 157 previous similar messages [674514.905243] LustreError: 129567:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST004a qtype:grp id:3526 enforced:1 granted:1073741824 pending:0 waiting:0 req:1 usage:772766836 qunit:0 qtune:0 edquot:0 [674514.905246] LustreError: 129568:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST004a qtype:grp id:3199 enforced:1 granted:16777216 pending:0 waiting:0 req:1 usage:445312 qunit:0 qtune:0 edquot:0 [674514.905248] LustreError: 129567:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 1 previous similar message [674514.905250] LustreError: 129568:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 2 previous similar messages [674520.163819] Lustre: oak-OST0038: Client b4859a0d-06c7-b585-66b6-8e7eadf483bd (at 10.210.47.242@o2ib3) reconnecting [674520.163820] Lustre: oak-OST0034: Client b4859a0d-06c7-b585-66b6-8e7eadf483bd (at 10.210.47.242@o2ib3) reconnecting [674520.164789] Lustre: Skipped 16 previous similar messages [674538.724049] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.47.6@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [674538.724866] LustreError: Skipped 121 previous similar messages [674545.341487] Lustre: oak-OST0030: Connection restored to 8759322d-2934-56c4-478d-94af3a5ef586 (at 10.210.47.185@o2ib3) [674545.341488] Lustre: oak-OST0032: Connection restored to 8759322d-2934-56c4-478d-94af3a5ef586 (at 10.210.47.185@o2ib3) [674545.341491] Lustre: Skipped 145 previous similar messages [674545.342688] Lustre: Skipped 11 previous similar messages [674565.901285] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 0 seconds [674565.901867] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (16): c: 0, oc: 0, rc: 8 [674565.902859] LustreError: 129573:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST0050 qtype:grp id:4721 enforced:1 granted:268435456 pending:0 waiting:0 req:1 usage:686668 qunit:0 qtune:0 edquot:0 [674565.902861] LustreError: 129571:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST0050 qtype:grp id:3367 enforced:1 granted:279867380 pending:0 waiting:0 req:1 usage:12231520 qunit:0 qtune:0 edquot:0 [674565.902864] LustreError: 129571:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 2 previous similar messages [674565.905329] LustreError: 129573:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 1 previous similar message [674628.898377] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [674628.898875] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (29): c: 0, oc: 2, rc: 8 [674628.900021] LustreError: 129573:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST0052 qtype:grp id:3372 enforced:1 granted:67108864 pending:0 waiting:0 req:1 usage:2896736 qunit:0 qtune:0 edquot:0 [674628.901037] LustreError: 129573:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 3 previous similar messages [674649.354681] Lustre: oak-OST0032: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [674649.354682] Lustre: oak-OST0030: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [674649.354685] Lustre: Skipped 238 previous similar messages [674649.355923] Lustre: Skipped 15 previous similar messages [674691.895525] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [674691.896027] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (42): c: 0, oc: 3, rc: 8 [674691.896976] LustreError: 129591:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST0038 qtype:grp id:3593 enforced:1 granted:132121040 pending:0 waiting:0 req:1 usage:95112220 qunit:0 qtune:0 edquot:0 [674691.898022] LustreError: 129591:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 7 previous similar messages [674699.352528] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.0.2.52@o2ib5 (no target). If you are running an HA pair check that the target is mounted on the other server. [674699.353303] LustreError: Skipped 53 previous similar messages [674702.834019] Lustre: oak-MDT0000-lwp-OST0044: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [674702.834721] Lustre: Skipped 4 previous similar messages [674748.156966] Lustre: oak-OST0030: Connection restored to 373bd608-9a3d-167b-936a-2ef5978f6027 (at 10.210.47.137@o2ib3) [674748.157447] Lustre: Skipped 150 previous similar messages [674754.892642] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [674754.893176] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (5): c: 0, oc: 0, rc: 8 [674754.894253] LustreError: 129606:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST0038 qtype:grp id:4718 enforced:1 granted:67108864 pending:0 waiting:0 req:1 usage:28578560 qunit:0 qtune:0 edquot:0 [674754.894255] LustreError: 129607:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST0038 qtype:grp id:3756 enforced:1 granted:16777216 pending:0 waiting:0 req:1 usage:3214616 qunit:0 qtune:0 edquot:0 [674754.894258] LustreError: 129607:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 1 previous similar message [674754.896913] LustreError: 129606:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 4 previous similar messages [674783.831295] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518807400/real 1518807400] req@ffff8809a517d400 x1592481880815984/t0(0) o38->oak-MDT0000-lwp-OST0052@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518807406 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [674783.832486] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 25 previous similar messages [674817.889735] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [674817.890286] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (18): c: 0, oc: 3, rc: 8 [674884.825628] Lustre: oak-MDT0000-lwp-OST004e: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [674897.735060] Lustre: 131976:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff883ff8335050 x1592236093877920/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:450/0 lens 608/448 e 22 to 0 dl 1518807525 ref 2 fl Interpret:/0/0 rc 0/0 [674905.615761] Lustre: oak-OST0030: Client a5fc285a-dc5f-ef03-bf1b-f67c5efe1956 (at 10.210.47.156@o2ib3) reconnecting [674905.616329] Lustre: Skipped 1383 previous similar messages [674931.884486] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [674931.884974] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Skipped 1 previous similar message [674931.885466] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (32): c: 0, oc: 2, rc: 8 [674931.885994] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Skipped 1 previous similar message [674931.887036] LustreError: 129570:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST004c qtype:grp id:3768 enforced:1 granted:38719368 pending:0 waiting:0 req:1 usage:27104796 qunit:0 qtune:0 edquot:0 [674931.887038] LustreError: 129571:0:(qsd_handler.c:340:qsd_req_completion()) $$$ DQACQ failed with -5, flags:0x1 qsd:oak-OST004c qtype:grp id:4718 enforced:1 granted:67108864 pending:0 waiting:0 req:1 usage:32075280 qunit:0 qtune:0 edquot:0 [674931.887041] LustreError: 129571:0:(qsd_handler.c:340:qsd_req_completion()) Skipped 15 previous similar messages [674989.160349] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.210.47.182@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [674989.161083] LustreError: Skipped 723 previous similar messages [675014.159515] Lustre: oak-OST0030: Connection restored to 30ba0698-6d7b-360e-6676-f412d032d800 (at 10.210.47.182@o2ib3) [675014.165523] Lustre: Skipped 1603 previous similar messages [675059.493723] Lustre: 251684:0:(service.c:2112:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:156s); client may timeout. req@ffff883ff8335050 x1592236093877920/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:450/0 lens 608/416 e 22 to 0 dl 1518807525 ref 1 fl Complete:/0/0 rc -115/-115 [675059.495028] LNet: Service thread pid 251684 completed after 756.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [675092.996376] LustreError: 282888:0:(ldlm_lockd.c:2365:ldlm_cancel_handler()) ldlm_cancel from 10.9.101.15@o2ib4 arrived at 1518807715 with bad export cookie 0 [675120.875766] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Timed out tx: tx_queue, 12 seconds [675120.876250] LNetError: 127335:0:(o2iblnd_cb.c:3147:kiblnd_check_txs_locked()) Skipped 2 previous similar messages [675120.876732] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Timed out RDMA with 10.0.2.52@o2ib5 (13): c: 0, oc: 2, rc: 8 [675120.877214] LNetError: 127335:0:(o2iblnd_cb.c:3222:kiblnd_check_conns()) Skipped 2 previous similar messages [675133.875191] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 25 seconds [675133.875674] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 15 previous similar messages [675145.874640] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 37 seconds [675145.875136] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [675158.874019] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 50 seconds [675158.874506] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [675159.489017] LustreError: 131976:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk WRITE after 100+0s req@ffff8822639d5850 x1592236093877920/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:296/0 lens 608/448 e 0 to 0 dl 1518808126 ref 1 fl Interpret:/2/0 rc 0/0 [675159.490007] Lustre: oak-OST0046: Bulk IO write error with 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4), client will retry: rc = -110 [675171.873437] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 63 seconds [675171.873907] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 7 previous similar messages [675196.872259] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Timed out tx for 10.0.2.52@o2ib5: 88 seconds [675196.872747] LNet: 127335:0:(o2iblnd_cb.c:3192:kiblnd_check_conns()) Skipped 15 previous similar messages [675246.623304] Lustre: oak-OST0036: haven't heard from client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) in 198 seconds. I think it's dead, and I am evicting it. exp ffff8821806fd400, cur 1518807869 expire 1518807719 last 1518807671 [675246.624108] Lustre: Skipped 17 previous similar messages [675251.638130] Lustre: oak-OST004e: haven't heard from client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) in 203 seconds. I think it's dead, and I am evicting it. exp ffff883767d54800, cur 1518807874 expire 1518807724 last 1518807671 [675251.638857] Lustre: Skipped 1 previous similar message [675254.863268] Lustre: oak-OST003e: haven't heard from client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) in 206 seconds. I think it's dead, and I am evicting it. exp ffff883cab94e400, cur 1518807877 expire 1518807727 last 1518807671 [675254.864035] Lustre: Skipped 1 previous similar message [675257.636613] Lustre: oak-OST0048: haven't heard from client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) in 209 seconds. I think it's dead, and I am evicting it. exp ffff88251da35400, cur 1518807880 expire 1518807730 last 1518807671 [675257.637345] Lustre: Skipped 10 previous similar messages [675300.806428] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518807907/real 1518807907] req@ffff8809a517d400 x1592481880838480/t0(0) o38->oak-MDT0000-lwp-OST0052@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518807923 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [675300.807662] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 614 previous similar messages [675542.218100] Lustre: oak-OST0030: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [675542.218615] Lustre: Skipped 899 previous similar messages [675559.794607] LustreError: 167-0: oak-MDT0000-lwp-OST0052: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [675559.795100] LustreError: Skipped 8 previous similar messages [675632.539360] LustreError: 11-0: oak-MDT0000-lwp-OST0050: operation quota_acquire to node 10.0.2.52@o2ib5 failed: rc = -11 [675634.791173] LustreError: 167-0: oak-MDT0000-lwp-OST004c: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [675634.791680] LustreError: Skipped 3 previous similar messages [675644.186356] LustreError: 11-0: oak-MDT0000-lwp-OST0052: operation quota_acquire to node 10.0.2.52@o2ib5 failed: rc = -11 [675659.790110] LustreError: 167-0: oak-MDT0000-lwp-OST0044: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [675659.790614] LustreError: Skipped 2 previous similar messages [675680.118185] Lustre: oak-OST0030: deleting orphan objects from 0x0:4385023 to 0x0:4385057 [675680.118189] Lustre: oak-OST0034: deleting orphan objects from 0x0:4207471 to 0x0:4207489 [675680.118191] Lustre: oak-OST0032: deleting orphan objects from 0x0:4337701 to 0x0:4337729 [675680.118193] Lustre: oak-OST0036: deleting orphan objects from 0x0:4169813 to 0x0:4169857 [675680.118196] Lustre: oak-OST003e: deleting orphan objects from 0x0:4205575 to 0x0:4205601 [675680.118206] Lustre: oak-OST0038: deleting orphan objects from 0x0:4302392 to 0x0:4302433 [675680.118225] Lustre: oak-OST0046: deleting orphan objects from 0x0:4373226 to 0x0:4373249 [675680.118230] Lustre: oak-OST0042: deleting orphan objects from 0x0:4248286 to 0x0:4248321 [675680.118231] Lustre: oak-OST004e: deleting orphan objects from 0x0:60638 to 0x0:60673 [675680.118240] Lustre: oak-OST003a: deleting orphan objects from 0x0:4323504 to 0x0:4323521 [675680.118256] Lustre: oak-OST0044: deleting orphan objects from 0x0:4319759 to 0x0:4319777 [675680.118258] Lustre: oak-OST004c: deleting orphan objects from 0x0:3361931 to 0x0:3361953 [675680.118308] Lustre: oak-OST0050: deleting orphan objects from 0x0:60392 to 0x0:60417 [675680.118340] Lustre: oak-OST0052: deleting orphan objects from 0x0:60467 to 0x0:60513 [675680.118349] Lustre: oak-OST0048: deleting orphan objects from 0x0:3347576 to 0x0:3347617 [675680.118370] Lustre: oak-OST004a: deleting orphan objects from 0x0:3343652 to 0x0:3343681 [675680.118383] Lustre: oak-OST003c: deleting orphan objects from 0x0:4348409 to 0x0:4348449 [675680.118387] Lustre: oak-OST0040: deleting orphan objects from 0x0:4356295 to 0x0:4356321 [676105.770191] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518808432/real 1518808432] req@ffff8809a517e600 x1592481881729008/t0(0) o38->oak-MDT0000-lwp-OST0038@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1518808728 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [676105.771514] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 72 previous similar messages [676109.771004] LustreError: 167-0: oak-MDT0000-lwp-OST0038: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [676134.769105] LustreError: 167-0: oak-MDT0000-lwp-OST0042: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [676134.769960] LustreError: 129587:0:(client.c:1189:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff881967af4e00 x1592481880787728/t0(0) o103->oak-MDT0000-lwp-OST0046@10.0.2.52@o2ib5:17/18 lens 328/224 e 0 to 1 dl 1518806978 ref 1 fl Rpc:eX/0/ffffffff rc 0/-1 [676716.741823] Lustre: 129611:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1518809332/real 1518809332] req@ffff8809a517ce00 x1592481884846656/t0(0) o400->MGC10.0.2.51@o2ib5@10.0.2.51@o2ib5:26/25 lens 224/224 e 0 to 1 dl 1518809339 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [676716.743020] Lustre: 129611:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [676716.743499] LustreError: 166-1: MGC10.0.2.51@o2ib5: Connection to MGS (at 10.0.2.51@o2ib5) was lost; in progress operations using this service will fail [676791.740640] Lustre: Evicted from MGS (at 10.0.2.51@o2ib5) after server handle changed from 0x6d1f3ec7b386fb45 to 0x9550dba931534a5d [676791.741399] Lustre: MGC10.0.2.51@o2ib5: Connection restored to 10.0.2.51@o2ib5 (at 10.0.2.51@o2ib5) [676791.741866] Lustre: Skipped 35 previous similar messages [696301.652898] Lustre: oak-OST004a: haven't heard from client bd128bec-2591-601e-854d-48358ee9ae1c (at 10.12.4.40@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d0e5d4800, cur 1518828925 expire 1518828775 last 1518828698 [696301.653866] Lustre: Skipped 2 previous similar messages [696331.883921] Lustre: oak-OST0030: Connection restored to bd128bec-2591-601e-854d-48358ee9ae1c (at 10.12.4.40@o2ib) [696331.884448] Lustre: Skipped 16 previous similar messages [697282.273207] Lustre: oak-OST0032: Connection restored to (at 10.8.15.2@o2ib6) [697282.273701] Lustre: Skipped 16 previous similar messages [701474.424798] Lustre: oak-OST0050: haven't heard from client a2b63e24-aa37-2f9b-57ad-fc4098adf5e1 (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881fde039000, cur 1518834098 expire 1518833948 last 1518833871 [701474.425847] Lustre: Skipped 17 previous similar messages [714025.324423] Lustre: oak-OST0034: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [714025.324930] Lustre: Skipped 14 previous similar messages [770747.192943] Lustre: oak-OST0034: haven't heard from client 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883df6024c00, cur 1518903374 expire 1518903224 last 1518903147 [770747.193900] Lustre: Skipped 17 previous similar messages [771860.953272] Lustre: oak-OST003c: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [771860.953273] Lustre: oak-OST003a: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [771860.953275] Lustre: oak-OST0038: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [771860.953276] Lustre: oak-OST0034: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [771860.953277] Lustre: Skipped 2 previous similar messages [771860.953278] Lustre: Skipped 2 previous similar messages [771860.953281] Lustre: Skipped 2 previous similar messages [771860.961630] Lustre: Skipped 12 previous similar messages [805861.557694] Lustre: oak-OST0030: haven't heard from client 095eb851-7ed4-1bf8-bee1-5607a6b7b3c8 (at 10.210.47.175@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881ff1b08400, cur 1518938490 expire 1518938340 last 1518938263 [805861.558656] Lustre: Skipped 17 previous similar messages [811772.573534] md: data-check of RAID array md34 [811778.630646] md: data-check of RAID array md32 [811784.735272] md: data-check of RAID array md20 [811790.842649] md: data-check of RAID array md26 [811796.947585] md: data-check of RAID array md24 [811803.046439] md: data-check of RAID array md16 [811809.158620] md: data-check of RAID array md12 [811815.259961] md: data-check of RAID array md6 [811821.364926] md: data-check of RAID array md28 [811827.476748] md: data-check of RAID array md22 [811833.604010] md: data-check of RAID array md14 [811839.712055] md: data-check of RAID array md10 [811845.822672] md: data-check of RAID array md4 [835850.184500] Lustre: oak-OST0048: haven't heard from client 9084afcb-f8c2-953f-97a7-f8aaed867fe8 (at 10.12.4.27@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cac4acc00, cur 1518968480 expire 1518968330 last 1518968253 [835850.186437] Lustre: Skipped 17 previous similar messages [835850.692315] Lustre: oak-OST0038: haven't heard from client 42e30373-7c28-3634-1c18-e4b21d9a682d (at 10.12.4.74@o2ib) in 209 seconds. I think it's dead, and I am evicting it. exp ffff883db57e2c00, cur 1518968480 expire 1518968330 last 1518968271 [835850.693261] Lustre: Skipped 34 previous similar messages [835878.781150] Lustre: oak-OST003c: Connection restored to 9084afcb-f8c2-953f-97a7-f8aaed867fe8 (at 10.12.4.27@o2ib) [835878.781151] Lustre: oak-OST003e: Connection restored to 9084afcb-f8c2-953f-97a7-f8aaed867fe8 (at 10.12.4.27@o2ib) [835878.781153] Lustre: oak-OST003a: Connection restored to 9084afcb-f8c2-953f-97a7-f8aaed867fe8 (at 10.12.4.27@o2ib) [835878.781154] Lustre: oak-OST0048: Connection restored to 9084afcb-f8c2-953f-97a7-f8aaed867fe8 (at 10.12.4.27@o2ib) [835878.781155] Lustre: Skipped 8 previous similar messages [835878.781156] Lustre: Skipped 8 previous similar messages [835878.781159] Lustre: Skipped 8 previous similar messages [835878.783831] Lustre: Skipped 4 previous similar messages [835902.659726] Lustre: oak-OST0036: Connection restored to 42e30373-7c28-3634-1c18-e4b21d9a682d (at 10.12.4.74@o2ib) [835902.659727] Lustre: oak-OST0030: Connection restored to 42e30373-7c28-3634-1c18-e4b21d9a682d (at 10.12.4.74@o2ib) [835902.659729] Lustre: oak-OST0038: Connection restored to 42e30373-7c28-3634-1c18-e4b21d9a682d (at 10.12.4.74@o2ib) [835902.659732] Lustre: Skipped 1 previous similar message [835902.661403] Lustre: Skipped 9 previous similar messages [849234.543831] Lustre: oak-OST004a: haven't heard from client 7ce7d082-7836-9230-5102-e06679037f88 (at 10.210.47.47@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882d4ab14800, cur 1518981865 expire 1518981715 last 1518981638 [850266.636374] Lustre: oak-OST0034: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [850266.636375] Lustre: oak-OST0032: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [850266.636377] Lustre: oak-OST0030: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [850266.636378] Lustre: Skipped 5 previous similar messages [850266.636382] Lustre: Skipped 5 previous similar messages [850266.638566] Lustre: Skipped 15 previous similar messages [859753.134780] Lustre: oak-OST004c: haven't heard from client ee2e29b8-5196-d987-f7ee-b304a50c951f (at 10.12.4.74@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881ebe6fb400, cur 1518992384 expire 1518992234 last 1518992157 [859753.135730] Lustre: Skipped 17 previous similar messages [859828.066767] Lustre: oak-OST0032: haven't heard from client ee2e29b8-5196-d987-f7ee-b304a50c951f (at 10.12.4.74@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d0663b000, cur 1518992459 expire 1518992309 last 1518992232 [859828.067791] Lustre: Skipped 10 previous similar messages [861004.380844] Lustre: oak-OST0032: Connection restored to 42e30373-7c28-3634-1c18-e4b21d9a682d (at 10.12.4.74@o2ib) [861004.380845] Lustre: oak-OST0038: Connection restored to 42e30373-7c28-3634-1c18-e4b21d9a682d (at 10.12.4.74@o2ib) [861004.380848] Lustre: Skipped 4 previous similar messages [861004.382288] Lustre: Skipped 12 previous similar messages [866537.781286] Lustre: oak-OST0042: haven't heard from client 4e053b86-c200-827a-b73a-acc3fd95a691 (at 10.9.105.52@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881f9caf2400, cur 1518999169 expire 1518999019 last 1518998942 [866537.782280] Lustre: Skipped 6 previous similar messages [903150.279946] ses 1:0:0:0: attempting task abort! scmd(ffff88142c014980) [903150.280197] ses 1:0:0:0: [sg0] CDB: Receive Diagnostic 1c 01 02 ff ff 00 [903150.280440] scsi target1:0:0: handle(0x0011), sas_address(0x50012be0000845bd), phy(48) [903150.280903] scsi target1:0:0: enclosure_logical_id(0x50012be0000845bf), slot(48) [903150.281362] scsi target1:0:0: enclosure level(0x0000),connector name( ) [903150.286130] ses 1:0:0:0: task abort: SUCCESS scmd(ffff88142c014980) [918028.565719] ses 12:0:366:0: attempting task abort! scmd(ffff880c58d91c00) [918028.565979] ses 12:0:366:0: [sg734] CDB: Receive Diagnostic 1c 01 01 ff ff 00 [918028.566458] scsi target12:0:366: handle(0x017f), sas_address(0x5001636001c4867d), phy(76) [918028.566934] scsi target12:0:366: enclosure_logical_id(0x5001636001c4867d), slot(60) [918028.567410] scsi target12:0:366: enclosure level(0x0001),connector name( ) [918028.572789] ses 12:0:366:0: task abort: FAILED scmd(ffff880c58d91c00) [918028.656783] ses 12:0:366:0: attempting device reset! scmd(ffff880c58d91c00) [918028.657059] ses 12:0:366:0: [sg734] CDB: Receive Diagnostic 1c 01 01 ff ff 00 [918028.657531] scsi target12:0:366: handle(0x017f), sas_address(0x5001636001c4867d), phy(76) [918028.658031] scsi target12:0:366: enclosure_logical_id(0x5001636001c4867d), slot(60) [918028.658603] scsi target12:0:366: enclosure level(0x0001),connector name( ) [918028.662548] ses 12:0:366:0: device reset: SUCCESS scmd(ffff880c58d91c00) [927048.981565] Lustre: oak-OST0050: haven't heard from client c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881cfcba5400, cur 1519059683 expire 1519059533 last 1519059456 [927048.982512] Lustre: Skipped 17 previous similar messages [944630.316757] Lustre: oak-OST003a: Client ccf0e612-aa4e-de11-b4e8-6ea3fc40393f (at 10.9.104.7@o2ib4) reconnecting [944630.317245] Lustre: Skipped 1134 previous similar messages [944630.317510] Lustre: oak-OST003a: Connection restored to ccf0e612-aa4e-de11-b4e8-6ea3fc40393f (at 10.9.104.7@o2ib4) [944631.180780] Lustre: oak-OST003a: Connection restored to 613b765b-5cd9-fccb-0749-c6f75ef8aa6a (at 10.9.0.61@o2ib4) [944631.181310] Lustre: Skipped 9 previous similar messages [944632.634768] Lustre: oak-OST0030: Connection restored to 1dd7fd52-ec09-323f-8dfc-27a90225a1fa (at 10.9.102.2@o2ib4) [944632.635247] Lustre: Skipped 11 previous similar messages [944635.181029] Lustre: oak-OST0030: Connection restored to 96deedc1-3ba1-f9a9-2104-768a87ec222b (at 10.9.112.6@o2ib4) [944635.181030] Lustre: oak-OST0034: Connection restored to 96deedc1-3ba1-f9a9-2104-768a87ec222b (at 10.9.112.6@o2ib4) [944635.181033] Lustre: Skipped 29 previous similar messages [944635.182214] Lustre: Skipped 5 previous similar messages [944639.260374] Lustre: oak-OST0038: Connection restored to f4df4959-008f-7a4b-de36-48dff2be7868 (at 10.9.101.51@o2ib4) [944639.260860] Lustre: Skipped 51 previous similar messages [944647.268495] Lustre: oak-OST0042: Connection restored to a3c45a9e-718a-de48-b8ca-187776c4105d (at 10.9.105.2@o2ib4) [944647.268496] Lustre: oak-OST0032: Connection restored to a3c45a9e-718a-de48-b8ca-187776c4105d (at 10.9.105.2@o2ib4) [944647.268498] Lustre: Skipped 62 previous similar messages [944680.354245] Lustre: oak-OST0034: Connection restored to 6591cb28-c82a-cd3d-7630-0ad8978b04bd (at 10.8.2.24@o2ib6) [944680.354246] Lustre: oak-OST0038: Connection restored to 6591cb28-c82a-cd3d-7630-0ad8978b04bd (at 10.8.2.24@o2ib6) [944680.354249] Lustre: Skipped 66 previous similar messages [944680.355435] Lustre: Skipped 7 previous similar messages [944680.392105] LustreError: 139225:0:(ldlm_lib.c:3186:target_bulk_io()) @@@ bulk WRITE failed: rc -107 req@ffff881d2cbcf850 x1591896506931952/t0(0) o4->590cd3e3-fa19-0835-a53e-1ffc2c2d0012@10.8.28.9@o2ib6:751/0 lens 608/448 e 0 to 0 dl 1519077361 ref 1 fl Interpret:/0/0 rc 0/0 [944680.393223] Lustre: oak-OST0046: Bulk IO write error with 590cd3e3-fa19-0835-a53e-1ffc2c2d0012 (at 10.8.28.9@o2ib6), client will retry: rc = -107 [944681.360336] LustreError: 39060:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff881fde26e050 x1592341184344144/t0(0) o4->d5dc4bc6-1544-c3cc-b202-164df739673f@10.8.2.17@o2ib6:713/0 lens 608/448 e 0 to 0 dl 1519077323 ref 1 fl Interpret:/0/0 rc 0/0 [944681.361536] Lustre: oak-OST003a: Bulk IO write error with d5dc4bc6-1544-c3cc-b202-164df739673f (at 10.8.2.17@o2ib6), client will retry: rc = -110 [944681.362009] Lustre: Skipped 5 previous similar messages [944681.371353] Lustre: oak-OST0042: Bulk IO read error with 29d46a6b-b8cb-d637-1b43-585d96eb3d03 (at 10.8.7.21@o2ib6), client will retry: rc -110 [944683.712887] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.8.2.21@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [944683.713619] LustreError: Skipped 537 previous similar messages [944708.711668] Lustre: oak-OST0030: Client bc830bbd-1004-1519-8c21-4cf177b512ce (at 10.8.2.21@o2ib6) reconnecting [944708.711669] Lustre: oak-OST0034: Client bc830bbd-1004-1519-8c21-4cf177b512ce (at 10.8.2.21@o2ib6) reconnecting [944708.711670] Lustre: oak-OST0036: Client bc830bbd-1004-1519-8c21-4cf177b512ce (at 10.8.2.21@o2ib6) reconnecting [944708.711672] Lustre: oak-OST0032: Client bc830bbd-1004-1519-8c21-4cf177b512ce (at 10.8.2.21@o2ib6) reconnecting [944708.711673] Lustre: Skipped 1504 previous similar messages [944708.711673] Lustre: Skipped 1505 previous similar messages [944708.711676] Lustre: Skipped 1505 previous similar messages [944732.812987] LustreError: 139228:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk WRITE after 100+0s req@ffff881d01082c50 x1591078822471456/t0(0) o4->f4df4959-008f-7a4b-de36-48dff2be7868@10.9.101.51@o2ib4:10/0 lens 608/448 e 4 to 0 dl 1519077375 ref 1 fl Interpret:/0/0 rc 0/0 [944732.814247] Lustre: oak-OST004c: Bulk IO write error with f4df4959-008f-7a4b-de36-48dff2be7868 (at 10.9.101.51@o2ib4), client will retry: rc = -110 [944732.814751] Lustre: Skipped 7 previous similar messages [944734.199911] LustreError: 183386:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk WRITE after 100+0s req@ffff882fa2a1dc50 x1591069810265184/t0(0) o4->1299f072-9ba4-de3a-7504-4b85f5fe96b9@10.9.101.71@o2ib4:25/0 lens 608/448 e 3 to 0 dl 1519077390 ref 1 fl Interpret:/0/0 rc 0/0 [944734.201131] LustreError: 183386:0:(ldlm_lib.c:3226:target_bulk_io()) Skipped 1 previous similar message [944745.377422] LustreError: 62500:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk READ after 100+0s req@ffff881cfcdbbc50 x1591078822477808/t0(0) o3->f4df4959-008f-7a4b-de36-48dff2be7868@10.9.101.51@o2ib4:23/0 lens 608/432 e 2 to 0 dl 1519077388 ref 1 fl Interpret:/0/0 rc 0/0 [944745.378850] Lustre: oak-OST0044: Bulk IO read error with f4df4959-008f-7a4b-de36-48dff2be7868 (at 10.9.101.51@o2ib4), client will retry: rc -110 [944747.727391] LustreError: 137-5: oak-OST003d_UUID: not available for connect from 10.210.45.82@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [944747.728122] LustreError: Skipped 3319 previous similar messages [944748.914260] LustreError: 202278:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff882fa2a1cc50 x1591049204145808/t0(0) o4->aed664d8-0edf-fcab-6799-b747f869f571@10.9.105.55@o2ib4:41/0 lens 608/448 e 1 to 0 dl 1519077406 ref 1 fl Interpret:/0/0 rc 0/0 [944748.915257] LustreError: 202278:0:(ldlm_lib.c:3236:target_bulk_io()) Skipped 8 previous similar messages [944748.915945] Lustre: oak-OST0038: Bulk IO write error with aed664d8-0edf-fcab-6799-b747f869f571 (at 10.9.105.55@o2ib4), client will retry: rc = -110 [944748.916439] Lustre: Skipped 2 previous similar messages [944757.347923] LustreError: 251689:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk WRITE after 100+0s req@ffff883cad89c450 x1591054182834992/t0(0) o4->8259e760-c316-c8e9-1e5e-cdfe87115902@10.9.104.18@o2ib4:48/0 lens 608/448 e 4 to 0 dl 1519077413 ref 1 fl Interpret:/0/0 rc 0/0 [944757.348932] Lustre: oak-OST0032: Bulk IO write error with 8259e760-c316-c8e9-1e5e-cdfe87115902 (at 10.9.104.18@o2ib4), client will retry: rc = -110 [944757.349412] Lustre: Skipped 1 previous similar message [944775.926966] LustreError: 188984:0:(ldlm_lib.c:3226:target_bulk_io()) @@@ timeout on bulk WRITE after 100+0s req@ffff883e93ee8050 x1591049204145872/t0(0) o4->aed664d8-0edf-fcab-6799-b747f869f571@10.9.105.55@o2ib4:66/0 lens 608/448 e 2 to 0 dl 1519077431 ref 1 fl Interpret:/0/0 rc 0/0 [944775.928312] LustreError: 188984:0:(ldlm_lib.c:3226:target_bulk_io()) Skipped 1 previous similar message [944775.928837] Lustre: oak-OST004a: Bulk IO write error with aed664d8-0edf-fcab-6799-b747f869f571 (at 10.9.105.55@o2ib4), client will retry: rc = -110 [944775.929333] Lustre: Skipped 1 previous similar message [944778.584838] LustreError: 183396:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk READ req@ffff883c89242050 x1591054192620288/t0(0) o3->cf01bf68-fafa-91ca-e7bb-217704026bed@10.9.101.2@o2ib4:93/0 lens 608/432 e 0 to 0 dl 1519077458 ref 1 fl Interpret:/0/0 rc 0/0 [944778.585817] LustreError: 183396:0:(ldlm_lib.c:3236:target_bulk_io()) Skipped 1 previous similar message [944778.586334] Lustre: oak-OST0042: Bulk IO read error with cf01bf68-fafa-91ca-e7bb-217704026bed (at 10.9.101.2@o2ib4), client will retry: rc -110 [944779.138362] LustreError: 62500:0:(ldlm_lib.c:3186:target_bulk_io()) @@@ bulk WRITE failed: rc -107 req@ffff881fdec28450 x1591049094263952/t0(0) o4->7461b1df-58fb-1087-c619-45f28694768c@10.9.104.12@o2ib4:57/0 lens 608/448 e 0 to 0 dl 1519077422 ref 1 fl Interpret:/0/0 rc 0/0 [944779.139315] LustreError: 62500:0:(ldlm_lib.c:3186:target_bulk_io()) Skipped 5 previous similar messages [944784.858602] LustreError: 140976:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff88354a680c50 x1591075262820384/t0(0) o4->bf802ab1-bb93-bebe-908c-edb1f2806c01@10.9.101.46@o2ib4:168/0 lens 608/448 e 1 to 0 dl 1519077533 ref 1 fl Interpret:/0/0 rc 0/0 [944784.859642] LustreError: 140976:0:(ldlm_lib.c:3236:target_bulk_io()) Skipped 22 previous similar messages [944807.188644] Lustre: oak-OST004e: haven't heard from client c96ccee3-c63b-064f-fa5f-cec674f180da (at 10.9.101.13@o2ib4) in 206 seconds. I think it's dead, and I am evicting it. exp ffff881d01b08800, cur 1519077442 expire 1519077292 last 1519077236 [944807.190086] Lustre: Skipped 17 previous similar messages [944808.143437] Lustre: oak-OST0036: haven't heard from client 2215d3b6-89f8-0fa2-d478-3945a32b6906 (at 10.9.114.3@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881e86a30400, cur 1519077443 expire 1519077293 last 1519077216 [944808.144425] Lustre: Skipped 89 previous similar messages [944808.402979] Lustre: oak-OST003a: Connection restored to b45a4bd2-c038-70b9-8a71-2dceb66f6e91 (at 10.210.46.83@o2ib3) [944808.403554] Lustre: Skipped 35023 previous similar messages [944812.162490] Lustre: oak-OST0048: haven't heard from client 4b7fcbbf-d943-3ea5-de88-323043a7420e (at 10.9.104.48@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ca7300800, cur 1519077447 expire 1519077297 last 1519077220 [944812.163441] Lustre: Skipped 60 previous similar messages [944817.157797] Lustre: oak-OST004c: haven't heard from client 36e9920d-5349-0ea9-4171-37991a501917 (at 10.9.102.69@o2ib4) in 214 seconds. I think it's dead, and I am evicting it. exp ffff883f2a622c00, cur 1519077452 expire 1519077302 last 1519077238 [944817.158826] Lustre: Skipped 134 previous similar messages [948255.096291] LustreError: 132-0: oak-OST0046: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x2000099c3:0x15950:0x0] object 0x0:3233680 extent [67108864-71303167], client returned csum d6b7bcb6 (type 4), server csum 2a7cbcb0 (type 4) [948255.097307] LustreError: Skipped 9 previous similar messages [948275.713467] LustreError: 132-0: oak-OST0046: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x2000099c3:0x15950:0x0] object 0x0:3233680 extent [67108864-71303167], client returned csum d6b7bcb6 (type 4), server csum 2a7cbcb0 (type 4) [948275.714447] LustreError: Skipped 4 previous similar messages [948309.719381] LustreError: 132-0: oak-OST0046: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x2000099c3:0x15950:0x0] object 0x0:3233680 extent [67108864-71303167], client returned csum d6b7bcb6 (type 4), server csum 2a7cbcb0 (type 4) [948309.720348] LustreError: Skipped 3 previous similar messages [948375.709444] LustreError: 132-0: oak-OST0046: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x2000099c3:0x15950:0x0] object 0x0:3233680 extent [67108864-71303167], client returned csum d6b7bcb6 (type 4), server csum 2a7cbcb0 (type 4) [948375.710716] LustreError: Skipped 13 previous similar messages [952876.774189] Lustre: oak-OST003e: haven't heard from client 52532599-8f6b-616f-14c5-a9132d0bb547 (at 10.9.113.2@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ff355ec00, cur 1519085512 expire 1519085362 last 1519085285 [952876.775197] Lustre: Skipped 24 previous similar messages [1005201.912866] LustreError: 11-0: oak-MDT0000-lwp-OST003c: operation obd_ping to node 10.0.2.52@o2ib5 failed: rc = -107 [1005201.912876] Lustre: oak-MDT0000-lwp-OST004a: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [1005201.912878] Lustre: Skipped 1 previous similar message [1005201.914313] LustreError: Skipped 16 previous similar messages [1005232.911195] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519137864/real 1519137864] req@ffff882b65977200 x1592481943673648/t0(0) o38->oak-MDT0000-lwp-OST004c@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1519137870 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [1005232.912455] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [1005251.912402] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005251.912686] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005251.913180] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005251.913719] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005251.914006] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005251.914636] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005251.915125] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005251.915639] Call Trace: [1005251.915896] [] dump_stack+0x19/0x1b [1005251.916179] [] warn_alloc_failed+0x110/0x180 [1005251.916438] [] __alloc_pages_slowpath+0x6b6/0x724 [1005251.916699] [] __alloc_pages_nodemask+0x405/0x420 [1005251.916975] [] dma_generic_alloc_coherent+0x8f/0x140 [1005251.917223] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005251.917498] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005251.918018] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005251.918284] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005251.918533] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005251.919053] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005251.919324] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005251.919586] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005251.919871] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005251.920366] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005251.920885] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005251.921130] [] process_one_work+0x17a/0x440 [1005251.921387] [] worker_thread+0x126/0x3c0 [1005251.921646] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005251.921919] [] kthread+0xcf/0xe0 [1005251.922159] [] ? insert_kthread_work+0x40/0x40 [1005251.922423] [] ret_from_fork+0x58/0x90 [1005251.922679] [] ? insert_kthread_work+0x40/0x40 [1005251.922947] Mem-Info: [1005251.923222] active_anon:952115 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115373 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942300 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156610 free_pcp:31 free_cma:0 [1005251.930431] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005251.932003] lowmem_reserve[]: 0 1554 128505 128505 [1005251.932281] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:410 all_unreclaimable? no [1005251.934071] lowmem_reserve[]: 0 0 126950 126950 [1005251.934350] Node 0 Normal free:22429392kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463492kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:32kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005251.936425] lowmem_reserve[]: 0 0 0 0 [1005251.936717] Node 1 Normal free:25661264kB min:1050512kB low:1313140kB high:1575768kB active_anon:424288kB inactive_anon:471392kB active_file:4830976kB inactive_file:79997392kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:872kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005251.938738] lowmem_reserve[]: 0 0 0 0 [1005251.939041] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005251.939705] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005251.940597] Node 0 Normal: 524998*4kB (UEM) 1385777*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22435536kB [1005251.941418] Node 1 Normal: 506788*4kB (UEM) 1927641*8kB (UEM) 502444*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25665720kB [1005251.942254] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005251.942765] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005251.943353] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005251.943863] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005251.944352] 42973045 total pagecache pages [1005251.944608] 2015 pages in swap cache [1005251.944875] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005251.945117] Free swap = 3189476kB [1005251.945370] Total swap = 4194300kB [1005251.945625] 67052113 pages RAM [1005251.945889] 0 pages HighMem/MovableOnly [1005251.946126] 1126685 pages reserved [1005251.946469] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005251.946730] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005251.947232] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005251.947757] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005251.948032] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005251.948580] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005251.949096] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005251.949598] Call Trace: [1005251.949840] [] dump_stack+0x19/0x1b [1005251.950114] [] warn_alloc_failed+0x110/0x180 [1005251.950359] [] ? drain_pages+0xb0/0xb0 [1005251.950618] [] __alloc_pages_slowpath+0x6b6/0x724 [1005251.950890] [] __alloc_pages_nodemask+0x405/0x420 [1005251.951136] [] alloc_pages_current+0x98/0x110 [1005251.951396] [] __get_free_pages+0xe/0x40 [1005251.951660] [] swiotlb_alloc_coherent+0x5e/0x150 [1005251.951938] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005251.952187] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005251.952708] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005251.952986] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005251.953282] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005251.953804] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005251.954051] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005251.954317] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005251.954581] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005251.955086] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005251.955613] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005251.955893] [] process_one_work+0x17a/0x440 [1005251.956165] [] worker_thread+0x126/0x3c0 [1005251.956425] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005251.956686] [] kthread+0xcf/0xe0 [1005251.956946] [] ? insert_kthread_work+0x40/0x40 [1005251.957295] [] ret_from_fork+0x58/0x90 [1005251.957532] [] ? insert_kthread_work+0x40/0x40 [1005251.957835] Mem-Info: [1005251.958069] active_anon:952241 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115429 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156614 free_pcp:61 free_cma:0 [1005251.959541] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005251.960990] lowmem_reserve[]: 0 1554 128505 128505 [1005251.961250] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:820 all_unreclaimable? no [1005251.962934] lowmem_reserve[]: 0 0 126950 126950 [1005251.963194] Node 0 Normal free:22429392kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463492kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005251.965092] lowmem_reserve[]: 0 0 0 0 [1005251.965350] Node 1 Normal free:25661160kB min:1050512kB low:1313140kB high:1575768kB active_anon:424288kB inactive_anon:471392kB active_file:4830976kB inactive_file:79997392kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:604kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005251.967248] lowmem_reserve[]: 0 0 0 0 [1005251.967505] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005251.968081] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005251.968900] Node 0 Normal: 525006*4kB (UEM) 1385777*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22435568kB [1005251.969679] Node 1 Normal: 506789*4kB (UEM) 1927641*8kB (UEM) 502439*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25665644kB [1005251.970471] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005251.970942] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005251.971501] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005251.971964] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005251.972429] 42973045 total pagecache pages [1005251.972667] 2015 pages in swap cache [1005251.972908] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005251.973150] Free swap = 3189476kB [1005251.973383] Total swap = 4194300kB [1005251.973619] 67052113 pages RAM [1005251.973852] 0 pages HighMem/MovableOnly [1005251.974089] 1126685 pages reserved [1005251.976703] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005251.976950] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005251.977436] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005251.977912] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005251.978157] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005251.978659] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005251.984518] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005251.985008] Call Trace: [1005251.985247] [] dump_stack+0x19/0x1b [1005251.985585] [] warn_alloc_failed+0x110/0x180 [1005251.985825] [] __alloc_pages_slowpath+0x6b6/0x724 [1005251.986066] [] __alloc_pages_nodemask+0x405/0x420 [1005251.986313] [] dma_generic_alloc_coherent+0x8f/0x140 [1005251.986559] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005251.986833] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005251.987323] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005251.987582] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005251.987835] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005251.988312] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005251.988568] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005251.988813] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005251.989094] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005251.989580] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005251.990057] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005251.990309] [] process_one_work+0x17a/0x440 [1005251.990564] [] worker_thread+0x126/0x3c0 [1005251.990826] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005251.991073] [] kthread+0xcf/0xe0 [1005251.991332] [] ? insert_kthread_work+0x40/0x40 [1005251.991576] [] ret_from_fork+0x58/0x90 [1005251.991816] [] ? insert_kthread_work+0x40/0x40 [1005251.992107] Mem-Info: [1005251.992391] active_anon:952241 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115429 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156586 free_pcp:206 free_cma:0 [1005251.993937] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005251.995531] lowmem_reserve[]: 0 1554 128505 128505 [1005251.995809] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1230 all_unreclaimable? no [1005251.997656] lowmem_reserve[]: 0 0 126950 126950 [1005251.997953] Node 0 Normal free:22429392kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463492kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:124kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.000007] lowmem_reserve[]: 0 0 0 0 [1005252.000299] Node 1 Normal free:25661072kB min:1050512kB low:1313140kB high:1575768kB active_anon:424288kB inactive_anon:471392kB active_file:4830976kB inactive_file:79997392kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:1320kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.002415] lowmem_reserve[]: 0 0 0 0 [1005252.002717] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005252.003303] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005252.004195] Node 0 Normal: 524983*4kB (UEM) 1385773*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22435444kB [1005252.005025] Node 1 Normal: 506632*4kB (UEM) 1927641*8kB (UEM) 502438*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25665000kB [1005252.005902] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.006395] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.006951] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.007441] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.007962] 42973045 total pagecache pages [1005252.008215] 2015 pages in swap cache [1005252.008468] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005252.008712] Free swap = 3189476kB [1005252.008994] Total swap = 4194300kB [1005252.009244] 67052113 pages RAM [1005252.009498] 0 pages HighMem/MovableOnly [1005252.009734] 1126685 pages reserved [1005252.010134] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005252.010383] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005252.010892] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005252.011384] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005252.011629] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005252.012185] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005252.012695] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005252.013283] Call Trace: [1005252.013525] [] dump_stack+0x19/0x1b [1005252.013767] [] warn_alloc_failed+0x110/0x180 [1005252.014139] [] ? drain_pages+0xb0/0xb0 [1005252.014391] [] __alloc_pages_slowpath+0x6b6/0x724 [1005252.014632] [] __alloc_pages_nodemask+0x405/0x420 [1005252.014923] [] alloc_pages_current+0x98/0x110 [1005252.015164] [] __get_free_pages+0xe/0x40 [1005252.015428] [] swiotlb_alloc_coherent+0x5e/0x150 [1005252.015677] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005252.015933] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005252.016415] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005252.016665] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005252.016915] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005252.017389] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005252.017640] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005252.017902] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005252.018152] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005252.018659] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005252.019144] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005252.019395] [] process_one_work+0x17a/0x440 [1005252.019638] [] worker_thread+0x126/0x3c0 [1005252.019900] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005252.020160] [] kthread+0xcf/0xe0 [1005252.020418] [] ? insert_kthread_work+0x40/0x40 [1005252.020660] [] ret_from_fork+0x58/0x90 [1005252.020934] [] ? insert_kthread_work+0x40/0x40 [1005252.021193] Mem-Info: [1005252.021450] active_anon:952241 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115429 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156566 free_pcp:123 free_cma:0 [1005252.022948] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005252.024463] lowmem_reserve[]: 0 1554 128505 128505 [1005252.024726] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1640 all_unreclaimable? no [1005252.026599] lowmem_reserve[]: 0 0 126950 126950 [1005252.026891] Node 0 Normal free:22429392kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463492kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:124kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.028948] lowmem_reserve[]: 0 0 0 0 [1005252.029204] Node 1 Normal free:25661072kB min:1050512kB low:1313140kB high:1575768kB active_anon:424288kB inactive_anon:471392kB active_file:4830976kB inactive_file:79997392kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:964kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.031208] lowmem_reserve[]: 0 0 0 0 [1005252.037184] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005252.037787] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005252.038646] Node 0 Normal: 524983*4kB (UEM) 1385773*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22435444kB [1005252.039483] Node 1 Normal: 506682*4kB (UEM) 1927641*8kB (UEM) 502438*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25665200kB [1005252.040328] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.040837] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.041333] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.041833] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.042314] 42973045 total pagecache pages [1005252.042641] 2015 pages in swap cache [1005252.042891] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005252.043156] Free swap = 3189476kB [1005252.043404] Total swap = 4194300kB [1005252.043656] 67052113 pages RAM [1005252.043920] 0 pages HighMem/MovableOnly [1005252.044159] 1126685 pages reserved [1005252.046919] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005252.047177] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005252.047733] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005252.048267] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005252.048533] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005252.049102] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005252.049592] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005252.050122] Call Trace: [1005252.050378] [] dump_stack+0x19/0x1b [1005252.050640] [] warn_alloc_failed+0x110/0x180 [1005252.050911] [] __alloc_pages_slowpath+0x6b6/0x724 [1005252.051157] [] __alloc_pages_nodemask+0x405/0x420 [1005252.051421] [] dma_generic_alloc_coherent+0x8f/0x140 [1005252.051684] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005252.051965] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005252.052452] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005252.052716] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005252.052993] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005252.053485] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005252.053739] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005252.054017] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005252.054335] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005252.054851] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005252.055338] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005252.055601] [] process_one_work+0x17a/0x440 [1005252.055870] [] worker_thread+0x126/0x3c0 [1005252.056111] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005252.056372] [] kthread+0xcf/0xe0 [1005252.056625] [] ? insert_kthread_work+0x40/0x40 [1005252.056981] [] ret_from_fork+0x58/0x90 [1005252.057218] [] ? insert_kthread_work+0x40/0x40 [1005252.057474] Mem-Info: [1005252.057725] active_anon:952367 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115429 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156566 free_pcp:61 free_cma:0 [1005252.059293] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005252.060794] lowmem_reserve[]: 0 1554 128505 128505 [1005252.061050] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2050 all_unreclaimable? no [1005252.062803] lowmem_reserve[]: 0 0 126950 126950 [1005252.063092] Node 0 Normal free:22429392kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463492kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.065140] lowmem_reserve[]: 0 0 0 0 [1005252.065416] Node 1 Normal free:25661060kB min:1050512kB low:1313140kB high:1575768kB active_anon:424792kB inactive_anon:471392kB active_file:4830976kB inactive_file:79997392kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:856kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.067457] lowmem_reserve[]: 0 0 0 0 [1005252.067731] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005252.068317] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005252.069157] Node 0 Normal: 525007*4kB (UEM) 1385777*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22435572kB [1005252.070000] Node 1 Normal: 506769*4kB (UEM) 1927641*8kB (UEM) 502438*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25665548kB [1005252.070875] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.071448] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.071956] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.072441] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.072956] 42973045 total pagecache pages [1005252.073192] 2015 pages in swap cache [1005252.073444] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005252.073687] Free swap = 3189476kB [1005252.073986] Total swap = 4194300kB [1005252.074296] 67052113 pages RAM [1005252.074530] 0 pages HighMem/MovableOnly [1005252.074812] 1126685 pages reserved [1005252.075171] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005252.075416] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005252.075909] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005252.076396] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005252.076644] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005252.077184] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005252.077718] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005252.078229] Call Trace: [1005252.078484] [] dump_stack+0x19/0x1b [1005252.078743] [] warn_alloc_failed+0x110/0x180 [1005252.079062] [] ? drain_pages+0xb0/0xb0 [1005252.079320] [] __alloc_pages_slowpath+0x6b6/0x724 [1005252.079583] [] __alloc_pages_nodemask+0x405/0x420 [1005252.079857] [] alloc_pages_current+0x98/0x110 [1005252.080100] [] __get_free_pages+0xe/0x40 [1005252.080364] [] swiotlb_alloc_coherent+0x5e/0x150 [1005252.080608] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005252.080857] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005252.081384] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005252.081650] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005252.081941] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005252.082444] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005252.082712] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005252.082985] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005252.083232] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005252.083746] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005252.084293] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005252.084540] [] process_one_work+0x17a/0x440 [1005252.084828] [] worker_thread+0x126/0x3c0 [1005252.085067] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005252.085428] [] kthread+0xcf/0xe0 [1005252.085662] [] ? insert_kthread_work+0x40/0x40 [1005252.085946] [] ret_from_fork+0x58/0x90 [1005252.086183] [] ? insert_kthread_work+0x40/0x40 [1005252.086443] Mem-Info: [1005252.086698] active_anon:952367 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115513 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156470 free_pcp:101 free_cma:0 [1005252.093966] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005252.095489] lowmem_reserve[]: 0 1554 128505 128505 [1005252.095768] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2460 all_unreclaimable? no [1005252.097500] lowmem_reserve[]: 0 0 126950 126950 [1005252.097826] Node 0 Normal free:22429020kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463828kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:84kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.099996] lowmem_reserve[]: 0 0 0 0 [1005252.100253] Node 1 Normal free:25661060kB min:1050512kB low:1313140kB high:1575768kB active_anon:424792kB inactive_anon:471392kB active_file:4830976kB inactive_file:79997392kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:940kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.102332] lowmem_reserve[]: 0 0 0 0 [1005252.102607] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005252.103177] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005252.104049] Node 0 Normal: 524821*4kB (UEM) 1385777*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22434828kB [1005252.104894] Node 1 Normal: 506488*4kB (UEM) 1927641*8kB (UEM) 502438*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25664424kB [1005252.105724] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.106221] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.106729] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.107277] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.107793] 42973320 total pagecache pages [1005252.108075] 2015 pages in swap cache [1005252.108375] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005252.108638] Free swap = 3189476kB [1005252.108901] Total swap = 4194300kB [1005252.109135] 67052113 pages RAM [1005252.109386] 0 pages HighMem/MovableOnly [1005252.109640] 1126685 pages reserved [1005252.112148] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005252.112402] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005252.112969] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005252.113485] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005252.113840] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005252.114367] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005252.114898] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005252.115396] Call Trace: [1005252.115631] [] dump_stack+0x19/0x1b [1005252.115920] [] warn_alloc_failed+0x110/0x180 [1005252.116160] [] __alloc_pages_slowpath+0x6b6/0x724 [1005252.116421] [] __alloc_pages_nodemask+0x405/0x420 [1005252.116663] [] dma_generic_alloc_coherent+0x8f/0x140 [1005252.116953] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005252.117203] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005252.117711] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005252.118006] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005252.118303] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005252.118821] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005252.119087] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005252.119352] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005252.119601] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005252.120164] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005252.120724] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005252.120981] [] process_one_work+0x17a/0x440 [1005252.121224] [] worker_thread+0x126/0x3c0 [1005252.121469] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005252.121712] [] kthread+0xcf/0xe0 [1005252.121999] [] ? insert_kthread_work+0x40/0x40 [1005252.122242] [] ret_from_fork+0x58/0x90 [1005252.122502] [] ? insert_kthread_work+0x40/0x40 [1005252.122779] Mem-Info: [1005252.123043] active_anon:952367 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115891 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156029 free_pcp:100 free_cma:0 [1005252.124533] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005252.126096] lowmem_reserve[]: 0 1554 128505 128505 [1005252.126390] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2870 all_unreclaimable? no [1005252.128293] lowmem_reserve[]: 0 0 126950 126950 [1005252.128548] Node 0 Normal free:22429020kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463828kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:76kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.130517] lowmem_reserve[]: 0 0 0 0 [1005252.130776] Node 1 Normal free:25659296kB min:1050512kB low:1313140kB high:1575768kB active_anon:424792kB inactive_anon:471392kB active_file:4830976kB inactive_file:79998904kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:728kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.132769] lowmem_reserve[]: 0 0 0 0 [1005252.133062] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005252.133680] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005252.134530] Node 0 Normal: 524749*4kB (UEM) 1385777*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22434540kB [1005252.135405] Node 1 Normal: 506321*4kB (UEM) 1927641*8kB (UEM) 502438*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25663756kB [1005252.136241] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.136745] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.137279] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.137844] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.138380] 42973696 total pagecache pages [1005252.138617] 2015 pages in swap cache [1005252.138854] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005252.139095] Free swap = 3189476kB [1005252.139333] Total swap = 4194300kB [1005252.139568] 67052113 pages RAM [1005252.139803] 0 pages HighMem/MovableOnly [1005252.140065] 1126685 pages reserved [1005252.140435] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005252.140681] CPU: 9 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005252.141222] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005252.141746] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005252.141988] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005252.142593] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005252.148786] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005252.149350] Call Trace: [1005252.149586] [] dump_stack+0x19/0x1b [1005252.149875] [] warn_alloc_failed+0x110/0x180 [1005252.150116] [] ? drain_pages+0xb0/0xb0 [1005252.150376] [] __alloc_pages_slowpath+0x6b6/0x724 [1005252.150624] [] __alloc_pages_nodemask+0x405/0x420 [1005252.150867] [] alloc_pages_current+0x98/0x110 [1005252.151159] [] __get_free_pages+0xe/0x40 [1005252.151423] [] swiotlb_alloc_coherent+0x5e/0x150 [1005252.151686] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005252.151965] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005252.152490] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005252.152759] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005252.153040] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005252.153532] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005252.153859] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005252.154155] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005252.154410] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005252.154888] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005252.155363] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005252.155660] [] process_one_work+0x17a/0x440 [1005252.155925] [] worker_thread+0x126/0x3c0 [1005252.156163] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005252.156518] [] kthread+0xcf/0xe0 [1005252.156757] [] ? insert_kthread_work+0x40/0x40 [1005252.157026] [] ret_from_fork+0x58/0x90 [1005252.157326] [] ? insert_kthread_work+0x40/0x40 [1005252.157568] Mem-Info: [1005252.157804] active_anon:952367 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115891 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156029 free_pcp:61 free_cma:0 [1005252.159327] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005252.160910] lowmem_reserve[]: 0 1554 128505 128505 [1005252.161169] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3280 all_unreclaimable? no [1005252.163019] lowmem_reserve[]: 0 0 126950 126950 [1005252.163329] Node 0 Normal free:22429020kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463828kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:60kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.165450] lowmem_reserve[]: 0 0 0 0 [1005252.165740] Node 1 Normal free:25659296kB min:1050512kB low:1313140kB high:1575768kB active_anon:424792kB inactive_anon:471392kB active_file:4830976kB inactive_file:79998904kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:976kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.167737] lowmem_reserve[]: 0 0 0 0 [1005252.168045] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005252.168735] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005252.169665] Node 0 Normal: 524675*4kB (UEM) 1385777*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22434244kB [1005252.170457] Node 1 Normal: 505985*4kB (UEM) 1927641*8kB (UEM) 502438*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25662412kB [1005252.171334] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.171866] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.172418] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.172935] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.173456] 42973696 total pagecache pages [1005252.173693] 2015 pages in swap cache [1005252.173980] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005252.174225] Free swap = 3189476kB [1005252.174477] Total swap = 4194300kB [1005252.174716] 67052113 pages RAM [1005252.174998] 0 pages HighMem/MovableOnly [1005252.175235] 1126685 pages reserved [1005252.177847] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005252.178110] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005252.178656] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005252.179189] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005252.179452] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005252.180049] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005252.180586] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005252.181101] Call Trace: [1005252.181358] [] dump_stack+0x19/0x1b [1005252.181622] [] warn_alloc_failed+0x110/0x180 [1005252.181880] [] __alloc_pages_slowpath+0x6b6/0x724 [1005252.182125] [] __alloc_pages_nodemask+0x405/0x420 [1005252.182389] [] dma_generic_alloc_coherent+0x8f/0x140 [1005252.182652] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005252.182932] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005252.183436] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005252.183702] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005252.183965] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005252.184456] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005252.184719] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005252.185065] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005252.185325] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005252.185822] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005252.186309] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005252.186575] [] process_one_work+0x17a/0x440 [1005252.186847] [] worker_thread+0x126/0x3c0 [1005252.187086] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005252.187348] [] kthread+0xcf/0xe0 [1005252.187609] [] ? insert_kthread_work+0x40/0x40 [1005252.187870] [] ret_from_fork+0x58/0x90 [1005252.188123] [] ? insert_kthread_work+0x40/0x40 [1005252.188393] Mem-Info: [1005252.188651] active_anon:952367 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115891 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12156155 free_pcp:61 free_cma:0 [1005252.190157] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005252.191728] lowmem_reserve[]: 0 1554 128505 128505 [1005252.191988] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3690 all_unreclaimable? no [1005252.193786] lowmem_reserve[]: 0 0 126950 126950 [1005252.194067] Node 0 Normal free:22429020kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463828kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.196178] lowmem_reserve[]: 0 0 0 0 [1005252.196454] Node 1 Normal free:25659136kB min:1050512kB low:1313140kB high:1575768kB active_anon:424792kB inactive_anon:471392kB active_file:4830976kB inactive_file:79998904kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:980kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.204406] lowmem_reserve[]: 0 0 0 0 [1005252.204683] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005252.205281] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005252.206131] Node 0 Normal: 524689*4kB (UEM) 1385778*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22434308kB [1005252.206977] Node 1 Normal: 505981*4kB (UEM) 1927641*8kB (UEM) 502438*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25662396kB [1005252.207846] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.208353] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.208858] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.209350] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.209855] 42973696 total pagecache pages [1005252.210093] 2015 pages in swap cache [1005252.210348] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005252.210591] Free swap = 3189476kB [1005252.210860] Total swap = 4194300kB [1005252.211095] 67052113 pages RAM [1005252.211349] 0 pages HighMem/MovableOnly [1005252.211585] 1126685 pages reserved [1005252.211990] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005252.212235] CPU: 9 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005252.212767] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005252.213372] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005252.213614] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005252.214139] ffff8806f51c7888 ffffffff81188810 0000000000000000 00000000ffffffff [1005252.214637] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005252.215127] Call Trace: [1005252.215367] [] dump_stack+0x19/0x1b [1005252.215627] [] warn_alloc_failed+0x110/0x180 [1005252.215898] [] __alloc_pages_slowpath+0x6b6/0x724 [1005252.216141] [] __alloc_pages_nodemask+0x405/0x420 [1005252.216420] [] alloc_pages_current+0x98/0x110 [1005252.216698] [] __get_free_pages+0xe/0x40 [1005252.216944] [] swiotlb_alloc_coherent+0x5e/0x150 [1005252.217205] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005252.217473] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005252.218008] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005252.218264] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005252.218515] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005252.219043] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005252.219328] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005252.219593] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005252.219868] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005252.220377] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005252.220850] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005252.221168] [] process_one_work+0x17a/0x440 [1005252.221414] [] worker_thread+0x126/0x3c0 [1005252.221675] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005252.221947] [] kthread+0xcf/0xe0 [1005252.222187] [] ? insert_kthread_work+0x40/0x40 [1005252.222478] [] ret_from_fork+0x58/0x90 [1005252.222741] [] ? insert_kthread_work+0x40/0x40 [1005252.223012] Mem-Info: [1005252.223283] active_anon:952367 inactive_anon:304827 isolated_anon:0 active_file:2754971 inactive_file:40115891 isolated_file:0 unevictable:25295 dirty:132 writeback:0 unstable:0 slab_reclaimable:1226192 slab_unreclaimable:942282 mapped:6302 shmem:98898 pagetables:4551 bounce:0 free:12155989 free_pcp:121 free_cma:0 [1005252.224820] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005252.226334] lowmem_reserve[]: 0 1554 128505 128505 [1005252.226608] Node 0 DMA32 free:521192kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138200kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:4100 all_unreclaimable? no [1005252.228509] lowmem_reserve[]: 0 0 126950 126950 [1005252.228821] Node 0 Normal free:22429020kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341640kB inactive_anon:697156kB active_file:6188076kB inactive_file:80463828kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:19468kB shmem:269668kB slab_reclaimable:1841768kB slab_unreclaimable:1713720kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.230926] lowmem_reserve[]: 0 0 0 0 [1005252.231185] Node 1 Normal free:25659048kB min:1050512kB low:1313140kB high:1575768kB active_anon:424792kB inactive_anon:471392kB active_file:4830976kB inactive_file:79998904kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:340kB writeback:0kB mapped:5732kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1917208kB kernel_stack:9088kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:980kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005252.233192] lowmem_reserve[]: 0 0 0 0 [1005252.233489] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005252.234132] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 422*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521304kB [1005252.234983] Node 0 Normal: 524666*4kB (UEM) 1385774*8kB (UEM) 575927*16kB (UEM) 1078*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22434184kB [1005252.235829] Node 1 Normal: 505976*4kB (UEM) 1927641*8kB (UEM) 502438*16kB (UEM) 5535*32kB (UEM) 3*64kB (U) 0*128kB 0*256kB 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25662376kB [1005252.236650] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.237188] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.237693] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005252.238196] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005252.238695] 42973696 total pagecache pages [1005252.238963] 2015 pages in swap cache [1005252.239230] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005252.239491] Free swap = 3189476kB [1005252.239729] Total swap = 4194300kB [1005252.239996] 67052113 pages RAM [1005252.240229] 0 pages HighMem/MovableOnly [1005252.240485] 1126685 pages reserved [1005257.577247] warn_alloc_failed: 22 callbacks suppressed [1005257.577539] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005257.577783] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005257.578300] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005257.578834] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005257.579118] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005257.579653] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005257.580169] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005257.580788] Call Trace: [1005257.581041] [] dump_stack+0x19/0x1b [1005257.581285] [] warn_alloc_failed+0x110/0x180 [1005257.581569] [] __alloc_pages_slowpath+0x6b6/0x724 [1005257.581813] [] __alloc_pages_nodemask+0x405/0x420 [1005257.582080] [] dma_generic_alloc_coherent+0x8f/0x140 [1005257.582348] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005257.582654] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005257.583161] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005257.583429] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005257.583716] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005257.584222] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005257.584493] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005257.584766] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005257.585038] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005257.585542] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005257.586055] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005257.586325] [] process_one_work+0x17a/0x440 [1005257.586583] [] worker_thread+0x126/0x3c0 [1005257.586825] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005257.587089] [] kthread+0xcf/0xe0 [1005257.587363] [] ? insert_kthread_work+0x40/0x40 [1005257.587628] [] ret_from_fork+0x58/0x90 [1005257.587869] [] ? insert_kthread_work+0x40/0x40 [1005257.588130] Mem-Info: [1005257.588387] active_anon:952132 inactive_anon:304827 isolated_anon:0 active_file:2754982 inactive_file:40120411 isolated_file:0 unevictable:25295 dirty:196 writeback:0 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:941901 mapped:6377 shmem:98898 pagetables:4551 bounce:0 free:12154459 free_pcp:180 free_cma:0 [1005257.595630] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005257.597172] lowmem_reserve[]: 0 1554 128505 128505 [1005257.597469] Node 0 DMA32 free:521368kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138136kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:410 all_unreclaimable? no [1005257.599246] lowmem_reserve[]: 0 0 126950 126950 [1005257.599533] Node 0 Normal free:22431940kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341624kB inactive_anon:697120kB active_file:6188076kB inactive_file:80466564kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:248kB writeback:0kB mapped:19736kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1713740kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005257.601623] lowmem_reserve[]: 0 0 0 0 [1005257.601881] Node 1 Normal free:25650592kB min:1050512kB low:1313140kB high:1575768kB active_anon:423868kB inactive_anon:471428kB active_file:4831020kB inactive_file:80014248kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:536kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1915176kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:1344kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005257.603880] lowmem_reserve[]: 0 0 0 0 [1005257.604178] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005257.604768] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 424*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521368kB [1005257.605616] Node 0 Normal: 524186*4kB (UEM) 1385786*8kB (UEM) 575903*16kB (UEM) 1084*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22432168kB [1005257.606444] Node 1 Normal: 502297*4kB (UEM) 1927290*8kB (UE) 502394*16kB (UEM) 5654*32kB (UEM) 26*64kB (UM) 6*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25650452kB [1005257.607323] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005257.607801] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005257.608309] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005257.608888] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005257.609402] 42978032 total pagecache pages [1005257.609652] 2015 pages in swap cache [1005257.609902] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005257.610185] Free swap = 3189476kB [1005257.610424] Total swap = 4194300kB [1005257.610667] 67052113 pages RAM [1005257.610902] 0 pages HighMem/MovableOnly [1005257.611145] 1126685 pages reserved [1005257.611502] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005257.611750] CPU: 9 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005257.612235] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005257.612739] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005257.612980] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005257.613516] ffff8806f51c7888 ffffffff81188810 0000000000000000 00000000ffffffff [1005257.614022] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005257.614558] Call Trace: [1005257.614792] [] dump_stack+0x19/0x1b [1005257.615052] [] warn_alloc_failed+0x110/0x180 [1005257.615314] [] __alloc_pages_slowpath+0x6b6/0x724 [1005257.615586] [] __alloc_pages_nodemask+0x405/0x420 [1005257.615828] [] alloc_pages_current+0x98/0x110 [1005257.616090] [] __get_free_pages+0xe/0x40 [1005257.616354] [] swiotlb_alloc_coherent+0x5e/0x150 [1005257.616627] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005257.616908] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005257.617422] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005257.617723] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005257.617974] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005257.618516] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005257.618786] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005257.619069] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005257.619334] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005257.619833] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005257.620355] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005257.620617] [] process_one_work+0x17a/0x440 [1005257.620873] [] worker_thread+0x126/0x3c0 [1005257.621137] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005257.621406] [] kthread+0xcf/0xe0 [1005257.621665] [] ? insert_kthread_work+0x40/0x40 [1005257.621907] [] ret_from_fork+0x58/0x90 [1005257.622167] [] ? insert_kthread_work+0x40/0x40 [1005257.622429] Mem-Info: [1005257.622678] active_anon:952132 inactive_anon:304827 isolated_anon:0 active_file:2754982 inactive_file:40120453 isolated_file:0 unevictable:25295 dirty:196 writeback:0 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:941755 mapped:6377 shmem:98898 pagetables:4551 bounce:0 free:12154681 free_pcp:92 free_cma:0 [1005257.624213] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005257.625754] lowmem_reserve[]: 0 1554 128505 128505 [1005257.626025] Node 0 DMA32 free:521368kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138136kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:820 all_unreclaimable? no [1005257.627852] lowmem_reserve[]: 0 0 126950 126950 [1005257.628133] Node 0 Normal free:22431940kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341624kB inactive_anon:697120kB active_file:6188076kB inactive_file:80466564kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:248kB writeback:0kB mapped:19736kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1713740kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:124kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005257.630255] lowmem_reserve[]: 0 0 0 0 [1005257.630529] Node 1 Normal free:25650692kB min:1050512kB low:1313140kB high:1575768kB active_anon:424372kB inactive_anon:471428kB active_file:4831020kB inactive_file:80014416kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:536kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1915144kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:868kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005257.632523] lowmem_reserve[]: 0 0 0 0 [1005257.632798] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005257.633457] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 424*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521368kB [1005257.634312] Node 0 Normal: 524155*4kB (UEM) 1385786*8kB (UEM) 575877*16kB (UEM) 1084*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22431628kB [1005257.635107] Node 1 Normal: 502367*4kB (UEM) 1927303*8kB (UEM) 502415*16kB (UEM) 5654*32kB (UEM) 26*64kB (UM) 6*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25651172kB [1005257.635958] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005257.636474] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005257.637049] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005257.637631] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005257.638124] 42978032 total pagecache pages [1005257.638375] 2015 pages in swap cache [1005257.638641] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005257.638884] Free swap = 3189476kB [1005257.639141] Total swap = 4194300kB [1005257.639409] 67052113 pages RAM [1005257.639655] 0 pages HighMem/MovableOnly [1005257.639890] 1126685 pages reserved [1005258.577065] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005258.577310] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005258.577814] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005258.578340] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005258.584275] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005258.584834] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005258.585402] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005258.585901] Call Trace: [1005258.586161] [] dump_stack+0x19/0x1b [1005258.586425] [] warn_alloc_failed+0x110/0x180 [1005258.586686] [] __alloc_pages_slowpath+0x6b6/0x724 [1005258.586928] [] __alloc_pages_nodemask+0x405/0x420 [1005258.587192] [] dma_generic_alloc_coherent+0x8f/0x140 [1005258.587453] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005258.587719] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005258.588211] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005258.588518] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005258.588785] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005258.589294] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005258.589562] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005258.589838] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005258.590108] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005258.590617] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005258.591196] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005258.591440] [] process_one_work+0x17a/0x440 [1005258.591709] [] worker_thread+0x126/0x3c0 [1005258.591972] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005258.592231] [] kthread+0xcf/0xe0 [1005258.592473] [] ? insert_kthread_work+0x40/0x40 [1005258.592714] [] ret_from_fork+0x58/0x90 [1005258.592994] [] ? insert_kthread_work+0x40/0x40 [1005258.593241] Mem-Info: [1005258.593511] active_anon:952133 inactive_anon:304827 isolated_anon:0 active_file:2754983 inactive_file:40120455 isolated_file:0 unevictable:25295 dirty:199 writeback:0 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:941694 mapped:6390 shmem:98898 pagetables:4551 bounce:0 free:12154909 free_pcp:61 free_cma:0 [1005258.594988] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005258.596578] lowmem_reserve[]: 0 1554 128505 128505 [1005258.596840] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:410 all_unreclaimable? no [1005258.598598] lowmem_reserve[]: 0 0 126950 126950 [1005258.598875] Node 0 Normal free:22432028kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341624kB inactive_anon:697120kB active_file:6188076kB inactive_file:80466564kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:248kB writeback:0kB mapped:19788kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1713832kB kernel_stack:41392kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005258.600914] lowmem_reserve[]: 0 0 0 0 [1005258.601217] Node 1 Normal free:25651484kB min:1050512kB low:1313140kB high:1575768kB active_anon:423872kB inactive_anon:471428kB active_file:4831024kB inactive_file:80014424kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:548kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1914840kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:864kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005258.603223] lowmem_reserve[]: 0 0 0 0 [1005258.603501] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005258.604117] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005258.605025] Node 0 Normal: 524175*4kB (UEM) 1385790*8kB (UEM) 575896*16kB (UEM) 1084*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22432044kB [1005258.605937] Node 1 Normal: 502368*4kB (UEM) 1927304*8kB (UEM) 502414*16kB (UEM) 5664*32kB (UEM) 26*64kB (UM) 6*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25651488kB [1005258.606805] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005258.607327] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005258.607829] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005258.608348] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005258.608847] 42978037 total pagecache pages [1005258.609103] 2015 pages in swap cache [1005258.609373] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005258.609631] Free swap = 3189476kB [1005258.609866] Total swap = 4194300kB [1005258.610121] 67052113 pages RAM [1005258.610391] 0 pages HighMem/MovableOnly [1005258.610643] 1126685 pages reserved [1005258.610980] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005258.611247] CPU: 9 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005258.611756] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005258.612266] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005258.612542] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005258.613060] ffff8806f51c7888 ffffffff81188810 0000000000000000 00000000ffffffff [1005258.613590] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005258.614092] Call Trace: [1005258.614343] [] dump_stack+0x19/0x1b [1005258.614599] [] warn_alloc_failed+0x110/0x180 [1005258.614841] [] __alloc_pages_slowpath+0x6b6/0x724 [1005258.615103] [] __alloc_pages_nodemask+0x405/0x420 [1005258.615353] [] alloc_pages_current+0x98/0x110 [1005258.615604] [] __get_free_pages+0xe/0x40 [1005258.615848] [] swiotlb_alloc_coherent+0x5e/0x150 [1005258.616096] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005258.616347] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005258.616828] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005258.617082] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005258.617334] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005258.617805] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005258.618058] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005258.618304] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005258.618551] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005258.619027] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005258.619617] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005258.619855] [] process_one_work+0x17a/0x440 [1005258.620096] [] worker_thread+0x126/0x3c0 [1005258.620333] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005258.620589] [] kthread+0xcf/0xe0 [1005258.620829] [] ? insert_kthread_work+0x40/0x40 [1005258.621091] [] ret_from_fork+0x58/0x90 [1005258.621335] [] ? insert_kthread_work+0x40/0x40 [1005258.621594] Mem-Info: [1005258.621827] active_anon:952133 inactive_anon:304827 isolated_anon:0 active_file:2754983 inactive_file:40120455 isolated_file:0 unevictable:25295 dirty:199 writeback:0 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:941694 mapped:6394 shmem:98898 pagetables:4551 bounce:0 free:12154935 free_pcp:31 free_cma:0 [1005258.623318] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005258.624843] lowmem_reserve[]: 0 1554 128505 128505 [1005258.625124] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:820 all_unreclaimable? no [1005258.626869] lowmem_reserve[]: 0 0 126950 126950 [1005258.627148] Node 0 Normal free:22432028kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341624kB inactive_anon:697120kB active_file:6188076kB inactive_file:80466564kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:248kB writeback:0kB mapped:19804kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1713832kB kernel_stack:41392kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005258.629109] lowmem_reserve[]: 0 0 0 0 [1005258.629383] Node 1 Normal free:25651588kB min:1050512kB low:1313140kB high:1575768kB active_anon:423872kB inactive_anon:471428kB active_file:4831024kB inactive_file:80014424kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:548kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1914840kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:864kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005258.636985] lowmem_reserve[]: 0 0 0 0 [1005258.637243] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005258.637854] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005258.638693] Node 0 Normal: 524175*4kB (UEM) 1385790*8kB (UEM) 575896*16kB (UEM) 1084*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22432044kB [1005258.639507] Node 1 Normal: 502368*4kB (UEM) 1927303*8kB (UEM) 502414*16kB (UEM) 5664*32kB (UEM) 26*64kB (UM) 6*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25651480kB [1005258.640326] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005258.640814] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005258.641317] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005258.641802] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005258.642334] 42978037 total pagecache pages [1005258.642586] 2015 pages in swap cache [1005258.642855] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005258.643103] Free swap = 3189476kB [1005258.643339] Total swap = 4194300kB [1005258.643591] 67052113 pages RAM [1005258.643824] 0 pages HighMem/MovableOnly [1005258.644078] 1126685 pages reserved [1005259.577170] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005259.577420] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005259.577909] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005259.578408] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005259.578670] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005259.579163] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005259.579649] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005259.580143] Call Trace: [1005259.580389] [] dump_stack+0x19/0x1b [1005259.580641] [] warn_alloc_failed+0x110/0x180 [1005259.580885] [] __alloc_pages_slowpath+0x6b6/0x724 [1005259.581137] [] __alloc_pages_nodemask+0x405/0x420 [1005259.581391] [] dma_generic_alloc_coherent+0x8f/0x140 [1005259.581637] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005259.581949] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005259.582470] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005259.582719] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005259.582991] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005259.583518] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005259.583787] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005259.584044] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005259.584299] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005259.584811] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005259.585325] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005259.585612] [] process_one_work+0x17a/0x440 [1005259.585881] [] worker_thread+0x126/0x3c0 [1005259.586165] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005259.586418] [] kthread+0xcf/0xe0 [1005259.586665] [] ? insert_kthread_work+0x40/0x40 [1005259.586919] [] ret_from_fork+0x58/0x90 [1005259.587164] [] ? insert_kthread_work+0x40/0x40 [1005259.587401] Mem-Info: [1005259.587735] active_anon:952135 inactive_anon:304827 isolated_anon:0 active_file:2754983 inactive_file:40120707 isolated_file:0 unevictable:25295 dirty:184 writeback:1 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:942234 mapped:6404 shmem:98898 pagetables:4551 bounce:0 free:12154048 free_pcp:31 free_cma:0 [1005259.589166] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005259.590631] lowmem_reserve[]: 0 1554 128505 128505 [1005259.590897] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1230 all_unreclaimable? no [1005259.592605] lowmem_reserve[]: 0 0 126950 126950 [1005259.592868] Node 0 Normal free:22428840kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341632kB inactive_anon:697120kB active_file:6188076kB inactive_file:80467564kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:212kB writeback:4kB mapped:19844kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1715888kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005259.594826] lowmem_reserve[]: 0 0 0 0 [1005259.595103] Node 1 Normal free:25650732kB min:1050512kB low:1313140kB high:1575768kB active_anon:423872kB inactive_anon:471428kB active_file:4831024kB inactive_file:80014432kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:524kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1914864kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:544kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005259.597133] lowmem_reserve[]: 0 0 0 0 [1005259.597445] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005259.598051] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005259.598888] Node 0 Normal: 523921*4kB (UEM) 1385776*8kB (UEM) 575785*16kB (UEM) 1086*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22429204kB [1005259.599731] Node 1 Normal: 502362*4kB (UEM) 1927315*8kB (UEM) 502424*16kB (UEM) 5655*32kB (UEM) 26*64kB (UM) 6*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25651424kB [1005259.600546] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005259.601028] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005259.601500] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005259.602063] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005259.602522] 42978297 total pagecache pages [1005259.602754] 2015 pages in swap cache [1005259.602994] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005259.603242] Free swap = 3189476kB [1005259.603478] Total swap = 4194300kB [1005259.603717] 67052113 pages RAM [1005259.603959] 0 pages HighMem/MovableOnly [1005259.604197] 1126685 pages reserved [1005259.604600] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005259.604863] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005259.605396] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005259.605949] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005259.606194] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005259.606710] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005259.607217] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005259.607760] Call Trace: [1005259.608024] [] dump_stack+0x19/0x1b [1005259.608274] [] warn_alloc_failed+0x110/0x180 [1005259.608570] [] ? drain_pages+0xb0/0xb0 [1005259.608815] [] __alloc_pages_slowpath+0x6b6/0x724 [1005259.609079] [] __alloc_pages_nodemask+0x405/0x420 [1005259.609326] [] alloc_pages_current+0x98/0x110 [1005259.609571] [] __get_free_pages+0xe/0x40 [1005259.609837] [] swiotlb_alloc_coherent+0x5e/0x150 [1005259.610105] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005259.610365] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005259.610924] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005259.611175] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005259.611458] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005259.611972] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005259.612224] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005259.612502] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005259.612756] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005259.613271] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005259.613812] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005259.614079] [] process_one_work+0x17a/0x440 [1005259.614328] [] worker_thread+0x126/0x3c0 [1005259.614607] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005259.614904] [] kthread+0xcf/0xe0 [1005259.615160] [] ? insert_kthread_work+0x40/0x40 [1005259.615457] [] ret_from_fork+0x58/0x90 [1005259.615731] [] ? insert_kthread_work+0x40/0x40 [1005259.622090] Mem-Info: [1005259.622335] active_anon:952135 inactive_anon:304827 isolated_anon:0 active_file:2754983 inactive_file:40120707 isolated_file:0 unevictable:25295 dirty:184 writeback:1 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:942214 mapped:6404 shmem:98898 pagetables:4551 bounce:0 free:12153895 free_pcp:61 free_cma:0 [1005259.623836] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005259.625377] lowmem_reserve[]: 0 1554 128505 128505 [1005259.625700] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1640 all_unreclaimable? no [1005259.627490] lowmem_reserve[]: 0 0 126950 126950 [1005259.627768] Node 0 Normal free:22428840kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341632kB inactive_anon:697120kB active_file:6188076kB inactive_file:80467564kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:212kB writeback:4kB mapped:19844kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1715888kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005259.629854] lowmem_reserve[]: 0 0 0 0 [1005259.630114] Node 1 Normal free:25650732kB min:1050512kB low:1313140kB high:1575768kB active_anon:423872kB inactive_anon:471428kB active_file:4831024kB inactive_file:80014432kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:524kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1914864kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:852kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005259.632155] lowmem_reserve[]: 0 0 0 0 [1005259.632464] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005259.633104] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005259.634019] Node 0 Normal: 523921*4kB (UEM) 1385776*8kB (UEM) 575829*16kB (UEM) 1086*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22429908kB [1005259.634865] Node 1 Normal: 502368*4kB (UEM) 1927332*8kB (UEM) 502385*16kB (UEM) 5655*32kB (UEM) 26*64kB (UM) 6*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25650960kB [1005259.635758] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005259.636284] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005259.636853] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005259.637380] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005259.637931] 42978297 total pagecache pages [1005259.638172] 2015 pages in swap cache [1005259.638442] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005259.638701] Free swap = 3189476kB [1005259.638958] Total swap = 4194300kB [1005259.639194] 67052113 pages RAM [1005259.639494] 0 pages HighMem/MovableOnly [1005259.639732] 1126685 pages reserved [1005260.577053] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005260.577322] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005260.577849] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005260.578385] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005260.578655] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005260.579144] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005260.579630] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005260.580151] Call Trace: [1005260.580413] [] dump_stack+0x19/0x1b [1005260.580672] [] warn_alloc_failed+0x110/0x180 [1005260.580947] [] __alloc_pages_slowpath+0x6b6/0x724 [1005260.581192] [] __alloc_pages_nodemask+0x405/0x420 [1005260.581486] [] dma_generic_alloc_coherent+0x8f/0x140 [1005260.581756] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005260.582050] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005260.582598] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005260.582881] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005260.583131] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005260.583663] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005260.583954] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005260.584292] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005260.584540] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005260.585076] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005260.585547] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005260.585793] [] process_one_work+0x17a/0x440 [1005260.586040] [] worker_thread+0x126/0x3c0 [1005260.586282] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005260.586548] [] kthread+0xcf/0xe0 [1005260.586833] [] ? insert_kthread_work+0x40/0x40 [1005260.587082] [] ret_from_fork+0x58/0x90 [1005260.587327] [] ? insert_kthread_work+0x40/0x40 [1005260.587573] Mem-Info: [1005260.587815] active_anon:952139 inactive_anon:304827 isolated_anon:0 active_file:2754986 inactive_file:40120793 isolated_file:0 unevictable:25295 dirty:175 writeback:12 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:942196 mapped:6415 shmem:98898 pagetables:4551 bounce:0 free:12153899 free_pcp:91 free_cma:0 [1005260.589286] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005260.590748] lowmem_reserve[]: 0 1554 128505 128505 [1005260.591009] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2050 all_unreclaimable? no [1005260.592677] lowmem_reserve[]: 0 0 126950 126950 [1005260.592939] Node 0 Normal free:22426168kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341648kB inactive_anon:697120kB active_file:6188076kB inactive_file:80467900kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:192kB writeback:24kB mapped:19888kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1718064kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005260.594847] lowmem_reserve[]: 0 0 0 0 [1005260.595111] Node 1 Normal free:25647084kB min:1050512kB low:1313140kB high:1575768kB active_anon:423872kB inactive_anon:471428kB active_file:4831036kB inactive_file:80014440kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:508kB writeback:24kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1918720kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:864kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005260.597020] lowmem_reserve[]: 0 0 0 0 [1005260.597279] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005260.597875] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005260.598803] Node 0 Normal: 523807*4kB (UE) 1385794*8kB (UEM) 575618*16kB (UEM) 1087*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22426252kB [1005260.599576] Node 1 Normal: 502366*4kB (UEM) 1927343*8kB (UEM) 502152*16kB (UEM) 5652*32kB (UEM) 26*64kB (UM) 6*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25647216kB [1005260.600465] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005260.600970] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005260.601442] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005260.601919] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005260.602404] 42978389 total pagecache pages [1005260.602654] 2015 pages in swap cache [1005260.602922] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005260.603166] Free swap = 3189476kB [1005260.603447] Total swap = 4194300kB [1005260.603689] 67052113 pages RAM [1005260.603964] 0 pages HighMem/MovableOnly [1005260.604203] 1126685 pages reserved [1005260.644751] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005260.645004] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005260.645482] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005260.645966] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005260.651563] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005260.652053] ffff8806f51c7888 ffffffff81188810 0000000000000000 ffff88407ffd8000 [1005260.652540] 0000000000000008 00000000000080d0 ffff8806f51c7888 000000003ea4c0ae [1005260.653028] Call Trace: [1005260.653269] [] dump_stack+0x19/0x1b [1005260.653516] [] warn_alloc_failed+0x110/0x180 [1005260.653759] [] __alloc_pages_slowpath+0x6b6/0x724 [1005260.654005] [] __alloc_pages_nodemask+0x405/0x420 [1005260.654250] [] alloc_pages_current+0x98/0x110 [1005260.654495] [] __get_free_pages+0xe/0x40 [1005260.654741] [] swiotlb_alloc_coherent+0x5e/0x150 [1005260.654988] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005260.655340] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005260.655806] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005260.656053] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005260.656297] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005260.656769] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005260.657029] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005260.657281] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005260.657532] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005260.658009] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005260.658538] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005260.658783] [] process_one_work+0x17a/0x440 [1005260.659027] [] worker_thread+0x126/0x3c0 [1005260.659270] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005260.659511] [] kthread+0xcf/0xe0 [1005260.659785] [] ? insert_kthread_work+0x40/0x40 [1005260.660036] [] ret_from_fork+0x58/0x90 [1005260.660280] [] ? insert_kthread_work+0x40/0x40 [1005260.660523] Mem-Info: [1005260.660787] active_anon:952139 inactive_anon:304827 isolated_anon:0 active_file:2754986 inactive_file:40120793 isolated_file:0 unevictable:25295 dirty:175 writeback:12 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:944490 mapped:6415 shmem:98898 pagetables:4551 bounce:0 free:12151512 free_pcp:74 free_cma:0 [1005260.662275] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005260.663761] lowmem_reserve[]: 0 1554 128505 128505 [1005260.664037] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2460 all_unreclaimable? no [1005260.665813] lowmem_reserve[]: 0 0 126950 126950 [1005260.666075] Node 0 Normal free:22424116kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341648kB inactive_anon:697120kB active_file:6188076kB inactive_file:80467900kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:192kB writeback:24kB mapped:19888kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1720112kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005260.668061] lowmem_reserve[]: 0 0 0 0 [1005260.668321] Node 1 Normal free:25645804kB min:1050512kB low:1313140kB high:1575768kB active_anon:423872kB inactive_anon:471428kB active_file:4831036kB inactive_file:80014440kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:508kB writeback:24kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1919744kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:912kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005260.670449] lowmem_reserve[]: 0 0 0 0 [1005260.670708] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005260.671285] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005260.672105] Node 0 Normal: 523831*4kB (UEM) 1385797*8kB (UEM) 575608*16kB (UEM) 1087*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22426212kB [1005260.672947] Node 1 Normal: 502270*4kB (UEM) 1927344*8kB (UEM) 502145*16kB (UEM) 5651*32kB (UEM) 26*64kB (UM) 5*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25646568kB [1005260.673752] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005260.674234] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005260.674707] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005260.675186] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005260.675661] 42978389 total pagecache pages [1005260.675903] 2015 pages in swap cache [1005260.676144] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005260.676397] Free swap = 3189476kB [1005260.676636] Total swap = 4194300kB [1005260.676875] 67052113 pages RAM [1005260.677111] 0 pages HighMem/MovableOnly [1005260.677349] 1126685 pages reserved [1005261.576923] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005261.577179] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005261.577657] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005261.578147] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005261.578404] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005261.578917] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005261.579404] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005261.579938] Call Trace: [1005261.580181] [] dump_stack+0x19/0x1b [1005261.580424] [] warn_alloc_failed+0x110/0x180 [1005261.580792] [] __alloc_pages_slowpath+0x6b6/0x724 [1005261.581036] [] __alloc_pages_nodemask+0x405/0x420 [1005261.581281] [] dma_generic_alloc_coherent+0x8f/0x140 [1005261.581524] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005261.581840] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005261.582342] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005261.582594] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005261.582881] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005261.583376] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005261.583639] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005261.583932] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005261.584178] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005261.584711] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005261.585245] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005261.585502] [] process_one_work+0x17a/0x440 [1005261.585813] [] worker_thread+0x126/0x3c0 [1005261.586080] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005261.586391] [] kthread+0xcf/0xe0 [1005261.586665] [] ? insert_kthread_work+0x40/0x40 [1005261.586918] [] ret_from_fork+0x58/0x90 [1005261.587163] [] ? insert_kthread_work+0x40/0x40 [1005261.587407] Mem-Info: [1005261.587694] active_anon:952139 inactive_anon:304827 isolated_anon:0 active_file:2754986 inactive_file:40121055 isolated_file:0 unevictable:25295 dirty:190 writeback:0 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:941750 mapped:6429 shmem:98898 pagetables:4551 bounce:0 free:12154290 free_pcp:94 free_cma:0 [1005261.589195] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005261.590810] lowmem_reserve[]: 0 1554 128505 128505 [1005261.591084] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2870 all_unreclaimable? no [1005261.592909] lowmem_reserve[]: 0 0 126950 126950 [1005261.593170] Node 0 Normal free:22429676kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341648kB inactive_anon:697120kB active_file:6188076kB inactive_file:80468668kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:208kB writeback:0kB mapped:19944kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1714352kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005261.595198] lowmem_reserve[]: 0 0 0 0 [1005261.595450] Node 1 Normal free:25651360kB min:1050512kB low:1313140kB high:1575768kB active_anon:423872kB inactive_anon:471428kB active_file:4831036kB inactive_file:80014720kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:552kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1914544kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:996kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005261.602752] lowmem_reserve[]: 0 0 0 0 [1005261.603026] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005261.603629] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005261.604490] Node 0 Normal: 523641*4kB (UEM) 1385771*8kB (UEM) 575893*16kB (UEM) 1087*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22429804kB [1005261.605357] Node 1 Normal: 502239*4kB (UEM) 1927329*8kB (UEM) 502444*16kB (UEM) 5651*32kB (UEM) 26*64kB (UM) 5*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25651108kB [1005261.606178] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005261.606697] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005261.607203] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005261.607710] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005261.608191] 42978664 total pagecache pages [1005261.608432] 2015 pages in swap cache [1005261.608669] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005261.608912] Free swap = 3189476kB [1005261.609259] Total swap = 4194300kB [1005261.609505] 67052113 pages RAM [1005261.609783] 0 pages HighMem/MovableOnly [1005261.610035] 1126685 pages reserved [1005261.610352] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005261.610596] CPU: 9 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005261.611135] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005261.611654] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005261.611918] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005261.612470] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005261.612989] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005261.613480] Call Trace: [1005261.613719] [] dump_stack+0x19/0x1b [1005261.613966] [] warn_alloc_failed+0x110/0x180 [1005261.614205] [] ? drain_pages+0xb0/0xb0 [1005261.614481] [] __alloc_pages_slowpath+0x6b6/0x724 [1005261.614742] [] __alloc_pages_nodemask+0x405/0x420 [1005261.615006] [] alloc_pages_current+0x98/0x110 [1005261.615248] [] __get_free_pages+0xe/0x40 [1005261.615528] [] swiotlb_alloc_coherent+0x5e/0x150 [1005261.615774] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005261.616044] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005261.616562] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005261.616812] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005261.617076] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005261.617638] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005261.617892] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005261.618139] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005261.618433] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005261.618923] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005261.619444] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005261.619688] [] process_one_work+0x17a/0x440 [1005261.619949] [] worker_thread+0x126/0x3c0 [1005261.620220] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005261.620507] [] kthread+0xcf/0xe0 [1005261.620747] [] ? insert_kthread_work+0x40/0x40 [1005261.621007] [] ret_from_fork+0x58/0x90 [1005261.621283] [] ? insert_kthread_work+0x40/0x40 [1005261.621534] Mem-Info: [1005261.621778] active_anon:952139 inactive_anon:304827 isolated_anon:0 active_file:2754986 inactive_file:40121055 isolated_file:0 unevictable:25295 dirty:190 writeback:0 unstable:0 slab_reclaimable:1226196 slab_unreclaimable:941750 mapped:6429 shmem:98898 pagetables:4551 bounce:0 free:12154310 free_pcp:61 free_cma:0 [1005261.623211] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005261.624768] lowmem_reserve[]: 0 1554 128505 128505 [1005261.625057] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3280 all_unreclaimable? no [1005261.626892] lowmem_reserve[]: 0 0 126950 126950 [1005261.627152] Node 0 Normal free:22429676kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341648kB inactive_anon:697120kB active_file:6188076kB inactive_file:80468668kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:208kB writeback:0kB mapped:19944kB shmem:269668kB slab_reclaimable:1841784kB slab_unreclaimable:1714352kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005261.629308] lowmem_reserve[]: 0 0 0 0 [1005261.629583] Node 1 Normal free:25651556kB min:1050512kB low:1313140kB high:1575768kB active_anon:423872kB inactive_anon:471428kB active_file:4831036kB inactive_file:80014720kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:552kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1914544kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:856kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005261.631684] lowmem_reserve[]: 0 0 0 0 [1005261.631988] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005261.632619] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005261.633471] Node 0 Normal: 523641*4kB (UEM) 1385771*8kB (UEM) 575893*16kB (UEM) 1087*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22429804kB [1005261.634312] Node 1 Normal: 502273*4kB (UEM) 1927327*8kB (UEM) 502452*16kB (UEM) 5651*32kB (UEM) 26*64kB (UM) 5*128kB (M) 1*256kB (U) 0*512kB 1*1024kB (U) 0*2048kB 0*4096kB = 25651356kB [1005261.635177] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005261.635697] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005261.636196] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005261.636669] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005261.637194] 42978664 total pagecache pages [1005261.637439] 2015 pages in swap cache [1005261.637764] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005261.638004] Free swap = 3189476kB [1005261.638234] Total swap = 4194300kB [1005261.638495] 67052113 pages RAM [1005261.638725] 0 pages HighMem/MovableOnly [1005261.638995] 1126685 pages reserved [1005263.577126] warn_alloc_failed: 2 callbacks suppressed [1005263.577378] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005263.577627] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005263.578113] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005263.578594] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005263.578845] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005263.579326] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005263.579817] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005263.580307] Call Trace: [1005263.580550] [] dump_stack+0x19/0x1b [1005263.580801] [] warn_alloc_failed+0x110/0x180 [1005263.581046] [] __alloc_pages_slowpath+0x6b6/0x724 [1005263.581292] [] __alloc_pages_nodemask+0x405/0x420 [1005263.581540] [] dma_generic_alloc_coherent+0x8f/0x140 [1005263.581794] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005263.582055] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005263.582534] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005263.582790] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005263.583042] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005263.583512] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005263.583768] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005263.584014] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005263.584271] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005263.584759] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005263.585232] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005263.585487] [] process_one_work+0x17a/0x440 [1005263.585735] [] worker_thread+0x126/0x3c0 [1005263.585983] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005263.586228] [] kthread+0xcf/0xe0 [1005263.586470] [] ? insert_kthread_work+0x40/0x40 [1005263.586713] [] ret_from_fork+0x58/0x90 [1005263.592398] [] ? insert_kthread_work+0x40/0x40 [1005263.592642] Mem-Info: [1005263.592893] active_anon:957091 inactive_anon:304833 isolated_anon:0 active_file:2755004 inactive_file:40123570 isolated_file:0 unevictable:25295 dirty:179 writeback:4 unstable:0 slab_reclaimable:1226197 slab_unreclaimable:941835 mapped:6930 shmem:98922 pagetables:4796 bounce:0 free:12144641 free_pcp:290 free_cma:0 [1005263.594332] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005263.595792] lowmem_reserve[]: 0 1554 128505 128505 [1005263.596059] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:4100 all_unreclaimable? no [1005263.597751] lowmem_reserve[]: 0 0 126950 126950 [1005263.598012] Node 0 Normal free:22417720kB min:1033836kB low:1292292kB high:1550752kB active_anon:3347160kB inactive_anon:697144kB active_file:6188076kB inactive_file:80471936kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:20048kB shmem:269764kB slab_reclaimable:1841788kB slab_unreclaimable:1713356kB kernel_stack:40928kB pagetables:11040kB unstable:0kB bounce:0kB free_pcp:460kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005263.599928] lowmem_reserve[]: 0 0 0 0 [1005263.600189] Node 1 Normal free:25623596kB min:1050512kB low:1313140kB high:1575768kB active_anon:439680kB inactive_anon:471428kB active_file:4831108kB inactive_file:80021544kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:528kB writeback:12kB mapped:7664kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1915752kB kernel_stack:9104kB pagetables:8028kB unstable:0kB bounce:0kB free_pcp:1200kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005263.602190] lowmem_reserve[]: 0 0 0 0 [1005263.602449] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005263.603015] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005263.603824] Node 0 Normal: 522767*4kB (UE) 1385811*8kB (UEM) 575894*16kB (UEM) 884*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22420148kB [1005263.604600] Node 1 Normal: 500269*4kB (UE) 1927114*8kB (UE) 502262*16kB (UE) 5340*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 25625060kB [1005263.605389] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005263.605865] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005263.606338] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005263.606813] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005263.607289] 42981176 total pagecache pages [1005263.607528] 2015 pages in swap cache [1005263.607776] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005263.608021] Free swap = 3189476kB [1005263.608260] Total swap = 4194300kB [1005263.608497] 67052113 pages RAM [1005263.608736] 0 pages HighMem/MovableOnly [1005263.608976] 1126685 pages reserved [1005263.609361] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005263.609607] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005263.610090] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005263.610569] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005263.610822] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005263.611304] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005263.611790] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005263.612273] Call Trace: [1005263.612514] [] dump_stack+0x19/0x1b [1005263.612763] [] warn_alloc_failed+0x110/0x180 [1005263.613006] [] ? drain_pages+0xb0/0xb0 [1005263.613251] [] __alloc_pages_slowpath+0x6b6/0x724 [1005263.613503] [] __alloc_pages_nodemask+0x405/0x420 [1005263.613754] [] alloc_pages_current+0x98/0x110 [1005263.614001] [] __get_free_pages+0xe/0x40 [1005263.614250] [] swiotlb_alloc_coherent+0x5e/0x150 [1005263.614497] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005263.614759] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005263.615233] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005263.615484] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005263.615735] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005263.616202] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005263.616546] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005263.616791] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005263.617041] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005263.617508] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005263.617985] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005263.618237] [] process_one_work+0x17a/0x440 [1005263.618480] [] worker_thread+0x126/0x3c0 [1005263.618724] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005263.618973] [] kthread+0xcf/0xe0 [1005263.619212] [] ? insert_kthread_work+0x40/0x40 [1005263.619458] [] ret_from_fork+0x58/0x90 [1005263.619701] [] ? insert_kthread_work+0x40/0x40 [1005263.619947] Mem-Info: [1005263.620191] active_anon:958477 inactive_anon:304833 isolated_anon:0 active_file:2755004 inactive_file:40123578 isolated_file:0 unevictable:25295 dirty:179 writeback:3 unstable:0 slab_reclaimable:1226197 slab_unreclaimable:941803 mapped:6930 shmem:98922 pagetables:4796 bounce:0 free:12143477 free_pcp:177 free_cma:0 [1005263.621650] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005263.623104] lowmem_reserve[]: 0 1554 128505 128505 [1005263.623367] Node 0 DMA32 free:521400kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138104kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:4510 all_unreclaimable? no [1005263.625032] lowmem_reserve[]: 0 0 126950 126950 [1005263.625291] Node 0 Normal free:22416032kB min:1033836kB low:1292292kB high:1550752kB active_anon:3349176kB inactive_anon:697144kB active_file:6188076kB inactive_file:80471936kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:188kB writeback:0kB mapped:20048kB shmem:269764kB slab_reclaimable:1841788kB slab_unreclaimable:1713356kB kernel_stack:40928kB pagetables:11040kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005263.627200] lowmem_reserve[]: 0 0 0 0 [1005263.627461] Node 1 Normal free:25620504kB min:1050512kB low:1313140kB high:1575768kB active_anon:442704kB inactive_anon:471428kB active_file:4831108kB inactive_file:80021544kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:528kB writeback:12kB mapped:7664kB shmem:125924kB slab_reclaimable:2401496kB slab_unreclaimable:1915752kB kernel_stack:9104kB pagetables:8028kB unstable:0kB bounce:0kB free_pcp:1296kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005263.629372] lowmem_reserve[]: 0 0 0 0 [1005263.629632] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005263.630210] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 425*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521400kB [1005263.631115] Node 0 Normal: 522772*4kB (UEM) 1385816*8kB (UEM) 575899*16kB (UE) 824*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22418368kB [1005263.631893] Node 1 Normal: 500173*4kB (UEM) 1927120*8kB (UEM) 502261*16kB (UEM) 5238*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 25621444kB [1005263.632676] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005263.633157] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005263.633624] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005263.634101] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005263.634577] 42981176 total pagecache pages [1005263.634821] 2015 pages in swap cache [1005263.635056] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005263.635299] Free swap = 3189476kB [1005263.635536] Total swap = 4194300kB [1005263.635781] 67052113 pages RAM [1005263.636021] 0 pages HighMem/MovableOnly [1005263.636266] 1126685 pages reserved [1005264.577502] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005264.577758] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005264.583760] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005264.584250] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005264.584633] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005264.585116] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005264.585653] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005264.586144] Call Trace: [1005264.586390] [] dump_stack+0x19/0x1b [1005264.586688] [] warn_alloc_failed+0x110/0x180 [1005264.586981] [] __alloc_pages_slowpath+0x6b6/0x724 [1005264.587239] [] __alloc_pages_nodemask+0x405/0x420 [1005264.587496] [] dma_generic_alloc_coherent+0x8f/0x140 [1005264.587766] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005264.588085] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005264.588563] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005264.588820] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005264.589116] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005264.589676] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005264.589931] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005264.590180] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005264.590456] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005264.590998] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005264.591505] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005264.591758] [] process_one_work+0x17a/0x440 [1005264.591999] [] worker_thread+0x126/0x3c0 [1005264.592285] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005264.592589] [] kthread+0xcf/0xe0 [1005264.592841] [] ? insert_kthread_work+0x40/0x40 [1005264.593092] [] ret_from_fork+0x58/0x90 [1005264.593337] [] ? insert_kthread_work+0x40/0x40 [1005264.593604] Mem-Info: [1005264.593895] active_anon:952161 inactive_anon:304827 isolated_anon:0 active_file:2755005 inactive_file:40124791 isolated_file:0 unevictable:25295 dirty:169 writeback:14 unstable:0 slab_reclaimable:1226207 slab_unreclaimable:943636 mapped:6456 shmem:98898 pagetables:4551 bounce:0 free:12146653 free_pcp:62 free_cma:0 [1005264.595449] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005264.596992] lowmem_reserve[]: 0 1554 128505 128505 [1005264.597289] Node 0 DMA32 free:521432kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138072kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:410 all_unreclaimable? no [1005264.599175] lowmem_reserve[]: 0 0 126950 126950 [1005264.599433] Node 0 Normal free:22413624kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341716kB inactive_anon:697116kB active_file:6188076kB inactive_file:80476128kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:172kB writeback:16kB mapped:20052kB shmem:269668kB slab_reclaimable:1841796kB slab_unreclaimable:1718384kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:124kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005264.601495] lowmem_reserve[]: 0 0 0 0 [1005264.601760] Node 1 Normal free:25637460kB min:1050512kB low:1313140kB high:1575768kB active_anon:423892kB inactive_anon:471432kB active_file:4831112kB inactive_file:80022204kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:504kB writeback:40kB mapped:5764kB shmem:125924kB slab_reclaimable:2401528kB slab_unreclaimable:1917576kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:864kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005264.603791] lowmem_reserve[]: 0 0 0 0 [1005264.604057] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005264.604634] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 426*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521432kB [1005264.605579] Node 0 Normal: 521916*4kB (UEM) 1385878*8kB (UEM) 575794*16kB (UEM) 974*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22418560kB [1005264.606370] Node 1 Normal: 500716*4kB (UEM) 1927413*8kB (UEM) 502427*16kB (UEM) 5521*32kB (UEM) 27*64kB (UM) 4*128kB (M) 1*256kB (M) 0*512kB 1*1024kB (M) 0*2048kB 0*4096kB = 25641192kB [1005264.607264] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005264.607763] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005264.608243] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005264.608799] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005264.609303] 42982343 total pagecache pages [1005264.609551] 2015 pages in swap cache [1005264.609793] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005264.610041] Free swap = 3189476kB [1005264.610289] Total swap = 4194300kB [1005264.610572] 67052113 pages RAM [1005264.610852] 0 pages HighMem/MovableOnly [1005264.611092] 1126685 pages reserved [1005264.611470] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005264.611758] CPU: 9 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005264.612276] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005264.612795] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005264.613136] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005264.613657] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005264.614206] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005264.614730] Call Trace: [1005264.614977] [] dump_stack+0x19/0x1b [1005264.615227] [] warn_alloc_failed+0x110/0x180 [1005264.615526] [] ? drain_pages+0xb0/0xb0 [1005264.615793] [] __alloc_pages_slowpath+0x6b6/0x724 [1005264.616042] [] __alloc_pages_nodemask+0x405/0x420 [1005264.616288] [] alloc_pages_current+0x98/0x110 [1005264.616541] [] __get_free_pages+0xe/0x40 [1005264.616811] [] swiotlb_alloc_coherent+0x5e/0x150 [1005264.617061] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005264.617346] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005264.617830] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005264.618101] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005264.618387] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005264.618871] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005264.619128] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005264.619422] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005264.619683] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005264.620196] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005264.620689] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005264.620977] [] process_one_work+0x17a/0x440 [1005264.621222] [] worker_thread+0x126/0x3c0 [1005264.621466] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005264.621736] [] kthread+0xcf/0xe0 [1005264.622025] [] ? insert_kthread_work+0x40/0x40 [1005264.622277] [] ret_from_fork+0x58/0x90 [1005264.622553] [] ? insert_kthread_work+0x40/0x40 [1005264.622800] Mem-Info: [1005264.623047] active_anon:952161 inactive_anon:304827 isolated_anon:0 active_file:2755005 inactive_file:40124791 isolated_file:0 unevictable:25295 dirty:169 writeback:14 unstable:0 slab_reclaimable:1226207 slab_unreclaimable:943124 mapped:6456 shmem:98898 pagetables:4551 bounce:0 free:12147419 free_pcp:129 free_cma:0 [1005264.624611] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005264.626236] lowmem_reserve[]: 0 1554 128505 128505 [1005264.626545] Node 0 DMA32 free:521432kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138072kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:820 all_unreclaimable? no [1005264.628412] lowmem_reserve[]: 0 0 126950 126950 [1005264.628697] Node 0 Normal free:22416692kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341716kB inactive_anon:697116kB active_file:6188076kB inactive_file:80476128kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:172kB writeback:16kB mapped:20052kB shmem:269668kB slab_reclaimable:1841796kB slab_unreclaimable:1716336kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:248kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005264.630692] lowmem_reserve[]: 0 0 0 0 [1005264.630952] Node 1 Normal free:25637860kB min:1050512kB low:1313140kB high:1575768kB active_anon:423892kB inactive_anon:471432kB active_file:4831112kB inactive_file:80022204kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:504kB writeback:40kB mapped:5764kB shmem:125924kB slab_reclaimable:2401528kB slab_unreclaimable:1917576kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:1132kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005264.638850] lowmem_reserve[]: 0 0 0 0 [1005264.639123] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005264.639821] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 426*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521432kB [1005264.640653] Node 0 Normal: 521923*4kB (UE) 1385877*8kB (UE) 575938*16kB (UEM) 988*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22421332kB [1005264.641583] Node 1 Normal: 500712*4kB (UEM) 1927381*8kB (UEM) 502466*16kB (UEM) 5525*32kB (UEM) 27*64kB (UM) 4*128kB (M) 1*256kB (M) 0*512kB 1*1024kB (M) 0*2048kB 0*4096kB = 25641672kB [1005264.642384] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005264.642864] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005264.643419] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005264.643930] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005264.644402] 42982343 total pagecache pages [1005264.644642] 2015 pages in swap cache [1005264.644927] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005264.645179] Free swap = 3189476kB [1005264.645414] Total swap = 4194300kB [1005264.645649] 67052113 pages RAM [1005264.645934] 0 pages HighMem/MovableOnly [1005264.646206] 1126685 pages reserved [1005265.576794] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005265.577083] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005265.577597] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005265.578115] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005265.578364] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005265.578857] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005265.579347] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005265.579853] Call Trace: [1005265.580100] [] dump_stack+0x19/0x1b [1005265.580349] [] warn_alloc_failed+0x110/0x180 [1005265.580593] [] __alloc_pages_slowpath+0x6b6/0x724 [1005265.580838] [] __alloc_pages_nodemask+0x405/0x420 [1005265.581179] [] dma_generic_alloc_coherent+0x8f/0x140 [1005265.581423] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005265.581681] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005265.582155] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005265.582408] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005265.582661] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005265.583146] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005265.583399] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005265.583652] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005265.583905] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005265.584385] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005265.584865] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005265.585116] [] process_one_work+0x17a/0x440 [1005265.585363] [] worker_thread+0x126/0x3c0 [1005265.585609] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005265.585863] [] kthread+0xcf/0xe0 [1005265.586105] [] ? insert_kthread_work+0x40/0x40 [1005265.586357] [] ret_from_fork+0x58/0x90 [1005265.586602] [] ? insert_kthread_work+0x40/0x40 [1005265.586848] Mem-Info: [1005265.587092] active_anon:952161 inactive_anon:304827 isolated_anon:0 active_file:2755005 inactive_file:40125151 isolated_file:0 unevictable:25295 dirty:173 writeback:0 unstable:0 slab_reclaimable:1226207 slab_unreclaimable:941699 mapped:6470 shmem:98898 pagetables:4551 bounce:0 free:12149615 free_pcp:61 free_cma:0 [1005265.588538] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005265.590008] lowmem_reserve[]: 0 1554 128505 128505 [1005265.590275] Node 0 DMA32 free:521464kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138040kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:410 all_unreclaimable? no [1005265.592054] lowmem_reserve[]: 0 0 126950 126950 [1005265.592317] Node 0 Normal free:22419432kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341716kB inactive_anon:697116kB active_file:6188076kB inactive_file:80476128kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:172kB writeback:0kB mapped:20108kB shmem:269668kB slab_reclaimable:1841796kB slab_unreclaimable:1715868kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005265.594347] lowmem_reserve[]: 0 0 0 0 [1005265.594659] Node 1 Normal free:25643164kB min:1050512kB low:1313140kB high:1575768kB active_anon:423892kB inactive_anon:471432kB active_file:4831112kB inactive_file:80023644kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:520kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401528kB slab_unreclaimable:1912712kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:860kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005265.596762] lowmem_reserve[]: 0 0 0 0 [1005265.597022] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005265.597694] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 427*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521464kB [1005265.598572] Node 0 Normal: 521917*4kB (UE) 1385791*8kB (UE) 575912*16kB (UE) 1001*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22420620kB [1005265.599423] Node 1 Normal: 500301*4kB (UEM) 1927387*8kB (UEM) 502509*16kB (UEM) 5621*32kB (UEM) 26*64kB (UM) 5*128kB (M) 1*256kB (M) 0*512kB 1*1024kB (M) 0*2048kB 0*4096kB = 25643900kB [1005265.600281] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005265.600795] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005265.601271] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005265.601785] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005265.602294] 42982708 total pagecache pages [1005265.602535] 2015 pages in swap cache [1005265.602836] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005265.603085] Free swap = 3189476kB [1005265.603318] Total swap = 4194300kB [1005265.603591] 67052113 pages RAM [1005265.603846] 0 pages HighMem/MovableOnly [1005265.604084] 1126685 pages reserved [1005265.604479] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005265.604736] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005265.605250] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005265.605763] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005265.606013] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005265.606531] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005265.607030] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005265.607541] Call Trace: [1005265.607804] [] dump_stack+0x19/0x1b [1005265.608054] [] warn_alloc_failed+0x110/0x180 [1005265.608306] [] ? drain_pages+0xb0/0xb0 [1005265.608576] [] __alloc_pages_slowpath+0x6b6/0x724 [1005265.608827] [] __alloc_pages_nodemask+0x405/0x420 [1005265.609072] [] alloc_pages_current+0x98/0x110 [1005265.609332] [] __get_free_pages+0xe/0x40 [1005265.609687] [] swiotlb_alloc_coherent+0x5e/0x150 [1005265.609936] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005265.610221] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005265.610694] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005265.610948] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005265.611216] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005265.611710] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005265.611963] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005265.612227] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005265.612476] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005265.612973] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005265.613465] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005265.613989] [] process_one_work+0x17a/0x440 [1005265.614233] [] worker_thread+0x126/0x3c0 [1005265.620300] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005265.620583] [] kthread+0xcf/0xe0 [1005265.621374] [] ? insert_kthread_work+0x40/0x40 [1005265.621670] [] ret_from_fork+0x58/0x90 [1005265.621920] [] ? insert_kthread_work+0x40/0x40 [1005265.622444] Mem-Info: [1005265.622708] active_anon:952161 inactive_anon:304827 isolated_anon:0 active_file:2755005 inactive_file:40125151 isolated_file:0 unevictable:25295 dirty:173 writeback:0 unstable:0 slab_reclaimable:1226207 slab_unreclaimable:941655 mapped:6470 shmem:98898 pagetables:4551 bounce:0 free:12149667 free_pcp:77 free_cma:0 [1005265.624319] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005265.625888] lowmem_reserve[]: 0 1554 128505 128505 [1005265.626187] Node 0 DMA32 free:521464kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138040kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:820 all_unreclaimable? no [1005265.628504] lowmem_reserve[]: 0 0 126950 126950 [1005265.628771] Node 0 Normal free:22419816kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341716kB inactive_anon:697116kB active_file:6188076kB inactive_file:80476128kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:172kB writeback:0kB mapped:20108kB shmem:269668kB slab_reclaimable:1841796kB slab_unreclaimable:1715724kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005265.630846] lowmem_reserve[]: 0 0 0 0 [1005265.631126] Node 1 Normal free:25643380kB min:1050512kB low:1313140kB high:1575768kB active_anon:424396kB inactive_anon:471432kB active_file:4831112kB inactive_file:80023644kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:520kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401528kB slab_unreclaimable:1912712kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:496kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005265.633399] lowmem_reserve[]: 0 0 0 0 [1005265.633936] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005265.634607] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 427*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521464kB [1005265.635498] Node 0 Normal: 521917*4kB (UE) 1385791*8kB (UE) 575914*16kB (UE) 1001*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22420652kB [1005265.636609] Node 1 Normal: 500183*4kB (UEM) 1927381*8kB (UEM) 502502*16kB (UEM) 5621*32kB (UEM) 26*64kB (UM) 5*128kB (M) 1*256kB (M) 0*512kB 1*1024kB (M) 0*2048kB 0*4096kB = 25643268kB [1005265.637472] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005265.638040] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005265.638506] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005265.638977] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005265.639463] 42982708 total pagecache pages [1005265.639946] 2015 pages in swap cache [1005265.640187] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005265.640475] Free swap = 3189476kB [1005265.640715] Total swap = 4194300kB [1005265.640950] 67052113 pages RAM [1005265.641199] 0 pages HighMem/MovableOnly [1005265.641437] 1126685 pages reserved [1005266.576826] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005266.577077] CPU: 33 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005266.577690] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005266.578165] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005266.578408] 00000000000080d0 000000003ea4c0ae ffff8806f51c7848 ffffffff816a3db1 [1005266.578902] ffff8806f51c78d8 ffffffff81188810 0000000000000000 00000000ffffffff [1005266.579431] ffffffffffffff00 000080d000000000 ffff8806f51c78a8 000000003ea4c0ae [1005266.579925] Call Trace: [1005266.580224] [] dump_stack+0x19/0x1b [1005266.580471] [] warn_alloc_failed+0x110/0x180 [1005266.580751] [] __alloc_pages_slowpath+0x6b6/0x724 [1005266.580997] [] __alloc_pages_nodemask+0x405/0x420 [1005266.581244] [] dma_generic_alloc_coherent+0x8f/0x140 [1005266.581511] [] x86_swiotlb_alloc_coherent+0x21/0x50 [1005266.581784] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005266.582282] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005266.582581] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005266.582834] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005266.583354] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005266.583643] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005266.583896] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005266.584165] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005266.584666] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005266.585138] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005266.585403] [] process_one_work+0x17a/0x440 [1005266.585706] [] worker_thread+0x126/0x3c0 [1005266.585949] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005266.586194] [] kthread+0xcf/0xe0 [1005266.586455] [] ? insert_kthread_work+0x40/0x40 [1005266.586719] [] ret_from_fork+0x58/0x90 [1005266.586960] [] ? insert_kthread_work+0x40/0x40 [1005266.587204] Mem-Info: [1005266.587479] active_anon:952161 inactive_anon:304827 isolated_anon:0 active_file:2755005 inactive_file:40125506 isolated_file:0 unevictable:25295 dirty:178 writeback:0 unstable:0 slab_reclaimable:1226207 slab_unreclaimable:941359 mapped:6483 shmem:98898 pagetables:4551 bounce:0 free:12149955 free_pcp:61 free_cma:0 [1005266.588927] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005266.590417] lowmem_reserve[]: 0 1554 128505 128505 [1005266.590680] Node 0 DMA32 free:521464kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138040kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1230 all_unreclaimable? no [1005266.592456] lowmem_reserve[]: 0 0 126950 126950 [1005266.592716] Node 0 Normal free:22420612kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341716kB inactive_anon:697116kB active_file:6188076kB inactive_file:80476576kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:172kB writeback:0kB mapped:20160kB shmem:269668kB slab_reclaimable:1841796kB slab_unreclaimable:1715060kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005266.594628] lowmem_reserve[]: 0 0 0 0 [1005266.594892] Node 1 Normal free:25643084kB min:1050512kB low:1313140kB high:1575768kB active_anon:423892kB inactive_anon:471432kB active_file:4831112kB inactive_file:80024616kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:540kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401528kB slab_unreclaimable:1912272kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:864kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005266.596845] lowmem_reserve[]: 0 0 0 0 [1005266.597104] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005266.597686] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 427*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521464kB [1005266.598527] Node 0 Normal: 521797*4kB (UE) 1385782*8kB (UE) 575937*16kB (UE) 1002*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22420500kB [1005266.599308] Node 1 Normal: 500039*4kB (UEM) 1927361*8kB (UEM) 502508*16kB (UEM) 5641*32kB (UEM) 26*64kB (UM) 5*128kB (M) 1*256kB (M) 0*512kB 1*1024kB (M) 0*2048kB 0*4096kB = 25643268kB [1005266.600125] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005266.600602] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005266.601071] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005266.601543] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005266.602016] 42983062 total pagecache pages [1005266.602251] 2015 pages in swap cache [1005266.602487] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005266.602734] Free swap = 3189476kB [1005266.602971] Total swap = 4194300kB [1005266.603210] 67052113 pages RAM [1005266.603449] 0 pages HighMem/MovableOnly [1005266.603689] 1126685 pages reserved [1005266.604033] kworker/u768:2: page allocation failure: order:8, mode:0x80d0 [1005266.609706] CPU: 9 PID: 393099 Comm: kworker/u768:2 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1005266.610187] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1005266.610675] Workqueue: rdma_cm cma_work_handler [rdma_cm] [1005266.610922] 00000000000080d0 000000003ea4c0ae ffff8806f51c77f8 ffffffff816a3db1 [1005266.611404] ffff8806f51c7888 ffffffff81188810 ffffffff8118b530 0000000000000000 [1005266.611900] ffffffffffffff00 000080d000000000 ffff8806f51c7858 000000003ea4c0ae [1005266.612388] Call Trace: [1005266.612649] [] dump_stack+0x19/0x1b [1005266.612898] [] warn_alloc_failed+0x110/0x180 [1005266.613142] [] ? drain_pages+0xb0/0xb0 [1005266.613386] [] __alloc_pages_slowpath+0x6b6/0x724 [1005266.613650] [] __alloc_pages_nodemask+0x405/0x420 [1005266.613908] [] alloc_pages_current+0x98/0x110 [1005266.614160] [] __get_free_pages+0xe/0x40 [1005266.614407] [] swiotlb_alloc_coherent+0x5e/0x150 [1005266.614667] [] x86_swiotlb_alloc_coherent+0x41/0x50 [1005266.614930] [] mlx4_buf_direct_alloc.isra.6+0xd3/0x1a0 [mlx4_core] [1005266.615412] [] mlx4_buf_alloc+0x1cb/0x240 [mlx4_core] [1005266.615670] [] ? __mlx4_cmd+0x560/0x920 [mlx4_core] [1005266.615921] [] create_qp_common.isra.31+0x62e/0x10d0 [mlx4_ib] [1005266.616400] [] mlx4_ib_create_qp+0x14e/0x480 [mlx4_ib] [1005266.616662] [] ib_create_qp+0x7a/0x2f0 [ib_core] [1005266.616910] [] rdma_create_qp+0x34/0xb0 [rdma_cm] [1005266.617166] [] kiblnd_create_conn+0xbf9/0x1960 [ko2iblnd] [1005266.617650] [] kiblnd_cm_callback+0x145f/0x2370 [ko2iblnd] [1005266.618123] [] cma_work_handler+0x6c/0xa0 [rdma_cm] [1005266.618376] [] process_one_work+0x17a/0x440 [1005266.618630] [] worker_thread+0x126/0x3c0 [1005266.618879] [] ? manage_workers.isra.24+0x2a0/0x2a0 [1005266.619152] [] kthread+0xcf/0xe0 [1005266.619391] [] ? insert_kthread_work+0x40/0x40 [1005266.619636] [] ret_from_fork+0x58/0x90 [1005266.619879] [] ? insert_kthread_work+0x40/0x40 [1005266.620115] Mem-Info: [1005266.620448] active_anon:952161 inactive_anon:304827 isolated_anon:0 active_file:2755005 inactive_file:40125506 isolated_file:0 unevictable:25295 dirty:178 writeback:0 unstable:0 slab_reclaimable:1226207 slab_unreclaimable:942111 mapped:6483 shmem:98898 pagetables:4551 bounce:0 free:12149228 free_pcp:91 free_cma:0 [1005266.621858] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1005266.623312] lowmem_reserve[]: 0 1554 128505 128505 [1005266.623577] Node 0 DMA32 free:521464kB min:12672kB low:15840kB high:19008kB active_anon:43036kB inactive_anon:50760kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:661504kB slab_unreclaimable:138040kB kernel_stack:288kB pagetables:116kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1640 all_unreclaimable? no [1005266.625293] lowmem_reserve[]: 0 0 126950 126950 [1005266.625560] Node 0 Normal free:22417540kB min:1033836kB low:1292292kB high:1550752kB active_anon:3341716kB inactive_anon:697116kB active_file:6188076kB inactive_file:80476576kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:172kB writeback:0kB mapped:20160kB shmem:269668kB slab_reclaimable:1841796kB slab_unreclaimable:1718132kB kernel_stack:40832kB pagetables:10788kB unstable:0kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005266.627548] lowmem_reserve[]: 0 0 0 0 [1005266.627809] Node 1 Normal free:25643184kB min:1050512kB low:1313140kB high:1575768kB active_anon:423892kB inactive_anon:471432kB active_file:4831112kB inactive_file:80024616kB unevictable:4700kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4700kB dirty:540kB writeback:0kB mapped:5764kB shmem:125924kB slab_reclaimable:2401528kB slab_unreclaimable:1912272kB kernel_stack:9072kB pagetables:7300kB unstable:0kB bounce:0kB free_pcp:864kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1005266.629812] lowmem_reserve[]: 0 0 0 0 [1005266.630071] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1005266.630656] Node 0 DMA32: 6130*4kB (UEM) 9092*8kB (UEM) 6415*16kB (UEM) 427*32kB (UEM) 1897*64kB (UEM) 823*128kB (UEM) 113*256kB (UEM) 24*512kB (EM) 19*1024kB (EM) 10*2048kB (EM) 0*4096kB = 521464kB [1005266.631467] Node 0 Normal: 521798*4kB (UEM) 1385782*8kB (UE) 575770*16kB (UE) 998*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22417704kB [1005266.632255] Node 1 Normal: 500045*4kB (UEM) 1927353*8kB (UEM) 502505*16kB (UEM) 5641*32kB (UEM) 26*64kB (UM) 5*128kB (M) 1*256kB (M) 0*512kB 1*1024kB (M) 0*2048kB 0*4096kB = 25643180kB [1005266.633069] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005266.633541] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005266.634012] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1005266.634576] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1005266.635051] 42983062 total pagecache pages [1005266.635286] 2015 pages in swap cache [1005266.635519] Swap cache stats: add 254803, delete 252788, find 1706/2041 [1005266.635769] Free swap = 3189476kB [1005266.636015] Total swap = 4194300kB [1005266.636255] 67052113 pages RAM [1005266.636494] 0 pages HighMem/MovableOnly [1005266.636734] 1126685 pages reserved [1005284.377347] Lustre: oak-OST0030: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [1005284.377833] Lustre: Skipped 4309 previous similar messages [1005301.908236] LustreError: 167-0: oak-MDT0000-lwp-OST0038: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [1005301.908724] LustreError: Skipped 8 previous similar messages [1005304.953884] Lustre: oak-OST0030: deleting orphan objects from 0x0:4452672 to 0x0:4452705 [1005304.953885] Lustre: oak-OST0034: deleting orphan objects from 0x0:4271332 to 0x0:4271361 [1005304.953887] Lustre: oak-OST0032: deleting orphan objects from 0x0:4405299 to 0x0:4405377 [1005304.953894] Lustre: oak-OST0036: deleting orphan objects from 0x0:4231598 to 0x0:4231617 [1005304.953897] Lustre: oak-OST0038: deleting orphan objects from 0x0:4369237 to 0x0:4369281 [1005304.953898] Lustre: oak-OST003a: deleting orphan objects from 0x0:4391387 to 0x0:4391425 [1005304.953900] Lustre: oak-OST0042: deleting orphan objects from 0x0:4314088 to 0x0:4314113 [1005304.953901] Lustre: oak-OST003c: deleting orphan objects from 0x0:4417614 to 0x0:4417633 [1005304.953902] Lustre: oak-OST0040: deleting orphan objects from 0x0:4424602 to 0x0:4424641 [1005304.953907] Lustre: oak-OST003e: deleting orphan objects from 0x0:4270075 to 0x0:4270113 [1005304.953920] Lustre: oak-OST0044: deleting orphan objects from 0x0:4386297 to 0x0:4386337 [1005304.953937] Lustre: oak-OST004c: deleting orphan objects from 0x0:3434595 to 0x0:3434625 [1005304.953938] Lustre: oak-OST0046: deleting orphan objects from 0x0:4441589 to 0x0:4441633 [1005304.953943] Lustre: oak-OST0048: deleting orphan objects from 0x0:3420669 to 0x0:3420705 [1005304.953953] Lustre: oak-OST0052: deleting orphan objects from 0x0:149880 to 0x0:149921 [1005304.953959] Lustre: oak-OST004e: deleting orphan objects from 0x0:149808 to 0x0:149825 [1005304.953978] Lustre: oak-OST004a: deleting orphan objects from 0x0:3416790 to 0x0:3416833 [1005304.955735] Lustre: oak-OST0050: deleting orphan objects from 0x0:149131 to 0x0:149153 [1009695.129856] Lustre: oak-OST0046: haven't heard from client 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ccbf35c00, cur 1519142333 expire 1519142183 last 1519142106 [1009695.130861] Lustre: Skipped 17 previous similar messages [1010480.145098] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1010480.145099] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1010480.145100] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1010480.145101] Lustre: Skipped 35 previous similar messages [1010480.145104] Lustre: Skipped 35 previous similar messages [1010480.146983] Lustre: Skipped 15 previous similar messages [1010500.251978] Lustre: oak-OST0030: Connection restored to 52532599-8f6b-616f-14c5-a9132d0bb547 (at 10.9.113.2@o2ib4) [1010500.251979] Lustre: oak-OST0032: Connection restored to 52532599-8f6b-616f-14c5-a9132d0bb547 (at 10.9.113.2@o2ib4) [1010500.252960] Lustre: Skipped 15 previous similar messages [1010565.946474] Lustre: oak-OST0030: Connection restored to 4e053b86-c200-827a-b73a-acc3fd95a691 (at 10.9.105.52@o2ib4) [1010565.946961] Lustre: Skipped 5 previous similar messages [1011644.969715] Lustre: oak-OST003c: Connection restored to 9881b47e-b3e2-e3bf-f7eb-349bb24f1449 (at 10.8.1.29@o2ib6) [1011644.970192] Lustre: Skipped 25 previous similar messages [1011646.043116] Lustre: oak-OST0036: haven't heard from client 08f57cac-fc9b-4c5f-e6ed-f579f1b1ba54 (at 10.9.101.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d2c641c00, cur 1519144284 expire 1519144134 last 1519144057 [1011646.049404] Lustre: Skipped 17 previous similar messages [1011744.009659] Lustre: oak-OST0030: Connection restored to 08f57cac-fc9b-4c5f-e6ed-f579f1b1ba54 (at 10.9.101.30@o2ib4) [1012032.022944] Lustre: oak-OST0048: haven't heard from client e4371391-ba6d-e8da-534c-1528869744eb (at 10.9.101.30@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883c8a8b2800, cur 1519144670 expire 1519144520 last 1519144443 [1012032.023932] Lustre: Skipped 17 previous similar messages [1012475.243114] Lustre: oak-OST0038: Connection restored to 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) [1012475.243115] Lustre: oak-OST0034: Connection restored to 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) [1012475.243116] Lustre: oak-OST0036: Connection restored to 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) [1012475.243117] Lustre: oak-OST0030: Connection restored to 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) [1012475.243119] Lustre: oak-OST0032: Connection restored to 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) [1012475.243119] Lustre: Skipped 17 previous similar messages [1012475.243120] Lustre: Skipped 17 previous similar messages [1012475.243121] Lustre: Skipped 17 previous similar messages [1012475.243124] Lustre: Skipped 17 previous similar messages [1012475.247189] Lustre: Skipped 12 previous similar messages [1012990.552753] Lustre: oak-OST0030: Connection restored to 08f57cac-fc9b-4c5f-e6ed-f579f1b1ba54 (at 10.9.101.30@o2ib4) [1012990.553234] Lustre: Skipped 4 previous similar messages [1023996.468799] Lustre: oak-OST0044: haven't heard from client 6acb7b42-ef3c-8004-44df-ce2e24f2a8b7 (at 10.210.46.179@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ca6546800, cur 1519156635 expire 1519156485 last 1519156408 [1023996.469785] Lustre: Skipped 17 previous similar messages [1024583.779837] ses 1:0:183:0: attempting task abort! scmd(ffff8833e15fa840) [1024583.780085] ses 1:0:183:0: [sg183] CDB: Receive Diagnostic 1c 01 02 ff ff 00 [1024583.780551] scsi target1:0:183: handle(0x00c8), sas_address(0x5001636001ab917d), phy(76) [1024583.781021] scsi target1:0:183: enclosure_logical_id(0x5001636001ab917d), slot(60) [1024583.781492] scsi target1:0:183: enclosure level(0x0001),connector name( ) [1024583.786505] ses 1:0:183:0: task abort: SUCCESS scmd(ffff8833e15fa840) [1024618.440980] Lustre: oak-OST0044: haven't heard from client f1c4b7ea-8d6b-ef45-31f7-c692353edfd8 (at 10.210.47.22@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ca8187000, cur 1519157257 expire 1519157107 last 1519157030 [1024618.441917] Lustre: Skipped 107 previous similar messages [1024795.442110] Lustre: oak-OST0030: haven't heard from client a3c9114f-d6c2-2dc0-97e9-febf180cf573 (at 10.210.47.6@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881da11fec00, cur 1519157434 expire 1519157284 last 1519157207 [1024795.443057] Lustre: Skipped 35 previous similar messages [1026574.546690] Lustre: oak-OST0030: Connection restored to c751babb-05fb-9dff-3b03-73aab59fcca5 (at 10.210.46.9@o2ib3) [1026574.547179] Lustre: Skipped 15 previous similar messages [1026644.159621] Lustre: oak-OST0038: Connection restored to 88b081c7-2b39-cdaa-bffa-54fbd9ac445c (at 10.210.44.130@o2ib3) [1026644.159622] Lustre: oak-OST0034: Connection restored to 88b081c7-2b39-cdaa-bffa-54fbd9ac445c (at 10.210.44.130@o2ib3) [1026644.159623] Lustre: oak-OST0032: Connection restored to 88b081c7-2b39-cdaa-bffa-54fbd9ac445c (at 10.210.44.130@o2ib3) [1026644.159624] Lustre: oak-OST0036: Connection restored to 88b081c7-2b39-cdaa-bffa-54fbd9ac445c (at 10.210.44.130@o2ib3) [1026644.159625] Lustre: Skipped 14 previous similar messages [1026644.159626] Lustre: Skipped 14 previous similar messages [1026644.159629] Lustre: Skipped 14 previous similar messages [1026644.162314] Lustre: Skipped 11 previous similar messages [1026679.687979] Lustre: oak-OST0032: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1026679.687981] Lustre: oak-OST0034: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1026679.687984] Lustre: Skipped 1 previous similar message [1026679.689278] Lustre: Skipped 15 previous similar messages [1026695.042002] Lustre: oak-OST0034: Connection restored to 58fba9c8-884d-1a67-5271-d45fd541fa94 (at 10.210.46.37@o2ib3) [1026695.042004] Lustre: oak-OST0038: Connection restored to 58fba9c8-884d-1a67-5271-d45fd541fa94 (at 10.210.46.37@o2ib3) [1026695.042005] Lustre: oak-OST0032: Connection restored to 58fba9c8-884d-1a67-5271-d45fd541fa94 (at 10.210.46.37@o2ib3) [1026695.042009] Lustre: Skipped 2 previous similar messages [1026695.043720] Lustre: Skipped 13 previous similar messages [1026752.874253] Lustre: oak-OST0030: Connection restored to 6acb7b42-ef3c-8004-44df-ce2e24f2a8b7 (at 10.210.46.179@o2ib3) [1026752.874254] Lustre: oak-OST0032: Connection restored to 6acb7b42-ef3c-8004-44df-ce2e24f2a8b7 (at 10.210.46.179@o2ib3) [1026752.874257] Lustre: Skipped 17 previous similar messages [1026752.875454] Lustre: Skipped 14 previous similar messages [1027529.302102] Lustre: oak-OST0030: Connection restored to e2f73f67-fba4-ec8f-e44f-f049cc385db3 (at 10.9.102.23@o2ib4) [1027529.302581] Lustre: Skipped 72 previous similar messages [1033599.487799] Lustre: oak-OST0032: Connection restored to ecae7a7f-ffb3-c65b-d0ab-55652e96b0f9 (at 10.8.17.11@o2ib6) [1033599.487800] Lustre: oak-OST0030: Connection restored to ecae7a7f-ffb3-c65b-d0ab-55652e96b0f9 (at 10.8.17.11@o2ib6) [1033599.487803] Lustre: Skipped 11 previous similar messages [1033599.489054] Lustre: Skipped 14 previous similar messages [1037221.857746] Lustre: oak-OST0034: haven't heard from client 948e6ce8-ff73-e884-ff57-7066987d6ead (at 10.8.29.8@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ce097d400, cur 1519169861 expire 1519169711 last 1519169634 [1037221.858696] Lustre: Skipped 17 previous similar messages [1037830.211871] Lustre: oak-OST0034: Connection restored to 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) [1037830.212340] Lustre: Skipped 54 previous similar messages [1042105.644664] Lustre: oak-OST0042: haven't heard from client f1256135-e808-6461-1e11-1248d7f0458d (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d0279a400, cur 1519174745 expire 1519174595 last 1519174518 [1042105.645708] Lustre: Skipped 35 previous similar messages [1042690.241047] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1042690.241048] Lustre: oak-OST0038: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1042690.241050] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1042690.241051] Lustre: Skipped 2 previous similar messages [1042690.241053] Lustre: Skipped 2 previous similar messages [1042690.242959] Lustre: Skipped 12 previous similar messages [1045389.480159] Lustre: oak-OST003a: haven't heard from client b0a7599d-e6fa-20c7-5218-7059666ae4c4 (at 10.9.114.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ccbe05400, cur 1519178029 expire 1519177879 last 1519177802 [1045389.481136] Lustre: Skipped 35 previous similar messages [1045770.365275] Lustre: oak-OST0030: Connection restored to (at 10.8.15.2@o2ib6) [1045770.365276] Lustre: oak-OST0034: Connection restored to (at 10.8.15.2@o2ib6) [1045770.365278] Lustre: oak-OST0032: Connection restored to (at 10.8.15.2@o2ib6) [1045770.366921] Lustre: Skipped 14 previous similar messages [1045775.068310] Lustre: oak-OST0032: Connection restored to 948e6ce8-ff73-e884-ff57-7066987d6ead (at 10.8.29.8@o2ib6) [1045775.068791] Lustre: Skipped 16 previous similar messages [1046018.810180] Lustre: oak-OST0030: Connection restored to (at 10.9.112.12@o2ib4) [1046018.810181] Lustre: oak-OST0032: Connection restored to (at 10.9.112.12@o2ib4) [1046018.810184] Lustre: Skipped 2 previous similar messages [1046018.811360] Lustre: Skipped 12 previous similar messages [1046223.445011] Lustre: oak-OST0030: haven't heard from client 934e71b1-3388-5916-2d0f-6596d748e33c (at 10.8.29.1@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881da115c400, cur 1519178863 expire 1519178713 last 1519178636 [1046223.445953] Lustre: Skipped 35 previous similar messages [1046301.446353] Lustre: oak-OST003c: haven't heard from client 00d4168e-b282-fe05-71fa-6d4c2929a322 (at 10.8.28.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ca7e9f000, cur 1519178941 expire 1519178791 last 1519178714 [1046301.447317] Lustre: Skipped 17 previous similar messages [1052559.840566] Lustre: oak-OST003a: Connection restored to 00d4168e-b282-fe05-71fa-6d4c2929a322 (at 10.8.28.2@o2ib6) [1052559.841037] Lustre: Skipped 14 previous similar messages [1052571.146832] Lustre: oak-OST0034: haven't heard from client f088d87f-b8fa-424d-5a47-f7128f6ebaf3 (at 10.9.114.4@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cdf8d3800, cur 1519185211 expire 1519185061 last 1519184984 [1052571.147861] Lustre: Skipped 17 previous similar messages [1053177.418138] Lustre: oak-OST0030: Connection restored to baf559cf-d89d-90ee-492d-3c6501f17713 (at 10.9.113.4@o2ib4) [1053177.418139] Lustre: oak-OST0032: Connection restored to baf559cf-d89d-90ee-492d-3c6501f17713 (at 10.9.113.4@o2ib4) [1053177.418143] Lustre: Skipped 1 previous similar message [1053177.419344] Lustre: Skipped 13 previous similar messages [1053499.109841] Lustre: oak-OST0050: haven't heard from client e88e63b3-6bd5-955b-9683-831116b242ec (at 10.210.45.55@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cd8b47400, cur 1519186139 expire 1519185989 last 1519185912 [1053499.110814] Lustre: Skipped 71 previous similar messages [1053596.099407] Lustre: oak-OST0050: haven't heard from client a19d9a9f-a54f-9741-6bca-4c816099d7c6 (at 10.210.47.105@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883e93ee9000, cur 1519186236 expire 1519186086 last 1519186009 [1053596.100355] Lustre: Skipped 161 previous similar messages [1054102.792484] Lustre: oak-OST0032: Connection restored to eabb0107-62c8-22e1-60fc-81736c3e1737 (at 10.210.46.17@o2ib3) [1054102.792485] Lustre: oak-OST0036: Connection restored to eabb0107-62c8-22e1-60fc-81736c3e1737 (at 10.210.46.17@o2ib3) [1054102.792487] Lustre: oak-OST0038: Connection restored to eabb0107-62c8-22e1-60fc-81736c3e1737 (at 10.210.46.17@o2ib3) [1054102.792488] Lustre: oak-OST0030: Connection restored to eabb0107-62c8-22e1-60fc-81736c3e1737 (at 10.210.46.17@o2ib3) [1054102.792489] Lustre: oak-OST0034: Connection restored to eabb0107-62c8-22e1-60fc-81736c3e1737 (at 10.210.46.17@o2ib3) [1054102.800351] Lustre: Skipped 13 previous similar messages [1054242.720014] Lustre: oak-OST0032: Connection restored to c2e608c8-c602-3886-0dc4-6c60ca6627f2 (at 10.210.46.118@o2ib3) [1054242.720015] Lustre: oak-OST0030: Connection restored to c2e608c8-c602-3886-0dc4-6c60ca6627f2 (at 10.210.46.118@o2ib3) [1054242.720983] Lustre: Skipped 16 previous similar messages [1054258.799663] Lustre: oak-OST0030: Connection restored to b0a7599d-e6fa-20c7-5218-7059666ae4c4 (at 10.9.114.7@o2ib4) [1054258.800141] Lustre: Skipped 11 previous similar messages [1054261.669791] Lustre: oak-OST0030: Connection restored to e88e63b3-6bd5-955b-9683-831116b242ec (at 10.210.45.55@o2ib3) [1054261.669792] Lustre: oak-OST0032: Connection restored to e88e63b3-6bd5-955b-9683-831116b242ec (at 10.210.45.55@o2ib3) [1054261.669797] Lustre: Skipped 36 previous similar messages [1054261.671013] Lustre: Skipped 16 previous similar messages [1054289.851325] Lustre: oak-OST0030: Connection restored to 7ac8d0bc-8a50-52de-3afa-54adb8564524 (at 10.210.45.57@o2ib3) [1054289.851807] Lustre: Skipped 8 previous similar messages [1054370.335751] Lustre: oak-OST0034: Connection restored to d0ab3532-faba-0b53-77ad-d6df2bc075f2 (at 10.210.44.2@o2ib3) [1054370.335752] Lustre: oak-OST0032: Connection restored to d0ab3532-faba-0b53-77ad-d6df2bc075f2 (at 10.210.44.2@o2ib3) [1054370.335754] Lustre: oak-OST0030: Connection restored to d0ab3532-faba-0b53-77ad-d6df2bc075f2 (at 10.210.44.2@o2ib3) [1054370.335755] Lustre: Skipped 6 previous similar messages [1054370.335758] Lustre: Skipped 6 previous similar messages [1054370.337685] Lustre: Skipped 15 previous similar messages [1055366.017843] Lustre: oak-OST0050: haven't heard from client 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ca788c000, cur 1519188006 expire 1519187856 last 1519187779 [1055366.018799] Lustre: Skipped 17 previous similar messages [1055366.663539] Lustre: oak-OST0034: haven't heard from client 2215d3b6-89f8-0fa2-d478-3945a32b6906 (at 10.9.114.3@o2ib4) in 220 seconds. I think it's dead, and I am evicting it. exp ffff883cd9d30000, cur 1519188006 expire 1519187856 last 1519187786 [1055366.664595] Lustre: Skipped 246 previous similar messages [1055950.000704] Lustre: oak-OST0030: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1055950.001172] Lustre: Skipped 84 previous similar messages [1055957.006118] Lustre: oak-OST0036: haven't heard from client b20053e8-bef8-e47a-ffc9-09547b305028 (at 10.9.112.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881e8037c800, cur 1519188597 expire 1519188447 last 1519188370 [1055957.007097] Lustre: Skipped 22 previous similar messages [1055961.647964] Lustre: oak-OST0034: Connection restored to ae748e38-f9ba-2528-6b4a-6e1001ec9561 (at 10.9.113.1@o2ib4) [1055961.647965] Lustre: oak-OST0032: Connection restored to ae748e38-f9ba-2528-6b4a-6e1001ec9561 (at 10.9.113.1@o2ib4) [1055961.647969] Lustre: Skipped 34 previous similar messages [1055961.649166] Lustre: Skipped 15 previous similar messages [1056370.972722] Lustre: oak-OST0036: haven't heard from client dc2dcc03-eaa5-d347-d1ce-75b2e38645c2 (at 10.9.112.16@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d57e26800, cur 1519189011 expire 1519188861 last 1519188784 [1056370.973702] Lustre: Skipped 17 previous similar messages [1056480.967628] Lustre: oak-OST0032: haven't heard from client 0a64c122-43c4-79b3-a8dd-c635cac57878 (at 10.9.114.5@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881fe2e1e400, cur 1519189121 expire 1519188971 last 1519188894 [1056480.968600] Lustre: Skipped 35 previous similar messages [1056556.973597] Lustre: oak-OST0050: haven't heard from client 9e97ff40-8191-82d1-3eb0-d70ff33d294c (at 10.8.9.9@o2ib6) in 176 seconds. I think it's dead, and I am evicting it. exp ffff881ff141d800, cur 1519189197 expire 1519189047 last 1519189021 [1056556.974571] Lustre: Skipped 17 previous similar messages [1056607.960180] Lustre: oak-OST0052: haven't heard from client 9e97ff40-8191-82d1-3eb0-d70ff33d294c (at 10.8.9.9@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d03301c00, cur 1519189248 expire 1519189098 last 1519189021 [1056846.952747] Lustre: oak-OST003a: haven't heard from client 1dab078e-dbfa-2d1a-4a79-111becb6145f (at 10.8.15.7@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ea9ac9400, cur 1519189487 expire 1519189337 last 1519189260 [1056846.953717] Lustre: Skipped 16 previous similar messages [1056963.958415] Lustre: oak-OST0030: Connection restored to (at 10.9.112.16@o2ib4) [1056963.958416] Lustre: oak-OST0032: Connection restored to (at 10.9.112.16@o2ib4) [1056963.958417] Lustre: oak-OST0034: Connection restored to (at 10.9.112.16@o2ib4) [1056963.958420] Lustre: Skipped 19 previous similar messages [1056963.958421] Lustre: Skipped 19 previous similar messages [1056963.960285] Lustre: Skipped 14 previous similar messages [1057325.395048] Lustre: oak-OST0036: Connection restored to 1dab078e-dbfa-2d1a-4a79-111becb6145f (at 10.8.15.7@o2ib6) [1057325.395049] Lustre: oak-OST0030: Connection restored to 1dab078e-dbfa-2d1a-4a79-111becb6145f (at 10.8.15.7@o2ib6) [1057325.395053] Lustre: Skipped 6 previous similar messages [1057325.396242] Lustre: Skipped 9 previous similar messages [1057506.963133] Lustre: oak-OST0038: haven't heard from client 1cab9900-abee-bb7f-0783-8b7c92a4aef9 (at 10.8.29.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881fdf2c7400, cur 1519190147 expire 1519189997 last 1519189920 [1057506.964105] Lustre: Skipped 17 previous similar messages [1057755.909017] Lustre: oak-OST0036: haven't heard from client 8dcf1735-449a-b5f0-95a7-46869cbc352a (at 10.8.29.6@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d55e81800, cur 1519190396 expire 1519190246 last 1519190169 [1057755.909979] Lustre: Skipped 53 previous similar messages [1058292.884306] Lustre: oak-OST0030: haven't heard from client 3e650e62-845d-6d60-26a3-3d4109f50747 (at 10.9.112.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881da11d6c00, cur 1519190933 expire 1519190783 last 1519190706 [1058292.885278] Lustre: Skipped 17 previous similar messages [1058886.414015] Lustre: oak-OST0034: Connection restored to 3e650e62-845d-6d60-26a3-3d4109f50747 (at 10.9.112.13@o2ib4) [1058886.414016] Lustre: oak-OST0030: Connection restored to 3e650e62-845d-6d60-26a3-3d4109f50747 (at 10.9.112.13@o2ib4) [1058886.414018] Lustre: oak-OST0036: Connection restored to 3e650e62-845d-6d60-26a3-3d4109f50747 (at 10.9.112.13@o2ib4) [1058886.414019] Lustre: Skipped 1 previous similar message [1058886.414022] Lustre: Skipped 1 previous similar message [1058886.415945] Lustre: Skipped 13 previous similar messages [1060574.809249] Lustre: oak-OST0032: haven't heard from client c30e55e3-ac07-c6d7-6d4d-e38ec6e313d9 (at 10.8.29.5@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883e603ba000, cur 1519193215 expire 1519193065 last 1519192988 [1060574.810212] Lustre: Skipped 35 previous similar messages [1062320.702225] Lustre: oak-OST0040: haven't heard from client b6338d77-1fa9-be10-93d4-b9041c881f67 (at 10.9.113.11@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883d13fdc800, cur 1519194961 expire 1519194811 last 1519194734 [1062320.703196] Lustre: Skipped 17 previous similar messages [1062329.734449] Lustre: oak-OST0048: haven't heard from client b6338d77-1fa9-be10-93d4-b9041c881f67 (at 10.9.113.11@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ca763e800, cur 1519194970 expire 1519194820 last 1519194743 [1062329.735418] Lustre: Skipped 16 previous similar messages [1062508.688067] Lustre: oak-OST0042: haven't heard from client a1961b5b-a1e1-2763-b54b-578a7debdac9 (at 10.8.29.3@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883d2e372800, cur 1519195149 expire 1519194999 last 1519194922 [1062923.159478] Lustre: oak-OST0032: Connection restored to b6338d77-1fa9-be10-93d4-b9041c881f67 (at 10.9.113.11@o2ib4) [1062923.159988] Lustre: Skipped 14 previous similar messages [1064819.586329] Lustre: oak-OST004a: haven't heard from client 6753e7eb-522c-7be3-4e53-e46f453e0ada (at 10.9.113.3@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d24ac3400, cur 1519197460 expire 1519197310 last 1519197233 [1064819.587295] Lustre: Skipped 17 previous similar messages [1064895.580619] Lustre: oak-OST0046: haven't heard from client ad88835a-f666-0886-4bc3-eb58d0dff519 (at 10.8.15.1@o2ib6) in 153 seconds. I think it's dead, and I am evicting it. exp ffff883cac714800, cur 1519197536 expire 1519197386 last 1519197383 [1064895.581615] Lustre: Skipped 35 previous similar messages [1065408.239835] Lustre: oak-OST0032: Connection restored to 6753e7eb-522c-7be3-4e53-e46f453e0ada (at 10.9.113.3@o2ib4) [1065408.239836] Lustre: oak-OST0030: Connection restored to 6753e7eb-522c-7be3-4e53-e46f453e0ada (at 10.9.113.3@o2ib4) [1065408.239845] Lustre: Skipped 2 previous similar messages [1065408.241027] Lustre: Skipped 14 previous similar messages [1065446.925642] Lustre: oak-OST0034: Connection restored to 95853065-bced-4a37-068a-7368116cf9b5 (at 10.9.112.10@o2ib4) [1065446.925643] Lustre: oak-OST0032: Connection restored to 95853065-bced-4a37-068a-7368116cf9b5 (at 10.9.112.10@o2ib4) [1065446.925646] Lustre: Skipped 2 previous similar messages [1065446.926832] Lustre: Skipped 12 previous similar messages [1065780.536104] Lustre: oak-OST0036: haven't heard from client 240f0c0c-4bf7-0269-c019-9ce39d7dda57 (at 10.8.29.7@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d25ebe400, cur 1519198421 expire 1519198271 last 1519198194 [1065780.537086] Lustre: Skipped 17 previous similar messages [1066580.499132] Lustre: oak-OST0036: haven't heard from client aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8836cdce8000, cur 1519199221 expire 1519199071 last 1519198994 [1066580.500920] Lustre: Skipped 17 previous similar messages [1066581.095876] Lustre: oak-OST0042: haven't heard from client aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882d4ab12800, cur 1519199221 expire 1519199071 last 1519198994 [1066581.096846] Lustre: Skipped 7 previous similar messages [1072766.212453] Lustre: oak-OST0052: haven't heard from client 43456dc5-32cc-af37-a38b-2cb009474183 (at 10.210.47.47@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883c8c907000, cur 1519205407 expire 1519205257 last 1519205180 [1072766.213414] Lustre: Skipped 9 previous similar messages [1072766.721186] Lustre: oak-OST0030: haven't heard from client 43456dc5-32cc-af37-a38b-2cb009474183 (at 10.210.47.47@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882d4ab10800, cur 1519205407 expire 1519205257 last 1519205180 [1072766.722158] Lustre: Skipped 15 previous similar messages [1073325.191434] Lustre: oak-OST004c: haven't heard from client 1ad39a8b-acd2-a023-9b02-2561544e08b8 (at 10.9.112.9@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cac616c00, cur 1519205966 expire 1519205816 last 1519205739 [1073325.192402] Lustre: Skipped 1 previous similar message [1073329.197021] Lustre: oak-OST003e: haven't heard from client 1ad39a8b-acd2-a023-9b02-2561544e08b8 (at 10.9.112.9@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d03a69000, cur 1519205970 expire 1519205820 last 1519205743 [1073335.200106] Lustre: oak-OST0052: haven't heard from client 1ad39a8b-acd2-a023-9b02-2561544e08b8 (at 10.9.112.9@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d2ca3b800, cur 1519205976 expire 1519205826 last 1519205749 [1073335.201072] Lustre: Skipped 11 previous similar messages [1073701.338451] Lustre: oak-OST0036: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [1073701.338452] Lustre: oak-OST0030: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [1073701.338454] Lustre: oak-OST003a: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [1073701.338455] Lustre: oak-OST0038: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [1073701.338456] Lustre: oak-OST0032: Connection restored to 6de9bf49-4782-c4fe-9c42-fc8869f33746 (at 10.210.47.47@o2ib3) [1073701.338457] Lustre: Skipped 4 previous similar messages [1073701.338458] Lustre: Skipped 4 previous similar messages [1073701.338459] Lustre: Skipped 4 previous similar messages [1073701.338462] Lustre: Skipped 5 previous similar messages [1073701.341860] Lustre: Skipped 11 previous similar messages [1073914.567364] Lustre: oak-OST0032: Connection restored to 1ad39a8b-acd2-a023-9b02-2561544e08b8 (at 10.9.112.9@o2ib4) [1073914.567365] Lustre: oak-OST0034: Connection restored to 1ad39a8b-acd2-a023-9b02-2561544e08b8 (at 10.9.112.9@o2ib4) [1073914.567366] Lustre: oak-OST0030: Connection restored to 1ad39a8b-acd2-a023-9b02-2561544e08b8 (at 10.9.112.9@o2ib4) [1073914.568811] Lustre: Skipped 14 previous similar messages [1082543.760141] Lustre: oak-OST0032: haven't heard from client 42b07aa2-0e4f-a664-521f-8af46f8ac255 (at 10.210.47.121@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ff30f7800, cur 1519215185 expire 1519215035 last 1519214958 [1082543.761179] Lustre: Skipped 4 previous similar messages [1082544.934421] Lustre: oak-OST003a: haven't heard from client 42b07aa2-0e4f-a664-521f-8af46f8ac255 (at 10.210.47.121@o2ib3) in 228 seconds. I think it's dead, and I am evicting it. exp ffff883ccbe07800, cur 1519215186 expire 1519215036 last 1519214958 [1082544.935407] Lustre: Skipped 15 previous similar messages [1083357.491762] Lustre: oak-OST0030: Connection restored to 42b07aa2-0e4f-a664-521f-8af46f8ac255 (at 10.210.47.121@o2ib3) [1083357.491763] Lustre: oak-OST0032: Connection restored to 42b07aa2-0e4f-a664-521f-8af46f8ac255 (at 10.210.47.121@o2ib3) [1083357.492746] Lustre: Skipped 14 previous similar messages [1089018.466002] Lustre: oak-OST0044: haven't heard from client 4ee345a0-1ac4-c87a-a275-576c70390233 (at 10.8.28.11@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883caae7e800, cur 1519221660 expire 1519221510 last 1519221433 [1089018.466967] Lustre: Skipped 1 previous similar message [1089020.949960] Lustre: oak-OST0048: haven't heard from client 4ee345a0-1ac4-c87a-a275-576c70390233 (at 10.8.28.11@o2ib6) in 229 seconds. I think it's dead, and I am evicting it. exp ffff883ca7081400, cur 1519221662 expire 1519221512 last 1519221433 [1089020.950948] Lustre: Skipped 12 previous similar messages [1090580.597929] ses 1:0:122:0: attempting task abort! scmd(ffff8823b58e9a40) [1090580.598173] ses 1:0:122:0: [sg122] CDB: Receive Diagnostic 1c 01 0a ff ff 00 [1090580.598633] scsi target1:0:122: handle(0x008b), sas_address(0x5001636001bd0f3d), phy(76) [1090580.599101] scsi target1:0:122: enclosure_logical_id(0x5001636001bd0f3d), slot(60) [1090580.599562] scsi target1:0:122: enclosure level(0x0001),connector name( ) [1090580.604221] ses 1:0:122:0: task abort: FAILED scmd(ffff8823b58e9a40) [1090580.970355] ses 1:0:122:0: attempting device reset! scmd(ffff8823b58e9a40) [1090580.970605] ses 1:0:122:0: [sg122] CDB: Receive Diagnostic 1c 01 0a ff ff 00 [1090580.971084] scsi target1:0:122: handle(0x008b), sas_address(0x5001636001bd0f3d), phy(76) [1090580.971551] scsi target1:0:122: enclosure_logical_id(0x5001636001bd0f3d), slot(60) [1090580.972026] scsi target1:0:122: enclosure level(0x0001),connector name( ) [1090580.975822] ses 1:0:122:0: device reset: SUCCESS scmd(ffff8823b58e9a40) [1094793.579458] Lustre: oak-OST003a: Connection restored to edc9ac7c-8da6-16ad-728b-6d71addd8cf0 (at 10.8.29.4@o2ib6) [1094793.579459] Lustre: oak-OST0034: Connection restored to edc9ac7c-8da6-16ad-728b-6d71addd8cf0 (at 10.8.29.4@o2ib6) [1094793.579461] Lustre: oak-OST0038: Connection restored to edc9ac7c-8da6-16ad-728b-6d71addd8cf0 (at 10.8.29.4@o2ib6) [1094793.579462] Lustre: Skipped 3 previous similar messages [1094793.579465] Lustre: Skipped 3 previous similar messages [1094793.581340] Lustre: Skipped 6 previous similar messages [1094794.124249] Lustre: oak-OST0052: Connection restored to 149f18b4-5c96-a969-340e-c4bf7444f757 (at 10.8.15.4@o2ib6) [1094794.124715] Lustre: Skipped 194 previous similar messages [1094818.766086] LustreError: 137-5: oak-OST0045_UUID: not available for connect from 10.8.29.4@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1094818.766878] LustreError: Skipped 15273 previous similar messages [1094868.763009] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.8.29.4@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1094868.763817] LustreError: Skipped 18 previous similar messages [1094914.191040] Lustre: oak-OST0034: haven't heard from client 97798af8-eef3-dbd4-8e85-fef451a162f3 (at 10.210.44.48@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cda3e3c00, cur 1519227556 expire 1519227406 last 1519227329 [1094914.192023] Lustre: Skipped 4 previous similar messages [1094914.934611] Lustre: oak-OST003c: haven't heard from client efa8935a-c9f7-be4d-c6a6-fa9041a61226 (at 10.210.47.131@o2ib3) in 220 seconds. I think it's dead, and I am evicting it. exp ffff883ca7def000, cur 1519227556 expire 1519227406 last 1519227336 [1094914.935642] Lustre: Skipped 50 previous similar messages [1094918.760622] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.8.29.4@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1094918.761319] LustreError: Skipped 18 previous similar messages [1095018.756250] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.8.29.4@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1095018.756973] LustreError: Skipped 36 previous similar messages [1095028.358884] Lustre: oak-OST0032: Connection restored to 75d30f74-7923-738b-70bc-bc5c47963847 (at 10.8.28.1@o2ib6) [1095028.358885] Lustre: oak-OST0030: Connection restored to 75d30f74-7923-738b-70bc-bc5c47963847 (at 10.8.28.1@o2ib6) [1095028.358888] Lustre: Skipped 10 previous similar messages [1095028.360065] Lustre: Skipped 15 previous similar messages [1095153.548258] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.8.28.1@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1095153.548984] LustreError: Skipped 59 previous similar messages [1095418.737252] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.8.29.4@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1095418.737971] LustreError: Skipped 106 previous similar messages [1095741.907159] Lustre: oak-OST0032: Connection restored to 63cbd65b-9009-91b1-e456-7028e46762c3 (at 10.210.46.124@o2ib3) [1095741.907160] Lustre: oak-OST0034: Connection restored to 63cbd65b-9009-91b1-e456-7028e46762c3 (at 10.210.46.124@o2ib3) [1095741.907161] Lustre: oak-OST0030: Connection restored to 63cbd65b-9009-91b1-e456-7028e46762c3 (at 10.210.46.124@o2ib3) [1095741.908600] Lustre: Skipped 15 previous similar messages [1095744.154584] Lustre: oak-OST0034: Connection restored to 9d0113bb-edf2-b892-68fd-b4a39365a1e9 (at 10.210.46.121@o2ib3) [1095744.154585] Lustre: oak-OST0032: Connection restored to 9d0113bb-edf2-b892-68fd-b4a39365a1e9 (at 10.210.46.121@o2ib3) [1095744.154586] Lustre: oak-OST0030: Connection restored to 9d0113bb-edf2-b892-68fd-b4a39365a1e9 (at 10.210.46.121@o2ib3) [1095744.156032] Lustre: Skipped 13 previous similar messages [1095750.774204] Lustre: oak-OST0030: Connection restored to (at 10.210.46.126@o2ib3) [1095750.774691] Lustre: Skipped 6 previous similar messages [1095763.292869] Lustre: oak-OST0030: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1095763.292870] Lustre: oak-OST0032: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1095763.292873] Lustre: Skipped 13 previous similar messages [1095763.294088] Lustre: Skipped 12 previous similar messages [1095781.646798] Lustre: oak-OST0030: Connection restored to b8900897-1256-1108-7e87-17aa5caa8404 (at 10.210.46.120@o2ib3) [1095781.647282] Lustre: Skipped 28 previous similar messages [1095861.722142] Lustre: oak-OST0032: Connection restored to ecc7ed8a-da67-6d04-63d2-f1638b2b161c (at 10.210.46.44@o2ib3) [1095861.722624] Lustre: Skipped 11 previous similar messages [1095937.157371] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.46.44@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1095937.158107] LustreError: Skipped 256 previous similar messages [1096054.533482] Lustre: oak-OST0036: Connection restored to f08aa377-43b8-fd42-96bc-e2f398544cd1 (at 10.210.46.117@o2ib3) [1096054.533483] Lustre: oak-OST0038: Connection restored to f08aa377-43b8-fd42-96bc-e2f398544cd1 (at 10.210.46.117@o2ib3) [1096054.533484] Lustre: oak-OST0030: Connection restored to f08aa377-43b8-fd42-96bc-e2f398544cd1 (at 10.210.46.117@o2ib3) [1096054.533485] Lustre: oak-OST0034: Connection restored to f08aa377-43b8-fd42-96bc-e2f398544cd1 (at 10.210.46.117@o2ib3) [1096054.533486] Lustre: oak-OST0032: Connection restored to f08aa377-43b8-fd42-96bc-e2f398544cd1 (at 10.210.46.117@o2ib3) [1096054.533487] Lustre: Skipped 33 previous similar messages [1096054.533488] Lustre: Skipped 33 previous similar messages [1096054.533489] Lustre: Skipped 33 previous similar messages [1096054.533492] Lustre: Skipped 33 previous similar messages [1096054.542416] Lustre: Skipped 13 previous similar messages [1096541.034695] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.44.48@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1096541.035432] LustreError: Skipped 377 previous similar messages [1096565.110028] Lustre: oak-OST004e: haven't heard from client 26daf71a-6b1d-9025-91c4-b5a509a534c6 (at 10.210.47.108@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882d4ab16400, cur 1519229207 expire 1519229057 last 1519228980 [1096565.111008] Lustre: Skipped 46 previous similar messages [1097141.010652] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.44.48@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1097141.011375] LustreError: Skipped 323 previous similar messages [1097740.987850] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.44.48@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1097740.988573] LustreError: Skipped 267 previous similar messages [1098058.172983] Lustre: oak-OST003e: haven't heard from client 5644f3fd-9d24-3aa5-47fa-f350aad949ce (at 10.9.113.4@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883af83dac00, cur 1519230700 expire 1519230550 last 1519230473 [1098058.173956] Lustre: Skipped 17 previous similar messages [1098343.602259] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.8.29.4@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1098343.602973] LustreError: Skipped 264 previous similar messages [1098604.372065] Lustre: oak-OST0030: Connection restored to baf559cf-d89d-90ee-492d-3c6501f17713 (at 10.9.113.4@o2ib4) [1098604.372066] Lustre: oak-OST0038: Connection restored to baf559cf-d89d-90ee-492d-3c6501f17713 (at 10.9.113.4@o2ib4) [1098604.372067] Lustre: oak-OST0032: Connection restored to baf559cf-d89d-90ee-492d-3c6501f17713 (at 10.9.113.4@o2ib4) [1098604.372068] Lustre: Skipped 2 previous similar messages [1098604.372071] Lustre: Skipped 2 previous similar messages [1098604.373954] Lustre: Skipped 13 previous similar messages [1098847.530920] Lustre: oak-OST0030: Connection restored to 0a64c122-43c4-79b3-a8dd-c635cac57878 (at 10.9.114.5@o2ib4) [1098847.530921] Lustre: oak-OST0034: Connection restored to 0a64c122-43c4-79b3-a8dd-c635cac57878 (at 10.9.114.5@o2ib4) [1098847.530922] Lustre: oak-OST0032: Connection restored to 0a64c122-43c4-79b3-a8dd-c635cac57878 (at 10.9.114.5@o2ib4) [1098847.532360] Lustre: Skipped 15 previous similar messages [1098882.598612] Lustre: oak-OST0030: Connection restored to 7a15e73a-a0ee-6bca-ab92-aa573877def2 (at 10.9.112.5@o2ib4) [1098882.598613] Lustre: oak-OST0034: Connection restored to 7a15e73a-a0ee-6bca-ab92-aa573877def2 (at 10.9.112.5@o2ib4) [1098882.598615] Lustre: oak-OST0032: Connection restored to 7a15e73a-a0ee-6bca-ab92-aa573877def2 (at 10.9.112.5@o2ib4) [1098882.598616] Lustre: Skipped 33 previous similar messages [1098882.598619] Lustre: Skipped 33 previous similar messages [1098882.600534] Lustre: Skipped 12 previous similar messages [1098944.136535] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.9.112.7@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1098944.137262] LustreError: Skipped 415 previous similar messages [1099365.314417] Lustre: oak-OST0030: Connection restored to a19d9a9f-a54f-9741-6bca-4c816099d7c6 (at 10.210.47.105@o2ib3) [1099365.314418] Lustre: oak-OST0032: Connection restored to a19d9a9f-a54f-9741-6bca-4c816099d7c6 (at 10.210.47.105@o2ib3) [1099365.314420] Lustre: oak-OST0034: Connection restored to a19d9a9f-a54f-9741-6bca-4c816099d7c6 (at 10.210.47.105@o2ib3) [1099365.314421] Lustre: Skipped 37 previous similar messages [1099365.314424] Lustre: Skipped 37 previous similar messages [1099365.316397] Lustre: Skipped 11 previous similar messages [1099550.680557] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.46.126@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1099550.681288] LustreError: Skipped 672 previous similar messages [1099770.986171] Lustre: oak-OST003a: haven't heard from client 1b95f2b6-37a3-daab-4a57-c536e4b4a9e1 (at 10.9.112.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cab94b800, cur 1519232413 expire 1519232263 last 1519232186 [1099770.987137] Lustre: Skipped 17 previous similar messages [1100154.362132] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.46.117@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1100154.362852] LustreError: Skipped 642 previous similar messages [1100379.117619] Lustre: oak-OST0034: Connection restored to baf559cf-d89d-90ee-492d-3c6501f17713 (at 10.9.113.4@o2ib4) [1100379.117621] Lustre: oak-OST0030: Connection restored to baf559cf-d89d-90ee-492d-3c6501f17713 (at 10.9.113.4@o2ib4) [1100379.117622] Lustre: oak-OST0032: Connection restored to baf559cf-d89d-90ee-492d-3c6501f17713 (at 10.9.113.4@o2ib4) [1100379.117623] Lustre: Skipped 40 previous similar messages [1100379.117626] Lustre: Skipped 40 previous similar messages [1100379.119526] Lustre: Skipped 15 previous similar messages [1100757.212102] LustreError: 137-5: oak-OST0035_UUID: not available for connect from 10.9.114.4@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1100757.212832] LustreError: Skipped 1001 previous similar messages [1101068.923441] Lustre: oak-OST0032: haven't heard from client 51a772ba-1402-58f0-7b77-ccfe83dcdfe6 (at 10.9.101.32@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883e603bf400, cur 1519233711 expire 1519233561 last 1519233484 [1101068.924418] Lustre: Skipped 125 previous similar messages [1101074.386676] Lustre: oak-OST0030: Connection restored to 51a772ba-1402-58f0-7b77-ccfe83dcdfe6 (at 10.9.101.32@o2ib4) [1101074.387150] Lustre: Skipped 107 previous similar messages [1101357.529713] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.210.47.108@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1101357.530432] LustreError: Skipped 1694 previous similar messages [1101961.885515] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.46.44@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1101961.886338] LustreError: Skipped 1608 previous similar messages [1102056.337168] Lustre: oak-OST0030: Connection restored to 2215d3b6-89f8-0fa2-d478-3945a32b6906 (at 10.9.114.3@o2ib4) [1102056.337169] Lustre: oak-OST0032: Connection restored to 2215d3b6-89f8-0fa2-d478-3945a32b6906 (at 10.9.114.3@o2ib4) [1102056.337172] Lustre: Skipped 8 previous similar messages [1102056.338379] Lustre: Skipped 9 previous similar messages [1102561.859507] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.46.44@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1102561.860238] LustreError: Skipped 2266 previous similar messages [1102683.842848] Lustre: oak-OST003c: haven't heard from client cf27dbf4-35f5-5de8-77cb-bd6a58a12004 (at 10.210.47.108@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882533270c00, cur 1519235326 expire 1519235176 last 1519235099 [1102683.843835] Lustre: Skipped 17 previous similar messages [1103165.176947] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.210.47.105@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1103165.177680] LustreError: Skipped 1920 previous similar messages [1103293.567664] Lustre: oak-OST0034: Connection restored to 17da3512-638d-27b2-c3c1-96f0bff1b7c4 (at 10.9.113.10@o2ib4) [1103293.567665] Lustre: oak-OST0032: Connection restored to 17da3512-638d-27b2-c3c1-96f0bff1b7c4 (at 10.9.113.10@o2ib4) [1103293.567669] Lustre: Skipped 55 previous similar messages [1103293.568964] Lustre: Skipped 14 previous similar messages [1103765.883732] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.9.114.2@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1103765.883734] LustreError: 137-5: oak-OST0037_UUID: not available for connect from 10.9.114.2@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1103765.883738] LustreError: Skipped 2228 previous similar messages [1103765.885427] LustreError: Skipped 13 previous similar messages [1104372.501364] LustreError: 137-5: oak-OST0053_UUID: not available for connect from 10.9.113.8@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1104372.502079] LustreError: Skipped 2040 previous similar messages [1104978.006817] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.47.120@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1104978.007577] LustreError: Skipped 2356 previous similar messages [1105578.066782] LustreError: 137-5: oak-OST004b_UUID: not available for connect from 10.8.28.1@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1105578.067517] LustreError: Skipped 2169 previous similar messages [1105677.401300] Lustre: oak-OST0030: Client c37df854-e648-12bb-4876-55cdb4944d56 (at 10.9.112.10@o2ib4) reconnecting [1105677.401820] Lustre: Skipped 37528 previous similar messages [1105677.402077] Lustre: oak-OST0030: Connection restored to 95853065-bced-4a37-068a-7368116cf9b5 (at 10.9.112.10@o2ib4) [1105728.683016] Lustre: oak-OST0030: Client 8089da30-2e6e-f45e-8f52-4de3332ae98a (at 10.9.105.28@o2ib4) reconnecting [1105728.683516] Lustre: Skipped 963 previous similar messages [1105728.683782] Lustre: oak-OST0030: Connection restored to 8089da30-2e6e-f45e-8f52-4de3332ae98a (at 10.9.105.28@o2ib4) [1105728.684293] Lustre: Skipped 966 previous similar messages [1105983.797447] Lustre: oak-OST0034: Connection restored to e0fb0da5-3d69-b70a-ade4-66ccab2d1917 (at 10.9.114.1@o2ib4) [1105983.797448] Lustre: oak-OST0032: Connection restored to e0fb0da5-3d69-b70a-ade4-66ccab2d1917 (at 10.9.114.1@o2ib4) [1105983.797449] Lustre: oak-OST0036: Connection restored to e0fb0da5-3d69-b70a-ade4-66ccab2d1917 (at 10.9.114.1@o2ib4) [1105983.797450] Lustre: oak-OST0030: Connection restored to e0fb0da5-3d69-b70a-ade4-66ccab2d1917 (at 10.9.114.1@o2ib4) [1105983.797452] Lustre: oak-OST0038: Connection restored to e0fb0da5-3d69-b70a-ade4-66ccab2d1917 (at 10.9.114.1@o2ib4) [1105983.797453] Lustre: Skipped 171 previous similar messages [1105983.797453] Lustre: Skipped 171 previous similar messages [1105983.797454] Lustre: Skipped 171 previous similar messages [1105983.797457] Lustre: Skipped 171 previous similar messages [1105983.807100] Lustre: Skipped 13 previous similar messages [1106179.081009] LustreError: 137-5: oak-OST0033_UUID: not available for connect from 10.210.46.117@o2ib3 (no target). If you are running an HA pair check that the target is mounted on the other server. [1106179.081750] LustreError: Skipped 2305 previous similar messages [1106711.645039] Lustre: oak-OST0036: haven't heard from client d41b8d5a-f1ae-3f2a-fc91-7fd6f565b4d5 (at 10.9.112.11@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882fa2a18800, cur 1519239354 expire 1519239204 last 1519239127 [1106711.646007] Lustre: Skipped 35 previous similar messages [1106780.410960] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.9.112.7@o2ib4 (no target). If you are running an HA pair check that the target is mounted on the other server. [1106780.411695] LustreError: Skipped 2284 previous similar messages [1107322.927214] Lustre: oak-OST0038: Connection restored to d41b8d5a-f1ae-3f2a-fc91-7fd6f565b4d5 (at 10.9.112.11@o2ib4) [1107322.927216] Lustre: oak-OST003e: Connection restored to d41b8d5a-f1ae-3f2a-fc91-7fd6f565b4d5 (at 10.9.112.11@o2ib4) [1107322.927217] Lustre: oak-OST003a: Connection restored to d41b8d5a-f1ae-3f2a-fc91-7fd6f565b4d5 (at 10.9.112.11@o2ib4) [1107322.927218] Lustre: oak-OST003c: Connection restored to d41b8d5a-f1ae-3f2a-fc91-7fd6f565b4d5 (at 10.9.112.11@o2ib4) [1107322.927219] Lustre: oak-OST0034: Connection restored to d41b8d5a-f1ae-3f2a-fc91-7fd6f565b4d5 (at 10.9.112.11@o2ib4) [1107322.927220] Lustre: Skipped 19 previous similar messages [1107322.927221] Lustre: Skipped 20 previous similar messages [1107322.927222] Lustre: Skipped 20 previous similar messages [1107322.927225] Lustre: Skipped 20 previous similar messages [1107322.930656] Lustre: Skipped 9 previous similar messages [1107381.527427] LustreError: 137-5: oak-OST0031_UUID: not available for connect from 10.8.9.8@o2ib6 (no target). If you are running an HA pair check that the target is mounted on the other server. [1107381.528149] LustreError: Skipped 2625 previous similar messages [1107576.147046] Lustre: oak-OST0050: Client d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) reconnecting [1107576.147523] Lustre: Skipped 161 previous similar messages [1107576.147788] Lustre: oak-OST0050: Connection restored to d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) [1107579.178984] Lustre: oak-OST004c: Connection restored to 6883e1ba-44ec-3eb2-be74-ac971d5539f6 (at 10.9.101.58@o2ib4) [1107583.928546] Lustre: oak-OST0046: Connection restored to 69c29484-c49b-fdda-aeae-7d5d0a8addc7 (at 10.9.104.40@o2ib4) [1107583.929023] Lustre: Skipped 20 previous similar messages [1107584.446002] Lustre: oak-OST003a: Client a119e365-e181-a311-9c7b-47c1f5c176c4 (at 10.9.105.31@o2ib4) reconnecting [1107584.446502] Lustre: Skipped 25 previous similar messages [1107593.361584] Lustre: oak-OST004a: Connection restored to 695f5cf4-dcdb-7470-21f3-66056e107f99 (at 10.9.104.58@o2ib4) [1107593.362075] Lustre: Skipped 258 previous similar messages [1107607.198919] Lustre: oak-OST004a: Client 7c5509a8-198c-2afc-7c8a-6392dfda69ec (at 10.9.105.65@o2ib4) reconnecting [1107607.199417] Lustre: Skipped 308 previous similar messages [1107613.304225] Lustre: oak-OST003c: Connection restored to 7a15e73a-a0ee-6bca-ab92-aa573877def2 (at 10.9.112.5@o2ib4) [1107613.304718] Lustre: Skipped 67 previous similar messages [1107645.742741] Lustre: oak-OST0040: Client e9d41537-e245-c818-6796-e1ba730ceed2 (at 10.8.18.26@o2ib6) reconnecting [1107645.742743] Lustre: oak-OST0048: Client e9d41537-e245-c818-6796-e1ba730ceed2 (at 10.8.18.26@o2ib6) reconnecting [1107645.742744] Lustre: oak-OST0042: Client e9d41537-e245-c818-6796-e1ba730ceed2 (at 10.8.18.26@o2ib6) reconnecting [1107645.742745] Lustre: Skipped 1300 previous similar messages [1107645.742749] Lustre: Skipped 1300 previous similar messages [1107650.912603] Lustre: oak-OST0042: Connection restored to 17da3512-638d-27b2-c3c1-96f0bff1b7c4 (at 10.9.113.10@o2ib4) [1107650.913077] Lustre: Skipped 1330 previous similar messages [1107726.078011] Lustre: oak-OST0048: Connection restored to 650cb648-a186-421c-f011-deef55717e8d (at 10.210.45.54@o2ib3) [1107726.078013] Lustre: oak-OST0046: Connection restored to 650cb648-a186-421c-f011-deef55717e8d (at 10.210.45.54@o2ib3) [1107726.078017] Lustre: Skipped 29835 previous similar messages [1107726.079225] Lustre: Skipped 4 previous similar messages [1107751.645956] md: md5 stopped. [1107751.739511] md/raid:md5: not clean -- starting background reconstruction [1107751.739915] md/raid:md5: device dm-332 operational as raid disk 0 [1107751.740158] md/raid:md5: device dm-10 operational as raid disk 9 [1107751.740401] md/raid:md5: device dm-9 operational as raid disk 8 [1107751.740652] md/raid:md5: device dm-2 operational as raid disk 7 [1107751.740892] md/raid:md5: device dm-344 operational as raid disk 6 [1107751.741131] md/raid:md5: device dm-340 operational as raid disk 5 [1107751.741368] md/raid:md5: device dm-319 operational as raid disk 4 [1107751.741609] md/raid:md5: device dm-351 operational as raid disk 3 [1107751.741849] md/raid:md5: device dm-347 operational as raid disk 2 [1107751.742090] md/raid:md5: device dm-335 operational as raid disk 1 [1107751.743372] md/raid:md5: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107751.753114] md5: detected capacity change from 0 to 64011431837696 [1107751.753518] md: resync of RAID array md5 [1107751.774541] md: md33 stopped. [1107751.795264] md/raid:md33: not clean -- starting background reconstruction [1107751.795647] md/raid:md33: device dm-311 operational as raid disk 0 [1107751.795885] md/raid:md33: device dm-306 operational as raid disk 9 [1107751.796123] md/raid:md33: device dm-305 operational as raid disk 8 [1107751.796365] md/raid:md33: device dm-293 operational as raid disk 7 [1107751.796615] md/raid:md33: device dm-292 operational as raid disk 6 [1107751.796860] md/raid:md33: device dm-280 operational as raid disk 5 [1107751.797133] md/raid:md33: device dm-279 operational as raid disk 4 [1107751.797376] md/raid:md33: device dm-266 operational as raid disk 3 [1107751.797618] md/raid:md33: device dm-265 operational as raid disk 2 [1107751.797856] md/raid:md33: device dm-312 operational as raid disk 1 [1107751.799098] md/raid:md33: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107751.846905] md33: detected capacity change from 0 to 64011431837696 [1107751.847643] md: resync of RAID array md33 [1107751.859765] md: md1 stopped. [1107751.879648] md/raid:md1: not clean -- starting background reconstruction [1107751.880071] md/raid:md1: device dm-348 operational as raid disk 0 [1107751.880318] md/raid:md1: device dm-317 operational as raid disk 9 [1107751.880576] md/raid:md1: device dm-3 operational as raid disk 8 [1107751.880823] md/raid:md1: device dm-354 operational as raid disk 7 [1107751.881070] md/raid:md1: device dm-336 operational as raid disk 6 [1107751.881318] md/raid:md1: device dm-346 operational as raid disk 5 [1107751.881569] md/raid:md1: device dm-334 operational as raid disk 4 [1107751.881813] md/raid:md1: device dm-343 operational as raid disk 3 [1107751.882060] md/raid:md1: device dm-352 operational as raid disk 2 [1107751.882306] md/raid:md1: device dm-324 operational as raid disk 1 [1107751.884120] md/raid:md1: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107751.928187] md1: detected capacity change from 0 to 64011431837696 [1107751.928564] md: resync of RAID array md1 [1107751.957897] md: md19 stopped. [1107751.989873] md/raid:md19: not clean -- starting background reconstruction [1107751.990267] md/raid:md19: device dm-147 operational as raid disk 0 [1107751.990514] md/raid:md19: device dm-182 operational as raid disk 9 [1107751.990999] md/raid:md19: device dm-181 operational as raid disk 8 [1107751.991237] md/raid:md19: device dm-168 operational as raid disk 7 [1107751.991474] md/raid:md19: device dm-167 operational as raid disk 6 [1107751.991732] md/raid:md19: device dm-155 operational as raid disk 5 [1107751.991970] md/raid:md19: device dm-154 operational as raid disk 4 [1107751.992212] md/raid:md19: device dm-142 operational as raid disk 3 [1107751.992454] md/raid:md19: device dm-141 operational as raid disk 2 [1107751.992839] md/raid:md19: device dm-158 operational as raid disk 1 [1107751.994133] md/raid:md19: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.036351] md19: detected capacity change from 0 to 64011431837696 [1107752.036758] md: resync of RAID array md19 [1107752.057769] md: md15 stopped. [1107752.081720] md/raid:md15: device dm-131 operational as raid disk 0 [1107752.081979] md/raid:md15: device dm-126 operational as raid disk 9 [1107752.082226] md/raid:md15: device dm-125 operational as raid disk 8 [1107752.082473] md/raid:md15: device dm-113 operational as raid disk 7 [1107752.082738] md/raid:md15: device dm-112 operational as raid disk 6 [1107752.083414] md/raid:md15: device dm-100 operational as raid disk 5 [1107752.083934] md/raid:md15: device dm-99 operational as raid disk 4 [1107752.084790] md/raid:md15: device dm-86 operational as raid disk 3 [1107752.085031] md/raid:md15: device dm-85 operational as raid disk 2 [1107752.085270] md/raid:md15: device dm-132 operational as raid disk 1 [1107752.087719] md/raid:md15: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.126352] md15: detected capacity change from 0 to 64011431837696 [1107752.133228] md: md11 stopped. [1107752.157393] md/raid:md11: not clean -- starting background reconstruction [1107752.157804] md/raid:md11: device dm-21 operational as raid disk 0 [1107752.158045] md/raid:md11: device dm-71 operational as raid disk 9 [1107752.158287] md/raid:md11: device dm-70 operational as raid disk 8 [1107752.158534] md/raid:md11: device dm-14 operational as raid disk 7 [1107752.158777] md/raid:md11: device dm-60 operational as raid disk 6 [1107752.159017] md/raid:md11: device dm-48 operational as raid disk 5 [1107752.159260] md/raid:md11: device dm-47 operational as raid disk 4 [1107752.159507] md/raid:md11: device dm-35 operational as raid disk 3 [1107752.159749] md/raid:md11: device dm-34 operational as raid disk 2 [1107752.159991] md/raid:md11: device dm-22 operational as raid disk 1 [1107752.161672] md/raid:md11: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.229146] md11: detected capacity change from 0 to 64011431837696 [1107752.229789] md: resync of RAID array md11 [1107752.242800] md: md23 stopped. [1107752.273705] md/raid:md23: not clean -- starting background reconstruction [1107752.274124] md/raid:md23: device dm-137 operational as raid disk 0 [1107752.274369] md/raid:md23: device dm-190 operational as raid disk 9 [1107752.274621] md/raid:md23: device dm-189 operational as raid disk 8 [1107752.274861] md/raid:md23: device dm-177 operational as raid disk 7 [1107752.275102] md/raid:md23: device dm-176 operational as raid disk 6 [1107752.275343] md/raid:md23: device dm-164 operational as raid disk 5 [1107752.275592] md/raid:md23: device dm-163 operational as raid disk 4 [1107752.275835] md/raid:md23: device dm-151 operational as raid disk 3 [1107752.276074] md/raid:md23: device dm-150 operational as raid disk 2 [1107752.276312] md/raid:md23: device dm-138 operational as raid disk 1 [1107752.283035] md/raid:md23: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.317421] md23: detected capacity change from 0 to 64011431837696 [1107752.317802] md: resync of RAID array md23 [1107752.347981] md: md17 stopped. [1107752.393764] md/raid:md17: device dm-77 operational as raid disk 0 [1107752.394014] md/raid:md17: device dm-130 operational as raid disk 9 [1107752.394258] md/raid:md17: device dm-129 operational as raid disk 8 [1107752.394505] md/raid:md17: device dm-117 operational as raid disk 7 [1107752.394748] md/raid:md17: device dm-116 operational as raid disk 6 [1107752.394995] md/raid:md17: device dm-104 operational as raid disk 5 [1107752.395240] md/raid:md17: device dm-103 operational as raid disk 4 [1107752.395490] md/raid:md17: device dm-91 operational as raid disk 3 [1107752.395732] md/raid:md17: device dm-90 operational as raid disk 2 [1107752.395979] md/raid:md17: device dm-78 operational as raid disk 1 [1107752.397696] md/raid:md17: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.433404] md17: detected capacity change from 0 to 64011431837696 [1107752.461007] md: md3 stopped. [1107752.563594] md/raid:md3: device dm-11 operational as raid disk 0 [1107752.563847] md/raid:md3: device dm-6 operational as raid disk 9 [1107752.564096] md/raid:md3: device dm-5 operational as raid disk 8 [1107752.564340] md/raid:md3: device dm-356 operational as raid disk 7 [1107752.564596] md/raid:md3: device dm-0 operational as raid disk 6 [1107752.564839] md/raid:md3: device dm-327 operational as raid disk 5 [1107752.565084] md/raid:md3: device dm-353 operational as raid disk 4 [1107752.565327] md/raid:md3: device dm-322 operational as raid disk 3 [1107752.565573] md/raid:md3: device dm-330 operational as raid disk 2 [1107752.565817] md/raid:md3: device dm-12 operational as raid disk 1 [1107752.567314] md/raid:md3: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.583889] md3: detected capacity change from 0 to 64011431837696 [1107752.597020] md: md27 stopped. [1107752.671659] md/raid:md27: not clean -- starting background reconstruction [1107752.672060] md/raid:md27: device dm-251 operational as raid disk 0 [1107752.672306] md/raid:md27: device dm-246 operational as raid disk 9 [1107752.672588] md/raid:md27: device dm-245 operational as raid disk 8 [1107752.672832] md/raid:md27: device dm-233 operational as raid disk 7 [1107752.673075] md/raid:md27: device dm-232 operational as raid disk 6 [1107752.673318] md/raid:md27: device dm-220 operational as raid disk 5 [1107752.673565] md/raid:md27: device dm-219 operational as raid disk 4 [1107752.673807] md/raid:md27: device dm-206 operational as raid disk 3 [1107752.674049] md/raid:md27: device dm-205 operational as raid disk 2 [1107752.674295] md/raid:md27: device dm-252 operational as raid disk 1 [1107752.676007] md/raid:md27: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.705441] md27: detected capacity change from 0 to 64011431837696 [1107752.706221] md: resync of RAID array md27 [1107752.748930] md: md25 stopped. [1107752.795601] md/raid:md25: device dm-207 operational as raid disk 0 [1107752.795852] md/raid:md25: device dm-242 operational as raid disk 9 [1107752.796097] md/raid:md25: device dm-241 operational as raid disk 8 [1107752.796342] md/raid:md25: device dm-228 operational as raid disk 7 [1107752.796604] md/raid:md25: device dm-227 operational as raid disk 6 [1107752.796846] md/raid:md25: device dm-215 operational as raid disk 5 [1107752.797086] md/raid:md25: device dm-214 operational as raid disk 4 [1107752.797463] md/raid:md25: device dm-202 operational as raid disk 3 [1107752.797706] md/raid:md25: device dm-201 operational as raid disk 2 [1107752.797947] md/raid:md25: device dm-218 operational as raid disk 1 [1107752.799339] md/raid:md25: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.869098] md25: detected capacity change from 0 to 64011431837696 [1107752.897000] md: md7 stopped. [1107752.948901] md/raid:md7: not clean -- starting background reconstruction [1107752.949304] md/raid:md7: device dm-31 operational as raid disk 0 [1107752.949553] md/raid:md7: device dm-65 operational as raid disk 9 [1107752.949799] md/raid:md7: device dm-64 operational as raid disk 8 [1107752.950040] md/raid:md7: device dm-52 operational as raid disk 7 [1107752.950282] md/raid:md7: device dm-51 operational as raid disk 6 [1107752.950527] md/raid:md7: device dm-39 operational as raid disk 5 [1107752.950773] md/raid:md7: device dm-38 operational as raid disk 4 [1107752.951021] md/raid:md7: device dm-26 operational as raid disk 3 [1107752.951271] md/raid:md7: device dm-25 operational as raid disk 2 [1107752.951522] md/raid:md7: device dm-42 operational as raid disk 1 [1107752.953122] md/raid:md7: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107752.993967] md7: detected capacity change from 0 to 64011431837696 [1107752.994372] md: resync of RAID array md7 [1107752.996728] md: md1: resync done. [1107753.007905] md: md29 stopped. [1107753.027860] md/raid:md29: device dm-197 operational as raid disk 0 [1107753.028107] md/raid:md29: device dm-250 operational as raid disk 9 [1107753.028350] md/raid:md29: device dm-249 operational as raid disk 8 [1107753.028858] md/raid:md29: device dm-237 operational as raid disk 7 [1107753.029101] md/raid:md29: device dm-236 operational as raid disk 6 [1107753.029344] md/raid:md29: device dm-224 operational as raid disk 5 [1107753.029593] md/raid:md29: device dm-223 operational as raid disk 4 [1107753.029837] md/raid:md29: device dm-211 operational as raid disk 3 [1107753.030078] md/raid:md29: device dm-210 operational as raid disk 2 [1107753.030321] md/raid:md29: device dm-198 operational as raid disk 1 [1107753.032110] md/raid:md29: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107753.108006] md29: detected capacity change from 0 to 64011431837696 [1107753.149273] md: md35 stopped. [1107753.210075] md/raid:md35: device dm-257 operational as raid disk 0 [1107753.210328] md/raid:md35: device dm-310 operational as raid disk 9 [1107753.210575] md/raid:md35: device dm-309 operational as raid disk 8 [1107753.210813] md/raid:md35: device dm-297 operational as raid disk 7 [1107753.211058] md/raid:md35: device dm-296 operational as raid disk 6 [1107753.211309] md/raid:md35: device dm-284 operational as raid disk 5 [1107753.211561] md/raid:md35: device dm-283 operational as raid disk 4 [1107753.211810] md/raid:md35: device dm-271 operational as raid disk 3 [1107753.212061] md/raid:md35: device dm-270 operational as raid disk 2 [1107753.212304] md/raid:md35: device dm-258 operational as raid disk 1 [1107753.213685] md/raid:md35: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107753.256105] md35: detected capacity change from 0 to 64011431837696 [1107753.267832] md: md21 stopped. [1107753.306977] md/raid:md21: not clean -- starting background reconstruction [1107753.307375] md/raid:md21: device dm-191 operational as raid disk 0 [1107753.307623] md/raid:md21: device dm-186 operational as raid disk 9 [1107753.307866] md/raid:md21: device dm-185 operational as raid disk 8 [1107753.308106] md/raid:md21: device dm-173 operational as raid disk 7 [1107753.308346] md/raid:md21: device dm-172 operational as raid disk 6 [1107753.308591] md/raid:md21: device dm-160 operational as raid disk 5 [1107753.308833] md/raid:md21: device dm-159 operational as raid disk 4 [1107753.309075] md/raid:md21: device dm-146 operational as raid disk 3 [1107753.309313] md/raid:md21: device dm-145 operational as raid disk 2 [1107753.309554] md/raid:md21: device dm-192 operational as raid disk 1 [1107753.311352] md/raid:md21: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107753.320544] md: md23: resync done. [1107753.333837] md21: detected capacity change from 0 to 64011431837696 [1107753.334231] md: resync of RAID array md21 [1107753.407251] md: md13 stopped. [1107753.439185] md/raid:md13: device dm-87 operational as raid disk 0 [1107753.439439] md/raid:md13: device dm-122 operational as raid disk 9 [1107753.439681] md/raid:md13: device dm-121 operational as raid disk 8 [1107753.439923] md/raid:md13: device dm-108 operational as raid disk 7 [1107753.440167] md/raid:md13: device dm-107 operational as raid disk 6 [1107753.440411] md/raid:md13: device dm-95 operational as raid disk 5 [1107753.440658] md/raid:md13: device dm-94 operational as raid disk 4 [1107753.440903] md/raid:md13: device dm-82 operational as raid disk 3 [1107753.441151] md/raid:md13: device dm-81 operational as raid disk 2 [1107753.441397] md/raid:md13: device dm-98 operational as raid disk 1 [1107753.443217] md/raid:md13: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107753.502711] md13: detected capacity change from 0 to 64011431837696 [1107753.512743] md: md31 stopped. [1107753.566962] md/raid:md31: not clean -- starting background reconstruction [1107753.567405] md/raid:md31: device dm-267 operational as raid disk 0 [1107753.567664] md/raid:md31: device dm-302 operational as raid disk 9 [1107753.567907] md/raid:md31: device dm-301 operational as raid disk 8 [1107753.568580] md/raid:md31: device dm-288 operational as raid disk 7 [1107753.568819] md/raid:md31: device dm-287 operational as raid disk 6 [1107753.569062] md/raid:md31: device dm-275 operational as raid disk 5 [1107753.569306] md/raid:md31: device dm-274 operational as raid disk 4 [1107753.569554] md/raid:md31: device dm-262 operational as raid disk 3 [1107753.569800] md/raid:md31: device dm-261 operational as raid disk 2 [1107753.570043] md/raid:md31: device dm-278 operational as raid disk 1 [1107753.571731] md/raid:md31: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107753.616683] md31: detected capacity change from 0 to 64011431837696 [1107753.617339] md: resync of RAID array md31 [1107753.632478] md: md9 stopped. [1107753.711882] md/raid:md9: device dm-17 operational as raid disk 0 [1107753.712128] md/raid:md9: device dm-68 operational as raid disk 9 [1107753.712368] md/raid:md9: device dm-15 operational as raid disk 8 [1107753.712614] md/raid:md9: device dm-57 operational as raid disk 7 [1107753.712855] md/raid:md9: device dm-56 operational as raid disk 6 [1107753.713099] md/raid:md9: device dm-44 operational as raid disk 5 [1107753.713340] md/raid:md9: device dm-43 operational as raid disk 4 [1107753.713584] md/raid:md9: device dm-30 operational as raid disk 3 [1107753.713826] md/raid:md9: device dm-29 operational as raid disk 2 [1107753.714066] md/raid:md9: device dm-72 operational as raid disk 1 [1107753.826573] md: md7: resync done. [1107754.072057] md: md11: resync done. [1107754.152377] md: md21: resync done. [1107754.587251] md: md5: resync done. [1107754.650983] md: md33: resync done. [1107754.677966] md/raid:md9: raid level 6 active with 10 out of 10 devices, algorithm 2 [1107754.736473] md9: detected capacity change from 0 to 64011431837696 [1107755.007208] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107755.007474] CPU: 35 PID: 207376 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107755.007966] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107755.008452] 00000000001040d0 000000000bc20af1 ffff8813fbca7478 ffffffff816a3db1 [1107755.014409] ffff8813fbca7508 ffffffff81188810 ffffffff810f9b11 0000000000000010 [1107755.014910] fffffffffffffff0 001040d000000000 0000000000000018 000000000bc20af1 [1107755.016095] Call Trace: [1107755.017063] [] dump_stack+0x19/0x1b [1107755.018147] [] warn_alloc_failed+0x110/0x180 [1107755.018536] [] ? on_each_cpu_mask+0x51/0x60 [1107755.018785] [] __alloc_pages_slowpath+0x6b6/0x724 [1107755.019032] [] __alloc_pages_nodemask+0x405/0x420 [1107755.019299] [] alloc_pages_current+0x98/0x110 [1107755.019559] [] __get_free_pages+0xe/0x40 [1107755.019821] [] kmalloc_order_trace+0x2e/0xa0 [1107755.020078] [] __kmalloc+0x211/0x230 [1107755.020349] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107755.020621] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107755.021097] [] ? snprintf+0x49/0x70 [1107755.021355] [] mount_bdev+0x1b0/0x1f0 [1107755.021634] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107755.022115] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107755.022373] [] mount_fs+0x39/0x1b0 [1107755.022622] [] vfs_kern_mount+0x67/0x110 [1107755.022886] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107755.023142] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107755.023670] [] obd_setup+0x114/0x2a0 [obdclass] [1107755.023944] [] class_setup+0x2a8/0x840 [obdclass] [1107755.024211] [] class_process_config+0x1940/0x23f0 [obdclass] [1107755.024713] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107755.025584] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107755.025867] [] do_lcfg+0x258/0x500 [obdclass] [1107755.026129] [] lustre_start_simple+0x88/0x210 [obdclass] [1107755.026434] [] server_fill_super+0xf24/0x184c [obdclass] [1107755.026703] [] lustre_fill_super+0x328/0x950 [obdclass] [1107755.026968] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107755.027451] [] mount_nodev+0x4d/0xb0 [1107755.027716] [] lustre_mount+0x38/0x60 [obdclass] [1107755.027960] [] mount_fs+0x39/0x1b0 [1107755.028207] [] vfs_kern_mount+0x67/0x110 [1107755.028463] [] do_mount+0x233/0xaf0 [1107755.028993] [] ? __get_free_pages+0xe/0x40 [1107755.029570] [] SyS_mount+0x96/0xf0 [1107755.029818] [] system_call_fastpath+0x16/0x1b [1107755.030061] Mem-Info: [1107755.030311] active_anon:1341161 inactive_anon:326098 isolated_anon:3 active_file:23772315 inactive_file:23770986 isolated_file:32 unevictable:25294 dirty:141 writeback:0 unstable:0 slab_reclaimable:1187400 slab_unreclaimable:981892 mapped:12966 shmem:104818 pagetables:14877 bounce:0 free:6655098 free_pcp:490 free_cma:0 [1107755.032339] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107755.034519] lowmem_reserve[]: 0 1554 128505 128505 [1107755.034786] Node 0 DMA32 free:517492kB min:12672kB low:15840kB high:19008kB active_anon:47328kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:585544kB slab_unreclaimable:177620kB kernel_stack:320kB pagetables:196kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:410 all_unreclaimable? no [1107755.037405] lowmem_reserve[]: 0 0 126950 126950 [1107755.037671] Node 0 Normal free:15079700kB min:1033836kB low:1292292kB high:1550752kB active_anon:4084836kB inactive_anon:708944kB active_file:46842544kB inactive_file:46839920kB unevictable:96328kB isolated(anon):12kB isolated(file):128kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:312kB writeback:0kB mapped:29472kB shmem:277108kB slab_reclaimable:1295768kB slab_unreclaimable:1751448kB kernel_stack:46352kB pagetables:31832kB unstable:0kB bounce:0kB free_pcp:3144kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:43 all_unreclaimable? no [1107755.040344] lowmem_reserve[]: 0 0 0 0 [1107755.040861] Node 1 Normal free:11077464kB min:1050512kB low:1313140kB high:1575768kB active_anon:1235000kB inactive_anon:529920kB active_file:48211028kB inactive_file:48207788kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:252kB writeback:0kB mapped:22384kB shmem:142164kB slab_reclaimable:2868288kB slab_unreclaimable:1999012kB kernel_stack:13344kB pagetables:27480kB unstable:0kB bounce:0kB free_pcp:356kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107755.043470] lowmem_reserve[]: 0 0 0 0 [1107755.043740] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107755.044328] Node 0 DMA32: 4143*4kB (UEM) 7807*8kB (UEM) 6030*16kB (UEM) 376*32kB (UEM) 2034*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 518164kB [1107755.045883] Node 0 Normal: 645309*4kB (UEM) 906928*8kB (UE) 323310*16kB (UEM) 2826*32kB (UEM) 133*64kB (EM) 24*128kB (M) 4*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 15112660kB [1107755.046793] Node 1 Normal: 167663*4kB (UEM) 521640*8kB (UE) 382793*16kB (UE) 3599*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 11083628kB [1107755.047968] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107755.048451] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107755.048933] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107755.049423] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107755.050230] 47615266 total pagecache pages [1107755.050481] 1847 pages in swap cache [1107755.050721] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107755.050967] Free swap = 3152396kB [1107755.051213] Total swap = 4194300kB [1107755.051811] 67052113 pages RAM [1107755.052297] 0 pages HighMem/MovableOnly [1107755.052540] 1126685 pages reserved [1107755.392142] LDISKFS-fs warning (device md5): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107755.762983] LDISKFS-fs warning (device md33): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107755.846801] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107755.847050] CPU: 16 PID: 207441 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107755.847534] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107755.848007] 00000000001040d0 0000000093638054 ffff8836687ef478 ffffffff816a3db1 [1107755.848501] ffff8836687ef508 ffffffff81188810 0000000000000000 00000000ffffffff [1107755.848997] fffffffffffffff0 001040d000000000 ffff8836687ef4d8 0000000093638054 [1107755.849482] Call Trace: [1107755.849722] [] dump_stack+0x19/0x1b [1107755.849975] [] warn_alloc_failed+0x110/0x180 [1107755.850217] [] __alloc_pages_slowpath+0x6b6/0x724 [1107755.850472] [] __alloc_pages_nodemask+0x405/0x420 [1107755.850727] [] alloc_pages_current+0x98/0x110 [1107755.850977] [] __get_free_pages+0xe/0x40 [1107755.851223] [] kmalloc_order_trace+0x2e/0xa0 [1107755.851471] [] __kmalloc+0x211/0x230 [1107755.851753] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107755.852014] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107755.852499] [] ? snprintf+0x49/0x70 [1107755.852745] [] mount_bdev+0x1b0/0x1f0 [1107755.853002] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107755.853489] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107755.853734] [] mount_fs+0x39/0x1b0 [1107755.853980] [] vfs_kern_mount+0x67/0x110 [1107755.854245] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107755.854503] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107755.855022] [] obd_setup+0x114/0x2a0 [obdclass] [1107755.855293] [] class_setup+0x2a8/0x840 [obdclass] [1107755.855563] [] class_process_config+0x1940/0x23f0 [obdclass] [1107755.856049] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107755.856537] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107755.856816] [] do_lcfg+0x258/0x500 [obdclass] [1107755.857077] [] lustre_start_simple+0x88/0x210 [obdclass] [1107755.857353] [] server_fill_super+0xf24/0x184c [obdclass] [1107755.857607] [] lustre_fill_super+0x328/0x950 [obdclass] [1107755.857857] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107755.858405] [] mount_nodev+0x4d/0xb0 [1107755.858652] [] lustre_mount+0x38/0x60 [obdclass] [1107755.858890] [] mount_fs+0x39/0x1b0 [1107755.859139] [] vfs_kern_mount+0x67/0x110 [1107755.859390] [] do_mount+0x233/0xaf0 [1107755.859632] [] ? __get_free_pages+0xe/0x40 [1107755.859882] [] SyS_mount+0x96/0xf0 [1107755.860125] [] system_call_fastpath+0x16/0x1b [1107755.860374] Mem-Info: [1107755.860613] active_anon:1325960 inactive_anon:326101 isolated_anon:0 active_file:23047346 inactive_file:23047288 isolated_file:64 unevictable:25294 dirty:143 writeback:0 unstable:0 slab_reclaimable:1187056 slab_unreclaimable:993922 mapped:12949 shmem:104820 pagetables:14573 bounce:0 free:7852867 free_pcp:891 free_cma:0 [1107755.867406] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107755.868870] lowmem_reserve[]: 0 1554 128505 128505 [1107755.869128] Node 0 DMA32 free:520408kB min:12672kB low:15840kB high:19008kB active_anon:46080kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:584904kB slab_unreclaimable:176596kB kernel_stack:320kB pagetables:140kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3280 all_unreclaimable? no [1107755.870806] lowmem_reserve[]: 0 0 126950 126950 [1107755.871067] Node 0 Normal free:17222880kB min:1033836kB low:1292292kB high:1550752kB active_anon:4084908kB inactive_anon:708932kB active_file:45399108kB inactive_file:45398744kB unevictable:96328kB isolated(anon):0kB isolated(file):128kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:312kB writeback:0kB mapped:29468kB shmem:277112kB slab_reclaimable:1295704kB slab_unreclaimable:1742092kB kernel_stack:46704kB pagetables:31852kB unstable:0kB bounce:0kB free_pcp:2824kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no [1107755.873048] lowmem_reserve[]: 0 0 0 0 [1107755.873303] Node 1 Normal free:13722376kB min:1050512kB low:1313140kB high:1575768kB active_anon:1179908kB inactive_anon:529944kB active_file:46751400kB inactive_file:46750428kB unevictable:4696kB isolated(anon):0kB isolated(file):128kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:260kB writeback:0kB mapped:22320kB shmem:142168kB slab_reclaimable:2867616kB slab_unreclaimable:2057000kB kernel_stack:13552kB pagetables:26300kB unstable:0kB bounce:0kB free_pcp:2360kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107755.875227] lowmem_reserve[]: 0 0 0 0 [1107755.875486] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107755.876051] Node 0 DMA32: 4307*4kB (UEM) 7875*8kB (UEM) 6038*16kB (UEM) 324*32kB (UEM) 2075*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520452kB [1107755.876867] Node 0 Normal: 757687*4kB (UEM) 1061885*8kB (UE) 352908*16kB (UEM) 2447*32kB (UM) 3*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17250852kB [1107755.877678] Node 1 Normal: 176929*4kB (UE) 660272*8kB (UE) 475956*16kB (UE) 4043*32kB (UE) 1*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13734628kB [1107755.878475] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107755.878947] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107755.879426] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107755.879893] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107755.880367] 46169270 total pagecache pages [1107755.880604] 1847 pages in swap cache [1107755.880839] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107755.881082] Free swap = 3152396kB [1107755.881324] Total swap = 4194300kB [1107755.881560] 67052113 pages RAM [1107755.881792] 0 pages HighMem/MovableOnly [1107755.882029] 1126685 pages reserved [1107756.008989] md: md27: resync done. [1107756.064916] md: md31: resync done. [1107756.222180] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107756.222435] CPU: 34 PID: 207484 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107756.224131] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107756.225432] 00000000001040d0 000000006cdad6f0 ffff883b09647478 ffffffff816a3db1 [1107756.225908] ffff883b09647508 ffffffff81188810 ffffffff810f9b11 0000000000000010 [1107756.226435] fffffffffffffff0 001040d000000000 0000000000000018 000000006cdad6f0 [1107756.227183] Call Trace: [1107756.227946] [] dump_stack+0x19/0x1b [1107756.228574] [] warn_alloc_failed+0x110/0x180 [1107756.229480] [] ? on_each_cpu_mask+0x51/0x60 [1107756.229720] [] __alloc_pages_slowpath+0x6b6/0x724 [1107756.229977] [] __alloc_pages_nodemask+0x405/0x420 [1107756.230220] [] alloc_pages_current+0x98/0x110 [1107756.230464] [] __get_free_pages+0xe/0x40 [1107756.231038] [] kmalloc_order_trace+0x2e/0xa0 [1107756.231903] [] __kmalloc+0x211/0x230 [1107756.232157] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107756.232685] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107756.233691] [] ? snprintf+0x49/0x70 [1107756.233929] [] mount_bdev+0x1b0/0x1f0 [1107756.234186] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107756.234678] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107756.235191] [] mount_fs+0x39/0x1b0 [1107756.235982] [] vfs_kern_mount+0x67/0x110 [1107756.236527] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107756.236791] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107756.237559] [] obd_setup+0x114/0x2a0 [obdclass] [1107756.238092] [] class_setup+0x2a8/0x840 [obdclass] [1107756.238361] [] class_process_config+0x1940/0x23f0 [obdclass] [1107756.239366] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107756.240363] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107756.240619] [] do_lcfg+0x258/0x500 [obdclass] [1107756.240888] [] lustre_start_simple+0x88/0x210 [obdclass] [1107756.241158] [] server_fill_super+0xf24/0x184c [obdclass] [1107756.241669] [] lustre_fill_super+0x328/0x950 [obdclass] [1107756.241941] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107756.242783] [] mount_nodev+0x4d/0xb0 [1107756.243322] [] lustre_mount+0x38/0x60 [obdclass] [1107756.243564] [] mount_fs+0x39/0x1b0 [1107756.244618] [] vfs_kern_mount+0x67/0x110 [1107756.244857] [] do_mount+0x233/0xaf0 [1107756.245097] [] ? __get_free_pages+0xe/0x40 [1107756.245344] [] SyS_mount+0x96/0xf0 [1107756.245582] [] system_call_fastpath+0x16/0x1b [1107756.245819] Mem-Info: [1107756.246318] active_anon:1337355 inactive_anon:326101 isolated_anon:32 active_file:22716610 inactive_file:22713355 isolated_file:0 unevictable:25294 dirty:143 writeback:0 unstable:0 slab_reclaimable:1186615 slab_unreclaimable:1034041 mapped:12941 shmem:104820 pagetables:14798 bounce:0 free:8183582 free_pcp:2165 free_cma:0 [1107756.249013] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107756.250959] lowmem_reserve[]: 0 1554 128505 128505 [1107756.252001] Node 0 DMA32 free:520452kB min:12672kB low:15840kB high:19008kB active_anon:45876kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:584616kB slab_unreclaimable:177044kB kernel_stack:320kB pagetables:140kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:410 all_unreclaimable? no [1107756.254536] lowmem_reserve[]: 0 0 126950 126950 [1107756.255824] Node 0 Normal free:17729640kB min:1033836kB low:1292292kB high:1550752kB active_anon:4120612kB inactive_anon:708932kB active_file:44731332kB inactive_file:44728876kB unevictable:96328kB isolated(anon):128kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:312kB writeback:0kB mapped:29468kB shmem:277112kB slab_reclaimable:1295668kB slab_unreclaimable:1755068kB kernel_stack:46624kB pagetables:33504kB unstable:0kB bounce:0kB free_pcp:3012kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107756.258597] lowmem_reserve[]: 0 0 0 0 [1107756.260041] Node 1 Normal free:14541892kB min:1050512kB low:1313140kB high:1575768kB active_anon:1177892kB inactive_anon:529944kB active_file:46099300kB inactive_file:46095936kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:260kB writeback:0kB mapped:22288kB shmem:142168kB slab_reclaimable:2866176kB slab_unreclaimable:2204052kB kernel_stack:13488kB pagetables:25548kB unstable:0kB bounce:0kB free_pcp:2164kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:96 all_unreclaimable? no [1107756.262448] lowmem_reserve[]: 0 0 0 0 [1107756.263874] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107756.264958] Node 0 DMA32: 4354*4kB (UEM) 7890*8kB (UEM) 6039*16kB (UEM) 322*32kB (UEM) 2075*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520712kB [1107756.265757] Node 0 Normal: 568943*4kB (UEM) 1193904*8kB (UEM) 369782*16kB (UEM) 722*32kB (UEM) 14*64kB (EM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17767516kB [1107756.267077] Node 1 Normal: 147966*4kB (UEM) 679934*8kB (UEM) 525440*16kB (UEM) 4212*32kB (UEM) 24*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 14574696kB [1107756.268637] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107756.269367] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107756.269828] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107756.270289] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107756.271508] 45492588 total pagecache pages [1107756.271738] 1847 pages in swap cache [1107756.279667] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107756.279912] Free swap = 3152396kB [1107756.280145] Total swap = 4194300kB [1107756.282019] 67052113 pages RAM [1107756.283394] 0 pages HighMem/MovableOnly [1107756.284001] 1126685 pages reserved [1107756.341995] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107756.342245] CPU: 7 PID: 207573 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107756.342730] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107756.343195] 00000000001040d0 00000000e555dde4 ffff88289edef478 ffffffff816a3db1 [1107756.343689] ffff88289edef508 ffffffff81188810 ffffffff810c13e6 ffff88407ffd7000 [1107756.344176] ffff88289edef4a8 ffffffff816a978a ffff88289edef508 00000000e555dde4 [1107756.344671] Call Trace: [1107756.344920] [] dump_stack+0x19/0x1b [1107756.345175] [] warn_alloc_failed+0x110/0x180 [1107756.345437] [] ? __cond_resched+0x26/0x30 [1107756.345689] [] ? _cond_resched+0x3a/0x50 [1107756.345941] [] __alloc_pages_slowpath+0x6b6/0x724 [1107756.346188] [] __alloc_pages_nodemask+0x405/0x420 [1107756.346442] [] alloc_pages_current+0x98/0x110 [1107756.346690] [] __get_free_pages+0xe/0x40 [1107756.346944] [] kmalloc_order_trace+0x2e/0xa0 [1107756.347194] [] __kmalloc+0x211/0x230 [1107756.347480] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107756.347746] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107756.348226] [] ? snprintf+0x49/0x70 [1107756.348474] [] mount_bdev+0x1b0/0x1f0 [1107756.348733] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107756.349225] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107756.349487] [] mount_fs+0x39/0x1b0 [1107756.349736] [] vfs_kern_mount+0x67/0x110 [1107756.349999] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107756.350258] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107756.350779] [] obd_setup+0x114/0x2a0 [obdclass] [1107756.351047] [] class_setup+0x2a8/0x840 [obdclass] [1107756.351326] [] class_process_config+0x1940/0x23f0 [obdclass] [1107756.351816] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107756.352294] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107756.352572] [] do_lcfg+0x258/0x500 [obdclass] [1107756.352839] [] lustre_start_simple+0x88/0x210 [obdclass] [1107756.353122] [] server_fill_super+0xf24/0x184c [obdclass] [1107756.353403] [] lustre_fill_super+0x328/0x950 [obdclass] [1107756.353678] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107756.354161] [] mount_nodev+0x4d/0xb0 [1107756.354434] [] lustre_mount+0x38/0x60 [obdclass] [1107756.354683] [] mount_fs+0x39/0x1b0 [1107756.354929] [] vfs_kern_mount+0x67/0x110 [1107756.355179] [] do_mount+0x233/0xaf0 [1107756.355434] [] ? __get_free_pages+0xe/0x40 [1107756.355683] [] SyS_mount+0x96/0xf0 [1107756.355927] [] system_call_fastpath+0x16/0x1b [1107756.356262] Mem-Info: [1107756.356513] active_anon:1340505 inactive_anon:326101 isolated_anon:32 active_file:22626481 inactive_file:22624744 isolated_file:0 unevictable:25294 dirty:143 writeback:0 unstable:0 slab_reclaimable:1186485 slab_unreclaimable:1035180 mapped:12941 shmem:104820 pagetables:15362 bounce:0 free:8340138 free_pcp:629 free_cma:0 [1107756.357967] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107756.359443] lowmem_reserve[]: 0 1554 128505 128505 [1107756.359711] Node 0 DMA32 free:520452kB min:12672kB low:15840kB high:19008kB active_anon:45876kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:584616kB slab_unreclaimable:177044kB kernel_stack:320kB pagetables:140kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:22960 all_unreclaimable? no [1107756.361413] lowmem_reserve[]: 0 0 126950 126950 [1107756.361684] Node 0 Normal free:18043276kB min:1033836kB low:1292292kB high:1550752kB active_anon:4105996kB inactive_anon:708932kB active_file:44557892kB inactive_file:44555996kB unevictable:96328kB isolated(anon):128kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:312kB writeback:0kB mapped:29468kB shmem:277112kB slab_reclaimable:1295668kB slab_unreclaimable:1755068kB kernel_stack:46624kB pagetables:33504kB unstable:0kB bounce:0kB free_pcp:924kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107756.363631] lowmem_reserve[]: 0 0 0 0 [1107756.363898] Node 1 Normal free:14819680kB min:1050512kB low:1313140kB high:1575768kB active_anon:1216700kB inactive_anon:529944kB active_file:45925616kB inactive_file:45920444kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:260kB writeback:0kB mapped:22288kB shmem:142168kB slab_reclaimable:2865656kB slab_unreclaimable:2208096kB kernel_stack:13488kB pagetables:27804kB unstable:0kB bounce:0kB free_pcp:1416kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107756.365820] lowmem_reserve[]: 0 0 0 0 [1107756.366087] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107756.366673] Node 0 DMA32: 4354*4kB (UEM) 7890*8kB (UEM) 6039*16kB (UEM) 322*32kB (UEM) 2075*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520712kB [1107756.367509] Node 0 Normal: 572127*4kB (UEM) 1218899*8kB (UEM) 373665*16kB (UE) 800*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18043940kB [1107756.368310] Node 1 Normal: 182228*4kB (UE) 681068*8kB (UEM) 531905*16kB (UE) 3997*32kB (U) 21*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 14817184kB [1107756.369108] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107756.369592] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107756.370068] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107756.370640] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107756.371111] 45348392 total pagecache pages [1107756.371353] 1847 pages in swap cache [1107756.371594] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107756.371840] Free swap = 3152396kB [1107756.372082] Total swap = 4194300kB [1107756.372328] 67052113 pages RAM [1107756.372567] 0 pages HighMem/MovableOnly [1107756.372806] 1126685 pages reserved [1107756.401015] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107756.401264] CPU: 46 PID: 207664 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107756.401741] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107756.402201] 00000000001040d0 00000000ca1d9436 ffff881b60427478 ffffffff816a3db1 [1107756.402689] ffff881b60427508 ffffffff81188810 0000000000000000 ffff88207ffdb000 [1107756.403165] 0000000000000004 00000000001040d0 ffff881b60427508 00000000ca1d9436 [1107756.403656] Call Trace: [1107756.403902] [] dump_stack+0x19/0x1b [1107756.404149] [] warn_alloc_failed+0x110/0x180 [1107756.404396] [] __alloc_pages_slowpath+0x6b6/0x724 [1107756.404640] [] __alloc_pages_nodemask+0x405/0x420 [1107756.404883] [] alloc_pages_current+0x98/0x110 [1107756.405123] [] __get_free_pages+0xe/0x40 [1107756.405368] [] kmalloc_order_trace+0x2e/0xa0 [1107756.405609] [] __kmalloc+0x211/0x230 [1107756.405870] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107756.406121] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107756.406592] [] ? snprintf+0x49/0x70 [1107756.406841] [] mount_bdev+0x1b0/0x1f0 [1107756.407092] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107756.407583] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107756.407828] [] mount_fs+0x39/0x1b0 [1107756.408071] [] vfs_kern_mount+0x67/0x110 [1107756.408335] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107756.408590] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107756.409120] [] obd_setup+0x114/0x2a0 [obdclass] [1107756.409392] [] class_setup+0x2a8/0x840 [obdclass] [1107756.409665] [] class_process_config+0x1940/0x23f0 [obdclass] [1107756.410160] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107756.410649] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107756.410925] [] do_lcfg+0x258/0x500 [obdclass] [1107756.411207] [] lustre_start_simple+0x88/0x210 [obdclass] [1107756.411498] [] server_fill_super+0xf24/0x184c [obdclass] [1107756.411769] [] lustre_fill_super+0x328/0x950 [obdclass] [1107756.412034] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107756.412501] [] mount_nodev+0x4d/0xb0 [1107756.412757] [] lustre_mount+0x38/0x60 [obdclass] [1107756.412996] [] mount_fs+0x39/0x1b0 [1107756.413331] [] vfs_kern_mount+0x67/0x110 [1107756.413570] [] do_mount+0x233/0xaf0 [1107756.419127] [] ? __get_free_pages+0xe/0x40 [1107756.419385] [] SyS_mount+0x96/0xf0 [1107756.419395] LDISKFS-fs warning (device md1): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107756.420345] [] system_call_fastpath+0x16/0x1b [1107756.420589] Mem-Info: [1107756.420830] active_anon:1340587 inactive_anon:326103 isolated_anon:32 active_file:22606057 inactive_file:22604125 isolated_file:0 unevictable:25294 dirty:143 writeback:0 unstable:0 slab_reclaimable:1186485 slab_unreclaimable:1037147 mapped:12941 shmem:104820 pagetables:15590 bounce:0 free:8352768 free_pcp:1113 free_cma:0 [1107756.422247] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107756.423688] lowmem_reserve[]: 0 1554 128505 128505 [1107756.423945] Node 0 DMA32 free:520432kB min:12672kB low:15840kB high:19008kB active_anon:45196kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:584616kB slab_unreclaimable:177812kB kernel_stack:320kB pagetables:140kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:1640 all_unreclaimable? no [1107756.425630] lowmem_reserve[]: 0 0 126950 126950 [1107756.425899] Node 0 Normal free:18003332kB min:1033836kB low:1292292kB high:1550752kB active_anon:4110028kB inactive_anon:708932kB active_file:44526156kB inactive_file:44523244kB unevictable:96328kB isolated(anon):128kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:312kB writeback:0kB mapped:29468kB shmem:277112kB slab_reclaimable:1295668kB slab_unreclaimable:1753020kB kernel_stack:46624kB pagetables:34256kB unstable:0kB bounce:0kB free_pcp:2332kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107756.427898] lowmem_reserve[]: 0 0 0 0 [1107756.428153] Node 1 Normal free:14889196kB min:1050512kB low:1313140kB high:1575768kB active_anon:1213172kB inactive_anon:529952kB active_file:45887000kB inactive_file:45881508kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:260kB writeback:0kB mapped:22288kB shmem:142168kB slab_reclaimable:2865656kB slab_unreclaimable:2217244kB kernel_stack:13456kB pagetables:27964kB unstable:0kB bounce:0kB free_pcp:1328kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107756.430057] lowmem_reserve[]: 0 0 0 0 [1107756.430323] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107756.430895] Node 0 DMA32: 4409*4kB (UEM) 7959*8kB (UEM) 6039*16kB (UEM) 300*32kB (UEM) 2074*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520716kB [1107756.431708] Node 0 Normal: 547363*4kB (UEM) 1226217*8kB (UE) 375642*16kB (UEM) 245*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18017300kB [1107756.432495] Node 1 Normal: 204241*4kB (UE) 681394*8kB (UE) 532446*16kB (UE) 3546*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 14900724kB [1107756.433279] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107756.433754] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107756.434225] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107756.434701] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107756.435167] 45303272 total pagecache pages [1107756.435406] 1847 pages in swap cache [1107756.435640] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107756.435881] Free swap = 3152396kB [1107756.436116] Total swap = 4194300kB [1107756.436356] 67052113 pages RAM [1107756.436589] 0 pages HighMem/MovableOnly [1107756.436825] 1126685 pages reserved [1107756.463724] md: md19: resync done. [1107757.222161] LDISKFS-fs warning (device md15): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107757.332224] LDISKFS-fs warning (device md11): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107757.396359] LDISKFS-fs warning (device md23): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107757.412821] LDISKFS-fs warning (device md19): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107757.588109] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107757.588362] CPU: 12 PID: 207809 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107757.588843] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107757.589310] 00000000001040d0 00000000a3024599 ffff881924cdb478 ffffffff816a3db1 [1107757.589790] ffff881924cdb508 ffffffff81188810 0000000000000000 00000000ffffffff [1107757.590272] fffffffffffffff0 001040d000000000 ffff881924cdb4d8 00000000a3024599 [1107757.590758] Call Trace: [1107757.590999] [] dump_stack+0x19/0x1b [1107757.591249] [] warn_alloc_failed+0x110/0x180 [1107757.591491] [] __alloc_pages_slowpath+0x6b6/0x724 [1107757.591732] [] __alloc_pages_nodemask+0x405/0x420 [1107757.591974] [] alloc_pages_current+0x98/0x110 [1107757.592214] [] __get_free_pages+0xe/0x40 [1107757.592459] [] kmalloc_order_trace+0x2e/0xa0 [1107757.592700] [] __kmalloc+0x211/0x230 [1107757.592959] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107757.593211] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107757.593676] [] ? snprintf+0x49/0x70 [1107757.593915] [] mount_bdev+0x1b0/0x1f0 [1107757.594160] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107757.594629] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107757.594963] [] mount_fs+0x39/0x1b0 [1107757.595204] [] vfs_kern_mount+0x67/0x110 [1107757.595467] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107757.595720] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107757.596221] [] obd_setup+0x114/0x2a0 [obdclass] [1107757.596492] [] class_setup+0x2a8/0x840 [obdclass] [1107757.596753] [] class_process_config+0x1940/0x23f0 [obdclass] [1107757.597224] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107757.597697] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107757.597954] [] do_lcfg+0x258/0x500 [obdclass] [1107757.598206] [] lustre_start_simple+0x88/0x210 [obdclass] [1107757.598474] [] server_fill_super+0xf24/0x184c [obdclass] [1107757.598734] [] lustre_fill_super+0x328/0x950 [obdclass] [1107757.598989] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107757.599458] [] mount_nodev+0x4d/0xb0 [1107757.599714] [] lustre_mount+0x38/0x60 [obdclass] [1107757.599958] [] mount_fs+0x39/0x1b0 [1107757.600200] [] vfs_kern_mount+0x67/0x110 [1107757.600447] [] do_mount+0x233/0xaf0 [1107757.600690] [] ? __get_free_pages+0xe/0x40 [1107757.600931] [] SyS_mount+0x96/0xf0 [1107757.601174] [] system_call_fastpath+0x16/0x1b [1107757.601415] Mem-Info: [1107757.601656] active_anon:1323828 inactive_anon:326107 isolated_anon:0 active_file:21697119 inactive_file:21695578 isolated_file:32 unevictable:25294 dirty:151 writeback:0 unstable:0 slab_reclaimable:1185977 slab_unreclaimable:1076662 mapped:12978 shmem:104823 pagetables:15338 bounce:0 free:9732560 free_pcp:1421 free_cma:0 [1107757.603064] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107757.604491] lowmem_reserve[]: 0 1554 128505 128505 [1107757.604748] Node 0 DMA32 free:520472kB min:12672kB low:15840kB high:19008kB active_anon:44132kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:584168kB slab_unreclaimable:179316kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:17630 all_unreclaimable? no [1107757.606437] lowmem_reserve[]: 0 0 126950 126950 [1107757.606714] Node 0 Normal free:20183564kB min:1033836kB low:1292292kB high:1550752kB active_anon:4126164kB inactive_anon:708876kB active_file:42804092kB inactive_file:42805124kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:336kB writeback:0kB mapped:29512kB shmem:277116kB slab_reclaimable:1295564kB slab_unreclaimable:1770516kB kernel_stack:46192kB pagetables:33860kB unstable:0kB bounce:0kB free_pcp:3068kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107757.608625] lowmem_reserve[]: 0 0 0 0 [1107757.608879] Node 1 Normal free:18163892kB min:1050512kB low:1313140kB high:1575768kB active_anon:1130056kB inactive_anon:530024kB active_file:43983552kB inactive_file:43980516kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:268kB writeback:0kB mapped:22392kB shmem:142176kB slab_reclaimable:2864176kB slab_unreclaimable:2362448kB kernel_stack:13296kB pagetables:27360kB unstable:0kB bounce:0kB free_pcp:3916kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107757.609124] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107757.609127] CPU: 46 PID: 207932 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107757.609128] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107757.609131] 00000000001040d0 0000000026e7e46c ffff881accc9f478 ffffffff816a3db1 [1107757.609133] ffff881accc9f508 ffffffff81188810 0000000000000000 00000000ffffffff [1107757.609135] fffffffffffffff0 001040d000000000 ffff881accc9f4d8 0000000026e7e46c [1107757.609135] Call Trace: [1107757.609145] [] dump_stack+0x19/0x1b [1107757.609152] [] warn_alloc_failed+0x110/0x180 [1107757.609154] [] __alloc_pages_slowpath+0x6b6/0x724 [1107757.609157] [] __alloc_pages_nodemask+0x405/0x420 [1107757.609160] [] alloc_pages_current+0x98/0x110 [1107757.609163] [] __get_free_pages+0xe/0x40 [1107757.609167] [] kmalloc_order_trace+0x2e/0xa0 [1107757.609170] [] __kmalloc+0x211/0x230 [1107757.609193] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107757.609205] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107757.609209] [] ? snprintf+0x49/0x70 [1107757.609212] [] mount_bdev+0x1b0/0x1f0 [1107757.609222] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107757.609232] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107757.609234] [] mount_fs+0x39/0x1b0 [1107757.609244] [] vfs_kern_mount+0x67/0x110 [1107757.609265] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107757.609274] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107757.609309] [] obd_setup+0x114/0x2a0 [obdclass] [1107757.609330] [] class_setup+0x2a8/0x840 [obdclass] [1107757.609349] [] class_process_config+0x1940/0x23f0 [obdclass] [1107757.609361] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107757.609369] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107757.609389] [] do_lcfg+0x258/0x500 [obdclass] [1107757.609409] [] lustre_start_simple+0x88/0x210 [obdclass] [1107757.609433] [] server_fill_super+0xf24/0x184c [obdclass] [1107757.609452] [] lustre_fill_super+0x328/0x950 [obdclass] [1107757.609470] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107757.609473] [] mount_nodev+0x4d/0xb0 [1107757.609491] [] lustre_mount+0x38/0x60 [obdclass] [1107757.609493] [] mount_fs+0x39/0x1b0 [1107757.609496] [] vfs_kern_mount+0x67/0x110 [1107757.609498] [] do_mount+0x233/0xaf0 [1107757.609501] [] ? __get_free_pages+0xe/0x40 [1107757.609504] [] SyS_mount+0x96/0xf0 [1107757.609506] [] system_call_fastpath+0x16/0x1b [1107757.609508] Mem-Info: [1107757.609516] active_anon:1325466 inactive_anon:326107 isolated_anon:0 active_file:21697119 inactive_file:21697122 isolated_file:0 unevictable:25294 dirty:151 writeback:0 unstable:0 slab_reclaimable:1185977 slab_unreclaimable:1078198 mapped:12978 shmem:104823 pagetables:15338 bounce:0 free:9719698 free_pcp:210 free_cma:0 [1107757.609523] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107757.609525] lowmem_reserve[]: 0 1554 128505 128505 [1107757.609531] Node 0 DMA32 free:520472kB min:12672kB low:15840kB high:19008kB active_anon:44132kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:584168kB slab_unreclaimable:179316kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107757.609533] lowmem_reserve[]: 0 0 126950 126950 [1107757.609539] Node 0 Normal free:20179820kB min:1033836kB low:1292292kB high:1550752kB active_anon:4127676kB inactive_anon:708876kB active_file:42804092kB inactive_file:42807140kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:336kB writeback:0kB mapped:29512kB shmem:277116kB slab_reclaimable:1295564kB slab_unreclaimable:1771028kB kernel_stack:46192kB pagetables:33860kB unstable:0kB bounce:0kB free_pcp:688kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107757.609541] lowmem_reserve[]: 0 0 0 0 [1107757.609547] Node 1 Normal free:18164396kB min:1050512kB low:1313140kB high:1575768kB active_anon:1130056kB inactive_anon:530024kB active_file:43983552kB inactive_file:43980516kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:268kB writeback:0kB mapped:22392kB shmem:142176kB slab_reclaimable:2864176kB slab_unreclaimable:2362448kB kernel_stack:13296kB pagetables:27360kB unstable:0kB bounce:0kB free_pcp:128kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107757.609549] lowmem_reserve[]: 0 0 0 0 [1107757.609558] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107757.609568] Node 0 DMA32: 4493*4kB (UEM) 8002*8kB (UEM) 6055*16kB (UEM) 325*32kB (UEM) 2043*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520468kB [1107757.609575] Node 0 Normal: 368691*4kB (UE) 1447179*8kB (UE) 445882*16kB (UEM) 41*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20187620kB [1107757.609582] Node 1 Normal: 371514*4kB (UEM) 777097*8kB (UEM) 645856*16kB (UEM) 4156*32kB (UEM) 40*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18172080kB [1107757.609584] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107757.609585] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107757.609586] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107757.609586] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107757.609587] 43501200 total pagecache pages [1107757.609588] 1847 pages in swap cache [1107757.609589] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107757.609590] Free swap = 3152396kB [1107757.609591] Total swap = 4194300kB [1107757.609592] 67052113 pages RAM [1107757.609592] 0 pages HighMem/MovableOnly [1107757.609593] 1126685 pages reserved [1107757.644877] lowmem_reserve[]: 0 0 0 0 [1107757.645132] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107757.645705] Node 0 DMA32: 4493*4kB (UEM) 8002*8kB (UEM) 6055*16kB (UEM) 326*32kB (UEM) 2043*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520500kB [1107757.646513] Node 0 Normal: 385520*4kB (UEM) 1447216*8kB (UE) 445612*16kB (U) 500*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20265600kB [1107757.647290] Node 1 Normal: 351482*4kB (UEM) 777206*8kB (UEM) 648874*16kB (UEM) 4169*32kB (UEM) 37*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18141336kB [1107757.648069] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107757.648544] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107757.649018] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107757.649487] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107757.649951] 43469992 total pagecache pages [1107757.650183] 1847 pages in swap cache [1107757.650419] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107757.650659] Free swap = 3152396kB [1107757.650890] Total swap = 4194300kB [1107757.651122] 67052113 pages RAM [1107757.651353] 0 pages HighMem/MovableOnly [1107757.651582] 1126685 pages reserved [1107758.196595] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107758.196852] CPU: 10 PID: 208225 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107758.197335] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107758.197804] 00000000001040d0 00000000a15cfb73 ffff883087967478 ffffffff816a3db1 [1107758.198301] ffff883087967508 ffffffff81188810 0000000000000000 00000000ffffffff [1107758.198789] fffffffffffffff0 001040d000000000 ffff8830879674d8 00000000a15cfb73 [1107758.199271] Call Trace: [1107758.199510] [] dump_stack+0x19/0x1b [1107758.199753] [] warn_alloc_failed+0x110/0x180 [1107758.200005] [] __alloc_pages_slowpath+0x6b6/0x724 [1107758.200259] [] __alloc_pages_nodemask+0x405/0x420 [1107758.200506] [] alloc_pages_current+0x98/0x110 [1107758.200751] [] __get_free_pages+0xe/0x40 [1107758.200995] [] kmalloc_order_trace+0x2e/0xa0 [1107758.201245] [] __kmalloc+0x211/0x230 [1107758.201505] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107758.201758] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107758.202232] [] ? snprintf+0x49/0x70 [1107758.202476] [] mount_bdev+0x1b0/0x1f0 [1107758.202725] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107758.203202] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107758.203451] [] mount_fs+0x39/0x1b0 [1107758.203695] [] vfs_kern_mount+0x67/0x110 [1107758.203954] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107758.204208] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107758.210094] [] obd_setup+0x114/0x2a0 [obdclass] [1107758.210723] [] class_setup+0x2a8/0x840 [obdclass] [1107758.210984] [] class_process_config+0x1940/0x23f0 [obdclass] [1107758.211871] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107758.212356] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107758.212618] [] do_lcfg+0x258/0x500 [obdclass] [1107758.212876] [] lustre_start_simple+0x88/0x210 [obdclass] [1107758.213136] [] server_fill_super+0xf24/0x184c [obdclass] [1107758.213400] [] lustre_fill_super+0x328/0x950 [obdclass] [1107758.213664] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107758.214132] [] mount_nodev+0x4d/0xb0 [1107758.214394] [] lustre_mount+0x38/0x60 [obdclass] [1107758.214635] [] mount_fs+0x39/0x1b0 [1107758.214877] [] vfs_kern_mount+0x67/0x110 [1107758.215118] [] do_mount+0x233/0xaf0 [1107758.215367] [] SyS_mount+0x96/0xf0 [1107758.215610] [] system_call_fastpath+0x16/0x1b [1107758.215851] Mem-Info: [1107758.216090] active_anon:1310326 inactive_anon:326119 isolated_anon:0 active_file:21370013 inactive_file:21367608 isolated_file:0 unevictable:25294 dirty:157 writeback:0 unstable:0 slab_reclaimable:1185560 slab_unreclaimable:1096333 mapped:13022 shmem:104825 pagetables:14349 bounce:0 free:10078678 free_pcp:920 free_cma:0 [1107758.217796] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107758.219250] lowmem_reserve[]: 0 1554 128505 128505 [1107758.219514] Node 0 DMA32 free:520444kB min:12672kB low:15840kB high:19008kB active_anon:44064kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:583752kB slab_unreclaimable:179764kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:60498 all_unreclaimable? no [1107758.221621] lowmem_reserve[]: 0 0 126950 126950 [1107758.221877] Node 0 Normal free:20855180kB min:1033836kB low:1292292kB high:1550752kB active_anon:4112392kB inactive_anon:708964kB active_file:42167096kB inactive_file:42160796kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:352kB writeback:0kB mapped:29460kB shmem:277128kB slab_reclaimable:1295548kB slab_unreclaimable:1741464kB kernel_stack:45536kB pagetables:32052kB unstable:0kB bounce:0kB free_pcp:2432kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107758.224127] lowmem_reserve[]: 0 0 0 0 [1107758.224390] Node 1 Normal free:18947752kB min:1050512kB low:1313140kB high:1575768kB active_anon:1087368kB inactive_anon:529984kB active_file:43298300kB inactive_file:43295776kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:276kB writeback:0kB mapped:22620kB shmem:142172kB slab_reclaimable:2862940kB slab_unreclaimable:2464104kB kernel_stack:13216kB pagetables:25212kB unstable:0kB bounce:0kB free_pcp:984kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107758.226304] lowmem_reserve[]: 0 0 0 0 [1107758.226562] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107758.227136] Node 0 DMA32: 4495*4kB (UEM) 8002*8kB (UEM) 6055*16kB (UEM) 329*32kB (UEM) 2043*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520604kB [1107758.227951] Node 0 Normal: 364396*4kB (UEM) 1481097*8kB (UE) 472166*16kB (UM) 6*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20861208kB [1107758.228732] Node 1 Normal: 303719*4kB (UE) 863875*8kB (UE) 669391*16kB (UEM) 4029*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18965060kB [1107758.229767] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107758.230242] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107758.230707] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107758.231176] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107758.231650] 42830734 total pagecache pages [1107758.231886] 1847 pages in swap cache [1107758.232122] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107758.232367] Free swap = 3152396kB [1107758.232600] Total swap = 4194300kB [1107758.232835] 67052113 pages RAM [1107758.233070] 0 pages HighMem/MovableOnly [1107758.233308] 1126685 pages reserved [1107758.413839] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107758.414096] CPU: 11 PID: 208273 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107758.414592] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107758.415072] 00000000001040d0 000000008b6b6863 ffff88249cecf478 ffffffff816a3db1 [1107758.415570] ffff88249cecf508 ffffffff81188810 0000000000000000 00000000ffffffff [1107758.416059] fffffffffffffff0 001040d000000000 ffff88249cecf4d8 000000008b6b6863 [1107758.416558] Call Trace: [1107758.416806] [] dump_stack+0x19/0x1b [1107758.417066] [] warn_alloc_failed+0x110/0x180 [1107758.417320] [] __alloc_pages_slowpath+0x6b6/0x724 [1107758.417573] [] __alloc_pages_nodemask+0x405/0x420 [1107758.417823] [] alloc_pages_current+0x98/0x110 [1107758.418073] [] __get_free_pages+0xe/0x40 [1107758.418324] [] kmalloc_order_trace+0x2e/0xa0 [1107758.418570] [] __kmalloc+0x211/0x230 [1107758.418832] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107758.419088] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107758.419215] LDISKFS-fs warning (device md27): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107758.420277] [] ? snprintf+0x49/0x70 [1107758.420616] [] mount_bdev+0x1b0/0x1f0 [1107758.420866] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107758.421340] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107758.421583] [] mount_fs+0x39/0x1b0 [1107758.421836] [] vfs_kern_mount+0x67/0x110 [1107758.422092] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107758.422347] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107758.422856] [] obd_setup+0x114/0x2a0 [obdclass] [1107758.423124] [] class_setup+0x2a8/0x840 [obdclass] [1107758.423399] [] class_process_config+0x1940/0x23f0 [obdclass] [1107758.423892] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107758.424374] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107758.424638] [] do_lcfg+0x258/0x500 [obdclass] [1107758.424898] [] lustre_start_simple+0x88/0x210 [obdclass] [1107758.425167] [] server_fill_super+0xf24/0x184c [obdclass] [1107758.425431] [] lustre_fill_super+0x328/0x950 [obdclass] [1107758.425692] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107758.426170] [] mount_nodev+0x4d/0xb0 [1107758.426431] [] lustre_mount+0x38/0x60 [obdclass] [1107758.426677] [] mount_fs+0x39/0x1b0 [1107758.426924] [] vfs_kern_mount+0x67/0x110 [1107758.427171] [] do_mount+0x233/0xaf0 [1107758.427419] [] ? __get_free_pages+0xe/0x40 [1107758.427666] [] SyS_mount+0x96/0xf0 [1107758.427909] [] system_call_fastpath+0x16/0x1b [1107758.428416] Mem-Info: [1107758.428660] active_anon:1305538 inactive_anon:326119 isolated_anon:0 active_file:21280085 inactive_file:21276925 isolated_file:0 unevictable:25294 dirty:157 writeback:0 unstable:0 slab_reclaimable:1185312 slab_unreclaimable:1110731 mapped:13022 shmem:104825 pagetables:13973 bounce:0 free:10178976 free_pcp:1543 free_cma:0 [1107758.430101] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107758.431573] lowmem_reserve[]: 0 1554 128505 128505 [1107758.431718] LDISKFS-fs warning (device md17): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107758.432819] Node 0 DMA32 free:520476kB min:12672kB low:15840kB high:19008kB active_anon:44064kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:583272kB slab_unreclaimable:180212kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:72980 all_unreclaimable? no [1107758.434517] lowmem_reserve[]: 0 0 126950 126950 [1107758.434879] Node 0 Normal free:21155616kB min:1033836kB low:1292292kB high:1550752kB active_anon:4124488kB inactive_anon:708964kB active_file:41999936kB inactive_file:41998108kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:352kB writeback:0kB mapped:29460kB shmem:277128kB slab_reclaimable:1295548kB slab_unreclaimable:1758776kB kernel_stack:45536kB pagetables:32052kB unstable:0kB bounce:0kB free_pcp:4748kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107758.436803] lowmem_reserve[]: 0 0 0 0 [1107758.437061] Node 1 Normal free:19012896kB min:1050512kB low:1313140kB high:1575768kB active_anon:1059648kB inactive_anon:529984kB active_file:43119572kB inactive_file:43115816kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:276kB writeback:0kB mapped:22620kB shmem:142172kB slab_reclaimable:2862428kB slab_unreclaimable:2503936kB kernel_stack:13216kB pagetables:24460kB unstable:0kB bounce:0kB free_pcp:3572kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107758.444440] lowmem_reserve[]: 0 0 0 0 [1107758.444708] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107758.446063] Node 0 DMA32: 4495*4kB (UEM) 8002*8kB (UEM) 6055*16kB (UEM) 344*32kB (UEM) 2038*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520764kB [1107758.447401] Node 0 Normal: 373114*4kB (UE) 1506524*8kB (UEM) 475711*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 21156024kB [1107758.447960] Node 1 Normal: 236993*4kB (UE) 879163*8kB (UE) 680964*16kB (UEM) 4293*32kB (UEM) 1*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 19014140kB [1107758.448747] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107758.449309] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107758.450028] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107758.450766] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107758.451243] 42663038 total pagecache pages [1107758.451481] 1847 pages in swap cache [1107758.451717] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107758.451959] Free swap = 3152396kB [1107758.452200] Total swap = 4194300kB [1107758.452437] 67052113 pages RAM [1107758.452671] 0 pages HighMem/MovableOnly [1107758.452909] 1126685 pages reserved [1107758.456625] LDISKFS-fs warning (device md3): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107759.080575] LDISKFS-fs warning (device md25): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107759.408331] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107759.408583] CPU: 37 PID: 208379 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107759.409068] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107759.409559] 00000000001040d0 0000000093765308 ffff883f641c3478 ffffffff816a3db1 [1107759.410399] ffff883f641c3508 ffffffff81188810 0000000000000000 00000000ffffffff [1107759.410883] fffffffffffffff0 001040d000000000 ffff883f641c34d8 0000000093765308 [1107759.411394] Call Trace: [1107759.411642] [] dump_stack+0x19/0x1b [1107759.411889] [] warn_alloc_failed+0x110/0x180 [1107759.412138] [] __alloc_pages_slowpath+0x6b6/0x724 [1107759.412723] [] __alloc_pages_nodemask+0x405/0x420 [1107759.412971] [] alloc_pages_current+0x98/0x110 [1107759.413230] [] __get_free_pages+0xe/0x40 [1107759.413479] [] kmalloc_order_trace+0x2e/0xa0 [1107759.414087] [] __kmalloc+0x211/0x230 [1107759.414364] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107759.414627] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107759.415099] [] ? snprintf+0x49/0x70 [1107759.415348] [] mount_bdev+0x1b0/0x1f0 [1107759.415606] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107759.416093] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107759.416345] [] mount_fs+0x39/0x1b0 [1107759.416900] [] vfs_kern_mount+0x67/0x110 [1107759.417219] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107759.417468] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107759.418350] [] obd_setup+0x114/0x2a0 [obdclass] [1107759.418648] [] class_setup+0x2a8/0x840 [obdclass] [1107759.418915] [] class_process_config+0x1940/0x23f0 [obdclass] [1107759.419413] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107759.419892] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107759.420188] [] do_lcfg+0x258/0x500 [obdclass] [1107759.420453] [] lustre_start_simple+0x88/0x210 [obdclass] [1107759.420725] [] server_fill_super+0xf24/0x184c [obdclass] [1107759.420991] [] lustre_fill_super+0x328/0x950 [obdclass] [1107759.421592] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107759.422408] [] mount_nodev+0x4d/0xb0 [1107759.422669] [] lustre_mount+0x38/0x60 [obdclass] [1107759.422914] [] mount_fs+0x39/0x1b0 [1107759.423165] [] vfs_kern_mount+0x67/0x110 [1107759.423413] [] do_mount+0x233/0xaf0 [1107759.423663] [] ? __get_free_pages+0xe/0x40 [1107759.423911] [] SyS_mount+0x96/0xf0 [1107759.424165] [] system_call_fastpath+0x16/0x1b [1107759.424418] Mem-Info: [1107759.424667] active_anon:1301662 inactive_anon:326126 isolated_anon:0 active_file:20805874 inactive_file:20806965 isolated_file:0 unevictable:25294 dirty:175 writeback:0 unstable:0 slab_reclaimable:1184995 slab_unreclaimable:1160140 mapped:13063 shmem:104825 pagetables:14597 bounce:0 free:10756269 free_pcp:1233 free_cma:0 [1107759.426120] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107759.428233] lowmem_reserve[]: 0 1554 128505 128505 [1107759.428496] Node 0 DMA32 free:520476kB min:12672kB low:15840kB high:19008kB active_anon:44064kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:583016kB slab_unreclaimable:180372kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:10660 all_unreclaimable? no [1107759.430202] lowmem_reserve[]: 0 0 126950 126950 [1107759.430788] Node 0 Normal free:22354592kB min:1033836kB low:1292292kB high:1550752kB active_anon:4070848kB inactive_anon:708976kB active_file:41087832kB inactive_file:41085872kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:404kB writeback:0kB mapped:29480kB shmem:277120kB slab_reclaimable:1295352kB slab_unreclaimable:1794716kB kernel_stack:45600kB pagetables:32692kB unstable:0kB bounce:0kB free_pcp:2312kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107759.433105] lowmem_reserve[]: 0 0 0 0 [1107759.433375] Node 1 Normal free:20130820kB min:1050512kB low:1313140kB high:1575768kB active_anon:1093248kB inactive_anon:530000kB active_file:42134832kB inactive_file:42141660kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:296kB writeback:0kB mapped:22764kB shmem:142180kB slab_reclaimable:2861612kB slab_unreclaimable:2665984kB kernel_stack:13072kB pagetables:25564kB unstable:0kB bounce:0kB free_pcp:2408kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107759.435307] lowmem_reserve[]: 0 0 0 0 [1107759.435826] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107759.436401] Node 0 DMA32: 4495*4kB (UEM) 8002*8kB (UEM) 6055*16kB (UEM) 352*32kB (UEM) 2031*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520572kB [1107759.437217] Node 0 Normal: 374154*4kB (UEM) 1563830*8kB (UEM) 517377*16kB (UEM) 2268*32kB (UEM) 15*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22358824kB [1107759.438015] Node 1 Normal: 150464*4kB (UE) 965918*8kB (UE) 729797*16kB (UEM) 3894*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20130560kB [1107759.438801] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107759.439286] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107759.440031] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107759.440522] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107759.441004] 41722617 total pagecache pages [1107759.441255] 1847 pages in swap cache [1107759.441501] Swap cache stats: add 264099, delete 262252, find 2320/2658 [1107759.441756] Free swap = 3152396kB [1107759.441994] Total swap = 4194300kB [1107759.442239] 67052113 pages RAM [1107759.442478] 0 pages HighMem/MovableOnly [1107759.442724] 1126685 pages reserved [1107759.620743] LDISKFS-fs warning (device md7): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107760.408543] LDISKFS-fs warning (device md29): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107760.791663] LDISKFS-fs warning (device md35): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107760.904327] warn_alloc_failed: 2 callbacks suppressed [1107760.904578] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107760.904831] CPU: 37 PID: 208679 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107760.905317] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107760.905790] 00000000001040d0 000000006fe27dde ffff8819f0c47478 ffffffff816a3db1 [1107760.906290] ffff8819f0c47508 ffffffff81188810 ffffffff810f9b11 0000000000000010 [1107760.912318] fffffffffffffff0 001040d000000000 0000000000000018 000000006fe27dde [1107760.912815] Call Trace: [1107760.913078] [] dump_stack+0x19/0x1b [1107760.913345] [] warn_alloc_failed+0x110/0x180 [1107760.913605] [] ? on_each_cpu_mask+0x51/0x60 [1107760.913864] [] __alloc_pages_slowpath+0x6b6/0x724 [1107760.914122] [] __alloc_pages_nodemask+0x405/0x420 [1107760.914376] [] alloc_pages_current+0x98/0x110 [1107760.914632] [] __get_free_pages+0xe/0x40 [1107760.914888] [] kmalloc_order_trace+0x2e/0xa0 [1107760.915148] [] __kmalloc+0x211/0x230 [1107760.915420] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107760.915683] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107760.916174] [] ? snprintf+0x49/0x70 [1107760.916425] [] mount_bdev+0x1b0/0x1f0 [1107760.916687] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107760.917178] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107760.917427] [] mount_fs+0x39/0x1b0 [1107760.917676] [] vfs_kern_mount+0x67/0x110 [1107760.917935] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107760.918198] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107760.918721] [] obd_setup+0x114/0x2a0 [obdclass] [1107760.918996] [] class_setup+0x2a8/0x840 [obdclass] [1107760.919273] [] class_process_config+0x1940/0x23f0 [obdclass] [1107760.919783] [] ? __cond_resched+0x26/0x30 [1107760.920050] [] ? _cond_resched+0x3a/0x50 [1107760.920329] [] do_lcfg+0x258/0x500 [obdclass] [1107760.920594] [] lustre_start_simple+0x88/0x210 [obdclass] [1107760.920870] [] server_fill_super+0xf24/0x184c [obdclass] [1107760.921146] [] lustre_fill_super+0x328/0x950 [obdclass] [1107760.921414] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107760.921895] [] mount_nodev+0x4d/0xb0 [1107760.922164] [] lustre_mount+0x38/0x60 [obdclass] [1107760.922422] [] mount_fs+0x39/0x1b0 [1107760.922670] [] vfs_kern_mount+0x67/0x110 [1107760.922922] [] do_mount+0x233/0xaf0 [1107760.923181] [] ? __get_free_pages+0xe/0x40 [1107760.923440] [] SyS_mount+0x96/0xf0 [1107760.923696] [] system_call_fastpath+0x16/0x1b [1107760.923944] Mem-Info: [1107760.924197] active_anon:1272296 inactive_anon:328146 isolated_anon:0 active_file:20243337 inactive_file:20242742 isolated_file:0 unevictable:25294 dirty:153 writeback:0 unstable:0 slab_reclaimable:1183340 slab_unreclaimable:1193429 mapped:13031 shmem:106893 pagetables:13512 bounce:0 free:11397936 free_pcp:1524 free_cma:0 [1107760.925654] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107760.927189] lowmem_reserve[]: 0 1554 128505 128505 [1107760.927453] Node 0 DMA32 free:520444kB min:12672kB low:15840kB high:19008kB active_anon:44064kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:582408kB slab_unreclaimable:181044kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:8708 all_unreclaimable? no [1107760.929168] lowmem_reserve[]: 0 0 126950 126950 [1107760.929441] Node 0 Normal free:23939660kB min:1033836kB low:1292292kB high:1550752kB active_anon:3990752kB inactive_anon:717032kB active_file:40008956kB inactive_file:40007572kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:356kB writeback:0kB mapped:29464kB shmem:285388kB slab_reclaimable:1295156kB slab_unreclaimable:1796548kB kernel_stack:45344kB pagetables:29996kB unstable:0kB bounce:0kB free_pcp:2564kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107760.931373] lowmem_reserve[]: 0 0 0 0 [1107760.931654] Node 1 Normal free:21142160kB min:1050512kB low:1313140kB high:1575768kB active_anon:1057036kB inactive_anon:530024kB active_file:40950760kB inactive_file:40949696kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:256kB writeback:0kB mapped:22692kB shmem:142184kB slab_reclaimable:2855796kB slab_unreclaimable:2796124kB kernel_stack:13088kB pagetables:24008kB unstable:0kB bounce:0kB free_pcp:1120kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:32 all_unreclaimable? no [1107760.933590] lowmem_reserve[]: 0 0 0 0 [1107760.933856] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107760.934462] Node 0 DMA32: 4495*4kB (UEM) 8002*8kB (UEM) 6055*16kB (UEM) 368*32kB (UEM) 2021*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520444kB [1107760.935305] Node 0 Normal: 307705*4kB (UEM) 1713146*8kB (UE) 557111*16kB (UEM) 3246*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23953636kB [1107760.936100] Node 1 Normal: 97226*4kB (UE) 981977*8kB (UEM) 798273*16kB (UEM) 4181*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 21150880kB [1107760.936894] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107760.937375] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107760.937854] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107760.938331] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107760.938804] 40584738 total pagecache pages [1107760.939042] 1845 pages in swap cache [1107760.939284] Swap cache stats: add 264099, delete 262254, find 2320/2658 [1107760.939531] Free swap = 3152404kB [1107760.939772] Total swap = 4194300kB [1107760.940007] 67052113 pages RAM [1107760.940334] 0 pages HighMem/MovableOnly [1107760.940568] 1126685 pages reserved [1107761.225857] LDISKFS-fs warning (device md21): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107761.622444] mount.lustre: page allocation failure: order:4, mode:0x1040d0 [1107761.622698] CPU: 47 PID: 208788 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107761.623196] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107761.623793] 00000000001040d0 0000000056e56efb ffff881325047478 ffffffff816a3db1 [1107761.624305] ffff881325047508 ffffffff81188810 0000000000000000 00000000ffffffff [1107761.624825] fffffffffffffff0 001040d000000000 ffff8813250474d8 0000000056e56efb [1107761.625341] Call Trace: [1107761.625603] [] dump_stack+0x19/0x1b [1107761.625873] [] warn_alloc_failed+0x110/0x180 [1107761.626146] [] __alloc_pages_slowpath+0x6b6/0x724 [1107761.626423] [] __alloc_pages_nodemask+0x405/0x420 [1107761.626684] [] alloc_pages_current+0x98/0x110 [1107761.626951] [] __get_free_pages+0xe/0x40 [1107761.627223] [] kmalloc_order_trace+0x2e/0xa0 [1107761.627481] [] __kmalloc+0x211/0x230 [1107761.627748] [] ldiskfs_kvmalloc+0x17/0x50 [ldiskfs] [1107761.628001] [] ldiskfs_fill_super+0x12ab/0x2cf0 [ldiskfs] [1107761.628480] [] ? snprintf+0x49/0x70 [1107761.628722] [] mount_bdev+0x1b0/0x1f0 [1107761.629002] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107761.629503] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107761.629771] [] mount_fs+0x39/0x1b0 [1107761.630026] [] vfs_kern_mount+0x67/0x110 [1107761.630292] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107761.630563] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107761.631112] [] obd_setup+0x114/0x2a0 [obdclass] [1107761.631379] [] class_setup+0x2a8/0x840 [obdclass] [1107761.631652] [] class_process_config+0x1940/0x23f0 [obdclass] [1107761.632162] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107761.632666] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107761.632940] [] do_lcfg+0x258/0x500 [obdclass] [1107761.633212] [] lustre_start_simple+0x88/0x210 [obdclass] [1107761.633490] [] server_fill_super+0xf24/0x184c [obdclass] [1107761.633757] [] lustre_fill_super+0x328/0x950 [obdclass] [1107761.634026] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107761.634521] [] mount_nodev+0x4d/0xb0 [1107761.634811] [] lustre_mount+0x38/0x60 [obdclass] [1107761.635080] [] mount_fs+0x39/0x1b0 [1107761.635327] [] vfs_kern_mount+0x67/0x110 [1107761.635585] [] do_mount+0x233/0xaf0 [1107761.635829] [] ? __get_free_pages+0xe/0x40 [1107761.636077] [] SyS_mount+0x96/0xf0 [1107761.636349] [] system_call_fastpath+0x16/0x1b [1107761.636596] Mem-Info: [1107761.636848] active_anon:1279218 inactive_anon:328147 isolated_anon:0 active_file:19950324 inactive_file:19948730 isolated_file:0 unevictable:25294 dirty:153 writeback:0 unstable:0 slab_reclaimable:1182508 slab_unreclaimable:1205690 mapped:13041 shmem:106893 pagetables:14089 bounce:0 free:11805819 free_pcp:1136 free_cma:0 [1107761.638422] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107761.645676] lowmem_reserve[]: 0 1554 128505 128505 [1107761.646025] Node 0 DMA32 free:520444kB min:12672kB low:15840kB high:19008kB active_anon:44064kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:581640kB slab_unreclaimable:181716kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:34850 all_unreclaimable? no [1107761.647767] lowmem_reserve[]: 0 0 126950 126950 [1107761.648030] Node 0 Normal free:25082716kB min:1033836kB low:1292292kB high:1550752kB active_anon:3951376kB inactive_anon:717036kB active_file:39431972kB inactive_file:39430240kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:356kB writeback:0kB mapped:29464kB shmem:285388kB slab_reclaimable:1295156kB slab_unreclaimable:1849444kB kernel_stack:45312kB pagetables:31816kB unstable:0kB bounce:0kB free_pcp:2664kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107761.649994] lowmem_reserve[]: 0 0 0 0 [1107761.650263] Node 1 Normal free:21604752kB min:1050512kB low:1313140kB high:1575768kB active_anon:1121936kB inactive_anon:530024kB active_file:40368492kB inactive_file:40365360kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:256kB writeback:0kB mapped:22692kB shmem:142184kB slab_reclaimable:2853236kB slab_unreclaimable:2791600kB kernel_stack:13088kB pagetables:24408kB unstable:0kB bounce:0kB free_pcp:3532kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107761.652294] lowmem_reserve[]: 0 0 0 0 [1107761.652595] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107761.653225] Node 0 DMA32: 4495*4kB (UEM) 8002*8kB (UEM) 6055*16kB (UEM) 395*32kB (UEM) 2011*64kB (UEM) 1038*128kB (UEM) 140*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520668kB [1107761.654084] Node 0 Normal: 368484*4kB (UEM) 1767918*8kB (UE) 583463*16kB (UEM) 4131*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 25084880kB [1107761.654901] Node 1 Normal: 111864*4kB (UEM) 1019066*8kB (UEM) 806582*16kB (UEM) 3098*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 21604432kB [1107761.655771] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107761.656273] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107761.656772] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107761.657282] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107761.657786] 40010054 total pagecache pages [1107761.658026] 1845 pages in swap cache [1107761.658289] Swap cache stats: add 264099, delete 262254, find 2320/2658 [1107761.658551] Free swap = 3152404kB [1107761.658803] Total swap = 4194300kB [1107761.659065] 67052113 pages RAM [1107761.659313] 0 pages HighMem/MovableOnly [1107761.659566] 1126685 pages reserved [1107761.720565] LDISKFS-fs warning (device md13): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107761.856847] LDISKFS-fs warning (device md31): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107762.453377] LDISKFS-fs warning (device md9): ldiskfs_multi_mount_protect:322: MMP interval 42 higher than expected, please wait. [1107775.177449] Lustre: oak-OST0042: Client 6179bce8-978c-2622-3687-63fc8004f198 (at 10.9.101.38@o2ib4) reconnecting [1107775.177965] Lustre: Skipped 32710 previous similar messages [1107782.714904] Lustre: oak-OST0050: haven't heard from client a6b51dbf-e757-4f34-40af-1ce1e2526fd1 (at 10.9.101.54@o2ib4) in 224 seconds. I think it's dead, and I am evicting it. exp ffff881d013c8c00, cur 1519240425 expire 1519240275 last 1519240201 [1107782.715878] Lustre: Skipped 17 previous similar messages [1107783.609760] Lustre: oak-OST004e: haven't heard from client a6b51dbf-e757-4f34-40af-1ce1e2526fd1 (at 10.9.101.54@o2ib4) in 200 seconds. I think it's dead, and I am evicting it. exp ffff883c92a14c00, cur 1519240426 expire 1519240276 last 1519240226 [1107784.619808] Lustre: oak-OST0040: haven't heard from client a6b51dbf-e757-4f34-40af-1ce1e2526fd1 (at 10.9.101.54@o2ib4) in 201 seconds. I think it's dead, and I am evicting it. exp ffff883cdf363c00, cur 1519240427 expire 1519240277 last 1519240226 [1107797.445307] LDISKFS-fs (md5): file extents enabled, maximum tree depth=5 [1107797.802864] LDISKFS-fs (md33): file extents enabled, maximum tree depth=5 [1107798.489416] LDISKFS-fs (md1): file extents enabled, maximum tree depth=5 [1107798.999457] LDISKFS-fs (md33): recovery complete [1107799.012514] LDISKFS-fs (md5): recovery complete [1107799.023156] LDISKFS-fs (md33): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107799.029180] LDISKFS-fs (md5): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107799.287252] LDISKFS-fs (md15): file extents enabled, maximum tree depth=5 [1107799.432898] LDISKFS-fs (md11): file extents enabled, maximum tree depth=5 [1107799.460563] LDISKFS-fs (md19): file extents enabled, maximum tree depth=5 [1107799.478680] LDISKFS-fs (md23): file extents enabled, maximum tree depth=5 [1107799.754473] Lustre: oak-OST0035: Not available for connect from 10.210.46.66@o2ib3 (not set up) [1107799.754956] Lustre: Skipped 4 previous similar messages [1107799.796829] LDISKFS-fs (md1): recovery complete [1107799.816352] LDISKFS-fs (md1): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107799.916946] LDISKFS-fs (md15): recovery complete [1107799.930999] Lustre: oak-OST0035: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [1107799.931507] Lustre: Skipped 6 previous similar messages [1107799.958409] LDISKFS-fs (md15): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107800.000128] Lustre: oak-OST0035: Will be in recovery for at least 2:30, or until 1238 clients reconnect [1107800.000644] Lustre: Skipped 7 previous similar messages [1107800.400107] LDISKFS-fs (md19): recovery complete [1107800.420901] LDISKFS-fs (md19): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107800.444444] Lustre: oak-OST0035: Denying connection for new client 83517dc1-1f9b-2103-a972-ea78046d1049(at 10.210.44.48@o2ib3), waiting for 1238 known clients (11 recovered, 0 in progress, and 0 evicted) to recover in 14:36 [1107800.445536] Lustre: Skipped 1 previous similar message [1107800.491575] LDISKFS-fs (md27): file extents enabled, maximum tree depth=5 [1107800.493524] LDISKFS-fs (md17): file extents enabled, maximum tree depth=5 [1107800.542648] LDISKFS-fs (md3): file extents enabled, maximum tree depth=5 [1107800.813219] LDISKFS-fs (md11): recovery complete [1107800.826621] LDISKFS-fs (md11): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107800.833781] LDISKFS-fs (md23): recovery complete [1107800.848836] LDISKFS-fs (md23): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107800.857412] Lustre: oak-OST0031: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [1107800.857888] Lustre: Skipped 1 previous similar message [1107800.865547] Lustre: oak-OST0031: Will be in recovery for at least 2:30, or until 1238 clients reconnect [1107800.866033] Lustre: Skipped 1 previous similar message [1107801.142870] LDISKFS-fs (md25): file extents enabled, maximum tree depth=5 [1107801.362681] Lustre: oak-OST0031: Denying connection for new client c792776b-8bdd-ae8e-5beb-d99769f537f1(at 10.210.46.126@o2ib3), waiting for 1238 known clients (9 recovered, 0 in progress, and 0 evicted) to recover in 14:37 [1107801.362682] Lustre: oak-OST0035: Denying connection for new client c792776b-8bdd-ae8e-5beb-d99769f537f1(at 10.210.46.126@o2ib3), waiting for 1238 known clients (21 recovered, 0 in progress, and 0 evicted) to recover in 14:35 [1107801.364150] Lustre: Skipped 1 previous similar message [1107801.413844] LDISKFS-fs (md27): recovery complete [1107801.497992] LDISKFS-fs (md27): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107801.597692] Lustre: oak-OST003f: Not available for connect from 10.210.46.139@o2ib3 (not set up) [1107801.598166] Lustre: Skipped 7 previous similar messages [1107801.642428] LDISKFS-fs (md17): recovery complete [1107801.660928] LDISKFS-fs (md17): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107801.747493] LDISKFS-fs (md7): file extents enabled, maximum tree depth=5 [1107801.887375] Lustre: oak-OST003f: Will be in recovery for at least 2:30, or until 1257 clients reconnect [1107801.911029] LDISKFS-fs (md3): recovery complete [1107801.924730] LDISKFS-fs (md3): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107802.478013] LDISKFS-fs (md29): file extents enabled, maximum tree depth=5 [1107802.583325] LDISKFS-fs (md25): recovery complete [1107802.601292] LDISKFS-fs (md25): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107802.852561] LDISKFS-fs (md35): file extents enabled, maximum tree depth=5 [1107802.885693] Lustre: oak-OST003f: Denying connection for new client 27b94bca-209a-dd33-731e-66c20863c5a5(at 10.9.113.4@o2ib4), waiting for 1257 known clients (21 recovered, 0 in progress, and 0 evicted) to recover in 14:36 [1107802.886427] Lustre: Skipped 3 previous similar messages [1107803.235648] Lustre: oak-OST0043: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [1107803.236114] Lustre: Skipped 1 previous similar message [1107803.297112] LDISKFS-fs (md21): file extents enabled, maximum tree depth=5 [1107803.470108] LDISKFS-fs (md7): recovery complete [1107803.618760] Lustre: oak-OST003b: Not available for connect from 10.210.44.43@o2ib3 (not set up) [1107803.619244] Lustre: Skipped 29 previous similar messages [1107803.620967] LDISKFS-fs (md7): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107803.778398] LDISKFS-fs (md13): file extents enabled, maximum tree depth=5 [1107803.864997] LDISKFS-fs (md29): recovery complete [1107803.893995] LDISKFS-fs (md29): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107803.913411] LDISKFS-fs (md31): file extents enabled, maximum tree depth=5 [1107804.052577] Lustre: oak-OST003b: Will be in recovery for at least 2:30, or until 1238 clients reconnect [1107804.058640] Lustre: Skipped 1 previous similar message [1107804.320694] LDISKFS-fs (md35): recovery complete [1107804.341495] LDISKFS-fs (md35): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107804.521166] LDISKFS-fs (md9): file extents enabled, maximum tree depth=5 [1107804.667636] mount.lustre: page allocation failure: order:4, mode:0x10c0d0 [1107804.667887] CPU: 37 PID: 208456 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107804.668378] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107804.668854] 000000000010c0d0 00000000f4b353cb ffff88341381f408 ffffffff816a3db1 [1107804.669354] ffff88341381f498 ffffffff81188810 0000000000000000 00000000ffffffff [1107804.669846] fffffffffffffff0 0010c0d000000000 ffff88341381f468 00000000f4b353cb [1107804.670331] Call Trace: [1107804.670572] [] dump_stack+0x19/0x1b [1107804.670821] [] warn_alloc_failed+0x110/0x180 [1107804.671065] [] __alloc_pages_slowpath+0x6b6/0x724 [1107804.671314] [] __alloc_pages_nodemask+0x405/0x420 [1107804.671560] [] alloc_pages_current+0x98/0x110 [1107804.671806] [] __get_free_pages+0xe/0x40 [1107804.672050] [] kmalloc_order_trace+0x2e/0xa0 [1107804.672298] [] __kmalloc+0x211/0x230 [1107804.672557] [] ldiskfs_kvzalloc+0x1b/0x50 [ldiskfs] [1107804.672805] [] ldiskfs_mb_alloc_groupinfo+0x64/0xe0 [ldiskfs] [1107804.673273] [] ldiskfs_mb_init+0x4a4/0x730 [ldiskfs] [1107804.673599] [] ldiskfs_fill_super+0x2087/0x2cf0 [ldiskfs] [1107804.674062] [] mount_bdev+0x1b0/0x1f0 [1107804.674307] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107804.674783] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107804.675025] [] mount_fs+0x39/0x1b0 [1107804.675273] [] vfs_kern_mount+0x67/0x110 [1107804.675527] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107804.675774] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107804.676280] [] obd_setup+0x114/0x2a0 [obdclass] [1107804.676539] [] class_setup+0x2a8/0x840 [obdclass] [1107804.676797] [] class_process_config+0x1940/0x23f0 [obdclass] [1107804.677280] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107804.677750] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107804.678013] [] do_lcfg+0x258/0x500 [obdclass] [1107804.678271] [] lustre_start_simple+0x88/0x210 [obdclass] [1107804.678539] [] server_fill_super+0xf24/0x184c [obdclass] [1107804.678801] [] lustre_fill_super+0x328/0x950 [obdclass] [1107804.679061] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107804.679538] [] mount_nodev+0x4d/0xb0 [1107804.679798] [] lustre_mount+0x38/0x60 [obdclass] [1107804.680043] [] mount_fs+0x39/0x1b0 [1107804.680290] [] vfs_kern_mount+0x67/0x110 [1107804.680534] [] do_mount+0x233/0xaf0 [1107804.680777] [] ? __get_free_pages+0xe/0x40 [1107804.681021] [] SyS_mount+0x96/0xf0 [1107804.681269] [] system_call_fastpath+0x16/0x1b [1107804.681514] Mem-Info: [1107804.681756] active_anon:1083482 inactive_anon:328210 isolated_anon:0 active_file:20016834 inactive_file:19916124 isolated_file:0 unevictable:25294 dirty:0 writeback:1577 unstable:0 slab_reclaimable:1423124 slab_unreclaimable:1230251 mapped:12514 shmem:106873 pagetables:5775 bounce:0 free:10776074 free_pcp:801 free_cma:0 [1107804.683188] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107804.684636] lowmem_reserve[]: 0 1554 128505 128505 [1107804.684899] Node 0 DMA32 free:520468kB min:12672kB low:15840kB high:19008kB active_anon:43976kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:581096kB slab_unreclaimable:180532kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:111110 all_unreclaimable? no [1107804.686584] lowmem_reserve[]: 0 0 126950 126950 [1107804.686846] Node 0 Normal free:22793336kB min:1033836kB low:1292292kB high:1550752kB active_anon:3630888kB inactive_anon:716816kB active_file:39460392kB inactive_file:39357072kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:680kB writeback:3008kB mapped:29684kB shmem:285296kB slab_reclaimable:1846928kB slab_unreclaimable:1904580kB kernel_stack:42592kB pagetables:13764kB unstable:0kB bounce:0kB free_pcp:1460kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107804.688829] lowmem_reserve[]: 0 0 0 0 [1107804.689092] Node 1 Normal free:19776188kB min:1050512kB low:1313140kB high:1575768kB active_anon:659064kB inactive_anon:530496kB active_file:40606112kB inactive_file:40307096kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:0kB writeback:3300kB mapped:20364kB shmem:142196kB slab_reclaimable:3264472kB slab_unreclaimable:2835380kB kernel_stack:11104kB pagetables:9204kB unstable:0kB bounce:0kB free_pcp:2448kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107804.691017] lowmem_reserve[]: 0 0 0 0 [1107804.691278] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107804.691851] Node 0 DMA32: 4515*4kB (UEM) 8003*8kB (UEM) 6055*16kB (UEM) 451*32kB (UEM) 2000*64kB (UEM) 1038*128kB (UEM) 134*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520308kB [1107804.692670] Node 0 Normal: 50586*4kB (UEM) 1641578*8kB (UEM) 583102*16kB (UEM) 4318*32kB (UM) 1*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 22802840kB [1107804.693462] Node 1 Normal: 77808*4kB (UE) 831925*8kB (UEM) 798605*16kB (UM) 1239*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 19783960kB [1107804.694245] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107804.694719] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107804.695189] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107804.695663] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107804.696139] 40044106 total pagecache pages [1107804.696375] 1846 pages in swap cache [1107804.696614] Swap cache stats: add 264107, delete 262261, find 2321/2660 [1107804.696859] Free swap = 3152432kB [1107804.697096] Total swap = 4194300kB [1107804.697333] 67052113 pages RAM [1107804.697567] 0 pages HighMem/MovableOnly [1107804.697802] 1126685 pages reserved [1107804.932626] LDISKFS-fs (md21): recovery complete [1107805.024156] LDISKFS-fs (md21): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107805.100660] mount.lustre: page allocation failure: order:4, mode:0x10c0d0 [1107805.100910] CPU: 33 PID: 208679 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107805.101386] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107805.101852] 000000000010c0d0 000000006fe27dde ffff8819f0c47408 ffffffff816a3db1 [1107805.102337] ffff8819f0c47498 ffffffff81188810 ffffffff8118b530 0000000000000000 [1107805.102823] fffffffffffffff0 0010c0d000000000 ffff8819f0c47468 000000006fe27dde [1107805.103310] Call Trace: [1107805.103555] [] dump_stack+0x19/0x1b [1107805.103799] [] warn_alloc_failed+0x110/0x180 [1107805.104042] [] ? drain_pages+0xb0/0xb0 [1107805.104291] [] __alloc_pages_slowpath+0x6b6/0x724 [1107805.104538] [] __alloc_pages_nodemask+0x405/0x420 [1107805.104784] [] alloc_pages_current+0x98/0x110 [1107805.105027] [] __get_free_pages+0xe/0x40 [1107805.105276] [] kmalloc_order_trace+0x2e/0xa0 [1107805.105521] [] __kmalloc+0x211/0x230 [1107805.105787] [] ldiskfs_kvzalloc+0x1b/0x50 [ldiskfs] [1107805.106040] [] ldiskfs_mb_alloc_groupinfo+0x64/0xe0 [ldiskfs] [1107805.106525] [] ldiskfs_mb_init+0x4a4/0x730 [ldiskfs] [1107805.106782] [] ldiskfs_fill_super+0x2087/0x2cf0 [ldiskfs] [1107805.107257] [] mount_bdev+0x1b0/0x1f0 [1107805.107512] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107805.108011] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107805.108278] [] mount_fs+0x39/0x1b0 [1107805.108527] [] vfs_kern_mount+0x67/0x110 [1107805.108785] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107805.109041] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107805.109559] [] obd_setup+0x114/0x2a0 [obdclass] [1107805.109819] [] class_setup+0x2a8/0x840 [obdclass] [1107805.110083] [] class_process_config+0x1940/0x23f0 [obdclass] [1107805.110573] [] ? __cond_resched+0x26/0x30 [1107805.110828] [] ? _cond_resched+0x3a/0x50 [1107805.111089] [] do_lcfg+0x258/0x500 [obdclass] [1107805.111345] [] lustre_start_simple+0x88/0x210 [obdclass] [1107805.111608] [] server_fill_super+0xf24/0x184c [obdclass] [1107805.111901] [] lustre_fill_super+0x328/0x950 [obdclass] [1107805.112167] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107805.112640] [] mount_nodev+0x4d/0xb0 [1107805.118344] [] lustre_mount+0x38/0x60 [obdclass] [1107805.118599] [] mount_fs+0x39/0x1b0 [1107805.118849] [] vfs_kern_mount+0x67/0x110 [1107805.119102] [] do_mount+0x233/0xaf0 [1107805.119354] [] ? __get_free_pages+0xe/0x40 [1107805.119600] [] SyS_mount+0x96/0xf0 [1107805.119846] [] system_call_fastpath+0x16/0x1b [1107805.120098] Mem-Info: [1107805.120342] active_anon:1090326 inactive_anon:328207 isolated_anon:0 active_file:19861924 inactive_file:19861833 isolated_file:0 unevictable:25294 dirty:372 writeback:564 unstable:0 slab_reclaimable:1441116 slab_unreclaimable:1218782 mapped:12526 shmem:106873 pagetables:5662 bounce:0 free:10944948 free_pcp:380 free_cma:0 [1107805.121789] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107805.123250] lowmem_reserve[]: 0 1554 128505 128505 [1107805.123517] Node 0 DMA32 free:520436kB min:12672kB low:15840kB high:19008kB active_anon:43976kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:581096kB slab_unreclaimable:180596kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:25010 all_unreclaimable? no [1107805.125198] lowmem_reserve[]: 0 0 126950 126950 [1107805.125461] Node 0 Normal free:23224568kB min:1033836kB low:1292292kB high:1550752kB active_anon:3634276kB inactive_anon:716812kB active_file:39201260kB inactive_file:39199292kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:1064kB writeback:0kB mapped:29732kB shmem:285296kB slab_reclaimable:1848464kB slab_unreclaimable:1885328kB kernel_stack:42544kB pagetables:13388kB unstable:0kB bounce:0kB free_pcp:720kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107805.127376] lowmem_reserve[]: 0 0 0 0 [1107805.127634] Node 1 Normal free:20085376kB min:1050512kB low:1313140kB high:1575768kB active_anon:658860kB inactive_anon:530488kB active_file:40219096kB inactive_file:40219344kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:424kB writeback:2256kB mapped:20364kB shmem:142196kB slab_reclaimable:3334904kB slab_unreclaimable:2809204kB kernel_stack:11264kB pagetables:9128kB unstable:0kB bounce:0kB free_pcp:2680kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:64 all_unreclaimable? no [1107805.129622] lowmem_reserve[]: 0 0 0 0 [1107805.129882] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107805.130458] Node 0 DMA32: 4515*4kB (UEM) 8003*8kB (UEM) 6055*16kB (UEM) 458*32kB (UEM) 2000*64kB (UEM) 1038*128kB (UEM) 134*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520532kB [1107805.131273] Node 0 Normal: 64213*4kB (UEM) 1672294*8kB (UEM) 592257*16kB (UEM) 4337*32kB (UEM) 2*64kB (U) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23250228kB [1107805.132061] Node 1 Normal: 78774*4kB (UEM) 868586*8kB (UEM) 800251*16kB (UE) 749*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20091768kB [1107805.132847] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107805.133321] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107805.133798] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107805.134281] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107805.134756] 39810647 total pagecache pages [1107805.134999] 1846 pages in swap cache [1107805.135243] Swap cache stats: add 264107, delete 262261, find 2321/2660 [1107805.135489] Free swap = 3152432kB [1107805.135723] Total swap = 4194300kB [1107805.135960] 67052113 pages RAM [1107805.136200] 0 pages HighMem/MovableOnly [1107805.136440] 1126685 pages reserved [1107805.303598] mount.lustre: page allocation failure: order:4, mode:0x10c0d0 [1107805.303904] CPU: 38 PID: 208543 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107805.304426] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107805.304925] 000000000010c0d0 00000000aefc09eb ffff8828ed5a3408 ffffffff816a3db1 [1107805.305434] ffff8828ed5a3498 ffffffff81188810 0000000000000000 00000000ffffffff [1107805.305909] fffffffffffffff0 0010c0d000000000 ffff8828ed5a3468 00000000aefc09eb [1107805.306388] Call Trace: [1107805.306624] [] dump_stack+0x19/0x1b [1107805.306864] [] warn_alloc_failed+0x110/0x180 [1107805.307107] [] __alloc_pages_slowpath+0x6b6/0x724 [1107805.307349] [] __alloc_pages_nodemask+0x405/0x420 [1107805.307590] [] alloc_pages_current+0x98/0x110 [1107805.307829] [] __get_free_pages+0xe/0x40 [1107805.308074] [] kmalloc_order_trace+0x2e/0xa0 [1107805.308317] [] __kmalloc+0x211/0x230 [1107805.308575] [] ldiskfs_kvzalloc+0x1b/0x50 [ldiskfs] [1107805.308825] [] ldiskfs_mb_alloc_groupinfo+0x64/0xe0 [ldiskfs] [1107805.309294] [] ldiskfs_mb_init+0x4a4/0x730 [ldiskfs] [1107805.309543] [] ldiskfs_fill_super+0x2087/0x2cf0 [ldiskfs] [1107805.310020] [] mount_bdev+0x1b0/0x1f0 [1107805.310281] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107805.310836] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107805.311078] [] mount_fs+0x39/0x1b0 [1107805.311315] [] vfs_kern_mount+0x67/0x110 [1107805.311593] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107805.311837] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107805.312349] [] obd_setup+0x114/0x2a0 [obdclass] [1107805.312605] [] class_setup+0x2a8/0x840 [obdclass] [1107805.312860] [] class_process_config+0x1940/0x23f0 [obdclass] [1107805.313338] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107805.313797] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107805.314145] [] do_lcfg+0x258/0x500 [obdclass] [1107805.314393] [] lustre_start_simple+0x88/0x210 [obdclass] [1107805.314648] [] server_fill_super+0xf24/0x184c [obdclass] [1107805.314899] [] lustre_fill_super+0x328/0x950 [obdclass] [1107805.315153] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107805.315614] [] mount_nodev+0x4d/0xb0 [1107805.315863] [] lustre_mount+0x38/0x60 [obdclass] [1107805.316104] [] mount_fs+0x39/0x1b0 [1107805.316340] [] vfs_kern_mount+0x67/0x110 [1107805.316579] [] do_mount+0x233/0xaf0 [1107805.316848] [] ? __get_free_pages+0xe/0x40 [1107805.317100] [] SyS_mount+0x96/0xf0 [1107805.317342] [] system_call_fastpath+0x16/0x1b [1107805.317583] Mem-Info: [1107805.317828] active_anon:1083144 inactive_anon:328207 isolated_anon:0 active_file:19783742 inactive_file:19783590 isolated_file:32 unevictable:25294 dirty:372 writeback:188 unstable:0 slab_reclaimable:1457748 slab_unreclaimable:1216725 mapped:12526 shmem:106873 pagetables:5662 bounce:0 free:11054832 free_pcp:1289 free_cma:0 [1107805.319298] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107805.320739] lowmem_reserve[]: 0 1554 128505 128505 [1107805.320993] Node 0 DMA32 free:520436kB min:12672kB low:15840kB high:19008kB active_anon:43976kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:581096kB slab_unreclaimable:180596kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:6150 all_unreclaimable? no [1107805.322645] lowmem_reserve[]: 0 0 126950 126950 [1107805.322900] Node 0 Normal free:23456512kB min:1033836kB low:1292292kB high:1550752kB active_anon:3630244kB inactive_anon:716812kB active_file:39050212kB inactive_file:39048564kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:1064kB writeback:0kB mapped:29732kB shmem:285296kB slab_reclaimable:1913984kB slab_unreclaimable:1882164kB kernel_stack:42544kB pagetables:13388kB unstable:0kB bounce:0kB free_pcp:2540kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107805.324786] lowmem_reserve[]: 0 0 0 0 [1107805.325048] Node 1 Normal free:20226720kB min:1050512kB low:1313140kB high:1575768kB active_anon:658356kB inactive_anon:530488kB active_file:40083924kB inactive_file:40085092kB unevictable:4696kB isolated(anon):0kB isolated(file):0kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:424kB writeback:752kB mapped:20364kB shmem:142196kB slab_reclaimable:3336920kB slab_unreclaimable:2804140kB kernel_stack:11264kB pagetables:9128kB unstable:0kB bounce:0kB free_pcp:3468kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107805.326931] lowmem_reserve[]: 0 0 0 0 [1107805.327189] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107805.327747] Node 0 DMA32: 4514*4kB (UEM) 8002*8kB (UEM) 6054*16kB (UEM) 460*32kB (UEM) 2000*64kB (UEM) 1038*128kB (UEM) 134*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520568kB [1107805.328634] Node 0 Normal: 65152*4kB (UEM) 1690802*8kB (UEM) 596100*16kB (UEM) 4327*32kB (U) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23463088kB [1107805.329447] Node 1 Normal: 78104*4kB (UE) 868043*8kB (UE) 803610*16kB (UE) 2342*32kB (UM) 419*64kB (U) 123*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20232024kB [1107805.335769] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107805.336234] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107805.336695] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107805.337199] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107805.337723] 39663443 total pagecache pages [1107805.337988] 1846 pages in swap cache [1107805.338243] Swap cache stats: add 264107, delete 262261, find 2321/2660 [1107805.338551] Free swap = 3152432kB [1107805.338831] Total swap = 4194300kB [1107805.339104] 67052113 pages RAM [1107805.339335] 0 pages HighMem/MovableOnly [1107805.339570] 1126685 pages reserved [1107805.378076] LDISKFS-fs (md31): recovery complete [1107805.441518] LDISKFS-fs (md31): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107805.560062] LDISKFS-fs (md13): recovery complete [1107805.627582] Lustre: oak-OST0041: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [1107805.628081] Lustre: Skipped 3 previous similar messages [1107805.654536] LDISKFS-fs (md13): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107805.838973] mount.lustre: page allocation failure: order:4, mode:0x10c0d0 [1107805.839271] CPU: 17 PID: 208788 Comm: mount.lustre Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1107805.839754] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1107805.840228] 000000000010c0d0 0000000056e56efb ffff881325047408 ffffffff816a3db1 [1107805.840852] ffff881325047498 ffffffff81188810 0000000000000000 ffff88407ffd8000 [1107805.841334] 0000000000000004 000000000010c0d0 ffff881325047498 0000000056e56efb [1107805.841812] Call Trace: [1107805.842098] [] dump_stack+0x19/0x1b [1107805.842367] [] warn_alloc_failed+0x110/0x180 [1107805.842620] [] __alloc_pages_slowpath+0x6b6/0x724 [1107805.842877] [] __alloc_pages_nodemask+0x405/0x420 [1107805.843125] [] alloc_pages_current+0x98/0x110 [1107805.843373] [] __get_free_pages+0xe/0x40 [1107805.843655] [] kmalloc_order_trace+0x2e/0xa0 [1107805.843919] [] __kmalloc+0x211/0x230 [1107805.844186] [] ldiskfs_kvzalloc+0x1b/0x50 [ldiskfs] [1107805.844441] [] ldiskfs_mb_alloc_groupinfo+0x64/0xe0 [ldiskfs] [1107805.844933] [] ldiskfs_mb_init+0x4a4/0x730 [ldiskfs] [1107805.845223] [] ldiskfs_fill_super+0x2087/0x2cf0 [ldiskfs] [1107805.845701] [] mount_bdev+0x1b0/0x1f0 [1107805.845952] [] ? ldiskfs_calculate_overhead+0x430/0x430 [ldiskfs] [1107805.846470] [] ldiskfs_mount+0x15/0x20 [ldiskfs] [1107805.846736] [] mount_fs+0x39/0x1b0 [1107805.846985] [] vfs_kern_mount+0x67/0x110 [1107805.847244] [] osd_mount+0x420/0xc10 [osd_ldiskfs] [1107805.847499] [] osd_device_alloc+0x38a/0x770 [osd_ldiskfs] [1107805.848009] [] obd_setup+0x114/0x2a0 [obdclass] [1107805.848272] [] class_setup+0x2a8/0x840 [obdclass] [1107805.848528] [] class_process_config+0x1940/0x23f0 [obdclass] [1107805.849013] [] ? cfs_range_expr_parse+0x1b4/0x480 [libcfs] [1107805.849491] [] ? cfs_expr_list_free+0xf1/0x240 [libcfs] [1107805.849754] [] do_lcfg+0x258/0x500 [obdclass] [1107805.850010] [] lustre_start_simple+0x88/0x210 [obdclass] [1107805.850279] [] server_fill_super+0xf24/0x184c [obdclass] [1107805.850539] [] lustre_fill_super+0x328/0x950 [obdclass] [1107805.850797] [] ? lustre_common_put_super+0x270/0x270 [obdclass] [1107805.851276] [] mount_nodev+0x4d/0xb0 [1107805.851531] [] lustre_mount+0x38/0x60 [obdclass] [1107805.851776] [] mount_fs+0x39/0x1b0 [1107805.852023] [] vfs_kern_mount+0x67/0x110 [1107805.852304] [] do_mount+0x233/0xaf0 [1107805.852571] [] ? __get_free_pages+0xe/0x40 [1107805.852820] [] SyS_mount+0x96/0xf0 [1107805.853069] [] system_call_fastpath+0x16/0x1b [1107805.853315] Mem-Info: [1107805.853560] active_anon:1082805 inactive_anon:328205 isolated_anon:7 active_file:19724675 inactive_file:19719292 isolated_file:25 unevictable:25294 dirty:730 writeback:0 unstable:0 slab_reclaimable:1474254 slab_unreclaimable:1227334 mapped:12547 shmem:106873 pagetables:5548 bounce:0 free:11012457 free_pcp:376 free_cma:0 [1107805.855124] Node 0 DMA free:14104kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15896kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [1107805.856617] lowmem_reserve[]: 0 1554 128505 128505 [1107805.856886] Node 0 DMA32 free:520472kB min:12672kB low:15840kB high:19008kB active_anon:43976kB inactive_anon:65528kB active_file:832kB inactive_file:832kB unevictable:152kB isolated(anon):0kB isolated(file):0kB present:1854168kB managed:1593824kB mlocked:152kB dirty:0kB writeback:0kB mapped:8kB shmem:0kB slab_reclaimable:581000kB slab_unreclaimable:180816kB kernel_stack:304kB pagetables:132kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:50020 all_unreclaimable? no [1107805.858702] lowmem_reserve[]: 0 0 126950 126950 [1107805.858967] Node 0 Normal free:23337516kB min:1033836kB low:1292292kB high:1550752kB active_anon:3628980kB inactive_anon:716824kB active_file:38943448kB inactive_file:38921596kB unevictable:96328kB isolated(anon):0kB isolated(file):0kB present:132120576kB managed:129997548kB mlocked:96328kB dirty:1436kB writeback:0kB mapped:29816kB shmem:285296kB slab_reclaimable:1982216kB slab_unreclaimable:1899448kB kernel_stack:42576kB pagetables:13036kB unstable:0kB bounce:0kB free_pcp:640kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107805.861027] lowmem_reserve[]: 0 0 0 0 [1107805.861334] Node 1 Normal free:20177040kB min:1050512kB low:1313140kB high:1575768kB active_anon:658264kB inactive_anon:530468kB active_file:39954420kB inactive_file:39954740kB unevictable:4696kB isolated(anon):28kB isolated(file):100kB present:134217728kB managed:132094444kB mlocked:4696kB dirty:1484kB writeback:0kB mapped:20364kB shmem:142196kB slab_reclaimable:3333800kB slab_unreclaimable:2828624kB kernel_stack:11520kB pagetables:9024kB unstable:0kB bounce:0kB free_pcp:1192kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [1107805.863394] lowmem_reserve[]: 0 0 0 0 [1107805.863661] Node 0 DMA: 0*4kB 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 0*256kB 1*512kB (U) 1*1024kB (U) 2*2048kB (UM) 2*4096kB (M) = 14104kB [1107805.864265] Node 0 DMA32: 4514*4kB (UEM) 8002*8kB (UEM) 6054*16kB (UEM) 457*32kB (UEM) 2000*64kB (UEM) 1038*128kB (UEM) 134*256kB (UEM) 24*512kB (EM) 17*1024kB (UEM) 1*2048kB (M) 0*4096kB = 520472kB [1107805.865143] Node 0 Normal: 49218*4kB (UEM) 1681290*8kB (UEM) 597349*16kB (UEM) 4135*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23337096kB [1107805.865958] Node 1 Normal: 77773*4kB (UEM) 852979*8kB (UE) 806822*16kB (UEM) 2108*32kB (UE) 513*64kB (U) 249*128kB (UM) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20176236kB [1107805.866861] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107805.867345] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107805.867907] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB [1107805.868444] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [1107805.868916] 39554340 total pagecache pages [1107805.869252] 1846 pages in swap cache [1107805.869485] Swap cache stats: add 264107, delete 262261, find 2321/2660 [1107805.869726] Free swap = 3152432kB [1107805.869959] Total swap = 4194300kB [1107805.870199] 67052113 pages RAM [1107805.870465] 0 pages HighMem/MovableOnly [1107805.870716] 1126685 pages reserved [1107806.067950] LDISKFS-fs (md9): recovery complete [1107806.139367] LDISKFS-fs (md9): mounted filesystem with ordered data mode. Opts: ,errors=remount-ro,no_mbcache,nodelalloc [1107806.932155] Lustre: oak-OST0031: Denying connection for new client 2d21ac44-ee87-ac13-a569-7b484f84fdeb(at 10.9.112.4@o2ib4), waiting for 1238 known clients (450 recovered, 1 in progress, and 0 evicted) to recover in 14:32 [1107806.932887] Lustre: Skipped 105 previous similar messages [1107808.175368] Lustre: oak-OST0053: Will be in recovery for at least 2:30, or until 1238 clients reconnect [1107808.175890] Lustre: Skipped 7 previous similar messages [1107808.512705] Lustre: oak-OST0045: Not available for connect from 10.210.46.34@o2ib3 (not set up) [1107808.513189] Lustre: Skipped 20 previous similar messages [1107810.064532] Lustre: oak-OST0039: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900 [1107810.065020] Lustre: Skipped 8 previous similar messages [1107810.598255] Lustre: oak-OST0030: haven't heard from client a6b51dbf-e757-4f34-40af-1ce1e2526fd1 (at 10.9.101.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881da08ecc00, cur 1519240453 expire 1519240303 last 1519240226 [1107810.599229] Lustre: Skipped 2 previous similar messages [1107815.081903] Lustre: oak-OST0053: Denying connection for new client cbefc8f7-4cba-0265-f773-a310fd0595a6(at 10.8.15.1@o2ib6), waiting for 1238 known clients (801 recovered, 6 in progress, and 0 evicted) to recover in 14:30 [1107815.082675] Lustre: Skipped 315 previous similar messages [1107833.417026] Lustre: oak-OST0037: Denying connection for new client 50d4a8ed-144f-61da-93d5-2c5a54320942(at 10.9.112.5@o2ib4), waiting for 1238 known clients (1207 recovered, 6 in progress, and 0 evicted) to recover in 14:09 [1107833.417027] Lustre: oak-OST0033: Denying connection for new client 50d4a8ed-144f-61da-93d5-2c5a54320942(at 10.9.112.5@o2ib4), waiting for 1238 known clients (1213 recovered, 5 in progress, and 0 evicted) to recover in 14:09 [1107833.417029] Lustre: oak-OST003f: Denying connection for new client 50d4a8ed-144f-61da-93d5-2c5a54320942(at 10.9.112.5@o2ib4), waiting for 1257 known clients (1240 recovered, 5 in progress, and 0 evicted) to recover in 14:05 [1107833.417030] Lustre: Skipped 144 previous similar messages [1107833.417034] Lustre: Skipped 144 previous similar messages [1107833.425307] Lustre: Skipped 12 previous similar messages [1107835.597037] Lustre: oak-OST0032: haven't heard from client a6b51dbf-e757-4f34-40af-1ce1e2526fd1 (at 10.9.101.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883e63528000, cur 1519240478 expire 1519240328 last 1519240251 [1107835.598010] Lustre: Skipped 4 previous similar messages [1107898.593515] Lustre: oak-OST0046: haven't heard from client a6b51dbf-e757-4f34-40af-1ce1e2526fd1 (at 10.9.101.54@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cac5c4c00, cur 1519240541 expire 1519240391 last 1519240314 [1107900.813510] Lustre: oak-OST0031: Denying connection for new client 82be1ce3-748d-5486-dd90-21b82a2f374f(at 10.8.29.4@o2ib6), waiting for 1238 known clients (1220 recovered, 4 in progress, and 0 evicted) to recover in 12:58 [1107900.813512] Lustre: oak-OST0033: Denying connection for new client 82be1ce3-748d-5486-dd90-21b82a2f374f(at 10.8.29.4@o2ib6), waiting for 1238 known clients (1219 recovered, 5 in progress, and 0 evicted) to recover in 13:02 [1107900.813515] Lustre: Skipped 1410 previous similar messages [1107900.815198] Lustre: Skipped 9 previous similar messages [1108156.889225] Lustre: oak-OST0035: Denying connection for new client eb490769-05a3-8e62-a44a-707ef1e0723e(at 10.9.113.10@o2ib4), waiting for 1238 known clients (1219 recovered, 5 in progress, and 0 evicted) to recover in 8:40 [1108156.889944] Lustre: Skipped 5853 previous similar messages [1108564.883943] Lustre: oak-OST0051: Recovery already passed deadline 1:52. If you do not want to wait more, please abort the recovery by force. [1108564.883945] Lustre: oak-OST0035: Recovery already passed deadline 1:52. If you do not want to wait more, please abort the recovery by force. [1108564.883946] Lustre: oak-OST0043: Recovery already passed deadline 1:54. If you do not want to wait more, please abort the recovery by force. [1108564.883948] Lustre: oak-OST003b: Recovery already passed deadline 1:56. If you do not want to wait more, please abort the recovery by force. [1108564.883949] Lustre: Skipped 3 previous similar messages [1108564.883949] Lustre: Skipped 3 previous similar messages [1108564.883952] Lustre: Skipped 3 previous similar messages [1108564.883968] Lustre: oak-OST0035: Connection restored to 566c789c-f3a6-ac0c-3466-186fc4c4470b (at 10.12.4.20@o2ib) [1108564.883968] Lustre: oak-OST003b: Connection restored to 566c789c-f3a6-ac0c-3466-186fc4c4470b (at 10.12.4.20@o2ib) [1108564.883969] Lustre: Skipped 24558 previous similar messages [1108564.883970] Lustre: Skipped 24558 previous similar messages [1108565.550233] Lustre: oak-OST003b: Recovery already passed deadline 1:55. If you do not want to wait more, please abort the recovery by force. [1108565.550234] Lustre: oak-OST004b: Recovery already passed deadline 1:57. If you do not want to wait more, please abort the recovery by force. [1108565.550236] Lustre: oak-OST0051: Recovery already passed deadline 1:51. If you do not want to wait more, please abort the recovery by force. [1108565.550238] Lustre: oak-OST0047: Recovery already passed deadline 1:55. If you do not want to wait more, please abort the recovery by force. [1108565.550239] Lustre: Skipped 25 previous similar messages [1108565.550240] Lustre: Skipped 25 previous similar messages [1108565.550243] Lustre: Skipped 25 previous similar messages [1108565.552844] Lustre: Skipped 2 previous similar messages [1108566.565672] Lustre: oak-OST004b: Recovery already passed deadline 1:56. If you do not want to wait more, please abort the recovery by force. [1108566.566170] Lustre: Skipped 75 previous similar messages [1108568.627612] Lustre: oak-OST0053: Recovery already passed deadline 1:56. If you do not want to wait more, please abort the recovery by force. [1108568.627614] Lustre: oak-OST0033: Recovery already passed deadline 1:54. If you do not want to wait more, please abort the recovery by force. [1108568.627615] Lustre: oak-OST004b: Recovery already passed deadline 1:54. If you do not want to wait more, please abort the recovery by force. [1108568.627617] Lustre: oak-OST0045: Recovery already passed deadline 1:56. If you do not want to wait more, please abort the recovery by force. [1108568.627618] Lustre: oak-OST004d: Recovery already passed deadline 1:54. If you do not want to wait more, please abort the recovery by force. [1108568.627619] Lustre: oak-OST0037: Recovery already passed deadline 1:54. If you do not want to wait more, please abort the recovery by force. [1108568.627621] Lustre: Skipped 295 previous similar messages [1108568.627621] Lustre: Skipped 295 previous similar messages [1108568.627622] Lustre: Skipped 295 previous similar messages [1108568.627623] Lustre: Skipped 295 previous similar messages [1108568.627626] Lustre: Skipped 295 previous similar messages [1108568.631645] Lustre: Skipped 21 previous similar messages [1108576.632591] Lustre: oak-OST0039: Recovery already passed deadline 1:50. If you do not want to wait more, please abort the recovery by force. [1108576.632592] Lustre: oak-OST004f: Recovery already passed deadline 1:50. If you do not want to wait more, please abort the recovery by force. [1108576.632594] Lustre: oak-OST003d: Recovery already passed deadline 1:50. If you do not want to wait more, please abort the recovery by force. [1108576.632595] Lustre: Skipped 546 previous similar messages [1108576.632598] Lustre: Skipped 546 previous similar messages [1108669.714416] Lustre: oak-OST0031: Denying connection for new client 47585c16-2c65-1a36-2519-62a0145c0305(at 10.8.9.8@o2ib6), waiting for 1238 known clients (1220 recovered, 4 in progress, and 0 evicted) to recover in 0:09 [1108669.715136] Lustre: Skipped 11759 previous similar messages [1108676.902885] Lustre: oak-OST0035: recovery is timed out, evict stale exports [1108676.903046] Lustre: oak-OST0051: disconnecting 14 stale clients [1108676.903273] Lustre: 210671:0:(ldlm_lib.c:1773:extend_recovery_timer()) oak-OST0051: extended recovery timer reaching hard limit: 900, extend: 1 [1108676.903969] Lustre: Skipped 1 previous similar message [1108677.132185] Lustre: oak-OST0051: deleting orphan objects from 0x0:156881 to 0x0:156897 [1108677.132573] Lustre: oak-OST0051: Recovery over after 14:37, of 1238 clients 1224 recovered and 14 were evicted. [1108677.246302] LustreError: 184234:0:(ofd_io.c:616:ofd_preprw_write()) oak-OST0035: BRW to missing obj 0x0:4346882 [1108677.247427] LustreError: 173730:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) delayed lvb init failed (rc -2) [1108678.950780] Lustre: oak-OST0043: recovery is timed out, evict stale exports [1108678.950924] Lustre: oak-OST003f: disconnecting 12 stale clients [1108678.950925] Lustre: Skipped 1 previous similar message [1108678.951127] Lustre: 210774:0:(ldlm_lib.c:1773:extend_recovery_timer()) oak-OST003f: extended recovery timer reaching hard limit: 900, extend: 1 [1108678.951129] Lustre: 210774:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 16 previous similar messages [1108678.952556] Lustre: Skipped 2 previous similar messages [1108679.086871] Lustre: oak-OST003f: Recovery over after 14:37, of 1257 clients 1245 recovered and 12 were evicted. [1108679.087353] Lustre: Skipped 1 previous similar message [1108679.163485] Lustre: oak-OST003f: deleting orphan objects from 0x0:4411374 to 0x0:4411396 [1108680.998667] Lustre: oak-OST003b: recovery is timed out, evict stale exports [1108680.998873] Lustre: oak-OST0047: disconnecting 14 stale clients [1108680.998874] Lustre: Skipped 2 previous similar messages [1108680.999093] Lustre: 210910:0:(ldlm_lib.c:1773:extend_recovery_timer()) oak-OST0047: extended recovery timer reaching hard limit: 900, extend: 1 [1108680.999094] Lustre: 210910:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 32 previous similar messages [1108681.000366] Lustre: Skipped 1 previous similar message [1108681.134119] Lustre: oak-OST0047: Recovery over after 14:36, of 1238 clients 1224 recovered and 14 were evicted. [1108681.134610] Lustre: Skipped 2 previous similar messages [1108681.166020] LustreError: 175762:0:(ofd_io.c:616:ofd_preprw_write()) oak-OST0047: BRW to missing obj 0x0:4389059 [1108681.167696] LustreError: 173730:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) delayed lvb init failed (rc -2) [1108681.327434] format at lustre_dlm.h:1097:ldlm_lvbo_fill doesn't end in newline [1108683.046573] Lustre: oak-OST0033: recovery is timed out, evict stale exports [1108683.046574] Lustre: oak-OST0049: recovery is timed out, evict stale exports [1108683.046576] Lustre: Skipped 3 previous similar messages [1108683.046725] Lustre: oak-OST0037: disconnecting 14 stale clients [1108683.046726] Lustre: Skipped 1 previous similar message [1108683.046981] Lustre: 211040:0:(ldlm_lib.c:1773:extend_recovery_timer()) oak-OST0049: extended recovery timer reaching hard limit: 900, extend: 1 [1108683.046983] Lustre: 211040:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 24 previous similar messages [1108683.189046] Lustre: oak-OST0049: deleting orphan objects from 0x0:3442723 to 0x0:3442849 [1108683.190341] Lustre: oak-OST0041: deleting orphan objects from 0x0:4427080 to 0x0:4427105 [1108683.193310] Lustre: oak-OST0037: Recovery over after 14:36, of 1238 clients 1224 recovered and 14 were evicted. [1108683.193785] Lustre: Skipped 1 previous similar message [1108683.195710] Lustre: oak-OST004d: deleting orphan objects from 0x0:3449716 to 0x0:3449761 [1108683.221583] LustreError: 251672:0:(ofd_io.c:616:ofd_preprw_write()) oak-OST0037: BRW to missing obj 0x0:4426498 [1108683.222086] LustreError: 251672:0:(ofd_io.c:616:ofd_preprw_write()) Skipped 6 previous similar messages [1108683.225179] LustreError: 131944:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) delayed lvb init failed (rc -2) [1108683.225667] LustreError: 131944:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) Skipped 6 previous similar messages [1108683.410640] Lustre: oak-OST0033: deleting orphan objects from 0x0:4409626 to 0x0:4409665 [1108683.433503] Lustre: oak-OST004b: deleting orphan objects from 0x0:3443976 to 0x0:3444011 [1108685.231869] Lustre: oak-OST0045: deleting orphan objects from 0x0:4406757 to 0x0:4406785 [1108685.341418] LustreError: 63779:0:(ofd_io.c:616:ofd_preprw_write()) oak-OST0053: BRW to missing obj 0x0:156930 [1108685.344071] LustreError: 258392:0:(lustre_dlm.h:1372:ldlm_res_lvbo_update()) delayed lvb init failed (rc -2) [1108687.142538] Lustre: oak-OST004f: disconnecting 14 stale clients [1108687.142851] Lustre: Skipped 8 previous similar messages [1108687.143456] Lustre: 211183:0:(ldlm_lib.c:1773:extend_recovery_timer()) oak-OST004f: extended recovery timer reaching hard limit: 900, extend: 1 [1108687.144058] Lustre: 211183:0:(ldlm_lib.c:1773:extend_recovery_timer()) Skipped 60 previous similar messages [1108687.219016] Lustre: oak-OST004f: Recovery over after 14:38, of 1238 clients 1224 recovered and 14 were evicted. [1108687.219510] Lustre: Skipped 7 previous similar messages [1108687.279247] Lustre: oak-OST0039: deleting orphan objects from 0x0:4333910 to 0x0:4333953 [1109529.251869] md: md24: data-check done. [1109578.395048] md: data-check of RAID array md18 [1109604.799959] md: md12: data-check done. [1109642.380470] md: md10: data-check done. [1109644.537582] md: data-check of RAID array md8 [1109650.656840] md: data-check of RAID array md0 [1109680.487089] md: md6: data-check done. [1109702.173997] md: md28: data-check done. [1109716.807143] md: data-check of RAID array md2 [1109719.065908] md: md26: data-check done. [1109722.920708] md: data-check of RAID array md30 [1109730.201665] md: md16: data-check done. [1109776.975880] md: md22: data-check done. [1109779.975171] md: md20: data-check done. [1109832.725336] md: md14: data-check done. [1109862.152173] md: md32: data-check done. [1109864.067797] md: md34: data-check done. [1110015.118738] md: md4: data-check done. [1110834.923774] LustreError: 11-0: oak-MDT0000-lwp-OST0032: operation obd_ping to node 10.0.2.52@o2ib5 failed: rc = -107 [1110834.923775] LustreError: 11-0: oak-MDT0000-lwp-OST0038: operation obd_ping to node 10.0.2.52@o2ib5 failed: rc = -107 [1110834.923777] LustreError: Skipped 2 previous similar messages [1110834.923779] Lustre: oak-MDT0000-lwp-OST0046: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [1110834.923781] Lustre: Skipped 17 previous similar messages [1110834.925953] LustreError: Skipped 31 previous similar messages [1110859.922515] LustreError: 11-0: oak-MDT0000-lwp-OST003a: operation obd_ping to node 10.0.2.52@o2ib5 failed: rc = -107 [1110859.923006] LustreError: Skipped 1 previous similar message [1110859.923260] Lustre: oak-MDT0000-lwp-OST003a: Connection to oak-MDT0000 (at 10.0.2.52@o2ib5) was lost; in progress operations using this service will wait for recovery to complete [1110859.924009] Lustre: Skipped 32 previous similar messages [1110865.924084] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519243502/real 1519243502] req@ffff8829baab7200 x1592481950950448/t0(0) o38->oak-MDT0000-lwp-OST0049@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1519243508 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [1110865.925295] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 53 previous similar messages [1110890.920924] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519243527/real 1519243527] req@ffff8829baab2700 x1592481950950704/t0(0) o38->oak-MDT0000-lwp-OST003a@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1519243533 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [1110890.922114] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 34 previous similar messages [1110920.919518] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519243552/real 1519243552] req@ffff8829baab4800 x1592481950951680/t0(0) o38->oak-MDT0000-lwp-OST0049@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1519243563 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [1110975.917007] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519243602/real 1519243602] req@ffff8829baab3600 x1592481950953056/t0(0) o38->oak-MDT0000-lwp-OST0039@10.0.2.51@o2ib5:12/10 lens 520/544 e 0 to 1 dl 1519243618 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1 [1110975.918208] Lustre: 129563:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 35 previous similar messages [1110982.461930] Lustre: oak-OST0035: haven't heard from client 7de7feed-7511-95a2-ea1d-cf787ba38328 (at 10.8.15.2@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d45bfc000, cur 1519243625 expire 1519243475 last 1519243398 [1110982.462899] Lustre: Skipped 6 previous similar messages [1110984.994004] Lustre: oak-OST0030: Connection restored to oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) [1110984.994503] Lustre: Skipped 1701 previous similar messages [1111009.915734] LustreError: 167-0: oak-MDT0000-lwp-OST0032: This client was evicted by oak-MDT0000; in progress operations using this service will fail. [1111009.916229] LustreError: Skipped 17 previous similar messages [1111863.855183] Lustre: oak-OST0046: deleting orphan objects from 0x0:4447028 to 0x0:4447073 [1111863.855184] Lustre: oak-OST0049: deleting orphan objects from 0x0:3442854 to 0x0:3442881 [1111863.855185] Lustre: oak-OST0039: deleting orphan objects from 0x0:4333957 to 0x0:4333985 [1111863.855186] Lustre: oak-OST0048: deleting orphan objects from 0x0:3426362 to 0x0:3426401 [1111863.855187] Lustre: oak-OST004b: deleting orphan objects from 0x0:3444015 to 0x0:3444043 [1111863.855188] Lustre: oak-OST0047: deleting orphan objects from 0x0:4389160 to 0x0:4389186 [1111863.855227] Lustre: oak-OST0032: deleting orphan objects from 0x0:4410704 to 0x0:4410721 [1111863.855242] Lustre: oak-OST0051: deleting orphan objects from 0x0:156900 to 0x0:156929 [1111863.855243] Lustre: oak-OST003c: deleting orphan objects from 0x0:4422920 to 0x0:4422945 [1111863.855254] Lustre: oak-OST0037: deleting orphan objects from 0x0:4426499 to 0x0:4426530 [1111863.855255] Lustre: oak-OST0034: deleting orphan objects from 0x0:4276398 to 0x0:4276417 [1111863.855259] Lustre: oak-OST004c: deleting orphan objects from 0x0:3440273 to 0x0:3440289 [1111863.855303] Lustre: oak-OST0033: deleting orphan objects from 0x0:4409668 to 0x0:4409697 [1111863.855311] Lustre: oak-OST0036: deleting orphan objects from 0x0:4236441 to 0x0:4236481 [1111863.855318] Lustre: oak-OST0031: deleting orphan objects from 0x0:4387253 to 0x0:4387265 [1111863.855332] Lustre: oak-OST0040: deleting orphan objects from 0x0:4430065 to 0x0:4430081 [1111863.855349] Lustre: oak-OST0042: deleting orphan objects from 0x0:4319182 to 0x0:4319201 [1111863.855373] Lustre: oak-OST003a: deleting orphan objects from 0x0:4396720 to 0x0:4396737 [1111863.855376] Lustre: oak-OST0045: deleting orphan objects from 0x0:4406791 to 0x0:4406817 [1111863.855394] Lustre: oak-OST0030: deleting orphan objects from 0x0:4458008 to 0x0:4458049 [1111863.855400] Lustre: oak-OST004e: deleting orphan objects from 0x0:156784 to 0x0:156801 [1111863.855401] Lustre: oak-OST0041: deleting orphan objects from 0x0:4427111 to 0x0:4427137 [1111863.855406] Lustre: oak-OST0044: deleting orphan objects from 0x0:4391655 to 0x0:4391681 [1111863.855421] Lustre: oak-OST0043: deleting orphan objects from 0x0:4389258 to 0x0:4389345 [1111863.855430] Lustre: oak-OST0050: deleting orphan objects from 0x0:156058 to 0x0:156097 [1111863.855442] Lustre: oak-OST003e: deleting orphan objects from 0x0:4275091 to 0x0:4275137 [1111863.855527] Lustre: oak-OST003f: deleting orphan objects from 0x0:4411400 to 0x0:4411428 [1111863.855534] Lustre: oak-OST004d: deleting orphan objects from 0x0:3449764 to 0x0:3449793 [1111863.855535] Lustre: oak-OST004a: deleting orphan objects from 0x0:3422486 to 0x0:3422529 [1111863.855537] Lustre: oak-OST0052: deleting orphan objects from 0x0:156798 to 0x0:156833 [1111863.855639] Lustre: oak-OST0038: deleting orphan objects from 0x0:4374546 to 0x0:4374561 [1113225.340139] Lustre: oak-OST0045: haven't heard from client 17dab95c-3765-c601-9b27-f4222ca95dab (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882d06906800, cur 1519245868 expire 1519245718 last 1519245641 [1113225.341095] Lustre: Skipped 35 previous similar messages [1115241.286415] Lustre: oak-OST0045: haven't heard from client 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8804a811f400, cur 1519247884 expire 1519247734 last 1519247657 [1115241.287437] Lustre: Skipped 35 previous similar messages [1115244.252102] Lustre: oak-OST004a: haven't heard from client 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d2788b400, cur 1519247887 expire 1519247737 last 1519247660 [1115250.248534] Lustre: oak-OST0041: haven't heard from client 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880b37f6dc00, cur 1519247893 expire 1519247743 last 1519247666 [1115253.252640] Lustre: oak-OST0031: haven't heard from client 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883c42fb3c00, cur 1519247896 expire 1519247746 last 1519247669 [1115253.253616] Lustre: Skipped 29 previous similar messages [1115814.567903] Lustre: oak-OST0032: Connection restored to 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) [1115814.567904] Lustre: oak-OST0034: Connection restored to 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) [1115814.567905] Lustre: oak-OST0030: Connection restored to 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) [1115814.567907] Lustre: oak-OST0038: Connection restored to 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) [1115814.567908] Lustre: Skipped 67 previous similar messages [1115814.567908] Lustre: Skipped 67 previous similar messages [1115814.567909] Lustre: Skipped 67 previous similar messages [1115814.570553] Lustre: Skipped 10 previous similar messages [1115839.435831] Lustre: oak-OST0031: Connection restored to 5552bcd3-4f27-e74c-8ee1-179309abb500 (at 10.9.113.6@o2ib4) [1115839.436334] Lustre: Skipped 19 previous similar messages [1117655.086835] Lustre: oak-OST0030: Connection restored to (at 10.8.15.2@o2ib6) [1117655.086836] Lustre: oak-OST0032: Connection restored to (at 10.8.15.2@o2ib6) [1117655.087792] Lustre: Skipped 16 previous similar messages [1117680.304204] Lustre: oak-OST0031: Connection restored to (at 10.8.15.2@o2ib6) [1117680.310061] Lustre: Skipped 16 previous similar messages [1117783.136261] Lustre: oak-OST0041: haven't heard from client 1eb55616-e815-1eed-5ced-241990d7e40c (at 10.9.112.8@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880b37f68400, cur 1519250426 expire 1519250276 last 1519250199 [1117783.137250] Lustre: Skipped 3 previous similar messages [1118347.132812] Lustre: oak-OST0034: Connection restored to 1eb55616-e815-1eed-5ced-241990d7e40c (at 10.9.112.8@o2ib4) [1118347.132813] Lustre: oak-OST0038: Connection restored to 1eb55616-e815-1eed-5ced-241990d7e40c (at 10.9.112.8@o2ib4) [1118347.132815] Lustre: oak-OST0032: Connection restored to 1eb55616-e815-1eed-5ced-241990d7e40c (at 10.9.112.8@o2ib4) [1118347.132816] Lustre: Skipped 2 previous similar messages [1118347.132817] Lustre: Skipped 2 previous similar messages [1118347.134705] Lustre: Skipped 13 previous similar messages [1118372.382211] Lustre: oak-OST0031: Connection restored to 1eb55616-e815-1eed-5ced-241990d7e40c (at 10.9.112.8@o2ib4) [1118372.382685] Lustre: Skipped 15 previous similar messages [1124017.448354] Lustre: oak-OST0030: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1124017.448844] Lustre: Skipped 3 previous similar messages [1124042.772541] Lustre: oak-OST0031: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1124042.773025] Lustre: Skipped 20 previous similar messages [1124430.838812] Lustre: oak-OST004d: haven't heard from client d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88156d4cec00, cur 1519257074 expire 1519256924 last 1519256847 [1124430.839793] Lustre: Skipped 35 previous similar messages [1124445.839042] Lustre: oak-OST004c: haven't heard from client d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ded6c9000, cur 1519257089 expire 1519256939 last 1519256862 [1124449.824968] Lustre: oak-OST003d: haven't heard from client d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88155c517400, cur 1519257093 expire 1519256943 last 1519256866 [1124449.825986] Lustre: Skipped 30 previous similar messages [1124457.827441] Lustre: oak-OST0034: haven't heard from client d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cda8d5800, cur 1519257101 expire 1519256951 last 1519256874 [1124976.780777] Lustre: oak-OST0036: Connection restored to d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) [1124976.780778] Lustre: oak-OST0030: Connection restored to d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) [1124976.780780] Lustre: oak-OST0034: Connection restored to d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) [1124976.780781] Lustre: oak-OST0032: Connection restored to d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) [1124976.780782] Lustre: Skipped 10 previous similar messages [1124976.780783] Lustre: Skipped 10 previous similar messages [1124976.780783] Lustre: Skipped 10 previous similar messages [1124976.783445] Lustre: Skipped 12 previous similar messages [1125001.743365] Lustre: oak-OST0031: Connection restored to d16097e6-b240-aa27-27eb-24d11bff4c92 (at 10.9.113.7@o2ib4) [1125001.743854] Lustre: Skipped 17 previous similar messages [1126507.732242] Lustre: oak-OST003f: haven't heard from client 674e65bd-d9a0-971c-e78d-61c4a8a708bd (at 10.210.47.106@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882a2bda4400, cur 1519259151 expire 1519259001 last 1519258924 [1126507.733220] Lustre: Skipped 2 previous similar messages [1127302.165242] Lustre: oak-OST0030: Connection restored to (at 10.210.47.106@o2ib3) [1127302.165721] Lustre: Skipped 14 previous similar messages [1127327.727160] Lustre: oak-OST0045: Connection restored to (at 10.210.47.106@o2ib3) [1127327.727161] Lustre: oak-OST0031: Connection restored to (at 10.210.47.106@o2ib3) [1127327.727162] Lustre: oak-OST0037: Connection restored to (at 10.210.47.106@o2ib3) [1127327.727163] Lustre: Skipped 4 previous similar messages [1127327.727164] Lustre: Skipped 4 previous similar messages [1127327.729170] Lustre: Skipped 8 previous similar messages [1127371.710067] Lustre: oak-OST0030: Connection restored to db6d3d40-4b8c-e093-15a4-aa74120c714b (at 10.210.46.49@o2ib3) [1127371.710566] Lustre: Skipped 3 previous similar messages [1127383.215704] Lustre: oak-OST0030: Connection restored to 7926e9f4-ad37-af82-48cd-0434817ff711 (at 10.210.44.47@o2ib3) [1127383.216252] Lustre: Skipped 15 previous similar messages [1127397.115772] Lustre: oak-OST003d: Connection restored to db6d3d40-4b8c-e093-15a4-aa74120c714b (at 10.210.46.49@o2ib3) [1127397.115773] Lustre: oak-OST0039: Connection restored to db6d3d40-4b8c-e093-15a4-aa74120c714b (at 10.210.46.49@o2ib3) [1127397.115775] Lustre: oak-OST0031: Connection restored to db6d3d40-4b8c-e093-15a4-aa74120c714b (at 10.210.46.49@o2ib3) [1127397.115776] Lustre: oak-OST0037: Connection restored to db6d3d40-4b8c-e093-15a4-aa74120c714b (at 10.210.46.49@o2ib3) [1127397.115777] Lustre: oak-OST0035: Connection restored to db6d3d40-4b8c-e093-15a4-aa74120c714b (at 10.210.46.49@o2ib3) [1127397.115779] Lustre: oak-OST0033: Connection restored to db6d3d40-4b8c-e093-15a4-aa74120c714b (at 10.210.46.49@o2ib3) [1127397.115780] Lustre: Skipped 16 previous similar messages [1127397.115781] Lustre: Skipped 16 previous similar messages [1127397.115781] Lustre: Skipped 16 previous similar messages [1127397.115782] Lustre: Skipped 16 previous similar messages [1127397.115783] Lustre: Skipped 16 previous similar messages [1127397.119849] Lustre: Skipped 9 previous similar messages [1135056.341391] Lustre: oak-OST0049: Client 7461b1df-58fb-1087-c619-45f28694768c (at 10.9.104.12@o2ib4) reconnecting [1135056.341393] Lustre: oak-OST0031: Client 7461b1df-58fb-1087-c619-45f28694768c (at 10.9.104.12@o2ib4) reconnecting [1135056.341395] Lustre: oak-OST003f: Client 7461b1df-58fb-1087-c619-45f28694768c (at 10.9.104.12@o2ib4) reconnecting [1135056.341397] Lustre: oak-OST0045: Client 7461b1df-58fb-1087-c619-45f28694768c (at 10.9.104.12@o2ib4) reconnecting [1135056.341398] Lustre: Skipped 71 previous similar messages [1135056.341399] Lustre: Skipped 71 previous similar messages [1135056.341401] Lustre: Skipped 71 previous similar messages [1135056.341429] Lustre: oak-OST0045: Connection restored to 7461b1df-58fb-1087-c619-45f28694768c (at 10.9.104.12@o2ib4) [1135056.341430] Lustre: oak-OST003f: Connection restored to 7461b1df-58fb-1087-c619-45f28694768c (at 10.9.104.12@o2ib4) [1135056.341432] Lustre: Skipped 15 previous similar messages [1135056.341433] Lustre: Skipped 16 previous similar messages [1135056.345729] Lustre: Skipped 8 previous similar messages [1135107.120729] Lustre: oak-OST0030: Client 1a82bd75-72e1-8040-9fd0-f227412ed7d7 (at 10.210.45.38@o2ib3) reconnecting [1135107.120749] Lustre: oak-OST0031: Connection restored to 1a82bd75-72e1-8040-9fd0-f227412ed7d7 (at 10.210.45.38@o2ib3) [1135107.120751] Lustre: Skipped 13005 previous similar messages [1135107.122018] Lustre: Skipped 13005 previous similar messages [1135156.219068] Lustre: oak-OST0052: Connection restored to dbad0532-ffa9-7d4e-e8f2-8bbf137ac40a (at 10.9.103.16@o2ib4) [1135156.219069] Lustre: oak-OST0051: Connection restored to dbad0532-ffa9-7d4e-e8f2-8bbf137ac40a (at 10.9.103.16@o2ib4) [1135156.219071] Lustre: Skipped 24 previous similar messages [1135156.220277] Lustre: Skipped 19 previous similar messages [1135158.784295] LustreError: 62502:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff8817f95f6850 x1591095117516672/t0(0) o4->89ace3c6-2eb8-89f3-a103-a2495018f228@10.9.101.55@o2ib4:182/0 lens 608/448 e 0 to 0 dl 1519267807 ref 1 fl Interpret:/0/0 rc 0/0 [1135158.785369] Lustre: oak-OST003c: Bulk IO write error with 89ace3c6-2eb8-89f3-a103-a2495018f228 (at 10.9.101.55@o2ib4), client will retry: rc = -110 [1135158.785856] Lustre: Skipped 24 previous similar messages [1135171.576661] LustreError: 62502:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff88039624cc50 x1591878701963744/t0(0) o4->70ea5860-ea02-c80c-735a-6a5e8618d73c@10.8.18.14@o2ib6:195/0 lens 608/448 e 0 to 0 dl 1519267820 ref 1 fl Interpret:/0/0 rc 0/0 [1135171.577631] Lustre: oak-OST003b: Bulk IO write error with 70ea5860-ea02-c80c-735a-6a5e8618d73c (at 10.8.18.14@o2ib6), client will retry: rc = -110 [1135177.259601] Lustre: oak-OST0031: Client 18aab27a-dacb-a4b4-76d0-192734850c7d (at 10.8.18.24@o2ib6) reconnecting [1135177.260085] Lustre: Skipped 8774 previous similar messages [1135220.499461] Lustre: oak-OST0032: Connection restored to 51ac6e63-c3f7-ca59-16ef-4eb91d23dbd7 (at 10.9.104.51@o2ib4) [1135220.499462] Lustre: oak-OST0036: Connection restored to 51ac6e63-c3f7-ca59-16ef-4eb91d23dbd7 (at 10.9.104.51@o2ib4) [1135220.499464] Lustre: Skipped 16852 previous similar messages [1135220.500644] Lustre: Skipped 15 previous similar messages [1135254.822565] LustreError: 135003:0:(ldlm_lockd.c:2365:ldlm_cancel_handler()) ldlm_cancel from 10.9.104.28@o2ib4 arrived at 1519267898 with bad export cookie 0 [1135255.792791] LustreError: 301774:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff881fde03e450 x1591095117516656/t0(0) o4->89ace3c6-2eb8-89f3-a103-a2495018f228@10.9.101.55@o2ib4:282/0 lens 608/448 e 4 to 0 dl 1519267907 ref 1 fl Interpret:/0/0 rc 0/0 [1135255.793777] Lustre: oak-OST0051: Bulk IO write error with 89ace3c6-2eb8-89f3-a103-a2495018f228 (at 10.9.101.55@o2ib4), client will retry: rc = -110 [1135288.396536] Lustre: oak-OST0030: haven't heard from client ea411769-8915-6823-c572-b4d43a6ebe81 (at 10.210.47.41@o2ib3) in 226 seconds. I think it's dead, and I am evicting it. exp ffff881ff1b0f400, cur 1519267932 expire 1519267782 last 1519267706 [1135288.397506] Lustre: Skipped 107 previous similar messages [1135289.352697] Lustre: oak-OST003c: haven't heard from client 9f775363-f028-046a-4f86-6a31f8156b5f (at 10.9.105.58@o2ib4) in 197 seconds. I think it's dead, and I am evicting it. exp ffff883ca7ede400, cur 1519267933 expire 1519267783 last 1519267736 [1135289.353667] Lustre: Skipped 7 previous similar messages [1135305.285414] Lustre: oak-OST0030: Client 94bd7708-fcb9-3baa-2082-e172ea10b753 (at 10.210.46.73@o2ib3) reconnecting [1135305.285915] Lustre: Skipped 63957 previous similar messages [1139169.163725] Lustre: oak-OST0031: Client 09979287-78e8-d016-8f34-973ac95dc589 (at 10.9.104.19@o2ib4) reconnecting [1139169.163753] Lustre: oak-OST0038: Connection restored to 09979287-78e8-d016-8f34-973ac95dc589 (at 10.9.104.19@o2ib4) [1139169.163755] Lustre: Skipped 65314 previous similar messages [1139169.165041] Lustre: Skipped 9415 previous similar messages [1139204.986642] Lustre: oak-OST0039: Client fab6a15a-3f78-a8e4-40d4-43e3c90d9b87 (at 10.9.102.65@o2ib4) reconnecting [1139204.992886] Lustre: Skipped 92 previous similar messages [1139204.993203] Lustre: oak-OST0039: Connection restored to fab6a15a-3f78-a8e4-40d4-43e3c90d9b87 (at 10.9.102.65@o2ib4) [1139204.993789] Lustre: Skipped 99 previous similar messages [1139221.323388] LustreError: 185052:0:(ldlm_lib.c:3236:target_bulk_io()) @@@ Reconnect on bulk WRITE req@ffff883549921050 x1591055802559504/t0(0) o4->7f22b6e5-5e60-0633-6d43-41c391054a8e@10.9.102.26@o2ib4:470/0 lens 608/448 e 0 to 0 dl 1519271870 ref 1 fl Interpret:/0/0 rc 0/0 [1139221.324458] LustreError: 185052:0:(ldlm_lib.c:3236:target_bulk_io()) Skipped 1 previous similar message [1139221.324980] Lustre: oak-OST003e: Bulk IO write error with 7f22b6e5-5e60-0633-6d43-41c391054a8e (at 10.9.102.26@o2ib4), client will retry: rc = -110 [1139221.325560] Lustre: Skipped 1 previous similar message [1139269.008718] Lustre: oak-OST0032: Client 1023aa9d-1c81-17bf-ad08-9f615d992694 (at 10.210.47.116@o2ib3) reconnecting [1139269.008733] Lustre: oak-OST0035: Connection restored to 1023aa9d-1c81-17bf-ad08-9f615d992694 (at 10.210.47.116@o2ib3) [1139269.008735] Lustre: oak-OST0036: Connection restored to 1023aa9d-1c81-17bf-ad08-9f615d992694 (at 10.210.47.116@o2ib3) [1139269.008735] Lustre: Skipped 53799 previous similar messages [1139269.008736] Lustre: Skipped 53799 previous similar messages [1139269.010636] Lustre: Skipped 53826 previous similar messages [1148418.712954] Lustre: oak-OST0044: haven't heard from client 51e89c38-fbba-0757-e10c-4872b8f7d281 (at 10.8.28.12@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883caae7f400, cur 1519281063 expire 1519280913 last 1519280836 [1148418.713962] Lustre: Skipped 36 previous similar messages [1175955.444091] Lustre: oak-OST004c: haven't heard from client 35ff4bab-ea66-ec8e-ca5d-3c6ff7075194 (at 10.9.112.13@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880edb099400, cur 1519308601 expire 1519308451 last 1519308374 [1175955.445056] Lustre: Skipped 35 previous similar messages [1180635.220420] Lustre: oak-OST004b: haven't heard from client ab3fefa7-8cee-4359-a92b-5ff5ee75eb32 (at 10.210.46.127@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881fde2a3c00, cur 1519313281 expire 1519313131 last 1519313054 [1180635.221460] Lustre: Skipped 35 previous similar messages [1181356.939757] Lustre: oak-OST0034: Connection restored to ab3fefa7-8cee-4359-a92b-5ff5ee75eb32 (at 10.210.46.127@o2ib3) [1181356.940257] Lustre: Skipped 1818 previous similar messages [1181389.367074] Lustre: oak-OST0030: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1181389.367076] Lustre: oak-OST0034: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1181389.367077] Lustre: oak-OST0036: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1181389.367079] Lustre: oak-OST003a: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1181389.367080] Lustre: oak-OST0038: Connection restored to aea09ef5-c649-fac0-d976-164232fcb59b (at 10.210.47.108@o2ib3) [1181389.367081] Lustre: Skipped 38 previous similar messages [1181389.367082] Lustre: Skipped 38 previous similar messages [1181389.367083] Lustre: Skipped 38 previous similar messages [1181389.367083] Lustre: Skipped 38 previous similar messages [1181389.370437] Lustre: Skipped 11 previous similar messages [1183855.078547] Lustre: oak-OST0049: haven't heard from client ef54ddaa-b488-a9e3-469f-c7a53917244f (at 10.210.47.105@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8839c813d000, cur 1519316501 expire 1519316351 last 1519316274 [1183855.079542] Lustre: Skipped 215 previous similar messages [1185135.481507] Lustre: oak-OST0030: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1185135.481508] Lustre: oak-OST0032: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1185135.481509] Lustre: Skipped 167 previous similar messages [1185135.482709] Lustre: Skipped 16 previous similar messages [1185562.994074] Lustre: oak-OST0032: haven't heard from client 05d353d5-c0ad-f363-77dd-e7ed5dacee45 (at 10.9.101.60@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8813f8fd1800, cur 1519318209 expire 1519318059 last 1519317982 [1185562.995065] Lustre: Skipped 35 previous similar messages [1186264.926932] Lustre: oak-OST0032: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1186264.926933] Lustre: oak-OST0030: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1186264.926935] Lustre: Skipped 16 previous similar messages [1186264.928178] Lustre: Skipped 10 previous similar messages [1186290.520531] Lustre: oak-OST0031: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1186290.521018] Lustre: Skipped 21 previous similar messages [1188552.166665] Lustre: oak-OST0030: Connection restored to e8d03940-2841-ddc6-88a1-756abe60edf4 (at 10.9.0.1@o2ib4) [1188552.167139] Lustre: Skipped 4 previous similar messages [1188577.169494] Lustre: oak-OST0031: Connection restored to e8d03940-2841-ddc6-88a1-756abe60edf4 (at 10.9.0.1@o2ib4) [1188577.169991] Lustre: Skipped 27 previous similar messages [1188867.111005] Lustre: oak-OST0030: Connection restored to (at 10.8.1.27@o2ib6) [1188867.111006] Lustre: oak-OST0032: Connection restored to (at 10.8.1.27@o2ib6) [1188867.111008] Lustre: Skipped 1 previous similar message [1188867.112184] Lustre: Skipped 15 previous similar messages [1188892.304070] Lustre: oak-OST003b: Connection restored to (at 10.8.1.27@o2ib6) [1188892.304071] Lustre: oak-OST0037: Connection restored to (at 10.8.1.27@o2ib6) [1188892.304073] Lustre: Skipped 4 previous similar messages [1188892.305532] Lustre: Skipped 10 previous similar messages [1189241.536392] Lustre: oak-OST0034: Connection restored to aa0141b0-d017-626c-c09d-b75d6daae2e4 (at 10.8.1.28@o2ib6) [1189241.536878] Lustre: Skipped 15 previous similar messages [1190340.347483] Lustre: oak-OST0030: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1190340.347977] Lustre: Skipped 23 previous similar messages [1190365.176502] Lustre: oak-OST0035: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1190365.176503] Lustre: oak-OST0039: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1190365.176504] Lustre: oak-OST0031: Connection restored to 2e9ea172-e8f6-6fd6-a6b7-22e2a9ad7a6b (at 10.9.101.60@o2ib4) [1190365.176505] Lustre: Skipped 6 previous similar messages [1190365.176506] Lustre: Skipped 6 previous similar messages [1190365.178462] Lustre: Skipped 15 previous similar messages [1190510.153327] Lustre: oak-OST0034: Connection restored to a6968ebc-f439-75ac-3be1-ab11e5061596 (at 10.8.26.33@o2ib6) [1190510.153328] Lustre: oak-OST0030: Connection restored to a6968ebc-f439-75ac-3be1-ab11e5061596 (at 10.8.26.33@o2ib6) [1190510.153330] Lustre: oak-OST0032: Connection restored to a6968ebc-f439-75ac-3be1-ab11e5061596 (at 10.8.26.33@o2ib6) [1190510.154777] Lustre: Skipped 14 previous similar messages [1190775.215473] Lustre: oak-OST0030: Connection restored to 59b13195-34b1-8dc0-1073-e82bd675167c (at 10.9.105.51@o2ib4) [1190775.215951] Lustre: Skipped 3648 previous similar messages [1191578.949807] Lustre: oak-OST0030: Connection restored to c2359b6b-1fee-229e-dce9-c36c4371d32f (at 10.9.102.39@o2ib4) [1191578.950306] Lustre: Skipped 279 previous similar messages [1192479.244776] Lustre: oak-OST0030: Connection restored to 50b56427-cfcb-b700-037c-334f11d1cb00 (at 10.9.101.48@o2ib4) [1192479.245259] Lustre: Skipped 3297 previous similar messages [1193615.260007] Lustre: oak-OST0030: Connection restored to 64597622-66c8-e766-d445-7261952d7072 (at 10.9.101.18@o2ib4) [1193615.260488] Lustre: Skipped 34 previous similar messages [1194237.312473] Lustre: oak-OST004f: Client f1d1d5b3-77a3-d711-f284-7184f5444cdb (at 10.210.44.249@o2ib3) reconnecting [1194237.312475] Lustre: oak-OST0035: Client f1d1d5b3-77a3-d711-f284-7184f5444cdb (at 10.210.44.249@o2ib3) reconnecting [1194237.312476] Lustre: oak-OST0050: Client f1d1d5b3-77a3-d711-f284-7184f5444cdb (at 10.210.44.249@o2ib3) reconnecting [1194237.312477] Lustre: Skipped 1777 previous similar messages [1194237.312477] Lustre: Skipped 1777 previous similar messages [1194237.312495] Lustre: oak-OST0035: Connection restored to (at 10.210.44.249@o2ib3) [1194237.312496] Lustre: Skipped 168 previous similar messages [1194237.315228] Lustre: Skipped 31 previous similar messages [1196131.190916] Lustre: oak-OST0032: Connection restored to b6338d77-1fa9-be10-93d4-b9041c881f67 (at 10.9.113.11@o2ib4) [1196131.190917] Lustre: oak-OST0030: Connection restored to b6338d77-1fa9-be10-93d4-b9041c881f67 (at 10.9.113.11@o2ib4) [1196131.190919] Lustre: Skipped 33 previous similar messages [1196131.192133] Lustre: Skipped 16 previous similar messages [1196222.528484] Lustre: oak-OST0033: Connection restored to (at 10.9.101.50@o2ib4) [1196222.528981] Lustre: Skipped 84 previous similar messages [1196489.801729] Lustre: oak-OST0032: Connection restored to b0a7599d-e6fa-20c7-5218-7059666ae4c4 (at 10.9.114.7@o2ib4) [1196489.801730] Lustre: oak-OST0034: Connection restored to b0a7599d-e6fa-20c7-5218-7059666ae4c4 (at 10.9.114.7@o2ib4) [1196489.801731] Lustre: oak-OST0030: Connection restored to b0a7599d-e6fa-20c7-5218-7059666ae4c4 (at 10.9.114.7@o2ib4) [1196489.803143] Lustre: Skipped 14 previous similar messages [1197034.429962] Lustre: oak-OST0038: Connection restored to 3e650e62-845d-6d60-26a3-3d4109f50747 (at 10.9.112.13@o2ib4) [1197034.429963] Lustre: oak-OST003a: Connection restored to 3e650e62-845d-6d60-26a3-3d4109f50747 (at 10.9.112.13@o2ib4) [1197034.429965] Lustre: Skipped 813 previous similar messages [1197034.431162] Lustre: Skipped 11 previous similar messages [1197641.677532] Lustre: oak-OST0031: Connection restored to 51e89c38-fbba-0757-e10c-4872b8f7d281 (at 10.8.28.12@o2ib6) [1197641.677534] Lustre: oak-OST0033: Connection restored to 51e89c38-fbba-0757-e10c-4872b8f7d281 (at 10.8.28.12@o2ib6) [1197641.677536] Lustre: Skipped 36 previous similar messages [1197641.678733] Lustre: Skipped 16 previous similar messages [1201227.247434] Lustre: oak-OST0036: Connection restored to 00d4168e-b282-fe05-71fa-6d4c2929a322 (at 10.8.28.2@o2ib6) [1201227.247435] Lustre: oak-OST0032: Connection restored to 00d4168e-b282-fe05-71fa-6d4c2929a322 (at 10.8.28.2@o2ib6) [1201227.247436] Lustre: oak-OST0038: Connection restored to 00d4168e-b282-fe05-71fa-6d4c2929a322 (at 10.8.28.2@o2ib6) [1201227.247437] Lustre: Skipped 5 previous similar messages [1201227.247438] Lustre: Skipped 5 previous similar messages [1201227.254670] Lustre: Skipped 10 previous similar messages [1201308.636210] Lustre: oak-OST0030: Connection restored to 89ace3c6-2eb8-89f3-a103-a2495018f228 (at 10.9.101.55@o2ib4) [1201308.636702] Lustre: Skipped 356 previous similar messages [1201650.263680] Lustre: oak-OST0030: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1201650.263681] Lustre: oak-OST0032: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1201650.263683] Lustre: Skipped 97 previous similar messages [1201650.264983] Lustre: Skipped 16 previous similar messages [1205689.894297] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1205689.894298] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1205689.894299] Lustre: Skipped 18 previous similar messages [1205689.895485] Lustre: Skipped 16 previous similar messages [1205785.691381] Lustre: oak-OST0039: Connection restored to baccbd1e-f756-b740-0a2c-485ab7fddb76 (at 10.9.101.53@o2ib4) [1205785.691877] Lustre: Skipped 172 previous similar messages [1205941.390177] Lustre: oak-OST0037: Connection restored to c7d5dc36-f107-849d-e976-61d2b500e8cd (at 10.9.101.52@o2ib4) [1205941.390178] Lustre: oak-OST0033: Connection restored to c7d5dc36-f107-849d-e976-61d2b500e8cd (at 10.9.101.52@o2ib4) [1205941.390179] Lustre: oak-OST003f: Connection restored to c7d5dc36-f107-849d-e976-61d2b500e8cd (at 10.9.101.52@o2ib4) [1205941.390180] Lustre: Skipped 96 previous similar messages [1205941.390181] Lustre: Skipped 96 previous similar messages [1205941.392287] Lustre: Skipped 10 previous similar messages [1207241.984946] Lustre: oak-OST0042: haven't heard from client 6a9f55f7-322f-0aa5-b322-370a427863d5 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8811b783fc00, cur 1519339889 expire 1519339739 last 1519339662 [1207241.985900] Lustre: Skipped 35 previous similar messages [1208374.957873] Lustre: oak-OST0032: Connection restored to 70d66984-cbf7-71f7-cfbf-7f3f48d8a9ef (at 10.9.114.8@o2ib4) [1208374.958359] Lustre: Skipped 17 previous similar messages [1210814.827303] Lustre: oak-OST0049: haven't heard from client 0a268942-d0a2-02b0-c5d8-38e968656e83 (at 10.210.47.108@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882bccee8000, cur 1519343462 expire 1519343312 last 1519343235 [1210814.828269] Lustre: Skipped 35 previous similar messages [1215172.529880] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4194304-8388607], client returned csum 54fde0b7 (type 4), server csum 1f41acdd (type 4) [1215172.530869] LustreError: Skipped 6 previous similar messages [1215192.362408] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4194304-8388607], client returned csum 54fde0b7 (type 4), server csum 1f41acdd (type 4) [1215192.363430] LustreError: Skipped 4 previous similar messages [1215226.323725] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4194304-8388607], client returned csum 54fde0b7 (type 4), server csum 1f41acdd (type 4) [1215226.324822] LustreError: Skipped 3 previous similar messages [1215291.355464] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4194304-7237631], client returned csum 9ac3682f (type 4), server csum 664ab593 (type 4) [1215291.356592] LustreError: Skipped 12 previous similar messages [1215423.352885] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4194304-8388607], client returned csum 54fde0b7 (type 4), server csum 1f41acdd (type 4) [1215423.353911] LustreError: Skipped 22 previous similar messages [1215629.653567] Lustre: oak-OST0034: Connection restored to f2fc2008-bb5f-07e6-6fbb-ac07090ab2fb (at 10.8.18.35@o2ib6) [1215629.653568] Lustre: oak-OST0032: Connection restored to f2fc2008-bb5f-07e6-6fbb-ac07090ab2fb (at 10.8.18.35@o2ib6) [1215629.653570] Lustre: Skipped 17 previous similar messages [1215629.654760] Lustre: Skipped 15 previous similar messages [1215639.326771] Lustre: oak-OST0036: Connection restored to dae73673-b89d-dcde-ce37-464280cdd141 (at 10.8.2.16@o2ib6) [1215639.326772] Lustre: oak-OST0034: Connection restored to dae73673-b89d-dcde-ce37-464280cdd141 (at 10.8.2.16@o2ib6) [1215639.326774] Lustre: oak-OST0032: Connection restored to dae73673-b89d-dcde-ce37-464280cdd141 (at 10.8.2.16@o2ib6) [1215639.326776] Lustre: oak-OST0030: Connection restored to dae73673-b89d-dcde-ce37-464280cdd141 (at 10.8.2.16@o2ib6) [1215639.326776] Lustre: Skipped 374 previous similar messages [1215639.326777] Lustre: Skipped 374 previous similar messages [1215639.326778] Lustre: Skipped 374 previous similar messages [1215639.329443] Lustre: Skipped 14 previous similar messages [1215658.229581] Lustre: oak-OST003b: Connection restored to 8c6c7eba-cfd3-a8a4-7d4e-c115b33d2d47 (at 10.9.105.18@o2ib4) [1215658.229583] Lustre: oak-OST0033: Connection restored to 8c6c7eba-cfd3-a8a4-7d4e-c115b33d2d47 (at 10.9.105.18@o2ib4) [1215658.229584] Lustre: oak-OST0037: Connection restored to 8c6c7eba-cfd3-a8a4-7d4e-c115b33d2d47 (at 10.9.105.18@o2ib4) [1215658.229585] Lustre: Skipped 408 previous similar messages [1215658.229586] Lustre: Skipped 408 previous similar messages [1215658.231518] Lustre: Skipped 15 previous similar messages [1216413.515579] Lustre: oak-OST0030: Connection restored to 31f88457-4737-4098-430b-1de18cc9c3b2 (at 10.9.101.63@o2ib4) [1216413.516064] Lustre: Skipped 543 previous similar messages [1216601.562632] Lustre: oak-OST0051: haven't heard from client 1eac496c-e83a-4227-c7cc-532c9a767740 (at 10.9.114.8@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881200880c00, cur 1519349249 expire 1519349099 last 1519349022 [1216601.563711] Lustre: Skipped 35 previous similar messages [1217118.448278] Lustre: oak-OST0032: Connection restored to 70d66984-cbf7-71f7-cfbf-7f3f48d8a9ef (at 10.9.114.8@o2ib4) [1217118.448760] Lustre: Skipped 74 previous similar messages [1237847.582061] Lustre: oak-OST0036: haven't heard from client 2a67fa2c-d9c4-c535-3b45-fb4ccd799362 (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cac657400, cur 1519370496 expire 1519370346 last 1519370269 [1237847.583358] Lustre: Skipped 71 previous similar messages [1243138.315863] Lustre: DEBUG MARKER: Fri Feb 23 00:49:46 2018 [1249632.034983] Lustre: oak-OST003b: haven't heard from client 3329e82e-efa7-879b-9e0f-29c2077791d5 (at 10.9.112.14@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88287201cc00, cur 1519382281 expire 1519382131 last 1519382054 [1249632.036002] Lustre: Skipped 35 previous similar messages [1274571.945764] Lustre: oak-OST0036: Connection restored to 7a15e73a-a0ee-6bca-ab92-aa573877def2 (at 10.9.112.5@o2ib4) [1274571.945765] Lustre: oak-OST0032: Connection restored to 7a15e73a-a0ee-6bca-ab92-aa573877def2 (at 10.9.112.5@o2ib4) [1274571.945766] Lustre: oak-OST0034: Connection restored to 7a15e73a-a0ee-6bca-ab92-aa573877def2 (at 10.9.112.5@o2ib4) [1274571.945767] Lustre: oak-OST0030: Connection restored to 7a15e73a-a0ee-6bca-ab92-aa573877def2 (at 10.9.112.5@o2ib4) [1274571.945768] Lustre: Skipped 19 previous similar messages [1274571.945769] Lustre: Skipped 19 previous similar messages [1274571.945769] Lustre: Skipped 19 previous similar messages [1274571.948355] Lustre: Skipped 14 previous similar messages [1274581.872669] Lustre: oak-OST0030: Connection restored to a3309323-63b0-d61f-19a3-ab31f183f10e (at 10.9.104.54@o2ib4) [1274581.873159] Lustre: Skipped 324 previous similar messages [1274601.025042] Lustre: oak-OST0030: Connection restored to 51ac6e63-c3f7-ca59-16ef-4eb91d23dbd7 (at 10.9.104.51@o2ib4) [1274601.025543] Lustre: Skipped 378 previous similar messages [1274640.485861] Lustre: oak-OST0037: Connection restored to 590cd3e3-fa19-0835-a53e-1ffc2c2d0012 (at 10.8.28.9@o2ib6) [1274640.485862] Lustre: oak-OST0031: Connection restored to 590cd3e3-fa19-0835-a53e-1ffc2c2d0012 (at 10.8.28.9@o2ib6) [1274640.485863] Lustre: Skipped 303 previous similar messages [1274640.487066] Lustre: Skipped 13 previous similar messages [1274833.758623] Lustre: oak-OST0030: Connection restored to 6753e7eb-522c-7be3-4e53-e46f453e0ada (at 10.9.113.3@o2ib4) [1274833.759103] Lustre: Skipped 250 previous similar messages [1276366.686295] Lustre: oak-OST0032: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1276366.686296] Lustre: oak-OST0034: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1276366.686297] Lustre: oak-OST0038: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1276366.686298] Lustre: Skipped 49 previous similar messages [1276366.686299] Lustre: Skipped 49 previous similar messages [1276366.688210] Lustre: Skipped 12 previous similar messages [1276385.785844] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1276385.785845] Lustre: oak-OST0036: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1276385.785847] Lustre: Skipped 2 previous similar messages [1276385.787062] Lustre: Skipped 13 previous similar messages [1276410.908547] Lustre: oak-OST0031: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1276410.909049] Lustre: Skipped 50 previous similar messages [1276768.776977] Lustre: oak-OST0036: haven't heard from client 5441a0ea-9d87-2364-d972-e431616c1d62 (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8814ec3aa800, cur 1519409419 expire 1519409269 last 1519409192 [1276768.777989] Lustre: Skipped 35 previous similar messages [1277086.662065] Lustre: oak-OST0032: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1277086.662066] Lustre: oak-OST0034: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1277086.662068] Lustre: Skipped 21 previous similar messages [1277086.663257] Lustre: Skipped 13 previous similar messages [1277106.931433] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1277106.931435] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1277106.931436] Lustre: oak-OST0036: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1277106.931437] Lustre: Skipped 1 previous similar message [1277106.931437] Lustre: Skipped 1 previous similar message [1277106.938557] Lustre: Skipped 14 previous similar messages [1277132.076646] Lustre: oak-OST0039: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1277132.076648] Lustre: oak-OST0031: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1277132.076649] Lustre: oak-OST0035: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1277132.076651] Lustre: Skipped 34 previous similar messages [1277132.076652] Lustre: Skipped 35 previous similar messages [1277132.078588] Lustre: Skipped 13 previous similar messages [1277485.428161] Lustre: oak-OST0034: Connection restored to 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) [1277485.428162] Lustre: oak-OST003a: Connection restored to 57635f11-dca4-393d-e47e-90cf0165d384 (at 10.9.112.14@o2ib4) [1277485.428163] Lustre: Skipped 18 previous similar messages [1277485.429368] Lustre: Skipped 12 previous similar messages [1277487.767334] Lustre: oak-OST004f: haven't heard from client a7974e92-7914-f271-fc98-69afe5d66cad (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cdbdcf800, cur 1519410138 expire 1519409988 last 1519409911 [1277487.768349] Lustre: Skipped 107 previous similar messages [1277802.424542] Lustre: oak-OST0030: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1277802.424544] Lustre: oak-OST0034: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1277802.424546] Lustre: oak-OST0032: Connection restored to 6cfc17b8-b5a8-7bc9-3ba4-f9d50bb01c21 (at 10.9.112.17@o2ib4) [1277802.424547] Lustre: Skipped 14 previous similar messages [1277802.424549] Lustre: Skipped 14 previous similar messages [1277802.426569] Lustre: Skipped 13 previous similar messages [1277886.709309] Lustre: oak-OST0041: haven't heard from client 133a7287-8266-4fa9-7928-bdda96a5c75f (at 10.9.112.14@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883c89313000, cur 1519410537 expire 1519410387 last 1519410310 [1277886.710261] Lustre: Skipped 107 previous similar messages [1278178.692456] Lustre: oak-OST0039: haven't heard from client b0216e63-f0c0-71c1-b5ce-a54eae556ef1 (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881ff1953400, cur 1519410829 expire 1519410679 last 1519410602 [1278178.693417] Lustre: Skipped 35 previous similar messages [1278545.001917] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1278545.001918] Lustre: oak-OST0036: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1278545.001919] Lustre: Skipped 119 previous similar messages [1278545.003117] Lustre: Skipped 13 previous similar messages [1278602.673788] Lustre: oak-OST0039: haven't heard from client bcd470bd-5253-e0f7-85b7-4ad3308a65f2 (at 10.9.112.14@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ca7843000, cur 1519411253 expire 1519411103 last 1519411026 [1278602.674739] Lustre: Skipped 107 previous similar messages [1278946.658338] Lustre: oak-OST0038: haven't heard from client 9097c84d-615a-9440-6dc7-ae9e191b17c4 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88130f2e9000, cur 1519411597 expire 1519411447 last 1519411370 [1278946.659443] Lustre: Skipped 35 previous similar messages [1279264.868510] Lustre: oak-OST003a: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1279264.868511] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1279264.868512] Lustre: oak-OST0036: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1279264.868513] Lustre: Skipped 89 previous similar messages [1279264.868514] Lustre: Skipped 90 previous similar messages [1279264.870532] Lustre: Skipped 11 previous similar messages [1279283.668413] Lustre: oak-OST0033: haven't heard from client 8698f9f5-c30e-2993-4c7b-90071d4953d1 (at 10.9.112.17@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883d1997dc00, cur 1519411934 expire 1519411784 last 1519411707 [1279283.669360] Lustre: Skipped 71 previous similar messages [1279666.649411] Lustre: oak-OST003b: haven't heard from client f38dd9b5-0d59-2122-09f9-20c634354cef (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881b78a18000, cur 1519412317 expire 1519412167 last 1519412090 [1279666.650396] Lustre: Skipped 35 previous similar messages [1279981.490307] Lustre: oak-OST0036: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1279981.490309] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1279981.490310] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1279981.490311] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1279981.490312] Lustre: Skipped 17 previous similar messages [1279981.490313] Lustre: Skipped 17 previous similar messages [1279981.490314] Lustre: Skipped 17 previous similar messages [1279981.492899] Lustre: Skipped 14 previous similar messages [1280383.617818] Lustre: oak-OST0031: haven't heard from client 78bd8f8c-2820-9c7b-52b2-caf4b2ebee7d (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8805c027a400, cur 1519413034 expire 1519412884 last 1519412807 [1280383.618933] Lustre: Skipped 35 previous similar messages [1280561.531761] Lustre: DEBUG MARKER: Fri Feb 23 11:13:31 2018 [1280703.674661] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1280703.675135] Lustre: Skipped 62 previous similar messages [1280718.109162] LustreError: 132-0: oak-OST0045: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200008c53:0xabf5:0x0] object 0x0:3018242 extent [46137344-50331647], client returned csum b2e206b3 (type 4), server csum 8a0c8ff7 (type 4) [1280718.110180] LustreError: Skipped 14 previous similar messages [1280746.577436] Lustre: oak-OST003d: haven't heard from client aa81197f-aba0-64b1-d844-c09875fc29be (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cac598400, cur 1519413397 expire 1519413247 last 1519413170 [1280746.578439] Lustre: Skipped 35 previous similar messages [1280753.314503] LustreError: 132-0: oak-OST0045: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200008c53:0xabf5:0x0] object 0x0:3018242 extent [46137344-50331647], client returned csum b2e206b3 (type 4), server csum 8a0c8ff7 (type 4) [1280753.315593] LustreError: Skipped 6 previous similar messages [1280817.851631] LustreError: 132-0: oak-OST0045: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200008c53:0xabf5:0x0] object 0x0:3018242 extent [46137344-50331647], client returned csum b2e206b3 (type 4), server csum 8a0c8ff7 (type 4) [1280817.852699] LustreError: Skipped 10 previous similar messages [1281105.569743] Lustre: oak-OST0030: haven't heard from client 3ee6ad51-0711-75b0-7bc9-b392a6b44a05 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881cffa60400, cur 1519413756 expire 1519413606 last 1519413529 [1281105.570703] Lustre: Skipped 71 previous similar messages [1281419.347769] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1281419.347770] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1281419.347772] Lustre: Skipped 122 previous similar messages [1281419.349047] Lustre: Skipped 16 previous similar messages [1281820.545750] Lustre: oak-OST0034: haven't heard from client 42f302db-a226-61a7-d35e-4446d1e1ac21 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cbe5d5800, cur 1519414471 expire 1519414321 last 1519414244 [1281820.546707] Lustre: Skipped 35 previous similar messages [1282139.825903] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1282139.825904] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1282139.825906] Lustre: Skipped 52 previous similar messages [1282139.827110] Lustre: Skipped 15 previous similar messages [1282183.534773] Lustre: oak-OST0030: haven't heard from client b573c12b-5a94-ff00-20c6-26d5fff91b01 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883c91afe800, cur 1519414834 expire 1519414684 last 1519414607 [1282183.535732] Lustre: Skipped 35 previous similar messages [1282512.536172] Lustre: oak-OST0036: haven't heard from client 39e0cc97-3498-7d4d-9f0f-2cb2418b7344 (at 10.9.101.60@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88349c5d7800, cur 1519415163 expire 1519415013 last 1519414936 [1282512.537125] Lustre: Skipped 35 previous similar messages [1283243.809505] Lustre: oak-OST0030: Connection restored to 0fb09ad7-848f-0329-d2d6-a5d10baed2d0 (at 10.9.113.8@o2ib4) [1283243.809997] Lustre: Skipped 195 previous similar messages [1283470.453129] Lustre: oak-OST004f: haven't heard from client 7caaf3bb-2673-7c3e-80a2-f6a25b84558e (at 10.8.9.6@o2ib6) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880751a6b000, cur 1519416121 expire 1519415971 last 1519415894 [1283470.454174] Lustre: Skipped 71 previous similar messages [1284616.396784] Lustre: oak-OST003c: haven't heard from client ae8efa8b-fbec-cb2a-23a5-9c09f69719cd (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881cfe4c9400, cur 1519417267 expire 1519417117 last 1519417040 [1284616.397819] Lustre: Skipped 35 previous similar messages [1284642.724960] Lustre: oak-OST0032: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1284642.724961] Lustre: oak-OST0034: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1284642.724962] Lustre: Skipped 150 previous similar messages [1284642.726196] Lustre: Skipped 13 previous similar messages [1284717.724490] Lustre: oak-OST0031: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1284717.724977] Lustre: Skipped 17 previous similar messages [1284747.460646] LustreError: 132-0: oak-OST0051: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dd8:0x0] object 0x0:162805 extent [1539309568-1543503871], client returned csum 7d3a3f7d (type 4), server csum 111ca6a0 (type 4) [1284747.461615] LustreError: Skipped 1 previous similar message [1284768.142802] LustreError: 132-0: oak-OST0051: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dd8:0x0] object 0x0:162805 extent [1539309568-1543503871], client returned csum 7d3a3f7d (type 4), server csum 111ca6a0 (type 4) [1284768.149130] LustreError: Skipped 4 previous similar messages [1284918.203921] LustreError: 132-0: oak-OST0051: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dd8:0x0] object 0x0:162805 extent [1539309568-1543503871], client returned csum 7d3a3f7d (type 4), server csum 111ca6a0 (type 4) [1284918.204893] LustreError: Skipped 4 previous similar messages [1284984.647619] LustreError: 132-0: oak-OST0051: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dd8:0x0] object 0x0:162805 extent [1539309568-1543503871], client returned csum 7d3a3f7d (type 4), server csum 111ca6a0 (type 4) [1284984.648621] LustreError: Skipped 12 previous similar messages [1285094.391649] Lustre: oak-OST003b: haven't heard from client f39f0e19-5eaa-43e3-c3ea-2e223296d1a4 (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881a8522b800, cur 1519417745 expire 1519417595 last 1519417518 [1285094.392602] Lustre: Skipped 35 previous similar messages [1285128.511925] Lustre: oak-OST0030: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1285128.511926] Lustre: oak-OST0032: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1285128.512876] Lustre: Skipped 16 previous similar messages [1285203.510772] Lustre: oak-OST0033: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1285203.510773] Lustre: oak-OST0031: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1285203.511740] Lustre: Skipped 15 previous similar messages [1285503.424408] Lustre: oak-OST0034: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1285503.424409] Lustre: oak-OST0032: Connection restored to f4b38ec1-850e-7e45-feb8-6ea6e1013f81 (at 10.12.4.68@o2ib) [1285503.424411] Lustre: Skipped 1 previous similar message [1285503.425605] Lustre: Skipped 15 previous similar messages [1285505.374333] Lustre: oak-OST0036: haven't heard from client 3aa01694-09dd-7d04-e8f1-1e3426332284 (at 10.12.4.68@o2ib) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880b3949e400, cur 1519418156 expire 1519418006 last 1519417929 [1285505.375331] Lustre: Skipped 35 previous similar messages [1287971.339983] LustreError: 132-0: oak-OST0031: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8deb:0x0] object 0x0:4391627 extent [3124756480-3128950783], client returned csum e6cf8ef (type 4), server csum 63d56461 (type 4) [1287971.340957] LustreError: Skipped 6 previous similar messages [1287991.355999] LustreError: 132-0: oak-OST0031: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8deb:0x0] object 0x0:4391627 extent [3124756480-3128950783], client returned csum e6cf8ef (type 4), server csum 63d56461 (type 4) [1287991.357208] LustreError: Skipped 4 previous similar messages [1288025.993556] LustreError: 132-0: oak-OST0031: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8deb:0x0] object 0x0:4391627 extent [3124756480-3128950783], client returned csum e6cf8ef (type 4), server csum 63d56461 (type 4) [1288025.994531] LustreError: Skipped 3 previous similar messages [1288165.506112] LustreError: 132-0: oak-OST0031: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8deb:0x0] object 0x0:4391627 extent [3124756480-3128950783], client returned csum e6cf8ef (type 4), server csum 63d56461 (type 4) [1288300.982631] LustreError: 132-0: oak-OST0031: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8deb:0x0] object 0x0:4391627 extent [3124756480-3128950783], client returned csum e6cf8ef (type 4), server csum 63d56461 (type 4) [1288300.983613] LustreError: Skipped 17 previous similar messages [1289132.188657] Lustre: oak-OST0050: haven't heard from client a3d3e82a-aa6f-03a1-1e54-d2436b9b790a (at 10.210.45.46@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ca6ea2c00, cur 1519421783 expire 1519421633 last 1519421556 [1289132.189620] Lustre: Skipped 35 previous similar messages [1291745.738489] Lustre: oak-OST0030: Connection restored to 06585daa-72d5-9293-64a7-1d0994d10593 (at 10.8.27.24@o2ib6) [1291745.738976] Lustre: Skipped 34 previous similar messages [1292601.988107] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4676648960-4680843263], client returned csum 657ce454 (type 4), server csum 87204dc7 (type 4) [1292601.989086] LustreError: Skipped 1 previous similar message [1292636.774352] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4676648960-4680843263], client returned csum 657ce454 (type 4), server csum 87204dc7 (type 4) [1292636.775329] LustreError: Skipped 6 previous similar messages [1292905.336139] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4676648960-4680843263], client returned csum 657ce454 (type 4), server csum 87204dc7 (type 4) [1292905.337111] LustreError: Skipped 2 previous similar messages [1293038.764435] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4676648960-4680843263], client returned csum 657ce454 (type 4), server csum 87204dc7 (type 4) [1293038.765475] LustreError: Skipped 18 previous similar messages [1293301.747961] LustreError: 132-0: oak-OST0040: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dda:0x0] object 0x0:4434610 extent [4678746112-4680843263], client returned csum dca056d7 (type 4), server csum 3efcff44 (type 4) [1293301.748948] LustreError: Skipped 46 previous similar messages [1295692.883195] Lustre: oak-OST0053: haven't heard from client a6c83b55-0c57-d887-4d6c-5b5686fbfa7e (at 10.210.44.17@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880c4d185000, cur 1519428344 expire 1519428194 last 1519428117 [1295692.884179] Lustre: Skipped 71 previous similar messages [1295911.888084] Lustre: oak-OST0040: haven't heard from client f98872db-7abe-2a9a-2f3c-36218aaa6cf2 (at 10.210.44.18@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cd7b24800, cur 1519428563 expire 1519428413 last 1519428336 [1295911.889086] Lustre: Skipped 35 previous similar messages [1296515.873579] Lustre: oak-OST0034: haven't heard from client 551963be-92d7-db3e-b398-33a21bf055e3 (at 10.210.44.37@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cd9dddc00, cur 1519429167 expire 1519429017 last 1519428940 [1296515.874545] Lustre: Skipped 35 previous similar messages [1296591.843823] Lustre: oak-OST0049: haven't heard from client 99f21505-d405-7b58-c0ca-66eb991e0c05 (at 10.210.44.39@o2ib3) in 198 seconds. I think it's dead, and I am evicting it. exp ffff8818aaeea400, cur 1519429243 expire 1519429093 last 1519429045 [1296591.844769] Lustre: Skipped 71 previous similar messages [1296667.841580] Lustre: oak-OST003c: haven't heard from client ed6a567d-bffe-57f8-2594-d2ad03a7bd32 (at 10.210.45.12@o2ib3) in 220 seconds. I think it's dead, and I am evicting it. exp ffff883ca7f3e400, cur 1519429319 expire 1519429169 last 1519429099 [1296667.842580] Lustre: Skipped 35 previous similar messages [1297984.310400] LustreError: 132-0: oak-OST0051: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dd8:0x0] object 0x0:162805 extent [5897191424-5899550719], client returned csum 882d5a82 (type 4), server csum 4ad47bbc (type 4) [1297984.311486] LustreError: Skipped 53 previous similar messages [1298053.536744] LustreError: 132-0: oak-OST0051: BAD READ CHECKSUM: should have changed on the client or in transit: from 10.0.2.230@o2ib5 inode [0x200010201:0x8dd8:0x0] object 0x0:162805 extent [5897191424-5901385727], client returned csum dbc0eb5f (type 4), server csum 376d86ab (type 4) [1298053.537722] LustreError: Skipped 14 previous similar messages [1298224.085441] Lustre: oak-OST0030: Connection restored to 77bd6f8f-dfb2-10d9-8d00-57a663f314f1 (at 10.210.45.13@o2ib3) [1298224.085939] Lustre: Skipped 67 previous similar messages [1298457.781388] Lustre: oak-OST003a: haven't heard from client 77bd6f8f-dfb2-10d9-8d00-57a663f314f1 (at 10.210.45.13@o2ib3) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883d272be800, cur 1519431109 expire 1519430959 last 1519430882 [1298457.782376] Lustre: Skipped 35 previous similar messages [1299834.891585] Lustre: 141924:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519432479/real 1519432479] req@ffff88005c226f00 x1592481984075968/t0(0) o106->oak-OST0039@10.8.2.20@o2ib6:15/16 lens 296/280 e 0 to 1 dl 1519432486 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1299834.891589] Lustre: 17272:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519432479/real 1519432479] req@ffff880053ce7800 x1592481984075984/t0(0) o106->oak-OST0046@10.8.2.20@o2ib6:15/16 lens 296/280 e 0 to 1 dl 1519432486 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1299834.891592] Lustre: 17272:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 36 previous similar messages [1301272.084851] Lustre: oak-OST0030: Connection restored to f30803fa-607a-2a67-3576-43087423cbf2 (at 10.8.8.34@o2ib6) [1301272.084852] Lustre: oak-OST0032: Connection restored to f30803fa-607a-2a67-3576-43087423cbf2 (at 10.8.8.34@o2ib6) [1301272.084853] Lustre: Skipped 2 previous similar messages [1301272.086055] Lustre: Skipped 15 previous similar messages [1301297.281790] Lustre: oak-OST0031: Connection restored to f30803fa-607a-2a67-3576-43087423cbf2 (at 10.8.8.34@o2ib6) [1301297.282267] Lustre: Skipped 100 previous similar messages [1301589.729956] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1301589.730451] Lustre: Skipped 102 previous similar messages [1301614.757183] Lustre: oak-OST004b: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1301614.757184] Lustre: oak-OST0053: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1301614.757185] Lustre: oak-OST0043: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1301614.757186] Lustre: oak-OST0047: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1301614.757188] Lustre: Skipped 8 previous similar messages [1301614.757188] Lustre: Skipped 8 previous similar messages [1301614.757189] Lustre: Skipped 8 previous similar messages [1301614.760024] Lustre: Skipped 5 previous similar messages [1301959.916553] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1301959.917038] Lustre: Skipped 13 previous similar messages [1301991.630173] Lustre: oak-OST0030: haven't heard from client 121dcbd3-8e5d-9112-5d3b-a9c0f4eb8bb9 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882eebc35c00, cur 1519434643 expire 1519434493 last 1519434416 [1301991.631115] Lustre: Skipped 17 previous similar messages [1301994.610602] Lustre: oak-OST0043: haven't heard from client 121dcbd3-8e5d-9112-5d3b-a9c0f4eb8bb9 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88083ad28000, cur 1519434646 expire 1519434496 last 1519434419 [1301994.617008] Lustre: Skipped 32 previous similar messages [1302000.616624] Lustre: oak-OST0039: haven't heard from client 121dcbd3-8e5d-9112-5d3b-a9c0f4eb8bb9 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8808bd4e9400, cur 1519434652 expire 1519434502 last 1519434425 [1302360.576355] Lustre: oak-OST003d: haven't heard from client 6bafd216-b326-77e1-e921-89a03111fc84 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8823dd911400, cur 1519435012 expire 1519434862 last 1519434785 [1302360.577313] Lustre: Skipped 1 previous similar message [1302490.967154] Lustre: oak-OST0032: Connection restored to e79fe59f-4a6e-3dd9-6e7e-c0551da1735f (at 10.210.46.123@o2ib3) [1302490.967155] Lustre: oak-OST0030: Connection restored to e79fe59f-4a6e-3dd9-6e7e-c0551da1735f (at 10.210.46.123@o2ib3) [1302490.967157] Lustre: Skipped 22 previous similar messages [1302490.968386] Lustre: Skipped 13 previous similar messages [1303084.550688] Lustre: oak-OST0040: haven't heard from client 7eb29d38-a42b-5ed7-c22f-5aa05669584c (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88067fc92800, cur 1519435736 expire 1519435586 last 1519435509 [1303084.551711] Lustre: Skipped 35 previous similar messages [1303406.653746] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1303406.653747] Lustre: oak-OST0038: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1303406.653749] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1303406.653750] Lustre: Skipped 53 previous similar messages [1303406.653750] Lustre: Skipped 53 previous similar messages [1303406.655728] Lustre: Skipped 12 previous similar messages [1303603.125876] Lustre: 249468:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436247/real 1519436247] req@ffff8809e272ec00 x1592481985110224/t0(0) o106->oak-OST0049@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436254 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1303603.125878] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436247/real 1519436247] req@ffff880028afbf00 x1592481985110208/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436254 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1303610.125532] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436254/real 1519436254] req@ffff880028afbf00 x1592481985110208/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436261 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1303617.126182] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436261/real 1519436261] req@ffff880028afbf00 x1592481985110208/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436268 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1303617.127289] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [1303624.126861] Lustre: 249468:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436268/real 1519436268] req@ffff8809e272ec00 x1592481985110224/t0(0) o106->oak-OST0049@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436275 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1303624.128065] Lustre: 249468:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 1 previous similar message [1303638.126243] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436282/real 1519436282] req@ffff880028afbf00 x1592481985110208/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436289 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1303638.127236] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [1303659.126240] Lustre: 249468:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436303/real 1519436303] req@ffff8809e272ec00 x1592481985110224/t0(0) o106->oak-OST0049@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436310 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1303659.126242] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436303/real 1519436303] req@ffff880028afbf00 x1592481985110208/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436310 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1303659.126244] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [1303701.124265] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519436345/real 1519436345] req@ffff880028afbf00 x1592481985110208/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519436352 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1303701.125208] Lustre: 249474:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [1303769.859556] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1303769.860055] Lustre: Skipped 30 previous similar messages [1303771.122413] LustreError: 249474:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff880028afbf00 x1592481985110208 status -107 rc -107), evict it ns: filter-oak-OST0042_UUID lock: ffff8800669ca000/0x806f959362256fd7 lrc: 4/0,0 mode: PW/PW res: [0x421917:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x40000000000000 nid: 10.9.112.15@o2ib4 remote: 0x574cd27843e52b31 expref: 5 pid: 249335 timeout: 0 lvb_type: 0 [1303771.123410] LustreError: 138-a: oak-OST0049: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1303771.124692] LustreError: 249474:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) Skipped 1 previous similar message [1303808.522604] Lustre: oak-OST0033: haven't heard from client 383da4a6-81fa-285e-647b-4cc6d0a5aef5 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880cc63cd000, cur 1519436460 expire 1519436310 last 1519436233 [1303808.523577] Lustre: Skipped 35 previous similar messages [1304171.493076] Lustre: oak-OST0032: haven't heard from client f243424f-9c95-754e-10c7-7902bc37011f (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88159cd72800, cur 1519436823 expire 1519436673 last 1519436596 [1304171.494066] Lustre: Skipped 31 previous similar messages [1304854.385401] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1304854.385886] Lustre: Skipped 32 previous similar messages [1304879.649068] Lustre: oak-OST0033: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1304879.649571] Lustre: Skipped 9 previous similar messages [1305215.726369] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1305215.726370] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1305215.727353] Lustre: Skipped 14 previous similar messages [1305253.467430] Lustre: oak-OST003c: haven't heard from client b82d75f2-2bf9-91ed-5de2-71f8d922db6c (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883d3a970400, cur 1519437905 expire 1519437755 last 1519437678 [1305253.468381] Lustre: Skipped 35 previous similar messages [1305256.447428] Lustre: oak-OST003d: haven't heard from client b82d75f2-2bf9-91ed-5de2-71f8d922db6c (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880b35160000, cur 1519437908 expire 1519437758 last 1519437681 [1305256.448477] Lustre: Skipped 1 previous similar message [1305262.439174] Lustre: oak-OST0030: haven't heard from client b82d75f2-2bf9-91ed-5de2-71f8d922db6c (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883d2e372c00, cur 1519437914 expire 1519437764 last 1519437687 [1305262.440133] Lustre: Skipped 30 previous similar messages [1305304.123746] Lustre: oak-OST0030: Connection restored to b90740e3-d61c-ff37-5776-dec50cda45bb (at 10.9.102.42@o2ib4) [1305304.124242] Lustre: Skipped 23 previous similar messages [1305440.432670] Lustre: oak-OST003b: haven't heard from client e95d51b7-d047-0aab-47c7-7b9fa69b0296 (at 10.9.113.6@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882df0493c00, cur 1519438092 expire 1519437942 last 1519437865 [1305440.433616] Lustre: Skipped 2 previous similar messages [1305511.785459] Lustre: oak-OST0036: Connection restored to 06585daa-72d5-9293-64a7-1d0994d10593 (at 10.8.27.24@o2ib6) [1305511.785937] Lustre: Skipped 325 previous similar messages [1305617.440126] Lustre: oak-OST004e: haven't heard from client 569de15f-8859-521c-aecf-e9e2ad56118e (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883ccbe03800, cur 1519438269 expire 1519438119 last 1519438042 [1305617.441107] Lustre: Skipped 107 previous similar messages [1305937.004629] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1305937.005130] Lustre: Skipped 67 previous similar messages [1306126.688306] Lustre: 390779:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519438771/real 1519438771] req@ffff880b91e54b00 x1592481986161856/t0(0) o106->oak-OST0033@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519438778 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1306126.689289] Lustre: 390779:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 21 previous similar messages [1306147.688362] Lustre: 390779:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519438792/real 1519438792] req@ffff880b91e54b00 x1592481986161856/t0(0) o106->oak-OST0033@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519438799 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1306147.689363] Lustre: 390779:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [1306185.685602] Lustre: 390793:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519438826/real 1519438826] req@ffff88005721e300 x1592481986161872/t0(0) o106->oak-OST003a@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519438837 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1306185.686617] Lustre: 390793:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 8 previous similar messages [1306262.682959] Lustre: 390793:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519438903/real 1519438903] req@ffff88005721e300 x1592481986161872/t0(0) o106->oak-OST003a@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519438914 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1306262.684037] Lustre: 390793:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 17 previous similar messages [1306294.682908] LustreError: 390779:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff880b91e54b00 x1592481986161856 status -107 rc -107), evict it ns: filter-oak-OST0033_UUID lock: ffff88004973f000/0x806f9593622e3c4f lrc: 4/0,0 mode: PW/PW res: [0x437cf4:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x40000000000000 nid: 10.9.112.15@o2ib4 remote: 0x3895f6c5428b1997 expref: 5 pid: 362684 timeout: 0 lvb_type: 0 [1306294.684660] LustreError: 390779:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) Skipped 2 previous similar messages [1306294.690848] LustreError: 138-a: oak-OST0033: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1306294.691356] LustreError: Skipped 3 previous similar messages [1306295.682676] LustreError: 390793:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff88005721e300 x1592481986161872 status -107 rc -107), evict it ns: filter-oak-OST003a_UUID lock: ffff8808af16c800/0x806f9593622e452b lrc: 4/0,0 mode: PW/PW res: [0x434ae0:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x40000000000000 nid: 10.9.112.15@o2ib4 remote: 0x3895f6c5428b19eb expref: 5 pid: 132963 timeout: 0 lvb_type: 0 [1306295.684606] LustreError: 138-a: oak-OST003a: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1306332.414678] Lustre: oak-OST0033: haven't heard from client 37f505d3-7f6e-32cf-6b6b-ca723026d0a8 (at 10.8.27.24@o2ib6) in 196 seconds. I think it's dead, and I am evicting it. exp ffff881ff876e800, cur 1519438984 expire 1519438834 last 1519438788 [1306332.415643] Lustre: Skipped 35 previous similar messages [1306337.392610] Lustre: oak-OST0050: haven't heard from client 718597f3-70cf-fd4e-1061-fb6ca2fd6515 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d03b35000, cur 1519438989 expire 1519438839 last 1519438762 [1306337.393580] Lustre: Skipped 3 previous similar messages [1306341.437587] Lustre: oak-OST0040: haven't heard from client 37f505d3-7f6e-32cf-6b6b-ca723026d0a8 (at 10.8.27.24@o2ib6) in 205 seconds. I think it's dead, and I am evicting it. exp ffff881ecaa8f000, cur 1519438993 expire 1519438843 last 1519438788 [1306341.438800] Lustre: Skipped 92 previous similar messages [1306658.760988] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1306658.760990] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1306658.760991] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1306658.760992] Lustre: Skipped 86 previous similar messages [1306658.760993] Lustre: Skipped 86 previous similar messages [1306658.762926] Lustre: Skipped 15 previous similar messages [1306696.375524] Lustre: oak-OST0041: haven't heard from client a479ea1b-a2f9-9649-f834-362d1d93f7e9 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880248a9c400, cur 1519439348 expire 1519439198 last 1519439121 [1306696.376498] Lustre: Skipped 5 previous similar messages [1307056.374038] Lustre: oak-OST0047: haven't heard from client 5c3b5bec-b026-9492-faa5-22fc5fad13fb (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881e096ef400, cur 1519439708 expire 1519439558 last 1519439481 [1307056.375002] Lustre: Skipped 35 previous similar messages [1307378.844059] Lustre: oak-OST003c: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1307378.844060] Lustre: oak-OST0038: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1307378.844061] Lustre: oak-OST003e: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1307378.844062] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1307378.844063] Lustre: Skipped 50 previous similar messages [1307378.844064] Lustre: Skipped 50 previous similar messages [1307378.844065] Lustre: Skipped 50 previous similar messages [1307378.846956] Lustre: Skipped 8 previous similar messages [1307421.336769] Lustre: oak-OST003f: haven't heard from client 4564b1e6-cbab-5450-0ad6-4f7e8b138df2 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880a029afc00, cur 1519440073 expire 1519439923 last 1519439846 [1307421.337769] Lustre: Skipped 35 previous similar messages [1307780.322105] Lustre: oak-OST004b: haven't heard from client 67357652-0bb5-c580-e574-58a776fb9161 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff8827d8aa0400, cur 1519440432 expire 1519440282 last 1519440205 [1307780.323069] Lustre: Skipped 35 previous similar messages [1308105.420624] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1308105.421121] Lustre: Skipped 98 previous similar messages [1308504.301623] Lustre: oak-OST0051: haven't heard from client 1d8dfad6-be37-8a9e-fe0f-43045062e957 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88141e797400, cur 1519441156 expire 1519441006 last 1519440929 [1308504.302612] Lustre: Skipped 35 previous similar messages [1308658.848497] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519441303/real 1519441303] req@ffff8819da8d0900 x1592481987629072/t0(0) o106->oak-OST0052@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519441310 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1308658.849495] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 8 previous similar messages [1308679.848534] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519441324/real 1519441324] req@ffff8819da8d0900 x1592481987629072/t0(0) o106->oak-OST0052@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519441331 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1308679.849725] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [1308721.847603] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519441366/real 1519441366] req@ffff8819da8d0900 x1592481987629072/t0(0) o106->oak-OST0052@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519441373 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1308721.848610] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [1308798.845995] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519441443/real 1519441443] req@ffff8819da8d0900 x1592481987629072/t0(0) o106->oak-OST0052@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519441450 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1308798.846997] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [1308852.453480] LNet: Service thread pid 17265 was inactive for 200.61s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1308852.454210] Pid: 17265, comm: ll_ost00_032 [1308852.454456] Call Trace: [1308852.454923] [] schedule+0x29/0x70 [1308852.455171] [] schedule_timeout+0x174/0x2c0 [1308852.455420] [] ? process_timeout+0x0/0x10 [1308852.455718] [] ptlrpc_set_wait+0x4c0/0x910 [ptlrpc] [1308852.456054] [] ? default_wake_function+0x0/0x20 [1308852.456357] [] ldlm_run_ast_work+0xd3/0x3a0 [ptlrpc] [1308852.456639] [] ldlm_glimpse_locks+0x3b/0x100 [ptlrpc] [1308852.456902] [] ofd_intent_policy+0x444/0xa30 [ofd] [1308852.457165] [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] [1308852.457430] [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] [1308852.457920] [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] [1308852.458417] [] tgt_enqueue+0x62/0x210 [ptlrpc] [1308852.458691] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1308852.458959] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1308852.459446] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1308852.459694] [] ? default_wake_function+0x12/0x20 [1308852.459962] [] ? __wake_up_common+0x58/0x90 [1308852.460232] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1308852.460528] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1308852.460788] [] kthread+0xcf/0xe0 [1308852.461056] [] ? kthread+0x0/0xe0 [1308852.461323] [] ret_from_fork+0x58/0x90 [1308852.461683] [] ? kthread+0x0/0xe0 [1308852.461920] [1308852.462149] LustreError: dumping log to /tmp/lustre-log.1519441504.17265 [1308854.285265] Lustre: oak-OST0033: haven't heard from client 4b6463cc-fa45-ef77-7f7b-f455bd7e7e80 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d2cb22000, cur 1519441506 expire 1519441356 last 1519441279 [1308854.286235] Lustre: Skipped 35 previous similar messages [1308865.290474] LNet: Service thread pid 17265 completed after 213.45s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [1308871.274564] Lustre: oak-OST0051: haven't heard from client 4b6463cc-fa45-ef77-7f7b-f455bd7e7e80 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881c062bcc00, cur 1519441523 expire 1519441373 last 1519441296 [1308871.275540] Lustre: Skipped 34 previous similar messages [1309190.040564] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1309190.040565] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1309190.040567] Lustre: Skipped 53 previous similar messages [1309190.042055] Lustre: Skipped 15 previous similar messages [1309591.241135] Lustre: oak-OST0035: haven't heard from client dc520497-0ed3-7fcc-0ff7-ef6979cd6d72 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881e033ab400, cur 1519442243 expire 1519442093 last 1519442016 [1309943.244530] Lustre: oak-OST0038: haven't heard from client 57fe3b57-b29a-46d5-c4e3-434d3c2841a8 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883edce3bc00, cur 1519442595 expire 1519442445 last 1519442368 [1309943.245578] Lustre: Skipped 35 previous similar messages [1309954.234381] Lustre: oak-OST0046: haven't heard from client 57fe3b57-b29a-46d5-c4e3-434d3c2841a8 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88153b1e7000, cur 1519442606 expire 1519442456 last 1519442379 [1309954.235343] Lustre: Skipped 32 previous similar messages [1310635.996818] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1310635.997297] Lustre: Skipped 66 previous similar messages [1311037.171396] Lustre: oak-OST003a: haven't heard from client 88674c29-77b1-6988-7736-4d705a1911ef (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880bfacdb800, cur 1519443689 expire 1519443539 last 1519443462 [1311037.172369] Lustre: Skipped 2 previous similar messages [1312083.322689] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1312083.323181] Lustre: Skipped 32 previous similar messages [1312108.498988] Lustre: oak-OST0035: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1312108.498989] Lustre: oak-OST0043: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1312108.498991] Lustre: Skipped 7 previous similar messages [1312108.505787] Lustre: Skipped 7 previous similar messages [1312442.857491] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1312442.857493] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1312442.858449] Lustre: Skipped 16 previous similar messages [1312485.105471] Lustre: oak-OST0034: haven't heard from client 5352dd19-0cc0-609f-cada-baa09ad688bb (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881e7a36f800, cur 1519445137 expire 1519444987 last 1519444910 [1312485.106440] Lustre: Skipped 35 previous similar messages [1312637.731153] Lustre: 132016:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519445282/real 1519445282] req@ffff88005f1c1800 x1592481989308912/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519445289 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1312637.732416] Lustre: 132016:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [1312658.731172] Lustre: 132016:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519445303/real 1519445303] req@ffff88005f1c1800 x1592481989308912/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519445310 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1312658.732159] Lustre: 132016:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [1312700.730209] Lustre: 132016:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519445345/real 1519445345] req@ffff88005f1c1800 x1592481989308912/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519445352 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1312700.731165] Lustre: 132016:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [1312777.727619] Lustre: 132016:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519445422/real 1519445422] req@ffff88005f1c1800 x1592481989308912/t0(0) o106->oak-OST0042@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519445429 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1312777.728579] Lustre: 132016:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [1312831.020139] LNet: Service thread pid 132016 was inactive for 200.29s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1312831.020877] Pid: 132016, comm: ll_ost00_005 [1312831.021150] Call Trace: [1312831.021634] [] schedule+0x29/0x70 [1312831.021909] [] schedule_timeout+0x174/0x2c0 [1312831.022176] [] ? process_timeout+0x0/0x10 [1312831.022472] [] ptlrpc_set_wait+0x4c0/0x910 [ptlrpc] [1312831.022720] [] ? default_wake_function+0x0/0x20 [1312831.023019] [] ldlm_run_ast_work+0xd3/0x3a0 [ptlrpc] [1312831.023306] [] ldlm_glimpse_locks+0x3b/0x100 [ptlrpc] [1312831.023566] [] ofd_intent_policy+0x444/0xa30 [ofd] [1312831.023875] [] ldlm_lock_enqueue+0x387/0x970 [ptlrpc] [1312831.024179] [] ldlm_handle_enqueue0+0x9c3/0x1680 [ptlrpc] [1312831.024731] [] ? lustre_swab_ldlm_request+0x0/0x30 [ptlrpc] [1312831.025276] [] tgt_enqueue+0x62/0x210 [ptlrpc] [1312831.025550] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1312831.025873] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1312831.026403] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1312831.026655] [] ? default_wake_function+0x12/0x20 [1312831.026906] [] ? __wake_up_common+0x58/0x90 [1312831.027223] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1312831.027502] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1312831.027765] [] kthread+0xcf/0xe0 [1312831.028024] [] ? kthread+0x0/0xe0 [1312831.028291] [] ret_from_fork+0x58/0x90 [1312831.028533] [] ? kthread+0x0/0xe0 [1312831.028818] [1312831.029085] LustreError: dumping log to /tmp/lustre-log.1519445482.132016 [1312840.096816] Lustre: oak-OST0040: haven't heard from client 3e76c9eb-9a6e-7b92-cae4-6e1294a0c730 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881cffa65400, cur 1519445492 expire 1519445342 last 1519445265 [1312840.097866] Lustre: Skipped 35 previous similar messages [1312843.110588] Lustre: oak-OST0042: haven't heard from client 3e76c9eb-9a6e-7b92-cae4-6e1294a0c730 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88153b1e0000, cur 1519445495 expire 1519445345 last 1519445268 [1312843.111639] LNet: Service thread pid 132016 completed after 212.38s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [1312848.090196] Lustre: oak-OST0039: haven't heard from client 3e76c9eb-9a6e-7b92-cae4-6e1294a0c730 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cac56bc00, cur 1519445500 expire 1519445350 last 1519445273 [1312848.091170] Lustre: Skipped 29 previous similar messages [1313165.495330] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1313165.495332] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1313165.495334] Lustre: Skipped 17 previous similar messages [1313165.496614] Lustre: Skipped 16 previous similar messages [1313561.050308] Lustre: oak-OST0030: haven't heard from client 6473208f-773b-b394-bc39-e21e7c112935 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883d71402c00, cur 1519446213 expire 1519446063 last 1519445986 [1313561.051272] Lustre: Skipped 4 previous similar messages [1313888.182910] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1313888.183401] Lustre: Skipped 23 previous similar messages [1314289.030740] Lustre: oak-OST0052: haven't heard from client 90c8b69b-478a-5303-d4ab-541f49a976aa (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880d1230bc00, cur 1519446941 expire 1519446791 last 1519446714 [1314289.031778] Lustre: Skipped 35 previous similar messages [1314608.666423] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1314608.666930] Lustre: Skipped 75 previous similar messages [1314650.005241] Lustre: oak-OST0045: haven't heard from client 668312ba-212f-c5b1-7aec-99f47a16bd8d (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881cb497dc00, cur 1519447302 expire 1519447152 last 1519447075 [1314650.006225] Lustre: Skipped 35 previous similar messages [1314803.903502] Lustre: 17266:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519447448/real 1519447448] req@ffff8837b1ac1200 x1592481989729392/t0(0) o106->oak-OST003c@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519447455 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1314803.904455] Lustre: 17266:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [1314824.902574] Lustre: 249330:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519447469/real 1519447469] req@ffff8812804c5d00 x1592481989729408/t0(0) o106->oak-OST0035@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519447476 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1314824.903625] Lustre: 249330:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [1314866.901582] Lustre: 17266:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519447511/real 1519447511] req@ffff8837b1ac1200 x1592481989729392/t0(0) o106->oak-OST003c@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519447518 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1314866.902528] Lustre: 17266:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [1314943.899031] Lustre: 249330:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519447588/real 1519447588] req@ffff8812804c5d00 x1592481989729408/t0(0) o106->oak-OST0035@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519447595 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1314943.899033] Lustre: 17266:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519447588/real 1519447588] req@ffff8837b1ac1200 x1592481989729392/t0(0) o106->oak-OST003c@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519447595 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1314943.899035] Lustre: 17266:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 21 previous similar messages [1314971.898121] LustreError: 17266:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff8837b1ac1200 x1592481989729392 status -107 rc -107), evict it ns: filter-oak-OST003c_UUID lock: ffff880789235200/0x806f959362687467 lrc: 4/0,0 mode: PW/PW res: [0x43b58d:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x40000000000000 nid: 10.9.112.15@o2ib4 remote: 0x74a0d86312df1222 expref: 5 pid: 390784 timeout: 0 lvb_type: 0 [1314971.899869] LustreError: 17266:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) Skipped 3 previous similar messages [1314971.900030] LustreError: 138-a: oak-OST0035: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1314971.900031] LustreError: Skipped 3 previous similar messages [1315005.991652] Lustre: oak-OST0040: haven't heard from client 72df71fd-c66e-637d-269e-2f412d3603ee (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff88229768ec00, cur 1519447658 expire 1519447508 last 1519447431 [1315005.992688] Lustre: Skipped 35 previous similar messages [1315010.001985] Lustre: oak-OST0033: haven't heard from client 72df71fd-c66e-637d-269e-2f412d3603ee (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880a029a9000, cur 1519447662 expire 1519447512 last 1519447435 [1315010.002981] Lustre: Skipped 2 previous similar messages [1315014.982999] Lustre: oak-OST0052: haven't heard from client 72df71fd-c66e-637d-269e-2f412d3603ee (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883563ce8c00, cur 1519447667 expire 1519447517 last 1519447440 [1315014.983973] Lustre: Skipped 25 previous similar messages [1315335.280734] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1315335.281329] Lustre: Skipped 70 previous similar messages [1315370.974342] Lustre: oak-OST0039: haven't heard from client 7a5304b5-b571-74f9-bf8a-674cf180b432 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881869b9e800, cur 1519448023 expire 1519447873 last 1519447796 [1315370.975314] Lustre: Skipped 1 previous similar message [1315736.959451] Lustre: oak-OST0038: haven't heard from client 24b17af5-2128-c8c3-c3ed-7493547f8575 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881e7b360400, cur 1519448389 expire 1519448239 last 1519448162 [1315736.960422] Lustre: Skipped 35 previous similar messages [1316057.366649] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1316057.366650] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1316057.366652] Lustre: Skipped 51 previous similar messages [1316057.367871] Lustre: Skipped 16 previous similar messages [1316094.934999] Lustre: oak-OST0043: haven't heard from client e9b7bdc4-9bf6-3312-81d7-dec225b70671 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880bfacda000, cur 1519448747 expire 1519448597 last 1519448520 [1316094.935975] Lustre: Skipped 35 previous similar messages [1316458.928786] Lustre: oak-OST0033: haven't heard from client ed722225-0e01-df72-d145-529efd0ad48c (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883caae1cc00, cur 1519449111 expire 1519448961 last 1519448884 [1316458.929758] Lustre: Skipped 35 previous similar messages [1316820.905465] Lustre: oak-OST0035: haven't heard from client de1e5c6a-3a5b-696c-19b9-25d8c5b308d5 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883cd82cd400, cur 1519449473 expire 1519449323 last 1519449246 [1316820.906520] Lustre: Skipped 35 previous similar messages [1317865.225991] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1317865.225992] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1317865.225994] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1317865.225995] Lustre: Skipped 52 previous similar messages [1317865.225996] Lustre: Skipped 52 previous similar messages [1317865.227923] Lustre: Skipped 15 previous similar messages [1318223.202431] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1318223.202432] Lustre: oak-OST0036: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1318223.202434] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1318223.202435] Lustre: Skipped 17 previous similar messages [1318223.202435] Lustre: Skipped 18 previous similar messages [1318223.204405] Lustre: Skipped 13 previous similar messages [1318266.870412] Lustre: oak-OST0030: haven't heard from client 28f13573-ae57-18b0-0efb-2f0b23d85238 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff883e99d54800, cur 1519450919 expire 1519450769 last 1519450692 [1318266.871372] Lustre: Skipped 35 previous similar messages [1318420.960468] Lustre: 390794:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519451066/real 1519451066] req@ffff88000ff00900 x1592481990160112/t0(0) o106->oak-OST0034@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519451073 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1318420.961431] Lustre: 390794:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 9 previous similar messages [1318462.958549] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519451108/real 1519451108] req@ffff8800048bfb00 x1592481990160128/t0(0) o106->oak-OST003e@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519451115 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1318462.959584] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 11 previous similar messages [1318539.955946] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519451185/real 1519451185] req@ffff8800048bfb00 x1592481990160128/t0(0) o106->oak-OST003e@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519451192 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1318539.957017] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 21 previous similar messages [1318588.321942] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1318588.322431] Lustre: Skipped 34 previous similar messages [1318588.954036] LustreError: 390794:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff88000ff00900 x1592481990160112 status -107 rc -107), evict it ns: filter-oak-OST0034_UUID lock: ffff883c89ab3000/0x806f959362778eac lrc: 4/0,0 mode: PW/PW res: [0x41775f:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x40000000000000 nid: 10.9.112.15@o2ib4 remote: 0x4e1f71a39d43afab expref: 5 pid: 21428 timeout: 0 lvb_type: 0 [1318588.954907] LustreError: 138-a: oak-OST003e: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1318588.954924] LustreError: Skipped 4 previous similar messages [1318588.956664] LustreError: 390794:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) Skipped 4 previous similar messages [1318623.826374] Lustre: oak-OST0042: haven't heard from client 1997f5e9-9072-a03d-9e54-3126cf681923 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881e7beec800, cur 1519451276 expire 1519451126 last 1519451049 [1318623.827343] Lustre: Skipped 35 previous similar messages [1318977.805245] Lustre: oak-OST0040: haven't heard from client d8b5411c-c28b-e928-4336-d13d9af3c72c (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff880d47c36c00, cur 1519451630 expire 1519451480 last 1519451403 [1318977.806223] Lustre: Skipped 30 previous similar messages [1319351.805174] Lustre: oak-OST0037: haven't heard from client 4a891d62-cdbb-063d-10a3-fc27f7ee3a74 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff882eb9988800, cur 1519452004 expire 1519451854 last 1519451777 [1319351.806151] Lustre: Skipped 35 previous similar messages [1319668.515571] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1319668.515572] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1319668.515574] Lustre: Skipped 51 previous similar messages [1319668.516870] Lustre: Skipped 16 previous similar messages [1320069.751793] Lustre: oak-OST0033: haven't heard from client d0deae9a-d1e1-4f0d-9522-2d67f83a5746 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881fdeab0c00, cur 1519452722 expire 1519452572 last 1519452495 [1320069.752781] Lustre: Skipped 35 previous similar messages [1320225.270473] Lustre: 135933:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519452870/real 1519452870] req@ffff880012957500 x1592481990366880/t0(0) o106->oak-OST004e@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519452877 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1320225.271439] Lustre: 135933:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 15 previous similar messages [1320246.269464] Lustre: 249494:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519452891/real 1519452891] req@ffff8819da8d0900 x1592481990366896/t0(0) o106->oak-OST003d@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519452898 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1320246.270417] Lustre: 249494:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [1320288.268545] Lustre: 249494:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519452933/real 1519452933] req@ffff8819da8d0900 x1592481990366896/t0(0) o106->oak-OST003d@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519452940 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1320288.269569] Lustre: 249494:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 12 previous similar messages [1320365.265937] Lustre: 249494:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519453010/real 1519453010] req@ffff8819da8d0900 x1592481990366896/t0(0) o106->oak-OST003d@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519453017 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1320365.266978] Lustre: 249494:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 20 previous similar messages [1320401.357982] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1320401.358473] Lustre: Skipped 65 previous similar messages [1320407.265292] LustreError: 249494:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff8819da8d0900 x1592481990366896 status -107 rc -107), evict it ns: filter-oak-OST003d_UUID lock: ffff8837c0035400/0x806f9593628123ed lrc: 4/0,0 mode: PW/PW res: [0x438d06:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x40000000000000 nid: 10.9.112.15@o2ib4 remote: 0xd7d0c605e8fec5a expref: 5 pid: 210324 timeout: 0 lvb_type: 0 [1320407.265344] LustreError: 138-a: oak-OST004e: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1320407.265345] LustreError: Skipped 4 previous similar messages [1320407.267986] LustreError: 249494:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) Skipped 4 previous similar messages [1320802.725509] Lustre: oak-OST0037: haven't heard from client c6cf18b3-da6c-d9d0-72d7-ca69d6a5b32c (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d017af400, cur 1519453455 expire 1519453305 last 1519453228 [1320802.726471] Lustre: Skipped 67 previous similar messages [1320950.332631] Lustre: 17268:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519453595/real 1519453595] req@ffff881448137b00 x1592481990451312/t0(0) o106->oak-OST0051@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519453602 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1320950.332634] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519453595/real 1519453595] req@ffff880058676f00 x1592481990451296/t0(0) o106->oak-OST0036@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519453602 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1320950.332636] Lustre: 17265:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 12 previous similar messages [1321118.870294] Lustre: oak-OST0034: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1321118.870794] Lustre: Skipped 70 previous similar messages [1321125.326883] LustreError: 17265:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff880058676f00 x1592481990451296 status -107 rc -107), evict it ns: filter-oak-OST0036_UUID lock: ffff880b14d1b000/0x806f9593628697c5 lrc: 4/0,0 mode: PW/PW res: [0x40d8cf:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) flags: 0x40000000000000 nid: 10.9.112.15@o2ib4 remote: 0x4754545b3a1b127f expref: 5 pid: 135933 timeout: 0 lvb_type: 0 [1321125.326989] LustreError: 138-a: oak-OST0051: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1321125.326990] LustreError: Skipped 3 previous similar messages [1321125.329478] LustreError: 17265:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) Skipped 3 previous similar messages [1321520.684471] Lustre: oak-OST0032: haven't heard from client 492c0ac2-f006-441d-6ac3-becadbe15e91 (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881577af3c00, cur 1519454173 expire 1519454023 last 1519453946 [1321520.690901] Lustre: Skipped 67 previous similar messages [1321840.869612] Lustre: oak-OST0032: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1321840.870166] Lustre: Skipped 32 previous similar messages [1322242.650946] Lustre: oak-OST003b: haven't heard from client fdb6af0d-770d-e835-152d-ab49ab53558f (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881d03b35c00, cur 1519454895 expire 1519454745 last 1519454668 [1322242.651928] Lustre: Skipped 35 previous similar messages [1322563.821504] Lustre: oak-OST0030: Connection restored to c8ed7df6-e4c7-6d43-dca5-2cc11a7d71fc (at 10.9.112.15@o2ib4) [1322563.821980] Lustre: Skipped 32 previous similar messages [1322964.619604] Lustre: oak-OST0047: haven't heard from client 8e16fa36-2b9e-961f-9969-c62621d2894d (at 10.9.112.15@o2ib4) in 227 seconds. I think it's dead, and I am evicting it. exp ffff881b251dd000, cur 1519455617 expire 1519455467 last 1519455390 [1322964.620577] Lustre: Skipped 35 previous similar messages [1322978.700084] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519455624/real 1519455624] req@ffff88004b3ed400 x1592481990801536/t0(0) o106->oak-OST0034@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519455631 ref 1 fl Rpc:X/0/ffffffff rc 0/-1 [1322978.701056] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 49 previous similar messages [1323055.697464] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519455701/real 1519455701] req@ffff88004b3ed400 x1592481990801536/t0(0) o106->oak-OST0034@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519455708 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1323055.698436] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [1323059.461269] INFO: task kswapd0:264 blocked for more than 120 seconds. [1323059.461525] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1108681.327901] LustreError: 173725:0:(lustre_dlm.h:1097:ldlm_lvbo_fill()) lock ffff880040c4a000: delayed lvb init failed (rc -2) [1323059.461999] kswapd0 D [1323059.462237] ffff880151e49958 0 264 2 0x00000000 [1323059.462713] ffff881ffac6f8d0 0000000000000046 ffff881ffb7c2f70 ffff881ffac6ffd8 [1323059.463195] ffff881ffac6ffd8 ffff881ffac6ffd8 ffff881ffb7c2f70 ffff880151e49950 [1323059.463681] ffff880151e49954 ffff881ffb7c2f70 00000000ffffffff ffff880151e49958 [1323059.464163] Call Trace: [1323059.464409] [] schedule_preempt_disabled+0x29/0x70 [1323059.464654] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323059.464917] [] ? jbd2__journal_start+0xf3/0x1e0 [jbd2] [1323059.465166] [] mutex_lock+0x1f/0x2f [1323059.465411] [] dquot_acquire+0x3a/0x130 [1323059.465668] [] ldiskfs_acquire_dquot+0x66/0xb0 [ldiskfs] [1323059.465912] [] dqget+0x3e4/0x440 [1323059.466152] [] dquot_get_dqblk+0x14/0x1f0 [1323059.466412] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1323059.466882] [] lquota_disk_read+0x124/0x390 [lquota] [1323059.467130] [] qsd_refresh_usage+0x6a/0x2b0 [lquota] [1323059.467381] [] qsd_op_adjust+0x2f1/0x730 [lquota] [1323059.467628] [] osd_object_delete+0x230/0x330 [osd_ldiskfs] [1323059.468134] [] lu_object_free.isra.31+0x9d/0x1a0 [obdclass] [1323059.468613] [] lu_site_purge_objects+0x2fe/0x520 [obdclass] [1323059.469183] [] lu_cache_shrink+0x259/0x2d0 [obdclass] [1323059.469433] [] shrink_slab+0x163/0x330 [1323059.469674] [] ? vmpressure+0x87/0x90 [1323059.469916] [] balance_pgdat+0x4b1/0x5e0 [1323059.470160] [] kswapd+0x173/0x440 [1323059.470406] [] ? wake_up_atomic_t+0x30/0x30 [1323059.470649] [] ? balance_pgdat+0x5e0/0x5e0 [1323059.470893] [] kthread+0xcf/0xe0 [1323059.471129] [] ? insert_kthread_work+0x40/0x40 [1323059.471375] [] ret_from_fork+0x58/0x90 [1323059.471623] [] ? insert_kthread_work+0x40/0x40 [1323059.471867] INFO: task kswapd1:265 blocked for more than 120 seconds. [1323059.472110] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.472580] kswapd1 D ffff881ffac73b80 0 265 2 0x00000000 [1323059.473059] ffff881ffac73b58 0000000000000046 ffff881ffb7c3f40 ffff881ffac73fd8 [1323059.473548] ffff881ffac73fd8 ffff881ffac73fd8 ffff881ffb7c3f40 ffff881ffb7c3f40 [1323059.474041] ffffffffc0a826a0 fffffffeffffffff ffffffffc0a826a8 ffff881ffac73b80 [1323059.474534] Call Trace: [1323059.474771] [] schedule+0x29/0x70 [1323059.475013] [] rwsem_down_read_failed+0x10d/0x1a0 [1323059.475264] [] call_rwsem_down_read_failed+0x18/0x30 [1323059.475523] [] down_read+0x20/0x40 [1323059.475775] [] lu_cache_shrink+0x6a/0x2d0 [obdclass] [1323059.476020] [] shrink_slab+0xa9/0x330 [1323059.476265] [] ? compaction_suitable+0x5b/0xb0 [1323059.476508] [] balance_pgdat+0x4b1/0x5e0 [1323059.476755] [] kswapd+0x173/0x440 [1323059.476996] [] ? wake_up_atomic_t+0x30/0x30 [1323059.477244] [] ? balance_pgdat+0x5e0/0x5e0 [1323059.477486] [] kthread+0xcf/0xe0 [1323059.477726] [] ? insert_kthread_work+0x40/0x40 [1323059.477971] [] ret_from_fork+0x58/0x90 [1323059.478214] [] ? insert_kthread_work+0x40/0x40 [1323059.478516] INFO: task multipathd:66017 blocked for more than 120 seconds. [1323059.478761] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.479236] multipathd D ffffffff81a8ae08 0 66017 1 0x00000080 [1323059.479724] ffff881fe5357b50 0000000000000082 ffff881fe533cf10 ffff881fe5357fd8 [1323059.480213] ffff881fe5357fd8 ffff881fe5357fd8 ffff881fe533cf10 ffffffff81a8ae00 [1323059.480703] ffffffff81a8ae04 ffff881fe533cf10 00000000ffffffff ffffffff81a8ae08 [1323059.481185] Call Trace: [1323059.481422] [] schedule_preempt_disabled+0x29/0x70 [1323059.481666] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323059.481909] [] mutex_lock+0x1f/0x2f [1323059.482152] [] sysfs_permission+0x32/0x60 [1323059.482396] [] __inode_permission+0x6e/0xc0 [1323059.482635] [] inode_permission+0x18/0x50 [1323059.482970] [] link_path_walk+0x27e/0x8b0 [1323059.483208] [] path_lookupat+0x6b/0x7b0 [1323059.483449] [] ? plist_del+0x46/0x70 [1323059.483687] [] ? __unqueue_futex+0x2c/0x60 [1323059.483928] [] ? cpupri_set+0x98/0x100 [1323059.484171] [] ? kmem_cache_alloc+0x35/0x1e0 [1323059.484415] [] ? getname_flags+0x4f/0x1a0 [1323059.484660] [] filename_lookup+0x2b/0xc0 [1323059.484901] [] user_path_at_empty+0x67/0xc0 [1323059.485144] [] ? __schedule+0x39d/0x8b0 [1323059.485386] [] user_path_at+0x11/0x20 [1323059.485627] [] vfs_fstatat+0x63/0xc0 [1323059.485867] [] SYSC_newstat+0x2e/0x60 [1323059.486106] [] ? do_nanosleep+0x72/0xf0 [1323059.486353] [] ? __audit_syscall_exit+0x1e6/0x280 [1323059.486606] [] SyS_newstat+0xe/0x10 [1323059.486845] [] system_call_fastpath+0x16/0x1b [1323059.487255] INFO: task dsm_sa_datamgrd:327053 blocked for more than 120 seconds. [1323059.487726] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.488195] dsm_sa_datamgrd D ffffffff81a8ae08 0 327053 1 0x00000080 [1323059.488678] ffff881ff975bb50 0000000000000086 ffff881f9ab28000 ffff881ff975bfd8 [1323059.489169] ffff881ff975bfd8 ffff881ff975bfd8 ffff881f9ab28000 ffffffff81a8ae00 [1323059.489661] ffffffff81a8ae04 ffff881f9ab28000 00000000ffffffff ffffffff81a8ae08 [1323059.490144] Call Trace: [1323059.490382] [] schedule_preempt_disabled+0x29/0x70 [1323059.490625] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323059.490869] [] mutex_lock+0x1f/0x2f [1323059.491116] [] sysfs_permission+0x32/0x60 [1323059.491365] [] __inode_permission+0x6e/0xc0 [1323059.491611] [] inode_permission+0x18/0x50 [1323059.491853] [] link_path_walk+0x27e/0x8b0 [1323059.492099] [] ? __d_lookup+0x7a/0x160 [1323059.492344] [] path_lookupat+0x6b/0x7b0 [1323059.492606] [] ? kmem_cache_alloc+0x35/0x1e0 [1323059.492860] [] ? getname_flags+0x4f/0x1a0 [1323059.493102] [] filename_lookup+0x2b/0xc0 [1323059.493361] [] user_path_at_empty+0x67/0xc0 [1323059.493611] [] ? sched_clock_cpu+0x85/0xc0 [1323059.493872] [] ? check_preempt_curr+0x78/0xa0 [1323059.494116] [] user_path_at+0x11/0x20 [1323059.494376] [] vfs_fstatat+0x63/0xc0 [1323059.494618] [] ? try_to_wake_up+0x183/0x340 [1323059.494876] [] SYSC_newstat+0x2e/0x60 [1323059.495116] [] ? wake_up_process+0x15/0x20 [1323059.495378] [] ? wake_up_sem_queue_do+0x37/0x60 [1323059.495639] [] ? __audit_syscall_exit+0x1e6/0x280 [1323059.495908] [] SyS_newstat+0xe/0x10 [1323059.496147] [] system_call_fastpath+0x16/0x1b [1323059.496410] INFO: task md30_raid6:340220 blocked for more than 120 seconds. [1323059.496654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.497274] md30_raid6 D ffffffff81a8ae08 0 340220 2 0x00000080 [1323059.497773] ffff881fde99bc10 0000000000000046 ffff88015367af70 ffff881fde99bfd8 [1323059.498265] ffff881fde99bfd8 ffff881fde99bfd8 ffff88015367af70 ffffffff81a8ae00 [1323059.504384] ffffffff81a8ae04 ffff88015367af70 00000000ffffffff ffffffff81a8ae08 [1323059.504868] Call Trace: [1323059.505103] [] schedule_preempt_disabled+0x29/0x70 [1323059.505351] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323059.505598] [] mutex_lock+0x1f/0x2f [1323059.505838] [] sysfs_notify+0x24/0x90 [1323059.506085] [] md_update_sb+0x686/0x690 [1323059.506330] [] ? mutex_lock+0x12/0x2f [1323059.506574] [] md_check_recovery+0x1ca/0x4c0 [1323059.506823] [] raid5d+0x542/0x7f0 [raid456] [1323059.507066] [] md_thread+0x155/0x1a0 [1323059.507313] [] ? wake_up_atomic_t+0x30/0x30 [1323059.507562] [] ? find_pers+0x80/0x80 [1323059.507805] [] kthread+0xcf/0xe0 [1323059.508047] [] ? insert_kthread_work+0x40/0x40 [1323059.508292] [] ret_from_fork+0x58/0x90 [1323059.508535] [] ? insert_kthread_work+0x40/0x40 [1323059.508793] INFO: task md2_raid6:131162 blocked for more than 120 seconds. [1323059.509038] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.509521] md2_raid6 D ffffffff81a8ae08 0 131162 2 0x00000080 [1323059.510002] ffff881ed4b93c10 0000000000000046 ffff881ed82a8000 ffff881ed4b93fd8 [1323059.510494] ffff881ed4b93fd8 ffff881ed4b93fd8 ffff881ed82a8000 ffffffff81a8ae00 [1323059.510979] ffffffff81a8ae04 ffff881ed82a8000 00000000ffffffff ffffffff81a8ae08 [1323059.511551] Call Trace: [1323059.511784] [] schedule_preempt_disabled+0x29/0x70 [1323059.512021] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323059.512263] [] mutex_lock+0x1f/0x2f [1323059.512503] [] sysfs_notify+0x24/0x90 [1323059.512743] [] md_update_sb+0x686/0x690 [1323059.512986] [] ? mutex_lock+0x12/0x2f [1323059.513227] [] md_check_recovery+0x1ca/0x4c0 [1323059.513473] [] raid5d+0x56/0x7f0 [raid456] [1323059.513720] [] ? del_timer_sync+0x52/0x60 [1323059.513963] [] ? schedule_timeout+0x17c/0x2c0 [1323059.514205] [] ? internal_add_timer+0x70/0x70 [1323059.514453] [] md_thread+0x155/0x1a0 [1323059.514699] [] ? wake_up_atomic_t+0x30/0x30 [1323059.514941] [] ? find_pers+0x80/0x80 [1323059.515182] [] kthread+0xcf/0xe0 [1323059.515425] [] ? insert_kthread_work+0x40/0x40 [1323059.515670] [] ret_from_fork+0x58/0x90 [1323059.515912] [] ? insert_kthread_work+0x40/0x40 [1323059.516157] INFO: task md0_raid6:131228 blocked for more than 120 seconds. [1323059.516405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.516879] md0_raid6 D ffffffff81a8ae08 0 131228 2 0x00000080 [1323059.517359] ffff881ebafabc10 0000000000000046 ffff881ed82a9fa0 ffff881ebafabfd8 [1323059.517842] ffff881ebafabfd8 ffff881ebafabfd8 ffff881ed82a9fa0 ffffffff81a8ae00 [1323059.518330] ffffffff81a8ae04 ffff881ed82a9fa0 00000000ffffffff ffffffff81a8ae08 [1323059.518816] Call Trace: [1323059.519054] [] schedule_preempt_disabled+0x29/0x70 [1323059.519305] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323059.519552] [] mutex_lock+0x1f/0x2f [1323059.519796] [] sysfs_notify+0x24/0x90 [1323059.520037] [] md_update_sb+0x686/0x690 [1323059.520282] [] ? mutex_lock+0x12/0x2f [1323059.520524] [] md_check_recovery+0x1ca/0x4c0 [1323059.520769] [] raid5d+0x56/0x7f0 [raid456] [1323059.521010] [] ? del_timer_sync+0x52/0x60 [1323059.521251] [] ? schedule_timeout+0x17c/0x2c0 [1323059.521495] [] ? internal_add_timer+0x70/0x70 [1323059.521737] [] md_thread+0x155/0x1a0 [1323059.521982] [] ? wake_up_atomic_t+0x30/0x30 [1323059.522226] [] ? find_pers+0x80/0x80 [1323059.522473] [] kthread+0xcf/0xe0 [1323059.522719] [] ? insert_kthread_work+0x40/0x40 [1323059.522961] [] ret_from_fork+0x58/0x90 [1323059.523200] [] ? insert_kthread_work+0x40/0x40 [1323059.523443] INFO: task md8_raid6:131281 blocked for more than 120 seconds. [1323059.523689] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.524158] md8_raid6 D ffffffff81a8ae08 0 131281 2 0x00000080 [1323059.524634] ffff881eabfefc10 0000000000000046 ffff881ebd281fa0 ffff881eabfeffd8 [1323059.525118] ffff881eabfeffd8 ffff881eabfeffd8 ffff881ebd281fa0 ffffffff81a8ae00 [1323059.525686] ffffffff81a8ae04 ffff881ebd281fa0 00000000ffffffff ffffffff81a8ae08 [1323059.526163] Call Trace: [1323059.526396] [] schedule_preempt_disabled+0x29/0x70 [1323059.526637] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323059.526881] [] mutex_lock+0x1f/0x2f [1323059.527122] [] sysfs_notify+0x24/0x90 [1323059.527368] [] md_update_sb+0x686/0x690 [1323059.527613] [] md_check_recovery+0x1ca/0x4c0 [1323059.527855] [] raid5d+0x56/0x7f0 [raid456] [1323059.528099] [] ? del_timer_sync+0x52/0x60 [1323059.528346] [] ? schedule_timeout+0x17c/0x2c0 [1323059.528589] [] ? internal_add_timer+0x70/0x70 [1323059.528837] [] md_thread+0x155/0x1a0 [1323059.529079] [] ? wake_up_atomic_t+0x30/0x30 [1323059.529327] [] ? find_pers+0x80/0x80 [1323059.529571] [] kthread+0xcf/0xe0 [1323059.529815] [] ? insert_kthread_work+0x40/0x40 [1323059.530065] [] ret_from_fork+0x58/0x90 [1323059.530311] [] ? insert_kthread_work+0x40/0x40 [1323059.530558] INFO: task md18_raid6:131341 blocked for more than 120 seconds. [1323059.530804] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.531274] md18_raid6 D ffffffff81a8ae08 0 131341 2 0x00000080 [1323059.531750] ffff881eaa6e7c10 0000000000000046 ffff881fb0342f70 ffff881eaa6e7fd8 [1323059.532233] ffff881eaa6e7fd8 ffff881eaa6e7fd8 ffff881fb0342f70 ffffffff81a8ae00 [1323059.532716] ffffffff81a8ae04 ffff881fb0342f70 00000000ffffffff ffffffff81a8ae08 [1323059.533198] Call Trace: [1323059.533435] [] schedule_preempt_disabled+0x29/0x70 [1323059.533679] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323059.533922] [] mutex_lock+0x1f/0x2f [1323059.534162] [] sysfs_notify+0x24/0x90 [1323059.534407] [] md_update_sb+0x686/0x690 [1323059.534650] [] ? mutex_lock+0x12/0x2f [1323059.534890] [] md_check_recovery+0x1ca/0x4c0 [1323059.535131] [] raid5d+0x542/0x7f0 [raid456] [1323059.535377] [] md_thread+0x155/0x1a0 [1323059.535619] [] ? wake_up_atomic_t+0x30/0x30 [1323059.535863] [] ? find_pers+0x80/0x80 [1323059.536103] [] kthread+0xcf/0xe0 [1323059.536347] [] ? insert_kthread_work+0x40/0x40 [1323059.536597] [] ret_from_fork+0x58/0x90 [1323059.536840] [] ? insert_kthread_work+0x40/0x40 [1323059.537088] INFO: task kmmpd-md2:131859 blocked for more than 120 seconds. [1323059.537332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [1323059.537831] kmmpd-md2 D ffff881e60747cc0 0 131859 2 0x00000080 [1323059.538325] ffff881e60747b38 0000000000000046 ffff881fde7b0fd0 ffff881e60747fd8 [1323059.538842] ffff881e60747fd8 ffff881e60747fd8 ffff881fde7b0fd0 ffff883f235c0800 [1323059.539346] ffff883f235c0a98 ffff883f235c083c ffff8814a9db3500 ffff881e60747cc0 [1323059.539954] Call Trace: [1323059.540185] [] schedule+0x29/0x70 [1323059.540443] [] md_write_start+0xd5/0x220 [1323059.540683] [] ? wake_up_atomic_t+0x30/0x30 [1323059.540957] [] raid5_make_request+0xc0/0xd10 [raid456] [1323059.541201] [] ? update_curr+0x104/0x190 [1323059.541463] [] ? wake_up_atomic_t+0x30/0x30 [1323059.541710] [] ? dequeue_entity+0x11c/0x5d0 [1323059.541986] [] ? ktime_get_ts64+0x4c/0xf0 [1323059.542233] [] md_make_request+0x104/0x290 [1323059.542491] [] generic_make_request+0x105/0x310 [1323059.542737] [] submit_bio+0x70/0x150 [1323059.543014] [] ? bio_alloc_bioset+0x115/0x310 [1323059.543271] [] _submit_bh+0x127/0x160 [1323059.543514] [] submit_bh+0x10/0x20 [1323059.543797] [] write_mmp_block+0x111/0x170 [ldiskfs] [1323059.544047] [] kmmpd+0x1a8/0x430 [ldiskfs] [1323059.544308] [] ? __schedule+0x39d/0x8b0 [1323059.544562] [] ? __dump_mmp_msg+0x70/0x70 [ldiskfs] [1323059.544840] [] kthread+0xcf/0xe0 [1323059.545085] [] ? insert_kthread_work+0x40/0x40 [1323059.545347] [] ret_from_fork+0x58/0x90 [1323059.545594] [] ? insert_kthread_work+0x40/0x40 [1323100.750361] LNet: Service thread pid 251689 was inactive for 200.06s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1323100.751102] Pid: 251689, comm: ll_ost_io01_029 [1323100.751355] Call Trace: [1323100.751895] [] schedule_preempt_disabled+0x29/0x70 [1323100.752139] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323100.752404] [] mutex_lock+0x1f/0x2f [1323100.752647] [] dquot_commit+0x29/0xc0 [1323100.752933] [] ldiskfs_write_dquot+0x6c/0x90 [ldiskfs] [1323100.758635] [] ldiskfs_mark_dquot_dirty+0x3f/0x60 [ldiskfs] [1323100.759164] [] __dquot_alloc_space+0x1f2/0x250 [1323100.759426] [] ? __percpu_counter_add+0x51/0x70 [1323100.759672] [] ldiskfs_mb_new_blocks+0xf1/0xb20 [ldiskfs] [1323100.760175] [] ? __read_extent_tree_block+0x55/0x1e0 [ldiskfs] [1323100.760692] [] ? __kmalloc+0x1e3/0x230 [1323100.760972] [] ? ldiskfs_ext_find_extent+0x12c/0x2f0 [ldiskfs] [1323100.761491] [] ldiskfs_ext_map_blocks+0x496/0xf50 [ldiskfs] [1323100.761996] [] ldiskfs_map_blocks+0x163/0x700 [ldiskfs] [1323100.762256] [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] [1323100.762778] [] osd_write_commit+0x3b2/0x8d0 [osd_ldiskfs] [1323100.763342] [] ? osd_trans_start+0x1f2/0x490 [osd_ldiskfs] [1323100.763848] [] ofd_commitrw_write+0xfe3/0x1c50 [ofd] [1323100.764096] [] ofd_commitrw+0x4b9/0xac0 [ofd] [1323100.764420] [] obd_commitrw+0x2ed/0x330 [ptlrpc] [1323100.764734] [] tgt_brw_write+0xff1/0x17c0 [ptlrpc] [1323100.764980] [] ? update_curr+0x104/0x190 [1323100.765222] [] ? __enqueue_entity+0x78/0x80 [1323100.765483] [] ? enqueue_entity+0x26c/0xb60 [1323100.765743] [] ? target_bulk_timeout+0x0/0xb0 [ptlrpc] [1323100.766043] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323100.766310] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323100.766851] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323100.767122] [] ? default_wake_function+0x12/0x20 [1323100.767494] [] ? __wake_up_common+0x58/0x90 [1323100.767758] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323100.768018] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323100.768289] [] kthread+0xcf/0xe0 [1323100.768542] [] ? kthread+0x0/0xe0 [1323100.768785] [] ret_from_fork+0x58/0x90 [1323100.769051] [] ? kthread+0x0/0xe0 [1323100.769285] [1323100.769535] LustreError: dumping log to /tmp/lustre-log.1519455753.251689 [1323100.771954] Pid: 251676, comm: ll_ost_io00_018 [1323100.772206] Call Trace: [1323100.772711] [] ? bit_wait_io+0x0/0x50 [1323100.772981] [] schedule+0x29/0x70 [1323100.773251] [] schedule_timeout+0x239/0x2c0 [1323100.773510] [] ? bit_wait_io+0x0/0x50 [1323100.773751] [] io_schedule_timeout+0xad/0x130 [1323100.774022] [] io_schedule+0x18/0x20 [1323100.774262] [] bit_wait_io+0x11/0x50 [1323100.774519] [] __wait_on_bit_lock+0x5f/0xc0 [1323100.774761] [] ? bit_wait_io+0x0/0x50 [1323100.775035] [] out_of_line_wait_on_bit_lock+0x81/0xb0 [1323100.775280] [] ? wake_bit_function_rh+0x0/0x40 [1323100.775543] [] __lock_buffer+0x32/0x40 [1323100.775803] [] do_get_write_access+0x42f/0x4c0 [jbd2] [1323100.776081] [] ? ldiskfs_getblk+0xa6/0x200 [ldiskfs] [1323100.776348] [] jbd2_journal_get_write_access+0x27/0x40 [jbd2] [1323100.776850] [] __ldiskfs_journal_get_write_access+0x3b/0x80 [ldiskfs] [1323100.777355] [] ldiskfs_quota_write+0xc6/0x1e0 [ldiskfs] [1323100.777608] [] qtree_write_dquot+0xc9/0x170 [1323100.777900] [] v2_write_dquot+0x2b/0x30 [1323100.778139] [] dquot_commit+0xb7/0xc0 [1323100.778405] [] ldiskfs_write_dquot+0x6c/0x90 [ldiskfs] [1323100.778653] [] ldiskfs_mark_dquot_dirty+0x3f/0x60 [ldiskfs] [1323100.779147] [] __dquot_alloc_space+0x1f2/0x250 [1323100.779405] [] ? dqget+0x1a7/0x440 [1323100.779652] [] ? ldiskfs_try_to_write_inline_data+0x1b0/0x5d0 [ldiskfs] [1323100.780156] [] ldiskfs_mb_new_blocks+0xf1/0xb20 [ldiskfs] [1323100.780674] [] ? ldiskfs_es_insert_extent+0xdb/0x1b0 [ldiskfs] [1323100.781205] [] ? lquota_disk_read+0xec/0x390 [lquota] [1323100.781459] [] ? __kmalloc+0x1e3/0x230 [1323100.781795] [] ? ldiskfs_ext_find_extent+0x249/0x2f0 [ldiskfs] [1323100.782283] [] ? ldiskfs_ext_find_extent+0x249/0x2f0 [ldiskfs] [1323100.782816] [] ? ldiskfs_ext_put_gap_in_cache+0x83/0x90 [ldiskfs] [1323100.783360] [] ldiskfs_ext_map_blocks+0x496/0xf50 [ldiskfs] [1323100.783862] [] ldiskfs_map_blocks+0x163/0x700 [ldiskfs] [1323100.784110] [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] [1323100.784616] [] osd_write_commit+0x3b2/0x8d0 [osd_ldiskfs] [1323100.785148] [] ? osd_trans_start+0x1f2/0x490 [osd_ldiskfs] [1323100.785658] [] ofd_commitrw_write+0xfe3/0x1c50 [ofd] [1323100.785936] [] ofd_commitrw+0x4b9/0xac0 [ofd] [1323100.786234] [] obd_commitrw+0x2ed/0x330 [ptlrpc] [1323100.786525] [] tgt_brw_write+0xff1/0x17c0 [ptlrpc] [1323100.786777] [] ? update_curr+0x104/0x190 [1323100.787052] [] ? account_entity_dequeue+0xae/0xd0 [1323100.787301] [] ? dequeue_entity+0x11c/0x5d0 [1323100.787583] [] ? target_bulk_timeout+0x0/0xb0 [ptlrpc] [1323100.787890] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323100.788187] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323100.788725] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323100.789001] [] ? default_wake_function+0x0/0x20 [1323100.789293] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323100.789576] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323100.789849] [] kthread+0xcf/0xe0 [1323100.790116] [] ? kthread+0x0/0xe0 [1323100.790376] [] ret_from_fork+0x58/0x90 [1323100.790616] [] ? kthread+0x0/0xe0 [1323100.790885] [1323100.791118] Pid: 63775, comm: ll_ost_io00_118 [1323100.791372] Call Trace: [1323100.791862] [] schedule_preempt_disabled+0x29/0x70 [1323100.792106] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323100.792367] [] mutex_lock+0x1f/0x2f [1323100.792607] [] dquot_commit+0x29/0xc0 [1323100.792880] [] ldiskfs_write_dquot+0x6c/0x90 [ldiskfs] [1323100.793131] [] ldiskfs_mark_dquot_dirty+0x3f/0x60 [ldiskfs] [1323100.793625] [] __dquot_alloc_space+0x1f2/0x250 [1323100.793902] [] ? __percpu_counter_add+0x51/0x70 [1323100.794175] [] ldiskfs_mb_new_blocks+0xf1/0xb20 [ldiskfs] [1323100.794693] [] ? __read_extent_tree_block+0x55/0x1e0 [ldiskfs] [1323100.795223] [] ? __kmalloc+0x1e3/0x230 [1323100.795488] [] ? ldiskfs_ext_find_extent+0x12c/0x2f0 [ldiskfs] [1323100.796072] [] ldiskfs_ext_map_blocks+0x496/0xf50 [ldiskfs] [1323100.796555] [] ldiskfs_map_blocks+0x163/0x700 [ldiskfs] [1323100.796835] [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] [1323100.797353] [] osd_write_commit+0x3b2/0x8d0 [osd_ldiskfs] [1323100.797858] [] ? osd_trans_start+0x1f2/0x490 [osd_ldiskfs] [1323100.798347] [] ofd_commitrw_write+0xfe3/0x1c50 [ofd] [1323100.798608] [] ofd_commitrw+0x4b9/0xac0 [ofd] [1323100.798920] [] obd_commitrw+0x2ed/0x330 [ptlrpc] [1323100.799187] [] tgt_brw_write+0xff1/0x17c0 [ptlrpc] [1323100.799449] [] ? update_curr+0x104/0x190 [1323100.799708] [] ? __enqueue_entity+0x78/0x80 [1323100.799975] [] ? enqueue_entity+0x26c/0xb60 [1323100.800216] [] ? ___slab_alloc+0x209/0x4f0 [1323100.800495] [] ? target_bulk_timeout+0x0/0xb0 [ptlrpc] [1323100.800766] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323100.801062] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323100.801576] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323100.801851] [] ? default_wake_function+0x12/0x20 [1323100.802126] [] ? __wake_up_common+0x58/0x90 [1323100.802407] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323100.802672] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323100.802944] [] kthread+0xcf/0xe0 [1323100.803183] [] ? kthread+0x0/0xe0 [1323100.803440] [] ret_from_fork+0x58/0x90 [1323100.803683] [] ? kthread+0x0/0xe0 [1323100.803951] [1323100.804182] Pid: 301783, comm: ll_ost_io00_040 [1323100.804435] Call Trace: [1323100.804924] [] schedule_preempt_disabled+0x29/0x70 [1323100.805169] [] __mutex_lock_slowpath+0xc7/0x1d0 [1323100.805431] [] mutex_lock+0x1f/0x2f [1323100.805668] [] dquot_commit+0x29/0xc0 [1323100.805940] [] ldiskfs_write_dquot+0x6c/0x90 [ldiskfs] [1323100.806188] [] ldiskfs_mark_dquot_dirty+0x3f/0x60 [ldiskfs] [1323100.806701] [] __dquot_alloc_space+0x1f2/0x250 [1323100.806977] [] ? __percpu_counter_add+0x51/0x70 [1323100.807254] [] ldiskfs_mb_new_blocks+0xf1/0xb20 [ldiskfs] [1323100.807780] [] ? __read_extent_tree_block+0x55/0x1e0 [ldiskfs] [1323100.813719] [] ? __kmalloc+0x1e3/0x230 [1323100.813995] [] ? ldiskfs_ext_find_extent+0x12c/0x2f0 [ldiskfs] [1323100.814533] [] ldiskfs_ext_map_blocks+0x496/0xf50 [ldiskfs] [1323100.815053] [] ldiskfs_map_blocks+0x163/0x700 [ldiskfs] [1323100.815304] [] osd_ldiskfs_map_inode_pages+0x143/0x420 [osd_ldiskfs] [1323100.815816] [] osd_write_commit+0x3b2/0x8d0 [osd_ldiskfs] [1323100.816368] [] ? osd_trans_start+0x1f2/0x490 [osd_ldiskfs] [1323100.816871] [] ofd_commitrw_write+0xfe3/0x1c50 [ofd] [1323100.817124] [] ofd_commitrw+0x4b9/0xac0 [ofd] [1323100.817416] [] obd_commitrw+0x2ed/0x330 [ptlrpc] [1323100.817703] [] tgt_brw_write+0xff1/0x17c0 [ptlrpc] [1323100.817974] [] ? update_curr+0x104/0x190 [1323100.818211] [] ? __enqueue_entity+0x78/0x80 [1323100.818471] [] ? enqueue_entity+0x26c/0xb60 [1323100.818732] [] ? target_bulk_timeout+0x0/0xb0 [ptlrpc] [1323100.819031] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323100.819296] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323100.819831] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323100.820106] [] ? default_wake_function+0x12/0x20 [1323100.820395] [] ? __wake_up_common+0x58/0x90 [1323100.820658] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323100.820950] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323100.821191] [] kthread+0xcf/0xe0 [1323100.821451] [] ? kthread+0x0/0xe0 [1323100.821691] [] ret_from_fork+0x58/0x90 [1323100.821958] [] ? kthread+0x0/0xe0 [1323100.822200] [1323100.822452] Pid: 184234, comm: ll_ost_io01_066 [1323100.822689] Call Trace: [1323100.823184] [] schedule+0x29/0x70 [1323100.823450] [] osd_trans_stop+0x205/0x830 [osd_ldiskfs] [1323100.823699] [] ? ldiskfs_dirty_inode+0x54/0x60 [ldiskfs] [1323100.823971] [] ? autoremove_wake_function+0x0/0x40 [1323100.824214] [] ofd_trans_stop+0x1f/0x60 [ofd] [1323100.824552] [] ofd_commitrw_write+0x7e4/0x1c50 [ofd] [1323100.824795] [] ofd_commitrw+0x4b9/0xac0 [ofd] [1323100.825088] [] obd_commitrw+0x2ed/0x330 [ptlrpc] [1323100.825371] [] tgt_brw_write+0xff1/0x17c0 [ptlrpc] [1323100.825614] [] ? update_curr+0x104/0x190 [1323100.825884] [] ? __enqueue_entity+0x78/0x80 [1323100.826125] [] ? enqueue_entity+0x26c/0xb60 [1323100.826402] [] ? target_bulk_timeout+0x0/0xb0 [ptlrpc] [1323100.826673] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323100.826974] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323100.827488] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323100.827733] [] ? default_wake_function+0x12/0x20 [1323100.828006] [] ? __wake_up_common+0x58/0x90 [1323100.828268] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323100.828551] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323100.828795] [] kthread+0xcf/0xe0 [1323100.829064] [] ? kthread+0x0/0xe0 [1323100.829304] [] ret_from_fork+0x58/0x90 [1323100.829562] [] ? kthread+0x0/0xe0 [1323100.829831] [1323100.830096] LNet: Service thread pid 316476 was inactive for 200.14s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323101.774312] LNet: Service thread pid 251691 was inactive for 200.51s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323101.774803] LNet: Skipped 2 previous similar messages [1323101.775055] LustreError: dumping log to /tmp/lustre-log.1519455754.251691 [1323102.798252] LNet: Service thread pid 40338 was inactive for 200.14s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323102.798757] LNet: Skipped 36 previous similar messages [1323102.798993] LustreError: dumping log to /tmp/lustre-log.1519455755.40338 [1323103.822183] LustreError: dumping log to /tmp/lustre-log.1519455756.301765 [1323104.846161] LNet: Service thread pid 322467 was inactive for 200.56s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323104.846688] LNet: Skipped 70 previous similar messages [1323104.846930] LustreError: dumping log to /tmp/lustre-log.1519455757.322467 [1323105.870108] LustreError: dumping log to /tmp/lustre-log.1519455758.39072 [1323106.894045] LustreError: dumping log to /tmp/lustre-log.1519455759.183386 [1323107.918009] LustreError: dumping log to /tmp/lustre-log.1519455760.316480 [1323108.941978] LNet: Service thread pid 185054 was inactive for 200.24s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323108.942478] LNet: Skipped 60 previous similar messages [1323108.942721] LustreError: dumping log to /tmp/lustre-log.1519455761.185054 [1323111.501861] LustreError: dumping log to /tmp/lustre-log.1519455763.184441 [1323112.013834] LustreError: dumping log to /tmp/lustre-log.1519455764.188984 [1323113.037787] LustreError: dumping log to /tmp/lustre-log.1519455765.63777 [1323115.085687] LustreError: dumping log to /tmp/lustre-log.1519455767.362684 [1323119.693454] LNet: Service thread pid 249482 was inactive for 200.48s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323119.693957] LNet: Skipped 7 previous similar messages [1323119.694193] LustreError: dumping log to /tmp/lustre-log.1519455772.249482 [1323121.229378] LustreError: dumping log to /tmp/lustre-log.1519455773.251662 [1323121.741383] LustreError: dumping log to /tmp/lustre-log.1519455774.301769 [1323123.277304] LustreError: dumping log to /tmp/lustre-log.1519455775.190094 [1323130.444974] LustreError: dumping log to /tmp/lustre-log.1519455782.251692 [1323130.956953] LustreError: dumping log to /tmp/lustre-log.1519455783.40335 [1323133.516839] LustreError: dumping log to /tmp/lustre-log.1519455785.362697 [1323139.148567] LNet: Service thread pid 251688 was inactive for 200.47s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323139.149050] LNet: Skipped 13 previous similar messages [1323139.149295] LustreError: dumping log to /tmp/lustre-log.1519455791.251688 [1323145.804264] LustreError: dumping log to /tmp/lustre-log.1519455798.210331 [1323153.995852] LustreError: dumping log to /tmp/lustre-log.1519455806.6922 [1323172.427013] LNet: Service thread pid 137164 was inactive for 200.73s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323172.427489] LNet: Skipped 5 previous similar messages [1323172.427725] LustreError: dumping log to /tmp/lustre-log.1519455824.137164 [1323179.594685] LustreError: dumping log to /tmp/lustre-log.1519455831.249478 [1323209.691251] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519455855/real 1519455855] req@ffff88004b3ed400 x1592481990801536/t0(0) o106->oak-OST0034@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519455862 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1323209.692232] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 21 previous similar messages [1323214.361064] LustreError: 362684:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519455566, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0030_UUID lock: ffff88005db96200/0x806f959362a6eaba lrc: 3/0,1 mode: --/PW res: [0x443f8c:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 362684 timeout: 0 lvb_type: 0 [1323214.362832] LustreError: dumping log to /tmp/lustre-log.1519455866.362684 [1323219.217845] LustreError: 21426:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519455571, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0030_UUID lock: ffff88002200c200/0x806f959362a6f579 lrc: 3/0,1 mode: --/PW res: [0x443f8d:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 21426 timeout: 0 lvb_type: 0 [1323233.489177] LustreError: 362697:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519455585, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0032_UUID lock: ffff8817a30da200/0x806f959362a702b5 lrc: 3/0,1 mode: --/PW res: [0x4386ad:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 362697 timeout: 0 lvb_type: 0 [1323245.127603] LNet: Service thread pid 132011 was inactive for 238.53s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1323245.128089] LNet: Skipped 1 previous similar message [1323245.128329] LustreError: dumping log to /tmp/lustre-log.1519455897.132011 [1323245.565584] LustreError: 210328:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519455597, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0038_UUID lock: ffff880022b18800/0x806f959362a70563 lrc: 3/0,1 mode: --/PW res: [0x42f8b9:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 210328 timeout: 0 lvb_type: 0 [1323245.567313] LustreError: 210328:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 1 previous similar message [1323246.663529] LustreError: dumping log to /tmp/lustre-log.1519455899.210344 [1323253.592216] LustreError: 6922:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519455605, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0030_UUID lock: ffff88007200ba00/0x806f959362a70873 lrc: 3/0,1 mode: --/PW res: [0x443f90:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 6922 timeout: 0 lvb_type: 0 [1323253.593883] LustreError: 6922:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 1 previous similar message [1323306.585738] LustreError: 132011:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519455658, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0041_UUID lock: ffff8817a30de000/0x806f959362a70b91 lrc: 3/0,1 mode: --/PW res: [0x43c75f:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 132011 timeout: 0 lvb_type: 0 [1323326.019885] LustreError: dumping log to /tmp/lustre-log.1519455978.249476 [1323370.582781] LustreError: 132969:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519455722, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0045_UUID lock: ffff88004c832200/0x806f959362a70e54 lrc: 3/0,1 mode: --/PW res: [0x437798:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 132969 timeout: 0 lvb_type: 0 [1323408.959964] LNet: Service thread pid 132969 was inactive for 338.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1323408.960685] LNet: Skipped 4 previous similar messages [1323408.966369] Pid: 132969, comm: ll_ost01_011 [1323408.966609] Call Trace: [1323408.967136] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1323408.967613] [] schedule+0x29/0x70 [1323408.967878] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1323408.968130] [] ? default_wake_function+0x0/0x20 [1323408.968398] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1323408.968890] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1323408.969158] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1323408.969409] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1323408.969676] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1323408.969946] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1323408.970194] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1323408.970477] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323408.970749] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323408.971250] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323408.971501] [] ? default_wake_function+0x12/0x20 [1323408.971748] [] ? __wake_up_common+0x58/0x90 [1323408.972018] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323408.972290] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323408.972539] [] kthread+0xcf/0xe0 [1323408.972780] [] ? kthread+0x0/0xe0 [1323408.973030] [] ret_from_fork+0x58/0x90 [1323408.973270] [] ? kthread+0x0/0xe0 [1323408.973510] [1323408.973743] LustreError: dumping log to /tmp/lustre-log.1519456061.132969 [1323485.756407] LNet: Service thread pid 249485 was inactive for 386.56s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1323485.757124] Pid: 249485, comm: ll_ost00_066 [1323485.757360] Call Trace: [1323485.757827] [] schedule+0x29/0x70 [1323485.758087] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1323485.758334] [] ? autoremove_wake_function+0x0/0x40 [1323485.758583] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1323485.759071] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1323485.759351] [] start_this_handle+0x1a1/0x430 [jbd2] [1323485.759605] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1323485.759849] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1323485.760094] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1323485.760352] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1323485.760827] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1323485.761296] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1323485.761554] [] dqget+0x3e4/0x440 [1323485.761812] [] dquot_get_dqblk+0x14/0x1f0 [1323485.762088] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1323485.762562] [] lquotactl_slv+0x286/0xac0 [lquota] [1323485.762813] [] ofd_quotactl+0x13c/0x380 [ofd] [1323485.763110] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323485.763381] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323485.763871] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323485.764114] [] ? default_wake_function+0x12/0x20 [1323485.764356] [] ? __wake_up_common+0x58/0x90 [1323485.764624] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323485.764909] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323485.765154] [] kthread+0xcf/0xe0 [1323485.765411] [] ? kthread+0x0/0xe0 [1323485.765655] [] ret_from_fork+0x58/0x90 [1323485.765896] [] ? kthread+0x0/0xe0 [1323485.766136] [1323485.766368] LustreError: dumping log to /tmp/lustre-log.1519456138.249485 [1323494.760000] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8814d92c2c50 x1593152211412416/t0(0) o4->664a7c90-84ba-06c2-9654-f9c1cbd5a207@10.8.2.15@o2ib6:532/0 lens 608/448 e 23 to 0 dl 1519456152 ref 2 fl Interpret:H/0/0 rc 0/0 [1323495.531935] Lustre: 360834:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff883cad2a9050 x1593220766694640/t4295755399(0) o4->f1347269-cc7b-9c70-cadb-2cc85f7a8a70@10.9.114.10@o2ib4:532/0 lens 608/448 e 23 to 0 dl 1519456152 ref 2 fl Interpret:/0/0 rc 0/0 [1323495.533220] Lustre: 360834:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 2 previous similar messages [1323496.533880] Lustre: 360834:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff883cb8925050 x1593152416260592/t0(0) o4->0d0e28ee-47ab-6a9b-a21b-20d577530272@10.9.102.32@o2ib4:533/0 lens 608/448 e 23 to 0 dl 1519456153 ref 2 fl Interpret:/0/0 rc 0/0 [1323496.535080] Lustre: 360834:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 19 previous similar messages [1323498.537823] Lustre: 360834:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8831c345f450 x1593214014939440/t0(0) o3->3326eb74-fc4c-e39b-1783-8c55bbb22498@10.9.112.6@o2ib4:535/0 lens 608/432 e 23 to 0 dl 1519456155 ref 2 fl Interpret:/0/0 rc 0/0 [1323498.539020] Lustre: 360834:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 55 previous similar messages [1323500.588300] Lustre: oak-OST0042: Client 221c25d8-1036-b285-2495-b7639654d19c (at 10.9.102.17@o2ib4) reconnecting [1323500.588830] Lustre: oak-OST0042: Connection restored to 13de3f5f-65b7-a3ba-803f-cddc50d5eb14 (at 10.9.102.17@o2ib4) [1323500.589311] Lustre: Skipped 17 previous similar messages [1323501.459262] Lustre: oak-OST0042: Client 664a7c90-84ba-06c2-9654-f9c1cbd5a207 (at 10.8.2.15@o2ib6) reconnecting [1323502.545596] Lustre: 184236:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff883cad260450 x1593126948529424/t0(0) o3->0dc7dd6f-f224-4477-c533-f0c7ba30d61b@10.9.104.17@o2ib4:539/0 lens 608/432 e 23 to 0 dl 1519456159 ref 2 fl Interpret:/0/0 rc 0/0 [1323502.546803] Lustre: 184236:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 91 previous similar messages [1323502.613245] Lustre: oak-OST0034: Client 6c5035e3-5c55-5738-2ecd-8ce793ddb381 (at 10.9.102.21@o2ib4) reconnecting [1323502.613727] Lustre: Skipped 15 previous similar messages [1323504.953219] Lustre: oak-OST0032: Client fadeddec-8594-ff20-3bec-7ad4bb33bece (at 10.8.18.23@o2ib6) reconnecting [1323504.953692] Lustre: Skipped 25 previous similar messages [1323509.609986] Lustre: oak-OST0042: Client oak-MDT0000-mdtlov_UUID (at 10.0.2.52@o2ib5) reconnecting [1323509.610470] Lustre: Skipped 8 previous similar messages [1323509.610863] Lustre: oak-OST0042: deleting orphan objects from 0x0:4333563 to 0x0:4333601 [1323510.678273] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1519456156/real 1519456156] req@ffff88004b3ed400 x1592481990801536/t0(0) o106->oak-OST0034@10.9.112.15@o2ib4:15/16 lens 296/280 e 0 to 1 dl 1519456163 ref 1 fl Rpc:X/2/ffffffff rc 0/-1 [1323510.679244] Lustre: 137164:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 42 previous similar messages [1323513.739090] Lustre: 210320:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8839e383a050 x1593042641062864/t0(0) o6->oak-MDT0000-mdtlov_UUID@10.0.2.52@o2ib5:551/0 lens 664/432 e 23 to 0 dl 1519456171 ref 2 fl Interpret:/0/0 rc 0/0 [1323513.740293] Lustre: 210320:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 13 previous similar messages [1323514.855247] Lustre: oak-OST0030: deleting orphan objects from 0x0:4472721 to 0x0:4472737 [1323520.396102] Lustre: oak-OST0030: Client cd9072f9-71b4-1cfb-a759-fd6823f1a4a9 (at 10.0.2.3@o2ib5) reconnecting [1323520.396602] Lustre: Skipped 2 previous similar messages [1323529.824376] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff88073a8d4c50 x1593126948621472/t0(0) o3->0dc7dd6f-f224-4477-c533-f0c7ba30d61b@10.9.104.17@o2ib4:567/0 lens 608/0 e 12 to 0 dl 1519456187 ref 2 fl New:/0/ffffffff rc 0/-1 [1323529.825567] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 142 previous similar messages [1323534.310248] Lustre: oak-OST0032: deleting orphan objects from 0x0:4425393 to 0x0:4425409 [1323536.605264] Lustre: oak-OST003f: Client 1a8fdd02-e840-cd78-d015-ebdfc48eadda (at 10.9.102.48@o2ib4) reconnecting [1323536.605739] Lustre: Skipped 88 previous similar messages [1323541.208827] LustreError: 362691:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519455893, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0035_UUID lock: ffff88005e575400/0x806f959362a712de lrc: 3/0,1 mode: --/PW res: [0x428644:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 362691 timeout: 0 lvb_type: 0 [1323546.029747] Lustre: oak-OST0038: deleting orphan objects from 0x0:4389051 to 0x0:4389089 [1323554.469305] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465762 to 0x0:3465793 [1323561.890896] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8812f8306850 x1593152182170096/t0(0) o4->385ec7a6-34dc-1efa-0d7b-713ce9bf54e6@10.9.101.65@o2ib4:599/0 lens 624/0 e 7 to 0 dl 1519456219 ref 2 fl New:/0/ffffffff rc 0/-1 [1323561.892094] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 569 previous similar messages [1323569.048818] Lustre: oak-OST0044: Client aa35134f-8565-3554-fe57-f6d38e77256e (at 10.8.18.21@o2ib6) reconnecting [1323569.049285] Lustre: Skipped 161 previous similar messages [1323606.677521] Lustre: oak-OST0041: deleting orphan objects from 0x0:4441952 to 0x0:4441985 [1323634.101548] Lustre: 174518:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8806eadcbc50 x1591002486533584/t0(0) o19->cd9072f9-71b4-1cfb-a759-fd6823f1a4a9@10.0.2.3@o2ib5:671/0 lens 336/336 e 4 to 0 dl 1519456291 ref 2 fl Interpret:/0/0 rc 0/0 [1323634.108283] Lustre: 174518:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 24 previous similar messages [1323645.671931] Lustre: oak-OST0035: Client 98f13d8f-bb6e-bfe1-8000-a39226051fc5 (at 10.8.18.34@o2ib6) reconnecting [1323645.672405] Lustre: Skipped 9 previous similar messages [1323671.143967] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421529 to 0x0:4421569 [1323696.690574] LNet: Service thread pid 249332 was inactive for 537.46s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1323696.691292] Pid: 249332, comm: ll_ost00_045 [1323696.691525] Call Trace: [1323696.692020] [] schedule+0x29/0x70 [1323696.692278] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1323696.692520] [] ? autoremove_wake_function+0x0/0x40 [1323696.692765] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1323696.693250] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1323696.693508] [] start_this_handle+0x1a1/0x430 [jbd2] [1323696.693761] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1323696.694004] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1323696.694246] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1323696.694500] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1323696.694964] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1323696.695432] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1323696.695677] [] dqget+0x3e4/0x440 [1323696.695918] [] dquot_get_dqblk+0x14/0x1f0 [1323696.696178] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1323696.696656] [] lquotactl_slv+0x286/0xac0 [lquota] [1323696.696926] [] ofd_quotactl+0x13c/0x380 [ofd] [1323696.697234] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323696.697503] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323696.697992] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323696.698242] [] ? default_wake_function+0x12/0x20 [1323696.698486] [] ? __wake_up_common+0x58/0x90 [1323696.698756] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323696.699024] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323696.699265] [] kthread+0xcf/0xe0 [1323696.699500] [] ? kthread+0x0/0xe0 [1323696.699741] [] ret_from_fork+0x58/0x90 [1323696.699998] [] ? kthread+0x0/0xe0 [1323696.700247] [1323696.700478] LustreError: dumping log to /tmp/lustre-log.1519456349.249332 [1323769.303195] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff881d02798c50 x1591169517243856/t0(0) o4->a7b231a5-a7d7-2f9d-4b5d-464beea873b1@10.8.0.63@o2ib6:51/0 lens 608/0 e 3 to 0 dl 1519456426 ref 2 fl New:/0/ffffffff rc 0/-1 [1323769.304376] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 7 previous similar messages [1323776.301675] Lustre: oak-OST003e: Client a7b231a5-a7d7-2f9d-4b5d-464beea873b1 (at 10.8.0.63@o2ib6) reconnecting [1323776.302199] Lustre: Skipped 4 previous similar messages [1323787.178358] LustreError: 210343:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519456139, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0049_UUID lock: ffff880050855000/0x806f959362a71347 lrc: 3/0,1 mode: --/PW res: [0x34c6f1:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 210343 timeout: 0 lvb_type: 0 [1323842.498456] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361536 to 0x0:4361570 [1323856.427149] LNet: Service thread pid 132016 was inactive for 637.26s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1323856.427909] Pid: 132016, comm: ll_ost00_005 [1323856.428164] Call Trace: [1323856.428640] [] schedule+0x29/0x70 [1323856.428899] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1323856.429150] [] ? autoremove_wake_function+0x0/0x40 [1323856.429395] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1323856.429869] [] ? lnet_select_pathway+0x4d6/0x1150 [lnet] [1323856.430114] [] start_this_handle+0x1a1/0x430 [jbd2] [1323856.430466] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1323856.430705] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1323856.430970] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1323856.431430] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1323856.431899] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1323856.432145] [] dqget+0x3e4/0x440 [1323856.432382] [] dquot_get_dqblk+0x14/0x1f0 [1323856.432634] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1323856.433103] [] lquotactl_slv+0x286/0xac0 [lquota] [1323856.433368] [] ofd_quotactl+0x13c/0x380 [ofd] [1323856.433672] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323856.433937] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323856.434428] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323856.434669] [] ? default_wake_function+0x12/0x20 [1323856.434912] [] ? __wake_up_common+0x58/0x90 [1323856.435179] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323856.435444] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323856.435691] [] kthread+0xcf/0xe0 [1323856.435929] [] ? kthread+0x0/0xe0 [1323856.436173] [] ret_from_fork+0x58/0x90 [1323856.436427] [] ? kthread+0x0/0xe0 [1323856.436679] [1323856.436909] LustreError: dumping log to /tmp/lustre-log.1519456508.132016 [1323930.151703] LNet: Service thread pid 362691 was inactive for 688.96s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1323930.152445] Pid: 362691, comm: ll_ost01_082 [1323930.152688] Call Trace: [1323930.153258] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1323930.153765] [] schedule+0x29/0x70 [1323930.154024] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1323930.154299] [] ? default_wake_function+0x0/0x20 [1323930.154561] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1323930.155096] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1323930.155392] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1323930.155651] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1323930.155930] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1323930.156206] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1323930.156452] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1323930.156751] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1323930.157016] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1323930.157529] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1323930.157805] [] ? default_wake_function+0x0/0x20 [1323930.158070] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1323930.158336] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1323930.158575] [] kthread+0xcf/0xe0 [1323930.158908] [] ? kthread+0x0/0xe0 [1323930.159144] [] ret_from_fork+0x58/0x90 [1323930.159410] [] ? kthread+0x0/0xe0 [1323930.159664] [1323930.159890] LustreError: dumping log to /tmp/lustre-log.1519456582.362691 [1324014.655966] LustreError: 137164:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff88004b3ed400 x1592481990801536 status -107 rc -107), evict it ns: filter-oak-OST0034_UUID lock: ffff883a8366ae00/0x806f9593628e8f8f lrc: 4/0,0 mode: PW/PW res: [0x3d20d7:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40000080020000 nid: 10.9.112.15@o2ib4 remote: 0x47ee55642915aecc expref: 5 pid: 4910 timeout: 0 lvb_type: 0 [1324014.657634] LustreError: 137164:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) Skipped 2 previous similar messages [1324014.658114] LustreError: 138-a: oak-OST0034: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1324014.658589] LustreError: Skipped 3 previous similar messages [1324014.658854] Lustre: 137164:0:(service.c:2112:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:443s); client may timeout. req@ffff881534771850 x1591002486520240/t0(0) o101->cd9072f9-71b4-1cfb-a759-fd6823f1a4a9@10.0.2.3@o2ib5:604/0 lens 328/368 e 7 to 0 dl 1519456224 ref 1 fl Complete:/0/0 rc 301/301 [1324014.660052] LNet: Service thread pid 137164 completed after 1043.00s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources). [1324016.163679] LNet: Service thread pid 249492 was inactive for 737.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1324016.164424] Pid: 249492, comm: ll_ost00_073 [1324016.164664] Call Trace: [1324016.165152] [] schedule+0x29/0x70 [1324016.165410] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1324016.165656] [] ? autoremove_wake_function+0x0/0x40 [1324016.165931] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1324016.166424] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1324016.166684] [] start_this_handle+0x1a1/0x430 [jbd2] [1324016.166964] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1324016.167221] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1324016.167461] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1324016.167713] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324016.168206] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1324016.168671] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324016.174381] [] dqget+0x3e4/0x440 [1324016.174635] [] dquot_get_dqblk+0x14/0x1f0 [1324016.174908] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1324016.175412] [] lquotactl_slv+0x286/0xac0 [lquota] [1324016.175663] [] ofd_quotactl+0x13c/0x380 [ofd] [1324016.175985] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324016.176247] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324016.176728] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324016.176985] [] ? default_wake_function+0x12/0x20 [1324016.177239] [] ? __wake_up_common+0x58/0x90 [1324016.177497] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324016.177756] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324016.178027] [] kthread+0xcf/0xe0 [1324016.178292] [] ? kthread+0x0/0xe0 [1324016.178527] [] ret_from_fork+0x58/0x90 [1324016.178765] [] ? kthread+0x0/0xe0 [1324016.179015] [1324016.179257] LustreError: dumping log to /tmp/lustre-log.1519456668.249492 [1324060.740587] Lustre: oak-OST0030: Export ffff8801d6e14c00 already connecting from 10.9.112.15@o2ib4 [1324060.741070] Lustre: Skipped 3 previous similar messages [1324067.105341] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff881d0279fc50 x1591169520451984/t0(0) o4->a7b231a5-a7d7-2f9d-4b5d-464beea873b1@10.8.0.63@o2ib6:349/0 lens 608/0 e 1 to 0 dl 1519456724 ref 2 fl New:/0/ffffffff rc 0/-1 [1324067.106533] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 8 previous similar messages [1324073.759772] Lustre: oak-OST0052: Client a7b231a5-a7d7-2f9d-4b5d-464beea873b1 (at 10.8.0.63@o2ib6) reconnecting [1324073.760261] Lustre: Skipped 9 previous similar messages [1324087.932487] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458802 to 0x0:3458817 [1324102.048009] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1324102.048516] Lustre: Skipped 364 previous similar messages [1324110.737469] Lustre: oak-OST0030: Export ffff8801d6e14c00 already connecting from 10.9.112.15@o2ib4 [1324110.737954] Lustre: Skipped 2 previous similar messages [1324145.538630] LustreError: 21429:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519456497, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST003f_UUID lock: ffff880047362000/0x806f959362a717d1 lrc: 3/0,1 mode: --/PW res: [0x4388fd:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 21429 timeout: 0 lvb_type: 0 [1324155.273359] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465762 to 0x0:3465825 [1324160.735164] Lustre: oak-OST0030: Export ffff8801d6e14c00 already connecting from 10.9.112.15@o2ib4 [1324160.735165] Lustre: oak-OST0032: Export ffff883d13fdc400 already connecting from 10.9.112.15@o2ib4 [1324160.736121] Lustre: Skipped 2 previous similar messages [1324207.606910] Lustre: oak-OST0041: deleting orphan objects from 0x0:4441952 to 0x0:4442017 [1324227.097812] LNet: Service thread pid 390779 was inactive for 887.88s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1324227.098515] Pid: 390779, comm: ll_ost00_082 [1324227.098747] Call Trace: [1324227.099208] [] schedule+0x29/0x70 [1324227.099466] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1324227.099708] [] ? autoremove_wake_function+0x0/0x40 [1324227.099954] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1324227.100421] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1324227.100666] [] start_this_handle+0x1a1/0x430 [jbd2] [1324227.100922] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1324227.101167] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1324227.101412] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1324227.101665] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324227.102135] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1324227.102604] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324227.102850] [] dqget+0x3e4/0x440 [1324227.103085] [] dquot_get_dqblk+0x14/0x1f0 [1324227.103341] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1324227.103814] [] lquotactl_slv+0x286/0xac0 [lquota] [1324227.104062] [] ofd_quotactl+0x13c/0x380 [ofd] [1324227.104351] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324227.104617] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324227.105108] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324227.105351] [] ? default_wake_function+0x12/0x20 [1324227.105597] [] ? __wake_up_common+0x58/0x90 [1324227.105877] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324227.106148] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324227.106393] [] kthread+0xcf/0xe0 [1324227.106632] [] ? kthread+0x0/0xe0 [1324227.106877] [] ret_from_fork+0x58/0x90 [1324227.107165] [] ? kthread+0x0/0xe0 [1324227.107437] [1324227.107665] LustreError: dumping log to /tmp/lustre-log.1519456879.390779 [1324271.787828] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421529 to 0x0:4421601 [1324388.882276] LNet: Service thread pid 174522 was inactive for 989.73s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1324388.882980] Pid: 174522, comm: ll_ost00_115 [1324388.883213] Call Trace: [1324388.883680] [] schedule+0x29/0x70 [1324388.883959] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1324388.884226] [] ? autoremove_wake_function+0x0/0x40 [1324388.884505] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1324388.885025] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1324388.885303] [] start_this_handle+0x1a1/0x430 [jbd2] [1324388.885570] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1324388.885841] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1324388.886081] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1324388.886348] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324388.886970] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1324388.887476] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324388.887733] [] dqget+0x3e4/0x440 [1324388.887998] [] dquot_get_dqblk+0x14/0x1f0 [1324388.888268] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1324388.888766] [] lquotactl_slv+0x286/0xac0 [lquota] [1324388.889028] [] ofd_quotactl+0x13c/0x380 [ofd] [1324388.889341] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324388.889655] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324388.890166] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324388.890422] [] ? default_wake_function+0x12/0x20 [1324388.890678] [] ? __wake_up_common+0x58/0x90 [1324388.890967] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324388.891227] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324388.891478] [] kthread+0xcf/0xe0 [1324388.891729] [] ? kthread+0x0/0xe0 [1324388.891985] [] ret_from_fork+0x58/0x90 [1324388.892236] [] ? kthread+0x0/0xe0 [1324388.892523] [1324388.892800] LustreError: dumping log to /tmp/lustre-log.1519457041.174522 [1324418.274262] LustreError: 131985:0:(ldlm_lockd.c:697:ldlm_handle_ast_error()) ### client (nid 10.9.112.15@o2ib4) returned error from glimpse AST (req@ffff881d2a687b00 x1592481990838992 status -107 rc -107), evict it ns: filter-oak-OST0034_UUID lock: ffff8835a9404600/0x806f959362a71a94 lrc: 4/0,0 mode: PW/PW res: [0x3d20d7:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40000080000000 nid: 10.9.112.15@o2ib4 remote: 0xe4db7b3653a7cbbf expref: 6 pid: 296683 timeout: 0 lvb_type: 0 [1324418.275991] LustreError: 138-a: oak-OST0034: A client on nid 10.9.112.15@o2ib4 was evicted due to a lock glimpse callback time out: rc -107 [1324422.978880] Lustre: oak-OST0030: Export ffff881ecaa8ac00 already connecting from 10.9.112.15@o2ib4 [1324422.978881] Lustre: oak-OST0032: Export ffff880526fd6400 already connecting from 10.9.112.15@o2ib4 [1324422.979817] Lustre: Skipped 3 previous similar messages [1324442.667931] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361572 to 0x0:4361602 [1324446.219701] Lustre: oak-OST003f: deleting orphan objects from 0x0:4425982 to 0x0:4426020 [1324447.087530] LustreError: 6918:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519456799, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0035_UUID lock: ffff8800480e6e00/0x806f959362a71c54 lrc: 3/0,1 mode: --/PW res: [0x428d63:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 6918 timeout: 0 lvb_type: 0 [1324447.089195] LustreError: 6918:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 2 previous similar messages [1324472.976263] Lustre: oak-OST0030: Export ffff881ecaa8ac00 already connecting from 10.9.112.15@o2ib4 [1324472.976745] Lustre: Skipped 4 previous similar messages [1324512.272513] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347818 to 0x0:4347841 [1324522.973920] Lustre: oak-OST0030: Export ffff881ecaa8ac00 already connecting from 10.9.112.15@o2ib4 [1324522.974405] Lustre: Skipped 4 previous similar messages [1324567.806012] Lustre: oak-OST004f: deleting orphan objects from 0x0:175045 to 0x0:175074 [1324597.768510] LNet: Service thread pid 131981 was inactive for 1138.63s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1324597.774563] Pid: 131981, comm: ll_ost00_003 [1324597.774797] Call Trace: [1324597.775256] [] schedule+0x29/0x70 [1324597.775521] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1324597.775770] [] ? autoremove_wake_function+0x0/0x40 [1324597.776049] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1324597.776551] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1324597.776810] [] start_this_handle+0x1a1/0x430 [jbd2] [1324597.777076] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1324597.777315] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1324597.777562] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1324597.777848] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324597.778344] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1324597.778810] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324597.779062] [] dqget+0x3e4/0x440 [1324597.779297] [] dquot_get_dqblk+0x14/0x1f0 [1324597.779570] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1324597.780072] [] lquotactl_slv+0x286/0xac0 [lquota] [1324597.780318] [] ofd_quotactl+0x13c/0x380 [ofd] [1324597.780615] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324597.780902] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324597.781404] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324597.781652] [] ? default_wake_function+0x12/0x20 [1324597.781933] [] ? __wake_up_common+0x58/0x90 [1324597.782230] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324597.782510] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324597.782772] [] kthread+0xcf/0xe0 [1324597.783027] [] ? kthread+0x0/0xe0 [1324597.783263] [] ret_from_fork+0x58/0x90 [1324597.783503] [] ? kthread+0x0/0xe0 [1324597.783768] [1324597.784025] LustreError: dumping log to /tmp/lustre-log.1519457250.131981 [1324615.431672] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff881ebd332450 x1593265787771344/t0(0) o10->f0e3dca9-4f94-7859-aae2-78241dcced78@10.9.112.15@o2ib4:142/0 lens 560/0 e 1 to 0 dl 1519457272 ref 2 fl New:/0/ffffffff rc 0/-1 [1324615.432905] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 967 previous similar messages [1324632.086109] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1324632.086588] Lustre: Skipped 335 previous similar messages [1324679.684699] Pid: 210343, comm: ll_ost01_070 [1324679.684941] Call Trace: [1324679.685478] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1324679.686047] [] schedule+0x29/0x70 [1324679.686337] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1324679.686582] [] ? default_wake_function+0x0/0x20 [1324679.686845] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1324679.687336] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1324679.687602] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1324679.687861] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1324679.688126] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1324679.688392] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1324679.688651] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1324679.688950] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324679.689221] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324679.689720] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324679.689965] [] ? default_wake_function+0x12/0x20 [1324679.690204] [] ? __wake_up_common+0x58/0x90 [1324679.690469] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324679.690740] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324679.690987] [] kthread+0xcf/0xe0 [1324679.691233] [] ? kthread+0x0/0xe0 [1324679.691478] [] ret_from_fork+0x58/0x90 [1324679.691724] [] ? kthread+0x0/0xe0 [1324679.691961] [1324679.692193] LustreError: dumping log to /tmp/lustre-log.1519457332.210343 [1324691.600275] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458802 to 0x0:3458849 [1324702.704976] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1324702.705468] Lustre: Skipped 368 previous similar messages [1324704.259530] Pid: 184438, comm: ll_ost_io01_073 [1324704.259774] Call Trace: [1324704.260262] [] ? bit_wait_io+0x0/0x50 [1324704.260511] [] schedule+0x29/0x70 [1324704.260786] [] schedule_timeout+0x239/0x2c0 [1324704.261061] [] ? mlx4_ib_post_send+0x510/0xb50 [mlx4_ib] [1324704.261308] [] ? bit_wait_io+0x0/0x50 [1324704.261569] [] io_schedule_timeout+0xad/0x130 [1324704.261813] [] io_schedule+0x18/0x20 [1324704.262055] [] bit_wait_io+0x11/0x50 [1324704.262312] [] __wait_on_bit_lock+0x5f/0xc0 [1324704.262595] [] __lock_page+0x74/0x90 [1324704.262859] [] ? wake_bit_function_rh+0x0/0x40 [1324704.263121] [] __find_lock_page+0x54/0x70 [1324704.263366] [] find_or_create_page+0x34/0xa0 [1324704.263626] [] osd_bufs_get+0x235/0x430 [osd_ldiskfs] [1324704.263896] [] ofd_preprw+0x6a8/0x1150 [ofd] [1324704.264209] [] ? __req_capsule_get+0x15d/0x700 [ptlrpc] [1324704.264484] [] ? null_alloc_rs+0x15d/0x330 [ptlrpc] [1324704.264782] [] tgt_brw_read+0x96f/0x1850 [ptlrpc] [1324704.265090] [] ? lprocfs_counter_add+0xf9/0x160 [obdclass] [1324704.265617] [] ? null_alloc_rs+0x176/0x330 [ptlrpc] [1324704.265885] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1324704.266390] [] ? lustre_pack_reply_flags+0x6f/0x1e0 [ptlrpc] [1324704.266940] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1324704.267247] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324704.267551] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324704.268077] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324704.268338] [] ? default_wake_function+0x12/0x20 [1324704.268603] [] ? __wake_up_common+0x58/0x90 [1324704.268874] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324704.269156] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324704.269401] [] kthread+0xcf/0xe0 [1324704.269680] [] ? kthread+0x0/0xe0 [1324704.269944] [] ret_from_fork+0x58/0x90 [1324704.270202] [] ? kthread+0x0/0xe0 [1324704.270444] [1324704.270694] LustreError: dumping log to /tmp/lustre-log.1519457356.184438 [1324704.271783] Pid: 184435, comm: ll_ost_io01_070 [1324704.272020] Call Trace: [1324704.272526] [] ? bit_wait_io+0x0/0x50 [1324704.272765] [] schedule+0x29/0x70 [1324704.273027] [] schedule_timeout+0x239/0x2c0 [1324704.273294] [] ? mlx4_ib_post_send+0x510/0xb50 [mlx4_ib] [1324704.273546] [] ? ktime_get_ts64+0x4c/0xf0 [1324704.273794] [] ? bit_wait_io+0x0/0x50 [1324704.274034] [] io_schedule_timeout+0xad/0x130 [1324704.274277] [] io_schedule+0x18/0x20 [1324704.274523] [] bit_wait_io+0x11/0x50 [1324704.274769] [] __wait_on_bit_lock+0x5f/0xc0 [1324704.275016] [] __lock_page+0x74/0x90 [1324704.275276] [] ? wake_bit_function_rh+0x0/0x40 [1324704.275537] [] __find_lock_page+0x54/0x70 [1324704.275779] [] find_or_create_page+0x34/0xa0 [1324704.276027] [] osd_bufs_get+0x235/0x430 [osd_ldiskfs] [1324704.276275] [] ofd_preprw_write.isra.30+0x1f7/0xd80 [ofd] [1324704.276751] [] ofd_preprw+0x422/0x1150 [ofd] [1324704.277032] [] tgt_brw_write+0xc34/0x17c0 [ptlrpc] [1324704.277277] [] ? update_curr+0x104/0x190 [1324704.277523] [] ? __enqueue_entity+0x78/0x80 [1324704.277765] [] ? enqueue_entity+0x26c/0xb60 [1324704.278030] [] ? ___slab_alloc+0x209/0x4f0 [1324704.278292] [] ? mutex_lock+0x12/0x2f [1324704.278572] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324704.278844] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324704.279343] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324704.279595] [] ? default_wake_function+0x12/0x20 [1324704.279842] [] ? __wake_up_common+0x58/0x90 [1324704.280112] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324704.280380] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324704.280630] [] kthread+0xcf/0xe0 [1324704.280873] [] ? kthread+0x0/0xe0 [1324704.281118] [] ret_from_fork+0x58/0x90 [1324704.281363] [] ? kthread+0x0/0xe0 [1324704.281610] [1324704.281845] Pid: 185057, comm: ll_ost_io01_093 [1324704.282084] Call Trace: [1324704.282558] [] ? bit_wait_io+0x0/0x50 [1324704.282800] [] schedule+0x29/0x70 [1324704.283041] [] schedule_timeout+0x239/0x2c0 [1324704.283287] [] ? mlx4_ib_post_send+0x510/0xb50 [mlx4_ib] [1324704.283539] [] ? ktime_get_ts64+0x4c/0xf0 [1324704.283786] [] ? bit_wait_io+0x0/0x50 [1324704.284032] [] io_schedule_timeout+0xad/0x130 [1324704.284275] [] io_schedule+0x18/0x20 [1324704.284520] [] bit_wait_io+0x11/0x50 [1324704.284761] [] __wait_on_bit_lock+0x5f/0xc0 [1324704.290595] [] __lock_page+0x74/0x90 [1324704.290835] [] ? wake_bit_function_rh+0x0/0x40 [1324704.291079] [] __find_lock_page+0x54/0x70 [1324704.291321] [] find_or_create_page+0x34/0xa0 [1324704.291576] [] osd_bufs_get+0x235/0x430 [osd_ldiskfs] [1324704.291825] [] ofd_preprw_write.isra.30+0x1f7/0xd80 [ofd] [1324704.292293] [] ofd_preprw+0x422/0x1150 [ofd] [1324704.292566] [] tgt_brw_write+0xc34/0x17c0 [ptlrpc] [1324704.292813] [] ? update_curr+0x104/0x190 [1324704.293054] [] ? __enqueue_entity+0x78/0x80 [1324704.293296] [] ? enqueue_entity+0x26c/0xb60 [1324704.293542] [] ? mutex_lock+0x12/0x2f [1324704.293817] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324704.294092] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324704.294595] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324704.294843] [] ? default_wake_function+0x12/0x20 [1324704.295087] [] ? __wake_up_common+0x58/0x90 [1324704.295352] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324704.295622] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324704.295869] [] kthread+0xcf/0xe0 [1324704.296110] [] ? kthread+0x0/0xe0 [1324704.296353] [] ret_from_fork+0x58/0x90 [1324704.296601] [] ? kthread+0x0/0xe0 [1324704.296847] [1324704.297084] LNet: Service thread pid 184443 was inactive for 1201.37s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1324704.297571] LNet: Skipped 2 previous similar messages [1324708.355368] LustreError: dumping log to /tmp/lustre-log.1519457360.301771 [1324712.451133] LustreError: dumping log to /tmp/lustre-log.1519457364.210332 [1324716.546973] LustreError: dumping log to /tmp/lustre-log.1519457369.21430 [1324720.642762] LNet: Service thread pid 135933 was inactive for 1200.23s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1324720.643346] LNet: Skipped 48 previous similar messages [1324720.643581] LustreError: dumping log to /tmp/lustre-log.1519457373.135933 [1324737.025985] LustreError: dumping log to /tmp/lustre-log.1519457389.131960 [1324749.313407] LustreError: dumping log to /tmp/lustre-log.1519457401.6919 [1324755.653270] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3465857 [1324782.079932] LNet: Service thread pid 249471 was inactive for 1201.85s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1324782.080423] LNet: Skipped 3 previous similar messages [1324782.080662] LustreError: dumping log to /tmp/lustre-log.1519457434.249471 [1324783.421338] Lustre: oak-OST0032: Export ffff883ca70bd800 already connecting from 10.9.112.15@o2ib4 [1324783.421815] Lustre: Skipped 3 previous similar messages [1324808.730848] Lustre: oak-OST0041: deleting orphan objects from 0x0:4441952 to 0x0:4442049 [1324843.517069] LustreError: dumping log to /tmp/lustre-log.1519457495.174525 [1324872.863880] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421529 to 0x0:4421633 [1324890.694919] Lustre: oak-OST004e: deleting orphan objects from 0x0:176029 to 0x0:176065 [1324891.398843] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290593 [1324900.858341] LNet: Service thread pid 141214 was inactive for 1201.72s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1324900.859045] LNet: Skipped 4 previous similar messages [1324900.859312] Pid: 141214, comm: ll_ost00_017 [1324900.859553] Call Trace: [1324900.860010] [] schedule+0x29/0x70 [1324900.860267] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1324900.860512] [] ? autoremove_wake_function+0x0/0x40 [1324900.860755] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1324900.861219] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1324900.861463] [] start_this_handle+0x1a1/0x430 [jbd2] [1324900.861712] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1324900.861953] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1324900.862194] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1324900.862446] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324900.862976] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1324900.863498] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324900.863756] [] dqget+0x3e4/0x440 [1324900.863991] [] dquot_get_dqblk+0x14/0x1f0 [1324900.864242] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1324900.864720] [] lquotactl_slv+0x286/0xac0 [lquota] [1324900.864974] [] ofd_quotactl+0x13c/0x380 [ofd] [1324900.865268] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324900.865538] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324900.866019] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324900.866260] [] ? default_wake_function+0x12/0x20 [1324900.866503] [] ? __wake_up_common+0x58/0x90 [1324900.866761] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324900.867020] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324900.867258] [] kthread+0xcf/0xe0 [1324900.867496] [] ? do_exit+0x6bb/0xa40 [1324900.867735] [] ? kthread+0x0/0xe0 [1324900.867974] [] ret_from_fork+0x58/0x90 [1324900.868214] [] ? kthread+0x0/0xe0 [1324900.868456] [1324900.868686] LustreError: dumping log to /tmp/lustre-log.1519457553.141214 [1324900.869553] Pid: 131957, comm: ll_ost00_000 [1324900.869788] Call Trace: [1324900.870276] [] schedule+0x29/0x70 [1324900.870542] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1324900.870830] [] ? autoremove_wake_function+0x0/0x40 [1324900.871070] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1324900.871660] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1324900.871904] [] start_this_handle+0x1a1/0x430 [jbd2] [1324900.872145] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1324900.872405] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1324900.872665] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1324900.872911] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324900.873393] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1324900.873875] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324900.874118] [] dqget+0x3e4/0x440 [1324900.874372] [] dquot_get_dqblk+0x14/0x1f0 [1324900.874634] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1324900.875098] [] lquotactl_slv+0x286/0xac0 [lquota] [1324900.875360] [] ofd_quotactl+0x13c/0x380 [ofd] [1324900.875642] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324900.875955] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324900.876486] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324900.876745] [] ? default_wake_function+0x12/0x20 [1324900.876989] [] ? __wake_up_common+0x58/0x90 [1324900.877247] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324900.877542] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324900.877803] [] kthread+0xcf/0xe0 [1324900.878042] [] ? kthread+0x0/0xe0 [1324900.878281] [] ret_from_fork+0x58/0x90 [1324900.878540] [] ? kthread+0x0/0xe0 [1324900.878792] [1324962.295515] Pid: 174518, comm: ll_ost00_111 [1324962.295753] Call Trace: [1324962.296222] [] schedule+0x29/0x70 [1324962.296503] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1324962.296764] [] ? autoremove_wake_function+0x0/0x40 [1324962.297022] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1324962.297520] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1324962.297767] [] start_this_handle+0x1a1/0x430 [jbd2] [1324962.298065] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1324962.298345] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1324962.298607] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1324962.298867] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324962.299355] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1324962.299924] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324962.300163] [] dqget+0x3e4/0x440 [1324962.300425] [] dquot_get_dqblk+0x14/0x1f0 [1324962.300696] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1324962.301225] [] lquotactl_slv+0x286/0xac0 [lquota] [1324962.301505] [] ofd_quotactl+0x13c/0x380 [ofd] [1324962.301815] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324962.302108] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324962.302614] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324962.302870] [] ? default_wake_function+0x12/0x20 [1324962.303138] [] ? __wake_up_common+0x58/0x90 [1324962.303397] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324962.303679] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324962.303928] [] kthread+0xcf/0xe0 [1324962.304179] [] ? kthread+0x0/0xe0 [1324962.304441] [] ret_from_fork+0x58/0x90 [1324962.304698] [] ? kthread+0x0/0xe0 [1324962.304972] [1324962.305235] LustreError: dumping log to /tmp/lustre-log.1519457614.174518 [1324962.306156] Pid: 249487, comm: ll_ost00_068 [1324962.306427] Call Trace: [1324962.306944] [] schedule+0x29/0x70 [1324962.307189] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1324962.307431] [] ? autoremove_wake_function+0x0/0x40 [1324962.313115] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1324962.313627] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1324962.314008] [] start_this_handle+0x1a1/0x430 [jbd2] [1324962.314249] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1324962.314504] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1324962.314769] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1324962.315031] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324962.315534] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1324962.316047] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1324962.316287] [] dqget+0x3e4/0x440 [1324962.316540] [] dquot_get_dqblk+0x14/0x1f0 [1324962.316802] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1324962.317293] [] lquotactl_slv+0x286/0xac0 [lquota] [1324962.317554] [] ofd_quotactl+0x13c/0x380 [ofd] [1324962.317842] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1324962.318135] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1324962.318638] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1324962.318897] [] ? default_wake_function+0x12/0x20 [1324962.319169] [] ? __wake_up_common+0x58/0x90 [1324962.319432] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1324962.319718] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1324962.319992] [] kthread+0xcf/0xe0 [1324962.320273] [] ? kthread+0x0/0xe0 [1324962.320528] [] ret_from_fork+0x58/0x90 [1324962.320779] [] ? kthread+0x0/0xe0 [1324962.321044] [1325023.732592] Pid: 174527, comm: ll_ost00_119 [1325023.732848] Call Trace: [1325023.733351] [] schedule+0x29/0x70 [1325023.733633] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325023.733901] [] ? autoremove_wake_function+0x0/0x40 [1325023.734177] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325023.734674] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325023.734916] [] start_this_handle+0x1a1/0x430 [jbd2] [1325023.735178] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325023.735420] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325023.735680] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325023.735931] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325023.736407] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325023.736902] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325023.737176] [] dqget+0x3e4/0x440 [1325023.737441] [] dquot_get_dqblk+0x14/0x1f0 [1325023.737711] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325023.738194] [] lquotactl_slv+0x286/0xac0 [lquota] [1325023.738438] [] ofd_quotactl+0x13c/0x380 [ofd] [1325023.738743] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325023.739007] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325023.739496] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325023.739756] [] ? default_wake_function+0x12/0x20 [1325023.740012] [] ? __wake_up_common+0x58/0x90 [1325023.740300] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325023.740558] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325023.740816] [] kthread+0xcf/0xe0 [1325023.741097] [] ? kthread+0x0/0xe0 [1325023.741361] [] ret_from_fork+0x58/0x90 [1325023.741611] [] ? kthread+0x0/0xe0 [1325023.741847] [1325023.742103] LustreError: dumping log to /tmp/lustre-log.1519457676.174527 [1325023.743141] LNet: Service thread pid 390793 was inactive for 1204.55s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1325023.743679] LNet: Skipped 3 previous similar messages [1325044.391899] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361572 to 0x0:4361634 [1325046.671736] Lustre: oak-OST003f: deleting orphan objects from 0x0:4425982 to 0x0:4426052 [1325048.307464] LustreError: dumping log to /tmp/lustre-log.1519457700.21429 [1325081.073959] LustreError: dumping log to /tmp/lustre-log.1519457733.17221 [1325109.644950] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401615 to 0x0:4401633 [1325112.900664] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347818 to 0x0:4347873 [1325113.840403] LustreError: dumping log to /tmp/lustre-log.1519457766.21425 [1325114.294396] LustreError: 6923:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519457466, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST004c_UUID lock: ffff880019c05e00/0x806f959362a72cd1 lrc: 3/0,1 mode: --/PW res: [0x34bb90:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 6923 timeout: 0 lvb_type: 0 [1325114.296067] LustreError: 6923:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 4 previous similar messages [1325141.903214] Lustre: oak-OST0030: Export ffff883f7ce53400 already connecting from 10.9.112.15@o2ib4 [1325141.903712] Lustre: Skipped 11 previous similar messages [1325142.511042] LustreError: dumping log to /tmp/lustre-log.1519457795.133118 [1325168.865950] Lustre: oak-OST004f: deleting orphan objects from 0x0:175045 to 0x0:175106 [1325171.181734] LNet: Service thread pid 136078 was inactive for 1203.96s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1325171.182221] LNet: Skipped 6 previous similar messages [1325171.182466] LustreError: dumping log to /tmp/lustre-log.1519457823.136078 [1325203.948230] Pid: 174521, comm: ll_ost00_114 [1325203.948482] Call Trace: [1325203.948938] [] schedule+0x29/0x70 [1325203.949204] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325203.949463] [] ? autoremove_wake_function+0x0/0x40 [1325203.949706] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325203.950201] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325203.950444] [] start_this_handle+0x1a1/0x430 [jbd2] [1325203.950710] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325203.950952] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325203.951206] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325203.951459] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325203.951922] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325203.952390] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325203.952632] [] dqget+0x3e4/0x440 [1325203.952882] [] dquot_get_dqblk+0x14/0x1f0 [1325203.953133] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325203.953615] [] lquotactl_slv+0x286/0xac0 [lquota] [1325203.953863] [] ofd_quotactl+0x13c/0x380 [ofd] [1325203.954194] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325203.954464] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325203.954957] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325203.955206] [] ? default_wake_function+0x12/0x20 [1325203.955452] [] ? __wake_up_common+0x58/0x90 [1325203.955718] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325203.955981] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325203.956221] [] kthread+0xcf/0xe0 [1325203.956550] [] ? kthread+0x0/0xe0 [1325203.956786] [] ret_from_fork+0x58/0x90 [1325203.957034] [] ? kthread+0x0/0xe0 [1325203.957283] [1325203.957510] LustreError: dumping log to /tmp/lustre-log.1519457856.174521 [1325209.352161] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437697 [1325212.139844] Pid: 249483, comm: ll_ost00_064 [1325212.140117] Call Trace: [1325212.140593] [] schedule+0x29/0x70 [1325212.140859] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325212.141138] [] ? autoremove_wake_function+0x0/0x40 [1325212.141410] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325212.141886] [] ? __percpu_counter_sum+0x70/0x80 [1325212.142221] [] start_this_handle+0x1a1/0x430 [jbd2] [1325212.142496] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1325212.142978] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325212.143218] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325212.143462] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325212.143973] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325212.144464] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325212.144755] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1325212.145294] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1325212.145546] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1325212.145810] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1325212.146283] [] ? dequeue_entity+0x11c/0x5d0 [1325212.146522] [] ? dequeue_task_fair+0x3d0/0x660 [1325212.146761] [] ? __switch_to+0xd7/0x510 [1325212.147048] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1325212.147315] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325212.147840] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325212.148082] [] ? default_wake_function+0x0/0x20 [1325212.148361] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325212.148621] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325212.148879] [] kthread+0xcf/0xe0 [1325212.149114] [] ? kthread+0x0/0xe0 [1325212.149351] [] ret_from_fork+0x58/0x90 [1325212.155034] [] ? kthread+0x0/0xe0 [1325212.155269] [1325212.155497] LustreError: dumping log to /tmp/lustre-log.1519457864.249483 [1325212.156341] Pid: 249486, comm: ll_ost00_067 [1325212.156574] Call Trace: [1325212.157028] [] schedule+0x29/0x70 [1325212.157271] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325212.157510] [] ? autoremove_wake_function+0x0/0x40 [1325212.157753] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325212.158216] [] ? __percpu_counter_sum+0x70/0x80 [1325212.158460] [] start_this_handle+0x1a1/0x430 [jbd2] [1325212.158708] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1325212.159171] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325212.159416] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325212.159660] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325212.160132] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325212.160605] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325212.160898] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1325212.161396] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1325212.161645] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1325212.161920] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1325212.162385] [] ? __enqueue_entity+0x78/0x80 [1325212.162627] [] ? enqueue_entity+0x26c/0xb60 [1325212.162910] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1325212.163177] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325212.163661] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325212.163904] [] ? default_wake_function+0x12/0x20 [1325212.164145] [] ? __wake_up_common+0x58/0x90 [1325212.164407] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325212.164669] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325212.164911] [] kthread+0xcf/0xe0 [1325212.165149] [] ? kthread+0x0/0xe0 [1325212.165386] [] ret_from_fork+0x58/0x90 [1325212.165622] [] ? kthread+0x0/0xe0 [1325212.165860] [1325212.166090] Pid: 390794, comm: ll_ost00_097 [1325212.166323] Call Trace: [1325212.166784] [] schedule+0x29/0x70 [1325212.167028] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325212.167271] [] ? autoremove_wake_function+0x0/0x40 [1325212.167519] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325212.167986] [] ? __percpu_counter_sum+0x70/0x80 [1325212.168232] [] start_this_handle+0x1a1/0x430 [jbd2] [1325212.168508] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1325212.168987] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325212.169232] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325212.169508] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325212.169993] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325212.170587] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325212.170884] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1325212.171397] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1325212.171637] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1325212.171926] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1325212.172420] [] ? __enqueue_entity+0x78/0x80 [1325212.172658] [] ? enqueue_entity+0x26c/0xb60 [1325212.172956] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1325212.173223] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325212.173738] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325212.173997] [] ? default_wake_function+0x12/0x20 [1325212.174238] [] ? __wake_up_common+0x58/0x90 [1325212.174531] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325212.174821] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325212.175062] [] kthread+0xcf/0xe0 [1325212.175330] [] ? kthread+0x0/0xe0 [1325212.175566] [] ret_from_fork+0x58/0x90 [1325212.175820] [] ? kthread+0x0/0xe0 [1325212.176061] [1325212.176313] Pid: 174514, comm: ll_ost00_107 [1325212.176547] Call Trace: [1325212.177024] [] schedule+0x29/0x70 [1325212.177268] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325212.177542] [] ? autoremove_wake_function+0x0/0x40 [1325212.177805] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325212.178295] [] ? __percpu_counter_sum+0x70/0x80 [1325212.178536] [] start_this_handle+0x1a1/0x430 [jbd2] [1325212.178805] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1325212.179297] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325212.179538] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325212.179804] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325212.180272] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325212.180742] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325212.181032] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1325212.181550] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1325212.181817] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1325212.182079] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1325212.182576] [] ? __enqueue_entity+0x78/0x80 [1325212.182839] [] ? enqueue_entity+0x26c/0xb60 [1325212.183107] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1325212.183402] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325212.183917] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325212.184158] [] ? default_wake_function+0x12/0x20 [1325212.184427] [] ? __wake_up_common+0x58/0x90 [1325212.184684] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325212.185048] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325212.185286] [] kthread+0xcf/0xe0 [1325212.185551] [] ? kthread+0x0/0xe0 [1325212.185804] [] ret_from_fork+0x58/0x90 [1325212.186043] [] ? kthread+0x0/0xe0 [1325212.186310] [1325224.427259] LustreError: dumping log to /tmp/lustre-log.1519457876.249495 [1325225.259229] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff883cbc43f450 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:752/0 lens 608/0 e 1 to 0 dl 1519457882 ref 2 fl New:H/0/ffffffff rc 0/-1 [1325225.260416] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 968 previous similar messages [1325233.467798] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1325233.468303] Lustre: Skipped 355 previous similar messages [1325261.289553] LustreError: dumping log to /tmp/lustre-log.1519457913.249472 [1325292.956275] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3458881 [1325303.740787] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1325303.741263] Lustre: Skipped 420 previous similar messages [1325322.726663] LustreError: dumping log to /tmp/lustre-log.1519457975.132963 [1325347.301485] LustreError: dumping log to /tmp/lustre-log.1519457999.249473 [1325351.397327] LustreError: dumping log to /tmp/lustre-log.1519458003.6918 [1325357.713154] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3465889 [1325380.068004] LustreError: dumping log to /tmp/lustre-log.1519458032.131958 [1325410.958697] Lustre: oak-OST0041: deleting orphan objects from 0x0:4441952 to 0x0:4442081 [1325414.558540] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455889 to 0x0:3455905 [1325441.505131] LNet: Service thread pid 390785 was inactive for 1202.40s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1325441.505638] LNet: Skipped 6 previous similar messages [1325441.505883] LustreError: dumping log to /tmp/lustre-log.1519458094.390785 [1325470.175765] LustreError: dumping log to /tmp/lustre-log.1519458122.4898 [1325474.271587] LustreError: dumping log to /tmp/lustre-log.1519458126.4896 [1325474.475775] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421529 to 0x0:4421665 [1325478.367399] LustreError: dumping log to /tmp/lustre-log.1519458130.17222 [1325490.654779] LustreError: dumping log to /tmp/lustre-log.1519458143.196786 [1325491.499016] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290625 [1325494.750614] LustreError: dumping log to /tmp/lustre-log.1519458147.362707 [1325502.942251] LustreError: dumping log to /tmp/lustre-log.1519458155.249330 [1325507.038042] LNet: Service thread pid 210324 was inactive for 1204.10s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1325507.038855] LNet: Skipped 9 previous similar messages [1325507.039111] Pid: 210324, comm: ll_ost01_051 [1325507.039348] Call Trace: [1325507.039820] [] schedule_preempt_disabled+0x29/0x70 [1325507.040070] [] __mutex_lock_slowpath+0xc7/0x1d0 [1325507.040315] [] mutex_lock+0x1f/0x2f [1325507.040563] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1325507.040870] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1325507.041357] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1325507.041941] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1325507.047596] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325507.047867] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325507.048365] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325507.048612] [] ? default_wake_function+0x12/0x20 [1325507.048856] [] ? __wake_up_common+0x58/0x90 [1325507.049125] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325507.049392] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325507.049637] [] kthread+0xcf/0xe0 [1325507.049876] [] ? kthread+0x0/0xe0 [1325507.050121] [] ret_from_fork+0x58/0x90 [1325507.050359] [] ? kthread+0x0/0xe0 [1325507.050599] [1325507.050832] LustreError: dumping log to /tmp/lustre-log.1519458159.210324 [1325515.145848] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442464 to 0x0:3442497 [1325539.804510] Pid: 249494, comm: ll_ost00_075 [1325539.804765] Call Trace: [1325539.805249] [] schedule+0x29/0x70 [1325539.805511] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325539.805802] [] ? autoremove_wake_function+0x0/0x40 [1325539.806075] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325539.806567] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325539.806827] [] start_this_handle+0x1a1/0x430 [jbd2] [1325539.807106] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325539.807347] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325539.807624] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325539.807878] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325539.808378] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325539.808876] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325539.809119] [] dqget+0x3e4/0x440 [1325539.809407] [] dquot_get_dqblk+0x14/0x1f0 [1325539.809682] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325539.810196] [] lquotactl_slv+0x286/0xac0 [lquota] [1325539.810465] [] ofd_quotactl+0x13c/0x380 [ofd] [1325539.810780] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325539.811089] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325539.811625] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325539.811868] [] ? default_wake_function+0x12/0x20 [1325539.812108] [] ? __wake_up_common+0x58/0x90 [1325539.812368] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325539.812632] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325539.812962] [] kthread+0xcf/0xe0 [1325539.813221] [] ? kthread+0x0/0xe0 [1325539.813475] [] ret_from_fork+0x58/0x90 [1325539.813724] [] ? kthread+0x0/0xe0 [1325539.813958] [1325539.814212] LustreError: dumping log to /tmp/lustre-log.1519458192.249494 [1325539.815122] Pid: 390786, comm: ll_ost00_089 [1325539.815358] Call Trace: [1325539.815836] [] schedule+0x29/0x70 [1325539.816095] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325539.816335] [] ? autoremove_wake_function+0x0/0x40 [1325539.816581] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325539.817061] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325539.817304] [] start_this_handle+0x1a1/0x430 [jbd2] [1325539.817554] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325539.817793] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325539.818050] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325539.818312] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325539.818831] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325539.819317] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325539.819566] [] dqget+0x3e4/0x440 [1325539.819801] [] dquot_get_dqblk+0x14/0x1f0 [1325539.820062] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325539.820558] [] lquotactl_slv+0x286/0xac0 [lquota] [1325539.820800] [] ofd_quotactl+0x13c/0x380 [ofd] [1325539.821075] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325539.821342] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325539.821889] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325539.822176] [] ? default_wake_function+0x12/0x20 [1325539.822443] [] ? __wake_up_common+0x58/0x90 [1325539.822734] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325539.823052] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325539.823335] [] kthread+0xcf/0xe0 [1325539.823591] [] ? kthread+0x0/0xe0 [1325539.823830] [] ret_from_fork+0x58/0x90 [1325539.824112] [] ? kthread+0x0/0xe0 [1325539.824349] [1325560.283550] Pid: 261357, comm: ll_ost00_022 [1325560.283788] Call Trace: [1325560.284331] [] schedule+0x29/0x70 [1325560.284608] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325560.284864] [] ? autoremove_wake_function+0x0/0x40 [1325560.285105] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325560.285601] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325560.285846] [] start_this_handle+0x1a1/0x430 [jbd2] [1325560.286094] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325560.286352] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325560.286614] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325560.286867] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325560.287347] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325560.287873] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325560.288130] [] dqget+0x3e4/0x440 [1325560.288384] [] dquot_get_dqblk+0x14/0x1f0 [1325560.288661] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325560.289132] [] lquotactl_slv+0x286/0xac0 [lquota] [1325560.289399] [] ofd_quotactl+0x13c/0x380 [ofd] [1325560.289716] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325560.289990] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325560.290496] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325560.290752] [] ? default_wake_function+0x12/0x20 [1325560.290992] [] ? __wake_up_common+0x58/0x90 [1325560.291272] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325560.291555] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325560.291795] [] kthread+0xcf/0xe0 [1325560.292030] [] ? kthread+0x0/0xe0 [1325560.292280] [] ret_from_fork+0x58/0x90 [1325560.292536] [] ? kthread+0x0/0xe0 [1325560.292773] [1325560.293002] LustreError: dumping log to /tmp/lustre-log.1519458212.261357 [1325572.570962] Pid: 174516, comm: ll_ost00_109 [1325572.571214] Call Trace: [1325572.571685] [] schedule+0x29/0x70 [1325572.571958] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325572.572221] [] ? autoremove_wake_function+0x0/0x40 [1325572.572466] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325572.572974] [] ? __percpu_counter_sum+0x70/0x80 [1325572.573219] [] start_this_handle+0x1a1/0x430 [jbd2] [1325572.573475] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1325572.573988] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325572.574231] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325572.574509] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325572.574993] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325572.575468] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1325572.575757] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1325572.576312] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1325572.576582] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1325572.576875] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1325572.577391] [] ? __enqueue_entity+0x78/0x80 [1325572.577668] [] ? enqueue_entity+0x26c/0xb60 [1325572.577990] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1325572.578261] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325572.578783] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325572.579042] [] ? default_wake_function+0x12/0x20 [1325572.579285] [] ? __wake_up_common+0x58/0x90 [1325572.579556] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325572.579831] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325572.580088] [] kthread+0xcf/0xe0 [1325572.580327] [] ? kthread+0x0/0xe0 [1325572.580596] [] ret_from_fork+0x58/0x90 [1325572.580831] [] ? kthread+0x0/0xe0 [1325572.581085] [1325572.581315] LustreError: dumping log to /tmp/lustre-log.1519458225.174516 [1325597.145836] LustreError: dumping log to /tmp/lustre-log.1519458249.390778 [1325621.720688] LustreError: dumping log to /tmp/lustre-log.1519458274.249468 [1325645.475876] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361572 to 0x0:4361666 [1325648.067687] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426084 [1325658.582947] LustreError: dumping log to /tmp/lustre-log.1519458311.17263 [1325683.157833] LustreError: dumping log to /tmp/lustre-log.1519458335.267671 [1325711.828454] LustreError: dumping log to /tmp/lustre-log.1519458364.4899 [1325713.312735] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401635 to 0x0:4401665 [1325713.976594] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347875 to 0x0:4347905 [1325715.924284] LustreError: dumping log to /tmp/lustre-log.1519458368.390790 [1325720.020098] LustreError: dumping log to /tmp/lustre-log.1519458372.390795 [1325740.499156] LustreError: dumping log to /tmp/lustre-log.1519458393.17242 [1325752.702781] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441538 [1325769.725964] Lustre: oak-OST004f: deleting orphan objects from 0x0:175045 to 0x0:175138 [1325777.361392] LustreError: dumping log to /tmp/lustre-log.1519458429.249474 [1325801.936274] LustreError: dumping log to /tmp/lustre-log.1519458454.174524 [1325809.996097] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437729 [1325810.127862] Pid: 133753, comm: ll_ost01_015 [1325810.128100] Call Trace: [1325810.128624] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1325810.129104] [] schedule+0x29/0x70 [1325810.129367] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1325810.129616] [] ? default_wake_function+0x0/0x20 [1325810.129885] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1325810.130392] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1325810.130657] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1325810.130913] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1325810.131190] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1325810.131451] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1325810.131699] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1325810.131985] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325810.132263] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325810.132786] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325810.133039] [] ? default_wake_function+0x12/0x20 [1325810.133285] [] ? __wake_up_common+0x58/0x90 [1325810.133551] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325810.133818] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325810.134065] [] kthread+0xcf/0xe0 [1325810.134325] [] ? do_exit+0x6bb/0xa40 [1325810.134565] [] ? kthread+0x0/0xe0 [1325810.134809] [] ret_from_fork+0x58/0x90 [1325810.135054] [] ? kthread+0x0/0xe0 [1325810.135300] [1325810.135535] LustreError: dumping log to /tmp/lustre-log.1519458462.133753 [1325826.255150] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff881a40848c50 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:598/0 lens 608/0 e 0 to 0 dl 1519458483 ref 2 fl New:H/2/ffffffff rc 0/-1 [1325826.256404] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1002 previous similar messages [1325834.365383] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1325834.365871] Lustre: Skipped 369 previous similar messages [1325838.798575] Pid: 17262, comm: ll_ost00_029 [1325838.798807] Call Trace: [1325838.799323] [] schedule+0x29/0x70 [1325838.799602] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325838.799893] [] ? autoremove_wake_function+0x0/0x40 [1325838.800153] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325838.800683] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325838.800976] [] start_this_handle+0x1a1/0x430 [jbd2] [1325838.801255] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325838.801497] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325838.801755] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325838.802058] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325838.802567] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325838.803122] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325838.803367] [] dqget+0x3e4/0x440 [1325838.803623] [] dquot_get_dqblk+0x14/0x1f0 [1325838.803897] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325838.804395] [] lquotactl_slv+0x286/0xac0 [lquota] [1325838.804659] [] ofd_quotactl+0x13c/0x380 [ofd] [1325838.804981] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325838.805271] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325838.805780] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325838.806083] [] ? default_wake_function+0x12/0x20 [1325838.806370] [] ? __wake_up_common+0x58/0x90 [1325838.806649] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325838.806929] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325838.807197] [] kthread+0xcf/0xe0 [1325838.807431] [] ? kthread+0x0/0xe0 [1325838.807685] [] ret_from_fork+0x58/0x90 [1325838.807942] [] ? kthread+0x0/0xe0 [1325838.808218] [1325838.808450] LustreError: dumping log to /tmp/lustre-log.1519458491.17262 [1325838.809375] Pid: 17272, comm: ll_ost00_039 [1325838.809638] Call Trace: [1325838.810142] [] schedule+0x29/0x70 [1325838.810386] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325838.810650] [] ? autoremove_wake_function+0x0/0x40 [1325838.810930] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325838.811441] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325838.811706] [] start_this_handle+0x1a1/0x430 [jbd2] [1325838.811982] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325838.812361] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325838.812619] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325838.812865] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325838.813371] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325838.813896] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325838.814191] [] dqget+0x3e4/0x440 [1325838.814473] [] dquot_get_dqblk+0x14/0x1f0 [1325838.814742] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325838.815288] [] lquotactl_slv+0x286/0xac0 [lquota] [1325838.815548] [] ofd_quotactl+0x13c/0x380 [ofd] [1325838.815827] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325838.816146] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325838.816656] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325838.816898] [] ? default_wake_function+0x12/0x20 [1325838.817185] [] ? __wake_up_common+0x58/0x90 [1325838.817445] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325838.817731] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325838.818005] [] kthread+0xcf/0xe0 [1325838.818292] [] ? kthread+0x0/0xe0 [1325838.818547] [] ret_from_fork+0x58/0x90 [1325838.818788] [] ? kthread+0x0/0xe0 [1325838.819075] [1325863.373421] Pid: 131959, comm: ll_ost00_002 [1325863.373711] Call Trace: [1325863.374201] [] schedule+0x29/0x70 [1325863.374487] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325863.374748] [] ? autoremove_wake_function+0x0/0x40 [1325863.375056] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325863.375578] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325863.375860] [] start_this_handle+0x1a1/0x430 [jbd2] [1325863.376192] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325863.376453] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325863.376697] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325863.377011] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325863.377514] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325863.378059] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325863.378306] [] dqget+0x3e4/0x440 [1325863.378562] [] dquot_get_dqblk+0x14/0x1f0 [1325863.378848] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325863.379388] [] lquotactl_slv+0x286/0xac0 [lquota] [1325863.379652] [] ofd_quotactl+0x13c/0x380 [ofd] [1325863.380003] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325863.380284] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325863.380819] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325863.381112] [] ? default_wake_function+0x12/0x20 [1325863.381355] [] ? __wake_up_common+0x58/0x90 [1325863.381637] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325863.381947] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325863.382234] [] kthread+0xcf/0xe0 [1325863.382488] [] ? kthread+0x0/0xe0 [1325863.382760] [] ret_from_fork+0x58/0x90 [1325863.383044] [] ? kthread+0x0/0xe0 [1325863.383276] [1325863.383521] LustreError: dumping log to /tmp/lustre-log.1519458515.131959 [1325885.536621] Lustre: oak-OST0050: deleting orphan objects from 0x0:175410 to 0x0:175425 [1325894.464134] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3458913 [1325896.139899] Pid: 174511, comm: ll_ost00_105 [1325896.140137] Call Trace: [1325896.140588] [] schedule+0x29/0x70 [1325896.140943] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1325896.141184] [] ? autoremove_wake_function+0x0/0x40 [1325896.141426] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1325896.147147] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1325896.147393] [] start_this_handle+0x1a1/0x430 [jbd2] [1325896.147644] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1325896.147890] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1325896.148134] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1325896.148386] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325896.148850] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1325896.149318] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1325896.149560] [] dqget+0x3e4/0x440 [1325896.149798] [] dquot_get_dqblk+0x14/0x1f0 [1325896.150061] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1325896.150539] [] lquotactl_slv+0x286/0xac0 [lquota] [1325896.150802] [] ofd_quotactl+0x13c/0x380 [ofd] [1325896.151122] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1325896.151389] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1325896.151913] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1325896.152159] [] ? default_wake_function+0x12/0x20 [1325896.152422] [] ? __wake_up_common+0x58/0x90 [1325896.152686] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1325896.152968] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1325896.153210] [] kthread+0xcf/0xe0 [1325896.153446] [] ? kthread+0x0/0xe0 [1325896.153700] [] ret_from_fork+0x58/0x90 [1325896.153956] [] ? kthread+0x0/0xe0 [1325896.154193] [1325896.154421] LustreError: dumping log to /tmp/lustre-log.1519458548.174511 [1325900.235715] LustreError: dumping log to /tmp/lustre-log.1519458552.133661 [1325904.774836] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1325904.775311] Lustre: Skipped 404 previous similar messages [1325917.352138] Lustre: oak-OST0030: Export ffff880d4614a000 already connecting from 10.9.112.15@o2ib4 [1325917.352701] Lustre: Skipped 9 previous similar messages [1325920.714741] LustreError: dumping log to /tmp/lustre-log.1519458573.137847 [1325930.448279] LustreError: 4897:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519458282, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0053_UUID lock: ffff880019c07600/0x806f959362a74884 lrc: 3/0,1 mode: --/PW res: [0x2b046:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 4897 timeout: 0 lvb_type: 0 [1325930.450072] LustreError: 4897:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 5 previous similar messages [1325937.097929] LustreError: dumping log to /tmp/lustre-log.1519458589.362709 [1325957.576973] LNet: Service thread pid 174512 was inactive for 1201.32s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1325957.577458] LNet: Skipped 31 previous similar messages [1325957.577698] LustreError: dumping log to /tmp/lustre-log.1519458610.174512 [1325959.349066] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3465921 [1325982.151863] LustreError: dumping log to /tmp/lustre-log.1519458634.174515 [1326013.778649] Lustre: oak-OST0041: deleting orphan objects from 0x0:4441952 to 0x0:4442113 [1326014.918298] LustreError: dumping log to /tmp/lustre-log.1519458667.6923 [1326015.786485] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455889 to 0x0:3455937 [1326019.014150] LustreError: dumping log to /tmp/lustre-log.1519458671.249470 [1326039.493248] LustreError: dumping log to /tmp/lustre-log.1519458692.390782 [1326074.879751] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421697 [1326076.355469] LustreError: dumping log to /tmp/lustre-log.1519458728.390791 [1326092.726935] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290657 [1326092.738719] LustreError: dumping log to /tmp/lustre-log.1519458745.210347 [1326100.930308] LustreError: dumping log to /tmp/lustre-log.1519458753.17265 [1326105.026127] LustreError: dumping log to /tmp/lustre-log.1519458757.174509 [1326116.477822] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442529 [1326117.313552] LNet: Service thread pid 362692 was inactive for 1202.90s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1326117.314269] LNet: Skipped 9 previous similar messages [1326117.314512] Pid: 362692, comm: ll_ost01_083 [1326117.314749] Call Trace: [1326117.315277] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1326117.315757] [] schedule+0x29/0x70 [1326117.316022] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1326117.316271] [] ? default_wake_function+0x0/0x20 [1326117.316551] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1326117.317042] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1326117.317313] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1326117.317598] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1326117.317930] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1326117.318204] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1326117.318458] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1326117.318754] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326117.319032] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326117.319541] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326117.319789] [] ? default_wake_function+0x12/0x20 [1326117.320034] [] ? __wake_up_common+0x58/0x90 [1326117.320311] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326117.320591] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326117.320838] [] kthread+0xcf/0xe0 [1326117.321077] [] ? kthread+0x0/0xe0 [1326117.321323] [] ret_from_fork+0x58/0x90 [1326117.321568] [] ? kthread+0x0/0xe0 [1326117.321809] [1326117.322041] LustreError: dumping log to /tmp/lustre-log.1519458769.362692 [1326137.792644] Pid: 17264, comm: ll_ost00_031 [1326137.792884] Call Trace: [1326137.793346] [] schedule+0x29/0x70 [1326137.793613] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326137.793860] [] ? autoremove_wake_function+0x0/0x40 [1326137.794105] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326137.794616] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326137.794865] [] start_this_handle+0x1a1/0x430 [jbd2] [1326137.795133] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326137.795380] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326137.795645] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326137.795901] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326137.796400] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326137.796875] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326137.797142] [] dqget+0x3e4/0x440 [1326137.797470] [] dquot_get_dqblk+0x14/0x1f0 [1326137.797729] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326137.798192] [] lquotactl_slv+0x286/0xac0 [lquota] [1326137.798451] [] ofd_quotactl+0x13c/0x380 [ofd] [1326137.798771] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326137.799039] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326137.799577] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326137.799819] [] ? default_wake_function+0x12/0x20 [1326137.800092] [] ? __wake_up_common+0x58/0x90 [1326137.800357] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326137.800638] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326137.800880] [] kthread+0xcf/0xe0 [1326137.801135] [] ? kthread+0x0/0xe0 [1326137.801390] [] ret_from_fork+0x58/0x90 [1326137.801663] [] ? kthread+0x0/0xe0 [1326137.801901] [1326137.802142] LustreError: dumping log to /tmp/lustre-log.1519458790.17264 [1326162.367475] Pid: 249481, comm: ll_ost00_062 [1326162.367711] Call Trace: [1326162.368166] [] schedule+0x29/0x70 [1326162.368429] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326162.368806] [] ? autoremove_wake_function+0x0/0x40 [1326162.369048] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326162.369564] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326162.369822] [] start_this_handle+0x1a1/0x430 [jbd2] [1326162.370118] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326162.370361] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326162.370621] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326162.370894] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326162.371421] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326162.371937] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326162.372180] [] dqget+0x3e4/0x440 [1326162.372434] [] dquot_get_dqblk+0x14/0x1f0 [1326162.372709] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326162.373221] [] lquotactl_slv+0x286/0xac0 [lquota] [1326162.373503] [] ofd_quotactl+0x13c/0x380 [ofd] [1326162.373799] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326162.374114] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326162.374631] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326162.374905] [] ? default_wake_function+0x12/0x20 [1326162.375193] [] ? __wake_up_common+0x58/0x90 [1326162.375474] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326162.375756] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326162.376042] [] kthread+0xcf/0xe0 [1326162.376278] [] ? kthread+0x0/0xe0 [1326162.376537] [] ret_from_fork+0x58/0x90 [1326162.376790] [] ? kthread+0x0/0xe0 [1326162.382691] [1326162.383027] LustreError: dumping log to /tmp/lustre-log.1519458814.249481 [1326195.133919] Pid: 390789, comm: ll_ost00_092 [1326195.134156] Call Trace: [1326195.134619] [] schedule+0x29/0x70 [1326195.134902] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326195.135165] [] ? autoremove_wake_function+0x0/0x40 [1326195.135408] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326195.135912] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326195.136154] [] start_this_handle+0x1a1/0x430 [jbd2] [1326195.136405] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326195.136677] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326195.136937] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326195.137190] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326195.137682] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326195.138166] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326195.138451] [] dqget+0x3e4/0x440 [1326195.138701] [] dquot_get_dqblk+0x14/0x1f0 [1326195.138986] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326195.139485] [] lquotactl_slv+0x286/0xac0 [lquota] [1326195.139728] [] ofd_quotactl+0x13c/0x380 [ofd] [1326195.140040] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326195.140390] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326195.140905] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326195.141144] [] ? default_wake_function+0x12/0x20 [1326195.141416] [] ? __wake_up_common+0x58/0x90 [1326195.141681] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326195.141964] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326195.142202] [] kthread+0xcf/0xe0 [1326195.142437] [] ? kthread+0x0/0xe0 [1326195.142696] [] ret_from_fork+0x58/0x90 [1326195.142951] [] ? kthread+0x0/0xe0 [1326195.143186] [1326195.143436] LustreError: dumping log to /tmp/lustre-log.1519458847.390789 [1326219.708831] Pid: 25988, comm: ll_ost00_079 [1326219.709094] Call Trace: [1326219.709552] [] schedule+0x29/0x70 [1326219.709836] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326219.710093] [] ? autoremove_wake_function+0x0/0x40 [1326219.710378] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326219.710892] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326219.711149] [] start_this_handle+0x1a1/0x430 [jbd2] [1326219.711551] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326219.711824] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326219.712079] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326219.712375] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326219.712853] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326219.713381] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326219.713622] [] dqget+0x3e4/0x440 [1326219.713876] [] dquot_get_dqblk+0x14/0x1f0 [1326219.714147] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326219.714660] [] lquotactl_slv+0x286/0xac0 [lquota] [1326219.714924] [] ofd_quotactl+0x13c/0x380 [ofd] [1326219.715228] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326219.715532] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326219.716061] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326219.716350] [] ? default_wake_function+0x12/0x20 [1326219.716639] [] ? __wake_up_common+0x58/0x90 [1326219.716920] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326219.717185] [] ? __switch_to+0xd7/0x510 [1326219.717473] [] ? __schedule+0x2f0/0x8b0 [1326219.717757] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326219.718018] [] kthread+0xcf/0xe0 [1326219.718303] [] ? kthread+0x0/0xe0 [1326219.718543] [] ret_from_fork+0x58/0x90 [1326219.718799] [] ? kthread+0x0/0xe0 [1326219.719049] [1326219.719318] LustreError: dumping log to /tmp/lustre-log.1519458872.25988 [1326227.900396] LustreError: dumping log to /tmp/lustre-log.1519458880.296683 [1326231.996188] LustreError: dumping log to /tmp/lustre-log.1519458884.390783 [1326236.091990] LustreError: dumping log to /tmp/lustre-log.1519458888.174523 [1326245.471763] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361698 [1326249.479670] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426116 [1326252.475212] LustreError: dumping log to /tmp/lustre-log.1519458905.210340 [1326253.975335] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288673 [1326256.571057] LustreError: dumping log to /tmp/lustre-log.1519458909.249469 [1326260.666829] LustreError: dumping log to /tmp/lustre-log.1519458913.196795 [1326281.145944] LustreError: dumping log to /tmp/lustre-log.1519458933.131985 [1326293.433303] LustreError: dumping log to /tmp/lustre-log.1519458945.362694 [1326313.876570] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401667 to 0x0:4401697 [1326314.764468] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347907 to 0x0:4347937 [1326318.008252] LustreError: dumping log to /tmp/lustre-log.1519458970.390781 [1326334.391421] LustreError: dumping log to /tmp/lustre-log.1519458986.196783 [1326342.583058] LustreError: dumping log to /tmp/lustre-log.1519458995.390780 [1326353.938651] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441570 [1326354.870464] LustreError: dumping log to /tmp/lustre-log.1519459007.21427 [1326370.593983] Lustre: oak-OST004f: deleting orphan objects from 0x0:175140 to 0x0:175170 [1326375.349515] LustreError: dumping log to /tmp/lustre-log.1519459027.249475 [1326385.497320] Lustre: oak-OST0043: deleting orphan objects from 0x0:4403912 to 0x0:4403937 [1326388.977188] Lustre: oak-OST0053: deleting orphan objects from 0x0:176199 to 0x0:176258 [1326399.924362] LustreError: dumping log to /tmp/lustre-log.1519459052.249489 [1326410.928027] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437761 [1326412.211799] LustreError: dumping log to /tmp/lustre-log.1519459064.249479 [1326416.307591] LustreError: dumping log to /tmp/lustre-log.1519459068.249493 [1326426.483157] Lustre: 132528:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff88097d184450 x1591002487081696/t0(0) o101->cd9072f9-71b4-1cfb-a759-fd6823f1a4a9@10.0.2.3@o2ib5:444/0 lens 328/0 e 0 to 0 dl 1519459084 ref 2 fl New:/0/ffffffff rc 0/-1 [1326426.484356] Lustre: 132528:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1017 previous similar messages [1326435.431380] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1326435.431878] Lustre: Skipped 390 previous similar messages [1326436.786642] Pid: 17270, comm: ll_ost00_037 [1326436.786875] Call Trace: [1326436.787330] [] schedule+0x29/0x70 [1326436.787593] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326436.787884] [] ? autoremove_wake_function+0x0/0x40 [1326436.788178] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326436.788663] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326436.788906] [] start_this_handle+0x1a1/0x430 [jbd2] [1326436.789218] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326436.789474] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326436.789735] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326436.789988] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326436.790495] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326436.791029] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326436.791309] [] dqget+0x3e4/0x440 [1326436.791572] [] dquot_get_dqblk+0x14/0x1f0 [1326436.791851] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326436.792373] [] lquotactl_slv+0x286/0xac0 [lquota] [1326436.792633] [] ofd_quotactl+0x13c/0x380 [ofd] [1326436.792931] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326436.793242] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326436.793743] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326436.793984] [] ? default_wake_function+0x12/0x20 [1326436.794270] [] ? __wake_up_common+0x58/0x90 [1326436.794528] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326436.794809] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326436.795048] [] kthread+0xcf/0xe0 [1326436.795329] [] ? kthread+0x0/0xe0 [1326436.795565] [] ret_from_fork+0x58/0x90 [1326436.795820] [] ? kthread+0x0/0xe0 [1326436.796055] [1326436.796292] LustreError: dumping log to /tmp/lustre-log.1519459089.17270 [1326452.990145] Lustre: oak-OST0051: deleting orphan objects from 0x0:176434 to 0x0:176449 [1326461.361494] Pid: 17245, comm: ll_ost00_027 [1326461.361733] Call Trace: [1326461.362195] [] schedule+0x29/0x70 [1326461.362461] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326461.362704] [] ? autoremove_wake_function+0x0/0x40 [1326461.362949] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326461.363412] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326461.363659] [] start_this_handle+0x1a1/0x430 [jbd2] [1326461.363909] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326461.364149] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326461.364420] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326461.364679] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326461.370644] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326461.371191] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326461.371437] [] dqget+0x3e4/0x440 [1326461.371699] [] dquot_get_dqblk+0x14/0x1f0 [1326461.371961] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326461.372486] [] lquotactl_slv+0x286/0xac0 [lquota] [1326461.372732] [] ofd_quotactl+0x13c/0x380 [ofd] [1326461.373084] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326461.373378] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326461.373956] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326461.374199] [] ? default_wake_function+0x12/0x20 [1326461.374483] [] ? __wake_up_common+0x58/0x90 [1326461.374746] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326461.375070] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326461.375320] [] kthread+0xcf/0xe0 [1326461.375590] [] ? kthread+0x0/0xe0 [1326461.375831] [] ret_from_fork+0x58/0x90 [1326461.376069] [] ? kthread+0x0/0xe0 [1326461.376334] [1326461.376580] LustreError: dumping log to /tmp/lustre-log.1519459113.17245 [1326473.648964] Pid: 390787, comm: ll_ost00_090 [1326473.649203] Call Trace: [1326473.649707] [] schedule+0x29/0x70 [1326473.649985] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326473.650263] [] ? autoremove_wake_function+0x0/0x40 [1326473.650564] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326473.651058] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326473.651322] [] start_this_handle+0x1a1/0x430 [jbd2] [1326473.651622] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326473.651866] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326473.652124] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326473.652431] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326473.652965] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326473.653494] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326473.653732] [] dqget+0x3e4/0x440 [1326473.654080] [] dquot_get_dqblk+0x14/0x1f0 [1326473.654354] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326473.654860] [] lquotactl_slv+0x286/0xac0 [lquota] [1326473.655125] [] ofd_quotactl+0x13c/0x380 [ofd] [1326473.655466] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326473.655775] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326473.656310] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326473.656604] [] ? default_wake_function+0x12/0x20 [1326473.656905] [] ? __wake_up_common+0x58/0x90 [1326473.657192] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326473.657501] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326473.657742] [] kthread+0xcf/0xe0 [1326473.657998] [] ? kthread+0x0/0xe0 [1326473.658253] [] ret_from_fork+0x58/0x90 [1326473.658537] [] ? kthread+0x0/0xe0 [1326473.658776] [1326473.659025] LustreError: dumping log to /tmp/lustre-log.1519459126.390787 [1326473.660019] Pid: 17269, comm: ll_ost00_036 [1326473.660271] Call Trace: [1326473.660772] [] schedule+0x29/0x70 [1326473.661039] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326473.661322] [] ? autoremove_wake_function+0x0/0x40 [1326473.661613] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326473.662104] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326473.662383] [] start_this_handle+0x1a1/0x430 [jbd2] [1326473.662679] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326473.662937] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326473.663181] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326473.663476] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326473.663961] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326473.664478] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326473.664721] [] dqget+0x3e4/0x440 [1326473.664978] [] dquot_get_dqblk+0x14/0x1f0 [1326473.665226] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326473.665738] [] lquotactl_slv+0x286/0xac0 [lquota] [1326473.666000] [] ofd_quotactl+0x13c/0x380 [ofd] [1326473.666302] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326473.666613] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326473.667120] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326473.667368] [] ? default_wake_function+0x12/0x20 [1326473.667663] [] ? __wake_up_common+0x58/0x90 [1326473.667956] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326473.668315] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326473.668596] [] kthread+0xcf/0xe0 [1326473.668840] [] ? kthread+0x0/0xe0 [1326473.669106] [] ret_from_fork+0x58/0x90 [1326473.669379] [] ? kthread+0x0/0xe0 [1326473.669667] [1326495.396177] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3458945 [1326498.223811] Pid: 174520, comm: ll_ost00_113 [1326498.224066] Call Trace: [1326498.224525] [] schedule+0x29/0x70 [1326498.224788] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326498.225070] [] ? autoremove_wake_function+0x0/0x40 [1326498.225451] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326498.225930] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326498.226202] [] start_this_handle+0x1a1/0x430 [jbd2] [1326498.226484] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326498.226785] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326498.227029] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326498.227328] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326498.227813] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326498.228348] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326498.228593] [] dqget+0x3e4/0x440 [1326498.228848] [] dquot_get_dqblk+0x14/0x1f0 [1326498.229121] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326498.229634] [] lquotactl_slv+0x286/0xac0 [lquota] [1326498.229898] [] ofd_quotactl+0x13c/0x380 [ofd] [1326498.230207] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326498.230517] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326498.231046] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326498.231340] [] ? default_wake_function+0x12/0x20 [1326498.231621] [] ? __wake_up_common+0x58/0x90 [1326498.231906] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326498.232187] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326498.232468] [] kthread+0xcf/0xe0 [1326498.232702] [] ? kthread+0x0/0xe0 [1326498.232958] [] ret_from_fork+0x58/0x90 [1326498.233272] [] ? kthread+0x0/0xe0 [1326498.233568] [1326498.233831] LustreError: dumping log to /tmp/lustre-log.1519459150.174520 [1326505.775515] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1326505.775998] Lustre: Skipped 423 previous similar messages [1326522.798650] LustreError: dumping log to /tmp/lustre-log.1519459175.249480 [1326535.086074] LustreError: dumping log to /tmp/lustre-log.1519459187.134046 [1326555.565111] LustreError: dumping log to /tmp/lustre-log.1519459208.17273 [1326559.660951] LNet: Service thread pid 249334 was inactive for 1203.62s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1326559.661468] LNet: Skipped 37 previous similar messages [1326559.661705] LustreError: dumping log to /tmp/lustre-log.1519459212.249334 [1326560.289091] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3465953 [1326580.139987] LustreError: dumping log to /tmp/lustre-log.1519459232.174517 [1326588.331530] LustreError: dumping log to /tmp/lustre-log.1519459240.362688 [1326592.427361] LustreError: dumping log to /tmp/lustre-log.1519459244.174505 [1326596.523179] LustreError: dumping log to /tmp/lustre-log.1519459249.137742 [1326615.214515] Lustre: oak-OST0041: deleting orphan objects from 0x0:4441952 to 0x0:4442145 [1326616.854415] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455939 to 0x0:3455969 [1326617.002255] LustreError: dumping log to /tmp/lustre-log.1519459269.134225 [1326641.577101] LustreError: dumping log to /tmp/lustre-log.1519459294.249537 [1326642.085444] Lustre: oak-OST0050: deleting orphan objects from 0x0:175410 to 0x0:175457 [1326653.864503] LustreError: dumping log to /tmp/lustre-log.1519459306.17266 [1326676.403650] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421729 [1326678.439347] LustreError: dumping log to /tmp/lustre-log.1519459331.174519 [1326693.666879] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290689 [1326703.014242] LustreError: dumping log to /tmp/lustre-log.1519459355.141924 [1326715.301650] LustreError: dumping log to /tmp/lustre-log.1519459367.249331 [1326719.489635] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442561 [1326735.780681] LustreError: dumping log to /tmp/lustre-log.1519459388.390788 [1326760.355510] LNet: Service thread pid 249490 was inactive for 1200.27s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1326760.356213] LNet: Skipped 9 previous similar messages [1326760.356450] Pid: 249490, comm: ll_ost00_071 [1326760.356691] Call Trace: [1326760.362645] [] schedule+0x29/0x70 [1326760.362903] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326760.363177] [] ? autoremove_wake_function+0x0/0x40 [1326760.363418] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326760.363933] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326760.364194] [] start_this_handle+0x1a1/0x430 [jbd2] [1326760.364466] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326760.364725] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326760.364976] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326760.365468] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326760.365983] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326760.366244] [] dqget+0x3e4/0x440 [1326760.366521] [] dquot_get_dqblk+0x14/0x1f0 [1326760.366780] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326760.367281] [] lquotactl_slv+0x286/0xac0 [lquota] [1326760.367625] [] ofd_quotactl+0x13c/0x380 [ofd] [1326760.367914] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326760.368208] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326760.368706] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326760.368947] [] ? default_wake_function+0x12/0x20 [1326760.369220] [] ? __wake_up_common+0x58/0x90 [1326760.369484] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326760.369760] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326760.370002] [] kthread+0xcf/0xe0 [1326760.370270] [] ? kthread+0x0/0xe0 [1326760.370528] [] ret_from_fork+0x58/0x90 [1326760.370764] [] ? kthread+0x0/0xe0 [1326760.371032] [1326760.371259] LustreError: dumping log to /tmp/lustre-log.1519459412.249490 [1326772.642967] Pid: 17267, comm: ll_ost00_034 [1326772.643206] Call Trace: [1326772.643667] [] schedule+0x29/0x70 [1326772.643952] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326772.644231] [] ? autoremove_wake_function+0x0/0x40 [1326772.644506] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326772.645011] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326772.645257] [] start_this_handle+0x1a1/0x430 [jbd2] [1326772.645555] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326772.645799] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326772.646060] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326772.646315] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326772.646837] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326772.647376] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326772.647653] [] dqget+0x3e4/0x440 [1326772.647950] [] dquot_get_dqblk+0x14/0x1f0 [1326772.648209] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326772.648729] [] lquotactl_slv+0x286/0xac0 [lquota] [1326772.648995] [] ofd_quotactl+0x13c/0x380 [ofd] [1326772.649293] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326772.649609] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326772.650117] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326772.650360] [] ? default_wake_function+0x12/0x20 [1326772.650648] [] ? __wake_up_common+0x58/0x90 [1326772.650911] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326772.651199] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326772.651502] [] kthread+0xcf/0xe0 [1326772.651785] [] ? kthread+0x0/0xe0 [1326772.652043] [] ret_from_fork+0x58/0x90 [1326772.652283] [] ? kthread+0x0/0xe0 [1326772.652567] [1326772.652796] LustreError: dumping log to /tmp/lustre-log.1519459425.17267 [1326797.217850] Pid: 249335, comm: ll_ost00_048 [1326797.218096] Call Trace: [1326797.218573] [] schedule+0x29/0x70 [1326797.218839] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326797.219115] [] ? autoremove_wake_function+0x0/0x40 [1326797.219390] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326797.219872] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326797.220116] [] start_this_handle+0x1a1/0x430 [jbd2] [1326797.220395] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326797.220636] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326797.220896] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326797.221152] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326797.221642] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326797.222155] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326797.222433] [] dqget+0x3e4/0x440 [1326797.222697] [] dquot_get_dqblk+0x14/0x1f0 [1326797.222970] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326797.223468] [] lquotactl_slv+0x286/0xac0 [lquota] [1326797.223713] [] ofd_quotactl+0x13c/0x380 [ofd] [1326797.224027] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326797.224317] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326797.224908] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326797.225150] [] ? default_wake_function+0x12/0x20 [1326797.225417] [] ? __wake_up_common+0x58/0x90 [1326797.225674] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326797.225954] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326797.226195] [] kthread+0xcf/0xe0 [1326797.226459] [] ? kthread+0x0/0xe0 [1326797.226698] [] ret_from_fork+0x58/0x90 [1326797.226954] [] ? kthread+0x0/0xe0 [1326797.227193] [1326797.227454] LustreError: dumping log to /tmp/lustre-log.1519459449.249335 [1326807.430311] LustreError: 210342:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519459159, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0039_UUID lock: ffff88258777d200/0x806f959362a77dc8 lrc: 3/0,1 mode: --/PW res: [0x4257e2:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 210342 timeout: 0 lvb_type: 0 [1326807.432121] LustreError: 210342:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 5 previous similar messages [1326821.792685] Pid: 174507, comm: ll_ost00_101 [1326821.792928] Call Trace: [1326821.793410] [] schedule+0x29/0x70 [1326821.793679] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1326821.793990] [] ? autoremove_wake_function+0x0/0x40 [1326821.794253] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1326821.794789] [] ? kiblnd_send+0x357/0xa10 [ko2iblnd] [1326821.795051] [] start_this_handle+0x1a1/0x430 [jbd2] [1326821.795333] [] ? lnet_ni_send+0x3b/0xd0 [lnet] [1326821.795576] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1326821.795834] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1326821.796184] [] ? ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326821.796720] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1326821.797212] [] ldiskfs_acquire_dquot+0x53/0xb0 [ldiskfs] [1326821.797466] [] dqget+0x3e4/0x440 [1326821.797737] [] dquot_get_dqblk+0x14/0x1f0 [1326821.798023] [] osd_acct_index_lookup+0x22f/0x470 [osd_ldiskfs] [1326821.798527] [] lquotactl_slv+0x286/0xac0 [lquota] [1326821.798793] [] ofd_quotactl+0x13c/0x380 [ofd] [1326821.799108] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326821.799398] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326821.799904] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326821.800193] [] ? default_wake_function+0x12/0x20 [1326821.800481] [] ? __wake_up_common+0x58/0x90 [1326821.800795] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326821.801106] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326821.801379] [] kthread+0xcf/0xe0 [1326821.801619] [] ? kthread+0x0/0xe0 [1326821.801878] [] ret_from_fork+0x58/0x90 [1326821.802169] [] ? kthread+0x0/0xe0 [1326821.802439] [1326821.802689] LustreError: dumping log to /tmp/lustre-log.1519459474.174507 [1326834.080067] Pid: 132143, comm: ll_ost01_009 [1326834.080311] Call Trace: [1326834.080821] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1326834.081301] [] schedule+0x29/0x70 [1326834.081559] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1326834.081895] [] ? default_wake_function+0x0/0x20 [1326834.082155] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1326834.082632] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1326834.082894] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1326834.083152] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1326834.083415] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1326834.083674] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1326834.083922] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1326834.084206] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1326834.084474] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1326834.084968] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1326834.085216] [] ? default_wake_function+0x0/0x20 [1326834.085482] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1326834.091139] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1326834.091387] [] kthread+0xcf/0xe0 [1326834.091651] [] ? kthread+0x0/0xe0 [1326834.091897] [] ret_from_fork+0x58/0x90 [1326834.092144] [] ? kthread+0x0/0xe0 [1326834.092387] [1326834.092622] LustreError: dumping log to /tmp/lustre-log.1519459486.132143 [1326847.019841] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361730 [1326850.463336] LustreError: dumping log to /tmp/lustre-log.1519459503.21431 [1326850.699562] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426148 [1326854.539278] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288705 [1326854.559141] LustreError: dumping log to /tmp/lustre-log.1519459507.4890 [1326858.654985] LustreError: dumping log to /tmp/lustre-log.1519459511.390796 [1326883.229826] LustreError: dumping log to /tmp/lustre-log.1519459535.362705 [1326895.517228] LustreError: dumping log to /tmp/lustre-log.1519459548.196785 [1326899.613003] LustreError: dumping log to /tmp/lustre-log.1519459552.362693 [1326915.640429] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401667 to 0x0:4401729 [1326916.008545] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347907 to 0x0:4347969 [1326955.878585] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441602 [1326972.197826] Lustre: oak-OST004f: deleting orphan objects from 0x0:175140 to 0x0:175202 [1326981.529185] LustreError: dumping log to /tmp/lustre-log.1519459634.210337 [1326989.720837] LustreError: dumping log to /tmp/lustre-log.1519459642.132017 [1327006.104065] LustreError: dumping log to /tmp/lustre-log.1519459658.4910 [1327011.860054] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437793 [1327018.391468] LustreError: dumping log to /tmp/lustre-log.1519459670.196796 [1327026.999117] Lustre: 132528:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff8806ca79a850 x1591169533815232/t0(0) o101->a7b231a5-a7d7-2f9d-4b5d-464beea873b1@10.8.0.63@o2ib6:289/0 lens 328/0 e 0 to 0 dl 1519459684 ref 2 fl New:/0/ffffffff rc 0/-1 [1327027.000288] Lustre: 132528:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1041 previous similar messages [1327036.407367] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1327036.407891] Lustre: Skipped 390 previous similar messages [1327071.637021] Pid: 362696, comm: ll_ost01_087 [1327071.637266] Call Trace: [1327071.637732] [] schedule+0x29/0x70 [1327071.638003] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1327071.638259] [] ? autoremove_wake_function+0x0/0x40 [1327071.638611] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1327071.639075] [] ? __percpu_counter_sum+0x70/0x80 [1327071.639315] [] start_this_handle+0x1a1/0x430 [jbd2] [1327071.639571] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1327071.640049] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1327071.640313] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1327071.640562] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327071.641077] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1327071.641613] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327071.641933] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1327071.642439] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1327071.642693] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1327071.642959] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1327071.643439] [] ? dequeue_entity+0x11c/0x5d0 [1327071.643686] [] ? enqueue_entity+0x26c/0xb60 [1327071.643930] [] ? dequeue_task_fair+0x3d0/0x660 [1327071.644178] [] ? __switch_to+0xd7/0x510 [1327071.644451] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1327071.644727] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327071.645230] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327071.645479] [] ? default_wake_function+0x0/0x20 [1327071.645742] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327071.646016] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327071.646259] [] kthread+0xcf/0xe0 [1327071.646505] [] ? kthread+0x0/0xe0 [1327071.646748] [] ret_from_fork+0x58/0x90 [1327071.646998] [] ? kthread+0x0/0xe0 [1327071.647240] [1327071.647474] LustreError: dumping log to /tmp/lustre-log.1519459724.362696 [1327071.648357] Pid: 132313, comm: ll_ost01_010 [1327071.648601] Call Trace: [1327071.649082] [] schedule+0x29/0x70 [1327071.649330] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1327071.649580] [] ? autoremove_wake_function+0x0/0x40 [1327071.649831] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1327071.650312] [] ? __percpu_counter_sum+0x70/0x80 [1327071.650561] [] start_this_handle+0x1a1/0x430 [jbd2] [1327071.650817] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1327071.651293] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1327071.651544] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1327071.651804] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327071.652281] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1327071.652844] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327071.653124] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1327071.653617] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1327071.653865] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1327071.654138] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1327071.654610] [] ? __enqueue_entity+0x78/0x80 [1327071.654855] [] ? enqueue_entity+0x26c/0xb60 [1327071.655137] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1327071.655410] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327071.655911] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327071.656162] [] ? default_wake_function+0x12/0x20 [1327071.656409] [] ? __wake_up_common+0x58/0x90 [1327071.656682] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327071.656953] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327071.657201] [] kthread+0xcf/0xe0 [1327071.657442] [] ? kthread+0x0/0xe0 [1327071.657684] [] ret_from_fork+0x58/0x90 [1327071.657928] [] ? kthread+0x0/0xe0 [1327071.658174] [1327071.658411] Pid: 210330, comm: ll_ost01_057 [1327071.658651] Call Trace: [1327071.659117] [] schedule+0x29/0x70 [1327071.659364] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1327071.659610] [] ? autoremove_wake_function+0x0/0x40 [1327071.659857] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1327071.660348] [] ? __percpu_counter_sum+0x70/0x80 [1327071.660613] [] start_this_handle+0x1a1/0x430 [jbd2] [1327071.660870] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1327071.661343] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1327071.661589] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1327071.661839] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327071.662333] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1327071.662824] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327071.663097] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1327071.663600] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1327071.663850] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1327071.664122] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1327071.664598] [] ? __enqueue_entity+0x78/0x80 [1327071.664842] [] ? enqueue_entity+0x26c/0xb60 [1327071.665121] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1327071.665395] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327071.665896] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327071.666150] [] ? default_wake_function+0x12/0x20 [1327071.666396] [] ? __wake_up_common+0x58/0x90 [1327071.666661] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327071.667025] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327071.667267] [] kthread+0xcf/0xe0 [1327071.667504] [] ? kthread+0x0/0xe0 [1327071.667741] [] ret_from_fork+0x58/0x90 [1327071.667981] [] ? kthread+0x0/0xe0 [1327071.668223] [1327071.668458] Pid: 135199, comm: ll_ost01_021 [1327071.668696] Call Trace: [1327071.669163] [] schedule+0x29/0x70 [1327071.669410] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1327071.669658] [] ? autoremove_wake_function+0x0/0x40 [1327071.669909] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1327071.670389] [] ? __percpu_counter_sum+0x70/0x80 [1327071.670632] [] start_this_handle+0x1a1/0x430 [jbd2] [1327071.670885] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1327071.671386] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1327071.671651] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1327071.671902] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327071.678014] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1327071.678497] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327071.678774] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1327071.679310] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1327071.679563] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1327071.679829] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1327071.680328] [] ? dequeue_entity+0x11c/0x5d0 [1327071.680590] [] ? dequeue_task_fair+0x3d0/0x660 [1327071.680828] [] ? __switch_to+0xd7/0x510 [1327071.681198] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1327071.681473] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327071.681962] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327071.682209] [] ? default_wake_function+0x0/0x20 [1327071.682475] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327071.682745] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327071.682997] [] kthread+0xcf/0xe0 [1327071.683239] [] ? kthread+0x0/0xe0 [1327071.683482] [] ret_from_fork+0x58/0x90 [1327071.683725] [] ? kthread+0x0/0xe0 [1327071.683970] [1327096.087994] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3458977 [1327100.307692] Pid: 362702, comm: ll_ost01_093 [1327100.307934] Call Trace: [1327100.308461] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1327100.308942] [] schedule+0x29/0x70 [1327100.309204] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1327100.309447] [] ? default_wake_function+0x0/0x20 [1327100.309708] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1327100.310279] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1327100.310541] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1327100.310793] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1327100.311055] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1327100.311318] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1327100.311568] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1327100.311858] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1327100.312125] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327100.312620] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327100.312870] [] ? default_wake_function+0x0/0x20 [1327100.313137] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327100.313405] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327100.313659] [] kthread+0xcf/0xe0 [1327100.313902] [] ? kthread+0x0/0xe0 [1327100.314150] [] ret_from_fork+0x58/0x90 [1327100.314390] [] ? kthread+0x0/0xe0 [1327100.314636] [1327100.314872] LustreError: dumping log to /tmp/lustre-log.1519459752.362702 [1327106.569372] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1327106.569840] Lustre: Skipped 388 previous similar messages [1327141.669887] Lustre: oak-OST0043: deleting orphan objects from 0x0:4403912 to 0x0:4403969 [1327144.925819] Lustre: oak-OST0053: deleting orphan objects from 0x0:176199 to 0x0:176290 [1327161.213055] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3465985 [1327208.578755] Lustre: oak-OST0051: deleting orphan objects from 0x0:176434 to 0x0:176481 [1327217.114425] Lustre: oak-OST0041: deleting orphan objects from 0x0:4441952 to 0x0:4442177 [1327218.290337] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455939 to 0x0:3456001 [1327277.335650] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421761 [1327295.286810] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290721 [1327310.217700] Lustre: oak-OST0032: Export ffff880612719800 already connecting from 10.9.112.15@o2ib4 [1327310.218187] Lustre: Skipped 6 previous similar messages [1327320.517550] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442593 [1327382.918465] LNet: Service thread pid 196789 was inactive for 1203.30s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1327382.919189] LNet: Skipped 9 previous similar messages [1327382.919432] Pid: 196789, comm: ll_ost01_030 [1327382.919678] Call Trace: [1327382.920155] [] schedule+0x29/0x70 [1327382.920423] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1327382.920676] [] ? autoremove_wake_function+0x0/0x40 [1327382.920927] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1327382.921404] [] ? __percpu_counter_sum+0x70/0x80 [1327382.921654] [] start_this_handle+0x1a1/0x430 [jbd2] [1327382.921917] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1327382.922399] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1327382.922652] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1327382.922912] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327382.923386] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1327382.923952] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327382.924265] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1327382.924763] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1327382.925017] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1327382.925284] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1327382.925763] [] ? __enqueue_entity+0x78/0x80 [1327382.926008] [] ? enqueue_entity+0x26c/0xb60 [1327382.926286] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1327382.926563] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327382.927060] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327382.927306] [] ? default_wake_function+0x12/0x20 [1327382.927557] [] ? __wake_up_common+0x58/0x90 [1327382.927826] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327382.928099] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327382.928343] [] kthread+0xcf/0xe0 [1327382.928590] [] ? kthread+0x0/0xe0 [1327382.928835] [] ret_from_fork+0x58/0x90 [1327382.929075] [] ? kthread+0x0/0xe0 [1327382.929321] [1327382.929558] LustreError: dumping log to /tmp/lustre-log.1519460035.196789 [1327382.931914] Pid: 196797, comm: ll_ost01_035 [1327382.932154] Call Trace: [1327382.932624] [] schedule+0x29/0x70 [1327382.932871] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1327382.933116] [] ? autoremove_wake_function+0x0/0x40 [1327382.933363] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1327382.933835] [] ? __percpu_counter_sum+0x70/0x80 [1327382.934080] [] start_this_handle+0x1a1/0x430 [jbd2] [1327382.934330] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1327382.934801] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1327382.935047] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1327382.935296] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327382.935776] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1327382.936256] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327382.936530] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1327382.937027] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1327382.937278] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1327382.937541] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1327382.938097] [] ? __enqueue_entity+0x78/0x80 [1327382.938335] [] ? enqueue_entity+0x26c/0xb60 [1327382.938605] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1327382.938872] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327382.939364] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327382.939609] [] ? default_wake_function+0x12/0x20 [1327382.939854] [] ? __wake_up_common+0x58/0x90 [1327382.940120] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327382.940385] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327382.940632] [] kthread+0xcf/0xe0 [1327382.940871] [] ? kthread+0x0/0xe0 [1327382.941112] [] ret_from_fork+0x58/0x90 [1327382.941355] [] ? kthread+0x0/0xe0 [1327382.941600] [1327382.941834] Pid: 4887, comm: ll_ost01_101 [1327382.942071] Call Trace: [1327382.942539] [] schedule+0x29/0x70 [1327382.942787] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1327382.943035] [] ? autoremove_wake_function+0x0/0x40 [1327382.943282] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1327382.943753] [] ? __percpu_counter_sum+0x70/0x80 [1327382.943999] [] start_this_handle+0x1a1/0x430 [jbd2] [1327382.944249] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1327382.944719] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1327382.944965] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1327382.945212] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327382.951018] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1327382.951494] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327382.951763] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1327382.952349] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1327382.952595] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1327382.952855] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1327382.953320] [] ? __enqueue_entity+0x78/0x80 [1327382.953566] [] ? enqueue_entity+0x26c/0xb60 [1327382.953834] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1327382.954102] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327382.954598] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327382.954842] [] ? default_wake_function+0x12/0x20 [1327382.955083] [] ? __wake_up_common+0x58/0x90 [1327382.955344] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327382.955610] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327382.955853] [] kthread+0xcf/0xe0 [1327382.956093] [] ? kthread+0x0/0xe0 [1327382.956335] [] ret_from_fork+0x58/0x90 [1327382.956579] [] ? kthread+0x0/0xe0 [1327382.956818] [1327382.957053] Pid: 21434, comm: ll_ost01_046 [1327382.957289] Call Trace: [1327382.957752] [] schedule+0x29/0x70 [1327382.957996] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1327382.958241] [] ? autoremove_wake_function+0x0/0x40 [1327382.958488] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1327382.958951] [] ? __percpu_counter_sum+0x70/0x80 [1327382.959195] [] start_this_handle+0x1a1/0x430 [jbd2] [1327382.959444] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1327382.959919] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1327382.960166] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1327382.960412] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327382.960891] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1327382.961365] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1327382.961643] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1327382.962157] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1327382.962421] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1327382.962684] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1327382.963154] [] ? __enqueue_entity+0x78/0x80 [1327382.963400] [] ? enqueue_entity+0x26c/0xb60 [1327382.963672] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1327382.963940] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327382.964430] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327382.964680] [] ? default_wake_function+0x12/0x20 [1327382.964924] [] ? __wake_up_common+0x58/0x90 [1327382.965205] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327382.965473] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327382.965718] [] kthread+0xcf/0xe0 [1327382.965968] [] ? kthread+0x0/0xe0 [1327382.966298] [] ret_from_fork+0x58/0x90 [1327382.966552] [] ? kthread+0x0/0xe0 [1327382.966804] [1327385.586579] Lustre: oak-OST0033: deleting orphan objects from 0x0:4424462 to 0x0:4424481 [1327397.521972] Lustre: oak-OST0050: deleting orphan objects from 0x0:175410 to 0x0:175489 [1327448.519724] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361762 [1327451.407520] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426180 [1327455.695309] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288737 [1327516.636434] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401667 to 0x0:4401761 [1327517.636315] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347907 to 0x0:4348001 [1327556.570595] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441634 [1327572.857845] Lustre: oak-OST004f: deleting orphan objects from 0x0:175140 to 0x0:175234 [1327608.187993] Pid: 362695, comm: ll_ost01_086 [1327608.188254] Call Trace: [1327608.188743] [] schedule_preempt_disabled+0x29/0x70 [1327608.188991] [] __mutex_lock_slowpath+0xc7/0x1d0 [1327608.189251] [] mutex_lock+0x1f/0x2f [1327608.189498] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1327608.189801] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1327608.190346] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1327608.190853] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1327608.191140] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1327608.191440] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327608.191933] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327608.192184] [] ? default_wake_function+0x12/0x20 [1327608.192428] [] ? __wake_up_common+0x58/0x90 [1327608.192693] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327608.192983] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327608.193246] [] kthread+0xcf/0xe0 [1327608.193494] [] ? kthread+0x0/0xe0 [1327608.193823] [] ret_from_fork+0x58/0x90 [1327608.194074] [] ? kthread+0x0/0xe0 [1327608.194310] [1327608.194637] LustreError: dumping log to /tmp/lustre-log.1519460260.362695 [1327612.911991] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437825 [1327628.731028] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff882a7d196450 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:136/0 lens 608/0 e 0 to 0 dl 1519460286 ref 2 fl New:H/2/ffffffff rc 0/-1 [1327628.732233] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1041 previous similar messages [1327636.748404] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1327636.748888] Lustre: Skipped 403 previous similar messages [1327697.076156] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3459009 [1327707.540372] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1327707.540851] Lustre: Skipped 430 previous similar messages [1327710.583225] Pid: 210342, comm: ll_ost01_069 [1327710.583464] Call Trace: [1327710.583995] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1327710.584479] [] schedule+0x29/0x70 [1327710.584738] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1327710.584987] [] ? default_wake_function+0x0/0x20 [1327710.585254] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1327710.585778] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1327710.586044] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1327710.586297] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1327710.586558] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1327710.586820] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1327710.587067] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1327710.587354] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1327710.587623] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327710.588111] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327710.588357] [] ? default_wake_function+0x12/0x20 [1327710.588602] [] ? __wake_up_common+0x58/0x90 [1327710.588864] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327710.589128] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327710.589377] [] kthread+0xcf/0xe0 [1327710.589616] [] ? kthread+0x0/0xe0 [1327710.589859] [] ret_from_fork+0x58/0x90 [1327710.590099] [] ? kthread+0x0/0xe0 [1327710.590343] [1327710.590576] LustreError: dumping log to /tmp/lustre-log.1519460363.210342 [1327739.253839] Pid: 134397, comm: ll_ost01_016 [1327739.254081] Call Trace: [1327739.254547] [] schedule_preempt_disabled+0x29/0x70 [1327739.254791] [] __mutex_lock_slowpath+0xc7/0x1d0 [1327739.255043] [] mutex_lock+0x1f/0x2f [1327739.255293] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1327739.255591] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1327739.256100] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1327739.256592] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1327739.256886] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1327739.257156] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327739.257648] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327739.257901] [] ? default_wake_function+0x12/0x20 [1327739.258153] [] ? __wake_up_common+0x58/0x90 [1327739.258419] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327739.258683] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327739.258934] [] kthread+0xcf/0xe0 [1327739.259177] [] ? do_exit+0x6bb/0xa40 [1327739.259415] [] ? kthread+0x0/0xe0 [1327739.259660] [] ret_from_fork+0x58/0x90 [1327739.265282] [] ? kthread+0x0/0xe0 [1327739.265521] [1327739.265848] LustreError: dumping log to /tmp/lustre-log.1519460391.134397 [1327743.349656] Pid: 210336, comm: ll_ost01_063 [1327743.349898] Call Trace: [1327743.350360] [] schedule_preempt_disabled+0x29/0x70 [1327743.350603] [] __mutex_lock_slowpath+0xc7/0x1d0 [1327743.350851] [] mutex_lock+0x1f/0x2f [1327743.351110] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1327743.351395] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1327743.351946] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1327743.352428] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1327743.352708] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1327743.352975] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327743.353467] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327743.353719] [] ? default_wake_function+0x12/0x20 [1327743.353961] [] ? __wake_up_common+0x58/0x90 [1327743.354223] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327743.354487] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327743.354734] [] kthread+0xcf/0xe0 [1327743.354975] [] ? kthread+0x0/0xe0 [1327743.355215] [] ret_from_fork+0x58/0x90 [1327743.355456] [] ? kthread+0x0/0xe0 [1327743.355698] [1327743.355930] LustreError: dumping log to /tmp/lustre-log.1519460395.210336 [1327761.424994] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3466017 [1327763.828688] Pid: 210333, comm: ll_ost01_060 [1327763.828925] Call Trace: [1327763.829394] [] schedule_preempt_disabled+0x29/0x70 [1327763.829644] [] __mutex_lock_slowpath+0xc7/0x1d0 [1327763.829897] [] mutex_lock+0x1f/0x2f [1327763.830148] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1327763.830450] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1327763.830947] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1327763.831435] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1327763.831721] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1327763.831992] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327763.832483] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327763.832732] [] ? default_wake_function+0x12/0x20 [1327763.832976] [] ? __wake_up_common+0x58/0x90 [1327763.833239] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327763.833501] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327763.833748] [] kthread+0xcf/0xe0 [1327763.833987] [] ? kthread+0x0/0xe0 [1327763.834228] [] ret_from_fork+0x58/0x90 [1327763.834473] [] ? kthread+0x0/0xe0 [1327763.834714] [1327763.834949] LustreError: dumping log to /tmp/lustre-log.1519460416.210333 [1327776.116142] Pid: 210329, comm: ll_ost01_056 [1327776.116386] Call Trace: [1327776.116854] [] schedule_preempt_disabled+0x29/0x70 [1327776.117097] [] __mutex_lock_slowpath+0xc7/0x1d0 [1327776.117346] [] mutex_lock+0x1f/0x2f [1327776.117593] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1327776.117891] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1327776.118391] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1327776.118883] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1327776.119173] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1327776.119444] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1327776.119938] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1327776.120190] [] ? default_wake_function+0x12/0x20 [1327776.120439] [] ? __wake_up_common+0x58/0x90 [1327776.120710] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1327776.120978] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1327776.121228] [] kthread+0xcf/0xe0 [1327776.121466] [] ? kthread+0x0/0xe0 [1327776.121709] [] ret_from_fork+0x58/0x90 [1327776.121948] [] ? kthread+0x0/0xe0 [1327776.122191] [1327776.122421] LustreError: dumping log to /tmp/lustre-log.1519460428.210329 [1327818.030407] Lustre: oak-OST0041: deleting orphan objects from 0x0:4442179 to 0x0:4442209 [1327821.262208] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455939 to 0x0:3456033 [1327829.361671] LNet: Service thread pid 133446 was inactive for 1201.99s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1327829.362162] LNet: Skipped 25 previous similar messages [1327829.362407] LustreError: dumping log to /tmp/lustre-log.1519460481.133446 [1327878.075554] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421793 [1327896.346665] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290753 [1327897.418697] Lustre: oak-OST0043: deleting orphan objects from 0x0:4403912 to 0x0:4404001 [1327901.034530] Lustre: oak-OST0053: deleting orphan objects from 0x0:176199 to 0x0:176322 [1327922.425459] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442625 [1327965.335551] Lustre: oak-OST0051: deleting orphan objects from 0x0:176434 to 0x0:176513 [1328036.703471] Lustre: oak-OST0030: Export ffff8835775b9000 already connecting from 10.9.112.15@o2ib4 [1328036.703950] Lustre: Skipped 2 previous similar messages [1328050.203468] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361794 [1328052.795415] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426212 [1328056.819181] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288769 [1328086.700084] Lustre: oak-OST0030: Export ffff8835775b9000 already connecting from 10.9.112.15@o2ib4 [1328086.700582] Lustre: Skipped 1 previous similar message [1328117.920336] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401667 to 0x0:4401793 [1328118.920376] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347907 to 0x0:4348033 [1328136.697746] Lustre: oak-OST0030: Export ffff8835775b9000 already connecting from 10.9.112.15@o2ib4 [1328136.698224] Lustre: Skipped 2 previous similar messages [1328141.951249] Lustre: oak-OST0033: deleting orphan objects from 0x0:4424462 to 0x0:4424513 [1328154.510654] Lustre: oak-OST0050: deleting orphan objects from 0x0:175410 to 0x0:175521 [1328157.438516] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441666 [1328174.069822] Lustre: oak-OST004f: deleting orphan objects from 0x0:175140 to 0x0:175266 [1328213.843797] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437857 [1328229.577098] Lustre: 132528:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff8816e486e450 x1591002487469040/t0(0) o19->cd9072f9-71b4-1cfb-a759-fd6823f1a4a9@10.0.2.3@o2ib5:737/0 lens 336/0 e 0 to 0 dl 1519460887 ref 2 fl New:/0/ffffffff rc 0/-1 [1328229.578314] Lustre: 132528:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1055 previous similar messages [1328237.422323] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1328237.422829] Lustre: Skipped 409 previous similar messages [1328285.345436] LustreError: 196788:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519460637, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST004b_UUID lock: ffff8800402bd200/0x806f959362a7a79e lrc: 3/0,1 mode: --/PW res: [0x34cace:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 196788 timeout: 0 lvb_type: 0 [1328285.347182] LustreError: 196788:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) Skipped 1 previous similar message [1328298.200017] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3459041 [1328308.421691] Lustre: oak-OST0033: Connection restored to 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) [1328308.422165] Lustre: Skipped 439 previous similar messages [1328362.668954] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3466049 [1328365.912664] LNet: Service thread pid 147130 was inactive for 1204.37s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1328365.913389] LNet: Skipped 9 previous similar messages [1328365.913632] Pid: 147130, comm: ll_ost01_018 [1328365.913879] Call Trace: [1328365.914351] [] schedule_preempt_disabled+0x29/0x70 [1328365.914599] [] __mutex_lock_slowpath+0xc7/0x1d0 [1328365.914858] [] mutex_lock+0x1f/0x2f [1328365.915109] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1328365.915410] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1328365.915911] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1328365.916403] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1328365.916692] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1328365.916961] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1328365.917455] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1328365.917709] [] ? default_wake_function+0x12/0x20 [1328365.917962] [] ? __wake_up_common+0x58/0x90 [1328365.923715] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1328365.923983] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1328365.924233] [] kthread+0xcf/0xe0 [1328365.924478] [] ? do_exit+0x6bb/0xa40 [1328365.924725] [] ? kthread+0x0/0xe0 [1328365.924983] [] ret_from_fork+0x58/0x90 [1328365.925226] [] ? kthread+0x0/0xe0 [1328365.925470] [1328365.925710] LustreError: dumping log to /tmp/lustre-log.1519461018.147130 [1328419.922261] Lustre: oak-OST0041: deleting orphan objects from 0x0:4442179 to 0x0:4442241 [1328422.066186] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455939 to 0x0:3456065 [1328464.212097] Pid: 196790, comm: ll_ost01_031 [1328464.212336] Call Trace: [1328464.212835] [] schedule+0x29/0x70 [1328464.213119] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1328464.213385] [] ? autoremove_wake_function+0x0/0x40 [1328464.213633] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1328464.214139] [] start_this_handle+0x1a1/0x430 [jbd2] [1328464.214398] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1328464.214889] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1328464.215153] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1328464.215420] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1328464.215913] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1328464.216450] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1328464.216812] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1328464.217345] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1328464.217610] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1328464.217894] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1328464.218433] [] ? dequeue_entity+0x11c/0x5d0 [1328464.218708] [] ? dequeue_task_fair+0x3d0/0x660 [1328464.218982] [] ? __switch_to+0xd7/0x510 [1328464.219281] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1328464.219551] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1328464.220044] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1328464.220290] [] ? default_wake_function+0x0/0x20 [1328464.220556] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1328464.220822] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1328464.221063] [] kthread+0xcf/0xe0 [1328464.221302] [] ? kthread+0x0/0xe0 [1328464.221635] [] ret_from_fork+0x58/0x90 [1328464.221872] [] ? kthread+0x0/0xe0 [1328464.222111] [1328464.222340] LustreError: dumping log to /tmp/lustre-log.1519461116.196790 [1328464.223343] Pid: 362685, comm: ll_ost01_076 [1328464.223583] Call Trace: [1328464.224045] [] schedule+0x29/0x70 [1328464.224299] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1328464.224546] [] ? autoremove_wake_function+0x0/0x40 [1328464.224792] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1328464.225265] [] start_this_handle+0x1a1/0x430 [jbd2] [1328464.225518] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1328464.225984] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1328464.226235] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1328464.226485] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1328464.226956] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1328464.227435] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1328464.227714] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1328464.228231] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1328464.228483] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1328464.228749] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1328464.229221] [] ? __enqueue_entity+0x78/0x80 [1328464.229465] [] ? enqueue_entity+0x26c/0xb60 [1328464.229736] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1328464.230005] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1328464.230512] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1328464.230759] [] ? default_wake_function+0x12/0x20 [1328464.231004] [] ? __wake_up_common+0x58/0x90 [1328464.231285] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1328464.231552] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1328464.231794] [] kthread+0xcf/0xe0 [1328464.232034] [] ? kthread+0x0/0xe0 [1328464.232277] [] ret_from_fork+0x58/0x90 [1328464.232520] [] ? kthread+0x0/0xe0 [1328464.232758] [1328464.232990] Pid: 210345, comm: ll_ost01_072 [1328464.233227] Call Trace: [1328464.233690] [] schedule+0x29/0x70 [1328464.233936] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1328464.234190] [] ? autoremove_wake_function+0x0/0x40 [1328464.234444] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1328464.234916] [] start_this_handle+0x1a1/0x430 [jbd2] [1328464.235174] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1328464.235731] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1328464.235974] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1328464.236222] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1328464.236687] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1328464.237164] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1328464.237437] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1328464.237935] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1328464.238189] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1328464.238455] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1328464.238927] [] ? __enqueue_entity+0x78/0x80 [1328464.239177] [] ? enqueue_entity+0x26c/0xb60 [1328464.239450] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1328464.239722] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1328464.240233] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1328464.240479] [] ? default_wake_function+0x12/0x20 [1328464.240728] [] ? __wake_up_common+0x58/0x90 [1328464.240996] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1328464.241278] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1328464.241523] [] kthread+0xcf/0xe0 [1328464.241762] [] ? kthread+0x0/0xe0 [1328464.242002] [] ret_from_fork+0x58/0x90 [1328464.242245] [] ? kthread+0x0/0xe0 [1328464.242486] [1328464.242719] Pid: 196781, comm: ll_ost01_023 [1328464.242955] Call Trace: [1328464.243417] [] schedule+0x29/0x70 [1328464.243664] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1328464.243908] [] ? autoremove_wake_function+0x0/0x40 [1328464.244159] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1328464.244631] [] start_this_handle+0x1a1/0x430 [jbd2] [1328464.244883] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1328464.245352] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1328464.245601] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1328464.245850] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1328464.246328] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1328464.246801] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1328464.247087] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1328464.247586] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1328464.247836] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1328464.248114] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1328464.248589] [] ? __enqueue_entity+0x78/0x80 [1328464.248830] [] ? enqueue_entity+0x26c/0xb60 [1328464.249073] [] ? ___slab_alloc+0x209/0x4f0 [1328464.249345] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1328464.249612] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1328464.250204] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1328464.250444] [] ? default_wake_function+0x12/0x20 [1328464.250684] [] ? __wake_up_common+0x58/0x90 [1328464.250946] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1328464.251228] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1328464.251473] [] kthread+0xcf/0xe0 [1328464.251713] [] ? kthread+0x0/0xe0 [1328464.251953] [] ret_from_fork+0x58/0x90 [1328464.252197] [] ? kthread+0x0/0xe0 [1328464.252438] [1328478.479600] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421825 [1328496.574719] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290785 [1328496.978585] LNet: Service thread pid 362708 was inactive for 1202.59s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1328496.979061] LustreError: dumping log to /tmp/lustre-log.1519461149.362708 [1328501.074382] LustreError: dumping log to /tmp/lustre-log.1519461153.362701 [1328521.553422] LustreError: dumping log to /tmp/lustre-log.1519461174.362690 [1328524.397406] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442657 [1328533.840875] LustreError: dumping log to /tmp/lustre-log.1519461186.21428 [1328587.210477] Lustre: oak-OST004b: deleting orphan objects from 0x0:3459791 to 0x0:3459819 [1328650.487545] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361826 [1328653.423427] Lustre: oak-OST0043: deleting orphan objects from 0x0:4403912 to 0x0:4404033 [1328654.119299] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426244 [1328657.591158] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288801 [1328657.879112] Lustre: oak-OST0053: deleting orphan objects from 0x0:176199 to 0x0:176354 [1328718.364335] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401667 to 0x0:4401825 [1328719.924357] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347907 to 0x0:4348065 [1328720.876347] Lustre: oak-OST0051: deleting orphan objects from 0x0:176434 to 0x0:176545 [1328758.482498] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441698 [1328774.705755] Lustre: oak-OST004f: deleting orphan objects from 0x0:175140 to 0x0:175298 [1328814.783882] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437889 [1328830.979038] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff882883a71450 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:583/0 lens 608/0 e 0 to 0 dl 1519461488 ref 2 fl New:H/2/ffffffff rc 0/-1 [1328830.980238] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1063 previous similar messages [1328838.426519] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1328838.427058] Lustre: Skipped 404 previous similar messages [1328897.883933] Lustre: oak-OST0033: deleting orphan objects from 0x0:4424462 to 0x0:4424545 [1328899.707902] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3459073 [1328909.389795] Lustre: oak-OST0033: Connection restored to 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) [1328909.390277] Lustre: Skipped 407 previous similar messages [1328912.955292] Lustre: oak-OST0050: deleting orphan objects from 0x0:175410 to 0x0:175553 [1328963.570884] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3466081 [1329020.694263] Lustre: oak-OST0041: deleting orphan objects from 0x0:4442179 to 0x0:4442273 [1329023.062154] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455939 to 0x0:3456097 [1329079.923553] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421857 [1329098.834647] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290817 [1329119.541585] LNet: Service thread pid 210322 was inactive for 1201.04s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1329119.542303] LNet: Skipped 4 previous similar messages [1329119.542549] Pid: 210322, comm: ll_ost01_050 [1329119.542787] Call Trace: [1329119.543271] [] schedule_preempt_disabled+0x29/0x70 [1329119.543514] [] __mutex_lock_slowpath+0xc7/0x1d0 [1329119.543764] [] mutex_lock+0x1f/0x2f [1329119.544011] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1329119.544313] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1329119.544823] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1329119.545312] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1329119.545596] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1329119.545865] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1329119.546355] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1329119.546605] [] ? default_wake_function+0x12/0x20 [1329119.546867] [] ? __wake_up_common+0x58/0x90 [1329119.547132] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1329119.547393] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1329119.547641] [] kthread+0xcf/0xe0 [1329119.547877] [] ? kthread+0x0/0xe0 [1329119.548117] [] ret_from_fork+0x58/0x90 [1329119.548357] [] ? kthread+0x0/0xe0 [1329119.548595] [1329119.548924] LustreError: dumping log to /tmp/lustre-log.1519461772.210322 [1329126.153309] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442689 [1329159.288722] LustreError: 131982:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519461511, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0041_UUID lock: ffff88003bfc9c00/0x806f959362a7c3a5 lrc: 3/0,1 mode: --/PW res: [0x43c842:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 131982 timeout: 0 lvb_type: 0 [1329188.334639] Lustre: oak-OST004b: deleting orphan objects from 0x0:3459791 to 0x0:3459851 [1329189.170307] Pid: 4902, comm: ll_ost01_113 [1329189.170546] Call Trace: [1329189.171030] [] schedule+0x29/0x70 [1329189.171312] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1329189.171572] [] ? autoremove_wake_function+0x0/0x40 [1329189.171820] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1329189.172302] [] ? cfs_hash_buckets_realloc+0x1b3/0x660 [libcfs] [1329189.172787] [] start_this_handle+0x1a1/0x430 [jbd2] [1329189.173045] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1329189.173522] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1329189.173786] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1329189.174035] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1329189.174513] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1329189.174985] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1329189.175300] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1329189.175804] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1329189.176056] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1329189.176330] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1329189.176815] [] ? __enqueue_entity+0x78/0x80 [1329189.177054] [] ? enqueue_entity+0x26c/0xb60 [1329189.177421] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1329189.177687] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1329189.178167] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1329189.178412] [] ? default_wake_function+0x12/0x20 [1329189.178660] [] ? __wake_up_common+0x58/0x90 [1329189.178957] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1329189.179225] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1329189.179471] [] kthread+0xcf/0xe0 [1329189.179713] [] ? kthread+0x0/0xe0 [1329189.179953] [] ret_from_fork+0x58/0x90 [1329189.180191] [] ? kthread+0x0/0xe0 [1329189.180433] [1329189.180662] LustreError: dumping log to /tmp/lustre-log.1519461841.4902 [1329189.181518] Pid: 133457, comm: ll_ost01_013 [1329189.181758] Call Trace: [1329189.182237] [] schedule+0x29/0x70 [1329189.182490] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1329189.182759] [] ? autoremove_wake_function+0x0/0x40 [1329189.183023] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1329189.183517] [] ? cfs_hash_buckets_realloc+0x1b3/0x660 [libcfs] [1329189.184024] [] start_this_handle+0x1a1/0x430 [jbd2] [1329189.184312] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1329189.184799] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1329189.185061] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1329189.185315] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1329189.185790] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1329189.186272] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1329189.186552] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1329189.187051] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1329189.187301] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1329189.187564] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1329189.188031] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1329189.188289] [] ? lnet_md_alloc.isra.5+0x145/0x300 [lnet] [1329189.188536] [] ? LNetMDAttach+0x3f4/0x450 [lnet] [1329189.188807] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1329189.189075] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1329189.189572] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1329189.189841] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1329189.190104] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1329189.190352] [] kthread+0xcf/0xe0 [1329189.190593] [] ? kthread+0x0/0xe0 [1329189.190833] [] ret_from_fork+0x58/0x90 [1329189.191070] [] ? kthread+0x0/0xe0 [1329189.191308] [1329189.191631] Pid: 196788, comm: ll_ost01_029 [1329189.197215] Call Trace: [1329189.197700] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1329189.198170] [] schedule+0x29/0x70 [1329189.198433] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1329189.198677] [] ? default_wake_function+0x0/0x20 [1329189.198938] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1329189.199426] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1329189.199691] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1329189.199938] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1329189.200198] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1329189.200463] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1329189.200713] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1329189.200986] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1329189.201254] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1329189.201751] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1329189.201996] [] ? default_wake_function+0x12/0x20 [1329189.202240] [] ? __wake_up_common+0x58/0x90 [1329189.202509] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1329189.202776] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1329189.203020] [] kthread+0xcf/0xe0 [1329189.203261] [] ? kthread+0x0/0xe0 [1329189.203502] [] ret_from_fork+0x58/0x90 [1329189.203743] [] ? kthread+0x0/0xe0 [1329189.203981] [1329189.204213] Pid: 196787, comm: ll_ost01_028 [1329189.204452] Call Trace: [1329189.204911] [] schedule+0x29/0x70 [1329189.205152] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1329189.205394] [] ? autoremove_wake_function+0x0/0x40 [1329189.205733] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1329189.206196] [] start_this_handle+0x1a1/0x430 [jbd2] [1329189.206447] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1329189.206908] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1329189.207151] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1329189.207400] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1329189.207871] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1329189.208345] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1329189.208616] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1329189.209110] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1329189.209356] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1329189.209618] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1329189.210091] [] ? __enqueue_entity+0x78/0x80 [1329189.210337] [] ? enqueue_entity+0x26c/0xb60 [1329189.210609] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1329189.210880] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1329189.211376] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1329189.211619] [] ? default_wake_function+0x12/0x20 [1329189.211863] [] ? __wake_up_common+0x58/0x90 [1329189.212128] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1329189.212397] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1329189.212639] [] kthread+0xcf/0xe0 [1329189.212879] [] ? kthread+0x0/0xe0 [1329189.213117] [] ret_from_fork+0x58/0x90 [1329189.213360] [] ? kthread+0x0/0xe0 [1329189.213599] [1329189.213831] LNet: Service thread pid 131961 was inactive for 1202.75s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1329189.214312] LNet: Skipped 3 previous similar messages [1329251.331542] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361858 [1329254.703224] LustreError: dumping log to /tmp/lustre-log.1519461907.159218 [1329255.195326] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426276 [1329258.491151] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288833 [1329258.799072] LustreError: dumping log to /tmp/lustre-log.1519461911.131984 [1329279.278081] LustreError: dumping log to /tmp/lustre-log.1519461931.362706 [1329291.565546] LustreError: dumping log to /tmp/lustre-log.1519461944.4889 [1329319.584312] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401667 to 0x0:4401857 [1329320.488315] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347907 to 0x0:4348097 [1329360.246472] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441730 [1329376.253684] Lustre: oak-OST004f: deleting orphan objects from 0x0:175140 to 0x0:175330 [1329410.644150] Lustre: oak-OST0043: deleting orphan objects from 0x0:4403912 to 0x0:4404065 [1329413.523931] Lustre: oak-OST0053: deleting orphan objects from 0x0:176199 to 0x0:176386 [1329416.251838] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437921 [1329431.732994] Lustre: 132528:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff8800223c1450 x1591002487779920/t0(0) o19->cd9072f9-71b4-1cfb-a759-fd6823f1a4a9@10.0.2.3@o2ib5:429/0 lens 336/0 e 0 to 0 dl 1519462089 ref 2 fl New:/0/ffffffff rc 0/-1 [1329431.734163] Lustre: 132528:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1036 previous similar messages [1329439.398479] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1329439.398974] Lustre: Skipped 404 previous similar messages [1329477.248953] Lustre: oak-OST0051: deleting orphan objects from 0x0:176434 to 0x0:176577 [1329500.783901] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3459105 [1329510.456830] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1329510.457337] Lustre: Skipped 407 previous similar messages [1329564.556952] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3466113 [1329621.962361] Lustre: oak-OST0041: deleting orphan objects from 0x0:4442179 to 0x0:4442305 [1329623.994154] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455939 to 0x0:3456129 [1329653.944844] Lustre: oak-OST0033: deleting orphan objects from 0x0:4424547 to 0x0:4424577 [1329669.128086] Lustre: oak-OST0050: deleting orphan objects from 0x0:175410 to 0x0:175585 [1329681.863452] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421889 [1329699.862644] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290849 [1329729.685212] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442721 [1329790.266365] Lustre: oak-OST004b: deleting orphan objects from 0x0:3459791 to 0x0:3459883 [1329813.956128] LustreError: 362689:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519462166, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0046_UUID lock: ffff88004720d200/0x806f959362a7de32 lrc: 3/0,1 mode: --/PW res: [0x44152a:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 362689 timeout: 0 lvb_type: 0 [1329852.999451] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361890 [1329856.575198] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426308 [1329859.695232] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288865 [1329877.266201] LNet: Service thread pid 131983 was inactive for 1201.80s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1329877.266926] LNet: Skipped 4 previous similar messages [1329877.267173] Pid: 131983, comm: ll_ost01_004 [1329877.267414] Call Trace: [1329877.267887] [] schedule_preempt_disabled+0x29/0x70 [1329877.268133] [] __mutex_lock_slowpath+0xc7/0x1d0 [1329877.268383] [] mutex_lock+0x1f/0x2f [1329877.268632] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1329877.268942] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1329877.269447] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1329877.269944] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1329877.270234] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1329877.270507] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1329877.271001] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1329877.271250] [] ? default_wake_function+0x12/0x20 [1329877.271496] [] ? __wake_up_common+0x58/0x90 [1329877.271764] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1329877.272032] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1329877.272276] [] kthread+0xcf/0xe0 [1329877.272514] [] ? kthread+0x0/0xe0 [1329877.272756] [] ret_from_fork+0x58/0x90 [1329877.272996] [] ? kthread+0x0/0xe0 [1329877.273239] [1329877.273470] LustreError: dumping log to /tmp/lustre-log.1519462529.131983 [1329921.468282] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401667 to 0x0:4401889 [1329921.692246] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347907 to 0x0:4348129 [1329961.274470] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441762 [1329977.169709] Lustre: oak-OST004f: deleting orphan objects from 0x0:175140 to 0x0:175362 [1330012.427860] Pid: 362703, comm: ll_ost01_094 [1330012.428113] Call Trace: [1330012.428584] [] schedule_preempt_disabled+0x29/0x70 [1330012.428832] [] __mutex_lock_slowpath+0xc7/0x1d0 [1330012.429087] [] mutex_lock+0x1f/0x2f [1330012.429338] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1330012.429639] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1330012.430140] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1330012.430633] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1330012.430919] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330012.431189] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330012.431682] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1330012.431932] [] ? default_wake_function+0x12/0x20 [1330012.432177] [] ? __wake_up_common+0x58/0x90 [1330012.432437] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330012.432792] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330012.433037] [] kthread+0xcf/0xe0 [1330012.433273] [] ? kthread+0x0/0xe0 [1330012.433513] [] ret_from_fork+0x58/0x90 [1330012.433750] [] ? kthread+0x0/0xe0 [1330012.433992] [1330012.434225] LustreError: dumping log to /tmp/lustre-log.1519462665.362703 [1330016.523714] Pid: 6921, comm: ll_ost01_121 [1330016.523957] Call Trace: [1330016.524429] [] schedule_preempt_disabled+0x29/0x70 [1330016.524680] [] __mutex_lock_slowpath+0xc7/0x1d0 [1330016.524925] [] mutex_lock+0x1f/0x2f [1330016.525176] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1330016.525460] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1330016.525959] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1330016.526450] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1330016.526731] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330016.527001] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330016.527468] [] ? default_wake_function+0x12/0x20 [1330016.527716] [] ? __wake_up_common+0x58/0x90 [1330016.527979] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330016.528243] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330016.528482] [] kthread+0xcf/0xe0 [1330016.528726] [] ? kthread+0x0/0xe0 [1330016.528967] [] ret_from_fork+0x58/0x90 [1330016.529207] [] ? kthread+0x0/0xe0 [1330016.529444] [1330016.529679] LustreError: dumping log to /tmp/lustre-log.1519462669.6921 [1330017.167794] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437953 [1330033.098941] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff883974e4ac50 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:275/0 lens 608/0 e 0 to 0 dl 1519462690 ref 2 fl New:H/2/ffffffff rc 0/-1 [1330033.100278] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1059 previous similar messages [1330037.002743] Pid: 210339, comm: ll_ost01_066 [1330037.002983] Call Trace: [1330037.003448] [] schedule_preempt_disabled+0x29/0x70 [1330037.003691] [] __mutex_lock_slowpath+0xc7/0x1d0 [1330037.003956] [] mutex_lock+0x1f/0x2f [1330037.004309] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1330037.004615] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1330037.005104] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1330037.005596] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1330037.005884] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330037.006156] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330037.006649] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1330037.006899] [] ? default_wake_function+0x12/0x20 [1330037.007144] [] ? __wake_up_common+0x58/0x90 [1330037.007412] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330037.007679] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330037.007926] [] kthread+0xcf/0xe0 [1330037.008165] [] ? kthread+0x0/0xe0 [1330037.008407] [] ret_from_fork+0x58/0x90 [1330037.008647] [] ? kthread+0x0/0xe0 [1330037.008888] [1330037.009121] LustreError: dumping log to /tmp/lustre-log.1519462689.210339 [1330040.544595] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1330040.545088] Lustre: Skipped 403 previous similar messages [1330045.194326] Pid: 4907, comm: ll_ost01_114 [1330045.194565] Call Trace: [1330045.195027] [] schedule_preempt_disabled+0x29/0x70 [1330045.195271] [] __mutex_lock_slowpath+0xc7/0x1d0 [1330045.195520] [] mutex_lock+0x1f/0x2f [1330045.195766] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1330045.196055] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1330045.196546] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1330045.197048] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1330045.197330] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330045.197610] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330045.198114] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1330045.198362] [] ? default_wake_function+0x12/0x20 [1330045.198605] [] ? __wake_up_common+0x58/0x90 [1330045.198867] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330045.199130] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330045.199377] [] kthread+0xcf/0xe0 [1330045.199631] [] ? kthread+0x0/0xe0 [1330045.199889] [] ret_from_fork+0x58/0x90 [1330045.200132] [] ? kthread+0x0/0xe0 [1330045.200391] [1330045.200644] LustreError: dumping log to /tmp/lustre-log.1519462697.4907 [1330061.577594] LNet: Service thread pid 131982 was inactive for 1202.33s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1330061.578080] LNet: Skipped 4 previous similar messages [1330061.578321] LustreError: dumping log to /tmp/lustre-log.1519462714.131982 [1330105.123729] Lustre: oak-OST0049: deleting orphan objects from 0x0:3458851 to 0x0:3459137 [1330111.429825] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1330111.430392] Lustre: Skipped 402 previous similar messages [1330165.418853] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3466145 [1330166.744890] Lustre: oak-OST0043: deleting orphan objects from 0x0:4403912 to 0x0:4404097 [1330169.776752] Lustre: oak-OST0053: deleting orphan objects from 0x0:176199 to 0x0:176418 [1330222.934158] Lustre: oak-OST0041: deleting orphan objects from 0x0:4442179 to 0x0:4442337 [1330225.182118] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455939 to 0x0:3456161 [1330232.773753] Lustre: oak-OST0051: deleting orphan objects from 0x0:176434 to 0x0:176609 [1330269.772092] Lustre: oak-OST0046: deleting orphan objects from 0x0:4461867 to 0x0:4461889 [1330282.995447] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421921 [1330300.994576] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290881 [1330330.393244] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442753 [1330391.038317] Lustre: oak-OST004b: deleting orphan objects from 0x0:3459791 to 0x0:3459915 [1330410.469405] Lustre: oak-OST0033: deleting orphan objects from 0x0:4424580 to 0x0:4424609 [1330413.230156] LustreError: 4900:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519462765, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0033_UUID lock: ffff880003197e00/0x806f959362a80935 lrc: 3/0,1 mode: --/PW res: [0x438362:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 4900 timeout: 0 lvb_type: 0 [1330425.476685] Lustre: oak-OST0050: deleting orphan objects from 0x0:175410 to 0x0:175617 [1330453.899504] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361922 [1330457.675189] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426340 [1330460.339106] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288897 [1330522.976277] Lustre: oak-OST0039: deleting orphan objects from 0x0:4347907 to 0x0:4348161 [1330525.312074] Lustre: oak-OST0031: deleting orphan objects from 0x0:4401667 to 0x0:4401921 [1330536.922481] LustreError: 210321:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519462889, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0033_UUID lock: ffff88066706bc00/0x806f959362a80ebb lrc: 3/0,1 mode: --/PW res: [0x438382:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 210321 timeout: 0 lvb_type: 0 [1330562.030417] Lustre: oak-OST0037: deleting orphan objects from 0x0:4441495 to 0x0:4441794 [1330578.061682] Lustre: oak-OST004f: deleting orphan objects from 0x0:175140 to 0x0:175394 [1330613.004946] Lustre: oak-OST0032: Export ffff88227aea4800 already connecting from 10.9.112.15@o2ib4 [1330613.010855] Lustre: Skipped 1 previous similar message [1330618.027822] Lustre: oak-OST003c: deleting orphan objects from 0x0:4437666 to 0x0:4437985 [1330634.158861] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff88378efa6050 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:121/0 lens 608/0 e 0 to 0 dl 1519463291 ref 2 fl New:H/2/ffffffff rc 0/-1 [1330634.160055] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1064 previous similar messages [1330634.990848] LNet: Service thread pid 210326 was inactive for 1202.56s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1330634.991588] LNet: Skipped 4 previous similar messages [1330634.991834] Pid: 210326, comm: ll_ost01_053 [1330634.992109] Call Trace: [1330634.992592] [] schedule_preempt_disabled+0x29/0x70 [1330634.992840] [] __mutex_lock_slowpath+0xc7/0x1d0 [1330634.993089] [] mutex_lock+0x1f/0x2f [1330634.993337] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1330634.993638] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1330634.994176] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1330634.994668] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1330634.994956] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330634.995224] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330634.995722] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1330634.995977] [] ? default_wake_function+0x12/0x20 [1330634.996223] [] ? __wake_up_common+0x58/0x90 [1330634.996488] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330634.996757] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330634.997009] [] kthread+0xcf/0xe0 [1330634.997250] [] ? kthread+0x0/0xe0 [1330634.997497] [] ret_from_fork+0x58/0x90 [1330634.997740] [] ? kthread+0x0/0xe0 [1330634.997984] [1330634.998218] LustreError: dumping log to /tmp/lustre-log.1519463287.210326 [1330641.499699] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1330641.500206] Lustre: Skipped 409 previous similar messages [1330653.186959] LustreError: 4892:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519463005, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0047_UUID lock: ffff880022b19600/0x806f959362a8121f lrc: 3/0,1 mode: --/PW res: [0x4331b4:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 4892 timeout: 0 lvb_type: 0 [1330663.002892] Lustre: oak-OST0030: Export ffff882e8fc84800 already connecting from 10.9.112.15@o2ib4 [1330663.003368] Lustre: Skipped 2 previous similar messages [1330705.719655] Lustre: oak-OST0049: deleting orphan objects from 0x0:3459139 to 0x0:3459169 [1330712.400977] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1330712.401466] Lustre: Skipped 432 previous similar messages [1330716.907013] Pid: 362689, comm: ll_ost01_080 [1330716.907261] Call Trace: [1330716.907791] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1330716.908268] [] schedule+0x29/0x70 [1330716.908531] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1330716.908779] [] ? default_wake_function+0x0/0x20 [1330716.909045] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1330716.909532] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1330716.909799] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1330716.910056] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1330716.910320] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1330716.910585] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1330716.910832] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1330716.911121] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330716.911393] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330716.911886] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1330716.912134] [] ? default_wake_function+0x12/0x20 [1330716.912379] [] ? __wake_up_common+0x58/0x90 [1330716.912645] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330716.912912] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330716.913160] [] kthread+0xcf/0xe0 [1330716.913399] [] ? kthread+0x0/0xe0 [1330716.913653] [] ret_from_fork+0x58/0x90 [1330716.913892] [] ? kthread+0x0/0xe0 [1330716.914134] [1330716.914366] LustreError: dumping log to /tmp/lustre-log.1519463369.362689 [1330766.056760] Pid: 210327, comm: ll_ost01_054 [1330766.057001] Call Trace: [1330766.057465] [] schedule_preempt_disabled+0x29/0x70 [1330766.057717] [] __mutex_lock_slowpath+0xc7/0x1d0 [1330766.057996] [] mutex_lock+0x1f/0x2f [1330766.058244] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1330766.058552] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1330766.059083] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1330766.059573] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1330766.059951] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330766.060226] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330766.060726] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1330766.060970] [] ? default_wake_function+0x12/0x20 [1330766.061216] [] ? __wake_up_common+0x58/0x90 [1330766.061492] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330766.061772] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330766.062019] [] kthread+0xcf/0xe0 [1330766.062260] [] ? kthread+0x0/0xe0 [1330766.062500] [] ret_from_fork+0x58/0x90 [1330766.062744] [] ? kthread+0x0/0xe0 [1330766.062984] [1330766.063218] LustreError: dumping log to /tmp/lustre-log.1519463418.210327 [1330766.348852] Lustre: oak-OST004d: deleting orphan objects from 0x0:3465827 to 0x0:3466177 [1330774.248316] Pid: 4895, comm: ll_ost01_106 [1330774.248557] Call Trace: [1330774.249024] [] schedule_preempt_disabled+0x29/0x70 [1330774.249270] [] __mutex_lock_slowpath+0xc7/0x1d0 [1330774.249520] [] mutex_lock+0x1f/0x2f [1330774.249766] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1330774.250059] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1330774.250564] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1330774.251062] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1330774.251349] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330774.251628] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330774.252132] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1330774.252384] [] ? default_wake_function+0x12/0x20 [1330774.252630] [] ? __wake_up_common+0x58/0x90 [1330774.252906] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330774.253182] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330774.253428] [] kthread+0xcf/0xe0 [1330774.253668] [] ? kthread+0x0/0xe0 [1330774.253909] [] ret_from_fork+0x58/0x90 [1330774.254152] [] ? kthread+0x0/0xe0 [1330774.254398] [1330774.254631] LustreError: dumping log to /tmp/lustre-log.1519463427.4895 [1330777.229187] LustreError: 210338:0:(ldlm_request.c:130:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1519463129, 300s ago); not entering recovery in server code, just going back to sleep ns: filter-oak-OST0033_UUID lock: ffff8815a477d000/0x806f959362a81774 lrc: 3/0,1 mode: --/PW res: [0x438383:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010080000000 nid: local remote: 0x0 expref: -99 pid: 210338 timeout: 0 lvb_type: 0 [1330790.631589] Pid: 4914, comm: ll_ost01_117 [1330790.631826] Call Trace: [1330790.632285] [] schedule_preempt_disabled+0x29/0x70 [1330790.632534] [] __mutex_lock_slowpath+0xc7/0x1d0 [1330790.632781] [] mutex_lock+0x1f/0x2f [1330790.633030] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1330790.633329] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1330790.633826] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1330790.634319] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1330790.634606] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1330790.634877] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1330790.635370] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1330790.635623] [] ? default_wake_function+0x12/0x20 [1330790.635867] [] ? __wake_up_common+0x58/0x90 [1330790.636134] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1330790.636401] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1330790.636649] [] kthread+0xcf/0xe0 [1330790.636889] [] ? kthread+0x0/0xe0 [1330790.637134] [] ret_from_fork+0x58/0x90 [1330790.637378] [] ? kthread+0x0/0xe0 [1330790.637621] [1330790.637857] LustreError: dumping log to /tmp/lustre-log.1519463443.4914 [1330802.919070] LNet: Service thread pid 210335 was inactive for 1201.22s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1330802.919553] LustreError: dumping log to /tmp/lustre-log.1519463455.210335 [1330823.434115] Lustre: oak-OST0041: deleting orphan objects from 0x0:4442179 to 0x0:4442369 [1330826.097988] Lustre: oak-OST004c: deleting orphan objects from 0x0:3455939 to 0x0:3456193 [1330885.999202] Lustre: oak-OST0045: deleting orphan objects from 0x0:4421667 to 0x0:4421953 [1330901.614492] Lustre: oak-OST0034: deleting orphan objects from 0x0:4290565 to 0x0:4290913 [1330922.653682] Lustre: oak-OST0043: deleting orphan objects from 0x0:4403912 to 0x0:4404129 [1330926.269480] Lustre: oak-OST0053: deleting orphan objects from 0x0:176199 to 0x0:176450 [1330931.549078] Lustre: oak-OST0048: deleting orphan objects from 0x0:3442499 to 0x0:3442785 [1330988.762552] Lustre: oak-OST0051: deleting orphan objects from 0x0:176434 to 0x0:176641 [1330991.370346] Lustre: oak-OST004b: deleting orphan objects from 0x0:3459791 to 0x0:3459947 [1331025.768721] Lustre: oak-OST0046: deleting orphan objects from 0x0:4461867 to 0x0:4461921 [1331054.831407] Lustre: oak-OST0035: deleting orphan objects from 0x0:4361668 to 0x0:4361954 [1331058.799129] Lustre: oak-OST003f: deleting orphan objects from 0x0:4426054 to 0x0:4426372 [1331061.479006] Lustre: oak-OST003e: deleting orphan objects from 0x0:4288655 to 0x0:4288929 [1331234.962859] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff88012009ec50 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:722/0 lens 608/0 e 0 to 0 dl 1519463892 ref 2 fl New:H/2/ffffffff rc 0/-1 [1331234.964078] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1050 previous similar messages [1331237.075132] LNet: Service thread pid 21433 was inactive for 1202.67s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1331237.075850] LNet: Skipped 4 previous similar messages [1331237.076089] Pid: 21433, comm: ll_ost01_045 [1331237.076327] Call Trace: [1331237.076802] [] schedule_preempt_disabled+0x29/0x70 [1331237.077047] [] __mutex_lock_slowpath+0xc7/0x1d0 [1331237.077290] [] mutex_lock+0x1f/0x2f [1331237.077537] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1331237.077843] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1331237.078342] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1331237.078843] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1331237.079129] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331237.079399] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331237.079894] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331237.080143] [] ? default_wake_function+0x12/0x20 [1331237.080391] [] ? __wake_up_common+0x58/0x90 [1331237.080657] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331237.080929] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331237.081176] [] kthread+0xcf/0xe0 [1331237.081421] [] ? kthread+0x0/0xe0 [1331237.081668] [] ret_from_fork+0x58/0x90 [1331237.081913] [] ? kthread+0x0/0xe0 [1331237.082156] [1331237.082394] LustreError: dumping log to /tmp/lustre-log.1519463889.21433 [1331243.187920] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1331243.188448] Lustre: Skipped 417 previous similar messages [1331313.373809] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1331313.374285] Lustre: Skipped 417 previous similar messages [1331314.895170] Pid: 4900, comm: ll_ost01_111 [1331314.895416] Call Trace: [1331314.895952] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1331314.896724] [] schedule+0x29/0x70 [1331314.896990] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1331314.897244] [] ? default_wake_function+0x0/0x20 [1331314.897515] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1331314.898010] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1331314.898287] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1331314.898539] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1331314.898810] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1331314.899085] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1331314.899335] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1331314.899621] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331314.899896] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331314.900401] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331314.900648] [] ? default_wake_function+0x12/0x20 [1331314.900898] [] ? __wake_up_common+0x58/0x90 [1331314.901172] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331314.901440] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331314.901775] [] kthread+0xcf/0xe0 [1331314.902011] [] ? kthread+0x0/0xe0 [1331314.902252] [] ret_from_fork+0x58/0x90 [1331314.902489] [] ? kthread+0x0/0xe0 [1331314.902726] [1331314.902964] LustreError: dumping log to /tmp/lustre-log.1519463967.4900 [1331437.769390] Pid: 210321, comm: ll_ost01_049 [1331437.769637] Call Trace: [1331437.770173] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1331437.770660] [] schedule+0x29/0x70 [1331437.770931] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1331437.771191] [] ? default_wake_function+0x0/0x20 [1331437.771474] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1331437.771973] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1331437.772247] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1331437.772500] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1331437.772771] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1331437.773121] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1331437.773371] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1331437.773659] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331437.773936] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331437.774450] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331437.774697] [] ? default_wake_function+0x12/0x20 [1331437.774941] [] ? __wake_up_common+0x58/0x90 [1331437.775220] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331437.775504] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331437.775754] [] kthread+0xcf/0xe0 [1331437.775997] [] ? kthread+0x0/0xe0 [1331437.776241] [] ret_from_fork+0x58/0x90 [1331437.776488] [] ? kthread+0x0/0xe0 [1331437.776731] [1331437.776966] LustreError: dumping log to /tmp/lustre-log.1519464090.210321 [1331523.781339] Pid: 21424, comm: ll_ost01_036 [1331523.781583] Call Trace: [1331523.782057] [] schedule_preempt_disabled+0x29/0x70 [1331523.782305] [] __mutex_lock_slowpath+0xc7/0x1d0 [1331523.782574] [] mutex_lock+0x1f/0x2f [1331523.782825] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1331523.783133] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1331523.783631] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1331523.784148] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1331523.784466] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331523.784738] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331523.785245] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331523.785502] [] ? default_wake_function+0x12/0x20 [1331523.785750] [] ? __wake_up_common+0x58/0x90 [1331523.786018] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331523.786286] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331523.786535] [] kthread+0xcf/0xe0 [1331523.786778] [] ? do_exit+0x6bb/0xa40 [1331523.787111] [] ? kthread+0x0/0xe0 [1331523.787356] [] ret_from_fork+0x58/0x90 [1331523.787594] [] ? kthread+0x0/0xe0 [1331523.787830] [1331523.788059] LustreError: dumping log to /tmp/lustre-log.1519464176.21424 [1331527.877197] Pid: 362687, comm: ll_ost01_078 [1331527.877436] Call Trace: [1331527.877898] [] schedule_preempt_disabled+0x29/0x70 [1331527.878160] [] __mutex_lock_slowpath+0xc7/0x1d0 [1331527.878405] [] mutex_lock+0x1f/0x2f [1331527.878652] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1331527.878950] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1331527.879450] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1331527.879949] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1331527.880232] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331527.880503] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331527.880994] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331527.881244] [] ? default_wake_function+0x12/0x20 [1331527.881487] [] ? __wake_up_common+0x58/0x90 [1331527.881769] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331527.882032] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331527.882294] [] kthread+0xcf/0xe0 [1331527.882533] [] ? kthread+0x0/0xe0 [1331527.882775] [] ret_from_fork+0x58/0x90 [1331527.883014] [] ? kthread+0x0/0xe0 [1331527.883256] [1331527.883486] LustreError: dumping log to /tmp/lustre-log.1519464180.362687 [1331548.356191] Pid: 362700, comm: ll_ost01_091 [1331548.356436] Call Trace: [1331548.356908] [] schedule_preempt_disabled+0x29/0x70 [1331548.357155] [] __mutex_lock_slowpath+0xc7/0x1d0 [1331548.357412] [] mutex_lock+0x1f/0x2f [1331548.357661] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1331548.357959] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1331548.358549] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1331548.359031] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1331548.359314] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331548.359585] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331548.360081] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331548.360337] [] ? default_wake_function+0x12/0x20 [1331548.360599] [] ? __wake_up_common+0x58/0x90 [1331548.360861] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331548.361126] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331548.361373] [] kthread+0xcf/0xe0 [1331548.361615] [] ? kthread+0x0/0xe0 [1331548.361858] [] ret_from_fork+0x58/0x90 [1331548.362099] [] ? kthread+0x0/0xe0 [1331548.362341] [1331548.362573] LustreError: dumping log to /tmp/lustre-log.1519464201.362700 [1331556.547810] Pid: 4892, comm: ll_ost01_105 [1331556.548059] Call Trace: [1331556.548572] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1331556.549054] [] schedule+0x29/0x70 [1331556.549319] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1331556.549564] [] ? default_wake_function+0x0/0x20 [1331556.549844] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1331556.550334] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1331556.550600] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1331556.550858] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1331556.551127] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1331556.551390] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1331556.551642] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1331556.551943] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331556.552216] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331556.552717] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331556.552968] [] ? default_wake_function+0x12/0x20 [1331556.553219] [] ? __wake_up_common+0x58/0x90 [1331556.553485] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331556.553752] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331556.554004] [] kthread+0xcf/0xe0 [1331556.554251] [] ? kthread+0x0/0xe0 [1331556.554497] [] ret_from_fork+0x58/0x90 [1331556.554739] [] ? kthread+0x0/0xe0 [1331556.554982] [1331556.555217] LustreError: dumping log to /tmp/lustre-log.1519464209.4892 [1331560.643615] Pid: 132004, comm: ll_ost01_006 [1331560.643851] Call Trace: [1331560.644406] [] schedule_preempt_disabled+0x29/0x70 [1331560.644650] [] __mutex_lock_slowpath+0xc7/0x1d0 [1331560.644892] [] mutex_lock+0x1f/0x2f [1331560.645134] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1331560.645416] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1331560.645913] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1331560.646403] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1331560.646684] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331560.646953] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331560.647448] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331560.647697] [] ? default_wake_function+0x12/0x20 [1331560.647942] [] ? __wake_up_common+0x58/0x90 [1331560.648212] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331560.648480] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331560.648726] [] kthread+0xcf/0xe0 [1331560.648965] [] ? kthread+0x0/0xe0 [1331560.649207] [] ret_from_fork+0x58/0x90 [1331560.649449] [] ? kthread+0x0/0xe0 [1331560.649695] [1331560.649929] LustreError: dumping log to /tmp/lustre-log.1519464213.132004 [1331679.422155] Pid: 210338, comm: ll_ost01_065 [1331679.422398] Call Trace: [1331679.422928] [] ? ldlm_expired_completion_wait+0x0/0x240 [ptlrpc] [1331679.423406] [] schedule+0x29/0x70 [1331679.423667] [] ldlm_completion_ast+0x63d/0x920 [ptlrpc] [1331679.423912] [] ? default_wake_function+0x0/0x20 [1331679.424180] [] ldlm_cli_enqueue_local+0x230/0x860 [ptlrpc] [1331679.424669] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1331679.424936] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1331679.425201] [] ofd_destroy_by_fid+0x1cc/0x4a0 [ofd] [1331679.425465] [] ? ldlm_blocking_ast+0x0/0x170 [ptlrpc] [1331679.425728] [] ? ldlm_completion_ast+0x0/0x920 [ptlrpc] [1331679.425976] [] ofd_destroy_hdl+0x267/0x970 [ofd] [1331679.426268] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331679.426548] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331679.427054] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331679.427299] [] ? default_wake_function+0x12/0x20 [1331679.427544] [] ? __wake_up_common+0x58/0x90 [1331679.427820] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331679.428100] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331679.428345] [] kthread+0xcf/0xe0 [1331679.428585] [] ? kthread+0x0/0xe0 [1331679.428828] [] ret_from_fork+0x58/0x90 [1331679.429074] [] ? kthread+0x0/0xe0 [1331679.429311] [1331679.429639] LustreError: dumping log to /tmp/lustre-log.1519464332.210338 [1331765.434070] Pid: 131962, comm: ll_ost01_002 [1331765.434313] Call Trace: [1331765.434776] [] schedule+0x29/0x70 [1331765.435040] [] wait_transaction_locked+0x85/0xd0 [jbd2] [1331765.435296] [] ? autoremove_wake_function+0x0/0x40 [1331765.435545] [] add_transaction_credits+0x268/0x2f0 [jbd2] [1331765.436017] [] ? __percpu_counter_sum+0x70/0x80 [1331765.436263] [] start_this_handle+0x1a1/0x430 [jbd2] [1331765.436522] [] ? osd_declare_qid+0x1f0/0x480 [osd_ldiskfs] [1331765.436994] [] ? kmem_cache_alloc+0x1ba/0x1e0 [1331765.437244] [] jbd2__journal_start+0xf3/0x1e0 [jbd2] [1331765.437492] [] ? osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1331765.437969] [] __ldiskfs_journal_start_sb+0x69/0xe0 [ldiskfs] [1331765.438462] [] osd_trans_start+0x1b4/0x490 [osd_ldiskfs] [1331765.438794] [] tgt_client_data_update+0x2ed/0x5d0 [ptlrpc] [1331765.439292] [] tgt_client_new+0x41b/0x610 [ptlrpc] [1331765.439543] [] ofd_obd_connect+0x3a3/0x4c0 [ofd] [1331765.439810] [] target_handle_connect+0x1146/0x2a70 [ptlrpc] [1331765.440289] [] ? dequeue_entity+0x11c/0x5d0 [1331765.440534] [] ? dequeue_task_fair+0x3d0/0x660 [1331765.440781] [] ? __switch_to+0xd7/0x510 [1331765.441063] [] tgt_request_handle+0x402/0x1370 [ptlrpc] [1331765.441332] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331765.441822] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331765.442079] [] ? default_wake_function+0x0/0x20 [1331765.442346] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331765.442614] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331765.442859] [] kthread+0xcf/0xe0 [1331765.443101] [] ? kthread+0x0/0xe0 [1331765.443337] [] ret_from_fork+0x58/0x90 [1331765.443666] [] ? kthread+0x0/0xe0 [1331765.443900] [1331765.444133] LustreError: dumping log to /tmp/lustre-log.1519464418.131962 [1331765.444997] LNet: Service thread pid 362699 was inactive for 1202.49s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one. [1331834.950840] Lustre: 210334:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply req@ffff88229eede850 x1593042643189984/t0(0) o5->oak-MDT0000-mdtlov_UUID@10.0.2.52@o2ib5:567/0 lens 432/432 e 0 to 0 dl 1519464492 ref 2 fl Interpret:/0/0 rc 0/0 [1331834.952035] Lustre: 210334:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1067 previous similar messages [1331843.491129] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1331843.491693] Lustre: Skipped 414 previous similar messages [1331914.345289] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1331914.345791] Lustre: Skipped 416 previous similar messages [1331994.799393] LNet: Service thread pid 4911 was inactive for 1203.43s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [1331994.800118] LNet: Skipped 9 previous similar messages [1331994.800454] Pid: 4911, comm: ll_ost01_116 [1331994.800688] Call Trace: [1331994.801147] [] schedule_preempt_disabled+0x29/0x70 [1331994.801392] [] __mutex_lock_slowpath+0xc7/0x1d0 [1331994.801639] [] mutex_lock+0x1f/0x2f [1331994.801886] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1331994.802192] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1331994.808115] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1331994.808624] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1331994.808910] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1331994.809180] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1331994.809677] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1331994.809924] [] ? default_wake_function+0x12/0x20 [1331994.810171] [] ? __wake_up_common+0x58/0x90 [1331994.810445] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1331994.810728] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1331994.810974] [] kthread+0xcf/0xe0 [1331994.811213] [] ? kthread+0x0/0xe0 [1331994.811465] [] ret_from_fork+0x58/0x90 [1331994.811707] [] ? kthread+0x0/0xe0 [1331994.811946] [1331994.812177] LustreError: dumping log to /tmp/lustre-log.1519464647.4911 [1332281.505983] Pid: 196794, comm: ll_ost01_032 [1332281.506227] Call Trace: [1332281.506693] [] schedule_preempt_disabled+0x29/0x70 [1332281.506954] [] __mutex_lock_slowpath+0xc7/0x1d0 [1332281.507205] [] mutex_lock+0x1f/0x2f [1332281.507453] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1332281.507755] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1332281.508246] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1332281.508735] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1332281.509022] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1332281.509290] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1332281.509781] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1332281.510028] [] ? default_wake_function+0x12/0x20 [1332281.510271] [] ? __wake_up_common+0x58/0x90 [1332281.510538] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1332281.510803] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1332281.511048] [] kthread+0xcf/0xe0 [1332281.511285] [] ? kthread+0x0/0xe0 [1332281.511529] [] ret_from_fork+0x58/0x90 [1332281.511771] [] ? kthread+0x0/0xe0 [1332281.512011] [1332281.512244] LustreError: dumping log to /tmp/lustre-log.1519464934.196794 [1332285.601823] Pid: 210325, comm: ll_ost01_052 [1332285.602062] Call Trace: [1332285.602524] [] schedule_preempt_disabled+0x29/0x70 [1332285.602767] [] __mutex_lock_slowpath+0xc7/0x1d0 [1332285.603020] [] mutex_lock+0x1f/0x2f [1332285.603265] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1332285.603547] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1332285.604043] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1332285.604533] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1332285.604819] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1332285.605093] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1332285.605585] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1332285.605833] [] ? default_wake_function+0x12/0x20 [1332285.606075] [] ? __wake_up_common+0x58/0x90 [1332285.606341] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1332285.606607] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1332285.606855] [] kthread+0xcf/0xe0 [1332285.607094] [] ? kthread+0x0/0xe0 [1332285.607334] [] ret_from_fork+0x58/0x90 [1332285.607575] [] ? kthread+0x0/0xe0 [1332285.607818] [1332285.608051] LustreError: dumping log to /tmp/lustre-log.1519464938.210325 [1332306.080840] Pid: 21432, comm: ll_ost01_044 [1332306.081079] Call Trace: [1332306.081544] [] schedule_preempt_disabled+0x29/0x70 [1332306.081789] [] __mutex_lock_slowpath+0xc7/0x1d0 [1332306.082041] [] mutex_lock+0x1f/0x2f [1332306.082293] [] ofd_create_hdl+0xdcb/0x2090 [ofd] [1332306.082599] [] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc] [1332306.083100] [] ? lustre_pack_reply_v2+0x14f/0x280 [ptlrpc] [1332306.083592] [] ? lustre_pack_reply+0x11/0x20 [ptlrpc] [1332306.083881] [] tgt_request_handle+0x925/0x1370 [ptlrpc] [1332306.084155] [] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc] [1332306.084656] [] ? ptlrpc_wait_event+0x98/0x340 [ptlrpc] [1332306.084912] [] ? default_wake_function+0x12/0x20 [1332306.085156] [] ? __wake_up_common+0x58/0x90 [1332306.085512] [] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [1332306.085774] [] ? ptlrpc_main+0x0/0x1e40 [ptlrpc] [1332306.086019] [] kthread+0xcf/0xe0 [1332306.086254] [] ? kthread+0x0/0xe0 [1332306.086493] [] ret_from_fork+0x58/0x90 [1332306.086735] [] ? kthread+0x0/0xe0 [1332306.086980] [1332306.087214] LustreError: dumping log to /tmp/lustre-log.1519464958.21432 [1332436.442817] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8826bea0cc50 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:414/0 lens 608/0 e 0 to 0 dl 1519465094 ref 2 fl New:H/2/ffffffff rc 0/-1 [1332436.444025] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1087 previous similar messages [1332444.590156] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1332444.590657] Lustre: Skipped 401 previous similar messages [1332515.318368] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1332515.318852] Lustre: Skipped 404 previous similar messages [1333037.246792] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff8837b4e10050 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:260/0 lens 608/0 e 0 to 0 dl 1519465695 ref 2 fl New:H/2/ffffffff rc 0/-1 [1333037.247959] Lustre: 39067:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1077 previous similar messages [1333044.695263] Lustre: oak-OST003b: Client 3326eb74-fc4c-e39b-1783-8c55bbb22498 (at 10.9.112.6@o2ib4) reconnecting [1333044.695749] Lustre: Skipped 411 previous similar messages [1333116.289455] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1333116.289961] Lustre: Skipped 410 previous similar messages [1333638.562716] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply req@ffff882b61b06c50 x1592239615714592/t0(0) o4->3ee95ef7-4278-ead7-52a3-bdca1c47a323@10.9.112.3@o2ib4:106/0 lens 608/0 e 0 to 0 dl 1519466296 ref 2 fl New:H/2/ffffffff rc 0/-1 [1333638.563991] Lustre: 251666:0:(service.c:1346:ptlrpc_at_send_early_reply()) Skipped 1083 previous similar messages [1333646.651278] Lustre: oak-OST0035: Client 3ee95ef7-4278-ead7-52a3-bdca1c47a323 (at 10.9.112.3@o2ib4) reconnecting [1333646.651784] Lustre: Skipped 421 previous similar messages [1333717.262433] Lustre: oak-OST0042: Connection restored to 9f3e786c-91bd-0b59-f474-23ccea62b4f0 (at 10.8.2.15@o2ib6) [1333717.262902] Lustre: Skipped 417 previous similar messages [1333975.901889] Uhhuh. NMI received for unknown reason 39 on CPU 0. [1333975.902130] Do you have a strange power saving mode enabled? [1333975.902384] Kernel panic - not syncing: NMI: Not continuing [1333975.902621] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------------ 3.10.0-693.2.2.el7_lustre.pl1.x86_64 #1 [1333975.903087] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.6.0 10/26/2017 [1333975.903546] ffff881ffce05e00 ba7860647ee2c77b ffff881ffce05e18 ffffffff816a3db1 [1333975.904019] ffff881ffce05e98 ffffffff8169dc74 ffffffff00000010 ffff881ffce05ea8 [1333975.904504] ffff881ffce05e48 ba7860647ee2c77b ffff881ffce05ea8 ffffffff818e7858 [1333975.904983] Call Trace: [1333975.905214] [] dump_stack+0x19/0x1b [1333975.905469] [] panic+0xe8/0x20d [1333975.905710] [] nmi_panic+0x3f/0x40 [1333975.905954] [] do_nmi+0x3e6/0x450 [1333975.906192] [] end_repeat_nmi+0x1e/0x2e [1333975.906433] [] ? intel_idle+0xd5/0x15a [1333975.906673] [] ? intel_idle+0xd5/0x15a [1333975.906911] [] ? intel_idle+0xd5/0x15a [1333975.907150] <> [] cpuidle_enter_state+0x40/0xc0 [1333975.907405] [] cpuidle_idle_call+0xd8/0x210 [1333975.907650] [] arch_cpu_idle+0xe/0x30 [1333975.907896] [] cpu_startup_entry+0x14a/0x1c0 [1333975.908140] [] rest_init+0x77/0x80 [1333975.908384] [] start_kernel+0x439/0x45a [1333975.908624] [] ? repair_env_string+0x5c/0x5c [1333975.908866] [] ? early_idt_handler_array+0x120/0x120 [1333975.909108] [] x86_64_start_reservations+0x24/0x26 [1333975.909351] [] x86_64_start_kernel+0x14f/0x172