[LU-9845] ost-pools test_22 hangs with ‘WARNING: Pool 'lustre-mdt1' has encountered an uncorrectable I/O failure and has been suspended.’ Created: 08/Aug/17 Updated: 11/Sep/20 Resolved: 22/May/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.11.0, Lustre 2.10.2, Lustre 2.10.3, Lustre 2.10.7, Lustre 2.12.3 |
| Fix Version/s: | Lustre 2.11.0, Lustre 2.10.4 |
| Type: | Bug | Priority: | Critical |
| Reporter: | James Nunez (Inactive) | Assignee: | Nathaniel Clark |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | zfs | ||
| Environment: |
ZFS 0.7.x |
||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
ost-pools test 22 hangs when testing with ZFS 0.7.0. The last lines in the test_log before the hang are: loop for testpool2 complete total: 100 open/close in 0.09 seconds: 1061.34 ops/second CMD: onyx-35vm9 lctl pool_remove lustre.testpool OST[0001-0001/1] onyx-35vm9: OST lustre-OST0001_UUID removed from pool lustre.testpool Pool testpool, iteration 2 CMD: onyx-35vm9 lctl pool_remove lustre.testpool OST[0001-0001/1] onyx-35vm9: OST lustre-OST0001_UUID removed from pool lustre.testpool Pool testpool, iteration 3 CMD: onyx-35vm9 lctl pool_remove lustre.testpool OST[0001-0001/1] onyx-35vm9: OST lustre-OST0001_UUID removed from pool lustre.testpool Pool testpool, iteration 4 In the MDS console logs, we see: 04:17:47:[11669.022658] Lustre: DEBUG MARKER: lctl pool_remove lustre.testpool OST[0001-0001/1] 04:17:47:[11677.206802] Lustre: DEBUG MARKER: lctl pool_add lustre.testpool OST[0000-0001/1] 04:17:47:[11687.391683] Lustre: DEBUG MARKER: lctl pool_remove lustre.testpool OST[0001-0001/1] 04:17:47:[11693.557227] WARNING: Pool 'lustre-mdt1' has encountered an uncorrectable I/O failure and has been suspended. 04:17:47:[11693.557227] 04:17:47:[11697.575911] Lustre: DEBUG MARKER: lctl pool_add lustre.testpool OST[0000-0001/1] 04:17:47:[15221.223579] SysRq : Show State 04:17:47:[15221.224071] task PC stack pid father 04:17:47:[15221.224729] systemd S 0000000000000000 0 1 0 0x00000000 04:17:47:[15221.225617] ffff88007c7bbdb8 0000000000000082 ffff88007c7b0000 ffff88007c7bbfd8 04:17:47:[15221.226469] ffff88007c7bbfd8 ffff88007c7bbfd8 ffff88007c7b0000 0000000000000000 04:17:47:[15221.227325] 0000000000000000 ffff8800369d1a20 ffff88007c7b0000 0000000000000000 04:17:47:[15221.228217] Call Trace: 04:17:47:[15221.228581] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15221.229094] [<ffffffff8168bb4d>] schedule_hrtimeout_range_clock+0x12d/0x150 04:17:47:[15221.229864] [<ffffffff812486f9>] ? ep_scan_ready_list.isra.7+0x1b9/0x1f0 04:17:47:[15221.230669] [<ffffffff8168bb83>] schedule_hrtimeout_range+0x13/0x20 04:17:47:[15221.231363] [<ffffffff8124898e>] ep_poll+0x23e/0x360 04:17:47:[15221.231927] [<ffffffff81200873>] ? __fput+0x183/0x260 04:17:47:[15221.232481] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 04:17:47:[15221.233121] [<ffffffff81249bfd>] SyS_epoll_wait+0xed/0x120 04:17:47:[15221.233737] [<ffffffff81697a49>] system_call_fastpath+0x16/0x1b … 04:17:47:[15222.536161] cfs_rh_00 S ffff88007a8667b0 0 2617 2 0x00000080 04:17:47:[15222.536916] ffff8800793ebe28 0000000000000046 ffff88000001bec0 ffff8800793ebfd8 04:17:47:[15222.537791] ffff8800793ebfd8 ffff8800793ebfd8 ffff88000001bec0 ffff88007a866780 04:17:47:[15222.538646] ffff88007a866798 ffff88007a866790 0000000000000000 ffff88007a8667b0 04:17:47:[15222.539495] Call Trace: 04:17:47:[15222.539749] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.540295] [<ffffffffa05cab58>] cfs_wi_scheduler+0x2f8/0x440 [libcfs] 04:17:47:[15222.541047] [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30 04:17:47:[15222.541647] [<ffffffffa05ca860>] ? cfs_wi_schedule+0x190/0x190 [libcfs] 04:17:47:[15222.542384] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.542905] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.543561] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.544161] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.544827] cfs_rh_01 S ffff88007a8667b0 0 2618 2 0x00000080 04:17:47:[15222.545646] ffff88007b4ffe28 0000000000000046 ffff880000018fb0 ffff88007b4fffd8 04:17:47:[15222.546465] ffff88007b4fffd8 ffff88007b4fffd8 ffff880000018fb0 ffff88007a866780 04:17:47:[15222.547284] ffff88007a866798 ffff88007a866790 0000000000000000 ffff88007a8667b0 04:17:47:[15222.548152] Call Trace: 04:17:47:[15222.548437] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.549020] [<ffffffffa05cab58>] cfs_wi_scheduler+0x2f8/0x440 [libcfs] 04:17:47:[15222.549696] [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30 04:17:47:[15222.550348] [<ffffffffa05ca860>] ? cfs_wi_schedule+0x190/0x190 [libcfs] 04:17:47:[15222.551049] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.551554] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.552281] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.552858] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.553513] spl_system_task S ffff8800791e43c0 0 3362 2 0x00000080 04:17:47:[15222.554288] ffff88007bae3e00 0000000000000046 ffff88007a24af10 ffff88007bae3fd8 04:17:47:[15222.555149] ffff88007bae3fd8 ffff88007bae3fd8 ffff88007a24af10 ffff8800791e4300 04:17:47:[15222.556011] ffff88007a9c6de0 ffff8800791e4390 ffff8800366fbd30 ffff8800791e43c0 04:17:47:[15222.556870] Call Trace: 04:17:47:[15222.557125] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.557656] [<ffffffffa061f0a9>] taskq_thread+0x429/0x470 [spl] 04:17:47:[15222.558369] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 04:17:47:[15222.558965] [<ffffffffa061ec80>] ? taskq_thread_spawn+0x60/0x60 [spl] 04:17:47:[15222.559630] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.560181] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.560842] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.561451] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.562133] spl_delay_taskq S ffff88007b8fe4c0 0 3363 2 0x00000080 04:17:47:[15222.562894] ffff88007a907e00 0000000000000046 ffff88007a24bec0 ffff88007a907fd8 04:17:47:[15222.563748] ffff88007a907fd8 ffff88007a907fd8 ffff88007a24bec0 ffff88007b8fe400 04:17:47:[15222.564605] ffff88007a9c6f00 ffff88007b8fe490 0000000000000000 ffff88007b8fe4c0 04:17:47:[15222.565474] Call Trace: 04:17:47:[15222.565733] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.566242] [<ffffffffa061f0a9>] taskq_thread+0x429/0x470 [spl] 04:17:47:[15222.566941] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 04:17:47:[15222.567515] [<ffffffffa061ec80>] ? taskq_thread_spawn+0x60/0x60 [spl] 04:17:47:[15222.568229] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.568761] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.569429] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.570037] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.570703] spl_dynamic_tas S ffff88007b8fe2c0 0 3364 2 0x00000080 04:17:47:[15222.571482] ffff88007b28fe00 0000000000000046 ffff88007a248000 ffff88007b28ffd8 04:17:47:[15222.572328] ffff88007b28ffd8 ffff88007b28ffd8 ffff88007a248000 ffff88007b8fe200 04:17:47:[15222.573178] ffff88007a9c6c60 ffff88007b8fe290 ffff8800668c9670 ffff88007b8fe2c0 04:17:47:[15222.574045] Call Trace: 04:17:47:[15222.574314] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.574893] [<ffffffffa061f0a9>] taskq_thread+0x429/0x470 [spl] 04:17:47:[15222.575505] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 04:17:47:[15222.576122] [<ffffffffa061ec80>] ? taskq_thread_spawn+0x60/0x60 [spl] 04:17:47:[15222.576785] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.577352] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.578036] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.578596] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.579309] spl_kmem_cache S ffff88007b8feec0 0 3365 2 0x00000080 04:17:47:[15222.580047] ffff88007ba3fe00 0000000000000046 ffff88007a24ce70 ffff88007ba3ffd8 04:17:47:[15222.580902] ffff88007ba3ffd8 ffff88007ba3ffd8 ffff88007a24ce70 ffff88007b8fee00 04:17:47:[15222.581750] ffff88007a9c6a20 ffff88007b8fee90 ffff8800387fe800 ffff88007b8feec0 04:17:47:[15222.582597] Call Trace: 04:17:47:[15222.582853] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.583358] [<ffffffffa061f0a9>] taskq_thread+0x429/0x470 [spl] 04:17:47:[15222.584057] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 04:17:47:[15222.584645] [<ffffffffa061ec80>] ? taskq_thread_spawn+0x60/0x60 [spl] 04:17:47:[15222.585372] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.585891] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.586549] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.587153] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.587820] zvol S ffff88007b5abbc0 0 3366 2 0x00000080 04:17:47:[15222.588648] ffff88007b29be00 0000000000000046 ffff88007a249f60 ffff88007b29bfd8 04:17:47:[15222.589473] ffff88007b29bfd8 ffff88007b29bfd8 ffff88007a249f60 ffff88007b5abb00 04:17:47:[15222.590300] ffff8800364fa6c0 ffff88007b5abb90 0000000000000000 ffff88007b5abbc0 04:17:47:[15222.591167] Call Trace: 04:17:47:[15222.591437] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.592015] [<ffffffffa061f0a9>] taskq_thread+0x429/0x470 [spl] 04:17:47:[15222.592643] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 04:17:47:[15222.593264] [<ffffffffa061ec80>] ? taskq_thread_spawn+0x60/0x60 [spl] 04:17:47:[15222.593951] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.594450] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.595156] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.595709] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.596425] arc_prune S ffff880079bc1ac0 0 3372 2 0x00000080 04:17:47:[15222.597166] ffff88007b8a7e00 0000000000000046 ffff88007a9d8000 ffff88007b8a7fd8 04:17:47:[15222.598033] ffff88007b8a7fd8 ffff88007b8a7fd8 ffff88007a9d8000 ffff880079bc1a00 04:17:47:[15222.598901] ffff8800796aad80 ffff880079bc1a90 0000000000000000 ffff880079bc1ac0 04:17:47:[15222.599755] Call Trace: 04:17:47:[15222.600008] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.600537] [<ffffffffa061f0a9>] taskq_thread+0x429/0x470 [spl] 04:17:47:[15222.601256] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 04:17:47:[15222.601857] [<ffffffffa061ec80>] ? taskq_thread_spawn+0x60/0x60 [spl] 04:17:47:[15222.602514] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.603059] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.603722] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.604329] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.605006] arc_reclaim S ffffffffa0914a08 0 3373 2 0x00000080 04:17:47:[15222.605764] ffff88007b933cf8 0000000000000046 ffff88007a9dedd0 ffff88007b933fd8 04:17:47:[15222.606622] ffff88007b933fd8 ffff88007b933fd8 ffff88007a9dedd0 ffff88007b933dc8 04:17:47:[15222.607473] 00000000000186a0 0000000000000001 ffffffffa0914a88 ffffffffa0914a08 04:17:47:[15222.608292] Call Trace: 04:17:47:[15222.608635] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.609129] [<ffffffff8168bad2>] schedule_hrtimeout_range_clock+0xb2/0x150 04:17:47:[15222.609843] [<ffffffff810b4640>] ? hrtimer_get_res+0x50/0x50 04:17:47:[15222.610414] [<ffffffff8168bac6>] ? schedule_hrtimeout_range_clock+0xa6/0x150 04:17:47:[15222.611121] [<ffffffff8168bb83>] schedule_hrtimeout_range+0x13/0x20 04:17:47:[15222.611769] [<ffffffffa06236ab>] cv_timedwait_hires_common+0x12b/0x220 [spl] 04:17:47:[15222.612503] [<ffffffff810b1b20>] ? wake_up_atomic_t+0x30/0x30 04:17:47:[15222.613080] [<ffffffffa06237d4>] cv_timedwait_sig_hires+0x14/0x20 [spl] 04:17:47:[15222.613844] [<ffffffffa071ad31>] arc_reclaim_thread+0x121/0x260 [zfs] 04:17:47:[15222.614529] [<ffffffffa071ac10>] ? arc_shrink+0xc0/0xc0 [zfs] 04:17:47:[15222.615337] [<ffffffffa061df91>] thread_generic_wrapper+0x71/0x80 [spl] 04:17:47:[15222.616077] [<ffffffffa061df20>] ? __thread_exit+0x20/0x20 [spl] 04:17:47:[15222.616766] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.617319] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.618001] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.618548] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.619262] dbu_evict S ffff88003641fbc0 0 3374 2 0x00000080 04:17:47:[15222.620014] ffff88000005be00 0000000000000046 ffff88007a9daf10 ffff88000005bfd8 04:17:47:[15222.620866] ffff88000005bfd8 ffff88000005bfd8 ffff88007a9daf10 ffff88003641fb00 04:17:47:[15222.621722] ffff8800796aad20 ffff88003641fb90 ffff88004caa2070 ffff88003641fbc0 04:17:47:[15222.622573] Call Trace: 04:17:47:[15222.622826] [<ffffffff8168ca49>] schedule+0x29/0x70 04:17:47:[15222.623336] [<ffffffffa061f0a9>] taskq_thread+0x429/0x470 [spl] 04:17:47:[15222.624031] [<ffffffff810c54e0>] ? wake_up_state+0x20/0x20 04:17:47:[15222.624606] [<ffffffffa061ec80>] ? taskq_thread_spawn+0x60/0x60 [spl] 04:17:47:[15222.625315] [<ffffffff810b0a4f>] kthread+0xcf/0xe0 04:17:47:[15222.625832] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 04:17:47:[15222.626491] [<ffffffff81697998>] ret_from_fork+0x58/0x90 04:17:47:[15222.627083] [<ffffffff810b0980>] ? kthread_create_on_node+0x140/0x140 From ost-pools.sh, 1196 test_22() {
1197 local POOL_ROOT=${POOL_ROOT:-$DIR/$tdir}
1198 [[ $OSTCOUNT -le 1 ]] && skip_env "Need at least 2 OSTs" && return
1199
1200 local numfiles=100
1201
1202 create_pool_nofail $POOL
1203 add_pool $POOL "OST0000" "$FSNAME-OST0000_UUID "
1204 create_pool_nofail $POOL2
1205 add_pool $POOL2 "OST0000" "$FSNAME-OST0000_UUID "
1206
1207 add_loop $POOL 1 &
1208 add_loop $POOL2 2 &
1209 sleep 5
1210 create_dir $POOL_ROOT $POOL
1211 createmany -o $POOL_ROOT/${tfile} $numfiles ||
1212 error "createmany $POOL_ROOT/${tfile} failed!"
1213 wait
1214
1215 return 0
1216 }
1217 run_test 22 "Simultaneous manipulation of a pool"
There’s only one instance of this failure. Logs are at: |
| Comments |
| Comment by Jinshan Xiong (Inactive) [ 10/Aug/17 ] |
|
new failure at: https://testing.hpdd.intel.com/test_sets/1831f49c-7d82-11e7-9d88-5254006e85c2 This time it was in recovery-small test. |
| Comment by Nathaniel Clark [ 13/Sep/17 ] |
|
sanity/test_116a with ZFS 0.7.1 18:47:36:[ 2630.525032] Lustre: DEBUG MARKER: == sanity test 116a: stripe QOS: free space balance ================================================== 17:47:09 (1505238429) 18:47:36:[ 2664.291335] WARNING: Pool 'lustre-ost1' has encountered an uncorrectable I/O failure and has been suspended. 18:47:36:[ 2664.291335] 18:47:36:[ 2664.364291] WARNING: Pool 'lustre-ost2' has encountered an uncorrectable I/O failure and has been suspended. |
| Comment by Andreas Dilger [ 15/Sep/17 ] |
|
Hit this again https://testing.hpdd.intel.com/test_sets/426d4a64-99b0-11e7-ba20-5254006e85c2 |
| Comment by Jian Yu [ 02/Oct/17 ] |
|
sanity test_101d: https://testing.hpdd.intel.com/test_sets/5e3dbaaa-a73c-11e7-b786-5254006e85c2 |
| Comment by Jian Yu [ 03/Oct/17 ] |
|
replay-single test_70c: https://testing.hpdd.intel.com/test_sets/eb3eccaa-a7b4-11e7-bb19-5254006e85c2 |
| Comment by Bob Glossman (Inactive) [ 25/Oct/17 ] |
|
similar error seen on master in conf-sanity, test_106: |
| Comment by Bob Glossman (Inactive) [ 25/Oct/17 ] |
|
another on master, this time in conf-sanity, test_50f: seems to be happening a lot now. |
| Comment by John Hammond [ 13/Nov/17 ] |
|
Nathaniel, It's really strange to see the "Pool 'blahblah' has encountered...' message with no other messages indicating an I/O error before or after it. Is this a known phenomenon to the ZFS folks? |
| Comment by Nathaniel Clark [ 13/Nov/17 ] |
|
I'm not seeing this as a known issue, but I also haven't gotten to dig into one of the fubar'd zpools. |
| Comment by Nathaniel Clark [ 15/Nov/17 ] |
|
Reproduced with Lustre 2.10.55 ZFS 0.7.3: (test conf-sanity/test_6) [root@onyx-121vm4 ~]# zpool status pool: lustre-ost1 state: ONLINE status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run 'zpool clear'. see: http://zfsonlinux.org/msg/ZFS-8000-HC scan: none requested config: NAME STATE READ WRITE CKSUM lustre-ost1 ONLINE 0 0 0 /tmp/lustre-ost1 ONLINE 0 0 0 Corresponding messages: Nov 15 15:40:49 onyx-121vm4.onyx.hpdd.intel.com kernel: WARNING: Pool 'lustre-ost1' has encountered an uncorrectable I/O failure and has been suspended. Nov 15 15:40:54 onyx-121vm4.onyx.hpdd.intel.com kernel: Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 8 seconds. The obd refcount = 6. Is it stuck? Nov 15 15:40:54 onyx-121vm4.onyx.hpdd.intel.com kernel: Lustre: lustre-OST0000: UNLINKED ffff880017838c00 lustre-MDT0000-mdtlov_UUID 10.2.103.29@tcp 1 (0 1 0) 1 0 0 0: (null) 0 stale:0 Nov 15 15:40:54 onyx-121vm4.onyx.hpdd.intel.com kernel: Lustre: lustre-OST0000: UNLINKED ffff88007bb6f400 eb0ec782-9708-eeb0-2cc9-b935c49ad433 10.2.103.27@tcp 1 (0 1 0) 1 0 0 0: (null) 51539607553 stale:0 Nov 15 15:41:10 onyx-121vm4.onyx.hpdd.intel.com kernel: Lustre: lustre-OST0000 is waiting for obd_unlinked_exports more than 16 seconds. The obd refcount = 2. Is it stuck? A ZDB Dump of the pool is at attached: zdb-lustre-ost1.log Doing a zpool clear does remove error condition. It's weird that there's no actual error count on any of the devices. |
| Comment by Jian Yu [ 17/Nov/17 ] |
|
replay-ost-single test 1 also hit this issue: This is affecting patch review testing on master branch. |
| Comment by Jian Yu [ 22/Nov/17 ] |
|
sanity test 181 and conf-sanity test 106 in ZFS test sessions hit this issue: |
| Comment by Sarah Liu [ 29/Nov/17 ] |
|
sanity-benchmark test_bonnie, ost-pools test_23b hit the similar issue 2.10.2 RC1 zfs DNE |
| Comment by John Hammond [ 08/Dec/17 ] |
|
https://testing.hpdd.intel.com/test_sets/81c7aa0e-db9d-11e7-8027-52540065bddc |
| Comment by Mikhail Pershin [ 08/Dec/17 ] |
|
sanity 101d this time: |
| Comment by Kim Sebo [ 09/Dec/17 ] |
|
We're also seeing what seems to be this problem. It only seems to occur during very high MDS loads. It can be reproduced reliably within minutes during a mdtest run. It occurs with all the following software combinations, all with CentOS 7.4 3.10.0-693.2.2 kernel. Lustre 2.10.1 DNE, ZFS 0.7.3, Mellanox ofed 4.2.1 The pool that faults seems random and occurs on all four MDS/MDTs. i/o will resume after a "zpool clear", until another random MDT pool faults soon after. Syslog has only a single entry: # zpool status pool: zmdt1 state: ONLINE status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run 'zpool clear'. see: http://zfsonlinux.org/msg/ZFS-8000-HC scan: none requested config: NAME STATE READ WRITE CKSUM zmdt1 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 MDT_LUN_03 ONLINE 0 0 0 MDT_LUN_04 ONLINE 0 0 0 errors: No known data errors Attached is an excerpt from /proc/spl/kstat/zfs/dbgmsg . The pool faults in between timestamps ...3292 and ...3298). Both logging and "zpool status" was hung for ~5 seconds whilst this occurred. |
| Comment by John Hammond [ 09/Dec/17 ] |
|
https://testing.hpdd.intel.com/test_sets/ee2fac14-dc7e-11e7-8027-52540065bddc |
| Comment by John Hammond [ 18/Dec/17 ] |
|
https://testing.hpdd.intel.com/test_sets/185deada-e22b-11e7-8027-52540065bddc |
| Comment by John Hammond [ 21/Dec/17 ] |
|
https://testing.hpdd.intel.com/test_sets/17be7004-e22b-11e7-8027-52540065bddc |
| Comment by John Hammond [ 21/Dec/17 ] |
|
I was able to reproduce the "Pool 'lustre-mdt1' has encountered an uncorrectable I/O failure and has been suspended" message on a local setup by running the root FS out of space. m:~# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 24G 20G 2.7G 88% / ... lustre-mdt1/mdt1 72M 3.4M 67M 5% /mnt/lustre-mds1 lustre-ost1/ost1 239M 4.0M 233M 2% /mnt/lustre-ost1 lustre-ost2/ost2 239M 8.0M 229M 4% /mnt/lustre-ost2 192.168.122.121@tcp:/lustre 478M 12M 462M 3% /mnt/lustre m:~# du -hs /tmp/lustre-mdt1 40M /tmp/lustre-mdt1 m:~# ls -lh /tmp/lustre-mdt1 -rw-r--r-- 1 root root 196M Dec 21 10:12 /tmp/lustre-mdt1 m:~# dd if=/dev/zero of=/tmp/f0 bs=1M dd: error writing ‘/tmp/f0’: No space left on device 3982+0 records in 3981+0 records out Consistent with what we see in AT test logs there were no other messages besides the uncorrectable I/O failure message: [692069.367224] Lustre: DEBUG MARKER: == sanity-hsm test 13: Recursively import and restore a directory ==================================== 10:13:00 (1513872780) [692095.705099] WARNING: Pool 'lustre-mdt1' has encountered an uncorrectable I/O failure and has been suspended. [692095.709069] WARNING: Pool 'lustre-ost1' has encountered an uncorrectable I/O failure and has been suspended. Perhaps we can avoid this issue by not using sparse backing files for ZFS. |
| Comment by John Hammond [ 21/Dec/17 ] |
|
Would it be useful to set the pool failure mode to panic to get more information here? |
| Comment by John Hammond [ 21/Dec/17 ] |
|
Or patch zfs to print more of the zio? |
| Comment by Gerrit Updater [ 22/Dec/17 ] |
|
John L. Hammond (john.hammond@intel.com) uploaded a new patch: https://review.whamcloud.com/30649 |
| Comment by John Hammond [ 22/Dec/17 ] |
|
This could also be due to a blocked mmp write. Running locally: m:~# export FSTYPE=zfs m:~# bash $LUSTRE/tests/llmount.sh ... m:~# cat /sys/module/zfs/parameters/zfs_multihost_interval 1000 m:~# echo 100 > /sys/module/zfs/parameters/zfs_multihost_interval # set mmp interval to 100ms m:~# chrt -f 20 dd if=/dev/zero of=/dev/null & m:~# chrt -f 20 dd if=/dev/zero of=/dev/null & m:~# dmesg | tail [ 122.404321] Lustre: Skipped 1 previous similar message [ 122.406378] Lustre: Mounted lustre-client [ 123.540533] Lustre: DEBUG MARKER: Using TIMEOUT=20 [ 171.949008] sched: RT throttling activated [ 308.950021] WARNING: Pool 'lustre-ost2' has encountered an uncorrectable I/O failure and has been suspended. [ 308.950055] WARNING: Pool 'lustre-ost1' has encountered an uncorrectable I/O failure and has been suspended. [ 308.950070] WARNING: Pool 'lustre-mdt1' has encountered an uncorrectable I/O failure and has been suspended. m:~# zpool status lustre-mdt1 pool: lustre-mdt1 state: ONLINE status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run 'zpool clear'. see: http://zfsonlinux.org/msg/ZFS-8000-HC scan: none requested config: NAME STATE READ WRITE CKSUM lustre-mdt1 ONLINE 0 0 0 /tmp/lustre-mdt1 ONLINE 0 0 0 errors: No known data errors |
| Comment by Gerrit Updater [ 05/Jan/18 ] |
|
Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/30649/ |
| Comment by Oleg Drokin [ 07/Jan/18 ] |
|
I had two panic events with the landed patch, both very similar - a sanityn test 77/78 triggers a "Sleep under spinlock" warning and then shortly after the zpool panic hits. [164923.664467] Lustre: DEBUG MARKER: == sanityn test 77c: check ORR NRS policy ============================================================ 23:24:29 (1515212669) [164937.316648] BUG: sleeping function called from invalid context at /home/green/git/lustre-release/libcfs/libcfs/hash.c:1120 [164937.319085] in_atomic(): 1, irqs_disabled(): 0, pid: 9459, name: lctl [164937.319812] INFO: lockdep is turned off. [164937.320473] CPU: 0 PID: 9459 Comm: lctl Tainted: P W OE ------------ 3.10.0-debug #2 [164937.322225] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [164937.322915] ffff8802e79d4b40 000000007ad8fd8d ffff8802c0183d18 ffffffff816fd3e4 [164937.324369] ffff8802c0183d30 ffffffff810b00e9 0000000000000018 ffff8802c0183d80 [164937.325689] ffffffffa0295233 ffffffffa0299cf6 ffffc90015e68000 0000000000001000 [164937.336350] Call Trace: [164937.337079] [<ffffffff816fd3e4>] dump_stack+0x19/0x1b [164937.337746] [<ffffffff810b00e9>] __might_sleep+0xe9/0x110 [164937.338436] [<ffffffffa0295233>] cfs_hash_putref+0x233/0x500 [libcfs] [164937.339120] [<ffffffffa0299cf6>] ? cfs_binheap_destroy+0x396/0x830 [libcfs] [164937.339927] [<ffffffffa11c1909>] nrs_orr_stop+0x69/0x270 [ptlrpc] [164937.340647] [<ffffffffa11b9d19>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc] [164937.354153] [<ffffffffa11ba688>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc] [164937.355884] [<ffffffffa11bb9c4>] nrs_policy_start_locked+0x624/0x670 [ptlrpc] [164937.357170] [<ffffffffa11bbb3b>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc] [164937.357910] [<ffffffffa11be12f>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc] [164937.359173] [<ffffffffa11a05b5>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc] [164937.360594] [<ffffffff8125d3cd>] proc_reg_write+0x3d/0x80 [164937.362287] [<ffffffff811ed36d>] vfs_write+0xbd/0x1e0 [164937.362943] [<ffffffff811fe3dd>] ? putname+0x3d/0x60 [164937.363595] [<ffffffff811ede34>] SyS_write+0x84/0xf0 [164937.364292] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [164937.364958] BUG: scheduling while atomic: lctl/9459/0x10000002 [164937.365627] INFO: lockdep is turned off. [164937.366239] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect sysimgblt ttm ata_generic drm_kms_helper pata_acpi drm i2c_piix4 ata_piix pcspkr virtio_balloon serio_raw virtio_blk virtio_console i2c_core libata floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [164937.412757] CPU: 0 PID: 9459 Comm: lctl Tainted: P W OE ------------ 3.10.0-debug #2 [164937.414008] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [164937.416003] ffff8802e79d4b40 000000007ad8fd8d ffff8802c0183c90 ffffffff816fd3e4 [164937.418242] ffff8802c0183ca8 ffffffff816f9218 ffff8802c0183fd8 ffff8802c0183d08 [164937.419596] ffffffff817041a7 ffff8802e79d4b40 ffff8802c0183fd8 ffff8802c0183fd8 [164937.421025] Call Trace: [164937.421640] [<ffffffff816fd3e4>] dump_stack+0x19/0x1b [164937.422370] [<ffffffff816f9218>] __schedule_bug+0x59/0x68 [164937.423092] [<ffffffff817041a7>] __schedule+0x807/0x950 [164937.423866] [<ffffffff810b4936>] __cond_resched+0x26/0x30 [164937.424823] [<ffffffff817045da>] _cond_resched+0x3a/0x50 [164937.425472] [<ffffffffa0295238>] cfs_hash_putref+0x238/0x500 [libcfs] [164937.426355] [<ffffffffa0299cf6>] ? cfs_binheap_destroy+0x396/0x830 [libcfs] [164937.427348] [<ffffffffa11c1909>] nrs_orr_stop+0x69/0x270 [ptlrpc] [164937.428483] [<ffffffffa11b9d19>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc] [164937.429204] [<ffffffffa11ba688>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc] [164937.430526] [<ffffffffa11bb9c4>] nrs_policy_start_locked+0x624/0x670 [ptlrpc] [164937.440376] [<ffffffffa11bbb3b>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc] [164937.440915] [<ffffffffa11be12f>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc] [164937.441796] [<ffffffffa11a05b5>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc] [164937.442696] [<ffffffff8125d3cd>] proc_reg_write+0x3d/0x80 [164937.443151] [<ffffffff811ed36d>] vfs_write+0xbd/0x1e0 [164937.443621] [<ffffffff811fe3dd>] ? putname+0x3d/0x60 [164937.444235] [<ffffffff811ede34>] SyS_write+0x84/0xf0 [164937.444903] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [164937.510313] BUG: scheduling while atomic: lctl/9459/0x10000002 [164937.511243] INFO: lockdep is turned off. [164937.511901] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect sysimgblt ttm ata_generic drm_kms_helper pata_acpi drm i2c_piix4 ata_piix pcspkr virtio_balloon serio_raw virtio_blk virtio_console i2c_core libata floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [164937.539224] CPU: 15 PID: 9459 Comm: lctl Tainted: P W OE ------------ 3.10.0-debug #2 [164937.540796] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [164937.544056] ffff8802e79d4b40 000000007ad8fd8d ffff8802c0183c90 ffffffff816fd3e4 [164937.578747] ffff8802c0183ca8 ffffffff816f9218 ffff8802c0183fd8 ffff8802c0183d08 [164937.581213] ffffffff817041a7 ffff8802e79d4b40 ffff8802c0183fd8 ffff8802c0183fd8 [164937.582630] Call Trace: [164937.583224] [<ffffffff816fd3e4>] dump_stack+0x19/0x1b [164937.583905] [<ffffffff816f9218>] __schedule_bug+0x59/0x68 [164937.584571] [<ffffffff817041a7>] __schedule+0x807/0x950 [164937.585220] [<ffffffff810b4936>] __cond_resched+0x26/0x30 [164937.585885] [<ffffffff817045da>] _cond_resched+0x3a/0x50 [164937.586567] [<ffffffffa0295238>] cfs_hash_putref+0x238/0x500 [libcfs] [164937.587246] [<ffffffffa0299cf6>] ? cfs_binheap_destroy+0x396/0x830 [libcfs] [164937.588052] [<ffffffffa11c1909>] nrs_orr_stop+0x69/0x270 [ptlrpc] [164937.588774] [<ffffffffa11b9d19>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc] [164937.589496] [<ffffffffa11ba688>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc] [164937.590765] [<ffffffffa11bb9c4>] nrs_policy_start_locked+0x624/0x670 [ptlrpc] [164937.592025] [<ffffffffa11bbb3b>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc] [164937.592729] [<ffffffffa11be12f>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc] [164937.593993] [<ffffffffa11a05b5>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc] [164937.595191] [<ffffffff8125d3cd>] proc_reg_write+0x3d/0x80 [164937.596143] [<ffffffff811ed36d>] vfs_write+0xbd/0x1e0 [164937.596779] [<ffffffff811fe3dd>] ? putname+0x3d/0x60 [164937.597413] [<ffffffff811ede34>] SyS_write+0x84/0xf0 [164937.598038] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [164937.614894] BUG: scheduling while atomic: lctl/9459/0x10000002 [164937.615623] INFO: lockdep is turned off. [164937.616240] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect sysimgblt ttm ata_generic drm_kms_helper pata_acpi drm i2c_piix4 ata_piix pcspkr virtio_balloon serio_raw virtio_blk virtio_console i2c_core libata floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [164937.622741] CPU: 15 PID: 9459 Comm: lctl Tainted: P W OE ------------ 3.10.0-debug #2 [164937.623998] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [164937.624663] ffff8802e79d4b40 000000007ad8fd8d ffff8802c0183c90 ffffffff816fd3e4 [164937.626597] ffff8802c0183ca8 ffffffff816f9218 ffff8802c0183fd8 ffff8802c0183d08 [164937.627859] ffffffff817041a7 ffff8802e79d4b40 ffff8802c0183fd8 ffff8802c0183fd8 [164937.630788] Call Trace: [164937.631507] [<ffffffff816fd3e4>] dump_stack+0x19/0x1b [164937.632306] [<ffffffff816f9218>] __schedule_bug+0x59/0x68 [164937.633014] [<ffffffff817041a7>] __schedule+0x807/0x950 [164937.634560] [<ffffffff810b4936>] __cond_resched+0x26/0x30 [164937.635082] [<ffffffff817045da>] _cond_resched+0x3a/0x50 [164937.635744] [<ffffffffa0295238>] cfs_hash_putref+0x238/0x500 [libcfs] [164937.636594] [<ffffffffa0299cf6>] ? cfs_binheap_destroy+0x396/0x830 [libcfs] [164937.643706] [<ffffffffa11c1909>] nrs_orr_stop+0x69/0x270 [ptlrpc] [164937.644453] [<ffffffffa11b9d19>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc] [164937.645174] [<ffffffffa11ba688>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc] [164937.656192] [<ffffffffa11bb9c4>] nrs_policy_start_locked+0x624/0x670 [ptlrpc] [164937.657543] [<ffffffffa11bbb3b>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc] [164937.658320] [<ffffffffa11be12f>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc] [164937.659782] [<ffffffffa11a05b5>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc] [164937.661211] [<ffffffff8125d3cd>] proc_reg_write+0x3d/0x80 [164937.661885] [<ffffffff811ed36d>] vfs_write+0xbd/0x1e0 [164937.663536] [<ffffffff811fe3dd>] ? putname+0x3d/0x60 [164937.664179] [<ffffffff811ede34>] SyS_write+0x84/0xf0 [164937.666440] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [164937.681884] BUG: scheduling while atomic: lctl/9459/0x10000002 [164937.682619] INFO: lockdep is turned off. [164937.683241] Modules linked in: lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_zfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) ext4 mbcache loop zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate jbd2 syscopyarea sysfillrect sysimgblt ttm ata_generic drm_kms_helper pata_acpi drm i2c_piix4 ata_piix pcspkr virtio_balloon serio_raw virtio_blk virtio_console i2c_core libata floppy nfsd ip_tables rpcsec_gss_krb5 [last unloaded: libcfs] [164937.721046] CPU: 15 PID: 9459 Comm: lctl Tainted: P W OE ------------ 3.10.0-debug #2 [164937.722320] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [164937.722980] ffff8802e79d4b40 000000007ad8fd8d ffff8802c0183c90 ffffffff816fd3e4 [164937.724242] ffff8802c0183ca8 ffffffff816f9218 ffff8802c0183fd8 ffff8802c0183d08 [164937.748057] ffffffff817041a7 ffff8802e79d4b40 ffff8802c0183fd8 ffff8802c0183fd8 [164937.755299] Call Trace: [164937.755938] [<ffffffff816fd3e4>] dump_stack+0x19/0x1b [164937.757446] [<ffffffff816f9218>] __schedule_bug+0x59/0x68 [164937.764367] [<ffffffff817041a7>] __schedule+0x807/0x950 [164937.765081] [<ffffffff810b4936>] __cond_resched+0x26/0x30 [164937.765847] [<ffffffff817045da>] _cond_resched+0x3a/0x50 [164937.766658] [<ffffffffa0295238>] cfs_hash_putref+0x238/0x500 [libcfs] [164937.767358] [<ffffffffa0299cf6>] ? cfs_binheap_destroy+0x396/0x830 [libcfs] [164937.768142] [<ffffffffa11c1909>] nrs_orr_stop+0x69/0x270 [ptlrpc] [164937.768970] [<ffffffffa11b9d19>] nrs_policy_stop0+0x39/0x1c0 [ptlrpc] [164937.769730] [<ffffffffa11ba688>] nrs_policy_stop_primary.isra.9+0x78/0x1f0 [ptlrpc] [164937.772095] [<ffffffffa11bb9c4>] nrs_policy_start_locked+0x624/0x670 [ptlrpc] [164937.775802] [<ffffffffa11bbb3b>] nrs_policy_ctl+0x12b/0x2b0 [ptlrpc] [164937.776892] [<ffffffffa11be12f>] ptlrpc_nrs_policy_control+0x10f/0x2d0 [ptlrpc] [164937.778593] [<ffffffffa11a05b5>] ptlrpc_lprocfs_nrs_seq_write+0x3a5/0x560 [ptlrpc] [164937.779895] [<ffffffff8125d3cd>] proc_reg_write+0x3d/0x80 [164937.780630] [<ffffffff811ed36d>] vfs_write+0xbd/0x1e0 [164937.781701] [<ffffffff811fe3dd>] ? putname+0x3d/0x60 [164937.784249] [<ffffffff811ede34>] SyS_write+0x84/0xf0 [164937.784959] [<ffffffff8170fc49>] system_call_fastpath+0x16/0x1b [164941.981909] Kernel panic - not syncing: Pool 'lustre-ost2' has encountered an uncorrectable I/O failure and the failure mode property for this pool is set to panic. [164941.985000] CPU: 0 PID: 13624 Comm: mmp Tainted: P W OE ------------ 3.10.0-debug #2 [164941.986459] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [164941.987216] ffffffffa0c844c8 00000000ff7aa3be ffff8802cf5a3cc0 ffffffff816fd3e4 [164941.988699] ffff8802cf5a3d40 ffffffff816f8c34 ffff880200000010 ffff8802cf5a3d50 [164941.990160] ffff8802cf5a3cf0 00000000ff7aa3be ffffffffa0565213 ffff8802266ec000 [164941.991644] Call Trace: [164941.992359] [<ffffffff816fd3e4>] dump_stack+0x19/0x1b [164941.993104] [<ffffffff816f8c34>] panic+0xd8/0x1e7 [164941.993863] [<ffffffffa0565213>] ? __cv_timedwait_common+0x143/0x240 [spl] [164942.007808] [<ffffffffa0bdf279>] zio_suspend+0x1c9/0x1d0 [zfs] [164942.008631] [<ffffffffa0b56242>] mmp_thread+0x9c2/0xb30 [zfs] [164942.009436] [<ffffffffa0b55640>] ? metaslab_check_free+0x190/0x190 [zfs] [164942.010251] [<ffffffffa0b55880>] ? mmp_write_done+0x240/0x240 [zfs] [164942.011057] [<ffffffffa055f091>] thread_generic_wrapper+0x71/0x80 [spl] [164942.011793] [<ffffffffa055f020>] ? __thread_exit+0x20/0x20 [spl] [164942.012514] [<ffffffff810a2eba>] kthread+0xea/0xf0 [164942.013119] [<ffffffff810a2dd0>] ? kthread_create_on_node+0x140/0x140 [164942.013767] [<ffffffff8170fb98>] ret_from_fork+0x58/0x90 [164942.014394] [<ffffffff810a2dd0>] ? kthread_create_on_node+0x140/0x140 Hm, I was under full impression tehre was a ticket for those schedule when atomic stuff in nrs_orr_stop but cannot find it. the other crash was in test 78 with a lot more warnings in between before hitting the panic, and Jira does not like such a long comment. |
| Comment by nasf (Inactive) [ 08/Jan/18 ] |
|
+1 on master: |
| Comment by John Hammond [ 08/Jan/18 ] |
|
The panic in https://testing.hpdd.intel.com/test_sets/15f738d8-f463-11e7-8c23-52540065bddc is due to MMP: [ 7799.830025] Kernel panic - not syncing: Pool 'lustre-mdt1' has encountered an uncorrectable I/O failure and the failure mode property for this pool is set to panic. [ 7799.831007] CPU: 1 PID: 15805 Comm: mmp Tainted: P OE ------------ 3.10.0-693.11.1.el7_lustre.x86_64 #1 [ 7799.831007] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [ 7799.831007] ffff880041bda000 000000002c0bba5e ffff880056c73d28 ffffffff816a3e3d [ 7799.831007] ffff880056c73da8 ffffffff8169dd24 ffff880000000010 ffff880056c73db8 [ 7799.831007] ffff880056c73d58 000000002c0bba5e ffff880056c73dc0 ffff880041bda000 [ 7799.831007] Call Trace: [ 7799.831007] [<ffffffff816a3e3d>] dump_stack+0x19/0x1b [ 7799.831007] [<ffffffff8169dd24>] panic+0xe8/0x20d [ 7799.831007] [<ffffffffc08bbcb6>] zio_suspend+0x106/0x110 [zfs] [ 7799.831007] [<ffffffffc08427ea>] mmp_thread+0x7ea/0x820 [zfs] [ 7799.831007] [<ffffffffc0841e90>] ? metaslab_check_free+0x190/0x190 [zfs] [ 7799.831007] [<ffffffffc0842000>] ? mmp_write_done+0x170/0x170 [zfs] [ 7799.831007] [<ffffffffc070cfa1>] thread_generic_wrapper+0x71/0x80 [spl] [ 7799.831007] [<ffffffffc070cf30>] ? __thread_exit+0x20/0x20 [spl] [ 7799.831007] [<ffffffff810b099f>] kthread+0xcf/0xe0 [ 7799.831007] [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40 [ 7799.831007] [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90 [ 7799.831007] [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40 |
| Comment by Oleg Drokin [ 09/Jan/18 ] |
|
aha, LU-6849 is where I was putting my reports |
| Comment by Sebastien Buisson (Inactive) [ 10/Jan/18 ] |
|
Hit with sanity test_64d: Panic is due to MMP: [ 2212.716735] Kernel panic - not syncing: Pool 'lustre-ost1' has encountered an uncorrectable I/O failure and the failure mode property for this pool is set to panic. [ 2212.718347] CPU: 0 PID: 777 Comm: mmp Tainted: P OE ------------ 3.10.0-693.11.1.el7_lustre.x86_64 #1 [ 2212.719366] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 2212.719946] ffff880058852000 00000000554e0bda ffff88005335bd28 ffffffff816a3e3d [ 2212.720772] ffff88005335bda8 ffffffff8169dd24 ffff880000000010 ffff88005335bdb8 [ 2212.721598] ffff88005335bd58 00000000554e0bda ffff88005335bdc0 ffff880058852000 [ 2212.722421] Call Trace: [ 2212.722687] [<ffffffff816a3e3d>] dump_stack+0x19/0x1b [ 2212.723211] [<ffffffff8169dd24>] panic+0xe8/0x20d [ 2212.723744] [<ffffffffc07facb6>] zio_suspend+0x106/0x110 [zfs] [ 2212.724356] [<ffffffffc07817ea>] mmp_thread+0x7ea/0x820 [zfs] [ 2212.724969] [<ffffffffc0780e90>] ? metaslab_check_free+0x190/0x190 [zfs] [ 2212.725676] [<ffffffffc0781000>] ? mmp_write_done+0x170/0x170 [zfs] [ 2212.726312] [<ffffffffc064bfa1>] thread_generic_wrapper+0x71/0x80 [spl] [ 2212.726988] [<ffffffffc064bf30>] ? __thread_exit+0x20/0x20 [spl] [ 2212.727604] [<ffffffff810b099f>] kthread+0xcf/0xe0 [ 2212.728092] [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40 [ 2212.728724] [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90 [ 2212.729261] [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40 ZFS 0.7.3 |
| Comment by Nathaniel Clark [ 11/Jan/18 ] |
|
vdev_random_leaf() (which is used to pick a vdev to write to in mmp_thread) was rewritten as mmp_random_leaf() (See ZFS Pull 6665) for ZFS 0.7.5/0.7.4. Looking at the code, I don't see this being the issue that was fixed, as this prevents an infinite loop if upper vdev is writeable with no writable children. |
| Comment by Nathaniel Clark [ 11/Jan/18 ] |
|
b2_10, ZFS 0.7.5 sanityn/test_40d (MMP trips during test 40c, but nothing hangs until beginning of test 40d) Jan 9 23:47:51 trevis-4vm7 kernel: WARNING: Pool 'lustre-ost2' has encountered an uncorrectable I/O failure and has been suspended. Jan 9 23:47:52 trevis-4vm7 kernel: WARNING: Pool 'lustre-ost1' has encountered an uncorrectable I/O failure and has been suspended. |
| Comment by Sarah Liu [ 11/Jan/18 ] |
|
another ones on 2.10.3 [ 4280.251316] Lustre: DEBUG MARKER: == replay-ost-single test 5: Fail OST during iozone ================================================== 17:57:51 (1515607071)
[ 4288.629724] Lustre: DEBUG MARKER: grep -c /mnt/lustre-ost1' ' /proc/mounts
[ 4289.211154] Lustre: DEBUG MARKER: umount -d /mnt/lustre-ost1
[ 4289.591211] Lustre: Failing over lustre-OST0000
[ 4292.839279] Lustre: server umount lustre-OST0000 complete
[ 4293.002308] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
[ 4293.343927] Lustre: DEBUG MARKER: ! zpool list -H lustre-ost1 >/dev/null 2>&1 ||
grep -q ^lustre-ost1/ /proc/mounts ||
zpool export lustre-ost1
[ 4304.792431] Lustre: DEBUG MARKER: hostname
[ 4305.199104] Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs;
zpool list -H lustre-ost1 >/dev/null 2>&1 ||
zpool import -f -o cachefile=none -d /dev/lvm-Role_OSS lustre-ost1
[ 4306.781168] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-ost1/ost1
[ 4307.103538] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost1; mount -t lustre lustre-ost1/ost1 /mnt/lustre-ost1
[ 4307.465621] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180
[ 4307.633766] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
[ 4307.958667] Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 6 clients reconnect
[ 4307.960237] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
[ 4308.555493] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-40vm8.onyx.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 4308.567312] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-40vm8.onyx.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 4308.764786] Lustre: DEBUG MARKER: onyx-40vm8.onyx.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 4308.765360] Lustre: DEBUG MARKER: onyx-40vm8.onyx.hpdd.intel.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 4308.940402] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-ost1/ost1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
[ 4309.261238] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-ost1/ost1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
[ 4309.590687] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-ost1/ost1 2>/dev/null
[ 4310.483010] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-40vm7.onyx.hpdd.intel.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-*.ost_server_uuid
[ 4310.555992] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-40vm6.onyx.hpdd.intel.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-*.ost_server_uuid
[ 4310.686461] Lustre: DEBUG MARKER: onyx-40vm7.onyx.hpdd.intel.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-*.ost_server_uuid
[ 4310.773159] Lustre: DEBUG MARKER: onyx-40vm6.onyx.hpdd.intel.com: executing wait_import_state_mount FULL osc.lustre-OST0000-osc-*.ost_server_uuid
[ 4310.799378] Lustre: lustre-OST0000: Recovery over after 0:03, of 6 clients 6 recovered and 0 were evicted.
[ 4310.803867] Lustre: lustre-OST0000: deleting orphan objects from 0x300000401:34 to 0x300000401:417
[ 4310.808957] Lustre: lustre-OST0000: deleting orphan objects from 0x0:2436 to 0x0:2465
[ 4310.810587] Lustre: lustre-OST0000: deleting orphan objects from 0x300000400:34 to 0x300000400:417
[ 4310.812828] Lustre: lustre-OST0000: deleting orphan objects from 0x300000402:34 to 0x300000402:417
[ 4311.105228] Lustre: DEBUG MARKER: /usr/sbin/lctl mark osc.lustre-OST0000-osc-*.ost_server_uuid in FULL state after 0 sec
[ 4311.497754] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-*.ost_server_uuid in FULL state after 0 sec
[ 4312.069870] Lustre: DEBUG MARKER: /usr/sbin/lctl mark osc.lustre-OST0000-osc-*.ost_server_uuid in FULL state after 1 sec
[ 4312.375301] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-*.ost_server_uuid in FULL state after 1 sec
[ 6614.243576] WARNING: Pool 'lustre-ost1' has encountered an uncorrectable I/O failure and has been suspended.
[ 6614.285585] WARNING: Pool 'lustre-ost3' has encountered an uncorrectable I/O failure and has been suspended.
[ 6614.397584] WARNING: Pool 'lustre-ost7' has encountered an uncorrectable I/O failure and has been suspended.
[ 6614.501582] WARNING: Pool 'lustre-ost6' has encountered an uncorrectable I/O failure and has been suspended.
[ 6614.504581] WARNING: Pool 'lustre-ost8' has encountered an uncorrectable I/O failure and has been suspended.
[ 6614.510587] WARNING: Pool 'lustre-ost5' has encountered an uncorrectable I/O failure and has been suspended.
[ 6614.796590] WARNING: Pool 'lustre-ost2' has encountered an uncorrectable I/O failure and has been suspended.
[ 6615.011582] WARNING: Pool 'lustre-ost4' has encountered an uncorrectable I/O failure and has been suspended.
[ 6649.202587] LNet: Service thread pid 30548 was inactive for 40.12s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
[ 6649.204384] Pid: 30548, comm: ll_ost_io00_003
[ 6649.204827]
Call Trace:
[ 6649.205292] [<ffffffff816ab6b9>] schedule+0x29/0x70
[ 6649.205810] [<ffffffffc0647515>] cv_wait_common+0x125/0x150 [spl]
[ 6649.206485] [<ffffffff810b34b0>] ? autoremove_wake_function+0x0/0x40
[ 6649.207126] [<ffffffffc0647555>] __cv_wait+0x15/0x20 [spl]
[ 6649.207776] [<ffffffffc0796a2f>] txg_wait_synced+0xef/0x140 [zfs]
[ 6649.208409] [<ffffffffc074bdf5>] dmu_tx_wait+0x275/0x3c0 [zfs]
[ 6649.209018] [<ffffffffc074bfd1>] dmu_tx_assign+0x91/0x490 [zfs]
[ 6649.209628] [<ffffffffc0fbcf3a>] osd_trans_start+0xaa/0x3c0 [osd_zfs]
[ 6649.210273] [<ffffffffc11101cb>] ofd_trans_start+0x6b/0xe0 [ofd]
[ 6649.210880] [<ffffffffc11165eb>] ofd_commitrw_write+0x94b/0x1c50 [ofd]
[ 6649.211528] [<ffffffffc111a5a9>] ofd_commitrw+0x4b9/0xac0 [ofd]
[ 6649.212188] [<ffffffffc0e12eb7>] obd_commitrw+0x2ed/0x330 [ptlrpc]
[ 6649.212847] [<ffffffffc0de5d91>] tgt_brw_write+0xff1/0x17c0 [ptlrpc]
[ 6649.213489] [<ffffffff813300bb>] ? string.isra.7+0x3b/0xf0
[ 6649.214075] [<ffffffffc0d37e60>] ? target_bulk_timeout+0x0/0xb0 [ptlrpc]
[ 6649.214787] [<ffffffffc0de1da5>] tgt_request_handle+0x925/0x1370 [ptlrpc]
[ 6649.215494] [<ffffffffc0d8ab16>] ptlrpc_server_handle_request+0x236/0xa90 [ptlrpc]
[ 6649.216262] [<ffffffff810bc0f8>] ? __wake_up_common+0x58/0x90
[ 6649.216871] [<ffffffffc0d8e252>] ptlrpc_main+0xa92/0x1e40 [ptlrpc]
[ 6649.217527] [<ffffffffc0d8d7c0>] ? ptlrpc_main+0x0/0x1e40 [ptlrpc]
[ 6649.218155] [<ffffffff810b252f>] kthread+0xcf/0xe0
[ 6649.218647] [<ffffffff810b2460>] ? kthread+0x0/0xe0
[ 6649.219151] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[ 6649.219687] [<ffffffff810b2460>] ? kthread+0x0/0xe0
[ 6649.220352] LustreError: dumping log to /tmp/lustre-log.1515609441.30548
[ 7204.082593] Lustre: 11314:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
req@ffff8800694ce050 x1589224788664784/t0(0) o4->1a4123cb-0667-4fc1-a984-578de9d574bd@10.2.8.124@tcp:350/0 lens 784/488 e 24 to 0 dl 1515610000 ref 2 fl Interpret:/0/0 rc 0/0
[ 7209.956336] Lustre: lustre-OST0000: Client 1a4123cb-0667-4fc1-a984-578de9d574bd (at 10.2.8.124@tcp) reconnecting
[ 7209.957438] Lustre: Skipped 3 previous similar messages
[ 7209.957986] Lustre: lustre-OST0000: Connection restored to 1bcb3ced-3143-3f53-430e-80a587c85a02 (at 10.2.8.124@tcp)
[ 7209.959086] Lustre: Skipped 23 previous similar messages
[ 7806.066639] Lustre: 11899:0:(service.c:1346:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (4/4), not sending early reply
req@ffff880069229c50 x1589224788689760/t0(0) o4->1a4123cb-0667-4fc1-a984-578de9d574bd@10.2.8.124@tcp:196/0 lens 608/448 e 1 to 0 dl 1515610601 ref 2 fl Interpret:/2/0 rc 0/0
[ 7810.960502] Lustre: lustre-OST0000: Client 1a4123cb-0667-4fc1-a984-578de9d574bd (at 10.2.8.124@tcp) reconnecting
[ 7810.961613] Lustre: lustre-OST0000: Connection restored to 1bcb3ced-3143-3f53-430e-80a587c85a02 (at 10.2.8.124@tcp)
|
| Comment by Bob Glossman (Inactive) [ 12/Jan/18 ] |
|
another on master: |
| Comment by John Hammond [ 15/Jan/18 ] |
|
Upstream zfsonlinux issue to track message improvement: https://github.com/zfsonlinux/zfs/issues/7045 |
| Comment by Andreas Dilger [ 16/Jan/18 ] |
|
Digging through the ZFS code a bit, I see there is a zfs_multihost_history tunable (default = 0) that can be set to track the history of MMP überblock writes. If the patches on https://github.com/zfsonlinux/zfs/issues/7045 do not solve the problem, it might be useful to collect the multihost stats in case of a test timeout/failure to see if there is some useful information (e.g. whether mmp_delay is increasing steadily or if it is a step function, what mmp_delay * vdev_count is vs. mmp_interval, etc). |
| Comment by Brian Behlendorf [ 16/Jan/18 ] |
|
Yes, it would be great if you could enable the multihost history so we can get a clear idea of exactly what's happening with the delay. You'll want to set zfs_multihost_history=100, where 100 means keep a history of the last 100 mmp writes. You can then dump the log by reading /proc/spl/kstat/zfs/<pool>/multihost. echo 100 >/sys/module/zfs/parameters/zfs_multihost_history cat /proc/spl/kstat/zfs/jet22-1/multihost 38 0 0x01 13 832 12020281026402 14414698950019 txg timestamp mmp_delay vdev_guid vdev_label vdev_path ... 28271769 1516145833 102524916 13587132990859797684 1 /dev/disk/by-vdev/L1 28271769 1516145833 102498562 18032937067254868643 0 /dev/disk/by-vdev/L56 28271769 1516145833 102470510 18032937067254868643 0 /dev/disk/by-vdev/L56 28271769 1516145833 102443525 18235850673591164513 1 /dev/disk/by-vdev/L15 28271769 1516145833 102435221 8353277438441160016 2 /dev/disk/by-vdev/L28 |
| Comment by Jian Yu [ 20/Jan/18 ] |
|
+1 on master branch: |
| Comment by Minh Diep [ 22/Jan/18 ] |
|
+1 on b2_10 |
| Comment by Minh Diep [ 23/Jan/18 ] |
|
FYI, we just added below params in our infrastructure. options zfs zfs_multihost_history=100 |
| Comment by nasf (Inactive) [ 26/Jan/18 ] |
|
+1 on master: |
| Comment by Jian Yu [ 15/Feb/18 ] |
|
sanity test 124a hit this failure on master branch: |
| Comment by Jian Yu [ 21/Mar/18 ] |
|
+1 on master branch: |
| Comment by Joe Grund [ 06/Jun/18 ] |
|
Looks like we saw this in IML testing: https://github.com/intel-hpdd/intel-manager-for-lustre/issues/645 |
| Comment by Nathaniel Clark [ 13/Sep/18 ] |
|
With the upgrade to ZFS 0.7.9 and the increase in the multihost fail interval, this is no longer an issue. |
| Comment by James Nunez (Inactive) [ 12/Mar/19 ] |
|
We're seeing this error message again with ZFS testing. Looking at a recent 2.10.7 RC1 test session using ZFS 0.7.9-1, https://testing.whamcloud.com/test_sessions/a35ede3e-36a0-4a9f-92f6-e22a3cda1bec , we see this error in the OST (vm3) console for ost-pools test 23b ============================================================= 23:14:20 \(1552259660\) [14220.079856] Lustre: DEBUG MARKER: == ost-pools test 23b: OST pools and OOS ============================================================= 23:14:20 (1552259660) [14414.872497] WARNING: MMP writes to pool 'lustre-ost4' have not succeeded in over 20s; suspending pool [14414.874142] WARNING: Pool 'lustre-ost4' has encountered an uncorrectable I/O failure and has been suspended. [14414.874142] [14414.878399] WARNING: MMP writes to pool 'lustre-ost8' have not succeeded in over 20s; suspending pool [14414.880115] WARNING: Pool 'lustre-ost8' has encountered an uncorrectable I/O failure and has been suspended. [14414.880115] [14415.175128] WARNING: MMP writes to pool 'lustre-ost7' have not succeeded in over 20s; suspending pool [14415.176759] WARNING: Pool 'lustre-ost7' has encountered an uncorrectable I/O failure and has been suspended. [14415.176759] [14415.251535] WARNING: MMP writes to pool 'lustre-ost2' have not succeeded in over 20s; suspending pool [14415.253116] WARNING: Pool 'lustre-ost2' has encountered an uncorrectable I/O failure and has been suspended. [14415.253116] [14465.182431] LNet: Service thread pid 16167 was inactive for 78.05s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [14465.185338] Pid: 16167, comm: ll_ost_io00_006 3.10.0-957.1.3.el7_lustre.x86_64 #1 SMP Fri Mar 8 06:17:38 UTC 2019 [14465.187046] Call Trace: [14465.187545] [<ffffffffc03ae2b5>] cv_wait_common+0x125/0x150 [spl] [14465.188787] [<ffffffffc03ae2f5>] __cv_wait+0x15/0x20 [spl] [14465.189791] [<ffffffffc0505c0f>] txg_wait_synced+0xef/0x140 [zfs] [14465.190919] [<ffffffffc04bba15>] dmu_tx_wait+0x275/0x3c0 [zfs] [14465.191992] [<ffffffffc04bbbf1>] dmu_tx_assign+0x91/0x490 [zfs] [14465.193078] [<ffffffffc1123f57>] osd_trans_start+0xa7/0x3a0 [osd_zfs] [14465.194216] [<ffffffffc124e31e>] ofd_trans_start+0x6e/0xf0 [ofd] [14465.195327] [<ffffffffc125482b>] ofd_commitrw_write+0x94b/0x1c90 [ofd] [14465.196500] [<ffffffffc1258879>] ofd_commitrw+0x4c9/0xae0 [ofd] [14465.197563] [<ffffffffc0f8fd84>] obd_commitrw+0x2f3/0x336 [ptlrpc] [14465.198979] [<ffffffffc0f627cd>] tgt_brw_write+0xffd/0x17d0 [ptlrpc] [14465.200198] [<ffffffffc0f5e7ca>] tgt_request_handle+0x92a/0x1370 [ptlrpc] [14465.201458] [<ffffffffc0f0705b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc] [14465.202814] [<ffffffffc0f0a7a2>] ptlrpc_main+0xa92/0x1e40 [ptlrpc] [14465.203954] [<ffffffff92ac1c31>] kthread+0xd1/0xe0 [14465.204831] [<ffffffff93174c37>] ret_from_fork_nospec_end+0x0/0x39 [14465.205926] [<ffffffffffffffff>] 0xffffffffffffffff [14465.206820] LustreError: dumping log to /tmp/lustre-log.1552259905.16167 Do you think this is a new issue? Other failures for the same test, ost-pools test 23b hanging, this year are at |
| Comment by Andreas Dilger [ 13/Mar/19 ] |
|
A patch has been submitted to upstream ZFS master that increases the MMP interval/retry interval, as well as improve other aspects of the code in https://github.com/zfsonlinux/zfs/issues/7709 and a smaller patch to just update the interval/retry count that seems suitable for inclusion into 0.7.x as https://github.com/zfsonlinux/zfs/pull/8495 . |
| Comment by Andreas Dilger [ 16/Mar/19 ] |
|
If this is becoming really problematic, we could disable MMP in our VM test configurations by setting "zfs_multihost_fail_intervals=0", but that should be a last resort as it will make our testing less representative of real deployments. At least the MMP feature would still be enabled and running, but a timeout would not take the pool offline. |
| Comment by Minh Diep [ 20/Mar/19 ] |
|
+1 on b2_12 https://testing.whamcloud.com/test_sessions/e73dd201-fae1-4123-92ab-1961d681f34e |
| Comment by Andreas Dilger [ 20/Apr/19 ] |
|
+1 on master conf-sanity test_32c: |
| Comment by Nathaniel Clark [ 13/May/19 ] |
|
I've asked that Autotest bump zfs_multihost_fail_intervals to 60 which is twice the default SCSI timeout in ATM-827 |
| Comment by Nathaniel Clark [ 22/May/19 ] |
|
zfs_multihost_fail_intervals has been bumped to 60 seconds in autotest via ATM-827. This should address the zpool MMP timeouts. |
| Comment by Homer Li (Inactive) [ 23/May/19 ] |
|
Hi Nathaniel Does it cause zpool import too slow in another pair OSS ? I mean HA switch have too wait a long time in 100+ SAS device env. |
| Comment by Nathaniel Clark [ 23/May/19 ] |
|
This setting is for our test environment only. Since they are VMs they can be pretty heavily I/O contended on underlying host. |
| Comment by Homer Li (Inactive) [ 24/May/19 ] |
|
Hi Nathaniel Oh, Got it. |
| Comment by Sarah Liu [ 07/Aug/19 ] |
|
Hit the same problem when doing soak on b2_12-ib build #31 soak-7 console shows WARNING: MMP writes to pool 'soaked-ost11' have not succeeded in over 5s; suspending pool WARNING: Pool 'soaked-ost11' has encountered an uncorrectable I/O failure and has been suspended. |
| Comment by Andreas Dilger [ 28/Aug/20 ] |
It looks like the zfs_mmp_fail_intervals=60 was not set, if the suspend happened after only 5s. |