Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
Lustre 2.10.0, Lustre 2.10.1, Lustre 2.11.0, Lustre 2.10.7
-
None
-
onyx-32vm1-8, Full Group test,
RHEL7.3/zfs, branch master, v2.9.54, b3541
-
3
-
9223372036854775807
Description
https://testing.hpdd.intel.com/test_sessions/afc7f4b0-0af4-11e7-8c9f-5254006e85c2
It appears that zfs was hung and caused this timeout. Here are a couple indications of this:
test_log:
Starting ost1: lustre-ost1/ost1 /mnt/lustre-ost1 CMD: onyx-32vm8 mkdir -p /mnt/lustre-ost1; mount -t lustre lustre-ost1/ost1 /mnt/lustre-ost1 onyx-32vm8: e2label: No such file or directory while trying to open lustre-ost1/ost1 onyx-32vm8: Couldn't find valid filesystem superblock.
OST console:
10:35:06:[31399.498089] txg_sync D 0000000000000001 0 27626 2 0x00000080 10:35:06:[31399.498090] ffff880049607ac0 0000000000000046 ffff88003d98edd0 ffff880049607fd8 10:35:06:[31399.498091] ffff880049607fd8 ffff880049607fd8 ffff88003d98edd0 ffff88007fc16c40 10:35:06:[31399.498092] 0000000000000000 7fffffffffffffff ffff88005ac587a8 0000000000000001 10:35:06:[31399.498092] Call Trace: 10:35:06:[31399.498093] [<ffffffff8168bac9>] schedule+0x29/0x70 10:35:06:[31399.498095] [<ffffffff81689519>] schedule_timeout+0x239/0x2d0 10:35:06:[31399.498096] [<ffffffff810c4fe2>] ? default_wake_function+0x12/0x20 10:35:06:[31399.498098] [<ffffffff810ba238>] ? __wake_up_common+0x58/0x90 10:35:06:[31399.498101] [<ffffffff81060c1f>] ? kvm_clock_get_cycles+0x1f/0x30 10:35:06:[31399.498103] [<ffffffff8168b06e>] io_schedule_timeout+0xae/0x130 10:35:06:[31399.498104] [<ffffffff810b1416>] ? prepare_to_wait_exclusive+0x56/0x90 10:35:06:[31399.498106] [<ffffffff8168b108>] io_schedule+0x18/0x20 10:35:06:[31399.498109] [<ffffffffa0677617>] cv_wait_common+0xa7/0x130 [spl] 10:35:06:[31399.498111] [<ffffffff810b1720>] ? wake_up_atomic_t+0x30/0x30 10:35:06:[31399.498114] [<ffffffffa06776f8>] __cv_wait_io+0x18/0x20 [spl] 10:35:06:[31399.498150] [<ffffffffa07d151b>] zio_wait+0x10b/0x1f0 [zfs] 10:35:06:[31399.498169] [<ffffffffa075acdf>] dsl_pool_sync+0xbf/0x440 [zfs] 10:35:06:[31399.498187] [<ffffffffa0775868>] spa_sync+0x388/0xb50 [zfs] 10:35:06:[31399.498189] [<ffffffff810b174b>] ? autoremove_wake_function+0x2b/0x40 10:35:06:[31399.498191] [<ffffffff81689c72>] ? mutex_lock+0x12/0x2f 10:35:06:[31399.498208] [<ffffffffa07874e5>] txg_sync_thread+0x3c5/0x620 [zfs] 10:35:06:[31399.498226] [<ffffffffa0787120>] ? txg_init+0x280/0x280 [zfs] 10:35:06:[31399.498229] [<ffffffffa0672851>] thread_generic_wrapper+0x71/0x80 [spl] 10:35:06:[31399.498232] [<ffffffffa06727e0>] ? __thread_exit+0x20/0x20 [spl] 10:35:06:[31399.498234] [<ffffffff810b064f>] kthread+0xcf/0xe0 10:35:06:[31399.498235] [<ffffffff810b0580>] ? kthread_create_on_node+0x140/0x140 10:35:06:[31399.498237] [<ffffffff81696958>] ret_from_fork+0x58/0x90 10:35:06:[31399.498239] [<ffffffff810b0580>] ? kthread_create_on_node+0x140/0x140
Attachments
Issue Links
- is related to
-
LU-8601 sanity test_230d: Timeout on ZFS backed MDSs
- Resolved
- is related to
-
LU-4950 sanity-benchmark test fsx hung: txg_sync was stuck on OSS
- Closed
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...