Details
-
Bug
-
Resolution: Fixed
-
Major
-
None
-
Lustre 2.8.0
-
Hyperion /SWL 2.7.61 review build 35536 (patch http://review.whamcloud.com/17053 - Revert "
LU-4865zfs: grow block size by write pattern")
-
3
-
9223372036854775807
Description
Running SWL, OSS has repeated timeouts
Nov 5 15:23:57 iws9 kernel: LNet: Service thread pid 23042 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Nov 5 15:23:57 iws9 kernel: Pid: 23042, comm: ll_ost00_004 Nov 5 15:23:57 iws9 kernel: Nov 5 15:23:57 iws9 kernel: Call Trace: Nov 5 15:23:57 iws9 kernel: [<ffffffffa067c380>] ? vdev_mirror_child_done+0x0/0x30 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffff815395c3>] io_schedule+0x73/0xc0 Nov 5 15:23:57 iws9 kernel: [<ffffffffa05b2f8f>] cv_wait_common+0xaf/0x130 [spl] Nov 5 15:23:57 iws9 kernel: [<ffffffff810a1460>] ? autoremove_wake_function+0x0/0x40 Nov 5 15:23:57 iws9 kernel: [<ffffffffa05b3028>] __cv_wait_io+0x18/0x20 [spl] Nov 5 15:23:57 iws9 kernel: [<ffffffffa06bd2eb>] zio_wait+0x10b/0x1e0 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0614939>] dbuf_read+0x439/0x850 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0614ef1>] __dbuf_hold_impl+0x1a1/0x4f0 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa06152bd>] dbuf_hold_impl+0x7d/0xb0 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0616790>] dbuf_hold+0x20/0x30 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa061d0d7>] dmu_buf_hold_noread+0x87/0x140 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa061d1cb>] dmu_buf_hold+0x3b/0x90 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0612fb8>] ? dbuf_rele_and_unlock+0x268/0x400 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0686e5a>] zap_lockdir+0x5a/0x770 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffff81178fcd>] ? kmem_cache_alloc_node_trace+0x1cd/0x200 Nov 5 15:23:57 iws9 kernel: [<ffffffffa06889ca>] zap_lookup_norm+0x4a/0x190 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0688ba3>] zap_lookup+0x33/0x40 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa062cc76>] dmu_tx_hold_zap+0x146/0x210 [zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa1034255>] osd_declare_object_create+0x2a5/0x440 [osd_zfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa11738e4>] ofd_precreate_objects+0x4e4/0x19d0 [ofd] Nov 5 15:23:57 iws9 kernel: [<ffffffffa04b4b61>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Nov 5 15:23:57 iws9 kernel: [<ffffffffa1180a9b>] ? ofd_grant_create+0x23b/0x3e0 [ofd] Nov 5 15:23:57 iws9 kernel: [<ffffffffa116384e>] ofd_create_hdl+0x56e/0x2640 [ofd] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0c28e80>] ? lustre_pack_reply_v2+0x220/0x280 [ptlrpc] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0c930ec>] tgt_request_handle+0x8bc/0x12e0 [ptlrpc] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0c3a9e1>] ptlrpc_main+0xe41/0x1910 [ptlrpc] Nov 5 15:23:57 iws9 kernel: [<ffffffffa0c39ba0>] ? ptlrpc_main+0x0/0x1910 [ptlrpc] Nov 5 15:23:57 iws9 kernel: [<ffffffff810a0fce>] kthread+0x9e/0xc0 Nov 5 15:23:57 iws9 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 Nov 5 15:23:57 iws9 kernel: [<ffffffff810a0f30>] ? kthread+0x0/0xc0 Nov 5 15:23:57 iws9 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20
Lustre-log dump attached
Attachments
Issue Links
- is duplicated by
-
LU-7602 Repeated timeouts with ZFS 0.6.5.2
-
- Resolved
-
- is related to
-
LU-6750 missing stop callback in osd-zfs
-
- Resolved
-
-
LU-7987 Lustre 2.8 OSS with zfs 0.6.5 backend hitting most schedule_timeout
-
- Closed
-
- is related to
-
LU-7153 Update ZFS/SPL version to 0.6.5.2
-
- Resolved
-
-
LU-4865 osd-zfs: increase object block size dynamically as object grows
-
- Resolved
-
- links to
(1 links to)
Hi Nathaniel,
We've tried 0.6.5.4 before and it didn't help.
Only ZFS Master includes the patches the upstream ZFS developer mentioned and we tried that on Hyperion yesterday, unfortunately it didn't help either.