[LU-7602] Repeated timeouts with ZFS 0.6.5.2 Created: 23/Dec/15 Updated: 23/Dec/15 Resolved: 23/Dec/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Cliff White (Inactive) | Assignee: | Jian Yu |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Hyperion/SWL - |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This bug created to track activity from http://review.whamcloud.com/17712 ZFS 0.6.5.2 is known to introduce I/O problems Dec 23 11:47:33 iws2 kernel: LNet: Service thread pid 30734 was inactive for 200.00s. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: Dec 23 11:47:33 iws2 kernel: Pid: 30734, comm: ll_ost00_000 Dec 23 11:47:33 iws2 kernel: Dec 23 11:47:33 iws2 kernel: Call Trace: Dec 23 11:47:33 iws2 kernel: [<ffffffffa06cb330>] ? vdev_mirror_child_done+0x0/0x30 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffff815395d3>] io_schedule+0x73/0xc0 Dec 23 11:47:33 iws2 kernel: [<ffffffffa05a3eaf>] cv_wait_common+0xaf/0x130 [spl] Dec 23 11:47:33 iws2 kernel: [<ffffffff810a1460>] ? autoremove_wake_function+0x0/0x40 Dec 23 11:47:33 iws2 kernel: [<ffffffffa05a3f48>] __cv_wait_io+0x18/0x20 [spl] Dec 23 11:47:33 iws2 kernel: [<ffffffffa070c29b>] zio_wait+0x10b/0x1e0 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa06638a9>] dbuf_read+0x439/0x850 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa066c168>] dmu_buf_hold+0x68/0x90 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa0661fa8>] ? dbuf_rele_and_unlock+0x268/0x390 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa06d5e0a>] zap_lockdir+0x5a/0x770 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa06d797a>] zap_lookup_norm+0x4a/0x190 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa06d7b53>] zap_lookup+0x33/0x40 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa067bbe6>] dmu_tx_hold_zap+0x146/0x210 [zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa107b3b5>] osd_declare_object_create+0x2d5/0x440 [osd_zfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa11bba24>] ofd_precreate_objects+0x4e4/0x19d0 [ofd] Dec 23 11:47:33 iws2 kernel: [<ffffffffa04bc6c1>] ? libcfs_debug_msg+0x41/0x50 [libcfs] Dec 23 11:47:33 iws2 kernel: [<ffffffffa11c8bdb>] ? ofd_grant_create+0x23b/0x3e0 [ofd] Dec 23 11:47:33 iws2 kernel: [<ffffffffa11ab83e>] ofd_create_hdl+0x56e/0x2640 [ofd] Dec 23 11:47:33 iws2 kernel: [<ffffffffa0bbefe0>] ? lustre_pack_reply_v2+0x220/0x280 [ptlrpc] Dec 23 11:47:33 iws2 kernel: [<ffffffffa0c294cc>] tgt_request_handle+0x8ec/0x1470 [ptlrpc] Dec 23 11:47:33 iws2 kernel: [<ffffffffa0bd0b41>] ptlrpc_main+0xe41/0x1910 [ptlrpc] Dec 23 11:47:33 iws2 kernel: [<ffffffffa0bcfd00>] ? ptlrpc_main+0x0/0x1910 [ptlrpc] Dec 23 11:47:33 iws2 kernel: [<ffffffff810a0fce>] kthread+0x9e/0xc0 Dec 23 11:47:33 iws2 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 Dec 23 11:47:33 iws2 kernel: [<ffffffff810a0f30>] ? kthread+0x0/0xc0 Dec 23 11:47:33 iws2 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 Dec 23 11:47:33 iws2 kernel: |
| Comments |
| Comment by Jian Yu [ 23/Dec/15 ] |
|
Hi Cliff, Patch http://review.whamcloud.com/17712 hit build failure on sles11sp2 server. I created TEI-4369 to disable the build. In the meantime, since builds on other distros passed, could you please verify whether the timeout issue is resolved or not after resetting ZFS baseline to 0.6.4.2? Thank you. |
| Comment by Andreas Dilger [ 23/Dec/15 ] |
|
Cliff, do you have the stack traces for all the threads on the OSS? It seems this ll_ost00_000 thread is waiting for the ZFS TXG to commit, but it would be useful to know what the other threads are doing in the meantime. |
| Comment by Andreas Dilger [ 23/Dec/15 ] |
|
Closing this as a duplicate of |
| Comment by Cliff White (Inactive) [ 23/Dec/15 ] |
|
I dumped the stacks on iws2. It's a while since the error, this file includes all the timeout stacks |