[LU-8806] LFSCK hangs on MDT - osp_precreate_cleanup_orphans - cannot cleanup orphans: rc = -107 - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Duplicate
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.9.0, Lustre 2.10.0, Lustre 2.10.3
Labels:
- soak
Environment:
Soak cluster, tip of Lustre-master lustre: 2.8.59_79_gb8811a0

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

After a failover, lfsck is started.
2016-11-04 15:18:59,065:fsmgmt.fsmgmt:INFO lfsck started on lola-8
lfsck never completes.
2016-11-06 01:08:25,976:fsmgmt.fsmgmt:INFO lfsck still in progress for soaked-MDT0000 after 122865s
Examining the logs, we see that the connection to OST0002 is going up and down, orphans cannot be cleaned up.

Nov  4 15:18:32 lola-8 kernel: LustreError: 11-0: soaked-OST0017-osc-MDT0000: operation obd_ping to node 192.168.1.106@o2ib10 failed: rc = -107
Nov  4 15:18:36 lola-8 kernel: LustreError: 11-0: soaked-OST0005-osc-MDT0000: operation ost_statfs to node 192.168.1.106@o2ib10 failed: rc = -107
Nov  4 15:18:46 lola-8 kernel: LustreError: 6010:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) soaked-OST0002-osc-MDT0000: cannot cleanup orphans: rc = -107
Nov  4 15:18:57 lola-8 kernel: LustreError: 11-0: soaked-OST000b-osc-MDT0000: operation obd_ping to node 192.168.1.106@o2ib10 failed: rc = -107
Nov  4 15:18:57 lola-8 kernel: LustreError: Skipped 1 previous similar message
Nov  4 15:19:27 lola-8 kernel: Lustre: 5851:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478297366/real 1478297366]  req@ffff8807e61689c0 x1550005401109488/t0(0) o6->soaked-OST0002-osc-MDT0000@192.168.1.104@o2ib10:28/4 lens 664/432 e 23 to 1 dl 1478297967 ref 1 fl Rpc:X/2/ffffffff rc -11/-1
Nov  4 15:19:27 lola-8 kernel: Lustre: soaked-OST0002-osc-MDT0000: Connection to soaked-OST0002 (at 192.168.1.104@o2ib10) was lost; in progress operations using this service will wait for recovery to complete
Nov  4 15:19:27 lola-8 kernel: Lustre: Skipped 9 previous similar messages
Nov  4 15:19:27 lola-8 kernel: Lustre: soaked-OST0002-osc-MDT0000: Connection restored to 192.168.1.104@o2ib10 (at 192.168.1.104@o2ib10)
Nov  4 15:19:27 lola-8 kernel: Lustre: Skipped 18 previous similar messages
Nov  4 15:19:27 lola-8 kernel: Lustre: 5851:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 38 previous similar messages
Nov  4 15:29:28 lola-8 kernel: Lustre: 5830:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478297967/real 1478297967]  req@ffff8803e013d380 x1550005401094688/t0(0) o1000->soaked-OST0002-osc-MDT0000@192.168.1.104@o2ib10:24/4 lens 520/4320 e 23 to 1 dl 1478298568 ref 1 fl Rpc:X/2/ffffffff rc -11/-1
Nov  4 15:29:28 lola-8 kernel: Lustre: 5830:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 1 previous similar message
Nov  4 15:29:28 lola-8 kernel: Lustre: soaked-OST0002-osc-MDT0000: Connection to soaked-OST0002 (at 192.168.1.104@o2ib10) was lost; in progress operations using this service will wait for recovery to complete
Nov  4 15:29:28 lola-8 kernel: Lustre: Skipped 1 previous similar message
Nov  4 15:29:28 lola-8 kernel: Lustre: soaked-OST0002-osc-MDT0000: Connection restored to 192.168.1.104@o2ib10 (at 192.168.1.104@o2ib10)
Nov  4 15:29:28 lola-8 kernel: Lustre: Skipped 1 previous similar message
Nov  4 15:31:22 lola-8 kernel: LustreError: 6010:0:(osp_precreate.c:903:osp_precreate_cleanup_orphans()) soaked-OST0002-osc-MDT0000: cannot cleanup orphans: rc = -107
Nov  4 15:39:29 lola-8 kernel: Lustre: 5840:0:(client.c:2111:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1478298568/real 1478298568]  req@ffff8803c0c980c0 x1550005401096032/t0(0) o1000->soaked-OST0002-osc-MDT0000@192.168.1.104@o2ib10:24/4 lens 520/4320 e 23 to 1 dl 1478299169 ref 1 fl Rpc:X/2/ffffffff rc -11/-1
Nov  4 15:39:29 lola-8 kernel: Lustre: 5840:0:(client.c:2111:ptlrpc_expire_one_request()) Skipped 2 previous similar messages
Nov  4 15:39:29 lola-8 kernel: Lustre: soaked-OST0002-osc-MDT0000: Connection to soaked-OST0002 (at 192.168.1.104@o2ib10) was lost; in progress operations using this service will wait for recovery to complete

lola-4 during this time was experiencing difficulties communicating with the MDS, prior to the start of the lfsck.

Nov  4 14:49:18 lola-4 kernel: Lustre: 33277:0:(service.c:1331:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
Nov  4 14:49:18 lola-4 kernel:  req@ffff88050f9c2450 
x1550005401098800/t0(0) o1000->soaked-MDT0000-mdtlov_UUID@192.168.1.108@o2ib10:-1/-1 lens 520/0 e 0 to 0 dl 1478296163 ref 2 fl New:/2/ffffffff rc 0/-1
Nov  4 14:49:18 lola-4 kernel: Lustre: 33277:0:(service.c:1331:ptlrpc_at_send_early_reply()) Skipped 96 previous similar messages
Nov  4 14:49:24 lola-4 kernel: Lustre: soaked-OST0002: Client soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib10) reconnecting
Nov  4 14:38:26 lola-4 kernel: Lustre: soaked-OST0002: Export ffff8802e106a800 already connecting from 192.168.1.113@o2ib100
Nov  4 14:38:26 lola-4 kernel: Lustre: Skipped 12 previous similar messages
Nov  4 14:39:17 lola-4 kernel: Lustre: 33277:0:(service.c:1331:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
Nov  4 14:39:17 lola-4 kernel:  req@ffff88052335ec50 x1550005401085472/t0(0) o1000->soaked-MDT0000-mdtlov_UUID@192.168.1.108@o2ib10:-1/-1 lens 520/0 e 0 to 0 dl 1478295562 ref 2 fl New:/2/ffffffff rc 0/-1
Nov  4 14:39:17 lola-4 kernel: Lustre: 33277:0:(service.c:1331:ptlrpc_at_send_early_reply()) Skipped 95 previous similar messages
Nov  4 14:39:23 lola-4 kernel: Lustre: soaked-OST0002: Client soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib10) reconnecting
Nov  4 14:39:23 lola-4 kernel: Lustre: Skipped 13 previous similar messages
Nov  4 14:39:23 lola-4 kernel: Lustre: soaked-OST0002: Connection restored to soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib10)
Nov  4 14:49:16 lola-4 kernel: Lustre: soaked-OST0002: Export ffff8802e106a800 already connecting from 192.168.1.113@o2ib100
Nov  4 14:49:16 lola-4 kernel: Lustre: Skipped 12 previous similar messages
Nov  4 14:49:18 lola-4 kernel: Lustre: 33277:0:(service.c:1331:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
Nov  4 14:49:18 lola-4 kernel:  req@ffff88050f9c2450 x1550005401098800/t0(0) o1000->soaked-MDT0000-mdtlov_UUID@192.168.1.108@o2ib10:-1/-1 lens 520/0 e 0 to 0 dl 1478296163 ref 2 fl New:/2/ffffffff rc 0/-1
Nov  4 14:49:18 lola-4 kernel: Lustre: 33277:0:(service.c:1331:ptlrpc_at_send_early_reply()) Skipped 96 previous similar messages
Nov  4 14:49:24 lola-4 kernel: Lustre: soaked-OST0002: Client soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib10) reconnecting

lola-4 continued dumping stacks, finally restarted the cluster
Prior to restart, dumped lctl logs from lola-4 and lola-8, those are attached with message logs

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

lola-4.syslog.txt
1.56 MB
07/Nov/16 8:46 PM
lola-8.lustre-log.txt.gz
889 kB
07/Nov/16 8:46 PM
lola-8.syslog.txt
3.46 MB
07/Nov/16 8:46 PM
oss.lola-4.lustre-log.txt.gz
2.63 MB
07/Nov/16 8:46 PM

Issue Links

is related to

LU-5994 DT transaction start and object lock ordering

Resolved

LU-10048 osd-ldiskfs to truncate outside of main transaction

Resolved

is related to

LU-5165 Test failure on test suite recovery-small, subtest test_29a

Open

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(3 mentioned in)

Activity

[LU-8806] LFSCK hangs on MDT - osp_precreate_cleanup_orphans - cannot cleanup orphans: rc = -107

Alexey Lyashkov added a comment - 18/Dec/17 3:13 PM

Peter,

i see no updates for this patch for year. If you talk about an Alex patch, it's just a part of this solution and i will add to watchers to it.
As about alternative solution - i don't see any good for now. It just a notification - it have a 2.7,2.8, 2.9 affected.
Not about 2.10/2.11.

Alexey Lyashkov added a comment - 18/Dec/17 3:13 PM Peter, i see no updates for this patch for year. If you talk about an Alex patch, it's just a part of this solution and i will add to watchers to it. As about alternative solution - i don't see any good for now. It just a notification - it have a 2.7,2.8, 2.9 affected. Not about 2.10/2.11.

Peter Jones added a comment - 18/Dec/17 3:09 PM

Alexey

This issue is presently targeted for the 2.11 release and a draft patch of one possible patch exists already. Feel free to provide an alternative approach or feedback on any testing you conduct on the existing prototype.

Peter

Peter Jones added a comment - 18/Dec/17 3:09 PM Alexey This issue is presently targeted for the 2.11 release and a draft patch of one possible patch exists already. Feel free to provide an alternative approach or feedback on any testing you conduct on the existing prototype. Peter

Alexey Lyashkov added a comment - 18/Dec/17 12:58 PM

We hit this issue with 2.7 (IEEL3) code.
OST deadlocked with lfsck operation.

Stack : 
__schedule
schedule
rwsem_down_write_failed
call_rwsem_down_write_failed
osd_object_write_lock
out_tx_attr_set_exec
out_tx_end
out_handle
tgt_request_handle
ptlrpc_server_handle_request
ptlrpc_main
kthread
ret_from_fork
Progs:  2213 "ll_ost_out00_00

Any plans to fix it?

Alexey Lyashkov added a comment - 18/Dec/17 12:58 PM We hit this issue with 2.7 (IEEL3) code. OST deadlocked with lfsck operation. Stack : __schedule schedule rwsem_down_write_failed call_rwsem_down_write_failed osd_object_write_lock out_tx_attr_set_exec out_tx_end out_handle tgt_request_handle ptlrpc_server_handle_request ptlrpc_main kthread ret_from_fork Progs: 2213 "ll_ost_out00_00 Any plans to fix it?

Gerrit Updater added a comment - 10/Nov/16 6:04 AM

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23689
Subject: ~~LU-8806~~ target: lock ostobj before transaction start
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 28bf0565e025ca20c72fb8d813274c142ed273fe

Gerrit Updater added a comment - 10/Nov/16 6:04 AM Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/23689 Subject: LU-8806 target: lock ostobj before transaction start Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 28bf0565e025ca20c72fb8d813274c142ed273fe

nasf (Inactive) added a comment - 09/Nov/16 1:42 AM - edited

As I remember, there seemed some discussion about the difference between MDT and OST operations for transaction and lock order. But I cannot remember the detail. Currently, the MDT side modification is start transaction before lock the object, but the OST side is the reverse order. Because the OUT handler is shared by both sides, there always be some potential deadlock. Is there any special reason for that? Can we unify them?

nasf (Inactive) added a comment - 09/Nov/16 1:42 AM - edited As I remember, there seemed some discussion about the difference between MDT and OST operations for transaction and lock order. But I cannot remember the detail. Currently, the MDT side modification is start transaction before lock the object, but the OST side is the reverse order. Because the OUT handler is shared by both sides, there always be some potential deadlock. Is there any special reason for that? Can we unify them?

Alex Zhuravlev added a comment - 08/Nov/16 7:07 PM

looks like a lock/tx ordering issue.

here we take the object's lock before TX

Nov 4 09:31:32 lola-4 kernel: [<ffffffffa038c81d>] dmu_tx_wait+0x21d/0x400 [zfs]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa03d1ef2>] ? txg_kick+0x92/0xd0 [zfs]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa038cbf1>] dmu_tx_assign+0xa1/0x570 [zfs]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa0f06d7d>] osd_trans_start+0xed/0x430 [osd_zfs]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa102156c>] ofd_trans_start+0x7c/0x100 [ofd]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa1023980>] ofd_object_destroy+0x3d0/0x8e0 [ofd]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa101d2cd>] ofd_destroy_by_fid+0x35d/0x620 [ofd]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa098bab0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa098db10>] ? ldlm_completion_ast+0x0/0x9a0 [ptlrpc]
Nov 4 09:31:32 lola-4 kernel: [<ffffffffa10176a2>] ofd_destroy_hdl+0x2f2/0xb60 [ofd]

and here we do with TX started:

Nov 4 09:32:15 lola-4 kernel: Pid: 33274, comm: ll_ost_out01_00
Nov 4 09:32:15 lola-4 kernel:
Nov 4 09:32:15 lola-4 kernel: Call Trace:
Nov 4 09:32:15 lola-4 kernel: [<ffffffffa03a22ce>] ? dsl_dir_tempreserve_impl+0x2ae/0x330 [zfs]
Nov 4 09:32:15 lola-4 kernel: [<ffffffff8153cc55>] rwsem_down_failed_common+0x95/0x1d0
Nov 4 09:32:15 lola-4 kernel: [<ffffffff8153cdb3>] rwsem_down_write_failed+0x23/0x30
Nov 4 09:32:15 lola-4 kernel: [<ffffffff8129e923>] call_rwsem_down_write_failed+0x13/0x20
Nov 4 09:32:15 lola-4 kernel: [<ffffffff8153c2b2>] ? down_write+0x32/0x40
Nov 4 09:32:15 lola-4 kernel: [<ffffffffa0f0f021>] osd_object_write_lock+0x61/0x70 [osd_zfs]
Nov 4 09:32:15 lola-4 kernel: [<ffffffffa0a2bbf9>] out_tx_attr_set_exec+0x69/0x480 [ptlrpc]
Nov 4 09:32:15 lola-4 kernel: [<ffffffffa0a2296a>] out_tx_end+0xda/0x5d0 [ptlrpc]
Nov 4 09:32:15 lola-4 kernel: [<ffffffffa0a282b9>] out_handle+0x11d9/0x18d0 [ptlrpc]

Alex Zhuravlev added a comment - 08/Nov/16 7:07 PM looks like a lock/tx ordering issue. here we take the object's lock before TX Nov 4 09:31:32 lola-4 kernel: [<ffffffffa038c81d>] dmu_tx_wait+0x21d/0x400 [zfs] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa03d1ef2>] ? txg_kick+0x92/0xd0 [zfs] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa038cbf1>] dmu_tx_assign+0xa1/0x570 [zfs] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa0f06d7d>] osd_trans_start+0xed/0x430 [osd_zfs] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa102156c>] ofd_trans_start+0x7c/0x100 [ofd] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa1023980>] ofd_object_destroy+0x3d0/0x8e0 [ofd] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa101d2cd>] ofd_destroy_by_fid+0x35d/0x620 [ofd] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa098bab0>] ? ldlm_blocking_ast+0x0/0x180 [ptlrpc] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa098db10>] ? ldlm_completion_ast+0x0/0x9a0 [ptlrpc] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa10176a2>] ofd_destroy_hdl+0x2f2/0xb60 [ofd] and here we do with TX started: Nov 4 09:32:15 lola-4 kernel: Pid: 33274, comm: ll_ost_out01_00 Nov 4 09:32:15 lola-4 kernel: Nov 4 09:32:15 lola-4 kernel: Call Trace: Nov 4 09:32:15 lola-4 kernel: [<ffffffffa03a22ce>] ? dsl_dir_tempreserve_impl+0x2ae/0x330 [zfs] Nov 4 09:32:15 lola-4 kernel: [<ffffffff8153cc55>] rwsem_down_failed_common+0x95/0x1d0 Nov 4 09:32:15 lola-4 kernel: [<ffffffff8153cdb3>] rwsem_down_write_failed+0x23/0x30 Nov 4 09:32:15 lola-4 kernel: [<ffffffff8129e923>] call_rwsem_down_write_failed+0x13/0x20 Nov 4 09:32:15 lola-4 kernel: [<ffffffff8153c2b2>] ? down_write+0x32/0x40 Nov 4 09:32:15 lola-4 kernel: [<ffffffffa0f0f021>] osd_object_write_lock+0x61/0x70 [osd_zfs] Nov 4 09:32:15 lola-4 kernel: [<ffffffffa0a2bbf9>] out_tx_attr_set_exec+0x69/0x480 [ptlrpc] Nov 4 09:32:15 lola-4 kernel: [<ffffffffa0a2296a>] out_tx_end+0xda/0x5d0 [ptlrpc] Nov 4 09:32:15 lola-4 kernel: [<ffffffffa0a282b9>] out_handle+0x11d9/0x18d0 [ptlrpc]

nasf (Inactive) added a comment - 08/Nov/16 3:26 AM

There are no detailed Lustre debug logs for the txg_quiesce_thread hung. By simply comparing the stack trace, there are some similar tickets on Jira, such as LU-5165.

Add Alex for more suggestion.

nasf (Inactive) added a comment - 08/Nov/16 3:26 AM There are no detailed Lustre debug logs for the txg_quiesce_thread hung. By simply comparing the stack trace, there are some similar tickets on Jira, such as LU-5165 . Add Alex for more suggestion.

nasf (Inactive) added a comment - 08/Nov/16 3:05 AM

The logs on the lola-4 show that at the beginning, the txg_quiesce_thread was blocked for some unknown reason.

Nov  4 09:31:32 lola-4 kernel: INFO: task txg_quiesce:12998 blocked for more than 120 seconds.
Nov  4 09:31:32 lola-4 kernel:      Tainted: P           -- ------------    2.6.32-573.26.1.el6_lustre.x86_64 #1
Nov  4 09:31:32 lola-4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov  4 09:31:32 lola-4 kernel: txg_quiesce   D 000000000000000c     0 12998      2 0x00000080
Nov  4 09:31:32 lola-4 kernel: ffff88040ae23d20 0000000000000046 0000000000000000 ffffffff810674be
Nov  4 09:31:32 lola-4 kernel: 0000465576d5e718 ffff880400000000 0000465576dd3cd4 000000010497fc9b
Nov  4 09:31:32 lola-4 kernel: ffff88040e8725f8 000000010497fc9d ffff88040e8725f8 ffff88040ae23fd8
Nov  4 09:31:32 lola-4 kernel: Call Trace:
Nov  4 09:31:32 lola-4 kernel: [<ffffffff810674be>] ? try_to_wake_up+0x24e/0x3e0
Nov  4 09:31:32 lola-4 kernel: [<ffffffffa0314e5d>] cv_wait_common+0x11d/0x130 [spl]
Nov  4 09:31:32 lola-4 kernel: [<ffffffff810a1820>] ? autoremove_wake_function+0x0/0x40
Nov  4 09:31:32 lola-4 kernel: [<ffffffff812a0d8c>] ? __bitmap_weight+0x8c/0xb0
Nov  4 09:31:32 lola-4 kernel: [<ffffffffa0314ec5>] __cv_wait+0x15/0x20 [spl]
Nov  4 09:31:32 lola-4 kernel: [<ffffffffa03d29fb>] txg_quiesce_thread+0x32b/0x470 [zfs]
Nov  4 09:31:32 lola-4 kernel: [<ffffffffa03d26d0>] ? txg_quiesce_thread+0x0/0x470 [zfs]
Nov  4 09:31:32 lola-4 kernel: [<ffffffffa03d26d0>] ? txg_quiesce_thread+0x0/0x470 [zfs]
Nov  4 09:31:32 lola-4 kernel: [<ffffffffa0310238>] thread_generic_wrapper+0x68/0x80 [spl]
Nov  4 09:31:32 lola-4 kernel: [<ffffffffa03101d0>] ? thread_generic_wrapper+0x0/0x80 [spl]
Nov  4 09:31:32 lola-4 kernel: [<ffffffff810a138e>] kthread+0x9e/0xc0
Nov  4 09:31:32 lola-4 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
Nov  4 09:31:32 lola-4 kernel: [<ffffffff810a12f0>] ? kthread+0x0/0xc0
Nov  4 09:31:32 lola-4 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20

The txg_quiesce_thread() is used for handling the ZFS transactions. Such hung caused a lot of subsequent transactions blocked. For example:

Nov  4 09:31:33 lola-4 kernel: INFO: task ll_ost_io01_002:13089 blocked for more than 120 seconds.
Nov  4 09:31:33 lola-4 kernel:      Tainted: P           -- ------------    2.6.32-573.26.1.el6_lustre.x86_64 #1
Nov  4 09:31:33 lola-4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov  4 09:31:33 lola-4 kernel: ll_ost_io01_0 D 0000000000000008     0 13089      2 0x00000080
Nov  4 09:31:33 lola-4 kernel: ffff880409027790 0000000000000046 0000000000000000 0000000000000000
Nov  4 09:31:33 lola-4 kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Nov  4 09:31:33 lola-4 kernel: 0000000000000000 0000000000000000 ffff8804090025f8 ffff880409027fd8
Nov  4 09:31:33 lola-4 kernel: Call Trace:
Nov  4 09:31:33 lola-4 kernel: [<ffffffff810a1a6e>] ? prepare_to_wait_exclusive+0x4e/0x80
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa0314e5d>] cv_wait_common+0x11d/0x130 [spl]
Nov  4 09:31:33 lola-4 kernel: [<ffffffff810a1820>] ? autoremove_wake_function+0x0/0x40
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa0314ec5>] __cv_wait+0x15/0x20 [spl]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa038c81d>] dmu_tx_wait+0x21d/0x400 [zfs]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa03d1ef2>] ? txg_kick+0x92/0xd0 [zfs]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa038cbf1>] dmu_tx_assign+0xa1/0x570 [zfs]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa0f06d7d>] osd_trans_start+0xed/0x430 [osd_zfs]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa10280d5>] ofd_write_attr_set+0x2c5/0x9e0 [ofd]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa102933f>] ofd_commitrw_write+0x23f/0x10a0 [ofd]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa102df1d>] ? ofd_fmd_find_nolock+0xad/0xd0 [ofd]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa102a75f>] ofd_commitrw+0x5bf/0xb10 [ofd]
Nov  4 09:31:33 lola-4 kernel: [<ffffffff81149461>] ? kzfree+0x31/0x40
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa079d5c1>] ? lprocfs_counter_add+0x151/0x1c0 [obdclass]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa0a17ee4>] obd_commitrw+0x114/0x380 [ptlrpc]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa0a20c80>] tgt_brw_write+0xc70/0x1530 [ptlrpc]
Nov  4 09:31:33 lola-4 kernel: [<ffffffff8105e9b6>] ? enqueue_task+0x66/0x80
Nov  4 09:31:33 lola-4 kernel: [<ffffffff8105ab8d>] ? check_preempt_curr+0x6d/0x90
Nov  4 09:31:33 lola-4 kernel: [<ffffffff810674be>] ? try_to_wake_up+0x24e/0x3e0
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa0974eb0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa0a1f4bc>] tgt_request_handle+0x8ec/0x1440 [ptlrpc]
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa09cbbc1>] ptlrpc_main+0xd31/0x1800 [ptlrpc]
Nov  4 09:31:33 lola-4 kernel: [<ffffffff81539b0e>] ? thread_return+0x4e/0x7d0
Nov  4 09:31:33 lola-4 kernel: [<ffffffffa09cae90>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
Nov  4 09:31:33 lola-4 kernel: [<ffffffff810a138e>] kthread+0x9e/0xc0
Nov  4 09:31:33 lola-4 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
Nov  4 09:31:33 lola-4 kernel: [<ffffffff810a12f0>] ? kthread+0x0/0xc0
Nov  4 09:31:33 lola-4 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20

These OST RPC services thread hung caused related MDT/client RPC timeout, then triggered the reconnection from the MDT/client.

Nov  4 09:48:54 lola-4 kernel: Lustre: soaked-OST0002: Client soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib10) reconnecting
Nov  4 09:48:54 lola-4 kernel: Lustre: Skipped 1 previous similar message
Nov  4 09:48:54 lola-4 kernel: Lustre: soaked-OST0002: Connection restored to soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib10)
Nov  4 09:48:54 lola-4 kernel: Lustre: Skipped 12 previous similar messages
Nov  4 09:48:58 lola-4 kernel: Lustre: soaked-OST0002: Client e093a05c-49f1-1f4e-3cd0-0381aa4e6173 (at 192.168.1.130@o2ib100) reconnecting
Nov  4 09:48:58 lola-4 kernel: Lustre: Skipped 2 previous similar messages
Nov  4 09:48:58 lola-4 kernel: Lustre: soaked-OST0008: deleting orphan objects from 0x0:27891162 to 0x0:27891185

For MDT, after reconnected, it sent orphan cleanup RPC to the OST for sync the last_id. Unfortunately, related RPC service thread was blocked by the mutex ofd_seq::os_create_lock that was held by another RPC service thread which was blocked by the transaction trouble. So there were a serious of blocking. And then these blocked orphan cleanup RPC timeout and trigger more and more reconnection repeatedly.

Nov  4 10:24:05 lola-4 kernel: Pid: 21738, comm: ll_ost01_103
Nov  4 10:24:05 lola-4 kernel: 
Nov  4 10:24:05 lola-4 kernel: Call Trace:
Nov  4 10:24:05 lola-4 kernel: [<ffffffff8153b4b6>] __mutex_lock_slowpath+0x96/0x210
Nov  4 10:24:05 lola-4 kernel: [<ffffffffa102086b>] ? ofd_seq_load+0xbb/0x9c0 [ofd]
Nov  4 10:24:05 lola-4 kernel: [<ffffffff8153afdb>] mutex_lock+0x2b/0x50
Nov  4 10:24:05 lola-4 kernel: [<ffffffffa1015a28>] ofd_create_hdl+0xba8/0x2530 [ofd]
Nov  4 10:24:05 lola-4 kernel: [<ffffffffa09baf9b>] ? lustre_pack_reply_v2+0x1eb/0x280 [ptlrpc]
Nov  4 10:24:05 lola-4 kernel: [<ffffffffa09bb0d6>] ? lustre_pack_reply_flags+0xa6/0x1e0 [ptlrpc]
Nov  4 10:24:05 lola-4 kernel: [<ffffffffa09bb221>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
Nov  4 10:24:05 lola-4 kernel: [<ffffffffa0a1f4bc>] tgt_request_handle+0x8ec/0x1440 [ptlrpc]
Nov  4 10:24:05 lola-4 kernel: [<ffffffffa09cbbc1>] ptlrpc_main+0xd31/0x1800 [ptlrpc]
Nov  4 10:24:05 lola-4 kernel: [<ffffffff81539b0e>] ? thread_return+0x4e/0x7d0
Nov  4 10:24:05 lola-4 kernel: [<ffffffffa09cae90>] ? ptlrpc_main+0x0/0x1800 [ptlrpc]
Nov  4 10:24:05 lola-4 kernel: [<ffffffff810a138e>] kthread+0x9e/0xc0
Nov  4 10:24:05 lola-4 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
Nov  4 10:24:05 lola-4 kernel: [<ffffffff810a12f0>] ? kthread+0x0/0xc0
Nov  4 10:24:05 lola-4 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20

So the root is why the txg_quiesce_thread was blocked.

nasf (Inactive) added a comment - 08/Nov/16 3:05 AM The logs on the lola-4 show that at the beginning, the txg_quiesce_thread was blocked for some unknown reason. Nov 4 09:31:32 lola-4 kernel: INFO: task txg_quiesce:12998 blocked for more than 120 seconds. Nov 4 09:31:32 lola-4 kernel: Tainted: P -- ------------ 2.6.32-573.26.1.el6_lustre.x86_64 #1 Nov 4 09:31:32 lola-4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 4 09:31:32 lola-4 kernel: txg_quiesce D 000000000000000c 0 12998 2 0x00000080 Nov 4 09:31:32 lola-4 kernel: ffff88040ae23d20 0000000000000046 0000000000000000 ffffffff810674be Nov 4 09:31:32 lola-4 kernel: 0000465576d5e718 ffff880400000000 0000465576dd3cd4 000000010497fc9b Nov 4 09:31:32 lola-4 kernel: ffff88040e8725f8 000000010497fc9d ffff88040e8725f8 ffff88040ae23fd8 Nov 4 09:31:32 lola-4 kernel: Call Trace: Nov 4 09:31:32 lola-4 kernel: [<ffffffff810674be>] ? try_to_wake_up+0x24e/0x3e0 Nov 4 09:31:32 lola-4 kernel: [<ffffffffa0314e5d>] cv_wait_common+0x11d/0x130 [spl] Nov 4 09:31:32 lola-4 kernel: [<ffffffff810a1820>] ? autoremove_wake_function+0x0/0x40 Nov 4 09:31:32 lola-4 kernel: [<ffffffff812a0d8c>] ? __bitmap_weight+0x8c/0xb0 Nov 4 09:31:32 lola-4 kernel: [<ffffffffa0314ec5>] __cv_wait+0x15/0x20 [spl] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa03d29fb>] txg_quiesce_thread+0x32b/0x470 [zfs] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa03d26d0>] ? txg_quiesce_thread+0x0/0x470 [zfs] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa03d26d0>] ? txg_quiesce_thread+0x0/0x470 [zfs] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa0310238>] thread_generic_wrapper+0x68/0x80 [spl] Nov 4 09:31:32 lola-4 kernel: [<ffffffffa03101d0>] ? thread_generic_wrapper+0x0/0x80 [spl] Nov 4 09:31:32 lola-4 kernel: [<ffffffff810a138e>] kthread+0x9e/0xc0 Nov 4 09:31:32 lola-4 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 Nov 4 09:31:32 lola-4 kernel: [<ffffffff810a12f0>] ? kthread+0x0/0xc0 Nov 4 09:31:32 lola-4 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 The txg_quiesce_thread() is used for handling the ZFS transactions. Such hung caused a lot of subsequent transactions blocked. For example: Nov 4 09:31:33 lola-4 kernel: INFO: task ll_ost_io01_002:13089 blocked for more than 120 seconds. Nov 4 09:31:33 lola-4 kernel: Tainted: P -- ------------ 2.6.32-573.26.1.el6_lustre.x86_64 #1 Nov 4 09:31:33 lola-4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 4 09:31:33 lola-4 kernel: ll_ost_io01_0 D 0000000000000008 0 13089 2 0x00000080 Nov 4 09:31:33 lola-4 kernel: ffff880409027790 0000000000000046 0000000000000000 0000000000000000 Nov 4 09:31:33 lola-4 kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Nov 4 09:31:33 lola-4 kernel: 0000000000000000 0000000000000000 ffff8804090025f8 ffff880409027fd8 Nov 4 09:31:33 lola-4 kernel: Call Trace: Nov 4 09:31:33 lola-4 kernel: [<ffffffff810a1a6e>] ? prepare_to_wait_exclusive+0x4e/0x80 Nov 4 09:31:33 lola-4 kernel: [<ffffffffa0314e5d>] cv_wait_common+0x11d/0x130 [spl] Nov 4 09:31:33 lola-4 kernel: [<ffffffff810a1820>] ? autoremove_wake_function+0x0/0x40 Nov 4 09:31:33 lola-4 kernel: [<ffffffffa0314ec5>] __cv_wait+0x15/0x20 [spl] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa038c81d>] dmu_tx_wait+0x21d/0x400 [zfs] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa03d1ef2>] ? txg_kick+0x92/0xd0 [zfs] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa038cbf1>] dmu_tx_assign+0xa1/0x570 [zfs] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa0f06d7d>] osd_trans_start+0xed/0x430 [osd_zfs] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa10280d5>] ofd_write_attr_set+0x2c5/0x9e0 [ofd] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa102933f>] ofd_commitrw_write+0x23f/0x10a0 [ofd] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa102df1d>] ? ofd_fmd_find_nolock+0xad/0xd0 [ofd] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa102a75f>] ofd_commitrw+0x5bf/0xb10 [ofd] Nov 4 09:31:33 lola-4 kernel: [<ffffffff81149461>] ? kzfree+0x31/0x40 Nov 4 09:31:33 lola-4 kernel: [<ffffffffa079d5c1>] ? lprocfs_counter_add+0x151/0x1c0 [obdclass] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa0a17ee4>] obd_commitrw+0x114/0x380 [ptlrpc] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa0a20c80>] tgt_brw_write+0xc70/0x1530 [ptlrpc] Nov 4 09:31:33 lola-4 kernel: [<ffffffff8105e9b6>] ? enqueue_task+0x66/0x80 Nov 4 09:31:33 lola-4 kernel: [<ffffffff8105ab8d>] ? check_preempt_curr+0x6d/0x90 Nov 4 09:31:33 lola-4 kernel: [<ffffffff810674be>] ? try_to_wake_up+0x24e/0x3e0 Nov 4 09:31:33 lola-4 kernel: [<ffffffffa0974eb0>] ? target_bulk_timeout+0x0/0xc0 [ptlrpc] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa0a1f4bc>] tgt_request_handle+0x8ec/0x1440 [ptlrpc] Nov 4 09:31:33 lola-4 kernel: [<ffffffffa09cbbc1>] ptlrpc_main+0xd31/0x1800 [ptlrpc] Nov 4 09:31:33 lola-4 kernel: [<ffffffff81539b0e>] ? thread_return+0x4e/0x7d0 Nov 4 09:31:33 lola-4 kernel: [<ffffffffa09cae90>] ? ptlrpc_main+0x0/0x1800 [ptlrpc] Nov 4 09:31:33 lola-4 kernel: [<ffffffff810a138e>] kthread+0x9e/0xc0 Nov 4 09:31:33 lola-4 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 Nov 4 09:31:33 lola-4 kernel: [<ffffffff810a12f0>] ? kthread+0x0/0xc0 Nov 4 09:31:33 lola-4 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 These OST RPC services thread hung caused related MDT/client RPC timeout, then triggered the reconnection from the MDT/client. Nov 4 09:48:54 lola-4 kernel: Lustre: soaked-OST0002: Client soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib10) reconnecting Nov 4 09:48:54 lola-4 kernel: Lustre: Skipped 1 previous similar message Nov 4 09:48:54 lola-4 kernel: Lustre: soaked-OST0002: Connection restored to soaked-MDT0000-mdtlov_UUID (at 192.168.1.108@o2ib10) Nov 4 09:48:54 lola-4 kernel: Lustre: Skipped 12 previous similar messages Nov 4 09:48:58 lola-4 kernel: Lustre: soaked-OST0002: Client e093a05c-49f1-1f4e-3cd0-0381aa4e6173 (at 192.168.1.130@o2ib100) reconnecting Nov 4 09:48:58 lola-4 kernel: Lustre: Skipped 2 previous similar messages Nov 4 09:48:58 lola-4 kernel: Lustre: soaked-OST0008: deleting orphan objects from 0x0:27891162 to 0x0:27891185 For MDT, after reconnected, it sent orphan cleanup RPC to the OST for sync the last_id. Unfortunately, related RPC service thread was blocked by the mutex ofd_seq::os_create_lock that was held by another RPC service thread which was blocked by the transaction trouble. So there were a serious of blocking. And then these blocked orphan cleanup RPC timeout and trigger more and more reconnection repeatedly. Nov 4 10:24:05 lola-4 kernel: Pid: 21738, comm: ll_ost01_103 Nov 4 10:24:05 lola-4 kernel: Nov 4 10:24:05 lola-4 kernel: Call Trace: Nov 4 10:24:05 lola-4 kernel: [<ffffffff8153b4b6>] __mutex_lock_slowpath+0x96/0x210 Nov 4 10:24:05 lola-4 kernel: [<ffffffffa102086b>] ? ofd_seq_load+0xbb/0x9c0 [ofd] Nov 4 10:24:05 lola-4 kernel: [<ffffffff8153afdb>] mutex_lock+0x2b/0x50 Nov 4 10:24:05 lola-4 kernel: [<ffffffffa1015a28>] ofd_create_hdl+0xba8/0x2530 [ofd] Nov 4 10:24:05 lola-4 kernel: [<ffffffffa09baf9b>] ? lustre_pack_reply_v2+0x1eb/0x280 [ptlrpc] Nov 4 10:24:05 lola-4 kernel: [<ffffffffa09bb0d6>] ? lustre_pack_reply_flags+0xa6/0x1e0 [ptlrpc] Nov 4 10:24:05 lola-4 kernel: [<ffffffffa09bb221>] ? lustre_pack_reply+0x11/0x20 [ptlrpc] Nov 4 10:24:05 lola-4 kernel: [<ffffffffa0a1f4bc>] tgt_request_handle+0x8ec/0x1440 [ptlrpc] Nov 4 10:24:05 lola-4 kernel: [<ffffffffa09cbbc1>] ptlrpc_main+0xd31/0x1800 [ptlrpc] Nov 4 10:24:05 lola-4 kernel: [<ffffffff81539b0e>] ? thread_return+0x4e/0x7d0 Nov 4 10:24:05 lola-4 kernel: [<ffffffffa09cae90>] ? ptlrpc_main+0x0/0x1800 [ptlrpc] Nov 4 10:24:05 lola-4 kernel: [<ffffffff810a138e>] kthread+0x9e/0xc0 Nov 4 10:24:05 lola-4 kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20 Nov 4 10:24:05 lola-4 kernel: [<ffffffff810a12f0>] ? kthread+0x0/0xc0 Nov 4 10:24:05 lola-4 kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20 So the root is why the txg_quiesce_thread was blocked.

nasf (Inactive) added a comment - 08/Nov/16 2:29 AM

Here is the LFSCK logs on the lola-8:

69088:00100000:10000000:31.0:1478532776.517351:0:18252:0:(lfsck_lib.c:2573:lfsck_post_generic()) soaked-MDT0000-osd: waiting for assistant to do lfsck_layout post, rc = 0
69089:00100000:10000000:16.0:1478532776.518230:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 6 for lfsck_layout: event = 6, rc = -4
69090:00100000:10000000:16.0:1478532776.518242:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 15 for lfsck_layout: event = 6, rc = -4
69091:00100000:10000000:16.0:1478532776.518248:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 8 for lfsck_layout: event = 6, rc = -4
69092:00100000:10000000:16.0:1478532776.518253:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 10 for lfsck_layout: event = 6, rc = -4
69093:00100000:10000000:16.0:1478532776.518258:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 4 for lfsck_layout: event = 6, rc = -4
69094:00100000:10000000:16.0:1478532776.518262:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 9 for lfsck_layout: event = 6, rc = -4
69095:00100000:10000000:16.0:1478532776.518265:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 13 for lfsck_layout: event = 6, rc = -4
69096:00100000:10000000:16.0:1478532776.518268:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST f for lfsck_layout: event = 6, rc = -4
69097:00100000:10000000:16.0:1478532776.518271:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST d for lfsck_layout: event = 6, rc = -4
69098:00100000:10000000:16.0:1478532776.518274:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 0 for lfsck_layout: event = 6, rc = -4
69099:00100000:10000000:16.0:1478532776.518280:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST e for lfsck_layout: event = 6, rc = -4
69100:00100000:10000000:16.0:1478532776.518283:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 16 for lfsck_layout: event = 6, rc = -4
69101:00100000:10000000:16.0:1478532776.518287:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST c for lfsck_layout: event = 6, rc = -4
69102:00100000:10000000:16.0:1478532776.518290:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 1 for lfsck_layout: event = 6, rc = -4
69103:00100000:10000000:16.0:1478532776.518293:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 11 for lfsck_layout: event = 6, rc = -4
69104:00100000:10000000:16.0:1478532776.518296:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 3 for lfsck_layout: event = 6, rc = -4
69105:00100000:10000000:16.0:1478532776.518299:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 14 for lfsck_layout: event = 6, rc = -4
69106:00100000:10000000:16.0:1478532776.518302:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 2 for lfsck_layout: event = 6, rc = -4
69107:00100000:10000000:16.0:1478532776.518305:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST b for lfsck_layout: event = 6, rc = -4
69108:00100000:10000000:16.0:1478532776.518309:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 7 for lfsck_layout: event = 6, rc = -4
69109:00100000:10000000:16.0:1478532776.518312:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 5 for lfsck_layout: event = 6, rc = -4
69110:00100000:10000000:16.0:1478532776.518315:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 17 for lfsck_layout: event = 6, rc = -4
69111:00100000:10000000:16.0:1478532776.518318:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST a for lfsck_layout: event = 6, rc = -4
69112:00100000:10000000:16.0:1478532776.518322:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 12 for lfsck_layout: event = 6, rc = -4
...
69151:00100000:10000000:16.0:1478532876.526908:0:18254:0:(lfsck_engine.c:1811:lfsck_assistant_engine()) soaked-MDT0000-osd: LFSCK assistant synced before exit: rc = -110
69152:00100000:10000000:16.0:1478532876.526916:0:18254:0:(lfsck_engine.c:1838:lfsck_assistant_engine()) soaked-MDT0000-osd: lfsck_layout LFSCK assistant thread exit: rc = 0
69153:00100000:10000000:31.0:1478532876.526967:0:18252:0:(lfsck_lib.c:2585:lfsck_post_generic()) soaked-MDT0000-osd: the assistant has done lfsck_layout post, rc = 0
69154:00100000:10000000:31.0:1478532876.527030:0:18252:0:(lfsck_layout.c:4680:lfsck_layout_master_post()) soaked-MDT0000-osd: layout LFSCK master post done: rc = 0
69155:00100000:10000000:31.0:1478532876.527035:0:18252:0:(lfsck_lib.c:2573:lfsck_post_generic()) soaked-MDT0000-osd: waiting for assistant to do lfsck_namespace post, rc = 0
69156:00100000:10000000:2.0:1478532876.527095:0:18256:0:(lfsck_engine.c:1805:lfsck_assistant_engine()) soaked-MDT0000-osd: LFSCK assistant sync before exit
69193:00100000:10000000:2.0:1478532976.535859:0:18256:0:(lfsck_engine.c:1811:lfsck_assistant_engine()) soaked-MDT0000-osd: LFSCK assistant synced before exit: rc = -110
69194:00100000:10000000:2.0:1478532976.535867:0:18256:0:(lfsck_engine.c:1838:lfsck_assistant_engine()) soaked-MDT0000-osd: lfsck_namespace LFSCK assistant thread exit: rc = 0
69195:00100000:10000000:31.0:1478532976.535918:0:18252:0:(lfsck_lib.c:2585:lfsck_post_generic()) soaked-MDT0000-osd: the assistant has done lfsck_namespace post, rc = 0
69196:00100000:10000000:31.0:1478532976.535987:0:18252:0:(lfsck_namespace.c:4303:lfsck_namespace_post()) soaked-MDT0000-osd: namespace LFSCK post done: rc = 0

That means both the layout LFSCK assistant thread and the namespace LFSCK assistant thread have exited finally, although there were a lot failures when communicated with remote peers. So what Cliff saw about the LFSCK on the MDT was during the connection issues with lola-4. The MDT0000 needs to talks with the OSTs on lola-4 for layout LFSCK, since there were connection rouble with the lola-4, the LFSCK will retry until recovered or manually stopped or exited. So the LFSCK behaviour is in expectation. What we need to make clear is why the connection between the lola-8 and lola-4 were repeatedly down and up.

nasf (Inactive) added a comment - 08/Nov/16 2:29 AM Here is the LFSCK logs on the lola-8: 69088:00100000:10000000:31.0:1478532776.517351:0:18252:0:(lfsck_lib.c:2573:lfsck_post_generic()) soaked-MDT0000-osd: waiting for assistant to do lfsck_layout post, rc = 0 69089:00100000:10000000:16.0:1478532776.518230:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 6 for lfsck_layout: event = 6, rc = -4 69090:00100000:10000000:16.0:1478532776.518242:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 15 for lfsck_layout: event = 6, rc = -4 69091:00100000:10000000:16.0:1478532776.518248:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 8 for lfsck_layout: event = 6, rc = -4 69092:00100000:10000000:16.0:1478532776.518253:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 10 for lfsck_layout: event = 6, rc = -4 69093:00100000:10000000:16.0:1478532776.518258:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 4 for lfsck_layout: event = 6, rc = -4 69094:00100000:10000000:16.0:1478532776.518262:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 9 for lfsck_layout: event = 6, rc = -4 69095:00100000:10000000:16.0:1478532776.518265:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 13 for lfsck_layout: event = 6, rc = -4 69096:00100000:10000000:16.0:1478532776.518268:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST f for lfsck_layout: event = 6, rc = -4 69097:00100000:10000000:16.0:1478532776.518271:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST d for lfsck_layout: event = 6, rc = -4 69098:00100000:10000000:16.0:1478532776.518274:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 0 for lfsck_layout: event = 6, rc = -4 69099:00100000:10000000:16.0:1478532776.518280:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST e for lfsck_layout: event = 6, rc = -4 69100:00100000:10000000:16.0:1478532776.518283:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 16 for lfsck_layout: event = 6, rc = -4 69101:00100000:10000000:16.0:1478532776.518287:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST c for lfsck_layout: event = 6, rc = -4 69102:00100000:10000000:16.0:1478532776.518290:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 1 for lfsck_layout: event = 6, rc = -4 69103:00100000:10000000:16.0:1478532776.518293:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 11 for lfsck_layout: event = 6, rc = -4 69104:00100000:10000000:16.0:1478532776.518296:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 3 for lfsck_layout: event = 6, rc = -4 69105:00100000:10000000:16.0:1478532776.518299:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 14 for lfsck_layout: event = 6, rc = -4 69106:00100000:10000000:16.0:1478532776.518302:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 2 for lfsck_layout: event = 6, rc = -4 69107:00100000:10000000:16.0:1478532776.518305:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST b for lfsck_layout: event = 6, rc = -4 69108:00100000:10000000:16.0:1478532776.518309:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 7 for lfsck_layout: event = 6, rc = -4 69109:00100000:10000000:16.0:1478532776.518312:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 5 for lfsck_layout: event = 6, rc = -4 69110:00100000:10000000:16.0:1478532776.518315:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 17 for lfsck_layout: event = 6, rc = -4 69111:00100000:10000000:16.0:1478532776.518318:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST a for lfsck_layout: event = 6, rc = -4 69112:00100000:10000000:16.0:1478532776.518322:0:18254:0:(lfsck_lib.c:2127:lfsck_async_interpret_common()) soaked-MDT0000-osd: fail to notify OST 12 for lfsck_layout: event = 6, rc = -4 ... 69151:00100000:10000000:16.0:1478532876.526908:0:18254:0:(lfsck_engine.c:1811:lfsck_assistant_engine()) soaked-MDT0000-osd: LFSCK assistant synced before exit: rc = -110 69152:00100000:10000000:16.0:1478532876.526916:0:18254:0:(lfsck_engine.c:1838:lfsck_assistant_engine()) soaked-MDT0000-osd: lfsck_layout LFSCK assistant thread exit: rc = 0 69153:00100000:10000000:31.0:1478532876.526967:0:18252:0:(lfsck_lib.c:2585:lfsck_post_generic()) soaked-MDT0000-osd: the assistant has done lfsck_layout post, rc = 0 69154:00100000:10000000:31.0:1478532876.527030:0:18252:0:(lfsck_layout.c:4680:lfsck_layout_master_post()) soaked-MDT0000-osd: layout LFSCK master post done: rc = 0 69155:00100000:10000000:31.0:1478532876.527035:0:18252:0:(lfsck_lib.c:2573:lfsck_post_generic()) soaked-MDT0000-osd: waiting for assistant to do lfsck_namespace post, rc = 0 69156:00100000:10000000:2.0:1478532876.527095:0:18256:0:(lfsck_engine.c:1805:lfsck_assistant_engine()) soaked-MDT0000-osd: LFSCK assistant sync before exit 69193:00100000:10000000:2.0:1478532976.535859:0:18256:0:(lfsck_engine.c:1811:lfsck_assistant_engine()) soaked-MDT0000-osd: LFSCK assistant synced before exit: rc = -110 69194:00100000:10000000:2.0:1478532976.535867:0:18256:0:(lfsck_engine.c:1838:lfsck_assistant_engine()) soaked-MDT0000-osd: lfsck_namespace LFSCK assistant thread exit: rc = 0 69195:00100000:10000000:31.0:1478532976.535918:0:18252:0:(lfsck_lib.c:2585:lfsck_post_generic()) soaked-MDT0000-osd: the assistant has done lfsck_namespace post, rc = 0 69196:00100000:10000000:31.0:1478532976.535987:0:18252:0:(lfsck_namespace.c:4303:lfsck_namespace_post()) soaked-MDT0000-osd: namespace LFSCK post done: rc = 0 That means both the layout LFSCK assistant thread and the namespace LFSCK assistant thread have exited finally, although there were a lot failures when communicated with remote peers. So what Cliff saw about the LFSCK on the MDT was during the connection issues with lola-4. The MDT0000 needs to talks with the OSTs on lola-4 for layout LFSCK, since there were connection rouble with the lola-4, the LFSCK will retry until recovered or manually stopped or exited. So the LFSCK behaviour is in expectation. What we need to make clear is why the connection between the lola-8 and lola-4 were repeatedly down and up.

Peter Jones added a comment - 08/Nov/16 12:16 AM

Fan Yong

Could you please advise on this issue?

Thanks

Peter

Peter Jones added a comment - 08/Nov/16 12:16 AM Fan Yong Could you please advise on this issue? Thanks Peter

People

Assignee:: Alex Zhuravlev

Reporter:: Cliff White (Inactive)

Votes:: 1 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 07/Nov/16 8:46 PM

Updated:: 22/Aug/18 9:16 PM

Resolved:: 22/Aug/18 9:16 PM