[LU-9202] lfsck_layout_assistant_sync_failures()) ASSERTION( ltd != ((void *)0) ) failed Created: 10/Mar/17  Updated: 03/Aug/18  Resolved: 16/May/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.10.0

Type: Bug Priority: Minor
Reporter: nasf (Inactive) Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The logs show that:

Mar  6 08:36:33 lfs-sa-mds kernel: LustreError: 9819:0:(fld_handler.c:260:fld_server_lookup()) srv-lfs1-MDT0000: Cannot find sequence 0x200000e84012d82: rc = -2
Mar  6 08:36:33 lfs-sa-mds kernel: LustreError: 9819:0:(fld_handler.c:260:fld_server_lookup()) srv-lfs1-MDT0000: Cannot find sequence 0x200000f3600fa05: rc = -2
Mar  6 08:36:33 lfs-sa-mds kernel: LustreError: 9819:0:(fld_handler.c:260:fld_server_lookup()) Skipped 64 previous similar messages
Mar  6 08:36:34 lfs-sa-mds kernel: LustreError: 9819:0:(fld_handler.c:260:fld_server_lookup()) srv-lfs1-MDT0000: Cannot find sequence 0x200000eb0015479: rc = -2
Mar  6 08:36:34 lfs-sa-mds kernel: LustreError: 9819:0:(fld_handler.c:260:fld_server_lookup()) Skipped 1167 previous similar messages
Mar  6 08:36:40 lfs-sa-mds kernel: LustreError: 9819:0:(fld_handler.c:260:fld_server_lookup()) srv-lfs1-MDT0000: Cannot find sequence 0x200000e98017922: rc = -2
Mar  6 08:36:40 lfs-sa-mds kernel: LustreError: 9819:0:(fld_handler.c:260:fld_server_lookup()) Skipped 3870 previous similar messages
Mar  6 08:36:44 lfs-sa-mds kernel: LustreError: 9819:0:(lfsck_layout.c:284:lfsck_layout_assistant_sync_failures()) ASSERTION( ltd != ((void *)0) ) failed: 
Mar  6 08:36:44 lfs-sa-mds kernel: LustreError: 9819:0:(lfsck_layout.c:284:lfsck_layout_assistant_sync_failures()) LBUG
Mar  6 08:36:44 lfs-sa-mds kernel: Pid: 9819, comm: lfsck_layout
Mar  6 08:36:44 lfs-sa-mds kernel: 
Mar  6 08:36:44 lfs-sa-mds kernel: Call Trace:
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffffa02cd875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffffa02cde77>] lbug_with_loc+0x47/0xb0 [libcfs]
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffffa0cceb91>] lfsck_layout_assistant_sync_failures+0x4c1/0x4d0 [lfsck]
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffffa0ca64b2>] lfsck_assistant_notify_others+0x5d2/0x1490 [lfsck]
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffffa0ca9cf0>] lfsck_assistant_engine+0x930/0x1e50 [lfsck]
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffff81060530>] ? __dequeue_entity+0x30/0x50
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffff810672b0>] ? default_wake_function+0x0/0x20
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffffa0ca93c0>] ? lfsck_assistant_engine+0x0/0x1e50 [lfsck]
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffff810a0fce>] kthread+0x9e/0xc0
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffff810a0f30>] ? kthread+0x0/0xc0
Mar  6 08:36:44 lfs-sa-mds kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20
Mar  6 08:36:44 lfs-sa-mds kernel: 


 Comments   
Comment by Gerrit Updater [ 10/Mar/17 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: https://review.whamcloud.com/25931
Subject: LU-9202 lfsck: skip unavailable targets when sync failures
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 0817aa25ea2117a0fcfa5e7763ec47f790bfcc72

Comment by Gerrit Updater [ 16/May/17 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/25931/
Subject: LU-9202 lfsck: skip unavailable targets when sync failures
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e96c70e04ff2644909b96d94d4718117f7402a40

Comment by Peter Jones [ 16/May/17 ]

Landed for 2.10

Comment by James A Simmons [ 03/Aug/18 ]

Is this safe to back port to b2_8 ? We see this on our 2.8 servers

Generated at Sat Feb 10 02:24:07 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.