[LU-11392] osp_sync_thread()) ASSERTION( thread->t_flags != SVC_RUNNING ) failed: 64767 changes, 41295 in progress, 1 in flight Created: 18/Sep/18  Updated: 23/Oct/18  Resolved: 23/Oct/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Major
Reporter: Alexander Boyko Assignee: Alexander Boyko
Resolution: Fixed Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
[209677.208444] LustreError: 4511:0:(llog_cat.c:269:llog_cat_id2handle()) snx11205-OST0000-osc-MDT0001: error opening log id [0x724a:0x1:0x0]:0: rc = -2
[209677.224367] LustreError: 4511:0:(osp_sync.c:1258:osp_sync_thread()) ASSERTION( thread->t_flags != SVC_RUNNING ) failed: 33830 changes, 284 in progress, 0 in flight
[209677.241177] LustreError: 4511:0:(osp_sync.c:1258:osp_sync_thread()) LBUG
[209677.249045] Pid: 4511, comm: osp-syn-0-1
[209677.254171] 
Call Trace:
[209677.260460]  [<ffffffffc0b347ae>] libcfs_call_trace+0x4e/0x60 [libcfs]
[209677.268138]  [<ffffffffc0b3483c>] lbug_with_loc+0x4c/0xb0 [libcfs]
[209677.275498]  [<ffffffffc1710f9b>] osp_sync_thread+0xa1b/0xa70 [osp]
[209677.282872]  [<ffffffff816b3e2c>] ? __schedule+0x47c/0xa30
[209677.289558]  [<ffffffffc1710580>] ? osp_sync_thread+0x0/0xa70 [osp]
[209677.296915]  [<ffffffff810b4031>] kthread+0xd1/0xe0
[209677.302876]  [<ffffffff810b3f60>] ? kthread+0x0/0xe0
[209677.308942]  [<ffffffff816c155d>] ret_from_fork+0x5d/0xb0
[209677.315441]  [<ffffffff810b3f60>] ? kthread+0x0/0xe0
[209677.321485] 
[209677.324090] Kernel panic - not syncing: LBUG

It is close to LU-7001 and could happen for wrapped catalog only. A race between llog_process_thread and llog_add. The LU-7001 addressed this particular problem in past, but it lefts a small window for the same race, unfortunately.



 Comments   
Comment by Gerrit Updater [ 18/Sep/18 ]

Alexandr Boyko (c17825@cray.com) uploaded a new patch: https://review.whamcloud.com/33192
Subject: LU-11392 tests: check race for llog_process_thread
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 607589044ca2f789f1c545c3cfd527999600bcdd

Comment by Gerrit Updater [ 18/Sep/18 ]

Alexandr Boyko (c17825@cray.com) uploaded a new patch: https://review.whamcloud.com/33193
Subject: LU-11392 llog: fix race llog_process_thread vs llog_add
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: b5399949083b8b4a1ecc2642f8686423ecda8bae

Comment by Gerrit Updater [ 05/Oct/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33193/
Subject: LU-11392 llog: fix race llog_process_thread vs llog_add
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 38c5f9aa6cb5f98f684e8bbe67ec3bd8e2204467

Comment by Gerrit Updater [ 23/Oct/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33192/
Subject: LU-11392 tests: check race for llog_process_thread
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 0f2d8154a888e3db0739ebec2279a0cafb6a0afd

Comment by Peter Jones [ 23/Oct/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:43:28 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.