[LU-8496] Race is changelog clear path Created: 11/Aug/16  Updated: 16/Apr/18  Resolved: 16/Apr/18

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Rahul Deshmukh (Inactive) Assignee: WC Triage
Resolution: Won't Do Votes: 0
Labels: patch

Issue Links:
Related
is related to LU-6699 LustreError: 7605:0:(osd_handler.c:25... Closed
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Follow error seen with the test patch which shows race in changelog clear path.

LustreError: 18671:0:(llog_osd.c:854:llog_osd_next_block()) ASSERTION( dt ) failed: 
LustreError: 18671:0:(llog_osd.c:854:llog_osd_next_block()) LBUG
Pid: 18671, comm: mdt00_001

Call Trace:
 [<ffffffffa0457875>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [<ffffffffa0457e77>] lbug_with_loc+0x47/0xb0 [libcfs]
 [<ffffffffa05b04f0>] llog_osd_declare_destroy+0x0/0x740 [obdclass]
 [<ffffffffa059f03e>] llog_process_thread+0x28e/0x1050 [obdclass]
 [<ffffffffa059fec8>] llog_process_or_fork+0xc8/0x490 [obdclass]
 [<ffffffffa0dc0b60>] ? llog_changelog_cancel_cb+0x0/0x1d0 [mdd]
 [<ffffffffa05a36f8>] llog_cat_process_cb+0x458/0x600 [obdclass]
 [<ffffffffa059f6aa>] llog_process_thread+0x8fa/0x1050 [obdclass]
 [<ffffffffa059fec8>] llog_process_or_fork+0xc8/0x490 [obdclass]
 [<ffffffffa05a32a0>] ? llog_cat_process_cb+0x0/0x600 [obdclass]
 [<ffffffffa05a32a0>] ? llog_cat_process_cb+0x0/0x600 [obdclass]
 [<ffffffffa05a1999>] llog_cat_process_or_fork+0x1a9/0x2f0 [obdclass]
 [<ffffffffa0dc0b60>] ? llog_changelog_cancel_cb+0x0/0x1d0 [mdd]
 [<ffffffffa0dc0b60>] ? llog_changelog_cancel_cb+0x0/0x1d0 [mdd]
 [<ffffffffa0463021>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
 [<ffffffffa05a1b6e>] llog_cat_process+0x2e/0x30 [obdclass]
 [<ffffffffa0dc098f>] llog_changelog_cancel+0x5f/0x230 [mdd]
 [<ffffffffa0dc4be5>] ? mdd_changelog_write_header+0x3a5/0x4d0 [mdd]
 [<ffffffffa05a5d88>] llog_cancel+0x58/0x230 [obdclass]
 [<ffffffffa0dc6d02>] mdd_changelog_user_purge+0x472/0x770 [mdd]
 [<ffffffffa0dc71ca>] mdd_iocontrol+0x1ca/0xaf0 [mdd]
 [<ffffffffa0e2d8b9>] mdt_ioc_child+0x149/0x1d0 [mdt]


 Comments   
Comment by Gerrit Updater [ 11/Aug/16 ]

Rahul Deshmukh (rahul.deshmukh@seagate.com) uploaded a new patch: http://review.whamcloud.com/21881
Subject: LU-8496 tests: lfs changelog_clear race test
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2997c5b6cda1e96993389df8840cb214578b5c04

Comment by Gerrit Updater [ 11/Aug/16 ]

Rahul Deshmukh (rahul.deshmukh@seagate.com) uploaded a new patch: http://review.whamcloud.com/21882
Subject: LU-8496 llog: avoid llog cancel race
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 03767e03e419f7efb80d3436dff8b8c82d12b0e3

Comment by Rahul Deshmukh (Inactive) [ 11/Aug/16 ]

First patch is re-producer. Second patch is push on top os first to show that it fixes the problem. I have tested this locally. The re-producer is slow test, so not sure if it will be run on normal run (Or need help on how to run it).

Comment by Vladimir Saveliev [ 16/Apr/18 ]

This fix is not needed since llog api has been changed to use osd devices.

Generated at Sat Feb 10 02:18:04 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.