[LU-10527] LustreError: 7830:0:(llog_cat.c:313:llog_cat_current_log()) ASSERTION( llh ) Created: 17/Jan/18  Updated: 26/Sep/19  Resolved: 05/Mar/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0, Lustre 2.10.7

Type: Bug Priority: Major
Reporter: Bruno Faccini (Inactive) Assignee: Patrick Farrell (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Cloners
Duplicate
Related
is related to LU-11943 many input/output error after soak ru... Resolved
is related to LU-7340 ChangeLogs catalog full condition sho... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
[894175.107791] LustreError: 21232:0:(mdd_dir.c:835:mdd_changelog_ns_store()) Skipped 71 previous similar messages
[894200.594288] Lustre: 21803:0:(llog_cat.c:91:llog_cat_new_log()) snx11253-MDD0000: there are no more free slots in catalog
[894200.606214] Lustre: 21803:0:(llog_cat.c:91:llog_cat_new_log()) Skipped 15668 previous similar messages
[894207.538039] LustreError: 21789:0:(mdd_dir.c:835:mdd_changelog_ns_store()) snx11253-MDD0000: cannot store changelog record: type = 2, name = 'LTPfstat01.9.PxDvLp.1496380725', t = [0x2800484d7:0xca6c:0x0], p = [0x280048380:0x1:0x0]: rc = -28
[894207.561210] LustreError: 21789:0:(mdd_dir.c:835:mdd_changelog_ns_store()) Skipped 166 previous similar messages
[894264.576106] Lustre: 19178:0:(llog_cat.c:91:llog_cat_new_log()) snx11253-MDD0000: there are no more free slots in catalog
[894264.588022] Lustre: 19178:0:(llog_cat.c:91:llog_cat_new_log()) Skipped 341 previous similar messages
[894271.531429] LustreError: 18644:0:(mdd_dir.c:835:mdd_changelog_ns_store()) snx11253-MDD0000: cannot store changelog record: type = 2, name = 'CL_LTPfcntl04.3.o1pIM1.1496380789', t = [0x2800484d7:0xcae9:0x0], p = [0x280048380:0x1:0x0]: rc = -28
[894271.554925] LustreError: 18644:0:(mdd_dir.c:835:mdd_changelog_ns_store()) Skipped 312 previous similar messages
[894333.383425] LustreError: 17029:0:(llog_cat.c:329:llog_cat_current_log()) snx11253-MDD0000: next log does not exist!
[894333.395645] LustreError: 7830:0:(llog_cat.c:313:llog_cat_current_log()) ASSERTION( llh ) failed:
[894333.405845] LustreError: 7830:0:(llog_cat.c:313:llog_cat_current_log()) LBUG
[894333.414189] Pid: 7830, comm: mdt_rdpg00_016
[894333.419667]
Call Trace:
[894333.426088]  [<ffffffffa0912853>] libcfs_debug_dumpstack+0x53/0x80 [libcfs]
[894333.434292]  [<ffffffffa0912f95>] lbug_with_loc+0x45/0xc0 [libcfs]
[894333.441728]  [<ffffffffa0a288c0>] llog_cat_add_rec+0xea0/0xf10 [obdclass]
[894333.449715]  [<ffffffff811c18ae>] ? kmem_cache_alloc_trace+0x1ce/0x1f0
[894333.457475]  [<ffffffffa0a2109a>] llog_add+0x7a/0x1a0 [obdclass]
[894333.464707]  [<ffffffffa114c8f2>] mdd_changelog_store+0xf2/0x270 [mdd]
[894333.472425]  [<ffffffffa115ea1a>] mdd_changelog_data_store+0x23a/0x330 [mdd]
[894333.480641]  [<ffffffffa115f95d>] mdd_close+0x29d/0xc90 [mdd]
[894333.487534]  [<ffffffffa1241500>] ? mdt_reint_create+0xcd0/0xd10 [mdt]
[894333.495203]  [<ffffffffa124be29>] mdt_mfd_close+0x2a9/0xa30 [mdt]
[894333.502390]  [<ffffffffa0a4f6d3>] ? class_handle2object+0xb3/0x1b0 [obdclass]
[894333.510584]  [<ffffffffa1251abb>] mdt_close+0x30b/0xd30 [mdt]
[894333.517578]  [<ffffffffa0d48dfb>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
[894333.525370]  [<ffffffffa0ceb54b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[894333.533906]  [<ffffffffa0924848>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[894333.541595]  [<ffffffffa0ce8618>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
[894333.549282]  [<ffffffffa0ceee78>] ptlrpc_main+0xc08/0x1f40 [ptlrpc]
[894333.556423]  [<ffffffffa0cee270>] ? ptlrpc_main+0x0/0x1f40 [ptlrpc]
[894333.563524]  [<ffffffff810a5b8f>] kthread+0xcf/0xe0
[894333.569214]  [<ffffffff810a5ac0>] ? kthread+0x0/0xe0
[894333.574986]  [<ffffffff81644e18>] ret_from_fork+0x58/0x90
[894333.581176]  [<ffffffff810a5ac0>] ? kthread+0x0/0xe0
[894333.586921]
[894333.589184] Kernel panic - not syncing: LBUG
[894333.594215] CPU: 0 PID: 7830 Comm: mdt_rdpg00_016 Tainted: P           OE  ------------   3.10.0-327.36.3.x3.0.15.x86_64 #1
[894333.606087] Hardware name: Seagate Laguna Seca/Laguna Seca, BIOS v02.0024 02/26/2016
[894333.614589]  ffffffffa0937ca3 00000000efc7dcf7 ffff880d5fc6b9c8 ffffffff816349d1
[894333.622804]  ffff880d5fc6ba48 ffffffff8162e260 ffffffff00000008 ffff880d5fc6ba58
[894333.631015]  ffff880d5fc6b9f8 00000000efc7dcf7 ffffffffa0aa9d00 ffffffffa0943525
[894333.639218] Call Trace:
[894333.642397]  [<ffffffff816349d1>] dump_stack+0x19/0x1b
[894333.648253]  [<ffffffff8162e260>] panic+0xd8/0x1e7
[894333.653749]  [<ffffffffa0912ffb>] lbug_with_loc+0xab/0xc0 [libcfs]
[894333.660640]  [<ffffffffa0a288c0>] llog_cat_add_rec+0xea0/0xf10 [obdclass]
[894333.668112]  [<ffffffff811c18ae>] ? kmem_cache_alloc_trace+0x1ce/0x1f0
[894333.675322]  [<ffffffffa0a2109a>] llog_add+0x7a/0x1a0 [obdclass]
[894333.681990]  [<ffffffffa114c8f2>] mdd_changelog_store+0xf2/0x270 [mdd]
[894333.689168]  [<ffffffffa115ea1a>] mdd_changelog_data_store+0x23a/0x330 [mdd]
[894333.696864]  [<ffffffffa115f95d>] mdd_close+0x29d/0xc90 [mdd]
[894333.703245]  [<ffffffffa1241500>] ? mdt_reint_create+0xcd0/0xd10 [mdt]
[894333.710397]  [<ffffffffa124be29>] mdt_mfd_close+0x2a9/0xa30 [mdt]
[894333.717101]  [<ffffffffa0a4f6d3>] ? class_handle2object+0xb3/0x1b0 [obdclass]
[894333.724834]  [<ffffffffa1251abb>] mdt_close+0x30b/0xd30 [mdt]
[894333.731188]  [<ffffffffa0d48dfb>] tgt_request_handle+0x8fb/0x11f0 [ptlrpc]
[894333.738669]  [<ffffffffa0ceb54b>] ptlrpc_server_handle_request+0x21b/0xa90 [ptlrpc]
[894333.746918]  [<ffffffffa0924848>] ? lc_watchdog_touch+0x68/0x180 [libcfs]
[894333.754319]  [<ffffffffa0ce8618>] ? ptlrpc_wait_event+0x98/0x330 [ptlrpc]
[894333.761770]  [<ffffffffa0ceee78>] ptlrpc_main+0xc08/0x1f40 [ptlrpc]




 Comments   
Comment by Bruno Faccini (Inactive) [ 17/Jan/18 ]

Original problem has been reported by a customer (error msgs and LBUG with stack in Description for this ticket), but I have been able to reproduce this LBUG on a recent master build, by simulating/injecting the catalog full/ENOSPC situation in llog_cat_new_log(), with fail_loc=0x137/OBD_FAIL_MDS_LLOG_CREATE_FAILED (i.e., permanent failure/-ENOSPC), and then simply forcing multiple+concurrent attempts to create ChangeLog records by running just 6x occurrences of the "while true; do touch /mnt/lustre/foodir/<file>; done &" script, each using a different file (to avoid locking and thus concurrency) and this on a Client+Server single node setup.

And as a possible fix, I wonder if in llog_cat_add_rec() we really need to reset "cathandle->u.chd.chd_current_log" to NULL, upon -ENOSPC error returned from llog_cat_new_log().
Not doing so should avoid to have llog_cat_declare_add_rec() repeatedly and unnecessarily create new+partially initialized LLOGs/llog_handle and assigned to "cathandle->u.chd.chd_current_log", this without llog_init_handle() never being called to initialize "loghandle->lgh_hdr".

Comment by Gerrit Updater [ 17/Jan/18 ]

Faccini Bruno (bruno.faccini@intel.com) uploaded a new patch: https://review.whamcloud.com/30897
Subject: LU-10527 obdclass: don't recycle loghandle upon ENOSPC
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 693e2e9e33302b49b08dd1054e6418945121f81c

Comment by Bruno Faccini (Inactive) [ 01/Feb/18 ]

With my patch, I have not been able to get any occurence of the crash even running the known reproducer during hours.

Comment by Sergey Cheremencev [ 22/May/18 ]

Just for history.

I also faced similar problem but with a little bit different stack trace:

 7620028.883060] Lustre: 130784:0:(llog_cat.c:93:llog_cat_new_log()) snx11035-MDD0000: there are no more free slots in catalog
[7620028.895696] LustreError: 130969:0:(llog_cat.c:329:llog_cat_current_log()) ASSERTION( llh ) failed: 
[7620028.906765] LustreError: 130969:0:(llog_cat.c:329:llog_cat_current_log()) LBUG
[7620028.915204] Pid: 130969, comm: mdt00_056
[7620028.919888] 
[7620028.919888] Call Trace:
[7620028.924859]  [<ffffffffa0503895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[7620028.932940]  [<ffffffffa0504007>] lbug_with_loc+0x47/0xb0 [libcfs]
[7620028.940167]  [<ffffffffa0644a02>] llog_cat_current_log+0x2c2/0x2d0 [obdclass]
[7620028.948527]  [<ffffffffa0647d81>] llog_cat_add_rec+0x61/0x450 [obdclass]
[7620028.956320]  [<ffffffffa063f169>] llog_add+0x89/0x1c0 [obdclass]
[7620028.963317]  [<ffffffffa0f80ab6>] ? osd_attr_set+0x166/0x460 [osd_ldiskfs]
[7620028.971293]  [<ffffffffa0dc54e2>] mdd_changelog_store+0xf2/0x260 [mdd]
[7620028.978887]  [<ffffffffa0dd2b96>] mdd_changelog_data_store+0x1a6/0x280 [mdd]
[7620028.987145]  [<ffffffffa0ddb171>] mdd_attr_set+0x1081/0x17a0 [mdd]
[7620028.994357]  [<ffffffffa0e430dd>] mdt_attr_set+0x25d/0x5c0 [mdt]
[7620029.001373]  [<ffffffffa0e43a3d>] mdt_reint_setattr+0x5fd/0xe40 [mdt]
[7620029.008868]  [<ffffffffa0e3c6dd>] mdt_reint_rec+0x5d/0x200 [mdt]
[7620029.015878]  [<ffffffffa0e22abb>] mdt_reint_internal+0x4cb/0x760 [mdt]
[7620029.023468]  [<ffffffffa0e231eb>] mdt_reint+0x6b/0x120 [mdt]
[7620029.031496]  [<ffffffffa09285de>] tgt_request_handle+0x6fe/0xaf0 [ptlrpc]
[7620029.039414]  [<ffffffffa08d7979>] ptlrpc_main+0xe49/0x1930 [ptlrpc]
[7620029.046727]  [<ffffffff810096f0>] ? __switch_to+0xd0/0x320
[7620029.053162]  [<ffffffff81525eee>] ? thread_return+0x4e/0x760
[7620029.059805]  [<ffffffffa08d6b30>] ? ptlrpc_main+0x0/0x1930 [ptlrpc]
[7620029.067103]  [<ffffffff8109ac66>] kthread+0x96/0xa0
[7620029.072842]  [<ffffffff8100c20a>] child_rip+0xa/0x20
[7620029.078677]  [<ffffffff8109abd0>] ? kthread+0x0/0xa0
[7620029.084519]  [<ffffffff8100c200>] ? child_rip+0x0/0x20

The problem exists due to race between 3 threads doing: t1:llog_cat_current_log, t2:llog_cat_current_log and  t3: llog_cat_declare_add_rec.

  1. t3: llog_cat_declare_add_rec :: chd_next_log = loghandle /* loghandle is not initialized yet */
  2. t1: llog_cat_current_log :: llog is full thus up_read LLOGH_CAT and try to use next log with write sem
  3. t2: llog_cat_current_log :: llog is full thus up_read LLOGH_CAT and try to use next log with write sem
  4. t1: llog_cat_current_log :: holds write LOGH_CAT semaphore /* llog is still full -> try to use next */
  5. t1: llog_cat_current_log :: chd_current_log = chd_next_log /* chd_next_log is not initialized yet */
  6. t1: llog_cat_current_log :: up_write LOGH_CAT sem
  7. t2: llog_cat_currnet_log :: holds write LOGH_CAT sem
  8. t2: llog_cat_current_log :: LASSERT(llh);

I have successfully reproduced the problem on the latest master with a small delay in llog_cat_current_log:

diff --git a/lustre/obdclass/llog_cat.c b/lustre/obdclass/llog_cat.c
index 5f8b9d1..4587d94 100644
--- a/lustre/obdclass/llog_cat.c
+++ b/lustre/obdclass/llog_cat.c
@@ -377,6 +377,7 @@ static struct llog_handle *llog_cat_current_log(struct llog_handle *cathandle,
        up_read(&cathandle->lgh_lock);
 
        /* time to use next log */
+       schedule_timeout_interruptible(msecs_to_jiffies(2000));

Example of reproducer:

lctl --device lustre-MDT0000 changelog_register
#define OBD_FAIL_CAT_RECORDS                        0x1312
lctl set_param fail_loc=0x1312
lctl set_param fail_val=2

for i in $(seq 0 50); do mkdir /mnt/lustre/dir$i; done

for i in $(seq 0 32768);
do
	for j in $(seq 0 50);
	do
		touch /mnt/lustre/dir$j/f$i&
	done
done

I also have reproduced it on master + https://review.whamcloud.com/30897.
I believe LASSERT(llh) is not needed anymore and can't be removed. In combine with existed 30897 patch this should totally solve the issue.

The problem also exists in lustre 2.x.

Comment by Bruno Faccini (Inactive) [ 09/Jun/18 ]

Thanks Sergey,
And as we have already discussed in #30897, I have also removed the LASSERT(llh) in llog_cat_current_log() in my original patch for this ticket, and successfully tested it against your and my reproducers.

Comment by Gerrit Updater [ 03/Jul/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/30897/
Subject: LU-10527 obdclass: don't recycle loghandle upon ENOSPC
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 5761b9576d39c44c02455b86eb86ce1276930e60

Comment by Peter Jones [ 03/Jul/18 ]

Landed for 2.12

Comment by Gerrit Updater [ 13/Dec/18 ]

Sebastien Piechurski (sebastien.piechurski@atos.net) uploaded a new patch: https://review.whamcloud.com/33850
Subject: LU-10527 obdclass: don't recycle loghandle upon ENOSPC
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 8e4d631e08409e7e42d92453c1117cdd8fc6f374

Comment by Stephane Thiell [ 29/Dec/18 ]

We just hit this MDT crash with 2.10, and MDT wouldn't start. I had to clear changelog_users/catalog to be able to start MDT again. Would be nice to backport this patch to b2_10 ASAP thanks. Our average changelogs rate on our oak-MDT0000 is 6800/sec. We still also have tons of occurrence of LU-11205 "Failure to clear the changelog for user 1", this might be related. Peter, can this be considered as critical/urgent?

Comment by Gerrit Updater [ 05/Jan/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33850/
Subject: LU-10527 obdclass: don't recycle loghandle upon ENOSPC
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: 51e962be60cf599ecf154ea3a6b1c0f9882daac2

Comment by Stephane Thiell [ 05/Jan/19 ]

Thanks! 

Comment by Gerrit Updater [ 27/Feb/19 ]

Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34335
Subject: LU-10527 obdclass: don't recycle loghandle upon ENOSPC
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 1307d8af3536514781e0a87c8d0ff3b19caae50d

Comment by Patrick Farrell (Inactive) [ 27/Feb/19 ]

This patch has come up as a possible culprit for dramatically increased MDT failover issues in soak testing - See LU-11943.  Revert is just for testing at the moment, not actually proposing reverting it.

Comment by Patrick Farrell (Inactive) [ 28/Feb/19 ]

Removing the patch from the tip of b2_10 causes the abort rate on MDT failover to go 50% to 0%, so it's pretty clearly the culprit.

So:
The LU-10527 patch is causing failures on MDT failover (recovery aborting), reporting apparent llog corruption.

Messages vary, but this is a common example:
Feb 22 16:03:22 soak-10 kernel: LustreError: 12847:0:(llog.c:588:llog_process_thread()) soaked-MDT0003-osp-MDT0002: [0x1:0xc002abb4:0x2] Invalid record: index 4 but expected 2

Comment by Patrick Farrell (Inactive) [ 28/Feb/19 ]

Unless someone has a quick suggestion for a fix, intention is to revert the patch while we look for a solution.

Comment by Gerrit Updater [ 28/Feb/19 ]

Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34346
Subject: LU-10527 llog: Reset current log on ENOSPC
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: 155f4e866fc8f14da2b082898c00c0643c8bc919

Comment by Gerrit Updater [ 28/Feb/19 ]

Patrick Farrell (pfarrell@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34347
Subject: LU-10527 llog: Reset current log on ENOSPC
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6ab95bb7d53a7c6831bcfc56ce0c29481fa39189

Comment by Patrick Farrell (Inactive) [ 28/Feb/19 ]

Switched from revert to hopeful fix...

Comment by Peter Jones [ 05/Mar/19 ]

So I think that this ticket can be closed - the fixes to the original issue were tracked under LU-11943

Generated at Sat Feb 10 02:35:53 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.