[LU-15671] conf-sanity test_30b: MDS assertion in osp_precreate_send req->rq_transno == 0 Created: 22/Mar/22  Updated: 25/Oct/23  Resolved: 20/Jun/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.16.0, Lustre 2.15.3
Fix Version/s: Lustre 2.16.0

Type: Bug Priority: Critical
Reporter: Maloo Assignee: Sergey Cheremencev
Resolution: Fixed Votes: 0
Labels: LTS15, failing_tests

Issue Links:
Duplicate
is duplicated by LU-16264 assertion in osp_precreate_send req->... Reopened
Related
is related to LU-10594 conf-sanity test_30b: FAIL: check lus... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for S Buisson <sbuisson@ddn.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/d3ce7316-3e9e-4249-abb1-799a5eb3934c

test_30b failed with the following error:

trevis-67vm4, trevis-67vm5 crashed during conf-sanity test_30b

Both MDSes crashed with the following stack trace:

[ 5524.915839] LustreError: 167-0: lustre-OST0000-osc-MDT0002: This client was evicted by lustre-OST0000; in progress operations using this service will fail.
[ 5524.918788] LustreError: Skipped 1 previous similar message
[ 5524.920502] Lustre: lustre-OST0000-osc-MDT0002: Connection restored to 10.240.41.194@tcp (at 10.240.41.194@tcp)
[ 5524.922306] Lustre: Skipped 1 previous similar message
[ 5524.954841] LustreError: 17882:0:(osp_precreate.c:683:osp_precreate_send()) ASSERTION( req->rq_transno == 0 ) failed: 
[ 5524.956895] LustreError: 17882:0:(osp_precreate.c:683:osp_precreate_send()) LBUG
[ 5524.958304] Pid: 17882, comm: osp-pre-0-2 3.10.0-1160.59.1.el7_lustre.ddn16.x86_64 #1 SMP Wed Mar 9 19:03:32 UTC 2022
[ 5524.960268] Call Trace:
[ 5524.960880] [<0>] libcfs_call_trace+0x90/0xf0 [libcfs]
[ 5524.961867] [<0>] lbug_with_loc+0x4c/0xa0 [libcfs]
[ 5524.962863] [<0>] osp_precreate_send+0x1126/0x11d0 [osp]
[ 5524.963851] [<0>] osp_precreate_thread+0x6ba/0x13b0 [osp]
[ 5524.964945] [<0>] kthread+0xd1/0xe0
[ 5524.965682] [<0>] ret_from_fork_nospec_begin+0x21/0x21
[ 5524.966771] [<0>] 0xfffffffffffffffe
[ 5524.967479] Kernel panic - not syncing: LBUG
[ 5524.968288] CPU: 1 PID: 17882 Comm: osp-pre-0-2 Kdump: loaded Tainted: G           OE  ------------   3.10.0-1160.59.1.el7_lustre.ddn16.x86_64 #1
[ 5524.970699] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 5524.971780] Call Trace:
[ 5524.972295]  [<ffffffff9d9865b9>] dump_stack+0x19/0x1b
[ 5524.973283]  [<ffffffff9d9802c1>] panic+0xe8/0x21f
[ 5524.974207]  [<ffffffffc0845a1b>] lbug_with_loc+0x9b/0xa0 [libcfs]
[ 5524.975364]  [<ffffffffc13c3326>] osp_precreate_send+0x1126/0x11d0 [osp]
[ 5524.976632]  [<ffffffffc13c47ca>] osp_precreate_thread+0x6ba/0x13b0 [osp]
[ 5524.977901]  [<ffffffff9d2c6f50>] ? wake_up_atomic_t+0x30/0x30
[ 5524.978998]  [<ffffffffc13c4110>] ? osp_init_pre_fid+0x630/0x630 [osp]
[ 5524.980214]  [<ffffffff9d2c5e61>] kthread+0xd1/0xe0
[ 5524.981143]  [<ffffffff9d2c5d90>] ? insert_kthread_work+0x40/0x40
[ 5524.982283]  [<ffffffff9d999df7>] ret_from_fork_nospec_begin+0x21/0x21
[ 5524.983492]  [<ffffffff9d2c5d90>] ? insert_kthread_work+0x40/0x40

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
conf-sanity test_30b - trevis-67vm4, trevis-67vm5 crashed during conf-sanity test_30b



 Comments   
Comment by Etienne Aujames [ 29/Aug/22 ]

+1 on b2_12: https://testing.whamcloud.com/test_sets/81998c03-2335-4086-b176-45c24e98ea62

Comment by Serguei Smirnov [ 12/Sep/22 ]

+ 1 on master: https://testing.whamcloud.com/test_sets/14bf8217-d3dc-47b3-88df-7a1af8dcff45

Comment by Peter Jones [ 15/Sep/22 ]

Lai

Could you please investigate when time permits?

Thanks

Peter

Comment by Peter Jones [ 15/Sep/22 ]

Actually, maybe Alex Z has been more active in this area recently...

Comment by Gerrit Updater [ 18/May/23 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51056
Subject: LU-15671 mds: do not send OST_CREATE transno interop
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: c41d1fabdc2a76f8b400efd14e8928bf550635ac

Comment by Gerrit Updater [ 20/Jun/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51056/
Subject: LU-15671 mds: do not send OST_CREATE transno interop
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 9ee1281060d0a00a9c5d715a9a6d9b99c27123ff

Comment by Peter Jones [ 20/Jun/23 ]

Landed for 2.16

Comment by Gerrit Updater [ 19/Sep/23 ]

"Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52418
Subject: LU-15671 llite: cleanup code style in xattr.c
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7f0332817a3f05bc780a6b08615c26092197f1a9

Comment by Gerrit Updater [ 25/Oct/23 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52418/
Subject: LU-15671 llite: cleanup code style in xattr.c
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 40ce52f02cf964327440804abb5da31f750476eb

Generated at Sat Feb 10 03:20:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.