[LU-5257] Rolling upgrade from 2.4 to master failed as: LustreError: 6579:0:(lustre_log.h:440:llog_group_get_ctxt()) ASSERTION( index >= 0 && index < LLOG_MAX_CTXTS ) failed Created: 25/Jun/14  Updated: 25/Jun/14  Resolved: 25/Jun/14

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

before upgrade client and server: 2.4
after upgrade server: master-lustre build # 2091
client 1: 2.4
client 2: master


Issue Links:
Duplicate
is duplicated by LU-5218 Interop 2.5.1<->2.6 failure on test s... Resolved
Severity: 3
Rank (Obsolete): 14666

 Description   

The system is configured as 1 MDS, 1 OST and 2 clients. After upgrading both servers and 1 client from 2.4 to master, when running sanity-160a, MDS hit LBUG

MDS shows:

[root@fat-amd-1 ~]# Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect
Lustre: lustre-MDT0000: Denying connection for new client lustre-MDT0000-lwp-OST0000_UUID (at 10.10.4.133@tcp), waiting for all 2 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:59
Lustre: lustre-MDT0000: Denying connection for new client lustre-MDT0000-lwp-OST0000_UUID (at 10.10.4.133@tcp), waiting for all 2 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:55
Lustre: MGS: non-config logname received: params
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0000: Denying connection for new client lustre-MDT0000-lwp-OST0000_UUID (at 10.10.4.133@tcp), waiting for all 2 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:50
Lustre: lustre-MDT0000: Denying connection for new client lustre-MDT0000-lwp-OST0000_UUID (at 10.10.4.133@tcp), waiting for all 2 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:45
Lustre: lustre-MDT0000: Denying connection for new client lustre-MDT0000-lwp-OST0000_UUID (at 10.10.4.133@tcp), waiting for all 2 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:40
Lustre: lustre-MDT0000: Recovery over after 0:28, of 2 clients 2 recovered and 0 were evicted.
LustreError: 6579:0:(lustre_log.h:440:llog_group_get_ctxt()) ASSERTION( index >= 0 && index < LLOG_MAX_CTXTS ) failed: 
LustreError: 6579:0:(lustre_log.h:440:llog_group_get_ctxt()) LBUG
Pid: 6579, comm: mdt02_003

Call Trace:

Message from [<ffffffffa0398895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 syslogd@fat-amd [<ffffffffa0398e97>] lbug_with_loc+0x47/0xb0 [libcfs]
-1 at Jun 25 09: [<ffffffffa0740868>] llog_origin_handle_open+0x668/0x670 [ptlrpc]
53:09 ...
 ker [<ffffffffa0784b35>] tgt_llog_open+0x35/0xd0 [ptlrpc]
nel:LustreError: [<ffffffffa078b2cc>] tgt_request_handle+0x23c/0xac0 [ptlrpc]
 6579:0:(lustre_ [<ffffffffa073ad3a>] ptlrpc_main+0xd1a/0x1980 [ptlrpc]
log.h:440:llog_g [<ffffffffa073a020>] ? ptlrpc_main+0x0/0x1980 [ptlrpc]
roup_get_ctxt()) [<ffffffff8109ab56>] kthread+0x96/0xa0
 ASSERTION( inde [<ffffffff8100c20a>] child_rip+0xa/0x20
x >= 0 && index  [<ffffffff8109aac0>] ? kthread+0x0/0xa0
< LLOG_MAX_CTXTS [<ffffffff8100c200>] ? child_rip+0x0/0x20
 ) failed: 


Message from sKernel panic - not syncing: LBUG
Pid: 6579, comm: mdt02_003 Not tainted 2.6.32-431.17.1.el6_lustre.g8d5344f.x86_64 #1
yslogd@fat-amd-1Call Trace:
 at Jun 25 09:53 [<ffffffff8152795f>] ? panic+0xa7/0x16f
:09 ...
 kerne [<ffffffffa0398eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
l:LustreError: 6 [<ffffffffa0740868>] ? llog_origin_handle_open+0x668/0x670 [ptlrpc]
579:0:(lustre_lo [<ffffffffa0784b35>] ? tgt_llog_open+0x35/0xd0 [ptlrpc]
g.h:440:llog_gro [<ffffffffa078b2cc>] ? tgt_request_handle+0x23c/0xac0 [ptlrpc]
up_get_ctxt()) L [<ffffffffa073ad3a>] ? ptlrpc_main+0xd1a/0x1980 [ptlrpc]
BUG
 [<ffffffffa073a020>] ? ptlrpc_main+0x0/0x1980 [ptlrpc]
 [<ffffffff8109ab56>] ? kthread+0x96/0xa0
 [<ffffffff8100c20a>] ? child_rip+0xa/0x20
 [<ffffffff8109aac0>] ? kthread+0x0/0xa0
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu

Generated at Sat Feb 10 01:49:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.