[LU-7364] conf-sanity test_84 fails with llog_process_thread()) invalid length 0 in llog record for index 0/68 'Restart of mds1 failed!' Created: 30/Oct/15  Updated: 01/Nov/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: James Nunez (Inactive) Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None
Environment:

autotest


Issue Links:
Related
is related to LU-7222 conf-sanity test_84: invalid llog tai... Resolved
is related to LU-7428 conf-sanity test_84, replay-dual 0a: ... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

conf-sanity test 84 fails when trying to restart the MDS. Logs are at https://testing.hpdd.intel.com/test_sets/d729ab02-7ed7-11e5-a7b1-5254006e85c2

The problem seems to be in the llog record. From the MDS1 console:

19:18:35:LDISKFS-fs (dm-0): recovery complete
19:18:35:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
19:18:35:Lustre: 12673:0:(llog.c:520:llog_process_thread()) invalid length 0 in llog record for index 0/68
19:18:35:LustreError: 15b-f: MGC10.2.4.149@tcp: The configuration from log 'lustre-MDT0000'failed from the MGS (-22).  Make sure this client and the MGS are running compatible versions of Lustre.
19:18:35:LustreError: 12627:0:(obd_mount_server.c:1306:server_start_targets()) failed to start server lustre-MDT0000: -22
19:18:35:LustreError: 12627:0:(obd_mount_server.c:1794:server_fill_super()) Unable to start targets: -22
19:18:35:Lustre: Failing over lustre-MDT0000
19:18:35:LustreError: 12627:0:(obd_mount.c:1342:lustre_fill_super()) Unable to mount  (-22)
19:18:35:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_84: @@@@@@ FAIL: Restart of mds1 failed! 

This issue looks similar to LU-7222, but we don’t see the error message about the llog tail.



 Comments   
Comment by Andreas Dilger [ 24/Nov/15 ]

I suspect that this is still the same problem as LU-7222, due to the "llog_process_thread()) invalid length 0 in llog record for index 0/68" message, just a slightly different form of corrupt llog.

Comment by Gu Zheng (Inactive) [ 01/Nov/18 ]

seems another instance.

https://testing.whamcloud.com/test_sets/5a8502de-dd0e-11e8-b46b-52540065bddc

Generated at Sat Feb 10 02:08:15 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.