[LU-7222] conf-sanity test_84: invalid llog tail at log id 0x4:10/0 offset 16384 Created: 28/Sep/15  Updated: 13/Oct/21  Resolved: 04/Dec/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Critical
Reporter: Maloo Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-7364 conf-sanity test_84 fails with llog_p... Open
is related to LU-7428 conf-sanity test_84, replay-dual 0a: ... Resolved
is related to LU-7097 conf-sanity test_84 (check recovery_t... Resolved
is related to LU-6789 Interop 2.5.3<->master conf-sanity te... Resolved
is related to LU-7100 conf-sanity test_84 LBUGS with “(llog... Closed
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for wangdi <di.wang@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/55681176-63f7-11e5-bcf0-5254006e85c2.

The sub-test test_84 failed with the following error:

00:04:11:Lustre: DEBUG MARKER: == conf-sanity test 84: check recovery_hard_time == 00:03:26 (1443225806)
00:04:11:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1
00:04:11:Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P1
00:04:11:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o recovery_time_hard=60,recovery_time_soft=60  		                   /dev/lvm-Role_MDS/P1 /mnt/mds1
00:04:11:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
00:04:11:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
00:04:11:Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust
00:04:11:Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
00:04:11:Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P1 2>/dev/null
00:04:11:Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
00:04:11:Lustre: DEBUG MARKER: sync; sync; sync
00:04:11:Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno
00:04:11:Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly
00:04:11:LustreError: 27572:0:(osd_handler.c:1380:osd_ro()) *** setting lustre-MDT0000 read-only ***
00:04:11:Turning device dm-0 (0xfd00000) read-only
00:04:11:Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000
00:04:11:Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
00:04:11:Lustre: DEBUG MARKER: lctl set_param fail_loc=0x20000709 fail_val=5
00:04:11:Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts
00:04:11:Lustre: DEBUG MARKER: umount -d /mnt/mds1
00:04:11:Lustre: Failing over lustre-MDT0000
00:04:11:Removing read-only on unknown block (0xfd00000)
00:04:11:Lustre: server umount lustre-MDT0000 complete
00:04:11:Lustre: Skipped 4 previous similar messages
00:04:11:Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST '
00:04:11:Lustre: DEBUG MARKER: hostname
00:04:11:Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P1
00:04:11:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o recovery_time_hard=60,recovery_time_soft=60  		                   /dev/lvm-Role_MDS/P1 /mnt/mds1
00:04:11:LDISKFS-fs (dm-0): recovery complete
00:04:11:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
00:04:11:LustreError: 28074:0:(llog_osd.c:866:llog_osd_next_block()) lustre-MDT0000-osd: invalid llog tail at log id 0x4:10/0 offset 16384
00:04:11:LustreError: 28055:0:(mgs_llog.c:457:mgs_find_or_make_fsdb()) Can't get db from client log -22
00:04:11:LustreError: 28055:0:(mgs_llog.c:496:mgs_check_index()) Can't get db for lustre
00:04:11:Lustre: MGS: Unable to add client 0@lo to file system lustre: -22
00:04:11:LustreError: 15b-f: MGC10.1.4.27@tcp: The configuration from log 'lustre-MDT0000'failed from the MGS (-22).  Make sure this client and the MGS are running compatible versions of Lustre.
00:04:42:LustreError: 28028:0:(obd_mount_server.c:1306:server_start_targets()) failed to start server lustre-MDT0000: -22
00:04:42:LustreError: 28028:0:(obd_mount_server.c:1794:server_fill_super()) Unable to start targets: -22
00:04:42:LustreError: 28028:0:(obd_mount_server.c:1509:server_put_super()) no obd lustre-MDT0000
00:04:42:LustreError: 28028:0:(obd_mount.c:1342:lustre_fill_super()) Unable to mount  (-22)
00:04:42:Lustre: DEBUG MARKER: /usr/sbin/lctl mark  conf-sanity test_84: @@@@@@ FAIL: Restart of mds1 failed! 
00:04:42:Lustre: DEBUG MARKER: conf-sanity test_84: @@@@@@ FAIL: Restart of mds1 failed!
00:04:42:Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /logdir/test_logs/2015-09-25/lustre-reviews-el6_7-x86_64--review-dne-part-1--1_6_1__34823__-69992788388800-135455/conf-sanity.test_84.debug_log.$(hostname -s).1443225849.log;
00:04:43:         dmesg > /logdir/test_logs/2015-09-25/lustre-reviews
00:04:43:Lustre: DEBUG MARKER: /usr/sbin/lctl mark == conf-sanity test 85: osd_ost init: fail ea_fid_set == 00:04:11 \(1443225851\)


 Comments   
Comment by Gerrit Updater [ 28/Sep/15 ]

wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/16662
Subject: LU-7222 tests: add Mulitple MDTs to test_84
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2fac3142cc66d49991fc6a2c63611cb637d61962

Comment by Di Wang [ 28/Sep/15 ]

This patch does not fix the issue, but only fixing the test script and adding more information into the error message. The reason for this error is still unknown.

Comment by Gerrit Updater [ 07/Oct/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/16662/
Subject: LU-7222 tests: add Mulitple MDTs to test_84
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 881b288d70318644098c335b92f07388e9e2d3a5

Comment by Andreas Dilger [ 24/Nov/15 ]

Linking to two other common failures in conf-sanity test_84 which may be related.

Comment by Peter Jones [ 04/Dec/15 ]

This particular variation of this test failure has been addressed in 2.8

Generated at Sat Feb 10 02:07:02 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.