Details
-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
Lustre 2.8.0
-
None
-
3
-
9223372036854775807
Description
https://testing.hpdd.intel.com/test_sessions/e089506c-5bf0-11e5-9dac-5254006e85c2
LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: Lustre: Setting parameter lustre-MDT0000-mdtlov.lov.stripesize in log lustre-MDT0000 Lustre: Skipped 79 previous similar messages Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400):0:mdt Lustre: Skipped 26 previous similar messages Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests//usr/lib64/lustre/tests:/usr/lib64/lustre/tests:/usr/lib64/lustre/tests/../utils:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lust Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1 Lustre: DEBUG MARKER: e2label /dev/lvm-Role_MDS/P1 2>/dev/null Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1 Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 14276:0:(osd_handler.c:1380:osd_ro()) *** setting lustre-MDT0000 read-only *** Turning device dm-0 (0xfd00000) read-only Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: lctl set_param fail_loc=0x20000709 fail_val=5 Lustre: DEBUG MARKER: grep -c /mnt/mds1' ' /proc/mounts Lustre: DEBUG MARKER: umount -d /mnt/mds1 Lustre: Failing over lustre-MDT0000 Removing read-only on unknown block (0xfd00000) Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && lctl dl | grep ' ST ' Lustre: DEBUG MARKER: hostname Lustre: DEBUG MARKER: test -b /dev/lvm-Role_MDS/P1 Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre -o recovery_time_hard=60,recovery_time_soft=60 /dev/lvm-Role_MDS/P1 /mnt/mds1 LDISKFS-fs (dm-0): recovery complete LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: LDISKFS-fs error (device dm-0): ldiskfs_lookup: deleted inode referenced: 75023 Aborting journal on device dm-0-8. LDISKFS-fs (dm-0): Remounting filesystem read-only LDISKFS-fs error (device dm-0): ldiskfs_put_super: Couldn't clean up the journal LustreError: 14732:0:(obd_config.c:575:class_setup()) setup lustre-MDT0000-osd failed (-30) LustreError: 14732:0:(obd_mount.c:203:lustre_start_simple()) lustre-MDT0000-osd setup error -30 LustreError: 14732:0:(obd_mount_server.c:1760:server_fill_super()) Unable to start osd on /dev/mapper/lvm--Role_MDS-P1: -30 LustreError: 14732:0:(obd_mount.c:1342:lustre_fill_super()) Unable to mount (-30) Lustre: DEBUG MARKER: /usr/sbin/lctl mark conf-sanity test_84: @@@@@@ FAIL: Restart of mds1 failed!
Looks the filesystem is corrupted somehow.
Attachments
Issue Links
- is related to
-
LU-7428 conf-sanity test_84, replay-dual 0a: /dev/lvm-Role_MDS/P1 failed to initialize!
-
- Resolved
-
-
LU-6789 Interop 2.5.3<->master conf-sanity test_84: completed_clients != 1/2: 2/2
-
- Resolved
-
- is related to
-
LU-6895 sanity-lfsck test 4 hung: bad entry in directory: rec_len is smaller than minimal - inode=3925999616
-
- Resolved
-
There are many, many failures of this test, but unfortunately they have all been assigned different bugs because the error messages are different.
In the tests I've seen, the e2fsck run is clean, except for the superblock inside and block counts, which is expected.
I pushed a patch under
LU-7428that may fix the problem, which I think is caused by test_84() setting the MDS read-only right after mount, and that is causing some of the recently written data to be discarded (e.g. superblock label, llog records, etc). Unfortunately, it will take a few days to be tested.