Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/57f0b610-268d-4d1e-be2b-d1898e8431b1
test_21c failed with the following error in the test logs:
trevis-98vm3 mkdir -p /mnt/lustre-ost2; mount -t lustre -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 trevis-98vm3: mount.lustre: mount /dev/mapper/ost2_flakey at /mnt/lustre-ost2 failed: Invalid argument trevis-98vm3: This may have multiple causes. trevis-98vm3: Are the mount options correct? trevis-98vm3: Check the syslog for more info.
The OST console log shows:
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-ost2; mount -t lustre -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 LDISKFS-fs (dm-11): file extents enabled, maximum tree depth=5 LDISKFS-fs (dm-11): mounted filesystem with ordered data mode. Opts: errors=remount-ro LDISKFS-fs (dm-11): file extents enabled, maximum tree depth=5 LDISKFS-fs (dm-11): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc LustreError: 160-7: lustre-OST0001: the MGS refuses to allow this server to start: rc = -22. Please see messages on the MGS. LustreError: 30487:0:(tgt_mount.c:2216:server_fill_super()) Unable to start targets: -22 LustreError: 30487:0:(tgt_mount.c:1752:server_put_super()) no obd lustre-OST0001 LustreError: 30487:0:(tgt_mount.c:132:server_deregister_mount()) lustre-OST0001 not registered Lustre: server umount lustre-OST0001 complete LustreError: 30487:0:(super25.c:188:lustre_fill_super()) llite: Unable to mount <unknown>: rc = -22
The MGS console log shows:
LustreError: 14228:0:(mgs_llog.c:4541:mgs_write_log_target()) Can't write logs for lustre-OST0001 (-22) LustreError: 14228:0:(mgs_handler.c:518:mgs_target_reg()) Failed to write lustre-OST0001 log (-22) Lustre: DEBUG MARKER: /usr/sbin/lctl mark conf-sanity test_21c: @@@@@@ FAIL: Unable to start OST2 Lustre: DEBUG MARKER: conf-sanity test_21c: @@@@@@ FAIL: Unable to start OST2
Test session details:
clients: https://build.whamcloud.com/job/lustre-master/4455 - 4.18.0-477.15.1.el8_8.x86_64
servers: https://build.whamcloud.com/job/lustre-master/4455 - 4.18.0-477.15.1.el8_lustre.x86_64
It appears that the "Unable to start OST2" problem started on 2023-08-19. There were previous intermittent failures with "Unable to start OST1" but these were all fallout from earlier failures.
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
conf-sanity test_21c - Unable to start OST2
Attachments
Issue Links
- is related to
-
LU-13306 allow clients to accept mgs_nidtbl_entry with IPv6 NIDs
- Resolved
Still failing. I looked at the logs and I think its some LNet issue on the MGS. I see this in the logs on the MGS:
this leads to a timeout (-110) for the OST registration with the MGS. On the OST its a ptlrpc timeout issue.