Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
3
-
9223372036854775807
Description
== sanity-quota test 18: MDS failover while writing, no watchdog triggered (b14840) ========================================================== 08:41:17 (1686832877)
sleep 5 for ZFS zfs
Waiting for MDT destroys to complete
Creating test directory
fail_val=0
fail_loc=0
Waiting 90s for 'u'
Updated after 2s: want 'u' got 'u'
User quota (limit: 200)
Disk quotas for usr quota_usr (uid 60000):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/lustre 0 0 204800 - 0 0 0 -
lustre-MDT0000_UUID
0 - 0 - 0 - 0 -
lustre-OST0000_UUID
0 - 0 - - - - -
lustre-OST0001_UUID
0 - 0 - - - - -
Total allocated inode limit: 0, total allocated block limit: 0
sysctl: cannot stat /proc/sys/lustre/timeout: No such file or directory
Write 100M (buffered) ...
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
[dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d18.sanity-quota/f18.sanity-quota] [count=100]
UUID 1K-blocks Used Available Use% Mounted on
lustre-MDT0000_UUID 2210688 4096 2204544 1% /mnt/lustre[MDT:0]
lustre-OST0000_UUID 3771392 3072 3748864 1% /mnt/lustre[OST:0]
lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1]
filesystem_summary: 7542784 6144 7515136 1% /mnt/lustre
Fail mds for 0 seconds
Failing mds1 on oleg365-server
Stopping /mnt/lustre-mds1 (opts on oleg365-server
08:41:31 (1686832891) shut down
Failover mds1 to oleg365-server
mount facets: mds1
Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1
oleg365-server: oleg365-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 8
pdsh@oleg365-client: oleg365-server: ssh exited with exit code 1
Started lustre-MDT0000
08:41:48 (1686832908) targets are mounted
08:41:48 (1686832908) facet_failover done
oleg365-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid
mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 48.3932 s, 2.2 MB/s
(dd_pid=1833, time=25, timeout=600)
Disk quotas for usr quota_usr (uid 60000):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/lustre 98310 0 204800 - 1 0 0 -
lustre-MDT0000_UUID
2* - 2 - 1 - 0 -
lustre-OST0000_UUID
98309 - 114688 - - - - -
lustre-OST0001_UUID
0 - 0 - - - - -
Total allocated inode limit: 0, total allocated block limit: 114688
Delete files...
Wait for unlink objects finished...
sleep 5 for ZFS zfs
sleep 5 for ZFS zfs
Waiting for MDT destroys to complete
sleep 5 for ZFS zfs
Waiting for MDT destroys to complete
Creating test directory
fail_val=0
fail_loc=0
User quota (limit: 200)
Disk quotas for usr quota_usr (uid 60000):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/lustre 0 0 204800 - 0 0 0 -
lustre-MDT0000_UUID
0 - 0 - 0 - 0 -
lustre-OST0000_UUID
0 - 0 - - - - -
lustre-OST0001_UUID
0 - 0 - - - - -
Total allocated inode limit: 0, total allocated block limit: 0
sysctl: cannot stat /proc/sys/lustre/timeout: No such file or directory
Write 100M (directio) ...
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
[dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d18.sanity-quota/f18.sanity-quota] [count=100] [oflag=direct]
UUID 1K-blocks Used Available Use% Mounted on
lustre-MDT0000_UUID 2210560 3840 2204672 1% /mnt/lustre[MDT:0]
lustre-OST0000_UUID 3771392 3072 3758080 1% /mnt/lustre[OST:0]
lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1]
filesystem_summary: 7542784 6144 7524352 1% /mnt/lustre
Fail mds for 0 seconds
Failing mds1 on oleg365-server
Stopping /mnt/lustre-mds1 (opts on oleg365-server
08:42:46 (1686832966) shut down
Failover mds1 to oleg365-server
mount facets: mds1
Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1
oleg365-server: oleg365-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 8
pdsh@oleg365-client: oleg365-server: ssh exited with exit code 1
Started lustre-MDT0000
08:43:02 (1686832982) targets are mounted
08:43:02 (1686832982) facet_failover done
oleg365-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid
mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 52.5656 s, 2.0 MB/s
(dd_pid=4187, time=30, timeout=600)
Disk quotas for usr quota_usr (uid 60000):
Filesystem kbytes quota limit grace files quota limit grace
/mnt/lustre 102407 0 204800 - 1 0 0 -
lustre-MDT0000_UUID
2* - 2 - 1 - 0 -
lustre-OST0000_UUID
102406 - 107525 - - - - -
lustre-OST0001_UUID
0 - 0 - - - - -
Total allocated inode limit: 0, total allocated block limit: 107525
Delete files...
Wait for unlink objects finished...
sleep 5 for ZFS zfs
sleep 5 for ZFS zfs
Waiting for MDT destroys to complete
sanity-quota test_18: @@@@@@ FAIL: [ 2836.180747] Lustre: ll_ost_io00_004: service thread pid 27906 was inactive for 40.067 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
Trace dump:
= /home/green/git/lustre-release/lustre/tests/test-framework.sh:6566:error()
= /home/green/git/lustre-release/lustre/tests/sanity-quota.sh:2945:test_18()
= /home/green/git/lustre-release/lustre/tests/test-framework.sh:6906:run_one()
= /home/green/git/lustre-release/lustre/tests/test-framework.sh:6955:run_one_logged()
= /home/green/git/lustre-release/lustre/tests/test-framework.sh:6792:run_test()
= /home/green/git/lustre-release/lustre/tests/sanity-quota.sh:2948:main()
Dumping lctl log to /tmp/testlogs//sanity-quota.test_18.*.1686833039.log
Delete files...
Wait for unlink objects finished...
rsync: chown "/tmp/testlogs/.sanity-quota.test_18.debug_log.oleg365-server.1686833039.log.4knRXN" failed: Operation not permitted (1)
rsync: chown "/tmp/testlogs/.sanity-quota.test_18.dmesg.oleg365-server.1686833039.log.Nvxt9o" failed: Operation not permitted (1)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1651) [generator=3.1.2]
sleep 5 for ZFS zfs
Waiting for MDT destroys to complete
Delete files...
Wait for unlink objects finished...
sleep 5 for ZFS zfs
Waiting for MDT destroys to complete