[LU-12831] sanity test_413b timed out Created: 04/Oct/19 Updated: 28/Oct/21 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.13.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | dne, zfs | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for jianyu <yujian@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/a660d22a-e346-11e9-a0ba-52540065bddc test_413b failed with the following error: weight diff=-25% must be > 50% ...Fill MDT0 with 45712 files weight diff=-25% must be > 50% ...Fill MDT0 with 45670 files weight diff=-25% must be > 50% ...Fill MDT0 with 45712 files Timeout occurred after 447 mins, last suite running was sanity, restarting cluster to continue tests VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Andreas Dilger [ 04/Oct/19 ] |
|
There should probably be a limit on the number of files that need to be created to make the MDTs be imbalanced. That might be increased for SLOW=yes tests. |
| Comment by Andreas Dilger [ 07/May/20 ] |
|
In recent failures, this doesn't look like a timeout issue because of too many inodes, but rather a problem on the OST from an earlier test that causes the filesystem to be mounted read-only: [ 9991.380259] Lustre: DEBUG MARKER: == sanity test 409: Large amount of cross-MDTs hard links on the same file =========================== 23:26:58 (1588807618) [10021.751428] LDISKFS-fs error (device dm-17) in ldiskfs_free_blocks:5463: IO failure [10021.755577] Aborting journal on device dm-17-8. [10021.757692] LDISKFS-fs error (device dm-17): ldiskfs_journal_check_start:56: [10021.757918] LDISKFS-fs (dm-17): Remounting filesystem read-only [10021.757922] LDISKFS-fs error (device dm-17) in ldiskfs_reserve_inode_write:5332: Journal has aborted [10021.758024] LDISKFS-fs error (device dm-17) in ldiskfs_reserve_inode_write:5332: Journal has aborted This happened on two separate test runs on b2_12: |
| Comment by Andreas Dilger [ 11/May/20 ] |
|
+1 on b2_12 https://testing.whamcloud.com/test_sets/3d7196ad-cecb-420d-8a02-a79f02ebc7ca [ 9848.724723] Lustre: DEBUG MARKER: == sanity test 409: Large amount of cross-MDTs hard links on the same file ==== 14:53:05 (1589035985) [ 9904.040345] LDISKFS-fs error (device dm-16) in ldiskfs_free_blocks:5463: IO failure [ 9904.062336] Aborting journal on device dm-16-8. [ 9904.063288] LDISKFS-fs error (device dm-16) in ldiskfs_orphan_add:3370: Journal has aborted [ 9904.063461] LDISKFS-fs (dm-16): Remounting filesystem read-only [ 9904.161794] LustreError: 28847:0:(ofd_dev.c:1804:ofd_destroy_hdl()) lustre-OST0005: error destroying object [0x440000402:0x11e:0x0]: -30 |