[LU-6739] EL7 mds-survey test_1: mds-survey failed Created: 17/Jun/15 Updated: 23/Jun/15 Resolved: 18/Jun/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.8.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
server and client: lustre-master build # 3071 EL7 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for sarah_lw <wei3.liu@intel.com> This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/ea948bde-135d-11e5-b4b0-5254006e85c2. The sub-test test_1 failed with the following error: mds-survey failed There is no log for MDS at all, this failure blocks all the following tests from being run. test log shows ====> Destroy 4 directories on onyx-42vm3:lustre-MDT0000_ecc ssh_exchange_identification: Connection closed by remote host Mon Jun 15 05:04:43 PDT 2015 /usr/bin/mds-survey from onyx-42vm6.onyx.hpdd.intel.com mdt 1 file 103011 dir 4 thr 4 create 18976.80 [ 11998.67, 23975.59] lookup 397752.68 [ 397752.68, 397752.68] md_getattr 294970.99 [ 294970.99, 294970.99] setxattr 1069.79 [ 0.00, 7999.06] destroy ERROR mdt 1 file 103011 dir 4 thr 8 create ERROR lookup ERROR md_getattr ERROR setxattr ERROR destroy ERROR starting run for config: test: create file: 103011 threads: 4 directories: 4 starting run for config: test: lookup file: 103011 threads: 4 directories: 4 starting run for config: test: md_getattr file: 103011 threads: 4 directories: 4 starting run for config: test: setxattr file: 103011 threads: 4 directories: 4 starting run for config: test: destroy file: 103011 threads: 4 directories: 4 starting run for config: test: create file: 103011 threads: 8 directories: 4 starting run for config: test: lookup file: 103011 threads: 8 directories: 4 starting run for config: test: md_getattr file: 103011 threads: 8 directories: 4 starting run for config: test: setxattr file: 103011 threads: 8 directories: 4 starting run for config: test: destroy file: 103011 threads: 8 directories: 4 mds-survey test_1: @@@@@@ FAIL: mds-survey failed |
| Comments |
| Comment by Andreas Dilger [ 18/Jun/15 ] |
|
The console logs are under "lustre-provisioning" from before mds-survey: 12:06:51:[ 1721.271559] WARNING: at lustre-2.7.55/ldiskfs/ext4_jbd2.c:260 __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs]() 12:06:51:[ 1721.294943] CPU: 1 PID: 5479 Comm: lctl Tainted: GF O-------------- 3.10.0-229.4.2.el7_lustre.x86_64 #1 12:06:51:[ 1721.297354] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 12:06:52:[ 1721.306555] Call Trace: 12:06:52:[ 1721.308459] [<ffffffff816050da>] dump_stack+0x19/0x1b 12:06:52:[ 1721.310520] [<ffffffff8106e34b>] warn_slowpath_common+0x6b/0xb0 12:06:52:[ 1721.312659] [<ffffffff8106e49a>] warn_slowpath_null+0x1a/0x20 12:06:52:[ 1721.314897] [<ffffffffa05616b2>] __ldiskfs_handle_dirty_metadata+0x1c2/0x220 [ldiskfs] 12:06:52:[ 1721.319529] [<ffffffffa0584659>] ldiskfs_free_blocks+0x5c9/0xb90 [ldiskfs] 12:06:53:[ 1721.321814] [<ffffffffa0578f75>] ldiskfs_xattr_release_block+0x275/0x330 [ldiskfs] 12:06:53:[ 1721.324060] [<ffffffffa057c1ab>] ldiskfs_xattr_delete_inode+0x2bb/0x300 [ldiskfs] 12:06:53:[ 1721.326316] [<ffffffffa0576ad5>] ldiskfs_evict_inode+0x1b5/0x610 [ldiskfs] 12:06:53:[ 1721.328683] [<ffffffff811e23d7>] evict+0xa7/0x170 12:06:53:[ 1721.330790] [<ffffffff811e2c15>] iput+0xf5/0x180 12:06:53:[ 1721.332864] [<ffffffffa0ba3e73>] osd_object_delete+0x1d3/0x300 [osd_ldiskfs] 12:06:53:[ 1721.335175] [<ffffffffa07586ad>] lu_object_free.isra.30+0x9d/0x1a0 [obdclass] 12:06:53:[ 1721.337494] [<ffffffffa0758872>] lu_object_put+0xc2/0x320 [obdclass] 12:06:54:[ 1721.339735] [<ffffffffa0f2a6d7>] echo_md_destroy_internal+0xe7/0x520 [obdecho] 12:06:54:[ 1721.342007] [<ffffffffa0f3217a>] echo_md_handler.isra.43+0x191a/0x2250 [obdecho] 12:06:54:[ 1721.348581] [<ffffffffa0f34766>] echo_client_iocontrol+0x1146/0x1d10 [obdecho] 12:06:54:[ 1721.354898] [<ffffffffa0724d1c>] class_handle_ioctl+0x1b3c/0x22b0 [obdclass] 12:06:54:[ 1721.358813] [<ffffffffa070a5e2>] obd_class_ioctl+0xd2/0x170 [obdclass] 12:06:54:[ 1721.360799] [<ffffffff811da2c5>] do_vfs_ioctl+0x2e5/0x4c0 12:06:54:[ 1721.364564] [<ffffffff811da541>] SyS_ioctl+0xa1/0xc0 12:06:55:[ 1721.366357] [<ffffffff81615029>] system_call_fastpath+0x16/0x1b 12:06:55:[ 1721.368204] ---[ end trace aed93badbc88e370 ]--- 12:06:55:[ 1721.370058] LDISKFS-fs: ldiskfs_free_blocks:5107: aborting transaction: error 28 in __ldiskfs_handle_dirty_metadata 12:06:55:[ 1721.372395] LDISKFS: jbd2_journal_dirty_metadata failed: handle type 5 started at line 240, credits 3/0, errcode -28 12:06:55:[ 1721.385026] LDISKFS-fs error (device dm-0) in ldiskfs_free_blocks:5123: error 28 Looks like the same journal size problem as |