[LU-8729] conf-sanity test_84: FAIL: /dev/mapper/mds1_flakey failed to initialize! Created: 19/Oct/16 Updated: 14/Sep/17 Resolved: 14/Sep/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.10.1, Lustre 2.11.0 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Jian Yu | Assignee: | Hongchao Zhang |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
With patch http://review.whamcloud.com/7200 on master branch, conf-sanity test 84 failed as follows: CMD: onyx-31vm7 e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Update not seen after 90s: wanted '' got 'lustre:MDT0000'
conf-sanity test_84: @@@@@@ FAIL: /dev/mapper/mds1_flakey failed to initialize!
https://testing.hpdd.intel.com/test_sets/e88a61c2-89bf-11e6-a8b7-5254006e85c2 |
| Comments |
| Comment by Jian Yu [ 19/Oct/16 ] |
|
More failure instances: |
| Comment by Gerrit Updater [ 19/Oct/16 ] |
|
Jian Yu (jian.yu@intel.com) uploaded a new patch: http://review.whamcloud.com/23251 |
| Comment by Jian Yu [ 13/Jan/17 ] |
|
Hi Hongchao, I tested the dm-flakey patch on the latest master branch in https://review.whamcloud.com/24733 and found the following issues: On CentOS 6.8 distro, replay-single test 0c consistently hit the following errors on MDS: LDISKFS-fs (dm-6): ldiskfs_check_descriptors: Checksum for group 4 failed (61905!=6807) LDISKFS-fs (dm-6): group descriptors corrupted! LustreError: 6423:0:(osd_handler.c:6825:osd_mount()) lustre-MDT0000-osd: can't mount /dev/mapper/mds1_flakey: -22 On CentOS 7.3 distro, conf-sanity test 84, replay-single tests 0a, 4b, 5, 18, 19 and replay-ost-single test 9 consistently hit the following errors on MDS or OSS: Lustre: DEBUG MARKER: dmsetup load /dev/mapper/mds1_flakey --table "0 4194304 flakey 252:0 0 0 1800 1 drop_writes" Lustre: DEBUG MARKER: dmsetup resume /dev/mapper/mds1_flakey Buffer I/O error on dev dm-6, logical block 524272, async page read The "Buffer I/O error" didn't occur on CentOS 6.8 distro. It also didn't occur on CentOS 7.3 distro before patch set 21 for https://review.whamcloud.com/7200. The CentOS 7.3 kernel version for patch set 20 is 3.10.0-327.36.3.el7, and the kernel version for patch set 21 is 3.10.0-514.el7. Is there sth wrong with dm-flakey in different kernels? |
| Comment by Hongchao Zhang [ 18/Jan/17 ] |
|
On my VM, I have setup dm-flakey manually on 7.3 (3.10.0-514.2.2.el7) and run some basic tests (say, creation, deletion, |
| Comment by Oleg Drokin [ 10/Feb/17 ] |
|
This whole 'dmsetup load /dev/mapper/mds1_flakey --table "0 4194304 flakey 252:0 0 0 1800 1 drop_writes"' this looks a bit suspicious to me. Looking into the kenrel documentation, we can see: Table parameters
----------------
<dev path> <offset> <up interval> <down interval> \
[<num_features> [<feature arguments>]]
Mandatory parameters:
<dev path>: Full pathname to the underlying block-device, or a
"major:minor" device-number.
<offset>: Starting sector within the device.
<up interval>: Number of seconds device is available.
<down interval>: Number of seconds device returns errors.
Optional feature parameters:
If no feature parameters are present, during the periods of
unreliability, all I/O returns errors.
drop_writes:
All write I/O is silently ignored.
Read I/O is handled correctly.
According to this, "1" at the end of the table above is spurious? And if a target is unknown then it's just makes io errors to be returned from the io - which seems to be matching what is seen on rhel 7.3? |
| Comment by Oleg Drokin [ 10/Feb/17 ] |
|
oh, I guess 1 is the "num_features" that I missed |
| Comment by Andreas Dilger [ 15/Mar/17 ] |
|
Jian, when is the last time this was tested? |
| Comment by Jian Yu [ 15/Mar/17 ] |
|
Hi Andreas, |
| Comment by Jian Yu [ 12/Apr/17 ] |
|
Hi Andreas and Oleg, The "Buffer I/O error" on device-mapper flakey device is still blocking the testing on patch https://review.whamcloud.com/7200. Hongchao looked at the codes of dm-flakey.c in CentOS 7.3 (kernel version 3.10.0-514.2.2.el7) but didn't find any clue of the problem. Could you please advise what we need to do to move this forward? Thank you. |
| Comment by James A Simmons [ 12/Apr/17 ] |
|
While testing I started to ask myself why are using dm_flakey for ever test. It appears this is only needed for recovery testing to replace lct --device readonly. Should we look to reduce the scope of dm-flakey to only the test that it is needed for? |
| Comment by James A Simmons [ 13/Apr/17 ] |
|
After working on this patch today I believe I got a handle on what is going on. So the "Buffer I/O error" are correct. So in the test you have: mount_client $MOUNT1 || error "mount $MOUNT1 failed"
replay_barrier $SINGLEMDS This mounts and syncs all the data. Then replay_barrier using the power of the flakey driver sets the disk /dev/mapper/mds1_flakey to read only mode. The creation attempts by createmany on a read only file system is what is generated the "Buffer I/O errors". This is what is supposed to happen. Next is: do_facet $SINGLEMDS "lctl set_param fail_loc=0x20000709 fail_val=5" facet_failover --fsck $SINGLEMDS || error "failover: $?" The facet_failover part is what is important. Here we set a special flag to make sure e2fsck is run on the MDT. Looking at the Maloo logs: https://testing.hpdd.intel.com/test_logs/2cb1e62c-1f0c-11e7-8920-5254006e85c2/show_text I noticed the error: [26120.388883] Lustre: DEBUG MARKER: /usr/sbin/lctl mark conf-sanity test_84: \ Running e2fsck shouldn't work on a read only device, mds1_flakey. I fixed up |
| Comment by Jian Yu [ 13/Apr/17 ] |
|
Thank you very much James for helping debug this failure.
Besides recovery test suites (replay-single, replay-dual, etc), recovery testing also exists in other test suites (sanity, insanity, conf-sanity, etc). The changes will be a lot. What's more, we would like to make mount and unmount transparent, so that developers do not need to know they have to use a special option to create flakey devices while creating a recovery testing related sub-test. |
| Comment by Jian Yu [ 23/Apr/17 ] |
|
After doing some experiments, I found out the commits that caused the "Buffer I/O error". They are related to RHEL 7.3 support: commit adc9592d1820d5086e52e387008263b4dace9b0e
Author: Christopher J. Morrone <morrone2@llnl.gov>
CommitDate: Fri Nov 18 15:35:18 2016 +0000
LU-8534 ldiskfs: Add patch series for RHEL7.3
Reviewed-on: http://review.whamcloud.com/22113
commit 5763c175c9a1a330b664a01bc08f329b2bee54f7
Author: Bob Glossman <bob.glossman@intel.com>
CommitDate: Fri Nov 18 15:35:26 2016 +0000
LU-8796 kernel: kernel upgrade RHEL7.3 [3.10.0-514.el7]
Reviewed-on: http://review.whamcloud.com/23560
In The error message on MDS console was: Lustre: DEBUG MARKER: dmsetup suspend --nolockfs --noflush /dev/mapper/mds1_flakey In RHEL 7.3 kernel 3.10.0-514.el7, the above error message came from fs/buffer.c: static void end_buffer_async_read(struct buffer_head *bh, int uptodate) { //…… page = bh->b_page; if (uptodate) { set_buffer_uptodate(bh); } else { clear_buffer_uptodate(bh); buffer_io_error(bh, ", async page read"); SetPageError(page); } //…… } |
| Comment by Andreas Dilger [ 28/Apr/17 ] |
|
One option is to add a WARN_ON() in end_buffer_async_read(), so that this will dump the full stack. Note that the callpath for end_buffer_async_read() does not come from mark_buffer_async_read(), that is just setting the bh->b_end_io pointer so that the end_buffer_async_read() callback is called when the read from disk is actually finished. It is probably coming through: ->dmio_complete()
bio.bi_error = error ? -EIO : 0;
or
->bio_complete()
->bio.bi_end_io = end_bio_bh_io_sync();
->bh->b_end_io(bh, !bio->bi_error);
I do see some comments in dm-flakey.c that might indicate problems if the DROP_WRITES state is being changed: } else if (!test_bit(DROP_WRITES, &fc->flags) && !test_bit(ERROR_WRITES, &fc->flags)) { /* * Error read during the down_interval if drop_writes * and error_writes were not configured. */ return -EIO; } so that may be a result of changing the drop_writes state in the middle of a read. I see just before configuring the dm-flakey device: dmsetup suspend --nolockfs --noflush /dev/mapper/mds1_flakey Does "--noflush" means there may still be partially-completed reads? At a higher level, have we considered using the dm-flakey device only for cases when lctl readonly is actually used? It appears that this is only called from replay_barrier(), replay_barrier_nosync(), and replay_barrier_nodf(). I see one call to replay_barrier in conf-sanity.sh::test_84(), but we might consider to move this test over to replay-single.sh. It is also worthwhile to note that dm-flakey is only needed for ldiskfs, not for ZFS filesystems. |
| Comment by Jian Yu [ 01/May/17 ] |
The full stack is: Lustre: DEBUG MARKER: dmsetup suspend --nolockfs --noflush /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: dmsetup load /dev/mapper/mds1_flakey --table "0 4194304 flakey 252:0 0 0 1800 1 drop_writes" Lustre: DEBUG MARKER: dmsetup resume /dev/mapper/mds1_flakey Buffer I/O error on dev dm-6, logical block 524272, async page read ------------[ cut here ]------------ WARNING: at fs/buffer.c:295 end_buffer_async_read+0x119/0x140() Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel ppdev lrw gf128mul glue_helper ablk_helper cryptd pcspkr virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus 8139too drm_kms_helper crct10dif_pclmul crct10dif_common syscopyarea sysfillrect crc32c_intel serio_raw sysimgblt fb_sys_fops ttm ata_piix virtio_pci 8139cp virtio_ring mii virtio drm libata i2c_core floppy CPU: 1 PID: 14902 Comm: systemd-udevd Tainted: G W OE ------------ 3.10.0-514.16.1.el7_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 0000000000000000 0000000081bc25b0 ffff880064f4b990 ffffffff81686d2f ffff880064f4b9c8 ffffffff81085cb0 ffff8800642a7e00 ffff8800631e5068 ffff8800642a7e00 ffffea00017e7640 ffff880077f24800 ffff880064f4b9d8 Call Trace: [<ffffffff81686d2f>] dump_stack+0x19/0x1b [<ffffffff81085cb0>] warn_slowpath_common+0x70/0xb0 [<ffffffff81085dfa>] warn_slowpath_null+0x1a/0x20 [<ffffffff81234269>] end_buffer_async_read+0x119/0x140 [<ffffffff8123263c>] end_bio_bh_io_sync+0x2c/0x60 [<ffffffff81239454>] bio_endio+0x64/0xa0 [<ffffffffa063c8da>] dec_pending+0x18a/0x2e0 [dm_mod] [<ffffffffa063e359>] __split_and_process_bio+0x2d9/0x500 [dm_mod] [<ffffffff812e0000>] ? aes_encrypt+0xb30/0xe00 [<ffffffffa063e68e>] dm_make_request+0x10e/0x170 [dm_mod] [<ffffffff812eee19>] generic_make_request+0x109/0x1e0 [<ffffffff812eef61>] submit_bio+0x71/0x150 [<ffffffff812381dd>] ? bio_alloc_bioset+0x1fd/0x350 [<ffffffff81233283>] _submit_bh+0x143/0x210 [<ffffffff81235bd7>] block_read_full_page+0x217/0x350 [<ffffffff81239670>] ? I_BDEV+0x10/0x10 [<ffffffff8118e219>] ? force_page_cache_readahead+0x99/0xe0 [<ffffffff8123a0a8>] blkdev_readpage+0x18/0x20 [<ffffffff811829df>] generic_file_aio_read+0x3cf/0x790 [<ffffffff8123a46c>] blkdev_aio_read+0x4c/0x70 [<ffffffff811fde6d>] do_sync_read+0x8d/0xd0 [<ffffffff811fe61e>] vfs_read+0x9e/0x170 [<ffffffff811ff1ef>] SyS_read+0x7f/0xe0 [<ffffffff816973c9>] system_call_fastpath+0x16/0x1b ---[ end trace 5808138721c1994b ]--- Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 I'm running the test again with "--noflush" removed. |
| Comment by Jian Yu [ 02/May/17 ] |
|
With "--noflush" removed, the full stack is: Lustre: DEBUG MARKER: dmsetup suspend --nolockfs /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: dmsetup load /dev/mapper/mds1_flakey --table "0 4194304 flakey 252:0 0 0 1800 1 drop_writes" Lustre: DEBUG MARKER: dmsetup resume /dev/mapper/mds1_flakey Buffer I/O error on dev dm-6, logical block 524272, async page read ------------[ cut here ]------------ WARNING: at fs/buffer.c:295 end_buffer_async_read+0x119/0x140() Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi virtio_blk cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_pclmul 8139too crct10dif_common ttm crc32c_intel serio_raw virtio_pci 8139cp virtio_ring mii virtio drm ata_piix i2c_core libata floppy CPU: 1 PID: 14904 Comm: systemd-udevd Tainted: G W OE ------------ 3.10.0-514.16.1.el7_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 0000000000000000 00000000cc24690a ffff88007b733990 ffffffff81686d2f ffff88007b7339c8 ffffffff81085cb0 ffff88007a994f00 ffff88006290fa90 ffff88007a994f00 ffffea0001ea3400 ffff880078178800 ffff88007b7339d8 Call Trace: [<ffffffff81686d2f>] dump_stack+0x19/0x1b [<ffffffff81085cb0>] warn_slowpath_common+0x70/0xb0 [<ffffffff81085dfa>] warn_slowpath_null+0x1a/0x20 [<ffffffff81234269>] end_buffer_async_read+0x119/0x140 [<ffffffff8123263c>] end_bio_bh_io_sync+0x2c/0x60 [<ffffffff81239454>] bio_endio+0x64/0xa0 [<ffffffffa063b8da>] dec_pending+0x18a/0x2e0 [dm_mod] [<ffffffffa063d359>] __split_and_process_bio+0x2d9/0x500 [dm_mod] [<ffffffff812e0000>] ? aes_encrypt+0xb30/0xe00 [<ffffffffa063d68e>] dm_make_request+0x10e/0x170 [dm_mod] [<ffffffff812eee19>] generic_make_request+0x109/0x1e0 [<ffffffff812eef61>] submit_bio+0x71/0x150 [<ffffffff812381dd>] ? bio_alloc_bioset+0x1fd/0x350 [<ffffffff81233283>] _submit_bh+0x143/0x210 [<ffffffff81235bd7>] block_read_full_page+0x217/0x350 [<ffffffff81239670>] ? I_BDEV+0x10/0x10 [<ffffffff8118e219>] ? force_page_cache_readahead+0x99/0xe0 [<ffffffff8123a0a8>] blkdev_readpage+0x18/0x20 [<ffffffff811829df>] generic_file_aio_read+0x3cf/0x790 [<ffffffff8123a46c>] blkdev_aio_read+0x4c/0x70 [<ffffffff811fde6d>] do_sync_read+0x8d/0xd0 [<ffffffff811fe61e>] vfs_read+0x9e/0x170 [<ffffffff811ff1ef>] SyS_read+0x7f/0xe0 [<ffffffff816973c9>] system_call_fastpath+0x16/0x1b ---[ end trace 0cce06cbf33ea2b1 ]--- Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 LustreError: 12953:0:(osd_io.c:1460:osd_ldiskfs_read()) lustre-MDT0000: can't read 4096@0 on ino 265: rc = -5 LustreError: 12953:0:(osd_io.c:1460:osd_ldiskfs_read()) Skipped 6 previous similar messages LustreError: 12953:0:(llog_osd.c:262:llog_osd_read_header()) lustre-MDT0000-osd: bad log [0xa:0xb:0x0] header magic: 0x0 (expected 0x10645539) LustreError: 12953:0:(llog_osd.c:262:llog_osd_read_header()) Skipped 6 previous similar messages Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 |
| Comment by Jian Yu [ 04/May/17 ] |
|
Hi Andreas, I also removed "--nolockfs" and got the same failure and full stack. So, the failure is not related to "--nolockfs" and/or "--noflush" options. |
| Comment by Jian Yu [ 12/May/17 ] |
|
Hi Hongchao, Could you please proceed with this blocker? Thank you. |
| Comment by Hongchao Zhang [ 01/Jun/17 ] |
|
Status Update: I have tested it in Maloo by disabling different patches contained in http://review.whamcloud.com/22113 and I looked at the test result in the original patch for Patch Set 20: https://testing.hpdd.intel.com/test_sessions/c01176f8-a5ef-11e6-b605-5254006e85c2
(3.10.0-327.36.3.el7.x86_64) Then the issue could be caused by the kernel RHEL7.3 (3.10.0-514.el7.x86_64) |
| Comment by Hongchao Zhang [ 08/Jun/17 ] |
|
I have setup the RHEL7.3 (3.10.0-514.2.2.el7.x86_64) and reproduce the problem by using ext4 and dm-flakey module |
| Comment by James A Simmons [ 10/Jun/17 ] |
|
Can you try linux-commit: 299f6230bc6d0ccd5f95bb0fb865d80a9c7d5ccc dm flakey: fix reads to be issued if drop_writes configured v4.8-rc3 commit 99f3c90d0d ("dm flakey: error READ bios during the Fixes: 99f3c90d0d ("dm flakey: error READ bios during the down_interval") |
| Comment by Jian Yu [ 11/Jun/17 ] |
|
Hi James,
I tried this in https://review.whamcloud.com/26788 (patch set 25). The same error still occurred: Buffer I/O error on dev dm-6, logical block 524272, async page read |
| Comment by Hongchao Zhang [ 15/Jun/17 ] |
|
I have tested the commit locally, the commit can fix the problem on 3.10.0-514.2.2.el7.x86_64, |
| Comment by Jian Yu [ 15/Jun/17 ] |
|
The autotest system uses the build with the following kernel version supported by the latest master branch: $ head -n2 lustre/kernel_patches/targets/3.10-rhel7.target.in lnxmaj="3.10.0" lnxrel="514.21.1.el7" So, the commit doesn't work and the failure still exists. |
| Comment by James A Simmons [ 14/Sep/17 ] |
|
So the good news is this is fixed in RHEL7.4. If some one really wants to work on RHEL7.3 can compare the sources in RHEL7.4 to RHLE7.3 to see which change fixed this problem so we can include the patch fpr RHEL7.3 if so desired. I updated the patch for |
| Comment by Peter Jones [ 14/Sep/17 ] |
|
No - let's just focus on RHEL 7.4 |