[LU-5420] Failure on test suite sanity test_17m: mount MDS failed, Input/output error Created: 26/Jul/14  Updated: 19/Mar/19  Resolved: 11/May/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.7.0
Fix Version/s: Lustre 2.8.0

Type: Bug Priority: Blocker
Reporter: Sarah Liu Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: HB, dne, patch
Environment:

client and server: lustre-b2_6-rc2 RHEL6 ldiskfs DNE mode


Issue Links:
Duplicate
Related
is related to LU-4913 mgc import reconnect race Resolved
is related to LU-5404 sanity test_228b FAIL: Fail to start ... Open
is related to LU-5077 insanity test_1: out of memory on MDT... Resolved
is related to LU-5407 Failover failure on test suite replay... Resolved
is related to LU-5130 Test failure sanity test_17n: destroy... Resolved
is related to LU-8206 ptlrpc_invalidate_import ( ASSERTION(... Closed
Severity: 3
Rank (Obsolete): 15076

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/16302020-14ed-11e4-bb6a-5254006e85c2.

The sub-test test_17m failed with the following error:

test failed to respond and timed out

Hit this bug in many tests, the env is configured as 1 MDS with 2 MDTs. Didn't hit this error when the configuration is 2 MDSs with 2 MDTs
client console:

CMD: onyx-46vm7 mkdir -p /mnt/mds1
CMD: onyx-46vm7 test -b /dev/lvm-Role_MDS/P1
Starting mds1:   /dev/lvm-Role_MDS/P1 /mnt/mds1
CMD: onyx-46vm7 mkdir -p /mnt/mds1; mount -t lustre   		                   /dev/lvm-Role_MDS/P1 /mnt/mds1
onyx-46vm7: mount.lustre: mount /dev/mapper/lvm--Role_MDS-P1 at /mnt/mds1 failed: Input/output error
onyx-46vm7: Is the MGS running?
Start of /dev/lvm-Role_MDS/P1 on mds1 failed 5


 Comments   
Comment by Sarah Liu [ 26/Jul/14 ]

mds console

11:01:42:Lustre: DEBUG MARKER: mkdir -p /mnt/mds1; mount -t lustre   		                   /dev/lvm-Role_MDS/P1 /mnt/mds1
11:01:42:LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. quota=on. Opts: 
11:01:42:LustreError: 166-1: MGC10.2.4.243@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
11:01:43:Lustre: Evicted from MGS (at MGC10.2.4.243@tcp_0) after server handle changed from 0x5efc70d9d01b7154 to 0x5efc70d9d01e1f23
11:01:43:LustreError: 18197:0:(obd_mount_server.c:1165:server_register_target()) lustre-MDT0000: error registering with the MGS: rc = -108 (not fatal)
11:01:45:LustreError: 15c-8: MGC10.2.4.243@tcp: The configuration from log 'lustre-MDT0000' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
11:01:45:Lustre: MGC10.2.4.243@tcp: Connection restored to MGS (at 0@lo)
11:01:46:LustreError: 18197:0:(obd_mount_server.c:1297:server_start_targets()) failed to start server lustre-MDT0000: -5
11:01:46:LustreError: 18197:0:(obd_mount_server.c:1769:server_fill_super()) Unable to start targets: -5
11:01:46:LustreError: 18197:0:(obd_mount_server.c:1496:server_put_super()) no obd lustre-MDT0000
11:01:47:LustreError: 18197:0:(obd_mount.c:1342:lustre_fill_super()) Unable to mount  (-5)
11:01:47:Lustre: DEBUG MARKER: lctl set_param -n mdt.lustre*.enable_remote_dir=1
11:01:47:LustreError: 137-5: lustre-MDT0000_UUID: not available for connect from 10.2.4.244@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
11:01:47:LustreError: Skipped 11 previous similar messages
Comment by Di Wang [ 26/Jul/14 ]

Sigh, this is brought in by http://review.whamcloud.com/#/c/9967/6

LU-4913 mgc: mgc import reconnect race

mgc import can be reconnected by pinger or
ptlrpc_reconnect_import().
ptlrpc_invalidate_import() isn't protected against
alteration of imp_invalid state. Import can be
reconnected by pinger which makes imp_invalid
equal to false. Thus LASSERT(imp->imp_invalid) fails
in ptlrpc_invalidate_import().

It is safe to call ptlrpc_invalidate_import() when
import is deactivated, but ptlrpc_reconnect_import() doesn't
deactivate it.
Let's use only pinger when available to reconnect import

Hmm, in ptlrpc_reconnect_import, the patch does not force the the import to reconnect to the server, instead it only check the import status, which seems wrong. Given that the import status might change later soon. I think the intention here to make the import is refreshed and connected after this call.

Though I am not so sure what the patch are trying to resolve here, since comment is a bit confused to me. I think there two options to fix the problem

1. revert the patch 9967, or just revert the change in ptlrpc_reconnect_import to force reconnect anyway.
2. mgc should retry to enqueue and get log when it meet invalid import (by checking the return value -ESHUTDOWN, see ptlrpc_import_delay_req).

I will cook these two patches to see which one is better.

Comment by Di Wang [ 26/Jul/14 ]

Btw: We hit this until now because our current maloo DNE configuration is a bit different then this final FULL release test. Here we are using 2 MDTs in MDS, where in maloo DNE test, we also use 2 MDSes, but 1 MDS only 1 MDTs, the other MDS has 3 MDTs.

This failure will only be hit when there are multiple MDTs on the first MDS.

Comment by Di Wang [ 26/Jul/14 ]

Option 1: http://review.whamcloud.com/11241

Option 2: http://review.whamcloud.com/11240

Personally I like option 1, but I am not sure whether reverting the patch will bring LU-4913 back, the patch description is a bit confused to me.

Comment by Andreas Dilger [ 28/Jul/14 ]

The option #2 patch is testing well on my local system (single-node 2x MDT, 3x OST, client) which was having solid test failures in sanity.sh test_17m and test_17o (which I'd incorrectly attributed to LU-1538 patch http://review.whamcloud.com/10481 that was reverted).

I've pushed an updated version of the 11240 patch at http://review.whamcloud.com/11258 with improved comments and removing some noise from the console. Since this might be a blocker I didn't refresh the original 11240 patch so that it could continue testing, but I'd prefer that the 11258 version land if it is ready.

Comment by Andreas Dilger [ 30/Jul/14 ]

It seems that this patch is repeatedly failing insanity, even when it is running on b2_6. The failures are marked as LU-5077, but I don't think that is the real reason. I suspect there is some other problem with this patch that needs to be investigated.

Comment by Andreas Dilger [ 30/Jul/14 ]

I verified that virtually all of the test failures marked LU-5077 are actually from the three versions of the LU-5420 patches, which fail "insanity" and "conf-sanity" repeatedly. Due to the presence of LU-5425, I'm not 100% positive that those are caused by this patch, but definitely the insanity failures.

Comment by Di Wang [ 30/Jul/14 ]

Hmm, I think insanity failures should be related with the fix from LU-5420. I am looking at it now.

Comment by Di Wang [ 30/Jul/14 ]

[7/30/14, 12:54:36 PM] wangdi: the failure of insanity is because of with the fix of LU-5425, MDT will insist MGS must be started, i.e. MDS setup process will wait there until MGS is setup. But in insanity test_1, it start mdt2 first, then mdt1/mgs, that is why the test will fail
[7/30/14, 12:55:32 PM] wangdi: so can we just fix the test case here, because it seems to me MGS must be setup first

Comment by Di Wang [ 31/Jul/14 ]

Sigh, most of insanity test starts MDT or OST before MGS, that is why it cause so many insanity failure with this patch. So if "starting mgs before other targets" is a requirement then we need fix insanity.

Comment by John Hammond [ 11/Aug/14 ]

Test specific issues aside, we need to fix this as putting 2 MDTs from a FS on a single node will be a likely failover configuration.

t:lustre-release# export LUSTRE=$HOME/lustre-release/lustre
t:lustre-release# export MDSCOUNT=2
t:lustre-release# llmount.sh
...
Starting mds1:   -o loop /tmp/lustre-mdt1 /mnt/mds1
Started lustre-MDT0000
Starting mds2:   -o loop /tmp/lustre-mdt2 /mnt/mds2
Started lustre-MDT0001
Starting ost1:   -o loop /tmp/lustre-ost1 /mnt/ost1
Started lustre-OST0000
Starting ost2:   -o loop /tmp/lustre-ost2 /mnt/ost2
Started lustre-OST0001
Starting client: t:  -o user_xattr,flock t@tcp:/lustre /mnt/lustre
Using TIMEOUT=20
seting jobstats to procname_uid
Setting lustre.sys.jobid_var from disable to procname_uid
Waiting 90 secs for update
Updated after 3s: wanted 'procname_uid' got 'procname_uid'
disable quota as required
t:lustre-release# umount /mnt/mds1
t:lustre-release# mount /tmp/lustre-mdt1 /mnt/mds1 -o loop -t lustre
mount.lustre: mount /dev/loop0 at /mnt/mds1 failed: Input/output error
Is the MGS running?
Comment by Andreas Dilger [ 09/Oct/14 ]

Without this patch I'm also not able to test past sanity.sh test_17m and test_17n without a shared MGS+MDS failing to mount due to -EIO and causing testing to hang until I remount the MDS. I'm able to mount it manually after 2 or 3 tries, so there must be some kind of startup race between the MDS and the MGS. Once I applied this patch I made it through all of sanity.sh and sanityn.sh with multiple MDS remounts without problems until I hit a memory allocation deadlock running dbench that looks unrelated.

Comment by Andreas Dilger [ 14/Oct/14 ]

The patch is still failing with a hang at unmount time (this failed in four separate conf-sanity runs in different subtests):

01:52:25:INFO: task umount:26263 blocked for more than 120 seconds.
01:52:25:      Tainted: G        W  ---------------    2.6.32-431.23.3.el6_lustre.g9f5284f.x86_64 #1
01:52:26:umount        D 0000000000000000     0 26263  26262 0x00000080
01:52:27:Call Trace:
01:52:27: [<ffffffff8152b6e5>] rwsem_down_failed_common+0x95/0x1d0
01:52:27: [<ffffffff8152b843>] rwsem_down_write_failed+0x23/0x30
01:52:28: [<ffffffff8128f7f3>] call_rwsem_down_write_failed+0x13/0x20
01:52:28: [<ffffffffa0b13cd1>] client_disconnect_export+0x61/0x460 [ptlrpc]
01:52:28: [<ffffffffa058975a>] lustre_common_put_super+0x28a/0xbf0 [obdclass]
01:52:28: [<ffffffffa05bc508>] server_put_super+0x198/0xe50 [obdclass]
01:52:29: [<ffffffff8118b23b>] generic_shutdown_super+0x5b/0xe0
01:52:29: [<ffffffff8118b326>] kill_anon_super+0x16/0x60
01:52:29: [<ffffffffa0580d06>] lustre_kill_super+0x36/0x60 [obdclass]
01:52:29: [<ffffffff8118bac7>] deactivate_super+0x57/0x80
01:52:29: [<ffffffff811ab4cf>] mntput_no_expire+0xbf/0x110
01:52:29: [<ffffffff811ac01b>] sys_umount+0x7b/0x3a0
Comment by Di Wang [ 15/Oct/14 ]

Just updated the patch.

Comment by Sergey Cheremencev [ 31/Oct/14 ]

Hello

We hit these problem in xyratex and have another solution http://review.whamcloud.com/#/c/12515/.
Hope it could be helpful.

Comment by Gerrit Updater [ 17/Nov/14 ]

Sergey Cheremencev (sergey_cheremencev@xyratex.com) uploaded a new patch: http://review.whamcloud.com/12515
Subject: LU-5420 mgc: process config logs only in mgc_requeue_thread()
Project: fs/lustre-release
Branch: master
Current Patch Set: 3
Commit: 1c3148dd8645cfa94bf3c36cfbe41176334ad4c5

Comment by Jian Yu [ 17/Nov/14 ]

While running replay-dual tests on master branch with MDSCOUNT=4, the same failure occurred:
https://testing.hpdd.intel.com/test_sets/33dfc794-6dba-11e4-9d65-5254006e85c2
https://testing.hpdd.intel.com/test_sets/5cb7b7f8-6dba-11e4-9d65-5254006e85c2

Comment by Sergey Cheremencev [ 18/Nov/14 ]

In seagate these bug is occurred only when MDT and MGS are on same node, in case when mdt starts earlier than mgs.

When MDT is on separate node it uses LOCAL configuration if can't retrieve it from MGS.
But when MDT and MGS are on same node MDT can't use LOCAL configuration:

        /* Copy the setup log locally if we can. Don't mess around if we're
         * running an MGS though (logs are already local). */
        if (lctxt && lsi && IS_SERVER(lsi) && !IS_MGS(lsi) &&
            cli->cl_mgc_configs_dir != NULL &&
            lu2dt_dev(cli->cl_mgc_configs_dir->do_lu.lo_dev) ==
            lsi->lsi_dt_dev) {
....
       } else {
                if (local_only) /* no local log at client side */
                        GOTO(out_pop, rc = -EIO);
        }
Comment by Di Wang [ 18/Nov/14 ]

Hmm, I thought there are two problems here
1. it is not just MGS and MDT are not share the same node, if several targets are sharing the same mgc, you will meet similar problem, because after http://review.whamcloud.com/#/c/9967 is landed, we can not make sure the import is FULL before mgc enqueue lock and retrieve logs, unless the MGC is new.
2. How can we make sure the local config log is stale or not. I think that is the reason we saw LU-5658, where the local config is stale.

Comment by James A Simmons [ 06/Feb/15 ]

As a note I don't see this is my regular RHEL testing but I can constantly reproduce this problem with my 3.12 kernel setup. This is with the MGS and MDS each being on separate nodes.

Comment by Gerrit Updater [ 09/Feb/15 ]

Alexey Lyashkov (alexey.lyashkov@seagate.com) uploaded a new patch: http://review.whamcloud.com/13693
Subject: LU-5420 mgc: fix reconnect
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ccfca18ad2ae9acb84dbfc4c0b2217bd10a0589d

Comment by Jodi Levi (Inactive) [ 17/Feb/15 ]

http://review.whamcloud.com/#/c/12515/

Comment by Gerrit Updater [ 20/Feb/15 ]

Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/13832
Subject: LU-5420 revert part of LU-4913
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 260e150f98f07fa68fb124348ca9540e77fed100

Comment by Gerrit Updater [ 22/Feb/15 ]

Oleg Drokin (oleg.drokin@intel.com) uploaded a new patch: http://review.whamcloud.com/13838
Subject: LU-5420 revert part of LU-4913
Project: fs/lustre-release
Branch: b2_7
Current Patch Set: 1
Commit: 77856caa2468dd69cfa5796bceb22c32aacf402f

Comment by Gerrit Updater [ 27/Feb/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13838/
Subject: LU-5420 ptlrpc: revert ptlrpc_reconnect_import() changes
Project: fs/lustre-release
Branch: b2_7
Current Patch Set:
Commit: 02739a078f54b5ccdf49456fd0d1daea90472a8d

Comment by Jodi Levi (Inactive) [ 27/Feb/15 ]

Patches landed to Master.

Comment by Peter Jones [ 27/Feb/15 ]

Actually Jodi the patches for master are still in flight. It is simply a workaround fix that has landed to b2_7

Comment by Jodi Levi (Inactive) [ 27/Feb/15 ]

Yes, my apologies.

Comment by James A Simmons [ 18/Mar/15 ]

I see many patches for this. Which patches are valid?

Comment by Di Wang [ 18/Mar/15 ]

For now, you can use this http://review.whamcloud.com/13838/ , but that only reverts the patch (http://review.whamcloud.com/#/c/9967/)which cause the problem, and not real fix.

There are patches trying to fix this problem, but none of them are satisfied by everyone. So leave it to 2.8 for now.
http://review.whamcloud.com/13693
http://review.whamcloud.com/11258

Comment by Gerrit Updater [ 06/May/15 ]

Andreas Dilger (andreas.dilger@intel.com) merged in patch http://review.whamcloud.com/11258/
Subject: LU-5420 mgc: MGC should retry for invalid import
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 77d406a0699307e8e633ef41f8984f45c09db9b8

Comment by Peter Jones [ 11/May/15 ]

Landed for 2.8

Generated at Sat Feb 10 01:51:20 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.