[LU-7171] Hard Failover recovery-small test_65: Inappropriate ioctl for device Created: 16/Sep/15  Updated: 28/Apr/17  Resolved: 28/Apr/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0, Lustre 2.10.0
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: Hongchao Zhang
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

client and server: lustre-master build #3175 RHEL7 zfs


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for sarah_lw <wei3.liu@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/2a433f74-55bb-11e5-8784-5254006e85c2.

The sub-test test_65 failed with the following error:

test_65 failed with 1

test log

== recovery-small test 65: lock enqueue for destroyed export ========================================= 09:27:59 (1441618079)
Starting client: shadow-49vm1.shadow.whamcloud.com:  -o user_xattr,flock shadow-49vm3:shadow-49vm7:/lustre /mnt/lustre2
CMD: shadow-49vm1.shadow.whamcloud.com mkdir -p /mnt/lustre2
CMD: shadow-49vm1.shadow.whamcloud.com mount -t lustre -o user_xattr,flock shadow-49vm3:shadow-49vm7:/lustre /mnt/lustre2
mount.lustre: mount shadow-49vm3:shadow-49vm7:/lustre at /mnt/lustre2 failed: Input/output error
Is the MGS running?
error on ioctl 0x4008669a for '/mnt/lustre2/f65.recovery-small' (3): Inappropriate ioctl for device
error: setstripe: create file '/mnt/lustre2/f65.recovery-small' failed
 recovery-small test_65: @@@@@@ FAIL: test_65 failed with 1 

client dmesg

[ 6969.281626] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == recovery-small test 65: lock enqueue for destroyed export ========================================= 09:27:59 \(1441618079\)
[ 6969.448135] Lustre: DEBUG MARKER: == recovery-small test 65: lock enqueue for destroyed export ========================================= 09:27:59 (1441618079)
[ 6969.499596] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre2
[ 6969.510144] Lustre: DEBUG MARKER: mount -t lustre -o user_xattr,flock shadow-49vm3:shadow-49vm7:/lustre /mnt/lustre2
[ 6972.666189] LustreError: 166-1: MGC10.1.6.57@tcp: Connection to MGS (at 10.1.6.61@tcp) was lost; in progress operations using this service will fail
[ 6972.669074] LustreError: Skipped 2 previous similar messages
[ 6972.671309] LustreError: 15c-8: MGC10.1.6.57@tcp: The configuration from log 'lustre-client' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
[ 6972.677497] Lustre: Unmounted lustre-client
[ 6972.679317] LustreError: 13842:0:(obd_mount.c:1342:lustre_fill_super()) Unable to mount  (-5)
[ 6972.916786] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  recovery-small test_65: @@@@@@ FAIL: test_65 failed with 1 
[ 6973.085666] Lustre: DEBUG MARKER: recovery-small test_65: @@@@@@ FAIL: test_65 failed with 1


 Comments   
Comment by Saurabh Tandan (Inactive) [ 14/Oct/15 ]

Another instance for EL6.7 Server/Client - ZFS in 2.7.61 tag:
https://testing.hpdd.intel.com/test_sets/4964bc54-6d42-11e5-bf10-5254006e85c2

Comment by Saurabh Tandan (Inactive) [ 14/Jan/16 ]

Also seen on Master for tag 2.7.65 with SELinux enabled client.
1 Cient, 1 OSS - 2 OSTs, 1 MDS - 1 MDT
Build# 3301

== recovery-small test 65: lock enqueue for destroyed export == 00:09:54 (1452730194)
Starting client: eagle-52vm5.eagle.hpdd.intel.com:  -o user_xattr,flock eagle-52vm2@tcp:/lustre /mnt/lustre2
mount.lustre: mount eagle-52vm2@tcp:/lustre at /mnt/lustre2 failed: Input/output error
Is the MGS running?
error on ioctl 0x4008669a for '/mnt/lustre2/f65.recovery-small' (3): Inappropriate ioctl for device
error: setstripe: create file '/mnt/lustre2/f65.recovery-small' failed
 recovery-small test_65: @@@@@@ FAIL: test_65 failed with 1 
Comment by Saurabh Tandan (Inactive) [ 20/Jan/16 ]

Another instance found for hardfailover: EL7 Server/Client - ZFS
build# 3305
https://testing.hpdd.intel.com/test_sets/fbbee064-bbc6-11e5-8506-5254006e85c2

Comment by Peter Jones [ 22/Jan/16 ]

Hongchao

Could you please look into this one?

Peter

Comment by Hongchao Zhang [ 26/Jan/16 ]

I have analyzed these failed cases, and it failed due to the disconnection between the client and the MGS, then the Lustre mount
failed subsequently, no special logs were found to indicate whether the disconnection was triggered, and the MGS just did NOT
received the request(PING request) from the client. I'm afraid that it could be related the network itself.

Comment by Saurabh Tandan (Inactive) [ 05/Feb/16 ]

Another instance for master, build# 3316
https://testing.hpdd.intel.com/test_sets/fe17414c-cc2b-11e5-b519-5254006e85c2

Comment by Saurabh Tandan (Inactive) [ 09/Feb/16 ]

Another instance found for hardfailover : EL7 Server/Client - ZFS, tag 2.7.66, master build 3314
https://testing.hpdd.intel.com/test_sessions/f0dd9616-ca6e-11e5-9609-5254006e85c2

Comment by Saurabh Tandan (Inactive) [ 24/Feb/16 ]

Another instance found on b2_8 for failover testing , build# 6.
https://testing.hpdd.intel.com/test_sessions/54ec62da-d99d-11e5-9ebe-5254006e85c2
https://testing.hpdd.intel.com/test_sessions/c5a8e44c-d9c7-11e5-85dd-5254006e85c2

Comment by Hongchao Zhang [ 28/Apr/17 ]

there is no such problem since Jun 11, 2016

Comment by Peter Jones [ 28/Apr/17 ]

ok then let's close the ticket

Generated at Sat Feb 10 02:06:36 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.