[LU-545] 1.8<->2.1 interop: replay-vbr: Can't lstat /mnt/lustre/d0.replay-vbr/d0/f0b: Cannot send after transport endpoint shutdown Created: 28/Jul/11  Updated: 27/Feb/12  Resolved: 27/Feb/12

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.0.0, Lustre 1.8.6
Fix Version/s: Lustre 2.1.0, Lustre 1.8.7

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Lustre Clients:
Tag: 1.8.6-wc1
Distro/Arch: RHEL6/x86_64 (kernel version: 2.6.32_131.2.1.el6)
Build: http://newbuild.whamcloud.com/job/lustre-b1_8/100/arch=x86_64,build_type=client,distro=el6,ib_stack=inkernel/
Network: IB (inkernel OFED)
ENABLE_QUOTA=yes

Lustre Servers:
Tag: v2_0_66_0
Distro/Arch: RHEL6/x86_64 (kernel version: 2.6.32-131.2.1.el6_lustre)
Build: http://newbuild.whamcloud.com/job/lustre-master/228/arch=x86_64,build_type=server,distro=el6,ib_stack=inkernel/
Network: IB (inkernel OFED)


Severity: 3
Rank (Obsolete): 6575

 Description   

replay-vbr test 0b failed as follows:

== test 0b: VBR: open (O_CREAT) checks version of parent == 11:20:15
mdd.lustre-MDT0000.sync_permission=0
Filesystem           1K-blocks      Used Available Use% Mounted on
fat-amd-1-ib@o2ib:/lustre
                        224904     30064    182456  15% /mnt/lustre
Succeed in opening file "/mnt/lustre/d0.replay-vbr/d0/f0b"(flags=O_RDWR)
Stopping client client-12-ib /mnt/lustre (opts:)
Failing mds on node fat-amd-1-ib
Stopping /mnt/mds (opts:)
affected facets: mds
df pid is 16562
Failover mds to fat-amd-1-ib
11:20:32 (1311790832) waiting for fat-amd-1-ib network 900 secs ...
11:20:32 (1311790832) network interface is UP
Starting mds: -o user_xattr,acl  /dev/sdb5 /mnt/mds
fat-amd-1-ib: debug=-1
fat-amd-1-ib: subsystem_debug=0xffb7e3ff
fat-amd-1-ib: debug_mb=48
Started lustre-MDT0000
fat-amd-3-ib: stat: cannot read file system information for `/mnt/lustre': Interrupted system call
fat-amd-3-ib: stat: cannot read file system information for `/mnt/lustre': Interrupted system call
Can't lstat /mnt/lustre/d0.replay-vbr/d0/f0b: Cannot send after transport endpoint shutdown
 replay-vbr test_0b: @@@@@@ FAIL: open succeeded unexpectedly 
Dumping lctl log to /home/yujian/test_logs/2011-07-27/072321/replay-vbr.test_0b.*.1311790926.log
tar: Removing leading `/' from member names
/home/yujian/test_logs/2011-07-27/072321/replay-vbr-1311790926.tar.bz2
Starting client: client-12-ib: -o user_xattr,acl,flock fat-amd-1-ib@o2ib:/lustre /mnt/lustre
client-12-ib: lnet.debug=-1
client-12-ib: lnet.subsystem_debug=0xffb7e3ff
client-12-ib: lnet.debug_mb=48
Resetting fail_loc on all nodes...done.
FAIL   (163s)

Dmesg on the client node fat-amd-3-ib showed that:

LustreError: 16693:0:(llite_lib.c:1760:ll_statfs_internal()) mdc_statfs fails: rc = -4
LustreError: 16693:0:(llite_lib.c:1760:ll_statfs_internal()) Skipped 1 previous similar message
LustreError: 16705:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID  req@ffff88031b1e1c00 x1375512328929572/t0 o101->lustre-MDT0000_UUID@192.168.4.132@o2ib:12/10 lens 544/1184 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0
LustreError: 16705:0:(mdc_locks.c:652:mdc_enqueue()) ldlm_cli_enqueue error: -108
Lustre: lustre-MDT0000-mdc-ffff88021b174400: Connection restored to service lustre-MDT0000 using nid 192.168.4.132@o2ib.
Lustre: DEBUG MARKER: replay-vbr test_0b: @@@@@@ FAIL: open succeeded unexpectedly

Maloo report: https://maloo.whamcloud.com/test_sets/e102b110-b8c4-11e0-8bdf-52540025f9af

This is an known issue: bug 23465



 Comments   
Comment by Jian Yu [ 28/Jul/11 ]

replay-vbr test 0e also failed with the same issue:

fat-amd-3-ib: stat: cannot read file system information for `/mnt/lustre': Interrupted system call
fat-amd-3-ib: stat: cannot read file system information for `/mnt/lustre': Interrupted system call
Can't lstat /mnt/lustre/d0.replay-vbr/d0/f0e: Cannot send after transport endpoint shutdown
 replay-vbr test_0e: @@@@@@ FAIL: create succeeded unexpectedly 

Maloo report: https://maloo.whamcloud.com/test_sets/e102b110-b8c4-11e0-8bdf-52540025f9af

Comment by Jian Yu [ 28/Aug/11 ]

Lustre Clients:
Tag: 1.8.6-wc1
Distro/Arch: RHEL6/x86_64 (kernel version: 2.6.32_131.2.1.el6)
Build: http://newbuild.whamcloud.com/job/lustre-b1_8/100/arch=x86_64,build_type=client,distro=el6,ib_stack=inkernel/
Network: IB (inkernel OFED)
ENABLE_QUOTA=yes

Lustre Servers:
Tag: v2_1_0_0_RC1
Distro/Arch: RHEL6/x86_64 (kernel version: 2.6.32-131.6.1.el6_lustre)
Build: http://newbuild.whamcloud.com/job/lustre-master/271/arch=x86_64,build_type=server,distro=el6,ib_stack=inkernel/
Network: IB (inkernel OFED)

replay-vbr test 0j and 0u failed with the same issue: https://maloo.whamcloud.com/test_sets/2519ed2e-cfbc-11e0-8d02-52540025f9af

Comment by Yang Sheng [ 27/Feb/12 ]

This is a duplicate LU-891.

Generated at Sat Feb 10 01:08:06 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.