[LU-536] recovery-mds-scale: (llite_lib.c:1142:ll_md_setattr()) md_setattr fails: rc = -30 Created: 26/Jul/11  Updated: 16/Aug/16  Resolved: 16/Aug/16

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0
Fix Version/s: Lustre 2.1.0

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: WC Triage
Resolution: Won't Fix Votes: 0
Labels: None
Environment:

Lustre Tag: v2_0_65_0
Lustre Build: http://newbuild.whamcloud.com/job/lustre-master/204/
e2fsprogs Build: http://newbuild.whamcloud.com/job/e2fsprogs-master/42/
Distro/Arch: RHEL6/x86_64(in-kernel OFED, kernel version: 2.6.32-131.2.1.el6)
ENABLE_QUOTA=yes
FAILURE_MODE=HARD

MGS/MDS Nodes: client-7-ib(active), client-8-ib(passive)
\ /
1 combined MGS/MDT

OSS Nodes: fat-amd-1-ib(active), fat-amd-2-ib(active)
\ /
OST1 (active in fat-amd-1-ib)
OST2 (active in fat-amd-2-ib)
OST3 (active in fat-amd-1-ib)
OST4 (active in fat-amd-2-ib)
OST5 (active in fat-amd-1-ib)
OST6 (active in fat-amd-2-ib)

Client Nodes: fat-amd-3-ib, client-[9,11,12,13]-ib


Attachments: File recovery-mds-scale-1311587892.tar.bz2    
Severity: 3
Rank (Obsolete): 9003

 Description   

While running recovery-mds-scale test, it failed as follows after MDS failed over 12 times:

==== Checking the clients loads AFTER  failover -- failure NOT OK
Client load failed on node client-13-ib, rc=1
Client load failed during failover. Exiting
Found the END_RUN_FILE file: /home/yujian/test_logs/end_run_file
client-13-ib
Client load failed on node client-13-ib

client client-13-ib load stdout and debug files :
              /tmp/recovery-mds-scale.log_run_dbench.sh-client-13-ib
              /tmp/recovery-mds-scale.log_run_dbench.sh-client-13-ib.debug
2011-07-25 02:58:07 Terminating clients loads ...
Duration:                43200
Server failover period: 600 seconds
Exited after:           6723 seconds
Number of failovers before exit:
mds1: 12 times

/tmp/recovery-mds-scale.log_run_dbench.sh-client-13-ib:

copying /usr/share/dbench/client.txt to /mnt/lustre/d0.dbench-client-13-ib/client.txt
running 'dbench 2' on /mnt/lustre/d0.dbench-client-13-ib at Mon Jul 25 02:56:38 PDT 2011
dbench PID=8460
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004

Running for 600 seconds with load 'client.txt' and minimum warmup 120 secs
0 of 2 processes prepared for launch   0 sec
2 of 2 processes prepared for launch   0 sec
releasing clients
[678] open ./clients/client1/~dmtmp/PWRPNT/NEWPCB.PPT failed for handle 10013 (No such file or directory)
(679) ERROR: handle 10013 was not found
Child failed with status 1

/tmp/recovery-mds-scale.log_run_dbench.sh-client-13-ib.debug:

<~snip~>
2011-07-25 02:56:38: dbench run starting
+ mkdir -p /mnt/lustre/d0.dbench-client-13-ib
+ load_pid=8452
+ wait 8452
+ rundbench -D /mnt/lustre/d0.dbench-client-13-ib 2
touch: missing file operand
Try `touch --help' for more information.
+ '[' 1 -eq 0 ']'
++ date '+%F %H:%M:%S'
+ echoerr '2011-07-25 02:56:39: dbench failed'
+ echo '2011-07-25 02:56:39: dbench failed'
2011-07-25 02:56:39: dbench failed

Syslog on the client node client-13-ib showed that:

Jul 25 02:56:39 client-13 kernel: LustreError: 8461:0:(llite_lib.c:1142:ll_md_setattr()) md_setattr fails: rc = -30
Jul 25 02:56:39 client-13 kernel: LustreError: 8461:0:(llite_lib.c:1142:ll_md_setattr()) Skipped 1 previous similar message

Maloo report: https://maloo.whamcloud.com/test_sets/c18c51da-b750-11e0-8bdf-52540025f9af

Please find more logs in the attached recovery-mds-scale-1311587892.tar.bz2.



 Comments   
Comment by James A Simmons [ 16/Aug/16 ]

Old ticket for unsupported version

Generated at Sat Feb 10 01:08:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.