[LU-3175] recovery-mds-scale test_failover_mds: unlink ./clients/client1/~dmtmp/PWRPNT/PPTC112.TMP failed (Read-only file system) - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Not a Bug
Priority: Blocker
Fix Version/s: None
Affects Version/s: Lustre 2.4.0
Labels:
None
Environment:

Lustre Branch: master
Lustre Build: http://build.whamcloud.com/job/lustre-master/1406/
Distro/Arch: RHEL6.3/x86_64
Test Group: failover
FAILURE_MODE=HARD

Severity:
3
Rank (Obsolete):
7735

Description

While running recovery-mds-scale test_failover_mds, dbench and iozone operations failed on client nodes as follows:

copying /usr/share/dbench/client.txt to /mnt/lustre/d0.dbench-wtm-79/client.txt
running 'dbench 2' on /mnt/lustre/d0.dbench-wtm-79 at Mon Apr 15 08:12:41 PDT 2013
dbench PID=11113
dbench version 4.00 - Copyright Andrew Tridgell 1999-2004

Running for 600 seconds with load 'client.txt' and minimum warmup 120 secs
0 of 2 processes prepared for launch   0 sec
2 of 2 processes prepared for launch   0 sec
releasing clients
   2       241    18.48 MB/sec  warmup   1 sec  latency 20.124 ms
   2       496    17.34 MB/sec  warmup   2 sec  latency 16.341 ms
   2       664    14.10 MB/sec  warmup   3 sec  latency 608.471 ms
   2       666    10.58 MB/sec  warmup   4 sec  latency 1093.980 ms
   2       722     8.52 MB/sec  warmup   5 sec  latency 649.730 ms
   2       724     7.10 MB/sec  warmup   6 sec  latency 1189.957 ms
   2       724     6.09 MB/sec  warmup   7 sec  latency 1332.253 ms
   2       724     5.32 MB/sec  warmup   8 sec  latency 2332.481 ms
   2       727     4.73 MB/sec  warmup   9 sec  latency 3176.583 ms
   2       729     4.26 MB/sec  warmup  10 sec  latency 632.289 ms
   2       731     3.87 MB/sec  warmup  11 sec  latency 804.657 ms
   2       731     3.55 MB/sec  warmup  12 sec  latency 1804.771 ms
   2       761     3.29 MB/sec  warmup  13 sec  latency 2337.010 ms
   2       791     3.07 MB/sec  warmup  14 sec  latency 1105.492 ms
[811] unlink ./clients/client1/~dmtmp/PWRPNT/PPTC112.TMP failed (Read-only file system) - expected NT_STATUS_OK
ERROR: child 1 failed at line 811
[811] unlink ./clients/client0/~dmtmp/PWRPNT/PPTC112.TMP failed (Read-only file system) - expected NT_STATUS_OK
ERROR: child 0 failed at line 811
Child failed with status 1

        Machine = Linux wtm-81 2.6.32-279.19.1.el6.x86_64 #1 SMP Wed Dec 19 07:05:20 U  Excel chart generation enabled
        Excel chart generation enabled
        Verify Mode. Pattern 3a3a3a3a
        Performance measurements are invalid in this mode.
        Using maximum file size of 102400 kilobytes.
        Using Maximum Record Size 512 KB
        Command line used: iozone -a -M -R -V 0xab -g 100M -q 512k -i0 -i1 -f /mnt/lustre/d0.iozone-wtm-81/iozone-file
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
                                                            random  random    bkwd   record   stride                                   
              KB  reclen   write rewrite    read    reread    read   write    read  rewrite     read   fwrite frewrite   fread  freread
              64       4
Can not open temp file: /mnt/lustre/d0.iozone-wtm-81/iozone-file

Dmesg on the client nodes showed that:

Lustre: DEBUG MARKER: Starting failover on mds1
LustreError: 11115:0:(llite_lib.c:1294:ll_md_setattr()) md_setattr fails: rc = -30
LustreError: 11114:0:(llite_lib.c:1294:ll_md_setattr()) md_setattr fails: rc = -30
LustreError: 11115:0:(file.c:158:ll_close_inode_openhandle()) inode 144115205289279635 mdc close failed: rc = -30
LustreError: 11115:0:(file.c:158:ll_close_inode_openhandle()) inode 144115205289279635 mdc close failed: rc = -30

The test results are still in the Maloo import queue.

Attachments

Activity

People

Assignee:: Niu Yawei (Inactive)

Reporter:: Jian Yu

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 15/Apr/13 4:24 PM

Updated:: 18/Apr/13 2:46 PM

Resolved:: 18/Apr/13 2:46 PM