[LU-657] recovery-mds-scale (FLAVOR=MDS): client load failed Created: 02/Sep/11  Updated: 09/Apr/13  Resolved: 15/Jan/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0, Lustre 2.3.0, Lustre 2.4.0, Lustre 2.1.1, Lustre 2.1.4
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Jian Yu Assignee: Hongchao Zhang
Resolution: Fixed Votes: 0
Labels: MB
Environment:

Lustre Tag: v2_1_0_0_RC1
Lustre Build: http://newbuild.whamcloud.com/job/lustre-master/274/
e2fsprogs Build: http://newbuild.whamcloud.com/job/e2fsprogs-master/42/
Distro/Arch: RHEL6/x86_64(in-kernel OFED, kernel version: 2.6.32-131.6.1.el6.x86_64)
ENABLE_QUOTA=yes
FAILURE_MODE=HARD
FLAVOR=MDS

MGS/MDS Nodes: fat-amd-1-ib(active), fat-amd-2-ib(passive)
\ /
1 combined MGS/MDT

OSS Nodes: fat-amd-3-ib(active), fat-amd-4-ib(active)
\ /
OST1 (lustre-OST0000, active in fat-amd-3-ib)
OST2 (lustre-OST0001, active in fat-amd-4-ib)
OST3 (lustre-OST0002, active in fat-amd-3-ib)
OST4 (lustre-OST0003, active in fat-amd-4-ib)
OST5 (lustre-OST0004, active in fat-amd-3-ib)
OST6 (lustre-OST0005, active in fat-amd-4-ib)

Client Nodes: client-[1,2,4,5,12,13,15],fat-intel-4

Network Addresses:
fat-amd-1-ib: 192.168.4.132
fat-amd-2-ib: 192.168.4.133
fat-amd-3-ib: 192.168.4.134
fat-amd-4-ib: 192.168.4.135
client-1-ib: 192.168.4.1
client-2-ib: 192.168.4.2
client-4-ib: 192.168.4.4
client-5-ib: 192.168.4.5
client-12-ib: 192.168.4.12
client-13-ib: 192.168.4.13
client-15-ib: 192.168.4.15
fat-intel-4-ib: 192.168.4.131


Attachments: File recovery-mds-scale.1316453521.log.tar.bz2    
Issue Links:
Related
is related to LU-1824 Test failure on test suite obdfilter-... Resolved
Severity: 3
Rank (Obsolete): 4239

 Description   

After running for 8 hours (MDS failed over 51 times), recovery-mds-scale test failed as follows:

Starting mds1: -o user_xattr,acl  /dev/disk/by-id/scsi-1IET_00010001 /mnt/mds1
fat-amd-1-ib: debug=0x33f0404
fat-amd-1-ib: subsystem_debug=0xffb7e3ff
fat-amd-1-ib: debug_mb=48
Started lustre-MDT0000
==== Checking the clients loads AFTER  failover -- failure NOT OK
Client load failed on node client-12-ib, rc=1
Client load failed during failover. Exiting
Found the END_RUN_FILE file: /home/yujian/test_logs/end_run_file
client-12-ib
client-2-ib
Client load failed on node client-12-ib

client client-12-ib load stdout and debug files :
              /tmp/recovery-mds-scale.log_run_dd.sh-client-12-ib
              /tmp/recovery-mds-scale.log_run_dd.sh-client-12-ib.debug
2011-09-01 08:15:46 Terminating clients loads ...
Duration:                43200
Server failover period: 600 seconds
Exited after:           30428 seconds
Number of failovers before exit:
mds1: 51 times
ost1: 0 times
ost2: 0 times
ost3: 0 times
ost4: 0 times
ost5: 0 times
ost6: 0 times
Status: FAIL: rc=1

/tmp/recovery-mds-scale.log_run_dd.sh-client-12-ib.debug:

2011-09-01 08:07:51: dd succeeded
+ cd /tmp
+ rm -rf /mnt/lustre/d0.dd-client-12-ib
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-01 08:07:53: dd run finished'
+ echo '2011-09-01 08:07:53: dd run finished'
2011-09-01 08:07:53: dd run finished
+ '[' '!' -e /home/yujian/test_logs/end_run_file ']'
+ true
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-01 08:07:53: dd run starting'
+ echo '2011-09-01 08:07:53: dd run starting'
2011-09-01 08:07:53: dd run starting
+ mkdir -p /mnt/lustre/d0.dd-client-12-ib
+ cd /mnt/lustre/d0.dd-client-12-ib
+ load_pid=26714
+ wait 26714
+ dd bs=4k count=1000000 status=noxfer if=/dev/zero of=/mnt/lustre/d0.dd-client-12-ib/dd-file
dd: writing `/mnt/lustre/d0.dd-client-12-ib/dd-file': No space left on device
805559+0 records in
805558+0 records out
+ '[' 1 -eq 0 ']'
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-01 08:09:10: dd failed'
+ echo '2011-09-01 08:09:10: dd failed'
2011-09-01 08:09:10: dd failed

Syslog on client-12-ib showed:

Sep  1 08:09:10 client-12 kernel: Lustre: DEBUG MARKER: Checking clients are in FULL state before doing next failover
Sep  1 08:09:10 client-12 kernel: LustreError: 11-0: an error occurred while communicating with 192.168.4.135@o2ib. The ost_write operation failed with -28
Sep  1 08:09:10 client-12 kernel: LustreError: 26714:0:(vvp_io.c:1001:vvp_io_commit_write()) Write page 805558 of inode ffff880312852a38 failed -28

The dd error and syslog on client-12-ib showed that "No space left on device" on the OSS node fat-amd-4 (192.168.4.135), however, df on that node showed that:

[root@fat-amd-4-ib ~]# lctl dl
  0 UP mgc MGC192.168.4.132@o2ib f176795b-1295-66f4-d018-e6c09ba5b112 5
  1 UP ost OSS OSS_uuid 3
  2 UP obdfilter lustre-OST0001 lustre-OST0001_UUID 21
  3 UP obdfilter lustre-OST0003 lustre-OST0003_UUID 21
  4 UP obdfilter lustre-OST0005 lustre-OST0005_UUID 21

[root@fat-amd-4-ib ~]# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             20642428   2223180  17370672  12% /
tmpfs                  8165072         0   8165072   0% /dev/shm
/dev/sdk              15939896   6847620   8292276  46% /mnt/ost2
/dev/sdg              15939896    460468  14679428   4% /mnt/ost4
/dev/sdh              15939896    463104  14676792   4% /mnt/ost6

"lfs df" and "lfs df -i" on the client nodes also showed that there were free enough spaces and inodes on the OSTs. What's more, for each load (dd, tar, dbench, iozone) running on different client node, the load dir would always be removed after a successful run and before a new run (please refer to the recovery-mds-scale.log_run_

{dd,tar,dbench,iozone}

.sh-*.debug files), so there should not be ENOSPC error.

Syslog on fat-amd-4-ib showed that:

Sep  1 08:00:05 fat-amd-4-ib kernel: Pid: 4286, comm: ll_ost_io_127
Sep  1 08:00:05 fat-amd-4-ib kernel:
Sep  1 08:00:05 fat-amd-4-ib kernel: Call Trace:
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff81247e8f>] ? __generic_unplug_device+0x1f/0x40
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff814dc995>] schedule_timeout+0x215/0x2e0
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff814dc613>] wait_for_common+0x123/0x180
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff8105dc60>] ? default_wake_function+0x0/0x20
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff81249a5d>] ? submit_bio+0x8d/0x120
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff814dc72d>] wait_for_completion+0x1d/0x20
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff8124cedd>] __blkdev_issue_flush+0xad/0xe0
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff8124cf26>] blkdev_issue_flush+0x16/0x20
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa041f8ab>] ldiskfs_sync_file+0x17b/0x250 [ldiskfs]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa0a15865>] filter_sync+0x285/0x3e0 [obdfilter]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff8115a340>] ? cache_alloc_refill+0x1c0/0x240
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa04fe98c>] ? lprocfs_counter_add+0x12c/0x196 [lvfs]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa09cfe0c>] ost_blocking_ast+0x58c/0xe40 [ost]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa064db73>] ldlm_cancel_callback+0x63/0xf0 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa064dc59>] ldlm_lock_cancel+0x59/0x190 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa066fe36>] ldlm_request_cancel+0x256/0x420 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa09d9720>] ost_handle+0x3d60/0x4b90 [ost]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa049b6b1>] ? libcfs_debug_vmsg2+0x4d1/0xb50 [libcfs]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa0694104>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa069684c>] ? lustre_msg_get_status+0x3c/0xa0 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa06a4c7e>] ptlrpc_main+0xb8e/0x1900 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa06a40f0>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff8100c1ca>] child_rip+0xa/0x20
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa06a40f0>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffffa06a40f0>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
Sep  1 08:00:05 fat-amd-4-ib kernel: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
Sep  1 08:00:05 fat-amd-4-ib kernel:
Sep  1 08:00:05 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889205.4286
Sep  1 08:00:05 fat-amd-4-ib kernel: Lustre: Service thread pid 4210 was inactive for 156.04s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
Sep  1 08:00:05 fat-amd-4-ib kernel: Lustre: Skipped 2 previous similar messages
Sep  1 08:00:05 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889205.4270
Sep  1 08:00:05 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889205.4269
Sep  1 08:00:06 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889206.4237
Sep  1 08:00:06 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889206.4267
Sep  1 08:00:06 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889206.4283
Sep  1 08:00:12 fat-amd-4-ib kernel: Lustre: Service thread pid 4218 was inactive for 156.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
Sep  1 08:00:12 fat-amd-4-ib kernel: Lustre: Skipped 12 previous similar messages
Sep  1 08:00:12 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889212.4218
Sep  1 08:00:15 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889215.4285
Sep  1 08:00:18 fat-amd-4-ib kernel: Lustre: Service thread pid 4201 was inactive for 156.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
Sep  1 08:00:18 fat-amd-4-ib kernel: Lustre: Skipped 2 previous similar messages
Sep  1 08:00:18 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889218.4201
Sep  1 08:00:25 fat-amd-4-ib kernel: Lustre: Service thread pid 4204 completed after 177.20s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
Sep  1 08:00:25 fat-amd-4-ib kernel: Lustre: Skipped 27 previous similar messages
Sep  1 08:00:35 fat-amd-4-ib kernel: Lustre: Service thread pid 4249 was inactive for 156.00s. Watchdog stack traces are limited to 3 per 300 seconds, skipping this one.
Sep  1 08:00:35 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889235.4249
Sep  1 08:00:36 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889236.4275
Sep  1 08:00:43 fat-amd-4-ib kernel: LustreError: dumping log to /tmp/lustre-log.1314889243.4222
Sep  1 08:00:48 fat-amd-4-ib kernel: Lustre: Service thread pid 4226 completed after 200.23s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).

Maloo report: https://maloo.whamcloud.com/test_sets/b4bac0f4-d523-11e0-8d02-52540025f9af

Since the size of all of the log files is large, I could not upload them to this ticket. Please find /scratch/logs/2.1.0/recovery-mds-scale.1314890151.log.tar.bz2 on node brent.whamcloud.com.



 Comments   
Comment by Jian Yu [ 13/Sep/11 ]

Lustre Branch: master
Lustre Build: http://build.whamcloud.com/job/lustre-reviews/2161/
e2fsprogs Build: http://newbuild.whamcloud.com/job/e2fsprogs-master/54/
Distro/Arch: RHEL6/x86_64(in-kernel OFED, kernel version: 2.6.32-131.6.1.el6.x86_64)
ENABLE_QUOTA=yes
FAILURE_MODE=HARD
FLAVOR=OSS

Lustre configuration:

MGS/MDS Node: fat-amd-1-ib

OSS Nodes: fat-amd-3-ib(active), fat-amd-4-ib(active)
                               \ /
                               OST1 (active in fat-amd-3-ib)
                               OST2 (active in fat-amd-4-ib)
                               OST3 (active in fat-amd-3-ib)
                               OST4 (active in fat-amd-4-ib)
                               OST5 (active in fat-amd-3-ib)
                               OST6 (active in fat-amd-4-ib)
           fat-amd-2-ib(OST7)

Client Nodes: client-[1,2,4,5,12,13,15],fat-intel-4 

After the OSS failed over 9 times, recovery-mds-scale(FLAVOR=OSS) failed as follows:

==== Checking the clients loads AFTER  failover -- failure NOT OK
Client load failed on node client-12-ib, rc=1
Client load failed during failover. Exiting
Found the END_RUN_FILE file: /home/yujian/test_logs/end_run_file
client-12-ib
Client load failed on node client-12-ib

client client-12-ib load stdout and debug files :
              /tmp/recovery-mds-scale.log_run_dd.sh-client-12-ib
              /tmp/recovery-mds-scale.log_run_dd.sh-client-12-ib.debug
2011-09-13 03:17:20 Terminating clients loads ...
Duration:                7200
Server failover period: 600 seconds
Exited after:           6452 seconds
Number of failovers before exit:
mds1: 0 times
ost1: 2 times
ost2: 2 times
ost3: 0 times
ost4: 1 times
ost5: 0 times
ost6: 2 times
ost7: 2 times
Status: FAIL: rc=1

/tmp/recovery-mds-scale.log_run_dd.sh-client-12-ib.debug:

2011-09-13 03:12:09: dd succeeded
+ cd /tmp
+ rm -rf /mnt/lustre/d0.dd-client-12-ib
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-13 03:12:13: dd run finished'
+ echo '2011-09-13 03:12:13: dd run finished'
2011-09-13 03:12:13: dd run finished
+ '[' '!' -e /home/yujian/test_logs/end_run_file ']'
+ true
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-13 03:12:13: dd run starting'
+ echo '2011-09-13 03:12:13: dd run starting'
2011-09-13 03:12:13: dd run starting
+ mkdir -p /mnt/lustre/d0.dd-client-12-ib
+ cd /mnt/lustre/d0.dd-client-12-ib
+ load_pid=8828
+ wait 8828
+ dd bs=4k count=1000000 status=noxfer if=/dev/zero of=/mnt/lustre/d0.dd-client-12-ib/dd-file
dd: writing `/mnt/lustre/d0.dd-client-12-ib/dd-file': No space left on device
654964+0 records in
654963+0 records out
+ '[' 1 -eq 0 ']'
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-13 03:13:21: dd failed'
+ echo '2011-09-13 03:13:21: dd failed'
2011-09-13 03:13:21: dd failed

Syslog on client-12-ib showed that:

Sep 13 03:13:12 client-12 kernel: Lustre: 2186:0:(client.c:1773:ptlrpc_expire_one_request()) @@@ Request  sent has failed due to network error: [sent 1315908792/real 1315908792
]  req@ffff8800ae432800 x1379823408037596/t0(0) o8->lustre-OST0002-osc-ffff880315e7c400@192.168.4.134@o2ib:28/4 lens 368/512 e 0 to 1 dl 1315908818 ref 1 fl Rpc:XN/0/ffffffff r
c 0/-1
Sep 13 03:13:12 client-12 kernel: Lustre: 2186:0:(client.c:1773:ptlrpc_expire_one_request()) Skipped 135 previous similar messages
Sep 13 03:13:21 client-12 kernel: LustreError: 8828:0:(vvp_io.c:1001:vvp_io_commit_write()) Write page 654963 of inode ffff88028e371bf8 failed -28
[root@client-12-ib ~]# lfs df -h
UUID                       bytes        Used   Available Use% Mounted on
lustre-MDT0000_UUID         7.2G      435.4M        6.2G   6% /mnt/lustre[MDT:0]
lustre-OST0000_UUID        15.2G      440.1M       14.0G   3% /mnt/lustre[OST:0]
lustre-OST0001_UUID        15.2G      440.1M       14.0G   3% /mnt/lustre[OST:1]
lustre-OST0002_UUID        15.2G      440.1M       14.0G   3% /mnt/lustre[OST:2]
lustre-OST0003_UUID        15.2G      440.1M       14.0G   3% /mnt/lustre[OST:3]
lustre-OST0004_UUID        15.2G      440.1M       14.0G   3% /mnt/lustre[OST:4]
lustre-OST0005_UUID        15.2G      440.1M       14.0G   3% /mnt/lustre[OST:5]
lustre-OST0006_UUID        15.2G        6.6G        7.9G  45% /mnt/lustre[OST:6]

filesystem summary:       106.4G        9.1G       91.9G   9% /mnt/lustre

[root@client-12-ib ~]# lfs df -i
UUID                      Inodes       IUsed       IFree IUse% Mounted on
lustre-MDT0000_UUID      5000040          63     4999977   0% /mnt/lustre[MDT:0]
lustre-OST0000_UUID       236160          89      236071   0% /mnt/lustre[OST:0]
lustre-OST0001_UUID       236160          94      236066   0% /mnt/lustre[OST:1]
lustre-OST0002_UUID       236160          94      236066   0% /mnt/lustre[OST:2]
lustre-OST0003_UUID       236160          88      236072   0% /mnt/lustre[OST:3]
lustre-OST0004_UUID       236160          87      236073   0% /mnt/lustre[OST:4]
lustre-OST0005_UUID       236160          89      236071   0% /mnt/lustre[OST:5]
lustre-OST0006_UUID       236160         100      236060   0% /mnt/lustre[OST:6]

filesystem summary:      5000040          63     4999977   0% /mnt/lustre

[root@fat-amd-2-ib ~]# lctl dl
  0 UP mgc MGC192.168.4.132@o2ib 87b4991e-f76f-5234-3bfb-544ca6cfbb6f 5
  1 UP ost OSS OSS_uuid 3
  2 UP obdfilter lustre-OST0006 lustre-OST0006_UUID 19

[root@fat-amd-2-ib ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              20G  1.8G   17G  10% /
tmpfs                 7.8G     0  7.8G   0% /dev/shm
/dev/sde               16G  6.6G  7.9G  46% /mnt/ost7

[root@fat-amd-2-ib ~]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda1            1310720   42667 1268053    4% /
tmpfs                2041268       1 2041267    1% /dev/shm
/dev/sde              236160     100  236060    1% /mnt/ost7

Syslog on fat-amd-2-ib showed that:

Sep 13 03:12:59 fat-amd-2-ib kernel: INFO: task jbd2/sde-8:2487 blocked for more than 120 seconds.
Sep 13 03:12:59 fat-amd-2-ib kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.Sep 13 03:12:59 fat-amd-2-ib kernel: jbd2/sde-8    D 0000000000000004     0  2487      2 0x00000080
Sep 13 03:12:59 fat-amd-2-ib kernel: ffff88011a5a3c10 0000000000000046 0000000000000000 ffff88011ac4cc70
Sep 13 03:12:59 fat-amd-2-ib kernel: ffff88011a5a3b80 ffffffff81012979 ffff88011a5a3bc0 ffffffff81098d09
Sep 13 03:12:59 fat-amd-2-ib kernel: ffff880117e00678 ffff88011a5a3fd8 000000000000f598 ffff880117e00678
Sep 13 03:12:59 fat-amd-2-ib kernel: Call Trace:Sep 13 03:12:59 fat-amd-2-ib kernel: [<ffffffff81012979>] ? read_tsc+0x9/0x20
Sep 13 03:12:59 fat-amd-2-ib kernel: [<ffffffff81098d09>] ? ktime_get_ts+0xa9/0xe0
Sep 13 03:12:59 fat-amd-2-ib kernel: [<ffffffff81098d09>] ? ktime_get_ts+0xa9/0xe0
Sep 13 03:12:59 fat-amd-2-ib kernel: [<ffffffff811a3de0>] ? sync_buffer+0x0/0x50
Sep 13 03:12:59 fat-amd-2-ib kernel: [<ffffffff814dc403>] io_schedule+0x73/0xc0Sep 13 03:12:59 fat-amd-2-ib kernel: [<ffffffff811a3e20>] sync_buffer+0x40/0x50
Sep 13 03:12:59 fat-amd-2-ib kernel: [<ffffffff814dcc6f>] __wait_on_bit+0x5f/0x90
Sep 13 03:12:59 fat-amd-2-ib kernel: [<ffffffff811a3de0>] ? sync_buffer+0x0/0x50
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff814dcd18>] out_of_line_wait_on_bit+0x78/0x90
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8108e1a0>] ? wake_bit_function+0x0/0x50Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff811a3dd6>] __wait_on_buffer+0x26/0x30Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa0922971>] jbd2_journal_commit_transaction+0x11c1/0x1530 [jbd2]Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff810096e0>] ? __switch_to+0xd0/0x320
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8107a17b>] ? try_to_del_timer_sync+0x7b/0xe0
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa0927b48>] kjournald2+0xb8/0x220 [jbd2]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa0927a90>] ? kjournald2+0x0/0x220 [jbd2]Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8108ddf6>] kthread+0x96/0xa0Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8100c1ca>] child_rip+0xa/0x20Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8108dd60>] ? kthread+0x0/0xa0
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
Sep 13 03:13:00 fat-amd-2-ib kernel: INFO: task ll_ost_io_89:2736 blocked for more than 120 seconds.
Sep 13 03:13:00 fat-amd-2-ib kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 13 03:13:00 fat-amd-2-ib kernel: ll_ost_io_89  D 0000000000000004     0  2736      2 0x00000080
Sep 13 03:13:00 fat-amd-2-ib kernel: ffff88041158b790 0000000000000046 0000000000000000 ffff880217fa8aa0
Sep 13 03:13:00 fat-amd-2-ib kernel: ffff88041158b710 ffffffff81247e8f 0000000000000001 0000000100237e3a
Sep 13 03:13:00 fat-amd-2-ib kernel: ffff880411589a78 ffff88041158bfd8 000000000000f598 ffff880411589a78
Sep 13 03:13:00 fat-amd-2-ib kernel: Call Trace:
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff81247e8f>] ? __generic_unplug_device+0x1f/0x40
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff814dc995>] schedule_timeout+0x215/0x2e0
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff814dc613>] wait_for_common+0x123/0x180
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8105dc60>] ? default_wake_function+0x0/0x20
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff81249a5d>] ? submit_bio+0x8d/0x120
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff814dc72d>] wait_for_completion+0x1d/0x20
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8124cedd>] __blkdev_issue_flush+0xad/0xe0
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8124cf26>] blkdev_issue_flush+0x16/0x20
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa094a8ab>] ldiskfs_sync_file+0x17b/0x250 [ldiskfs]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa0a1c8f5>] filter_sync+0x285/0x3e0 [obdfilter]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa05b56d6>] ? _ldlm_lock_debug+0x446/0x680 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa046998c>] ? lprocfs_counter_add+0x12c/0x196 [lvfs]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa09d6e0c>] ost_blocking_ast+0x58c/0xe40 [ost]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa05b8b73>] ldlm_cancel_callback+0x63/0xf0 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa05b8c59>] ldlm_lock_cancel+0x59/0x190 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa05dae36>] ldlm_request_cancel+0x256/0x420 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa09e0720>] ost_handle+0x3d60/0x4b90 [ost]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa04066b1>] ? libcfs_debug_vmsg2+0x4d1/0xb50 [libcfs]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa05fee94>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa06015fc>] ? lustre_msg_get_status+0x3c/0xa0 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa060fb4e>] ptlrpc_main+0xb8e/0x1900 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa060efc0>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8100c1ca>] child_rip+0xa/0x20
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa060efc0>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffffa060efc0>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
Sep 13 03:13:00 fat-amd-2-ib kernel: [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
<~snip~>

Maloo report: https://maloo.whamcloud.com/test_sets/64ba1fae-ddfd-11e0-9909-52540025f9af

Please find /scratch/logs/2.1.0/recovery-oss-scale.1315909046.log.tar.bz2 on node brent.whamcloud.com for more logs.

Comment by Jian Yu [ 19/Sep/11 ]

Lustre Tag: v2_1_0_0_RC2
Lustre Build: http://newbuild.whamcloud.com/job/lustre-master/283/
e2fsprogs Build: http://newbuild.whamcloud.com/job/e2fsprogs-master/54/
Distro/Arch: RHEL6/x86_64(in-kernel OFED, kernel version: 2.6.32-131.6.1.el6.x86_64)
ENABLE_QUOTA=yes
FAILURE_MODE=HARD
FLAVOR=MDS

After running about 7 hours (MDS failed over 29 times), recovery-mds-scale test failed with the same issue:
https://maloo.whamcloud.com/test_sets/b3f0c344-e32e-11e0-9909-52540025f9af

/tmp/recovery-mds-scale.log_run_dd.sh-client-12-ib.debug:

2011-09-19 10:16:49: dd succeeded
+ cd /tmp
+ rm -rf /mnt/lustre/d0.dd-client-12-ib
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-19 10:16:51: dd run finished'
+ echo '2011-09-19 10:16:51: dd run finished'
2011-09-19 10:16:51: dd run finished
+ '[' '!' -e /home/yujian/test_logs/end_run_file ']'
+ true
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-19 10:16:51: dd run starting'
+ echo '2011-09-19 10:16:51: dd run starting'
2011-09-19 10:16:51: dd run starting
+ mkdir -p /mnt/lustre/d0.dd-client-12-ib
+ cd /mnt/lustre/d0.dd-client-12-ib
+ load_pid=17460
+ wait 17460
+ dd bs=4k count=1000000 status=noxfer if=/dev/zero of=/mnt/lustre/d0.dd-client-12-ib/dd-file
dd: writing `/mnt/lustre/d0.dd-client-12-ib/dd-file': No space left on device
589960+0 records in
589959+0 records out
+ '[' 1 -eq 0 ']'
++ date '+%F %H:%M:%S'
+ echoerr '2011-09-19 10:24:03: dd failed'
+ echo '2011-09-19 10:24:03: dd failed'
2011-09-19 10:24:03: dd failed

Syslog on the client node client-12-ib:

Sep 19 10:17:48 client-12 kernel: LustreError: 11-0: an error occurred while communicating with 192.168.4.134@o2ib. The ost_write operation failed with -28
Sep 19 10:17:48 client-12 kernel: LustreError: 17460:0:(vvp_io.c:1001:vvp_io_commit_write()) Write page 589959 of inode ffff88029f2b7cb8 failed -28

Please refer to the attached recovery-mds-scale.1316453521.log.tar.bz2 for more logs.

Comment by Jian Yu [ 13/Feb/12 ]

Lustre Tag: v2_1_1_0_RC2
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/41/
Distro/Arch: RHEL6/x86_64 (kernel version: 2.6.32-220.el6)
Network: TCP (1GigE)
FAILURE_MODE=HARD
FLAVOR=MDS

The same issue occurred: https://maloo.whamcloud.com/test_sets/8384ecca-55b2-11e1-9aa8-5254004bbbd3

Comment by Jian Yu [ 13/Jun/12 ]

Another instance: https://maloo.whamcloud.com/test_sets/92648b78-b3d7-11e1-8808-52540035b04c

Comment by Jian Yu [ 13/Jun/12 ]

Another instance: https://maloo.whamcloud.com/test_sets/f09a5f58-b3bc-11e1-a2dd-52540035b04c

Comment by Peter Jones [ 18/Jul/12 ]

Hongchao

Could you please look into this one?

Thanks

Peter

Comment by Hongchao Zhang [ 23/Jul/12 ]

there was no obvious error in the logs, and the obd_osfs was updated by fsfilt_statfs before -ENOSPC
I checked the recently occurrences of this bug in Maloo, and the log of corresponding OST was overwritten
for the extra logs produced after the error -ENOSPC was triggered at OST

the OST could be actually full at the time if the tests running at the clients used the same OST,
will create a debug patch to produce more logs to check whether it is the case

Comment by Hongchao Zhang [ 23/Jul/12 ]

the possible patch is tracked at http://review.whamcloud.com/#change,3446

Comment by Hongchao Zhang [ 26/Jul/12 ]

this issue occurs after the client received the OBD_FL_NO_USRQUOTA or OBD_FL_NO_GRPQUOTA, is the recovery-mds-scale tested
by user id with quota limit?

Comment by Jian Yu [ 01/Aug/12 ]

this issue occurs after the client received the OBD_FL_NO_USRQUOTA or OBD_FL_NO_GRPQUOTA, is the recovery-mds-scale tested by user id with quota limit?

The test was run by root user.

Comment by Peter Jones [ 09/Aug/12 ]

The proposed patch has been landed for 2.3. Closing this ticket for now but we should reopen it if it transpires that the issue fixed was not the one reported and this problem still occurs on the next tag.

Comment by Alex Zhuravlev [ 16/Aug/12 ]

just got this on the master:

Lustre: DEBUG MARKER: == sanity test 49: Change max_pages_per_rpc won't break osc extent =================================== 07:44:43 (1345088683)
format at filter_io.c:786:filter_preprw_write doesn't end in newline
LustreError: 15266:0:(ost_handler.c:2240:ost_handle()) ASSERTION( get_current()->journal_info == ((void *)0) ) failed:
LustreError: 15266:0:(ost_handler.c:2240:ost_handle()) LBUG

filter_io.c:786: CDEBUG(D_INODE, "retry after commit pending journals");

Comment by Hongchao Zhang [ 16/Aug/12 ]

there is a bug in the merged patch

Comment by Hongchao Zhang [ 16/Aug/12 ]

Thanks bzzz!

sorry, there is indeed a bug in the merged patch, the fix patch will be pushed soon.

Comment by Hongchao Zhang [ 16/Aug/12 ]

the fix patch is tracked at http://review.whamcloud.com/#change,3692

Comment by Jian Yu [ 20/Aug/12 ]

Lustre Branch: b2_1
Lustre Build: http://build.whamcloud.com/job/lustre-b2_1/117

Another failure instance: https://maloo.whamcloud.com/test_sets/6533657c-ea21-11e1-b794-52540035b04c

Comment by Peter Jones [ 23/Aug/12 ]

Landed for 2.3 and 2.4

Comment by Jian Yu [ 05/Sep/12 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/12

The issue still exists:
https://maloo.whamcloud.com/test_sets/5daabc00-f760-11e1-8b95-52540035b04c

Comment by Hongchao Zhang [ 06/Sep/12 ]

Hi YuJian,
could you please help to reproduce this issue with D_QUOTA added into PTLDEBUG, for it is suspected it could be related to quota, thanks!

Comment by Jian Yu [ 06/Sep/12 ]

could you please help to reproduce this issue with D_QUOTA added into PTLDEBUG, for it is suspected it could be related to quota, thanks!

Hi Chris,

Could you please set the following PTLDEBUG value in autotest_config.sh to run the hard failover tests?

PTLDEBUG="vfstrace rpctrace dlmtrace neterror ha config ioctl super quota"

I could not reproduce this issue in manual test runs.

Comment by Chris Gearing (Inactive) [ 06/Sep/12 ]

Added to b2_3 failover tests

Comment by Jian Yu [ 09/Sep/12 ]

Added to b2_3 failover tests

Thanks Chris.

Hi Hongchao,
The issue was reproduced in autotest run:
https://maloo.whamcloud.com/test_sets/7eca53cc-f92d-11e1-a1b8-52540035b04c

Comment by Hongchao Zhang [ 11/Sep/12 ]

the autotest has been queued to verify whether this issue is fixed or not
after the patch(http://review.whamcloud.com/#change,3913) was merged, which fixed a bug in the previous patch in this ticket

Comment by Jian Yu [ 13/Sep/12 ]

Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/17

https://maloo.whamcloud.com/test_sets/22a91122-fd70-11e1-afe5-52540035b04c
https://maloo.whamcloud.com/test_sets/28a35b3a-fd72-11e1-a1b4-52540035b04c

The issue still occurred.

Comment by Hongchao Zhang [ 13/Sep/12 ]

the debug patch is at http://review.whamcloud.com/#change,3979

Hi Chris,
Could you please queue an autotest (only b2_3 failover tests) against the build with this debug patch? Thanks!

Comment by Chris Gearing (Inactive) [ 13/Sep/12 ]

You are able to do this using the Test-Requirements in the described on the wiki.

You will need something like this;

Test-Requirements: fortestonly envdefinitions="SLOW=yes" clientcount=4 osscount=2 mdscount=2 austeroptions=-R failover=true useiscsi=true testgroup=failover

I should probably introduce a class that allows somethings like that to be done more easily, this example is on the wiki page.

Comment by Jian Yu [ 13/Sep/12 ]

Hi Chris and Hongchao,
Please also add testlist=recovery-mds-scale to only run this test in the failover group, otherwise it will take very long time to finish the whole group.

Comment by Jian Yu [ 15/Sep/12 ]

Hi Hongchao,
I applied the debug patch on b2_3 branch: http://review.whamcloud.com/#change,4002.

Comment by Jian Yu [ 16/Sep/12 ]

I applied the debug patch on b2_3 branch: http://review.whamcloud.com/#change,4002.

Chris, could you please a look at the above change? The recovery-mds-scale test was not run somehow.

Comment by Chris Gearing (Inactive) [ 17/Sep/12 ]

There is a bug in autotest, I have tried to fix on the fly. You might want to try it again.

The bug was that austeroptions was defined as nil and so have no type meaning the parser did not know what to do with the option.

I fixed this by defaults austeroptions to "-v" meaning it now has a type.

Comment by Chris Gearing (Inactive) [ 17/Sep/12 ]

I've restarted this but don't expect it to work, I've not actually been successful with the long set of Test-Requirements, in fact it's possible I've broken stuff by trying to get it to work.

Comment by Hongchao Zhang [ 17/Sep/12 ]

the recovery-mds-scale test run, Thanks Chris, Yujian!

from the output, the OST does have no space,
https://maloo.whamcloud.com/test_logs/9a8991da-00e5-11e2-860a-52540035b04c
...
LustreError: 9656:0:(filter_io.c:787:filter_preprw_write()) retry after commit pending journals
LustreError: 9656:0:(filter_io.c:820:filter_preprw_write()) lustre-OST0001: cli 70f1591f-2616-ae38-9797-12038e2d355c/ffff880037547c00 free: 246771712 avail: 37392384 grant 37253120 left: 4096 pending: 0 osfs_age 4295175522, current 4295175522
...

Hi Chris, what is the disk size of "/dev/dm-1" in fat-intel-3vm4(OSS node), is it 2G? ( I just check it but it started to
run another test). and 'dd' encountered the -ENOSPC(-28) at 3.5G (897749*4096), the stripe_count is 2?

Comment by Chris Gearing (Inactive) [ 20/Sep/12 ]

The really need to make this info easier to access, we have a ticket about this.

I went and found the original logs and the values are;

MDSSIZE=2097152
OSTSIZE=4089446

I don't specify the stripe count.

config was this:

09:00:57:cat /root/autotest_config.sh
09:00:57:#!/bin/bash
09:00:57:#Auto Generated By Whamcloud Autotest
09:00:57:#Key Exports
09:00:57:export mgs_HOST=fat-intel-3vm3
09:00:58:export mds_HOST=fat-intel-3vm3
09:00:58:export MGSDEV=/dev/lvm-MDS/P1
09:00:58:export MDSDEV=/dev/lvm-MDS/P1
09:00:58:export mds1_HOST=fat-intel-3vm3
09:00:58:export MDSDEV1=/dev/lvm-MDS/P1
09:00:58:export MDSCOUNT=1
09:00:58:export MDSSIZE=2097152
09:00:58:export MGSSIZE=2097152
09:00:58:export MDSFSTYPE=ldiskfs
09:00:58:export MGSFSTYPE=ldiskfs
09:00:58:export mdsfailover_HOST=fat-intel-3vm7
09:00:58:export mds1failover_HOST=fat-intel-3vm7
09:00:58:export MGSNID=fat-intel-3vm3:fat-intel-3vm7
09:00:58:export FAILURE_MODE=HARD
09:00:58:export POWER_DOWN="pm -h powerman --off"
09:00:58:export POWER_UP="pm -h powerman --on"
09:00:58:export ost_HOST=fat-intel-3vm4
09:00:58:export ostfailover_HOST=fat-intel-3vm8
09:00:58:export ost1_HOST=fat-intel-3vm4
09:00:58:export OSTDEV1=/dev/lvm-OSS/P1
09:00:58:export ost1failover_HOST=fat-intel-3vm8
09:00:58:export ost2_HOST=fat-intel-3vm4
09:00:58:export OSTDEV2=/dev/lvm-OSS/P2
09:00:58:export ost2failover_HOST=fat-intel-3vm8
09:00:58:export ost3_HOST=fat-intel-3vm4
09:00:58:export OSTDEV3=/dev/lvm-OSS/P3
09:00:58:export ost3failover_HOST=fat-intel-3vm8
09:00:58:export ost4_HOST=fat-intel-3vm4
09:00:58:export OSTDEV4=/dev/lvm-OSS/P4
09:00:58:export ost4failover_HOST=fat-intel-3vm8
09:00:58:export ost5_HOST=fat-intel-3vm4
09:00:58:export OSTDEV5=/dev/lvm-OSS/P5
09:00:58:export ost5failover_HOST=fat-intel-3vm8
09:00:58:export ost6_HOST=fat-intel-3vm4
09:00:58:export OSTDEV6=/dev/lvm-OSS/P6
09:00:58:export ost6failover_HOST=fat-intel-3vm8
09:00:58:export ost7_HOST=fat-intel-3vm4
09:00:58:export OSTDEV7=/dev/lvm-OSS/P7
09:00:58:export ost7failover_HOST=fat-intel-3vm8
09:00:58:# some setup for conf-sanity test 24a, 24b, 33a
09:00:58:export fs2mds_DEV=/dev/lvm-MDS/S1
09:00:58:export fs2ost_DEV=/dev/lvm-OSS/S1
09:00:58:export fs3ost_DEV=/dev/lvm-OSS/S2
09:00:58:export RCLIENTS="fat-intel-3vm6 fat-intel-3vm5"
09:00:58:export OSTCOUNT=7
09:00:58:export NETTYPE=tcp
09:00:59:export OSTSIZE=4089446
09:00:59:export OSTFSTYPE=ldiskfs
09:00:59:export FSTYPE=ldiskfs
09:00:59:export SHARED_DIRECTORY=/home/autotest/.autotest/shared_dir/2012-09-17/080224-7f44ee54dbb8
09:00:59:export SLOW=yes
09:00:59:VERBOSE=true
09:00:59:
09:00:59:# Entries above here come are created by configurecluster.rb
09:00:59:# Entries below here come from mecturk.h
09:00:59:FSNAME=lustre
09:00:59:
09:00:59:TMP=${TMP:-/tmp}
09:00:59:
09:00:59:DAEMONSIZE=${DAEMONSIZE:-500}
09:00:59:
09:00:59:MDSOPT=${MDSOPT:-""}
09:00:59:MGSOPT=${MGSOPT:-""}
09:00:59:
09:00:59:# sgpdd-survey requires these to be set. They apprarently have no side affect.
09:00:59:SGPDD_YES=true
09:00:59:REFORMAT=true
09:00:59:
09:00:59:# some bits for liblustre tcp connecttions
09:00:59:export LNET_ACCEPT_PORT=7988
09:00:59:export ACCEPTOR_PORT=7988
09:00:59:
09:00:59:OSTOPT=${OSTOPT:-""}
09:00:59:
09:00:59:STRIPE_BYTES=${STRIPE_BYTES:-1048576}
09:00:59:STRIPES_PER_OBJ=${STRIPES_PER_OBJ:-0}
09:00:59:SINGLEMDS=${SINGLEMDS:-"mds1"}
09:00:59:TIMEOUT=${TIMEOUT:-20}
09:00:59:PTLDEBUG=${PTLDEBUG:-0x33f0404}
09:00:59:DEBUG_SIZE=${DEBUG_SIZE:-32}
09:00:59:SUBSYSTEM=${SUBSYSTEM:- 0xffb7e3ff}
09:00:59:
09:00:59:MKFSOPT=""
09:00:59:MOUNTOPT=""
09:00:59:[ "x$MDSJOURNALSIZE" != "x" ] &&
09:00:59: MKFSOPT=$MKFSOPT" -J size=$MDSJOURNALSIZE"
09:00:59:[ "x$MDSISIZE" != "x" ] &&
09:00:59: MKFSOPT=$MKFSOPT" -i $MDSISIZE"
09:00:59:[ "x$MKFSOPT" != "x" ] &&
09:00:59: MKFSOPT="--mkfsoptions=\\\"$MKFSOPT\\\""
09:00:59:[ "x$MDSCAPA" != "x" ] &&
09:00:59: MKFSOPT="--param mdt.capa=$MDSCAPA"
09:00:59:[ "$MDSFSTYPE" = "ldiskfs" ] &&
09:00:59: MDSOPT=$MDSOPT" --mountfsoptions=errors=remount-ro,iopen_nopriv,user_xattr,acl"
09:00:59:[ "x$mdsfailover_HOST" != "x" ] &&
09:00:59: MDSOPT=$MDSOPT" --failnode=`h2$NETTYPE $mdsfailover_HOST`"
09:00:59:[ "x$STRIPE_BYTES" != "x" ] &&
09:00:59: MOUNTOPT=$MOUNTOPT" --param lov.stripesize=$STRIPE_BYTES"
09:00:59:[ "x$STRIPES_PER_OBJ" != "x" ] &&
09:00:59: MOUNTOPT=$MOUNTOPT" --param lov.stripecount=$STRIPES_PER_OBJ"
09:00:59:[ "x$L_GETIDENTITY" != "x" ] &&
09:00:59: MOUNTOPT=$MOUNTOPT" --param mdt.identity_upcall=$L_GETIDENTITY"
09:00:59:
09:00:59:MDS_MKFS_OPTS="--mdt --fsname=$FSNAME $MKFSOPT $MDSOPT"
09:00:59:[ "$MDSFSTYPE" = "ldiskfs" ] &&
09:00:59: MDS_MKFS_OPTS=$MDS_MKFS_OPTS" --param sys.timeout=$TIMEOUT --device-size=$MDSSIZE"
09:00:59:[ "$MDSFSTYPE" = "zfs" ] &&
09:00:59: MDS_MKFS_OPTS=$MDS_MKFS_OPTS" --vdev-size=$MDSSIZE"
09:00:59:
09:00:59:if combined_mgs_mds ; then
09:00:59: [ "$MDSCOUNT" = "1" ] &&
09:00:59: MDS_MKFS_OPTS="--mgs $MDS_MKFS_OPTS"
09:00:59:else
09:00:59: MDS_MKFS_OPTS="--mgsnode=$MGSNID $MDS_MKFS_OPTS"
09:00:59: [ "$MGSFSTYPE" = "ldiskfs" ] &&
09:00:59: MGS_MKFS_OPTS="--mgs --device-size=$MGSSIZE"
09:00:59: [ "$MGSFSTYPE" = "zfs" ] &&
09:00:59: MGS_MKFS_OPTS="--mgs --vdev-size=$MGSSIZE"
09:00:59:fi
09:00:59:
09:00:59:if [ "$MDSDEV1" != "$MGSDEV" ]; then
09:00:59: if [ "$MGSFSTYPE" == "ldiskfs" ]; then
09:00:59: MGS_MOUNT_OPTS=${MGS_MOUNT_OPTS:-"-o loop"}
09:00:59: else
09:00:59: MGS_MOUNT_OPTS=${MGS_MOUNT_OPTS:-""}
09:00:59: fi
09:00:59:else
09:00:59: MGS_MOUNT_OPTS=${MGS_MOUNT_OPTS:-$MDS_MOUNT_OPTS}
09:00:59:fi
09:00:59:
09:00:59:MKFSOPT=""
09:00:59:MOUNTOPT=""
09:00:59:[ "x$OSTJOURNALSIZE" != "x" ] &&
09:00:59: MKFSOPT=$MKFSOPT" -J size=$OSTJOURNALSIZE"
09:00:59:[ "x$MKFSOPT" != "x" ] &&
09:00:59: MKFSOPT="--mkfsoptions=\\\"$MKFSOPT\\\""
09:00:59:[ "x$OSSCAPA" != "x" ] &&
09:00:59: MKFSOPT="--param ost.capa=$OSSCAPA"
09:00:59:[ "x$ostfailover_HOST" != "x" ] &&
09:00:59: OSTOPT=$OSTOPT" --failnode=`h2$NETTYPE $ostfailover_HOST`"
09:00:59:
09:00:59:OST_MKFS_OPTS="--ost --fsname=$FSNAME --mgsnode=$MGSNID $MKFSOPT $OSTOPT"
09:00:59:[ "$OSTFSTYPE" = "ldiskfs" ] &&
09:00:59: OST_MKFS_OPTS=$OST_MKFS_OPTS" --param sys.timeout=$TIMEOUT --device-size=$OSTSIZE"
09:00:59:[ "$OSTFSTYPE" = "zfs" ] &&
09:00:59: OST_MKFS_OPTS=$OST_MKFS_OPTS" --vdev-size=$OSTSIZE"
09:01:00:
09:01:00:MDS_MOUNT_OPTS=${MDS_MOUNT_OPTS:-"-o user_xattr,acl"}
09:01:00:OST_MOUNT_OPTS=${OST_MOUNT_OPTS:-""}
09:01:00:
09:01:00:# TT-430
09:01:00:SERVER_FAILOVER_PERIOD=$((60 * 15))
09:01:00:
09:01:00:#RUNAS_ID=840000017
09:01:00:#client
09:01:00:MOUNT=${MOUNT:-/mnt/${FSNAME}}
09:01:00:MOUNT1=${MOUNT1:-$MOUNT}
09:01:00:MOUNT2=${MOUNT2:-${MOUNT}2}
09:01:00:MOUNTOPT=${MOUNTOPT:-"-o user_xattr,acl,flock"}
09:01:00:[ "x$RMTCLIENT" != "x" ] &&
09:01:00: MOUNTOPT=$MOUNTOPT",remote_client"
09:01:00:DIR=${DIR:-$MOUNT}
09:01:00:DIR1=${DIR:-$MOUNT1}
09:01:00:DIR2=${DIR2:-$MOUNT2}
09:01:00:
09:01:00:if [ $UID -ne 0 ]; then
09:01:00: log "running as non-root uid $UID"
09:01:00: RUNAS_ID="$UID"
09:01:00: RUNAS_GID=`id -g $USER`
09:01:00: RUNAS=""
09:01:00:else
09:01:00: RUNAS_ID=${RUNAS_ID:-500}
09:01:00: RUNAS_GID=${RUNAS_GID:-$RUNAS_ID}
09:01:00: RUNAS=${RUNAS:-"runas -u $RUNAS_ID"}
09:01:00:fi
09:01:00:
09:01:00:PDSH="pdsh -t 120 -S -Rrsh -w"
09:01:00:export RSYNC_RSH=rsh
09:01:00:FAILURE_MODE=${FAILURE_MODE:-SOFT} # or HARD
09:01:00:POWER_DOWN=${POWER_DOWN:-"powerman --off"}
09:01:00:POWER_UP=${POWER_UP:-"powerman --on"}
09:01:00:SLOW=${SLOW:-no}
09:01:00:FAIL_ON_ERROR=${FAIL_ON_ERROR:-true}
09:01:00:
09:01:01:# error: conf_param: No such device" issue in every test suite logs
09:01:01:# sanity-quota test_32 hash_lqs_cur_bits isn't set properly
09:01:01:QUOTA_TYPE=${QUOTA_TYPE:-"ug3"}
09:01:01:QUOTA_USERS=${QUOTA_USERS:-"quota_usr quota_2usr sanityusr sanityusr1"}
09:01:01:LQUOTAOPTS=${LQUOTAOPTS:-"hash_lqs_cur_bits=3"}
09:01:01:
09:01:01:# SKIP: parallel-scale test_compilebench compilebench not found
09:01:01:# SKIP: parallel-scale test_connectathon connectathon dir not found
09:01:01:# ------
09:01:01:cbench_DIR=/usr/bin
09:01:01:cnt_DIR=/opt/connectathon
09:01:01:
09:01:01:MPIRUN=$(which mpirun 2>/dev/null) || true
09:01:01:MPIRUN_OPTIONS="-mca boot ssh"
09:01:01:MPI_USER=${MPI_USER:-mpiuser}
09:01:01:SINGLECLIENT=$(hostname)
09:01:01:#cbench_DIR=/data/src/benchmarks/compilebench.hg
09:01:01:#cnt_DIR=/data/src/benchmarks/cthon04
09:01:01:
09:01:01:# For multiple clients testing, we need use the cfg/ncli.sh config file, and
09:01:01:# only need specify the "RCLIENTS" variable. The "CLIENTS" and "CLIENTCOUNT"
09:01:01:# variables are defined in init_clients_lists(), which is called from cfg/ncli.sh.
09:01:01:# So, if we add the contents of cfg/ncli.sh into autotest_config.sh, we would not
09:01:01:# need specify "CLIENTS" and "CLIENTCOUNT", and the above two issues (#3 and #4) would also be fixed.
09:01:01:# Start of contents of cfg/ncli.sh
09:01:01:CLIENT1=${CLIENT1:-`hostname`}
09:01:01:SINGLECLIENT=$CLIENT1
09:01:01:RCLIENTS=${RCLIENTS:-""}
09:01:01:
09:01:01:init_clients_lists
09:01:01:
09:01:01:[ -n "$RCLIENTS" -a "$PDSH" = "no_dsh" ] && \
09:01:01: error "tests for remote clients $RCLIENTS needs pdsh != do_dsh " || true
09:01:01:
09:01:01:[ -n "$FUNCTIONS" ] && . $FUNCTIONS || true
09:01:01:
09:01:01:# for recovery scale tests
09:01:01:# default boulder cluster iozone location
09:01:01:export PATH=/opt/iozone/bin:$PATH
09:01:01:
09:01:01:LOADS=${LOADS:-"dd tar dbench iozone"}
09:01:01:for i in $LOADS; do
09:01:01: [ -f $LUSTRE/tests/run_${i}.sh ] || \
09:01:01: error "incorrect load: $i"
09:01:01:done
09:01:01:CLIENT_LOADS=($LOADS)
09:01:01:# End of contents of cfg/ncli.sh
09:01:01:

Comment by Hongchao Zhang [ 21/Sep/12 ]

this explains the issue, the default stripe count is 1 and the OSTSIZE is 4G, the -ENOSPC occurs just at writing 3.5G,
and the remaining disk space is 0.3G (free: 246771712 avail: 37392384 grant 37253120 left: 4096 pending: 0).

then this is the normal case, and it can be fixed by modifying the replay-mds-scale to increase default stripe count or enlarge OST size.

Comment by Hongchao Zhang [ 21/Sep/12 ]

the patch is tracked at http://review.whamcloud.com/#change,4064

Comment by Jian Yu [ 21/Sep/12 ]

I went and found the original logs and the values are;

MDSSIZE=2097152
OSTSIZE=4089446

I just checked the run_dd.sh script and found it always created/deleted a file with 4000000k bytes:

dd bs=4k count=1000000 status=noxfer if=/dev/zero of=$TESTDIR/dd-file 1>$LOG &

The OSTSIZE with 4089446k bytes is not enough for a 4000000k bytes file with stripe count 0.

I just updated http://review.whamcloud.com/#change,4002 with "ostsizegb=6" to see whether the recovery-mds-scale test will hit the out of space error or not.

Comment by Peter Jones [ 22/Sep/12 ]

Dropping priority as this is a test only issue. Once the test correction is finalized we should land it to master and we may also land it to b2_3 if we have another RC

Comment by Jian Yu [ 22/Sep/12 ]

I just updated http://review.whamcloud.com/#change,4002 with "ostsizegb=6" to see whether the recovery-mds-scale test will hit the out of space error or not.

Hi Chris,
Could you please check whether "ostsizegb" test parameter works or not? In the latest run for the above patch, although I set "ostsizegb=6", I still found "OSTSIZE=4089446" was used.

Comment by Jian Yu [ 25/Sep/12 ]

Lustre Version: v2_3_0_RC1
Lustre Build: http://build.whamcloud.com/job/lustre-b2_3/24
Distro/Arch: RHEL5.8/x86_64(client), RHEL6.3/x86_64(server)

recovery-mds-scale test failed with out of space issue again:
https://maloo.whamcloud.com/test_sets/50d94020-068b-11e2-9e80-52540035b04c

Comment by Hongchao Zhang [ 25/Sep/12 ]

what is the OSTSIZE for this test? the dd stopped at about 1.7G.

Comment by Jian Yu [ 25/Sep/12 ]

what is the OSTSIZE for this test? the dd stopped at about 1.7G.

From the above Maloo report, we can enter "go to session" and find the autotest configuration info in lustre-initialization-1:

export OSTSIZE=2097152
Comment by Hongchao Zhang [ 25/Sep/12 ]

oh, thanks!

OSTSIZE is 2G, and it's the same situation of insufficient disk space at OST

Comment by Jian Yu [ 06/Nov/12 ]

Mark this ticket as a blocker since it's blocking the recovery-*-scale tests.

Comment by Hongchao Zhang [ 06/Nov/12 ]

how about adapting the data size in 'dd' to the actual free space in Luste then, will try to create a patch for it

Comment by Hongchao Zhang [ 12/Nov/12 ]

status update

this should be a TT ticket, and the patch is under creation and test, will attach it soon

Comment by Hongchao Zhang [ 16/Nov/12 ]

the patch is tracked at http://review.whamcloud.com/#change,4599

Comment by Hongchao Zhang [ 10/Dec/12 ]

during testing the patch in a loop (with small OSTs), I found it will fail for -ENOSPC after several loops, and it seems to be related to grant...
will spend some more time to trace where the problem is.

btw, this is not the same issue with this ticket.

Comment by Hongchao Zhang [ 16/Dec/12 ]

there is a bug in osd_trans_stop, which use thandle->th_sync after "dt_txn_hook_stop", which could change the value,

int ofd_txn_stop_cb(const struct lu_env *env, struct thandle *txn,
void *cookie)
{
...

/* if can't add callback, do sync write */
txn->th_sync = !!tgt_last_commit_cb_add(txn, &ofd->ofd_lut,
info->fti_exp,
info->fti_transno);

return ofd_last_rcvd_update(info, txn);
}

then the transactions won't be committed and caused the issue(the client can't use just deleted disk space, LU-456)

Comment by Mikhail Pershin [ 17/Dec/12 ]

the transaction will be committed in any case or did you mean 'won't be forced to commit'?
This is not bug in osd_trans_stop, the th_sync must be used exactly after hook_stop call just because hook may set 'sync' flag. The bug is in ofd_txn_stop_cb() which shouldn't rewrite th_sync but set it using 'OR' logic operation: txn->th_sync |= !!tgt_last_commit_cb_add(...).

Comment by Hongchao Zhang [ 20/Dec/12 ]

Yes, thanks, the patch has been updated accordingly.

Comment by Jodi Levi (Inactive) [ 15/Jan/13 ]

Patch landed. Closing ticket. Please reopen if more work is needed.

Generated at Sat Feb 10 01:09:10 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.