[LU-6088] racer test_1: dir_create.sh mutex deadlock in sys_open->do_lookup Created: 07/Jan/15  Updated: 02/May/23  Resolved: 23/Jan/15

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.7.0
Fix Version/s: Lustre 2.7.0

Type: Bug Priority: Critical
Reporter: Maloo Assignee: Di Wang
Resolution: Fixed Votes: 0
Labels: HB

Issue Links:
Duplicate
is duplicated by LU-5936 lmv_merge_attr() and callees ignore i... Resolved
Related
is related to LU-6085 racer stuck on mutex_lock in ll_setat... Resolved
is related to LU-4712 racer test_1: oops at __d_lookup+0x8c Resolved
is related to LU-5285 mdt_reconstruct_setattr() calls mdt_a... Resolved
Severity: 3
Rank (Obsolete): 16947

 Description   

This issue was created by maloo for Andreas Dilger <andreas.dilger@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/9ec8122a-9608-11e4-af28-5254006e85c2.

The sub-test test_1 failed with the following logs on the client console:

INFO: task dir_create.sh:5686 was blocked for more than 120s
Call trace:
mutex_lock+0x2b/0x50
do_lookup+0x11b/0x230
__link_path_walk+0x200/0x1000
path_walk+0x6a/0xe0
do_filp_open+0x1fa/0xd20
do_sys_open+0x69/0x140
sys_open+0x20/0x30

It looks like this is only being hit with both master client and master server (pre-2.7.0) so is very likely related to DNE striped directories and is a regression on master (possibly due to the addition of a new racer test for striped directories?). Combinations of 2.4/2.5/2.6/master client or server do not hit this problem.

It would be nice to get the LU-4712 patch http://review.whamcloud.com/9689 landed to clean up the DNE striped directory console messages, but this case doesn't have the client oops, just stuck threads.

Info required for matching: racer 1



 Comments   
Comment by Jinshan Xiong (Inactive) [ 07/Jan/15 ]

The same stack trace has been seen in LU-6085

Comment by Di Wang [ 16/Jan/15 ]

Hmm, this trace caused my attention,

ls            S 0000000000000001     0 33694      1 0x00000080
 ffff8801d21fd2f8 0000000000000082 ffff8801d21fd288 ffffffffa092ac0c
 ffffffffa0a0b460 ffff8801e5393000 ffff8801d21fd268 ffffffffa093d4f5
 ffff8801d21fd2f8 ffffffffa0964af2 ffff8801e5ed25f8 ffff8801d21fdfd8
Call Trace:
 [<ffffffffa092ac0c>] ? ptlrpc_request_bufs_pack+0x5c/0x80 [ptlrpc]
 [<ffffffffa093d4f5>] ? lustre_msg_buf+0x55/0x60 [ptlrpc]
 [<ffffffffa0964af2>] ? __req_capsule_get+0x162/0x6d0 [ptlrpc]
 [<ffffffffa0941d40>] ? lustre_swab_mdt_body+0x0/0x140 [ptlrpc]
 [<ffffffffa06e8fe4>] obd_get_request_slot+0x1a4/0x280 [obdclass]
 [<ffffffff81064b90>] ? default_wake_function+0x0/0x20
 [<ffffffffa0ba11a5>] mdc_enqueue+0x275/0x1a40 [mdc]
 [<ffffffffa0b9f25b>] ? mdc_lock_match+0xbb/0x170 [mdc]
 [<ffffffffa0ba2b52>] mdc_intent_lock+0x1e2/0x5f9 [mdc]
 [<ffffffffa1174af0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
 [<ffffffffa0912840>] ? ldlm_completion_ast+0x0/0x9b0 [ptlrpc]
 [<ffffffffa0b59b32>] lmv_revalidate_slaves+0x482/0x1130 [lmv]
 [<ffffffffa1174af0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
 [<ffffffffa0b40a7a>] lmv_update_lsm_md+0x1a/0x20 [lmv]
 [<ffffffffa11562da>] ll_update_inode+0x134a/0x1e60 [lustre]
 [<ffffffffa0b5c3d1>] ? lmv_fld_lookup+0xf1/0x440 [lmv]
 [<ffffffff8129456a>] ? strlcpy+0x4a/0x60
 [<ffffffffa1156e78>] ll_read_inode2+0x88/0x470 [lustre]
 [<ffffffffa11720fb>] ll_iget+0x13b/0x3c0 [lustre]
 [<ffffffffa0b3e4b8>] ? lmv_get_lustre_md+0x88/0x300 [lmv]
 [<ffffffffa1164fe5>] ll_prep_inode+0x6c5/0xe80 [lustre]
 [<ffffffffa0929c4f>] ? ptlrpc_request_cache_free+0xbf/0x100 [ptlrpc]
 [<ffffffffa0b59064>] ? lmv_intent_remote+0x444/0xa90 [lmv]
 [<ffffffffa0941d40>] ? lustre_swab_mdt_body+0x0/0x140 [ptlrpc]
 [<ffffffffa11755d1>] ll_lookup_it_finish+0x2f1/0x11b0 [lustre]
 [<ffffffff811749e3>] ? kmem_cache_alloc_trace+0x1a3/0x1b0
 [<ffffffffa1171ed9>] ? ll_i2suppgid+0x19/0x30 [lustre]
 [<ffffffffa115748c>] ? ll_prep_md_op_data+0x22c/0x530 [lustre]
 [<ffffffffa1174af0>] ? ll_md_blocking_ast+0x0/0x7f0 [lustre]
 [<ffffffffa1176737>] ll_lookup_it+0x2a7/0x9a0 [lustre]
 [<ffffffffa1176eb9>] ll_lookup_nd+0x89/0x5e0 [lustre]
 [<ffffffff8119e0f5>] do_lookup+0x1a5/0x230
 [<ffffffff8119ed84>] __link_path_walk+0x7a4/0x1000
 [<ffffffff8114f89f>] ? handle_pte_fault+0x4af/0xb00
 [<ffffffff8119f89a>] path_walk+0x6a/0xe0
 [<ffffffff8119faab>] filename_lookup+0x6b/0xc0
 [<ffffffff8122db26>] ? security_file_alloc+0x16/0x20
 [<ffffffff811a0f84>] do_filp_open+0x104/0xd20
 [<ffffffff8129980a>] ? strncpy_from_user+0x4a/0x90
 [<ffffffff811ae432>] ? alloc_fd+0x92/0x160
 [<ffffffff8118b237>] do_sys_open+0x67/0x130
 [<ffffffff8118b340>] sys_open+0x20/0x30
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Right now, when the slave lock is being revalidated (enqueue etc), we do not release the master lock, it is not probably not right. I will cook a patch.

Comment by Gerrit Updater [ 16/Jan/15 ]

wangdi (di.wang@intel.com) uploaded a new patch: http://review.whamcloud.com/13432
Subject: LU-6088 lmv: Do not revalidate strips with master lock
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 715252971aa4839930492520a72eb50b4fa2936b

Comment by Di Wang [ 16/Jan/15 ]

With this patch and minor change for racer, racer can pass on my local test (4 MDTs, 2 OSTs).

diff --git a/lustre/tests/cfg/local.sh b/lustre/tests/cfg/local.sh
index 6d16312..73c85a3 100644
--- a/lustre/tests/cfg/local.sh
+++ b/lustre/tests/cfg/local.sh
@@ -13,7 +13,7 @@ TMP=${TMP:-/tmp}
 DAEMONSIZE=${DAEMONSIZE:-500}
 MDSCOUNT=${MDSCOUNT:-1}
 MDSDEVBASE=${MDSDEVBASE:-$TMP/${FSNAME}-mdt}
-MDSSIZE=${MDSSIZE:-200000}
+MDSSIZE=${MDSSIZE:-2000000}
 #
 # Format options of facets can be specified with these variables:
 #
@@ -39,7 +39,7 @@ MGS_MOUNT_OPTS=${MGS_MOUNT_OPTS:-}

 OSTCOUNT=${OSTCOUNT:-2}
 OSTDEVBASE=${OSTDEVBASE:-$TMP/${FSNAME}-ost}
-OSTSIZE=${OSTSIZE:-200000}
+OSTSIZE=${OSTSIZE:-2000000}
 OSTOPT=${OSTOPT:-}
 OST_FS_MKFS_OPTS=${OST_FS_MKFS_OPTS:-}
 OST_MOUNT_OPTS=${OST_MOUNT_OPTS:-}
diff --git a/lustre/tests/racer/racer.sh b/lustre/tests/racer/racer.sh
index deef18e..3ed624e 100755
--- a/lustre/tests/racer/racer.sh
+++ b/lustre/tests/racer/racer.sh
@@ -17,7 +17,7 @@ file_list file_concat file_exec file_chown file_chmod file_mknod file_truncate \
 file_delxattr file_getxattr file_setxattr"

 if [ $MDSCOUNT -gt 1 ]; then
-       RACER_PROGS="${RACER_PROGS} dir_remote dir_migrate"
+       RACER_PROGS="${RACER_PROGS} dir_remote"
 fi

 racer_cleanup()
== racer test 1: racer on clients: testnode DURATION=300 == 17:12:47 (1421284367)
racers pids: 77271 77272 77273 77275 77278 77281 77285 77289
./file_exec.sh: line 12: 86522 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 90169 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 95671 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 95382 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 98325 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 108967 Segmentation fault      $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 115776 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 122252 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 122042 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 130326 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 12459 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 16796 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 36610 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 40410 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 41512 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 44574 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 54088 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 53171 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 55728 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 56052 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 58816 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 65907 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 68804 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 74136 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 73959 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 78464 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 93150 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 98385 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 102325 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 101154 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 104116 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 106750 Segmentation fault      $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 113145 Segmentation fault      $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 113147 Segmentation fault      $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 113766 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 115832 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 117893 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 122696 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12:   421 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12:  9260 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 11327 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 11681 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 17643 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 20928 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 24589 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 38846 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 49343 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 64789 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 64736 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 89863 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 103920 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 105115 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 107463 Segmentation fault      $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 116236 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 116222 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 117338 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 125041 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 126624 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null

./file_exec.sh: line 12:  5690 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 10494 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 21195 Segmentation fault      $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 21185 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 28111 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 30436 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 37965 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 39235 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 42352 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 45837 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 46220 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 47790 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 52719 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 53661 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 53811 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 54609 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 55437 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 60614 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 65795 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 66303 Segmentation fault      $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 71790 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 74574 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 80161 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 86993 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 86851 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 86691 Segmentation fault      (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 90074 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 91221 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 88688 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 104944 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 110164 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 115085 Bus error               $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
./file_exec.sh: line 12: 121941 Bus error               (core dumped) $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null
file_create.sh: no process killed
dir_create.sh: no process killed
file_rm.sh: no process killed
file_create.sh: no process killed
file_create.sh: no process killed
dir_create.sh: no process killed
file_rename.sh: no process killed
dir_create.sh: no process killed
file_rm.sh: no process killed
file_link.sh: no process killed
file_rm.sh: no process killed
file_rename.sh: no process killed
file_symlink.sh: no process killed
file_create.sh: no process killed
file_rename.sh: no process killed
file_link.sh: no process killed
file_create.sh: no process killed
file_list.sh: no process killed
file_create.sh: no process killed
dir_create.sh: no process killed
file_link.sh: no process killed
file_symlink.sh: no process killed
dir_create.sh: no process killed
file_concat.sh: no process killed
file_create.sh: no process killed
file_symlink.sh: no process killed
file_rm.sh: no process killed
file_rm.sh: no process killed
dir_create.sh: no process killed
file_list.sh: no process killed
file_create.sh: no process killed
file_exec.sh: no process killed
dir_create.sh: no process killed
file_list.sh: no process killed
file_rename.sh: no process killed
file_concat.sh: no process killed
file_rm.sh: no process killed
file_rename.sh: no process killed
file_chown.sh: no process killed
dir_create.sh: no process killed
file_rm.sh: no process killed
file_concat.sh: no process killed
file_link.sh: no process killed
file_exec.sh: no process killed
file_rename.sh: no process killed
file_link.sh: no process killed
file_chmod.sh: no process killed
file_rm.sh: no process killed
file_rename.sh: no process killed
file_exec.sh: no process killed
file_symlink.sh: no process killed
file_mknod.sh: no process killed
file_symlink.sh: no process killed
file_link.sh: no process killed
file_rename.sh: no process killed
file_chown.sh: no process killed
file_link.sh: no process killed
file_chown.sh: no process killed
file_truncate.sh: no process killed
file_list.sh: no process killed
file_list.sh: no process killed
file_symlink.sh: no process killed
file_link.sh: no process killed
file_symlink.sh: no process killed
file_chmod.sh: no process killed
file_delxattr.sh: no process killed
file_chmod.sh: no process killed
file_concat.sh: no process killed
file_list.sh: no process killed
file_list.sh: no process killed
file_concat.sh: no process killed
file_symlink.sh: no process killed
file_getxattr.sh: no process killed
file_mknod.sh: no process killed
file_mknod.sh: no process killed
file_exec.sh: no process killed
file_concat.sh: no process killed
file_concat.sh: no process killed
file_list.sh: no process killed
file_exec.sh: no process killed
file_setxattr.sh: no process killed
file_truncate.sh: no process killed
file_truncate.sh: no process killed
file_chown.sh: no process killed
file_exec.sh: no process killed
file_delxattr.sh: no process killed
file_concat.sh: no process killed
file_chown.sh: no process killed
file_exec.sh: no process killed
file_delxattr.sh: no process killed
file_chown.sh: no process killed
dir_remote.sh: no process killed
file_chmod.sh: no process killed
file_chmod.sh: no process killed
file_exec.sh: no process killed
file_getxattr.sh: no process killed
file_getxattr.sh: no process killed
file_chown.sh: no process killed
file_chmod.sh: no process killed
file_mknod.sh: no process killed
file_setxattr.sh: no process killed
file_mknod.sh: no process killed
file_setxattr.sh: no process killed
file_mknod.sh: no process killed
file_chown.sh: no process killed
file_chmod.sh: no process killed
file_truncate.sh: no process killed
dir_remote.sh: no process killed
dir_remote.sh: no process killed
file_truncate.sh: no process killed
file_truncate.sh: no process killed
file_chmod.sh: no process killed
file_mknod.sh: no process killed
file_delxattr.sh: no process killed
file_delxattr.sh: no process killed
file_delxattr.sh: no process killed
file_getxattr.sh: no process killed
file_mknod.sh: no process killed
file_getxattr.sh: no process killed
file_getxattr.sh: no process killed
file_setxattr.sh: no process killed
file_truncate.sh: no process killed
Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
racer cleanup
sleeping 5 sec ...
there should be NO racer processes:
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
Filesystem           1K-blocks   Used Available Use% Mounted on
testnode@tcp:/lustre   3777312 143096   3417764   5% /mnt/lustre2
We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
racer cleanup
sleeping 5 sec ...
there should be NO racer processes:
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
Filesystem           1K-blocks   Used Available Use% Mounted on
testnode@tcp:/lustre   3777312 143096   3417764   5% /mnt/lustre2
We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
file_truncate.sh: no process killed
file_setxattr.sh: no process killed
file_setxattr.sh: no process killed
file_delxattr.sh: no process killed
dir_remote.sh: no process killed
dir_remote.sh: no process killed
Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
racer cleanup
sleeping 5 sec ...
there should be NO racer processes:
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
Filesystem           1K-blocks   Used Available Use% Mounted on
testnode@tcp:/lustre   3777312 143096   3417764   5% /mnt/lustre2
We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
dir_remote.sh: no process killed
file_delxattr.sh: no process killed
file_getxattr.sh: no process killed
file_getxattr.sh: no process killed
file_setxattr.sh: no process killed
file_setxattr.sh: no process killed
dir_remote.sh: no process killed
dir_remote.sh: no process killed
Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
racer cleanup
sleeping 5 sec ...
there should be NO racer processes:
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
Filesystem           1K-blocks   Used Available Use% Mounted on
testnode@tcp:/lustre   3777312 143096   3417764   5% /mnt/lustre
We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
racer cleanup
sleeping 5 sec ...
there should be NO racer processes:
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
Filesystem           1K-blocks   Used Available Use% Mounted on
testnode@tcp:/lustre   3777312 143096   3417764   5% /mnt/lustre
We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
pid=77271 rc=0
Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
racer cleanup
sleeping 5 sec ...
there should be NO racer processes:
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
Filesystem           1K-blocks   Used Available Use% Mounted on
testnode@tcp:/lustre   3777312 143096   3417764   5% /mnt/lustre
We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
pid=77272 rc=0
Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
racer cleanup
sleeping 5 sec ...
there should be NO racer processes:
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
Filesystem           1K-blocks   Used Available Use% Mounted on
testnode@tcp:/lustre   3777312 143096   3417764   5% /mnt/lustre2
We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
Running /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds. CTRL-C to exit
racer cleanup
sleeping 5 sec ...
there should be NO racer processes:
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
Filesystem           1K-blocks   Used Available Use% Mounted on
testnode@tcp:/lustre   3777312 143096   3417764   5% /mnt/lustre
We survived /work/lustre-release_new/lustre/tests/racer/racer.sh for 300 seconds.
pid=77273 rc=0
pid=77275 rc=0
pid=77278 rc=0
pid=77281 rc=0
pid=77285 rc=0
pid=77289 rc=0
Resetting fail_loc on all nodes...done.
PASS 1 (306s)
== racer test complete, duration 308 sec == 17:17:53 (1421284673)
Stopping clients: testnode /mnt/lustre2 (opts:)
Stopping client testnode /mnt/lustre2 opts:
[root@testnode tests]# 
[root@testnode tests]# MDSCOUNT=4 sh llmountcleanup.sh 
Stopping clients: testnode /mnt/lustre (opts:-f)
Stopping client testnode /mnt/lustre opts:-f
Stopping clients: testnode /mnt/lustre2 (opts:-f)
Stopping /mnt/mds1 (opts:-f) on testnode
Stopping /mnt/mds2 (opts:-f) on testnode
Stopping /mnt/mds3 (opts:-f) on testnode
Stopping /mnt/mds4 (opts:-f) on testnode
Stopping /mnt/ost1 (opts:-f) on testnode
Stopping /mnt/ost2 (opts:-f) on testnode
modules unloaded.

Note: I run racer with 8 cores and 8G memory.

Comment by Gerrit Updater [ 23/Jan/15 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/13432/
Subject: LU-6088 lmv: Do not revalidate stripes with master lock
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 0860eda0544a507754b3af4aadcd651e1120ded5

Comment by Peter Jones [ 23/Jan/15 ]

Landed for 2.7

Generated at Sat Feb 10 01:57:07 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.