Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.12.0
-
3
-
9223372036854775807
Description
Test 201 was added to sanity-flr with the patch https://review.whamcloud.com/#/c/29097/. In that same patch test 201 was added to the ALWAYS_EXCEPT list. Bobijam says test 201 is “… a data mover watcher example monitoring FLR file change and resync the changed file and will not quit the loop.”
The last thing seen in the test log is
== sanity-flr test 201: FLR data mover =============================================================== 21:23:44 (1536960224) CMD: trevis-47vm12 /usr/sbin/lctl --device lustre-MDT0000 changelog_register -n Starting client: trevis-47vm9.trevis.whamcloud.com: -o user_xattr,flock trevis-47vm12@tcp:/lustre /mnt/lustre2 CMD: trevis-47vm9.trevis.whamcloud.com mkdir -p /mnt/lustre2 CMD: trevis-47vm9.trevis.whamcloud.com mount -t lustre -o user_xattr,flock trevis-47vm12@tcp:/lustre /mnt/lustre2
There's nothing obviously wrong in the console logs.
The code for test 201 is
2098 test_201() { 2099 local delay=${RESYNC_DELAY:-5} 2100 2101 MDT0=$($LCTL get_param -n mdc.*.mds_server_uuid | 2102 awk '{ gsub(/_UUID/,""); print $1 }' | head -n1) 2103 2104 trap cleanup_test_201 EXIT 2105 2106 CL_USER=$(do_facet $SINGLEMDS $LCTL --device $MDT0 \ 2107 changelog_register -n) 2108 2109 mkdir -p $MOUNT2 && mount_client $MOUNT2 2110 2111 local index=0 2112 while :; do 2113 local log=$($LFS changelog $MDT0 $index | grep FLRW) 2114 [ -z "$log" ] && { sleep 1; continue; } 2115 2116 index=$(echo $log | awk '{print $1}') 2117 local ts=$(date -d "$(echo $log | awk '{print $3}')" "+%s" -u) 2118 local fid=$(echo $log | awk '{print $6}' | sed -e 's/t=//') 2119 local file=$($LFS fid2path $MOUNT2 $fid 2> /dev/null) 2120 2121 ((++index)) 2122 [ -z "$file" ] && continue 2123 2124 local now=$(date +%s) 2125 2126 echo "file: $file $fid was modified at $ts, now: $now, " \ 2127 "will be resynced at $((ts+delay))" 2128 2129 [ $now -lt $((ts + delay)) ] && sleep $((ts + delay - now)) 2130 2131 mirror_io resync $file 2132 echo "$file resync done" 2133 done 2134 2135 cleanup_test_201 2136 } 2137 run_test 201 "FLR data mover"
This ticket is to track the issues and the solutions for this test.
Logs for sanity-flr test 201 hang are at
https://jira.whamcloud.com/browse/LU-11381