Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.12.0
-
3
-
9223372036854775807
Description
Test 201 was added to sanity-flr with the patch https://review.whamcloud.com/#/c/29097/. In that same patch test 201 was added to the ALWAYS_EXCEPT list. Bobijam says test 201 is “… a data mover watcher example monitoring FLR file change and resync the changed file and will not quit the loop.”
The last thing seen in the test log is
== sanity-flr test 201: FLR data mover =============================================================== 21:23:44 (1536960224) CMD: trevis-47vm12 /usr/sbin/lctl --device lustre-MDT0000 changelog_register -n Starting client: trevis-47vm9.trevis.whamcloud.com: -o user_xattr,flock trevis-47vm12@tcp:/lustre /mnt/lustre2 CMD: trevis-47vm9.trevis.whamcloud.com mkdir -p /mnt/lustre2 CMD: trevis-47vm9.trevis.whamcloud.com mount -t lustre -o user_xattr,flock trevis-47vm12@tcp:/lustre /mnt/lustre2
There's nothing obviously wrong in the console logs.
The code for test 201 is
2098 test_201() {
2099 local delay=${RESYNC_DELAY:-5}
2100
2101 MDT0=$($LCTL get_param -n mdc.*.mds_server_uuid |
2102 awk '{ gsub(/_UUID/,""); print $1 }' | head -n1)
2103
2104 trap cleanup_test_201 EXIT
2105
2106 CL_USER=$(do_facet $SINGLEMDS $LCTL --device $MDT0 \
2107 changelog_register -n)
2108
2109 mkdir -p $MOUNT2 && mount_client $MOUNT2
2110
2111 local index=0
2112 while :; do
2113 local log=$($LFS changelog $MDT0 $index | grep FLRW)
2114 [ -z "$log" ] && { sleep 1; continue; }
2115
2116 index=$(echo $log | awk '{print $1}')
2117 local ts=$(date -d "$(echo $log | awk '{print $3}')" "+%s" -u)
2118 local fid=$(echo $log | awk '{print $6}' | sed -e 's/t=//')
2119 local file=$($LFS fid2path $MOUNT2 $fid 2> /dev/null)
2120
2121 ((++index))
2122 [ -z "$file" ] && continue
2123
2124 local now=$(date +%s)
2125
2126 echo "file: $file $fid was modified at $ts, now: $now, " \
2127 "will be resynced at $((ts+delay))"
2128
2129 [ $now -lt $((ts + delay)) ] && sleep $((ts + delay - now))
2130
2131 mirror_io resync $file
2132 echo "$file resync done"
2133 done
2134
2135 cleanup_test_201
2136 }
2137 run_test 201 "FLR data mover"
This ticket is to track the issues and the solutions for this test.
Logs for sanity-flr test 201 hang are at
https://jira.whamcloud.com/browse/LU-11381