[LU-6764] Test directory access during migration for DNE2 Created: 24/Jun/15 Updated: 22/Dec/15 Resolved: 24/Jun/15 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.8.0 |
| Type: | Task | Priority: | Blocker |
| Reporter: | Richard Henwood (Inactive) | Assignee: | Di Wang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
Access the directory during migration Setup lustre with 4 MDTs, 4 OSTs and 2 clients. Create 1 directory and some files under the directory
mkdir /mnt/lustre/migrate_dir
for F in {1,2,3,4,5}; do
echo "$F$F$F$F$F" > /mnt/lustre/migrate_dir/file$F
done
On one client, migrate the directory among 4 MDTs
while true; do
mdt_idx=$((RANDOM % MDTCOUNT))
lfs migration -m $mdt_idx /mnt/lustre/migrate_dir || break
done
echo "migrate directory failed"
return 1
Simultaneously, on another client access these files under the migrating directory
while true; do
N=$((N + 1 % 5))
stat /mnt/lustre/migrate_dir/file1 > /dev/null || break
cat /mnt/lustre/migrate_dir/file2 > /dev/null || break
> /mnt/lustre/migrate_dir/file3 > /dev/null || break
echo "aaaaa" > /mnt/lustre/migrate_dir/file4 > /dev/null || break
stat /mnt/lustre/migrate_dir/file5 > /dev/null || break
done
echo "access migrating files failed"
return 1
Steps 3 and 4 should keep running at least 5 minutes and will not return error. |
| Comments |
| Comment by Richard Henwood (Inactive) [ 24/Jun/15 ] |
|
See patch of the test == sanityn test 80b: Accessing directory during migration == 04:32:18
(1434861138)
start migration thread 2958
accessing the migrating directory for 5 minutes...
...10 seconds
...20 seconds
...30 seconds
...40 seconds
...50 seconds
...60 seconds
...70 seconds
...80 seconds
...90 seconds
...100 seconds
...110 seconds
...120 seconds
...130 seconds
...140 seconds
...150 seconds
...160 seconds
...170 seconds
...180 seconds
...190 seconds
...200 seconds
...210 seconds
...220 seconds
...230 seconds
...240 seconds
...250 seconds
...260 seconds
...270 seconds
...280 seconds
...290 seconds
...300 seconds
Resetting fail_loc on all
nodes.../usr/lib64/lustre/tests/test-framework.sh: line 2969: 2958 Killed
( while true; do
mdt_idx=$((RANDOM % MDSCOUNT)); $LFS mv -M $mdt_idx $migrate_dir1 2
&>/dev/null || rc=$?; [ $rc -ne 0 -o $rc -ne 16 ] || break;
done )
CMD:
shadow-17vm10.shadow.whamcloud.com,shadow-17vm11,shadow-17vm12,shadow-17vm8
,shadow-17vm9 lctl set_param -n fail_loc=0 fail_val=0 2>/dev/null ||
true
done.
CMD:
shadow-17vm10.shadow.whamcloud.com,shadow-17vm11,shadow-17vm12,shadow-17vm8
,shadow-17vm9 rc=0;
val=\$(/usr/sbin/lctl get_param -n catastrophe 2>&1);
if [[ \$? -eq 0 && \$val -ne 0 ]]; then
echo \$(hostname -s): \$val;
rc=\$val;
fi;
exit \$rc
PASS 80b (300s)
|