[LU-9601] recovery-mds-scale test_failover_mds: test_failover_mds returned 1 Created: 05/Jun/17 Updated: 24/Sep/20 |
|
| Status: | Reopened |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.10.0, Lustre 2.10.1, Lustre 2.11.0, Lustre 2.12.0, Lustre 2.10.4, Lustre 2.10.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | James Casper | Assignee: | Zhenyu Xu |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
trevis, failover |
||
| Issue Links: |
|
||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
https://testing.hpdd.intel.com/test_sessions/e6b87235-1ff0-4e96-a53f-ca46ffe5ed7e From suite_log: CMD: trevis-38vm1,trevis-38vm5,trevis-38vm6,trevis-38vm7,trevis-38vm8 PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/lib64/lustre/utils:/usr/lib64/mpi/gcc/openmpi/bin:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/sbin:/sbin::/sbin:/bin:/usr/sbin: NAME=autotest_config sh rpc.sh check_logdir /shared_test/autotest2/2017-05-24/051508-70323187606440 trevis-38vm1: trevis-38vm1: executing check_logdir /shared_test/autotest2/2017-05-24/051508-70323187606440 trevis-38vm7: trevis-38vm7.trevis.hpdd.intel.com: executing check_logdir /shared_test/autotest2/2017-05-24/051508-70323187606440 trevis-38vm8: trevis-38vm8.trevis.hpdd.intel.com: executing check_logdir /shared_test/autotest2/2017-05-24/051508-70323187606440 pdsh@trevis-38vm1: trevis-38vm6: mcmd: connect failed: No route to host pdsh@trevis-38vm1: trevis-38vm5: mcmd: connect failed: No route to host CMD: trevis-38vm1 uname -n CMD: trevis-38vm5 uname -n pdsh@trevis-38vm1: trevis-38vm5: mcmd: connect failed: No route to host SKIP: recovery-double-scale SHARED_DIRECTORY should be specified with a shared directory which is accessable on all of the nodes Stopping clients: trevis-38vm1,trevis-38vm5,trevis-38vm6 /mnt/lustre (opts:) CMD: trevis-38vm1,trevis-38vm5,trevis-38vm6 running=\$(grep -c /mnt/lustre' ' /proc/mounts); and pdsh@trevis-38vm1: trevis-38vm5: mcmd: connect failed: No route to host pdsh@trevis-38vm1: trevis-38vm6: mcmd: connect failed: No route to host auster : @@@@@@ FAIL: clients environments are insane! Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:4952:error() = /usr/lib64/lustre/tests/test-framework.sh:1736:sanity_mount_check_clients() = /usr/lib64/lustre/tests/test-framework.sh:1741:sanity_mount_check() = /usr/lib64/lustre/tests/test-framework.sh:3796:setupall() = auster:114:reset_lustre() = auster:217:run_suite() = auster:234:run_suite_logged() = auster:298:run_suites() = auster:334:main() |
| Comments |
| Comment by James Casper [ 05/Jun/17 ] |
|
subsequent test sets: recovery-small, replay-ost-single, replay-dual, replay-vbr, replay-single: mmp: |
| Comment by Sarah Liu [ 08/Jun/17 ] |
|
dup of LU-9600, all caused by the DCO-7216 |
| Comment by James Casper [ 11/Aug/17 ] |
|
This is not a pdsh issue. subtest 1 (test_failover_mds): 2 MDSs, 2 OSTs, 3 clients subtest 2 (test_failover_ost): 2 MDSs, 2 OSTs, 2 clients We are only seeing this with SLES failover configs, and DCO-7324 was also opened for this. |
| Comment by James Nunez (Inactive) [ 05/Sep/17 ] |
|
Looking at the client1 test_log at https://testing.hpdd.intel.com/test_sets/4ef0bae8-8860-11e7-b4b0-5254006e85c2 , we can see that there is some network issue on client2, we get the mcmd no route to host error and then an error saying dump_kernel is an invalid parameter: 16:23:08 (1503530588) waiting for trevis-66vm5 network 5 secs ...
Network not available!
2017-08-23 16:23:11 Terminating clients loads ...
Duration: 86400
Server failover period: 1200 seconds
Exited after: 66 seconds
Number of failovers before exit:
mds1: 1 times
ost1: 0 times
ost2: 0 times
ost3: 0 times
ost4: 0 times
ost5: 0 times
ost6: 0 times
ost7: 0 times
Status: FAIL: rc=1
CMD: trevis-66vm5,trevis-66vm6 test -f /tmp/client-load.pid &&
{ kill -s TERM \$(cat /tmp/client-load.pid); rm -f /tmp/client-load.pid; }
pdsh@trevis-66vm1: trevis-66vm5: mcmd: connect failed: No route to host
/usr/lib64/lustre/tests/recovery-mds-scale.sh: line 103: 13774 Killed do_node $client "PATH=$PATH MOUNT=$MOUNT ERRORS_OK=$ERRORS_OK BREAK_ON_ERROR=$BREAK_ON_ERROR END_RUN_FILE=$END_RUN_FILE LOAD_PID_FILE=$LOAD_PID_FILE TESTLOG_PREFIX=$TESTLOG_PREFIX TESTNAME=$TESTNAME DBENCH_LIB=$DBENCH_LIB DBENCH_SRC=$DBENCH_SRC CLIENT_COUNT=$((CLIENTCOUNT - 1)) LFS=$LFS LCTL=$LCTL FSNAME=$FSNAME run_${load}.sh"
/usr/lib64/lustre/tests/recovery-mds-scale.sh: line 103: 13914 Killed do_node $client "PATH=$PATH MOUNT=$MOUNT ERRORS_OK=$ERRORS_OK BREAK_ON_ERROR=$BREAK_ON_ERROR END_RUN_FILE=$END_RUN_FILE LOAD_PID_FILE=$LOAD_PID_FILE TESTLOG_PREFIX=$TESTLOG_PREFIX TESTNAME=$TESTNAME DBENCH_LIB=$DBENCH_LIB DBENCH_SRC=$DBENCH_SRC CLIENT_COUNT=$((CLIENTCOUNT - 1)) LFS=$LFS LCTL=$LCTL FSNAME=$FSNAME run_${load}.sh"
Dumping lctl log to /test_logs/2017-08-23/lustre-b2_10-el7-x86_64-vs-lustre-b2_10-sles12sp2-x86_64--failover--2_1_1__5__-69876128994560-223221/recovery-mds-scale.test_failover_mds.*.1503530603.log
CMD: trevis-66vm3,trevis-66vm4,trevis-66vm7,trevis-66vm8 /usr/sbin/lctl dk > /test_logs/2017-08-23/lustre-b2_10-el7-x86_64-vs-lustre-b2_10-sles12sp2-x86_64--failover--2_1_1__5__-69876128994560-223221/recovery-mds-scale.test_failover_mds.debug_log.\$(hostname -s).1503530603.log;
dmesg > /test_logs/2017-08-23/lustre-b2_10-el7-x86_64-vs-lustre-b2_10-sles12sp2-x86_64--failover--2_1_1__5__-69876128994560-223221/recovery-mds-scale.test_failover_mds.dmesg.\$(hostname -s).1503530603.log
trevis-66vm7: invalid parameter 'dump_kernel'
trevis-66vm7: open(dump_kernel) failed: No such file or directory
If we look at client2 (trevis-66vm5), we get an OOM error: 23:07:00:[ 908.305066] Lustre: Evicted from MGS (at MGC10.9.6.214@tcp_1) after server handle changed from 0xd3eb788f77c2284b to 0x5174f2bcf88f87f5 23:07:00:[ 908.310726] Lustre: MGC10.9.6.214@tcp: Connection restored to MGC10.9.6.214@tcp_1 (at 10.9.6.210@tcp) 23:07:00:[ 908.369629] LustreError: 8944:0:(client.c:2982:ptlrpc_replay_interpret()) @@@ status 301, old was 0 req@ffff8800d4e20040 x1576564775978512/t4294967305(4294967305) o101->lustre-MDT0000-mdc-ffff8800db2f9800@10.9.6.210@tcp:12/10 lens 952/544 e 0 to 0 dl 1503529448 ref 2 fl Interpret:RP/4/0 rc 301/301 23:07:00:[ 1077.515462] dd invoked oom-killer: gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), nodemask=0, order=0, oom_score_adj=0 23:07:00:[ 1077.515475] dd cpuset=/ mems_allowed=0 23:07:00:[ 1077.515480] CPU: 0 PID: 11285 Comm: dd Tainted: G OE N 4.4.59-92.24-default #1 23:07:00:[ 1077.515484] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 23:07:00:[ 1077.515491] 0000000000000000 ffffffff8130f0d0 ffff880119f43bf0 0000000000000000 23:07:00:[ 1077.515493] ffffffff811f739e 0000000000000000 0000000100000000 000000fae0e8fdf6 23:07:00:[ 1077.515494] 0000000001320122 ffff88011fc15c00 ffff8800d81dd880 ffff8800d916c240 23:07:00:[ 1077.515495] Call Trace: 23:07:00:[ 1077.515564] [<ffffffff81019a99>] dump_trace+0x59/0x310 23:07:00:[ 1077.515568] [<ffffffff81019e3a>] show_stack_log_lvl+0xea/0x170 23:07:00:[ 1077.515571] [<ffffffff8101abc1>] show_stack+0x21/0x40 23:07:00:[ 1077.515583] [<ffffffff8130f0d0>] dump_stack+0x5c/0x7c 23:07:00:[ 1077.515609] [<ffffffff811f739e>] dump_header+0x82/0x215 23:07:00:[ 1077.515633] [<ffffffff811887a4>] oom_kill_process+0x214/0x3f0 23:07:00:[ 1077.515643] [<ffffffff81188e0d>] out_of_memory+0x43d/0x4a0 23:07:00:[ 1077.515650] [<ffffffff8118d556>] __alloc_pages_nodemask+0xaf6/0xb20 23:07:00:[ 1077.515666] [<ffffffff811d5804>] alloc_pages_vma+0xa4/0x220 23:07:00:[ 1077.515679] [<ffffffff811c5f90>] __read_swap_cache_async+0xf0/0x150 23:07:00:[ 1077.515685] [<ffffffff811c6004>] read_swap_cache_async+0x14/0x30 23:07:00:[ 1077.515687] [<ffffffff811c611d>] swapin_readahead+0xfd/0x190 23:07:00:[ 1077.515697] [<ffffffff811b3bde>] handle_pte_fault+0x10fe/0x14b0 23:07:00:[ 1077.515703] [<ffffffff811b4e7e>] handle_mm_fault+0x29e/0x550 23:07:00:[ 1077.515714] [<ffffffff8106469a>] __do_page_fault+0x18a/0x410 23:07:00:[ 1077.515721] [<ffffffff8106494b>] do_page_fault+0x2b/0x70 23:07:00:[ 1077.515741] [<ffffffff815e63e8>] page_fault+0x28/0x30 23:07:00:[ 1077.516007] DWARF2 unwinder stuck at page_fault+0x28/0x30 23:07:00:[ 1077.516007] 23:07:00:[ 1077.516007] Leftover inexact backtrace: 23:07:00:[ 1077.516007] 23:07:00:[ 1077.520402] Mem-Info: 23:07:00:[ 1077.520411] active_anon:0 inactive_anon:18 isolated_anon:0 23:07:00:[ 1077.520411] active_file:258394 inactive_file:672357 isolated_file:64 23:07:00:[ 1077.520411] unevictable:20 dirty:34 writeback:3183 unstable:0 23:07:00:[ 1077.520411] slab_reclaimable:2810 slab_unreclaimable:11643 23:07:00:[ 1077.520411] mapped:6475 shmem:0 pagetables:852 bounce:0 23:07:00:[ 1077.520411] free:21107 free_pcp:0 free_cma:0 23:07:00:[ 1077.520420] Node 0 DMA free:15372kB min:276kB low:344kB high:412kB active_anon:0kB inactive_anon:0kB active_file:144kB inactive_file:264kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:72kB shmem:0kB slab_reclaimable:8kB slab_unreclaimable:32kB kernel_stack:16kB pagetables:36kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:3216 all_unreclaimable? yes 23:07:00:[ 1077.520421] lowmem_reserve[]: 0 3335 3782 3782 3782 23:07:00:[ 1077.520425] Node 0 DMA32 free:61184kB min:59364kB low:74204kB high:89044kB active_anon:0kB inactive_anon:72kB active_file:909496kB inactive_file:2385320kB unevictable:56kB isolated(anon):0kB isolated(file):256kB present:3653620kB managed:3442064kB mlocked:56kB dirty:136kB writeback:12616kB mapped:22140kB shmem:0kB slab_reclaimable:9568kB slab_unreclaimable:37996kB kernel_stack:1808kB pagetables:2828kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:20051824 all_unreclaimable? yes 23:07:00:[ 1077.520426] lowmem_reserve[]: 0 0 446 446 446 23:07:00:[ 1077.520430] Node 0 Normal free:7872kB min:7940kB low:9924kB high:11908kB active_anon:0kB inactive_anon:0kB active_file:123936kB inactive_file:303844kB unevictable:24kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:457064kB mlocked:24kB dirty:0kB writeback:116kB mapped:3688kB shmem:0kB slab_reclaimable:1664kB slab_unreclaimable:8544kB kernel_stack:672kB pagetables:544kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2658164 all_unreclaimable? yes 23:07:00:[ 1077.520431] lowmem_reserve[]: 0 0 0 0 0 23:07:00:[ 1077.520446] Node 0 DMA: 5*4kB (UE) 3*8kB (ME) 0*16kB 3*32kB (UME) 2*64kB (ME) 2*128kB (UE) 2*256kB (UE) 2*512kB (ME) 3*1024kB (UME) 1*2048kB (E) 2*4096kB (M) = 15372kB 23:07:00:[ 1077.520450] Node 0 DMA32: 1115*4kB (UME) 780*8kB (UME) 682*16kB (UME) 419*32kB (UME) 216*64kB (UME) 79*128kB (UE) 7*256kB (UME) 1*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 61260kB 23:07:00:[ 1077.520454] Node 0 Normal: 222*4kB (UME) 191*8kB (UME) 129*16kB (UME) 62*32kB (UME) 22*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7872kB 23:07:00:[ 1077.520462] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB 23:07:00:[ 1077.520470] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB 23:07:00:[ 1077.520471] 16822 total pagecache pages 23:07:00:[ 1077.520474] 13 pages in swap cache 23:07:00:[ 1077.520475] Swap cache stats: add 10940, delete 10927, find 313/551 23:07:00:[ 1077.520475] Free swap = 14296096kB 23:07:00:[ 1077.520475] Total swap = 14338044kB 23:07:00:[ 1077.520476] 1048473 pages RAM 23:07:00:[ 1077.520476] 0 pages HighMem/MovableOnly 23:07:00:[ 1077.520477] 69718 pages reserved 23:07:00:[ 1077.520477] 0 pages hwpoisoned 23:07:00:[ 1077.520478] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name 23:07:00:[ 1077.520484] [ 437] 0 437 10929 674 25 3 1115 0 systemd-journal 23:07:00:[ 1077.520485] [ 506] 0 506 3006 363 10 3 800 0 haveged 23:07:00:[ 1077.520487] [ 508] 0 508 9209 719 20 3 165 -1000 systemd-udevd 23:07:00:[ 1077.520488] [ 510] 495 510 13123 929 30 3 121 0 rpcbind 23:07:00:[ 1077.520489] [ 632] 499 632 10912 877 27 3 150 -900 dbus-daemon 23:07:00:[ 1077.520490] [ 655] 0 655 4812 594 14 3 59 0 irqbalance 23:07:00:[ 1077.520492] [ 657] 0 657 7448 1031 20 3 269 0 wickedd-dhcp4 23:07:00:[ 1077.520493] [ 659] 0 659 7448 1036 21 3 262 0 wickedd-dhcp6 23:07:00:[ 1077.520494] [ 660] 0 660 7447 1020 21 3 261 0 wickedd-auto4 23:07:00:[ 1077.520495] [ 713] 0 713 87480 965 38 3 266 0 rsyslogd 23:07:00:[ 1077.520496] [ 717] 0 717 22580 1020 46 3 221 0 sssd 23:07:00:[ 1077.520497] [ 722] 0 722 29731 2007 61 3 300 0 sssd_be 23:07:00:[ 1077.520498] [ 750] 0 750 19684 1504 43 3 188 0 sssd_nss 23:07:00:[ 1077.520499] [ 751] 0 751 20740 1528 43 3 179 0 sssd_pam 23:07:00:[ 1077.520500] [ 752] 0 752 19108 1302 41 3 180 0 sssd_ssh 23:07:00:[ 1077.520502] [ 896] 0 896 7480 1040 19 3 299 0 wickedd 23:07:00:[ 1077.520503] [ 909] 0 909 1663 439 9 3 29 0 agetty 23:07:00:[ 1077.520504] [ 910] 0 910 1663 412 9 3 30 0 agetty 23:07:00:[ 1077.520506] [ 916] 0 916 7454 1029 19 3 273 0 wickedd-nanny 23:07:00:[ 1077.520507] [ 1579] 0 1579 2139 425 10 3 40 0 xinetd 23:07:00:[ 1077.520508] [ 1598] 0 1598 164140 1845 65 4 1181 0 automount 23:07:00:[ 1077.520509] [ 1606] 0 1606 11801 1281 28 3 154 -1000 sshd 23:07:00:[ 1077.520510] [ 1608] 74 1608 5863 965 16 3 167 0 ntpd 23:07:00:[ 1077.520511] [ 1610] 74 1610 6916 567 17 3 153 0 ntpd 23:07:00:[ 1077.520512] [ 1635] 493 1635 55351 607 20 3 229 0 munged 23:07:00:[ 1077.520514] [ 1691] 0 1691 5510 601 16 3 67 0 systemd-logind 23:07:00:[ 1077.520515] [ 1793] 0 1793 5218 600 14 3 83 0 master 23:07:00:[ 1077.520516] [ 1798] 51 1798 6256 601 17 3 84 0 pickup 23:07:00:[ 1077.520517] [ 1799] 51 1799 6352 884 18 3 124 0 qmgr 23:07:00:[ 1077.520518] [ 1823] 0 1823 5195 540 16 3 150 0 cron 23:07:00:[ 1077.520521] [11253] 0 11253 14910 657 34 4 172 0 in.mrshd 23:07:00:[ 1077.520522] [11254] 0 11254 2893 569 12 3 77 0 bash 23:07:00:[ 1077.520523] [11259] 0 11259 2893 393 11 3 78 0 bash 23:07:00:[ 1077.520524] [11260] 0 11260 3000 575 11 3 189 0 run_dd.sh 23:07:00:[ 1077.520527] [11285] 0 11285 1061 177 8 3 30 0 dd 23:07:00:[ 1077.520529] Out of memory: Kill process 1598 (automount) score 0 or sacrifice child 23:07:00:[ 1077.520545] Killed process 1598 (automount) total-vm:656560kB, anon-rss:0kB, file-rss:7380kB, shmem-rss:0kB 23:07:00:[ 1077.541270] ntpd invoked oom-killer: gfp_mask=0x24200ca(GFP_HIGHUSER_MOVABLE), nodemask=0, order=0, oom_score_adj=0 23:07:00 Note: 1. Charlie ran the failover test group on VMs with 4GB of memory, double what is normally run in our autotesting, and those tests also hit OOM; see ATM-606. We see this error with master tags 2.10.51 and 2.10.52: b2_10 build 5 and 18: |
| Comment by James Casper [ 26/Sep/17 ] |
|
2.10.1: |
| Comment by James Casper [ 20/Oct/17 ] |
|
2.10.54: |
| Comment by James Casper [ 20/Oct/17 ] |
|
Looks like dumps on an SLES VM are saved in /var/crash. EL7 saves them in /scratch/dumps. Getting a copy of the SLES dumps to an nfs share looks to not be setup: el7:
sles:
|
| Comment by Andreas Dilger [ 06/Dec/17 ] |
|
The system has about 4GB of RAM. There is not a lot of memory in slab objects (only about 40MB). Most of the memory is tied up in inactive_file (about 3GB) and active_file (about 0.5GB), but none of it is reclaimable. It makes sense that there are a bunch of pages is tied up in active_file for dirty pages and RPC bulk replay, but the pages in inactive_file should be reclaimable. I suspect that there is some bad interaction between how CLIO is tracking pages and the VM page state in the newer SLES kernel that makes it appear to the VM that none of the pages can be reclaimed (e.g. extra page references from DLM locks, OSC extents, etc). We do have slab callbacks for DLM locks that would release pages, but I'm wondering if dd is using a single large lock on the whole file that this lock cannot be cancelled while it still has dirty pages? This might also relate to |
| Comment by Brad Hoagland (Inactive) [ 08/Dec/17 ] |
|
Hi YangSheng, Can you take a look at this one? Thanks, Brad |
| Comment by Yang Sheng [ 12/Dec/17 ] |
|
Looks like sles12sp3 has a little difference alloc_page logic with upstream. It has brought two proc parameters: /proc/sys/vm/pagecache_limit_mb This tunable sets a limit to the unmapped pages in the pagecache in megabytes. If non-zero, it should not be set below 4 (4MB), or the system might behave erratically. In real-life, much larger limits (a few percent of system RAM / a hundred MBs) will be useful. Examples: echo 512 >/proc/sys/vm/pagecache_limit_mb This sets a baseline limits for the page cache (not the buffer cache!) of 0.5GiB. As we only consider pagecache pages that are unmapped, currently mapped pages (files that are mmap'ed such as e.g. binaries and libraries as well as SysV shared memory) are not limited by this. NOTE: The real limit depends on the amount of free memory. Every existing free page allows the page cache to grow 8x the amount of free memory above the set baseline. As soon as the free memory is needed, we free up page cache. /proc/sys/vm/pagecache_limit_ignore_dirty But it should have less effective if work with default value. Thanks, |
| Comment by Andreas Dilger [ 22/Feb/18 ] |
|
Bobijam, it looks like the client is having problems to release pages from the page cache. I suspect there is something going badly with the CLIO page reference/dirty state with the new kernel, that is preventing the page from being released. Are you able to reproduce something like this in a VM (e.g. dd very large single file) for debugging? |
| Comment by Sarah Liu [ 02/May/18 ] |
|
+1 on master SLES12 sp3 server/client failover, client hit "page allocation failure" https://testing.hpdd.intel.com/test_sets/50068624-4679-11e8-960d-52540065bddc |
| Comment by Sarah Liu [ 17/May/18 ] |
|
+2 on b2_10 https://testing.hpdd.intel.com/test_sets/b7026366-5880-11e8-abc3-52540065bddc https://testing.hpdd.intel.com/test_sets/d93466da-5878-11e8-b9d3-52540065bddc |
| Comment by Jian Yu [ 27/Sep/18 ] |
|
Hi Andreas,
I provisioned 3 SLES12 SP3 VMs (1 client + 1 MGS/MDS +1 OSS) on trevis cluster with the latest master build #3795, and ran dd to create a 30G single file. The command passed: trevis-59vm1:/usr/lib64/lustre/tests # lfs df -h UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 5.6G 45.7M 5.0G 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 39.0G 49.0M 36.9G 0% /mnt/lustre[OST:0] lustre-OST0001_UUID 39.0G 49.0M 36.9G 0% /mnt/lustre[OST:1] filesystem_summary: 78.0G 98.1M 73.9G 0% /mnt/lustre trevis-59vm1:/usr/lib64/lustre/tests # dd if=/dev/urandom of=/mnt/lustre/large_file_10G bs=1M count=30720 30720+0 records in 30720+0 records out 32212254720 bytes (32 GB, 30 GiB) copied, 2086.88 s, 15.4 MB/s |