[LU-10970] sanity test_255b: FAIL: Ladvise willread should use more memory than 76800 KiB Created: 30/Apr/18  Updated: 24/Jul/18  Resolved: 24/Jul/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.0

Type: Bug Priority: Minor
Reporter: Patrick Farrell (Inactive) Assignee: Patrick Farrell (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Lustre 2.11, CentOS 7


Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   
stdout.log
sanity test_255b: @@@@@@ FAIL: Ladvise willread should use more memory than 76800 KiB
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:5790:error()
= /usr/lib64/lustre/tests/sanity.sh:15757:test_255b()
= /usr/lib64/lustre/tests/test-framework.sh:6069:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:6108:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:5904:run_test()
= /usr/lib64/lustre/tests/sanity.sh:15766:main()
Dumping lctl log to /tmp/test_logs/1524567862/sanity.test_255b.*.1524567882.log
bluepill05: ssh: connect to host bluepill-client10.ext port 22: Connection timed out
bluepill05: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
bluepill05: rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9]
pdsh@bluepill-client10: bluepill05: ssh exited with exit code 255
bluepill02: ssh: connect to host bluepill-client10.ext port 22: Connection timed out
bluepill02: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
bluepill02: rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9]
pdsh@bluepill-client10: bluepill02: ssh exited with exit code 255
bluepill04: ssh: connect to host bluepill-client10.ext port 22: Connection timed out
bluepill04: rsync: connection unexpectedly closed (0 bytes received so far) [sender]
bluepill04: rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9]
 
 
stderr.log
Using TIMEOUT=300
bluepill02: bluepill02: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill04: bluepill04: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill05: bluepill05: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill03: bluepill03: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client01: bluepill-client01.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client05: bluepill-client05.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client03: bluepill-client03.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client07: bluepill-client07.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client04: bluepill-client04.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client06: bluepill-client06.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client09: bluepill-client09.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client02: bluepill-client02.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
bluepill-client08: bluepill-client08.ext: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 288
running as uid/gid/euid/egid 500/500/500/500, groups:
[touch] [/mnt/fs1/d0_runas_test/f103157]
excepting tests: 407 253 312 160f 160g 42a 42b 42c
mke2fs 1.42.13.x5 (23-Mar-2017)
== sanity test complete, duration 150 sec ============================================================ 11:06:52 (1524568012)
 
 


 Comments   
Comment by Patrick Farrell (Inactive) [ 30/Apr/18 ]

The test appears to have a race in it:

I suspect a sync on the client is required - We're seeing this occasionally internally at Cray when testing the latest release.

There's nothing in the test that guarantees the write has completed to the server when we do sync/drop_caches.

I'll provide a patch.

Comment by Gerrit Updater [ 30/Apr/18 ]

Patrick Farrell (paf@cray.com) uploaded a new patch: https://review.whamcloud.com/32203
Subject: LU-10970 tests: make sure write is complete
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 7e7975b62817e92a6d0498ba3d1fc0cd000593ff

Comment by Gerrit Updater [ 24/Jul/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32203/
Subject: LU-10970 tests: make sure write is complete
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: a55af59ff73b678638375032abb7ec3baf7841a6

Comment by Peter Jones [ 24/Jul/18 ]

Landed for 2.12

Generated at Sat Feb 10 02:39:48 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.