[LUDOC-174] sgpdd-survey normal output produces errors (says failure when all is well) Created: 22/Aug/13  Updated: 17/Oct/16  Resolved: 17/Oct/16

Status: Resolved
Project: Lustre Documentation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Dan Ferber (Inactive) Assignee: Dan Cobb (Inactive)
Resolution: Not a Bug Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9876

 Description   

See the output below.

We could just fix up the script so it says something other than failed
in this case.

It could say something like
"out of memory for this config" or just pass the unexpected log line
along to the end user so they can parse it.

>> Background:
>>
>> Running the script "sgpdd-survey.john" with the header below:
>> LUN's look to be: b, d, f, g, and i
>>
>> #!/bin/bash
>>
>>
##########################################################
>> ############
>> # customize per survey
>>
>> # CHOOSE EITHER scsidevs or rawdevs
>> # the SCSI devices to measure - WARNING: will be erased.
>> # The raw devices to use
>> # rawdevs=${rawdevs:-"/dev/raw/raw1"}
>> # scsidevs=`ls /dev/sd[a-z] /dev/sd[a-z][a-z]` # all devices, if you
>>use udev scsidevs='/dev/sdb /dev/sdd /dev/sdf /dev/sdi /dev/sdg'
>>
>> # result file prefix. date/time+hostname makes unique # NB ensure
>>the path exists if it includes subdirs rslt_loc=${rslt_loc:-"/tmp"}
>>rslt=${rslt:-
>> "$rslt_loc/sgpdd_survey_`date +%F@%R`_`uname -n`"}
>>
>> # what to do (read or write)
>> actions=${actions:-"write read"}
>>
>> # total size per device (MBytes)
>> # NB bigger than device cache is good size=${size:-8192}
>>
>> # record size (KBytes)
>> rszlo=${rszlo:-1024}
>> rszhi=${rszhi:-1024}
>>
>> # Concurrent regions per device
>> crglo=${crglo:-1}
>> crghi=${crghi:-256}
>>
>> # boundary blocks between concurrent regions per device
>> boundary=${boundary:-1024}
>>
>> # threads to share between concurrent regions per device # multiple
>>threads per region simulates a deeper request queue # NB survey skips
>>over #thr < #regions and #thr/#regions > SG_MAX_QUEUE
>>thrlo=${thrlo:-1} thrhi=${thrhi:-4096}
>>
>> =================
>>
>> The errors he is seeing are:
>>
>> Wed Aug 21 10:03:23 PDT 2013 sgpdd-survey on /dev/sdb /dev/sdd
>>/dev/sdf /dev/sdi /dev/sdg from Lustre-oss1.isvlab.local
>> total_size 41943040K rsz 1024 crg 5 thr 5 write 891.17 MB/s
>> 5 x 178.29 =
>> 891.45 MB/s read 1053.69 MB/s 5 x 210.78 = 1053.91 MB/s
>> total_size 41943040K rsz 1024 crg 5 thr 10 write 1358.47 MB/s
>> 5 x 271.85
>> = 1359.27 MB/s read 1143.60 MB/s 5 x 229.00 = 1144.98 MB/s
>> total_size 41943040K rsz 1024 crg 5 thr 20 write 1305.13 MB/s
>> 5 x 261.29
>> = 1306.44 MB/s read 1113.26 MB/s 5 x 222.80 = 1113.99 MB/s
>> total_size 41943040K rsz 1024 crg 5 thr 40 write 1282.39 MB/s
>> 5 x 256.68
>> = 1283.41 MB/s read 994.86 MB/s 5 x 199.09 = 995.45 MB/s
>> total_size 41943040K rsz 1024 crg 5 thr 80 write 1281.24 MB/s
>> 5 x 256.40
>> = 1282.02 MB/s read 1109.08 MB/s 5 x 222.03 = 1110.17 MB/s
>> total_size 41943040K rsz 1024 crg 10 thr 10 write 1332.48 MB/s
>>10 x 133.34
>> = 1333.43 MB/s read 764.70 MB/s 10 x 76.48 = 764.85 MB/s
>> total_size 41943040K rsz 1024 crg 10 thr 20 write 1362.39 MB/s
>>10 x 136.34
>> = 1363.37 MB/s read 982.14 MB/s 10 x 98.31 = 983.14 MB/s
>> total_size 41943040K rsz 1024 crg 10 thr 40 write 1339.71 MB/s
>>10 x 134.05
>> = 1340.48 MB/s read 1012.83 MB/s 10 x 101.35 = 1013.47 MB/s
>> total_size 41943040K rsz 1024 crg 10 thr 80 write 1065.92 MB/s
>>10 x 106.64
>> = 1066.40 MB/s read 1061.93 MB/s 10 x 106.80 = 1068.02 MB/s
>> total_size 41943040K rsz 1024 crg 10 thr 160 write 956.98 MB/s
>>10 x 95.81
>> = 958.06 MB/s read 1832.06 MB/s 10 x 183.42 = 1834.20 MB/s
>> total_size 41943040K rsz 1024 crg 20 thr 20 write 1330.72 MB/s
>>20 x 66.60
>> = 1331.90 MB/s read 667.91 MB/s 20 x 33.41 = 668.14 MB/s
>> total_size 41943040K rsz 1024 crg 20 thr 40 write 1330.28 MB/s
>>20 x 67.15
>> = 1342.96 MB/s read 946.57 MB/s 20 x 47.36 = 947.19 MB/s
>> total_size 41943040K rsz 1024 crg 20 thr 80 write 885.61 MB/s
>>20 x 44.30
>> = 885.96 MB/s read 1172.73 MB/s 20 x 58.95 = 1178.93 MB/s
>> total_size 41943040K rsz 1024 crg 20 thr 160 write 1014.48 MB/s
>>20 x 50.78
>> = 1015.66 MB/s read 1775.75 MB/s 20 x 88.92 = 1778.41 MB/s
>> total_size 41943040K rsz 1024 crg 20 thr 320 write 772.73 MB/s
>>20 x 38.65
>> = 773.05 MB/s read 1138.71 MB/s 20 x 56.97 = 1139.45 MB/s
>> total_size 41943040K rsz 1024 crg 40 thr 40 write 1130.65 MB/s
>>40 x 28.85
>> = 1153.95 MB/s read 733.74 MB/s 40 x 18.35 = 733.95 MB/s
>> total_size 41943040K rsz 1024 crg 40 thr 80 write 912.06 MB/s
>>40 x 23.03
>> = 921.25 MB/s read 1025.11 MB/s 40 x 25.64 = 1025.77 MB/s
>> total_size 41943040K rsz 1024 crg 40 thr 160 write 1250.21 MB/s
>>40 x 31.33
>> = 1253.13 MB/s read 1179.10 MB/s 40 x 29.50 = 1179.89 MB/s
>> total_size 41943040K rsz 1024 crg 40 thr 320 write 1020.84 MB/s
>>40 x 25.57
>> = 1022.72 MB/s read 1241.09 MB/s 40 x 31.22 = 1248.93 MB/s
>> total_size 41943040K rsz 1024 crg 40 thr 640 write 1055.14 MB/s
>>40 x 26.45
>> = 1057.82 MB/s read 1249.91 MB/s 40 x 31.27 = 1250.84 MB/s
>> total_size 41943040K rsz 1024 crg 80 thr 80 write 1240.89 MB/s
>>80 x 15.54
>> = 1242.83 MB/s read 1011.95 MB/s 80 x 12.66 = 1013.18 MB/s
>> total_size 41943040K rsz 1024 crg 80 thr 160 write 1130.85 MB/s
>>80 x 14.15
>> = 1132.20 MB/s read 1154.84 MB/s 80 x 14.47 = 1157.38 MB/s
>> total_size 41943040K rsz 1024 crg 80 thr 320 write 1071.42 MB/s
>>80 x 13.41
>> = 1072.69 MB/s read 1204.97 MB/s 80 x 15.08 = 1206.21 MB/s
>> total_size 41943040K rsz 1024 crg 80 thr 640 write 985.84 MB/s
>>80 x 12.44
>> = 994.87 MB/s read 1520.15 MB/s 80 x 19.03 = 1522.06 MB/s
>> total_size 41943040K rsz 1024 crg 80 thr 1280 write 79 failed read
>>48 failed
>> total_size 41943040K rsz 1024 crg 160 thr 160 write 764.63 MB/s
>>160 x 4.85
>> = 776.67 MB/s read 1509.01 MB/s 160 x 9.46 = 1513.67 MB/s
>> total_size 41943040K rsz 1024 crg 160 thr 320 write 726.86 MB/s
>>160 x 4.56
>> = 729.37 MB/s read 1558.96 MB/s 160 x 9.77 = 1562.50 MB/s
>> total_size 41943040K rsz 1024 crg 160 thr 640 write 727.54 MB/s
>>160 x 4.55
>> = 727.84 MB/s read 1356.87 MB/s 160 x 8.49 = 1358.03 MB/s
>> total_size 41943040K rsz 1024 crg 160 thr 1280 write 86 failed read
>>90 failed
>> total_size 41943040K rsz 1024 crg 160 thr 2560 write 151 failed read
>>130
>> failed
>> total_size 41943040K rsz 1024 crg 320 thr 320 write 797.18 MB/s
>>320 x 2.51
>> = 802.61 MB/s read 1725.78 MB/s 320 x 5.43 = 1736.45 MB/s
>> total_size 41943040K rsz 1024 crg 320 thr 640 write 814.87 MB/s
>>320 x 2.57
>> = 820.92 MB/s read 1443.16 MB/s 320 x 4.52 = 1446.53 MB/s
>> total_size 41943040K rsz 1024 crg 320 thr 1280 write 47 failed read
>>55 failed
>> total_size 41943040K rsz 1024 crg 320 thr 2560 write 152 failed read
>>121
>> failed
>> total_size 41943040K rsz 1024 crg 320 thr 5120 write 295 failed read
>>205
>> failed
>> total_size 41943040K rsz 1024 crg 640 thr 640 write 1116.96 MB/s
>>640 x
>> 1.75 = 1123.05 MB/s read 1862.02 MB/s 640 x 2.92 = 1867.68 MB/s
>> total_size 41943040K rsz 1024 crg 640 thr 1280 write 177 failed read
>>133
>> failed
>> total_size 41943040K rsz 1024 crg 640 thr 2560 write 311 failed read
>>227
>> failed
>> total_size 41943040K rsz 1024 crg 640 thr 5120 write 490 failed read
>>338
>> failed
>> total_size 41943040K rsz 1024 crg 640 thr 10240 write 622 failed read
>>557
>> failed
>> total_size 41943040K rsz 1024 crg 1280 thr 1280 write 640 failed
>>read
>>640
>> failed total_size 41943040K rsz 1024 crg 1280 thr 2560 write 802
>>failed read
>> 787 failed total_size 41943040K rsz 1024 crg 1280 thr 5120 write
>>974 failed read 902 failed total_size 41943040K rsz 1024 crg 1280
>>thr 10240 write
>>1151
>> failed read 1091 failed total_size 41943040K rsz 1024 crg 1280 thr
>>20480 write
>> 1246 failed read 1226 failed
>>



 Comments   
Comment by Dan Cobb (Inactive) [ 30/Sep/13 ]

Why would this be assigned to Dan Cobb

Generated at Sat Feb 10 03:40:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.