Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11157

sanity test_42e: invalid arithmetic operator (error token is ".9")

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.13.0, Lustre 2.12.4
    • Lustre 2.12.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for John Hammond <jhammond@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/f2a78f4c-8aa6-11e8-808e-52540065bddc

      test_42e failed with the following error:

      test_42e returned 1
      
      == sanity test 42e: verify sub-RPC writes are not done synchronously ================================= 14:42:04 (1531924924)
      total: 3500 open/close in 6.19 seconds: 565.32 ops/second
      /usr/lib64/lustre/tests/sanity.sh: line 3978: 209.9: syntax error: invalid arithmetic operator (error token is ".9")
      

      Some proc/sys/... files are returning decimal values. This one is from max_dirty_mb.

      Looks like 14 instances of this failure today.

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_42e - test_42e returned 1

      Attachments

        Issue Links

          Activity

            [LU-11157] sanity test_42e: invalid arithmetic operator (error token is ".9")

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36393/
            Subject: LU-11157 obd: round values to nearest MiB for *_mb syfs files
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 72c2d383c3e77206b0d169d894a18b5e43b7b72b

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36393/ Subject: LU-11157 obd: round values to nearest MiB for *_mb syfs files Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 72c2d383c3e77206b0d169d894a18b5e43b7b72b

            Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36393
            Subject: LU-11157 obd: round values to nearest MiB for *_mb syfs files
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: 350b4ee688585c034f0c0e520e2587aa4d389eda

            gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36393 Subject: LU-11157 obd: round values to nearest MiB for *_mb syfs files Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 350b4ee688585c034f0c0e520e2587aa4d389eda
            pjones Peter Jones added a comment -

            Landed for 2.13

            pjones Peter Jones added a comment - Landed for 2.13

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34317/
            Subject: LU-11157 obd: round values to nearest MiB for *_mb syfs files
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: ba2817fe3ead1b8e32be6d6c6ce25b490626118a

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34317/ Subject: LU-11157 obd: round values to nearest MiB for *_mb syfs files Project: fs/lustre-release Branch: master Current Patch Set: Commit: ba2817fe3ead1b8e32be6d6c6ce25b490626118a

            James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34317
            Subject: LU-11157 obd: round values to nearest MiB for *_mb sysfs files
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 3e9d90e56042d5ff9076ae057c1ff117c348b2fa

            gerrit Gerrit Updater added a comment - James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/34317 Subject: LU-11157 obd: round values to nearest MiB for *_mb sysfs files Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 3e9d90e56042d5ff9076ae057c1ff117c348b2fa

            I'm fine with rounding the output to a whole number of MB. This is the only code that is using lprocfs_read_frac_helper() so it could just be removed. There are other places that are using lprocfs_seq_read_frac_helper() that will have the same issues. Back when this code was written, a few MB was a lot of memory, but now it is a rounding error for a client.

            adilger Andreas Dilger added a comment - I'm fine with rounding the output to a whole number of MB. This is the only code that is using lprocfs_read_frac_helper() so it could just be removed. There are other places that are using lprocfs_seq_read_frac_helper() that will have the same issues. Back when this code was written, a few MB was a lot of memory, but now it is a rounding error for a client.

            So I started to look into this and see the reason as John pointed out for the failures Nunez posted is due to the test treating the values returned by say "max_dirty_mb" as a real integer and not a float point string. We could update the test to feed this result into bc since bash can't do floating point math. The question to ask is do we want to make sites do the same kind of crazy? If we don't that means we end implementing round_up() handling like we did for dirty_max_pages. Is that okay with people?

            simmonsja James A Simmons added a comment - So I started to look into this and see the reason as John pointed out for the failures Nunez posted is due to the test treating the values returned by say "max_dirty_mb" as a real integer and not a float point string. We could update the test to feed this result into bc since bash can't do floating point math. The question to ask is do we want to make sites do the same kind of crazy? If we don't that means we end implementing round_up() handling like we did for dirty_max_pages. Is that okay with people?

            In addition to sanity test 42e failing, we also see the following tests fail:
            sanity 64d with the same error message; '209.9: syntax error: invalid arithmetic operator (error token is ".9")'
            conf-sanity test 76a with error message ''209.9: syntax error: invalid arithmetic operator (error token is ".9")'
            recovery-small test 55 with error 'error: set_param: setting /sys/fs/lustre/osc/lustre-OST0001-osc-ffff9b4353f50800/max_dirty_mb=209.9: Invalid argument'
            sanity-dom test 42e, which is running sanity test 42e

            jamesanunez James Nunez (Inactive) added a comment - In addition to sanity test 42e failing, we also see the following tests fail: sanity 64d with the same error message; '209.9: syntax error: invalid arithmetic operator (error token is ".9")' conf-sanity test 76a with error message ''209.9: syntax error: invalid arithmetic operator (error token is ".9")' recovery-small test 55 with error 'error: set_param: setting /sys/fs/lustre/osc/lustre-OST0001-osc-ffff9b4353f50800/max_dirty_mb=209.9: Invalid argument' sanity-dom test 42e, which is running sanity test 42e

            Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/32831/
            Subject: LU-11157 obd: keep dirty_max_pages a round number of MB
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: d3f88d376c49e4520a0d695a4b4e9b0c2dbebaaf

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) merged in patch https://review.whamcloud.com/32831/ Subject: LU-11157 obd: keep dirty_max_pages a round number of MB Project: fs/lustre-release Branch: master Current Patch Set: Commit: d3f88d376c49e4520a0d695a4b4e9b0c2dbebaaf
            jhammond John Hammond added a comment - - edited

            > Ouch, it been broken upstream for some time.

            It was only when LU-10990 and LU-8066 were combined that this started failing.

            I am really not a fan of displaying decimal fractional values in these files. Generally it means that we cannot faithfully save and restore the parameter values.

            However if we do keep fractional values then I think there is more to do here. There are substantial differences between lprocfs_seq_read_frac_helper() and lprocfs_read_frac_helper(). I don't know why the second function is so complicated. But this is bad. Switching from proc to sys shouldn't be changing the format used to display the values.

            jhammond John Hammond added a comment - - edited > Ouch, it been broken upstream for some time. It was only when LU-10990 and LU-8066 were combined that this started failing. I am really not a fan of displaying decimal fractional values in these files. Generally it means that we cannot faithfully save and restore the parameter values. However if we do keep fractional values then I think there is more to do here. There are substantial differences between lprocfs_seq_read_frac_helper() and lprocfs_read_frac_helper() . I don't know why the second function is so complicated. But this is bad. Switching from proc to sys shouldn't be changing the format used to display the values.

            People

              simmonsja James A Simmons
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: