Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11823

sanity-flr test 0b and 0c fails with ‘pool_new failed lustre.test_0b' on ARM

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.12.0
    • Lustre 2.12.0
    • Ubuntu 18.04 clients
    • 3
    • 9223372036854775807

    Description

      sanity-flr test_0b and test_0c fails to create a new OST pool and fails with error ‘pool_new failed lustre.test_0b'. So far, we see this error only for ARM/PPC clients.

      Looking at the client test_log for https://testing.whamcloud.com/test_sets/d794bc4c-fdd4-11e8-a97c-52540065bddc ,

      == sanity-flr test 0b: lfs mirror create plain layout mirrors ======================================== 00:37:49 (1544488669)
      CMD: trevis-19vm4 lctl pool_new lustre.test_0b
      trevis-19vm4: Pool lustre.test_0b created
      CMD: trevis-19vm4 lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.test_0b 				2>/dev/null || echo foo
      CMD: trevis-19vm4 lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.test_0b 				2>/dev/null || echo foo
      CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n lov.lustre-*.pools.test_0b 		2>/dev/null || echo foo
      CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n lov.lustre-*.pools.test_0b 		2>/dev/null || echo foo
      Waiting 90 secs for update
      CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n lov.lustre-*.pools.test_0b 		2>/dev/null || echo foo
      …
      Waiting 10 secs for update
      CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n lov.lustre-*.pools.test_0b 		2>/dev/null || echo foo
      CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n lov.lustre-*.pools.test_0b 		2>/dev/null || echo foo
      CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n lov.lustre-*.pools.test_0b 		2>/dev/null || echo foo
      CMD: trevis-19vm1.trevis.whamcloud.com lctl get_param -n lov.lustre-*.pools.test_0b 		2>/dev/null || echo foo
      Update not seen after 90s: wanted '' got 'foo'
       sanity-flr test_0b: @@@@@@ FAIL: pool_new failed lustre.test_0b 
      

      We see some errors in the Client 1 (vm1) console log

      [  379.473969] Lustre: DEBUG MARKER: == sanity-flr test 0b: lfs mirror create plain layout mirrors ======================================== 00:37:49 (1544488669)
      [  380.882615] LustreError: 10433:0:(obd_config.c:1264:class_process_config()) no device for: lustre-clilov-000000007c485d00
      [  380.883815] Lustre: 10433:0:(obd_config.c:1351:class_process_config()) Ignoring error -22 on optional command 0xce020
      [  390.562916] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      [  390.574202] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      [  391.586752] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      [  392.598782] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      [  393.610433] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      [  394.622561] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      [  395.471515] Lustre: lustre-OST0000-osc-        (ptrval): disconnect after 22s idle
      [  395.472390] Lustre: Skipped 5 previous similar messages
      [  395.634478] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      

      We see similar errors the same test, but the test failure error is different; 'destroy pool failed lustre.test_0b'

      Looking at the Client 2 (vm9) dmesg log for https://testing.whamcloud.com/test_sets/147f0c12-e2f4-11e8-b67f-52540065bddc , we see

      [  210.247534] Lustre: DEBUG MARKER: == sanity-flr test 0b: lfs mirror create plain layout mirrors ======================================== 09:05:40 (1541495140)
      [  217.367773] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      [  217.379702] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      [  225.070588] Lustre: lustre-OST0000-osc-        (ptrval): disconnect after 21s idle
      [  228.360951] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b | sort -u | tr '\n' ' ' 
      [  228.372621] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b | sort -u | tr '\n' ' ' 
      [  250.803910] random: crng init done
      [  260.423743] LustreError: 10730:0:(obd_config.c:1264:class_process_config()) no device for: lustre-clilov-00000000b93a97e7
      [  260.425715] Lustre: 10730:0:(obd_config.c:1351:class_process_config()) Ignoring error -22 on optional command 0xce022
      [  270.916901] LustreError: 10742:0:(obd_config.c:1264:class_process_config()) no device for: lustre-clilov-00000000b93a97e7
      [  270.918884] LustreError: 10742:0:(obd_config.c:1264:class_process_config()) Skipped 1 previous similar message
      [  270.920584] Lustre: 10742:0:(obd_config.c:1351:class_process_config()) Ignoring error -22 on optional command 0xce022
      [  270.922396] Lustre: 10742:0:(obd_config.c:1351:class_process_config()) Skipped 1 previous similar message
      [  285.248874] LustreError: 10765:0:(obd_config.c:1264:class_process_config()) no device for: lustre-clilov-00000000b93a97e7
      [  285.250873] Lustre: 10765:0:(obd_config.c:1351:class_process_config()) Ignoring error -22 on optional command 0xce022
      [  290.469224] Lustre: DEBUG MARKER: lctl get_param -n lov.lustre-*.pools.test_0b 2>/dev/null || echo foo
      

      Attachments

        Issue Links

          Activity

            People

              simmonsja James A Simmons
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: