Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17951

sanity-pfl test_20a: test_20a returned 1

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • Lustre 2.15.5
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/2fed3bdb-f743-4311-ab8d-21385e3f3935

      test_20a failed with the following error:

      test_20a returned 1
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-b2_15/88 - 4.18.0-513.5.1.el8_9.x86_64
      servers: https://build.whamcloud.com/job/lustre-b2_15/88 - 4.18.0-513.24.1.el8_lustre.x86_64

      <<Please provide additional information about the failure here>>

      OSS dmesg

      [ 2055.927580] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == sanity-pfl test 20a: Test out of space, spillover to defined component ========================================================== 18:22:28 \(1718043748\)
      [ 2056.103186] Lustre: DEBUG MARKER: == sanity-pfl test 20a: Test out of space, spillover to defined component ========================================================== 18:22:28 (1718043748)
      [ 2108.977341] Lustre: 38610:0:(service.c:2333:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (135/20s); client may timeout  req@00000000f5f1a170 x1801497925444160/t4294967830(0) o6->lustre-MDT0000-mdtlov_UUID@10.240.44.184@tcp:512/0 lens 544/432 e 0 to 0 dl 1718043782 ref 1 fl Complete:/0/0 rc 0/0 job:'osp-syn-0-0.0'
      [ 2145.352426] Lustre: 41610:0:(service.c:2333:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (135/36s); client may timeout  req@00000000ea5982df x1801497925455424/t4294967939(0) o6->lustre-MDT0000-mdtlov_UUID@10.240.44.184@tcp:532/0 lens 544/432 e 0 to 0 dl 1718043802 ref 1 fl Complete:/0/0 rc 0/0 job:'osp-syn-6-0.0'
      [ 2213.339841] Lustre: ll_ost_io00_009: service thread pid 47073 was inactive for 67.559 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
      [ 2213.339856] Pid: 18190, comm: ll_ost_io00_002 4.18.0-513.24.1.el8_lustre.x86_64 #1 SMP Thu May 30 22:41:54 UTC 2024
      [ 2213.343031] Lustre: Skipped 1 previous similar message
      [ 2213.344714] Call Trace TBD:
      [ 2213.344809] [<0>] cv_wait_common+0xaf/0x130 [spl]
      [ 2213.346892] [<0>] txg_wait_synced_impl+0xc6/0x110 [zfs]
      [ 2213.347874] [<0>] txg_wait_synced+0xc/0x40 [zfs]
      [ 2213.348759] [<0>] dmu_tx_wait+0x1e4/0x3f0 [zfs]
      [ 2213.349591] [<0>] dmu_tx_assign+0x157/0x4d0 [zfs]
      [ 2213.350444] [<0>] osd_trans_start+0x1b1/0x430 [osd_zfs]
      [ 2213.351339] [<0>] ofd_write_attr_set+0x13b/0x1020 [ofd]
      [ 2213.352315] [<0>] ofd_commitrw_write+0x226/0x1ad0 [ofd]
      [ 2213.353201] [<0>] ofd_commitrw+0x5b4/0xd20 [ofd]
      [ 2213.353995] [<0>] obd_commitrw+0x1b6/0x370 [ptlrpc]
      [ 2213.355250] [<0>] tgt_brw_write+0x1374/0x1cb0 [ptlrpc]
      [ 2213.356188] [<0>] tgt_request_handle+0xccd/0x1a20 [ptlrpc]
      [ 2213.357172] [<0>] ptlrpc_server_handle_request+0x323/0xbe0 [ptlrpc]
      [ 2213.358291] [<0>] ptlrpc_main+0xbec/0x1530 [ptlrpc]
      [ 2213.359182] [<0>] kthread+0x134/0x150
      [ 2213.359822] [<0>] ret_from_fork+0x35/0x40
      [ 2213.360505] Pid: 47073, comm: ll_ost_io00_009 4.18.0-513.24.1.el8_lustre.x86_64 #1 SMP Thu May 30 22:41:54 UTC 2024
      [ 2213.362230] Call Trace TBD:
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity-pfl test_20a - test_20a returned 1

      Attachments

        Issue Links

          Activity

            [LU-17951] sanity-pfl test_20a: test_20a returned 1

            This has been hit a few times in the past 6 months with ZFS, on master as well.  It looks like the server is stuck trying to write to the storage, so it could be caused by high disk usage on the VM host.

            adilger Andreas Dilger added a comment - This has been hit a few times in the past 6 months with ZFS, on master as well.  It looks like the server is stuck trying to write to the storage, so it could be caused by high disk usage on the VM host.

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: