Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15344

replay-single test_68: 2nd cp failed 1

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Sergey Cheremencev <c17829@cray.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/765157db-a02f-4c7c-9047-6bafbeb8771f

      test_68 failed with the following error:

      2nd cp failed 1
      

      It failed due to EIO:

      cp: error writing '/mnt/lustre/d68.replay-single/f68.replay-single_2': Input/output error
       replay-single test_68: @@@@@@ FAIL: 2nd cp failed 1 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:6336:error()
        = /usr/lib64/lustre/tests/replay-single.sh:2031:test_68()
        = /usr/lib64/lustre/tests/test-framework.sh:6640:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:6687:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:6528:run_test()
        = /usr/lib64/lustre/tests/replay-single.sh:2039:main()
      

      Client1 was evicted by OST1 due to lock callback timeout:

      OST0:
      [ 5530.395758] Lustre: DEBUG MARKER: echo 20 >> /sys/module/ptlrpc/parameters/ldlm_enqueue_min
      [ 5550.908551] LustreError: 22440:0:(ldlm_lockd.c:259:expired_lock_main()) ### lock callback timer expired after 20s: evicting client at 10.240.29.77@tcp  ns: filter-lustre-OST0000_UUID lock: 00000000ff793f75/0x5f354f68b67d3784 lrc: 3/0,0 mode: PR/PR res: [0x300000400:0xa3:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000400000020 nid: 10.240.29.77@tcp remote: 0x56b69374bce0b1db expref: 10 pid: 122059 timeout: 5552 lvb_type: 1
      [ 5556.416584] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  replay-single test_68: @@@@@@ FAIL: 2nd cp failed 1 
      client1:
      [ 5637.164451] Lustre: lustre-OST0000-osc-ffff969b66b59000: Connection to lustre-OST0000 (at 10.240.28.228@tcp) was lost; in progress operations using this service will wait for recovery to complete
      [ 5637.167860] Lustre: Skipped 7 previous similar messages
      [ 5637.169419] LustreError: 167-0: lustre-OST0000-osc-ffff969b66b59000: This client was evicted by lustre-OST0000; in progress operations using this service will fail.
      [ 5637.172217] LustreError: Skipped 1 previous similar message
      [ 5637.173588] Lustre: 7849:0:(llite_lib.c:3356:ll_dirty_page_discard_warn()) lustre: dirty page discard: 10.240.30.129@tcp:/lustre/fid: [0x2c0000405:0x2:0x0]// may get corrupted (rc -108)
      [ 5637.177005] LustreError: 218205:0:(ldlm_resource.c:1124:ldlm_resource_complain()) lustre-OST0000-osc-ffff969b66b59000: namespace resource [0x300000400:0xa3:0x0].0x0 (000000009ffb33da) refcount nonzero (2) after lock cleanup; forcing cleanup.
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      replay-single test_68 - 2nd cp failed 1

      Attachments

        Activity

          People

            wc-triage WC Triage
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: