Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3205

Interop 2.1.5<->2.4 failure on test suite sanity test_24u

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.0
    • Lustre 2.4.0
    • 3
    • 7825

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: http://maloo.whamcloud.com/test_sets/ec0b91f2-a6bc-11e2-9b48-52540035b04c.

      The sub-test test_24u failed with the following error:

      test failed to respond and timed out

      Info required for matching: sanity 24u

      Attachments

        Activity

          [LU-3205] Interop 2.1.5<->2.4 failure on test suite sanity test_24u
          pjones Peter Jones added a comment -

          Landed for 2.4

          pjones Peter Jones added a comment - Landed for 2.4
          jay Jinshan Xiong (Inactive) added a comment - patch is at: http://review.whamcloud.com/6137

          This problem is due to compatibility check. I'll work out a fix.

          jay Jinshan Xiong (Inactive) added a comment - This problem is due to compatibility check. I'll work out a fix.

          It looks like multiop got stuck on the client during the write trying to get cl_lock:

          Apr 15 03:04:25 client-24vm2 kernel: multiop       R  running task        0 14143  13996 0x00000080
          Apr 15 03:04:25 client-24vm2 kernel: ffff88006fbe3c58 0000000000000086 0000000000000052 0000000100000020
          Apr 15 03:04:25 client-24vm2 kernel: 516bd0a700000000 00000000000d14d1 0000373f00000000 000005f400000000
          Apr 15 03:04:25 client-24vm2 kernel: ffff88007c3b25f8 ffff88006fbe3fd8 000000000000fb88 ffff88007c3b2600
          Apr 15 03:04:25 client-24vm2 kernel: Call Trace:
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff81064d6a>] __cond_resched+0x2a/0x40
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff8150e320>] _cond_resched+0x30/0x40
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff8150ee4e>] mutex_lock+0x1e/0x50
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0590e6f>] cl_lock_mutex_get+0x6f/0xd0 [obdclass]
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa05937a9>] cl_wait+0x39/0x250 [obdclass]
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0599cc5>] cl_io_lock+0x485/0x560 [obdclass]
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0599e42>] cl_io_loop+0xa2/0x1b0 [obdclass]
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0a6f7f0>] ll_file_io_generic+0x450/0x600 [lustre]
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0a70c12>] ll_file_aio_write+0x142/0x2c0 [lustre]
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0a70efc>] ll_file_write+0x16c/0x2a0 [lustre]
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff81181078>] vfs_write+0xb8/0x1a0
          Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff81181971>] sys_write+0x51/0x90
          

          Sarah, can you please submit another test run with this config to see if this problem will repeat?

          adilger Andreas Dilger added a comment - It looks like multiop got stuck on the client during the write trying to get cl_lock: Apr 15 03:04:25 client-24vm2 kernel: multiop R running task 0 14143 13996 0x00000080 Apr 15 03:04:25 client-24vm2 kernel: ffff88006fbe3c58 0000000000000086 0000000000000052 0000000100000020 Apr 15 03:04:25 client-24vm2 kernel: 516bd0a700000000 00000000000d14d1 0000373f00000000 000005f400000000 Apr 15 03:04:25 client-24vm2 kernel: ffff88007c3b25f8 ffff88006fbe3fd8 000000000000fb88 ffff88007c3b2600 Apr 15 03:04:25 client-24vm2 kernel: Call Trace: Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff81064d6a>] __cond_resched+0x2a/0x40 Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff8150e320>] _cond_resched+0x30/0x40 Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff8150ee4e>] mutex_lock+0x1e/0x50 Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0590e6f>] cl_lock_mutex_get+0x6f/0xd0 [obdclass] Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa05937a9>] cl_wait+0x39/0x250 [obdclass] Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0599cc5>] cl_io_lock+0x485/0x560 [obdclass] Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0599e42>] cl_io_loop+0xa2/0x1b0 [obdclass] Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0a6f7f0>] ll_file_io_generic+0x450/0x600 [lustre] Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0a70c12>] ll_file_aio_write+0x142/0x2c0 [lustre] Apr 15 03:04:25 client-24vm2 kernel: [<ffffffffa0a70efc>] ll_file_write+0x16c/0x2a0 [lustre] Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff81181078>] vfs_write+0xb8/0x1a0 Apr 15 03:04:25 client-24vm2 kernel: [<ffffffff81181971>] sys_write+0x51/0x90 Sarah, can you please submit another test run with this config to see if this problem will repeat?

          People

            jay Jinshan Xiong (Inactive)
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: