Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16261

sanity test_244b: Timeout occurred after 245 minutes, last suite running was sanity

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Chris Horn <chris.horn@hpe.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/3f6fa410-0bcb-4668-9c54-750275ee2044

      test_244b failed with the following error:

      Timeout occurred after 245 minutes, last suite running was sanity
      

      Client reports stuck threads:

      [Sat Oct 22 01:26:58 2022] Lustre: DEBUG MARKER: == sanity test 244b: multi-threaded write with group lock ========================================================== 01:26:59 (1666402019)
      [Sat Oct 22 01:30:46 2022] INFO: task multiop:941148 blocked for more than 120 seconds.
      [Sat Oct 22 01:30:46 2022]       Tainted: G        W  OE    --------- -  - 4.18.0-348.7.1.el8_5.x86_64 #1
      [Sat Oct 22 01:30:46 2022] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [Sat Oct 22 01:30:46 2022] task:multiop         state:D stack:    0 pid:941148 ppid:940902 flags:0x00004080
      [Sat Oct 22 01:30:46 2022] Call Trace:
      [Sat Oct 22 01:30:46 2022]  __schedule+0x2bd/0x760
      [Sat Oct 22 01:30:46 2022]  ? vsnprintf+0x297/0x520
      [Sat Oct 22 01:30:46 2022]  schedule+0x37/0xa0
      [Sat Oct 22 01:30:46 2022]  schedule_preempt_disabled+0xa/0x10
      [Sat Oct 22 01:30:46 2022]  __mutex_lock.isra.6+0x2b5/0x4a0
      [Sat Oct 22 01:30:46 2022]  ll_layout_refresh+0x19b/0x330 [lustre]
      [Sat Oct 22 01:30:46 2022]  vvp_io_init+0x22a/0x370 [lustre]
      [Sat Oct 22 01:30:46 2022]  __cl_io_init.isra.14+0x86/0x150 [obdclass]
      [Sat Oct 22 01:30:46 2022]  ll_file_io_generic+0x388/0xd70 [lustre]
      [Sat Oct 22 01:30:46 2022]  ll_file_write_iter+0x64b/0x8a0 [lustre]
      [Sat Oct 22 01:30:46 2022]  new_sync_write+0x112/0x160
      [Sat Oct 22 01:30:46 2022]  vfs_write+0xa5/0x1a0
      [Sat Oct 22 01:30:46 2022]  ksys_write+0x4f/0xb0
      [Sat Oct 22 01:30:46 2022]  do_syscall_64+0x5b/0x1a0
      [Sat Oct 22 01:30:46 2022]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      [Sat Oct 22 01:30:46 2022] RIP: 0033:0x7fb6fffd0915
      [Sat Oct 22 01:30:46 2022] Code: Unable to access opcode bytes at RIP 0x7fb6fffd08eb.
      [Sat Oct 22 01:30:46 2022] RSP: 002b:00007ffcc51be938 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [Sat Oct 22 01:30:46 2022] RAX: ffffffffffffffda RBX: 00007ffcc51bfe6e RCX: 00007fb6fffd0915
      [Sat Oct 22 01:30:46 2022] RDX: 0000000000000001 RSI: 0000000001280000 RDI: 0000000000000003
      [Sat Oct 22 01:30:46 2022] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000003
      [Sat Oct 22 01:30:46 2022] R10: 000000000000000f R11: 0000000000000246 R12: 0000000000403348
      [Sat Oct 22 01:30:46 2022] R13: 0000000000000001 R14: 0000000000000003 R15: 0000000000604a60
      [Sat Oct 22 01:30:46 2022] INFO: task multiop:941150 blocked for more than 120 seconds.
      [Sat Oct 22 01:30:46 2022]       Tainted: G        W  OE    --------- -  - 4.18.0-348.7.1.el8_5.x86_64 #1
      [Sat Oct 22 01:30:46 2022] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [Sat Oct 22 01:30:46 2022] task:multiop         state:D stack:    0 pid:941150 ppid:940902 flags:0x00004080
      [Sat Oct 22 01:30:46 2022] Call Trace:
      [Sat Oct 22 01:30:46 2022]  __schedule+0x2bd/0x760
      [Sat Oct 22 01:30:46 2022]  ? lprocfs_counter_add+0xd2/0x140 [obdclass]
      [Sat Oct 22 01:30:46 2022]  schedule+0x37/0xa0
      [Sat Oct 22 01:30:46 2022]  schedule_preempt_disabled+0xa/0x10
      [Sat Oct 22 01:30:46 2022]  __mutex_lock.isra.6+0x2b5/0x4a0
      [Sat Oct 22 01:30:46 2022]  ll_layout_refresh+0x19b/0x330 [lustre]
      [Sat Oct 22 01:30:46 2022]  vvp_io_init+0x22a/0x370 [lustre]
      [Sat Oct 22 01:30:46 2022]  __cl_io_init.isra.14+0x86/0x150 [obdclass]
      [Sat Oct 22 01:30:46 2022]  cl_get_grouplock+0xd2/0x200 [lustre]
      [Sat Oct 22 01:30:46 2022]  ll_get_grouplock+0x378/0x700 [lustre]
      [Sat Oct 22 01:30:46 2022]  ll_file_ioctl+0x7f4/0x4480 [lustre]
      [Sat Oct 22 01:30:46 2022]  ? __handle_mm_fault+0x4d5/0x820
      [Sat Oct 22 01:30:46 2022]  do_vfs_ioctl+0xa4/0x680
      [Sat Oct 22 01:30:46 2022]  ? handle_mm_fault+0xbe/0x1e0
      [Sat Oct 22 01:30:46 2022]  ? syscall_trace_enter+0x1d3/0x2c0
      [Sat Oct 22 01:30:46 2022]  ksys_ioctl+0x60/0x90
      [Sat Oct 22 01:30:46 2022]  __x64_sys_ioctl+0x16/0x20
      [Sat Oct 22 01:30:46 2022]  do_syscall_64+0x5b/0x1a0
      [Sat Oct 22 01:30:46 2022]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_244b - Timeout occurred after 245 minutes, last suite running was sanity

      Attachments

        Activity

          People

            wc-triage WC Triage
            maloo Maloo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: