Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-10800

Mount hangs on clients.

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.11.0
    • Lustre 2.11.0
    • Soak stress cluster, lustre-master-next-ib build 1
    • 9223372036854775807

    Description

      Mounts frequently hang on clients.

      Mar  9 18:14:41 soak-36 kernel: INFO: task mount.lustre:2807 blocked for more than 120 seconds.
      Mar  9 18:14:41 soak-36 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      Mar  9 18:14:41 soak-36 kernel: mount.lustre    D ffff88085b7a0000     0  2807   2806 0x00000080
      Mar  9 18:14:41 soak-36 kernel: Call Trace:
      Mar  9 18:14:41 soak-36 kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
      Mar  9 18:14:41 soak-36 kernel: [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
      Mar  9 18:14:41 soak-36 kernel: [<ffffffff81050b4c>] ? native_smp_send_reschedule+0x4c/0x70
      Mar  9 18:14:41 soak-36 kernel: [<ffffffff810c2358>] ? resched_curr+0xa8/0xc0
      Mar  9 18:14:41 soak-36 kernel: [<ffffffff810c30d8>] ? check_preempt_curr+0x78/0xa0
      Mar  9 18:14:41 soak-36 kernel: [<ffffffff810c3119>] ? ttwu_do_wakeup+0x19/0xd0
      Mar  9 18:14:41 soak-36 kernel: [<ffffffff816abc5d>] wait_for_completion+0xfd/0x140
      Mar  9 18:14:42 soak-36 kernel: [<ffffffff810c6620>] ? wake_up_state+0x20/0x20
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0b28854>] llog_process_or_fork+0x244/0x450 [obdclass]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0b28a74>] llog_process+0x14/0x20 [obdclass]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0b5b1c5>] class_config_parse_llog+0x125/0x350 [obdclass]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc06501c8>] mgc_process_cfg_log+0x788/0xc40 [mgc]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0652243>] mgc_process_log+0x3d3/0x8b0 [mgc]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0b63240>] ? class_config_dump_handler+0x7e0/0x7e0 [obdclass]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0652968>] ? do_config_log_add+0x248/0x580 [mgc]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0653840>] mgc_process_config+0x890/0x13f0 [mgc]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0b66c85>] lustre_process_log+0x2d5/0xae0 [obdclass]
      Mar  9 18:14:42 soak-36 kernel: [<ffffffffc0855e27>] ? libcfs_debug_msg+0x57/0x80 [libcfs]
      Mar  9 18:14:43 soak-36 kernel: [<ffffffffc0f3e3bb>] ll_fill_super+0x45b/0x1100 [lustre]
      Mar  9 18:14:43 soak-36 kernel: [<ffffffffc0b6caa6>] lustre_fill_super+0x286/0x910 [obdclass]
      Mar  9 18:14:43 soak-36 kernel: [<ffffffffc0b6c820>] ? lustre_common_put_super+0x270/0x270 [obdclass]
      Mar  9 18:14:43 soak-36 kernel: [<ffffffff81206abd>] mount_nodev+0x4d/0xb0
      Mar  9 18:14:43 soak-36 kernel: [<ffffffffc0b64ab8>] lustre_mount+0x38/0x60 [obdclass]
      Mar  9 18:14:43 soak-36 kernel: [<ffffffff81207549>] mount_fs+0x39/0x1b0
      Mar  9 18:14:43 soak-36 kernel: [<ffffffff81224177>] vfs_kern_mount+0x67/0x110
      Mar  9 18:14:43 soak-36 kernel: [<ffffffff81226683>] do_mount+0x233/0xaf0
      Mar  9 18:14:43 soak-36 kernel: [<ffffffff811894ee>] ? __get_free_pages+0xe/0x40
      Mar  9 18:14:43 soak-36 kernel: [<ffffffff812272c6>] SyS_mount+0x96/0xf0
      Mar  9 18:14:43 soak-36 kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
      Mar  9 18:16:43 soak-36 kernel: INFO: task mount.lustre:2807 blocked for more than 120 seconds.
      Mar  9 18:16:43 soak-36 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      

      I dumped the lustre log during the hang, attached. I also crash-dumped the client, files available on soak

      Attachments

        Issue Links

          Activity

            People

              ashehata Amir Shehata (Inactive)
              cliffw Cliff White (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: