Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5469

Intermittent Clients hang during IO

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Critical
    • None
    • Lustre 2.4.3
    • None
    • clients: sles11SP3 2.4.3
                
      Server: centos 2.4.3
    • 3
    • 15236

    Description

      Clients hang during IO this is a typical trace for the hung process

      [287554.530174] ld              S ffff880989104a30     0 17536      1 0x00000000^M
      [287554.551603]  ffff88096e0759a8 0000000000000082 ffff88096e074010 0000000000011800^M
      [287554.574176]  0000000000011800 0000000000011800 0000000000011800 ffff88096e075fd8^M
      [287554.596747]  ffff88096e075fd8 0000000000011800 ffff8809de16c640 ffff8802733aa040^M
      [287554.619319] Call Trace:^M
      [287554.626926]  [<ffffffffa08916a5>] cl_sync_io_wait+0x365/0x450 [obdclass]^M
      [287554.647277]  [<ffffffffa0cc9169>] vvp_page_sync_io+0x59/0x120 [lustre]^M
      [287554.667083]  [<ffffffffa0cc9731>] vvp_io_commit_write+0x501/0x640 [lustre]^M
      [287554.687953]  [<ffffffffa0891ebc>] cl_io_commit_write+0x9c/0x1d0 [obdclass]^M
      [287554.708816]  [<ffffffffa0c9df14>] ll_commit_write+0x104/0x1f0 [lustre]^M
      [287554.728608]  [<ffffffffa0cb6fda>] ll_write_end+0x2a/0x60 [lustre]^M
      [287554.747098]  [<ffffffff810f83e2>] generic_perform_write+0x122/0x1c0^M
      [287554.766095]  [<ffffffff810f84e1>] generic_file_buffered_write+0x61/0xa0^M
      [287554.786125]  [<ffffffff810fb476>] __generic_file_aio_write+0x296/0x490^M
      [287554.805895]  [<ffffffff810fb6bc>] generic_file_aio_write+0x4c/0xb0^M
      [287554.824646]  [<ffffffffa0ccc111>] vvp_io_write_start+0xc1/0x2e0 [lustre]^M
      [287554.844973]  [<ffffffffa088ea09>] cl_io_start+0x69/0x140 [obdclass]^M
      [287554.864028]  [<ffffffffa0892dc3>] cl_io_loop+0xa3/0x190 [obdclass]^M
      [287554.882804]  [<ffffffffa0c71d91>] ll_file_io_generic+0x461/0x600 [lustre]^M
      [287554.903365]  [<ffffffffa0c72166>] ll_file_aio_write+0x236/0x290 [lustre]^M
      [287554.923680]  [<ffffffffa0c73373>] ll_file_write+0x203/0x290 [lustre]^M
      [287554.942941]  [<ffffffff8115b03e>] vfs_write+0xce/0x140^M
      [287554.958555]  [<ffffffff8115b1b3>] sys_write+0x53/0xa0^M
      [287554.973915]  [<ffffffff81479c92>] system_call_fastpath+0x16/0x1b^M
      [287554.992131]  [<00007ffe3f20e0b0>] 0x7ffe3f20e0af^M
      

      We don't see any errors.

      OSS trace is attached.

      Attachments

        1. service194.gz
          102 kB
          Mahmoud Hanafi
        2. service64.trace.gz
          87 kB
          Mahmoud Hanafi

        Issue Links

          Activity

            People

              niu Niu Yawei (Inactive)
              mhanafi Mahmoud Hanafi
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: