Details

    • 3
    • 8471

    Description

      It looks like LU-2576 is back again. The problem went away for a while, seemingly thanks to the patch from LU-2576. I note that a later fix for stack overflow in LU-2859 changed the same lines where the fix was applied, so perhaps that reintroduced the problem?

      We are seeing hangs during writes on BG/Q hardware. We find tasks that appear to be stuck sleeping indefinitely here:

      2013-05-29 16:15:36.363547  sysiod        S 00000fffa85363bc     0  4111   3070 0x00000000
      2013-05-29 16:15:36.363582  Call Trace:
      2013-05-29 16:15:36.363617  [c0000002e1e12440] [c000000302e95cc8] 0xc000000302e95cc8 (unreliable)
      2013-05-29 16:15:36.363653  [c0000002e1e12610] [c000000000008de0] .__switch_to+0xc4/0x100
      2013-05-29 16:15:36.363688  [c0000002e1e126a0] [c00000000044dc68] .schedule+0x858/0x9c0
      2013-05-29 16:15:36.363723  [c0000002e1e12950] [80000000004820a0] .cfs_waitq_wait+0x10/0x30 [libcfs]
      2013-05-29 16:15:36.363758  [c0000002e1e129c0] [80000000015b3ccc] .osc_enter_cache+0xb6c/0x1410 [osc]
      2013-05-29 16:15:36.363793  [c0000002e1e12ba0] [80000000015bbf30] .osc_queue_async_io+0xcd0/0x2690 [osc]
      2013-05-29 16:15:36.363828  [c0000002e1e12db0] [8000000001598598] .osc_page_cache_add+0xf8/0x2a0 [osc]
      2013-05-29 16:15:36.363863  [c0000002e1e12e70] [8000000000a04248] .cl_page_cache_add+0xf8/0x420 [obdclass]
      2013-05-29 16:15:36.363898  [c0000002e1e12fa0] [800000000179ed28] .lov_page_cache_add+0xc8/0x340 [lov]
      2013-05-29 16:15:36.363934  [c0000002e1e13070] [8000000000a04248] .cl_page_cache_add+0xf8/0x420 [obdclass]
      2013-05-29 16:15:36.363968  [c0000002e1e131a0] [8000000001d2ac74] .vvp_io_commit_write+0x464/0x910 [lustre]
      2013-05-29 16:15:36.364003  [c0000002e1e132c0] [8000000000a1df6c] .cl_io_commit_write+0x11c/0x2d0 [obdclass]
      2013-05-29 16:15:36.364038  [c0000002e1e13380] [8000000001cebc00] .ll_commit_write+0x120/0x3e0 [lustre]
      2013-05-29 16:15:36.364074  [c0000002e1e13450] [8000000001d0f134] .ll_write_end+0x34/0x80 [lustre]
      2013-05-29 16:15:36.364109  [c0000002e1e134e0] [c000000000097238] .generic_file_buffered_write+0x1f4/0x388
      2013-05-29 16:15:36.364143  [c0000002e1e13620] [c000000000097928] .__generic_file_aio_write+0x374/0x3d8
      2013-05-29 16:15:36.364178  [c0000002e1e13720] [c000000000097a04] .generic_file_aio_write+0x78/0xe8
      2013-05-29 16:15:36.364213  [c0000002e1e137d0] [8000000001d2df00] .vvp_io_write_start+0x170/0x3b0 [lustre]
      2013-05-29 16:15:36.364248  [c0000002e1e138a0] [8000000000a1849c] .cl_io_start+0xcc/0x220 [obdclass]
      2013-05-29 16:15:36.364283  [c0000002e1e13940] [8000000000a202a4] .cl_io_loop+0x194/0x2c0 [obdclass]
      2013-05-29 16:15:36.364317  [c0000002e1e139f0] [8000000001ca0780] .ll_file_io_generic+0x4f0/0x850 [lustre]
      2013-05-29 16:15:36.364352  [c0000002e1e13b30] [8000000001ca0f64] .ll_file_aio_write+0x1d4/0x3a0 [lustre]
      2013-05-29 16:15:36.364387  [c0000002e1e13c00] [8000000001ca1280] .ll_file_write+0x150/0x320 [lustre]
      2013-05-29 16:15:36.364422  [c0000002e1e13ce0] [c0000000000d4328] .vfs_write+0xd0/0x1c4
      2013-05-29 16:15:36.364458  [c0000002e1e13d80] [c0000000000d4518] .SyS_write+0x54/0x98
      2013-05-29 16:15:36.364492  [c0000002e1e13e30] [c000000000000580] syscall_exit+0x0/0x2c
      

      This was with Lustre 2.4.0-RC1_3chaos.

      Attachments

        1. rzuseqio13_drop_caches.txt.bz2
          0.2 kB
        2. rzuseqio14_drop_caches.txt.bz2
          4.62 MB
        3. rzuseqio14_lustre_log.txt
          0.2 kB
        4. rzuseqio15_console.txt.bz2
          79 kB
        5. rzuseqio15_drop_caches.txt.bz2
          0.2 kB
        6. rzuseqio16_drop_caches.txt.bz2
          3.46 MB
        7. rzuseqio16_dump_page_cache.txt.bz2
          783 kB

        Issue Links

          Activity

            [LU-3416] Hangs on write in osc_enter_cache()

            See the attached rzuseq*_drop_caches.bz2 files. When I got a write hange, I enabled full lustre debugging, issued an 'echo 2 > drop_caches', and waited a few seconds, and then dumped the logs on all four clients.

            Hopefully there is something useful in there.

            morrone Christopher Morrone (Inactive) added a comment - See the attached rzuseq*_drop_caches.bz2 files. When I got a write hange, I enabled full lustre debugging, issued an 'echo 2 > drop_caches', and waited a few seconds, and then dumped the logs on all four clients. Hopefully there is something useful in there.

            Yes, I haven't been able to reproduce the problem again. Performance is in the accepatble range.

            I should be able to get that log tomorrow.

            morrone Christopher Morrone (Inactive) added a comment - Yes, I haven't been able to reproduce the problem again. Performance is in the accepatble range. I should be able to get that log tomorrow.

            No news assumes good news, no?

            If this patch can survive, can you please do me a favor to roll back this patch, make the hang happen, and collect logs when doing `echo 3 > drop_caches', I don't understand why this can make it go forward?

            jay Jinshan Xiong (Inactive) added a comment - No news assumes good news, no? If this patch can survive, can you please do me a favor to roll back this patch, make the hang happen, and collect logs when doing `echo 3 > drop_caches', I don't understand why this can make it go forward?

            I have been running the reproducer, and haven't had a single hang yet with patch set 5 applied. I will leave it running over night.

            morrone Christopher Morrone (Inactive) added a comment - I have been running the reproducer, and haven't had a single hang yet with patch set 5 applied. I will leave it running over night.

            Can you please try patch set 5?

            jay Jinshan Xiong (Inactive) added a comment - Can you please try patch set 5?

            See new attachment rzuseqio15_console.txt.bz2. It shows backtraces from all processes after a hang began, and then the messages that were printed by patch set 4.

            morrone Christopher Morrone (Inactive) added a comment - See new attachment rzuseqio15_console.txt.bz2 . It shows backtraces from all processes after a hang began, and then the messages that were printed by patch set 4.

            This is awkward - I thought it must be sleeping for LRU slots while holding some dirty pages. The problem is obvious: cl_dirty couldn't be deducted for some reason, one possible reason would the ptlrpc threads were blocked somewhere so brw_interpret couldn't be called, or the processes are waiting for something while holding active osc_extents. Can you please take a look at backtrace of running processes and tell me if there exists any other process is blocking on something?

            Also, I worked out patch set 4 which dumps the extent tree when this problem happens. This can help us isolate the root cause.

            jay Jinshan Xiong (Inactive) added a comment - This is awkward - I thought it must be sleeping for LRU slots while holding some dirty pages. The problem is obvious: cl_dirty couldn't be deducted for some reason, one possible reason would the ptlrpc threads were blocked somewhere so brw_interpret couldn't be called, or the processes are waiting for something while holding active osc_extents. Can you please take a look at backtrace of running processes and tell me if there exists any other process is blocking on something? Also, I worked out patch set 4 which dumps the extent tree when this problem happens. This can help us isolate the root cause.

            Here's output from one of the debug messages before I leave for the day:

            LustreError: 3709:0:(osc_cache.c:1563:osc_enter_cache()) lsrzb-OST0004-osc-c0000003c70ca400: grant { dirty: 33554432/33554432 dirty_pages: 6934/131071 unstable_pages: 0/131071 dropped: 0 avail: 45613056, reserved: 0, flight: 0 } lru {in list: 2407, left: 515, waiters: 0 }try to reserve 65536.
            LustreError: 3736:0:(osc_cache.c:1563:osc_enter_cache()) lsrzb-OST0004-osc-c0000003c70ca400: grant { dirty: 33554432/33554432 dirty_pages: 6934/131071 unstable_pages: 0/131071 dropped: 0 avail: 45613056, reserved: 0, flight: 0 } lru {in list: 2407, left: 515, waiters: 0 }try to reserve 65536.
            
            morrone Christopher Morrone (Inactive) added a comment - Here's output from one of the debug messages before I leave for the day: LustreError: 3709:0:(osc_cache.c:1563:osc_enter_cache()) lsrzb-OST0004-osc-c0000003c70ca400: grant { dirty: 33554432/33554432 dirty_pages: 6934/131071 unstable_pages: 0/131071 dropped: 0 avail: 45613056, reserved: 0, flight: 0 } lru {in list: 2407, left: 515, waiters: 0 }try to reserve 65536. LustreError: 3736:0:(osc_cache.c:1563:osc_enter_cache()) lsrzb-OST0004-osc-c0000003c70ca400: grant { dirty: 33554432/33554432 dirty_pages: 6934/131071 unstable_pages: 0/131071 dropped: 0 avail: 45613056, reserved: 0, flight: 0 } lru {in list: 2407, left: 515, waiters: 0 }try to reserve 65536.

            Performance (and presumably the negative number) problem is gone, but it is still hanging for ten minutes like with patch one. I'll get post some info tomorrow.

            morrone Christopher Morrone (Inactive) added a comment - Performance (and presumably the negative number) problem is gone, but it is still hanging for ten minutes like with patch one. I'll get post some info tomorrow.

            Will you please try patch set 3 and see if it helps.

            jay Jinshan Xiong (Inactive) added a comment - Will you please try patch set 3 and see if it helps.

            People

              niu Niu Yawei (Inactive)
              morrone Christopher Morrone (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: