Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2576

Hangs in osc_enter_cache due to dirty pages not being flushed

Details

    • 3
    • 6013

    Description

      We've had reports of IO writes hanging in Sequoia, and after some initial debugging, have narrowed the hung threads to getting stuck with the following stack trace:

       sysiod        S 00000fffa842633c     0 27910   3135 0x00000000                    
       Call Trace:                                                                       
       [c0000003ee34e480] [c0000003ee34e590] 0xc0000003ee34e590 (unreliable)             
       [c0000003ee34e650] [c000000000009b8c] .__switch_to+0xc4/0x100                     
       [c0000003ee34e6e0] [c0000000004364c8] .schedule+0x858/0x9c0                       
       [c0000003ee34e990] [8000000000a720c0] .cfs_waitq_wait+0x10/0x30 [libcfs]          
       [c0000003ee34ea00] [8000000004533560] .osc_enter_cache+0x880/0x12c0 [osc]         
       [c0000003ee34ebd0] [800000000453b210] .osc_queue_async_io+0xd10/0x1a40 [osc]   
       [c0000003ee34edf0] [8000000004516fe8] .osc_page_cache_add+0xf8/0x2a0 [osc]        
       [c0000003ee34eeb0] [80000000024468b8] .cl_page_cache_add+0xe8/0x3b0 [obdclass] 
       [c0000003ee34efe0] [8000000004facae8] .lov_page_cache_add+0xc8/0x340 [lov]        
       [c0000003ee34f0b0] [80000000024468b8] .cl_page_cache_add+0xe8/0x3b0 [obdclass] 
       [c0000003ee34f1e0] [80000000067ecda4] .vvp_io_commit_write+0x474/0x8a0 [lustre]
       [c0000003ee34f300] [800000000246098c] .cl_io_commit_write+0x11c/0x2d0 [obdclass]
       [c0000003ee34f3c0] [80000000067af410] .ll_commit_write+0x120/0x3e0 [lustre]       
       [c0000003ee34f490] [80000000067d1634] .ll_write_end+0x34/0x80 [lustre]            
       [c0000003ee34f520] [c000000000098bac] .generic_file_buffered_write+0x1ec/0x374 
       [c0000003ee34f660] [c000000000099290] .__generic_file_aio_write+0x374/0x3d8       
       [c0000003ee34f760] [c00000000009936c] .generic_file_aio_write+0x78/0xe8           
       [c0000003ee34f810] [80000000067eff9c] .vvp_io_write_start+0xfc/0x3e0 [lustre]  
       [c0000003ee34f8e0] [800000000245aedc] .cl_io_start+0xcc/0x220 [obdclass]          
       [c0000003ee34f980] [8000000002462cf4] .cl_io_loop+0x194/0x2c0 [obdclass]          
       [c0000003ee34fa30] [800000000676a278] .ll_file_io_generic+0x498/0x670 [lustre] 
       [c0000003ee34fb30] [800000000676a8d4] .ll_file_aio_write+0x1d4/0x3a0 [lustre]  
       [c0000003ee34fc00] [800000000676abf0] .ll_file_write+0x150/0x320 [lustre]         
       [c0000003ee34fce0] [c0000000000d5968] .vfs_write+0xd0/0x1c4                       
       [c0000003ee34fd80] [c0000000000d5b58] .SyS_write+0x54/0x98                        
       [c0000003ee34fe30] [c000000000000580] syscall_exit+0x0/0x2c
      

      The specific line it's stuck at is here:

      (gdb) l *osc_enter_cache+0x880
      0x47f50 is in osc_enter_cache (/builddir/build/BUILD/lustre-2.3.56/lustre/osc/osc_cache.c:1526).
      1521    /builddir/build/BUILD/lustre-2.3.56/lustre/osc/osc_cache.c: No such file or directory.
              in /builddir/build/BUILD/lustre-2.3.56/lustre/osc/osc_cache.c
      
      1523                 CDEBUG(D_CACHE, "%s: sleeping for cache space @ %p for %p\n",   
      1524                        cli->cl_import->imp_obd->obd_name, &ocw, oap);           
      1525                                                                                 
      1526                 rc = l_wait_event(ocw.ocw_waitq,                                
      1527                                   cfs_list_empty(&ocw.ocw_entry), &lwi);        
      1528                                                                                 
      1529                 client_obd_list_lock(&cli->cl_loi_list_lock);                   
      1530                 cfs_list_del_init(&ocw.ocw_entry); 
      

      There were specific OSCs on a couple of the problem clients showing outstanding dirty pages for a significant amount of time:

      seqlac2@root:pdsh -w vulcanio[42-43] 'grep -v 0 /proc/fs/lustre/osc/*/cur_dirty_bytes' | dshbak -c
      ----------------                                                                
      vulcanio42                                                                      
      ----------------                                                                
      /proc/fs/lustre/osc/ls1-OST0026-osc-c0000003c1100000/cur_dirty_bytes:65536         
      /proc/fs/lustre/osc/ls1-OST0027-osc-c0000003c1100000/cur_dirty_bytes:65536         
      /proc/fs/lustre/osc/ls1-OST0028-osc-c0000003c1100000/cur_dirty_bytes:65536         
      /proc/fs/lustre/osc/ls1-OST002b-osc-c0000003c1100000/cur_dirty_bytes:65536         
      ----------------                                                                
      vulcanio43                                                                      
      ----------------                                                                
      /proc/fs/lustre/osc/ls1-OST0053-osc-c0000003e336b000/cur_dirty_bytes:65536
      

      Manually dropping caches on the problem clients seems to have cleared things up:

      # echo 3 > /proc/sys/vm/drop_caches
      

      In case it is useful, attached is a dump of the lustre pages taken prior to dropping caches on one of the problem clients:

      # cat /proc/fs/lustre/llite/ls1-c0000003e336b000/dump_page_cache > vulcanio43-dump_page_cache-`date +%s`
      

      I also ran the same command after dropping caches, but the file was empty, so there's no reason to post it.

      Attachments

        Issue Links

          Activity

            [LU-2576] Hangs in osc_enter_cache due to dirty pages not being flushed
            lflis Lukasz Flis added a comment -

            Hi Jinshan,

            Thank you very much for pointing this one out and for patch for 2.4
            we'll give it a try ASAP. There was no sign of this change in 2.4.0 and 2.4.1RC2

            I see that 2.4.1 is ready for release (packages are ready and published) - maybe it's worth
            to consider holding (or updating) release a bit as this bug can manifest itself in few hours
            (i have an impression that oss recoveries raise the chance of getting this one)

            Thank you very much for the quick patch. I'll try to give a feedback asap


            Lukasz Flis

            lflis Lukasz Flis added a comment - Hi Jinshan, Thank you very much for pointing this one out and for patch for 2.4 we'll give it a try ASAP. There was no sign of this change in 2.4.0 and 2.4.1RC2 I see that 2.4.1 is ready for release (packages are ready and published) - maybe it's worth to consider holding (or updating) release a bit as this bug can manifest itself in few hours (i have an impression that oss recoveries raise the chance of getting this one) Thank you very much for the quick patch. I'll try to give a feedback asap – Lukasz Flis
            jay Jinshan Xiong (Inactive) added a comment - patch for b2_4 is at: http://review.whamcloud.com/7657

            Hi Lukasz,

            Do you have http://review.whamcloud.com/6554 in your tree? It looks like we need to cherry pick this patch into b2_4 as well.

            jay Jinshan Xiong (Inactive) added a comment - Hi Lukasz, Do you have http://review.whamcloud.com/6554 in your tree? It looks like we need to cherry pick this patch into b2_4 as well.
            lflis Lukasz Flis added a comment -

            Dear All,

            Has this patch (#7) been really landed for 2.4 and 2.4.1-RC2?
            http://review.whamcloud.com/#/c/4963/

            We have observed the issue for 2.4.0 and 2.4.1-RC2 so looks like the problem is still there.

            Here's our process stack:
            Sep 14 13:40:12 n0269-g6l kernel: ost_check S 0000000000000005 0 10821 1 0x00000080
            Sep 14 13:40:12 n0269-g6l kernel: ffff880127051548 0000000000000086 0000000000000000 0000000000000000
            Sep 14 13:40:12 n0269-g6l kernel: 0000000000000000 0000000000000082 ffff880127051528 ffffffff8106327e
            Sep 14 13:40:12 n0269-g6l kernel: ffff88020d5a3af8 ffff880127051fd8 000000000000fb88 ffff88020d5a3af8
            Sep 14 13:40:12 n0269-g6l kernel: Call Trace:
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8106327e>] ? try_to_wake_up+0x24e/0x3e0
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa03236fe>] cfs_waitq_wait+0xe/0x10 [libcfs]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa08d09ea>] osc_enter_cache+0x85a/0xb40 [osc]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0338917>] ? cfs_hash_bd_lookup_intent+0x37/0x130 [libcfs]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff81063410>] ? default_wake_function+0x0/0x20
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa08d7fd1>] osc_queue_async_io+0x1081/0x1c18 [osc]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff810573c5>] ? select_idle_sibling+0x95/0x150
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8121ba89>] ? security_capable+0x29/0x30
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8107baba>] ? capable+0x2a/0x60
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa04841c5>] ? cl_page_slice_add+0x55/0x140 [obdclass]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa08bd2ec>] ? osc_page_init+0xec/0x890 [osc]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa08be2e9>] osc_page_cache_add+0xc9/0x1d0 [osc]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa04848af>] cl_page_cache_add+0x7f/0x2a0 [obdclass]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa094e335>] lov_page_cache_add+0x85/0x200 [lov]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa04848af>] cl_page_cache_add+0x7f/0x2a0 [obdclass]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa04889fc>] ? cl_page_find0+0x44c/0x810 [obdclass]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0481439>] ? cl_env_hops_keycmp+0x19/0x70 [obdclass]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa03389bd>] ? cfs_hash_bd_lookup_intent+0xdd/0x130 [libcfs]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0a26225>] vvp_io_commit_write+0x3e5/0x5b0 [lustre]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0493cbd>] cl_io_commit_write+0xad/0x1d0 [obdclass]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa09fb37e>] ll_commit_write+0xee/0x320 [lustre]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0a13650>] ll_write_end+0x30/0x60 [lustre]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8111a81a>] generic_file_buffered_write+0x18a/0x2e0
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff81075887>] ? current_fs_time+0x27/0x30
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8111c210>] __generic_file_aio_write+0x260/0x490
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8111c4c8>] generic_file_aio_write+0x88/0x100
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0a283db>] vvp_io_write_start+0xcb/0x2e0 [lustre]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa049072a>] cl_io_start+0x6a/0x140 [obdclass]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0494e64>] cl_io_loop+0xb4/0x1b0 [obdclass]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa09ce9e0>] ll_file_io_generic+0x450/0x600 [lustre]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa09cf922>] ll_file_aio_write+0x142/0x2c0 [lustre]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa09cfc0c>] ll_file_write+0x16c/0x2a0 [lustre]
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff81181368>] vfs_write+0xb8/0x1a0
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff81181c61>] sys_write+0x51/0x90
            Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

            Unfortunately dropping caches doesn't unblock the process.
            Only way to get ost communication to work again is to deactivate and reactivate osc

            [root@n0269-g6l ~]# echo 0 > /proc/fs/lustre/osc/scratch-OST0003-osc-ffff8802126ba400/active
            [root@n0269-g6l ~]# echo 1 > /proc/fs/lustre/osc/scratch-OST0003-osc-ffff8802126ba400/active

            Could you please have a look?

            In case more debugging info is needed please let us know. We still keep some
            clients hanging for debugging purposes.

            Best Regards

            Lukasz Flis
            ACC Cyfronet

            lflis Lukasz Flis added a comment - Dear All, Has this patch (#7) been really landed for 2.4 and 2.4.1-RC2? http://review.whamcloud.com/#/c/4963/ We have observed the issue for 2.4.0 and 2.4.1-RC2 so looks like the problem is still there. Here's our process stack: Sep 14 13:40:12 n0269-g6l kernel: ost_check S 0000000000000005 0 10821 1 0x00000080 Sep 14 13:40:12 n0269-g6l kernel: ffff880127051548 0000000000000086 0000000000000000 0000000000000000 Sep 14 13:40:12 n0269-g6l kernel: 0000000000000000 0000000000000082 ffff880127051528 ffffffff8106327e Sep 14 13:40:12 n0269-g6l kernel: ffff88020d5a3af8 ffff880127051fd8 000000000000fb88 ffff88020d5a3af8 Sep 14 13:40:12 n0269-g6l kernel: Call Trace: Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8106327e>] ? try_to_wake_up+0x24e/0x3e0 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa03236fe>] cfs_waitq_wait+0xe/0x10 [libcfs] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa08d09ea>] osc_enter_cache+0x85a/0xb40 [osc] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0338917>] ? cfs_hash_bd_lookup_intent+0x37/0x130 [libcfs] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff81063410>] ? default_wake_function+0x0/0x20 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa08d7fd1>] osc_queue_async_io+0x1081/0x1c18 [osc] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff810573c5>] ? select_idle_sibling+0x95/0x150 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8121ba89>] ? security_capable+0x29/0x30 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8107baba>] ? capable+0x2a/0x60 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa04841c5>] ? cl_page_slice_add+0x55/0x140 [obdclass] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa08bd2ec>] ? osc_page_init+0xec/0x890 [osc] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa08be2e9>] osc_page_cache_add+0xc9/0x1d0 [osc] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa04848af>] cl_page_cache_add+0x7f/0x2a0 [obdclass] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa094e335>] lov_page_cache_add+0x85/0x200 [lov] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa04848af>] cl_page_cache_add+0x7f/0x2a0 [obdclass] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa04889fc>] ? cl_page_find0+0x44c/0x810 [obdclass] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0481439>] ? cl_env_hops_keycmp+0x19/0x70 [obdclass] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa03389bd>] ? cfs_hash_bd_lookup_intent+0xdd/0x130 [libcfs] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0a26225>] vvp_io_commit_write+0x3e5/0x5b0 [lustre] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0493cbd>] cl_io_commit_write+0xad/0x1d0 [obdclass] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa09fb37e>] ll_commit_write+0xee/0x320 [lustre] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0a13650>] ll_write_end+0x30/0x60 [lustre] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8111a81a>] generic_file_buffered_write+0x18a/0x2e0 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff81075887>] ? current_fs_time+0x27/0x30 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8111c210>] __generic_file_aio_write+0x260/0x490 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8111c4c8>] generic_file_aio_write+0x88/0x100 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0a283db>] vvp_io_write_start+0xcb/0x2e0 [lustre] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa049072a>] cl_io_start+0x6a/0x140 [obdclass] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa0494e64>] cl_io_loop+0xb4/0x1b0 [obdclass] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa09ce9e0>] ll_file_io_generic+0x450/0x600 [lustre] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa09cf922>] ll_file_aio_write+0x142/0x2c0 [lustre] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffffa09cfc0c>] ll_file_write+0x16c/0x2a0 [lustre] Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff81181368>] vfs_write+0xb8/0x1a0 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff81181c61>] sys_write+0x51/0x90 Sep 14 13:40:12 n0269-g6l kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Unfortunately dropping caches doesn't unblock the process. Only way to get ost communication to work again is to deactivate and reactivate osc [root@n0269-g6l ~] # echo 0 > /proc/fs/lustre/osc/scratch-OST0003-osc-ffff8802126ba400/active [root@n0269-g6l ~] # echo 1 > /proc/fs/lustre/osc/scratch-OST0003-osc-ffff8802126ba400/active Could you please have a look? In case more debugging info is needed please let us know. We still keep some clients hanging for debugging purposes. Best Regards – Lukasz Flis ACC Cyfronet

            patch landed for 2.4

            niu Niu Yawei (Inactive) added a comment - patch landed for 2.4

            Yes, so far no repeats of this bug.

            morrone Christopher Morrone (Inactive) added a comment - Yes, so far no repeats of this bug.

            Has the testing continued to be successful and we can close this ticket?

            jlevi Jodi Levi (Inactive) added a comment - Has the testing continued to be successful and we can close this ticket?

            Testing for this issue went well yesterday. We'll see how the patch holds up to real users over the next week, but I am cautiously optimistic that the problem is fixed.

            morrone Christopher Morrone (Inactive) added a comment - Testing for this issue went well yesterday. We'll see how the patch holds up to real users over the next week, but I am cautiously optimistic that the problem is fixed.

            We don't know yet. Other bugs keep preventing reasonable testing of the LU-2576 patch. I see it landed to master. My hope is to get a good test run in today and leave a version of lustre with the LU-2576 patch installed for users to try and see if things improve.

            morrone Christopher Morrone (Inactive) added a comment - We don't know yet. Other bugs keep preventing reasonable testing of the LU-2576 patch. I see it landed to master. My hope is to get a good test run in today and leave a version of lustre with the LU-2576 patch installed for users to try and see if things improve.

            Prakash, does the patch fix your problem? Thanks.

            niu Niu Yawei (Inactive) added a comment - Prakash, does the patch fix your problem? Thanks.

            People

              niu Niu Yawei (Inactive)
              prakash Prakash Surya (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: