Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-2909

Failure on test suite sanity-benchmark test_fsx: fsx bus error, core dumped

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.4.0
    • Lustre 2.4.0
    • 3
    • 6999

    Description

      This issue was created by maloo for sarah <sarah@whamcloud.com>

      This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/5a852a8a-85b1-11e2-9f8d-52540035b04c.

      The sub-test test_fsx failed with the following error:

      fsx failed

      truncating to largest ever: 0x25701f
      018000 1362442021.951758 trunc from 0x169dd8  to  0x2e6079	(0x17c2a2 bytes)
      019000 1362442023.494191 mapwrite   0x1dd935 thru 0x1ec6de	(0xedaa bytes)
      020000 1362442025.029135 read       0x17d39e thru 0x1848a8	(0x750b bytes)
      021000 1362442026.608768 mapwrite   0x21696e thru 0x223441	(0xcad4 bytes)
      022000 1362442028.147407 write      0x065ab3 thru 0x071b3c	(0xc08a bytes)
      023000 1362442029.708216 write      0x142a88 thru 0x149749	(0x6cc2 bytes)
      024000 1362442031.298952 trunc from 0x16cef5  to  0x19c5bf	(0x2f6cb bytes)
      025000 1362442032.760608 mapread    0x050a4e thru 0x05ea37	(0xdfea bytes)
      026000 1362442034.337654 write      0x100d42 thru 0x10c4b8	(0xb777 bytes)
      027000 1362442035.889125 mapread    0x0a0142 thru 0x0a0412	(0x2d1 bytes)
      028000 1362442037.474502 read       0x1556d1 thru 0x15c086	(0x69b6 bytes)
      029000 1362442039.086174 write      0x009b42 thru 0x00e01c	(0x44db bytes)
      030000 1362442040.692829 mapwrite   0x18135d thru 0x184f86	(0x3c2a bytes)
      031000 1362442042.344050 write      0x142774 thru 0x14e54f	(0xbddc bytes)
      032000 1362442043.844112 mapwrite   0x0a2628 thru 0x0a295f	(0x338 bytes)
      033000 1362442045.438181 write      0x18f5c4 thru 0x19c18c	(0xcbc9 bytes)
      034000 1362442047.032646 trunc from 0x1e89cc  to  0x3b16df	(0x1c8d14 bytes)
      035000 1362442048.552844 write      0x1e22ba thru 0x1eb190	(0x8ed7 bytes)
      036000 1362442050.084864 read       0x0abc33 thru 0x0b5f8c	(0xa35a bytes)
      037000 1362442051.712007 trunc from 0x0af05a  to  0x0b8740	(0x96e7 bytes)
      038000 1362442053.320923 mapread    0x055a8e thru 0x05a120	(0x4693 bytes)
      039000 1362442054.897862 write      0x236d76 thru 0x23733f	(0x5ca bytes)
      040000 1362442056.462382 mapread    0x116f2e thru 0x12395c	(0xca2f bytes)
      041000 1362442058.009744 mapwrite   0x1493c5 thru 0x156198	(0xcdd4 bytes)
      042000 1362442059.557940 write      0x1cc6db thru 0x1cf588	(0x2eae bytes)
      /usr/lib64/lustre/tests/sanity-benchmark.sh: line 186: 22372 Bus error               (core dumped) $CMD
       sanity-benchmark test_fsx: @@@@@@ FAIL: fsx failed 
        Trace dump:
        = /usr/lib64/lustre/tests/test-framework.sh:3973:error_noexit()
        = /usr/lib64/lustre/tests/test-framework.sh:3996:error()
        = /usr/lib64/lustre/tests/sanity-benchmark.sh:199:test_fsx()
        = /usr/lib64/lustre/tests/test-framework.sh:4251:run_one()
        = /usr/lib64/lustre/tests/test-framework.sh:4284:run_one_logged()
        = /usr/lib64/lustre/tests/test-framework.sh:4139:run_test()
        = /usr/lib64/lustre/tests/sanity-benchmark.sh:203:main()
      Dumping lctl log to /logdir/test_logs/2013-03-04/lustre-master-el6-x86_64--full--2_1_1__1286__-70011898087180-120645/sanity-benchmark.test_fsx.*.1362442060.log
      CMD: client-20-ib,client-21-ib,client-22-ib,client-23-ib.lab.whamcloud.com /usr/sbin/lctl dk > /logdir/test_logs/2013-03-04/lustre-master-el6-x86_64--full--2_1_1__1286__-70011898087180-120645/sanity-benchmark.test_fsx.debug_log.\$(hostname -s).1362442060.log;
               dmesg > /logdir/test_logs/2013-03-04/lustre-master-el6-x86_64--full--2_1_1__1286__-70011898087180-120645/sanity-benchmark.test_fsx.dmesg.\$(hostname -s).1362442060.log
      

      Attachments

        Issue Links

          Activity

            [LU-2909] Failure on test suite sanity-benchmark test_fsx: fsx bus error, core dumped

            Well with the change I seem to be seeing 1.5 - 3 hours between bus errors but I don't have many runs in. It is hard to say if it makes it better at this point.

            keith Keith Mannthey (Inactive) added a comment - Well with the change I seem to be seeing 1.5 - 3 hours between bus errors but I don't have many runs in. It is hard to say if it makes it better at this point.

            In the orginal case the issue repos in somewhere between 8 hours and 30 min (yes a big window).

            cat /proc/sys/lustre/max_dirty_mb
            127.9

            now

            cat /proc/sys/lustre/max_dirty_mb
            460.7

            The test is running I will update when I know more. To confirm in the code this client/mds/ost trio is running is RHEL6 based 2.3.62 stock.

            keith Keith Mannthey (Inactive) added a comment - In the orginal case the issue repos in somewhere between 8 hours and 30 min (yes a big window). cat /proc/sys/lustre/max_dirty_mb 127.9 now cat /proc/sys/lustre/max_dirty_mb 460.7 The test is running I will update when I know more. To confirm in the code this client/mds/ost trio is running is RHEL6 based 2.3.62 stock.

            Sorry for the slow progress on this. I am collecting a few times to repro before I change values. I will have enough data to move forward tomorrow with the test.

            I will double /proc/sys/lustre/max_dirty_mb.

            keith Keith Mannthey (Inactive) added a comment - Sorry for the slow progress on this. I am collecting a few times to repro before I change values. I will have enough data to move forward tomorrow with the test. I will double /proc/sys/lustre/max_dirty_mb.
            niu Niu Yawei (Inactive) added a comment - - edited

            Per discussion with Xiong, the object should be on the write list after osc_extent_release() called, but the async flush problem has to be addressed.

            This should address the problem caused by async flush in osc_enter_cache(): http://review.whamcloud.com/5732

            niu Niu Yawei (Inactive) added a comment - - edited Per discussion with Xiong, the object should be on the write list after osc_extent_release() called, but the async flush problem has to be addressed. This should address the problem caused by async flush in osc_enter_cache(): http://review.whamcloud.com/5732

            Actually if an object has dirty pages, it should be in the clint_obd::cl_loi_write_list, unless extents are being used by someone - apparently this is not the case for fsx. Please let me know if I missed something.

            Looks the object is added onto the cl_loi_write_list only when osc_list_maint() (or __osc_list_maint()) is called upon the object, and we call osc_io_unplug(obj) -> __osc_list_maint(obj) in osc_enter_cache() for this purpose. Do we add the object onto write list somewhere else? I didn't see it in the code.

            niu Niu Yawei (Inactive) added a comment - Actually if an object has dirty pages, it should be in the clint_obd::cl_loi_write_list, unless extents are being used by someone - apparently this is not the case for fsx. Please let me know if I missed something. Looks the object is added onto the cl_loi_write_list only when osc_list_maint() (or __osc_list_maint()) is called upon the object, and we call osc_io_unplug(obj) -> __osc_list_maint(obj) in osc_enter_cache() for this purpose. Do we add the object onto write list somewhere else? I didn't see it in the code.

            1. the current object isn't being added onto ready list; (so the dirty of current object may not be flushed)

            Actually if an object has dirty pages, it should be in the clint_obd::cl_loi_write_list, unless extents are being used by someone - apparently this is not the case for fsx. Please let me know if I missed something.

            jay Jinshan Xiong (Inactive) added a comment - 1. the current object isn't being added onto ready list; (so the dirty of current object may not be flushed) Actually if an object has dirty pages, it should be in the clint_obd::cl_loi_write_list, unless extents are being used by someone - apparently this is not the case for fsx. Please let me know if I missed something.

            Keith, if your original test didn't include the (66be2e6014a7bf20b2f749c877cb141b209f15d9), I guess your case could be little different, please make sure enlarge the global dirty max: /proc/sys/lustre/max_dirty_mb (not only the per osc dirty max) as well when you testing.

            niu Niu Yawei (Inactive) added a comment - Keith, if your original test didn't include the (66be2e6014a7bf20b2f749c877cb141b209f15d9), I guess your case could be little different, please make sure enlarge the global dirty max: /proc/sys/lustre/max_dirty_mb (not only the per osc dirty max) as well when you testing.

            Xiong, your log shows osc_enter_cache returned -EDQUOT because of per osc dirty is exceeding 32M. I think it's caused by the chagne of (66be2e6014a7bf20b2f749c877cb141b209f15d9)

            diff --git a/lustre/osc/osc_cache.c b/lustre/osc/osc_cache.c
            index 37b249f..581beb6 100644
            --- a/lustre/osc/osc_cache.c
            +++ b/lustre/osc/osc_cache.c
            @@ -1533,12 +1533,7 @@ static int osc_enter_cache(const struct lu_env *env, struct client_obd *cli,
                            ocw.ocw_rc = 0;
                            client_obd_list_unlock(&cli->cl_loi_list_lock);
            
            -               /* First osc_io_unplug() tries to put current object
            -                * on ready list, second osc_io_unplug() makes sure that
            -                * dirty flush can still be triggered even if current
            -                * object hasn't any dirty pages */
            -               osc_io_unplug(env, cli, osc, PDL_POLICY_ROUND);
            -               osc_io_unplug(env, cli, NULL, PDL_POLICY_ROUND);
            +               osc_io_unplug_async(env, cli, NULL);
            
                            CDEBUG(D_CACHE, "%s: sleeping for cache space @ %p for %p\n",
                                   cli->cl_import->imp_obd->obd_name, &ocw, oap);
            

            There are two problems:

            1. the current object isn't being added onto ready list; (so the dirty of current object may not be flushed)
            2. flush is changed to asynchronous, so we can't make sure the dirty is flushed in time;

            I'm not sure if Keith's test included this fix.

            niu Niu Yawei (Inactive) added a comment - Xiong, your log shows osc_enter_cache returned -EDQUOT because of per osc dirty is exceeding 32M. I think it's caused by the chagne of (66be2e6014a7bf20b2f749c877cb141b209f15d9) diff --git a/lustre/osc/osc_cache.c b/lustre/osc/osc_cache.c index 37b249f..581beb6 100644 --- a/lustre/osc/osc_cache.c +++ b/lustre/osc/osc_cache.c @@ -1533,12 +1533,7 @@ static int osc_enter_cache( const struct lu_env *env, struct client_obd *cli, ocw.ocw_rc = 0; client_obd_list_unlock(&cli->cl_loi_list_lock); - /* First osc_io_unplug() tries to put current object - * on ready list, second osc_io_unplug() makes sure that - * dirty flush can still be triggered even if current - * object hasn't any dirty pages */ - osc_io_unplug(env, cli, osc, PDL_POLICY_ROUND); - osc_io_unplug(env, cli, NULL, PDL_POLICY_ROUND); + osc_io_unplug_async(env, cli, NULL); CDEBUG(D_CACHE, "%s: sleeping for cache space @ %p for %p\n" , cli->cl_import->imp_obd->obd_name, &ocw, oap); There are two problems: 1. the current object isn't being added onto ready list; (so the dirty of current object may not be flushed) 2. flush is changed to asynchronous, so we can't make sure the dirty is flushed in time; I'm not sure if Keith's test included this fix.

            This issue can be easily reproduced on my node with command: ./fsx -l $((4*1024*1024)) /mnt/lustre/fsxfile

            Please check the attachment for debug log.

            jay Jinshan Xiong (Inactive) added a comment - This issue can be easily reproduced on my node with command: ./fsx -l $((4*1024*1024)) /mnt/lustre/fsxfile Please check the attachment for debug log.

            I managed to accidentally reboot my debug client that was in the failing state. I restarted testing a while ago. Sorry for the delay.

            I also wanted to confirm my current settings as

            lctl get_param osc.*.max_dirty_mb
            osc.lustre-OST0000-osc-ffff88001d064000.max_dirty_mb=32

            My plan is get into the failing state again then double the number to 64. I hope to observe a clean running fsx loop after that.

            Also I tried setting osc.*.max_dirty_mb to 4mb in an attempt to shorted down the time to reproduction.

            keith Keith Mannthey (Inactive) added a comment - I managed to accidentally reboot my debug client that was in the failing state. I restarted testing a while ago. Sorry for the delay. I also wanted to confirm my current settings as lctl get_param osc.*.max_dirty_mb osc.lustre-OST0000-osc-ffff88001d064000.max_dirty_mb=32 My plan is get into the failing state again then double the number to 64. I hope to observe a clean running fsx loop after that. Also I tried setting osc.*.max_dirty_mb to 4mb in an attempt to shorted down the time to reproduction.

            I will kick off a test with master and see how a larger obd_max_dirty_pages affect things.

            keith Keith Mannthey (Inactive) added a comment - I will kick off a test with master and see how a larger obd_max_dirty_pages affect things.

            People

              niu Niu Yawei (Inactive)
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: