Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-7382

(vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs >= vio->vui_iter->nr_segs ) failed

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • Lustre 2.8.0
    • Lustre 2.7.0, Lustre 2.8.0, Lustre 2.5.4
    • None
    • 3
    • 9223372036854775807

    Description

      kernel:LustreError: 29035:0:(vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs >= vio->vui_iter->nr_segs ) failed: tot_nrsegs: 1, nrsegs: 2

      Message from syslogd@test1 at Nov 4 13:01:37 ...
      kernel:LustreError: 29035:0:(vvp_io.c:573:vvp_io_update_iov()) LBUG

      Attachments

        Issue Links

          Activity

            [LU-7382] (vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs >= vio->vui_iter->nr_segs ) failed
            simmonsja James A Simmons added a comment - - edited

            The patch that landed to b2_8 seems to handle most of the cases but we recently have found one application on our systems that causes this problem at random times.

            Correction: It looks closed to the LU-7067

            simmonsja James A Simmons added a comment - - edited The patch that landed to b2_8 seems to handle most of the cases but we recently have found one application on our systems that causes this problem at random times. Correction: It looks closed to the LU-7067

            Just as a "me too", we hit that same LBUG (trace below) with IEEL 3.0

            # cat /proc/fs/lustre/version 
            lustre: 2.7.15.3 
            kernel: patchless_client 
            build:  jenkins-arch=x86_64,build_type=client,distro=el6.7,ib_stack=inkernel-3843-ga11db72-PRISTINE-2.6.32-573.12.1.el6.x86_64
            
            2017-02-24 08:29:10 [3183317.732931] LustreError: 50671:0:(vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs >= vio->vui_iter->nr_segs ) failed: tot_nrsegs: 1, nrsegs: 2 
            2017-02-24 08:29:10 [3183317.749876] LustreError: 50671:0:(vvp_io.c:573:vvp_io_update_iov()) LBUG 
            2017-02-24 08:29:10 [3183317.757841] Pid: 50671, comm: jellyfish 
            2017-02-24 08:29:11 [3183317.762581] 
            2017-02-24 08:29:11 [3183317.762581] Call Trace: 
            2017-02-24 08:29:11 [3183317.767941]  [<ffffffffa030c895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 
            2017-02-24 08:29:11 [3183317.776222]  [<ffffffffa030ce97>] lbug_with_loc+0x47/0xb0 [libcfs] 
            2017-02-24 08:29:11 [3183317.783662]  [<ffffffffa09e1d19>] vvp_io_rw_lock+0x6f9/0x790 [lustre] 
            2017-02-24 08:29:11 [3183317.791356]  [<ffffffffa09e1de5>] vvp_io_write_lock+0x35/0x40 [lustre] 
            2017-02-24 08:29:11 [3183317.799184]  [<ffffffffa0526893>] cl_io_lock+0x63/0x3c0 [obdclass] 
            2017-02-24 08:29:11 [3183317.806589]  [<ffffffffa0526c92>] cl_io_loop+0xa2/0x1b0 [obdclass] 
            2017-02-24 08:29:11 [3183317.813985]  [<ffffffffa097d470>] ll_file_io_generic+0x5d0/0xae0 [lustre] 
            2017-02-24 08:29:11 [3183317.822061]  [<ffffffff8105e173>] ? __wake_up+0x53/0x70 
            2017-02-24 08:29:11 [3183317.828391]  [<ffffffffa0987dbb>] ll_file_aio_write+0x21b/0x9d0 [lustre] 
            2017-02-24 08:29:11 [3183317.836376]  [<ffffffffa0987ba0>] ? ll_file_aio_write+0x0/0x9d0 [lustre] 
            2017-02-24 08:29:11 [3183317.844351]  [<ffffffff811917db>] do_sync_readv_writev+0xfb/0x140 
            2017-02-24 08:29:11 [3183317.851649]  [<ffffffff810a1460>] ? autoremove_wake_function+0x0/0x40 
            2017-02-24 08:29:11 [3183317.859347]  [<ffffffffa051ae0d>] ? cl_env_put+0x16d/0x200 [obdclass] 
            2017-02-24 08:29:11 [3183317.867019]  [<ffffffff81231a56>] ? security_file_permission+0x16/0x20 
            2017-02-24 08:29:11 [3183317.874791]  [<ffffffff81192886>] do_readv_writev+0xd6/0x1f0 
            2017-02-24 08:29:11 [3183317.881616]  [<ffffffffa098a4d3>] ? ll_file_read+0x143/0x260 [lustre] 
            2017-02-24 08:29:11 [3183317.889288]  [<ffffffff811929e6>] vfs_writev+0x46/0x60 
            2017-02-24 08:29:11 [3183317.895509]  [<ffffffff81192b11>] sys_writev+0x51/0xd0 
            2017-02-24 08:29:11 [3183317.901747]  [<ffffffff8153c64e>] ? do_device_not_available+0xe/0x10 
            2017-02-24 08:29:11 [3183317.909337]  [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b 
            2017-02-24 08:29:11 [3183317.916521] 
            2017-02-24 08:29:11 [3183317.919054] Kernel panic - not syncing: LBUG 
            2017-02-24 08:29:11 [3183317.924280] Pid: 50671, comm: jellyfish Not tainted 2.6.32-573.12.1.el6.noc0w.x86_64 #1 
            2017-02-24 08:29:11 [3183317.933932] Call Trace: 
            2017-02-24 08:29:11 [3183317.937145]  [<ffffffff81538271>] ? panic+0xa7/0x16f 
            2017-02-24 08:29:11 [3183317.943174]  [<ffffffffa030ceeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] 
            2017-02-24 08:29:11 [3183317.950774]  [<ffffffffa09e1d19>] ? vvp_io_rw_lock+0x6f9/0x790 [lustre] 
            2017-02-24 08:29:11 [3183317.958640]  [<ffffffffa09e1de5>] ? vvp_io_write_lock+0x35/0x40 [lustre] 
            2017-02-24 08:29:11 [3183317.966634]  [<ffffffffa0526893>] ? cl_io_lock+0x63/0x3c0 [obdclass] 
            2017-02-24 08:29:11 [3183317.974209]  [<ffffffffa0526c92>] ? cl_io_loop+0xa2/0x1b0 [obdclass] 
            2017-02-24 08:29:11 [3183317.981778]  [<ffffffffa097d470>] ? ll_file_io_generic+0x5d0/0xae0 [lustre] 
            2017-02-24 08:29:11 [3183317.990012]  [<ffffffff8105e173>] ? __wake_up+0x53/0x70 
            2017-02-24 08:29:11 [3183317.996318]  [<ffffffffa0987dbb>] ? ll_file_aio_write+0x21b/0x9d0 [lustre] 
            2017-02-24 08:29:11 [3183318.004469]  [<ffffffffa0987ba0>] ? ll_file_aio_write+0x0/0x9d0 [lustre] 
            2017-02-24 08:29:11 [3183318.012424]  [<ffffffff811917db>] ? do_sync_readv_writev+0xfb/0x140 
            2017-02-24 08:29:11 [3183318.019891]  [<ffffffff810a1460>] ? autoremove_wake_function+0x0/0x40 
            2017-02-24 08:29:11 [3183318.027561]  [<ffffffffa051ae0d>] ? cl_env_put+0x16d/0x200 [obdclass] 
            2017-02-24 08:29:11 [3183318.035212]  [<ffffffff81231a56>] ? security_file_permission+0x16/0x20 
            2017-02-24 08:29:11 [3183318.042963]  [<ffffffff81192886>] ? do_readv_writev+0xd6/0x1f0 
            2017-02-24 08:29:11 [3183318.049950]  [<ffffffffa098a4d3>] ? ll_file_read+0x143/0x260 [lustre] 
            2017-02-24 08:29:11 [3183318.057606]  [<ffffffff811929e6>] ? vfs_writev+0x46/0x60 
            2017-02-24 08:29:11 [3183318.063997]  [<ffffffff81192b11>] ? sys_writev+0x51/0xd0 
            2017-02-24 08:29:11 [3183318.070389]  [<ffffffff8153c64e>] ? do_device_not_available+0xe/0x10 
            2017-02-24 08:29:11 [3183318.077955]  [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
            

            Cheers,
            -- 
            Kilian

            srcc Stanford Research Computing Center added a comment - Just as a "me too", we hit that same LBUG (trace below) with IEEL 3.0 # cat /proc/fs/lustre/version lustre: 2.7.15.3 kernel: patchless_client build:  jenkins-arch=x86_64,build_type=client,distro=el6.7,ib_stack=inkernel-3843-ga11db72-PRISTINE-2.6.32-573.12.1.el6.x86_64 2017-02-24 08:29:10 [3183317.732931] LustreError: 50671:0:(vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs >= vio->vui_iter->nr_segs ) failed: tot_nrsegs: 1, nrsegs: 2 2017-02-24 08:29:10 [3183317.749876] LustreError: 50671:0:(vvp_io.c:573:vvp_io_update_iov()) LBUG 2017-02-24 08:29:10 [3183317.757841] Pid: 50671, comm: jellyfish 2017-02-24 08:29:11 [3183317.762581] 2017-02-24 08:29:11 [3183317.762581] Call Trace: 2017-02-24 08:29:11 [3183317.767941]  [<ffffffffa030c895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs] 2017-02-24 08:29:11 [3183317.776222]  [<ffffffffa030ce97>] lbug_with_loc+0x47/0xb0 [libcfs] 2017-02-24 08:29:11 [3183317.783662]  [<ffffffffa09e1d19>] vvp_io_rw_lock+0x6f9/0x790 [lustre] 2017-02-24 08:29:11 [3183317.791356]  [<ffffffffa09e1de5>] vvp_io_write_lock+0x35/0x40 [lustre] 2017-02-24 08:29:11 [3183317.799184]  [<ffffffffa0526893>] cl_io_lock+0x63/0x3c0 [obdclass] 2017-02-24 08:29:11 [3183317.806589]  [<ffffffffa0526c92>] cl_io_loop+0xa2/0x1b0 [obdclass] 2017-02-24 08:29:11 [3183317.813985]  [<ffffffffa097d470>] ll_file_io_generic+0x5d0/0xae0 [lustre] 2017-02-24 08:29:11 [3183317.822061]  [<ffffffff8105e173>] ? __wake_up+0x53/0x70 2017-02-24 08:29:11 [3183317.828391]  [<ffffffffa0987dbb>] ll_file_aio_write+0x21b/0x9d0 [lustre] 2017-02-24 08:29:11 [3183317.836376]  [<ffffffffa0987ba0>] ? ll_file_aio_write+0x0/0x9d0 [lustre] 2017-02-24 08:29:11 [3183317.844351]  [<ffffffff811917db>] do_sync_readv_writev+0xfb/0x140 2017-02-24 08:29:11 [3183317.851649]  [<ffffffff810a1460>] ? autoremove_wake_function+0x0/0x40 2017-02-24 08:29:11 [3183317.859347]  [<ffffffffa051ae0d>] ? cl_env_put+0x16d/0x200 [obdclass] 2017-02-24 08:29:11 [3183317.867019]  [<ffffffff81231a56>] ? security_file_permission+0x16/0x20 2017-02-24 08:29:11 [3183317.874791]  [<ffffffff81192886>] do_readv_writev+0xd6/0x1f0 2017-02-24 08:29:11 [3183317.881616]  [<ffffffffa098a4d3>] ? ll_file_read+0x143/0x260 [lustre] 2017-02-24 08:29:11 [3183317.889288]  [<ffffffff811929e6>] vfs_writev+0x46/0x60 2017-02-24 08:29:11 [3183317.895509]  [<ffffffff81192b11>] sys_writev+0x51/0xd0 2017-02-24 08:29:11 [3183317.901747]  [<ffffffff8153c64e>] ? do_device_not_available+0xe/0x10 2017-02-24 08:29:11 [3183317.909337]  [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b 2017-02-24 08:29:11 [3183317.916521] 2017-02-24 08:29:11 [3183317.919054] Kernel panic - not syncing: LBUG 2017-02-24 08:29:11 [3183317.924280] Pid: 50671, comm: jellyfish Not tainted 2.6.32-573.12.1.el6.noc0w.x86_64 #1 2017-02-24 08:29:11 [3183317.933932] Call Trace: 2017-02-24 08:29:11 [3183317.937145]  [<ffffffff81538271>] ? panic+0xa7/0x16f 2017-02-24 08:29:11 [3183317.943174]  [<ffffffffa030ceeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs] 2017-02-24 08:29:11 [3183317.950774]  [<ffffffffa09e1d19>] ? vvp_io_rw_lock+0x6f9/0x790 [lustre] 2017-02-24 08:29:11 [3183317.958640]  [<ffffffffa09e1de5>] ? vvp_io_write_lock+0x35/0x40 [lustre] 2017-02-24 08:29:11 [3183317.966634]  [<ffffffffa0526893>] ? cl_io_lock+0x63/0x3c0 [obdclass] 2017-02-24 08:29:11 [3183317.974209]  [<ffffffffa0526c92>] ? cl_io_loop+0xa2/0x1b0 [obdclass] 2017-02-24 08:29:11 [3183317.981778]  [<ffffffffa097d470>] ? ll_file_io_generic+0x5d0/0xae0 [lustre] 2017-02-24 08:29:11 [3183317.990012]  [<ffffffff8105e173>] ? __wake_up+0x53/0x70 2017-02-24 08:29:11 [3183317.996318]  [<ffffffffa0987dbb>] ? ll_file_aio_write+0x21b/0x9d0 [lustre] 2017-02-24 08:29:11 [3183318.004469]  [<ffffffffa0987ba0>] ? ll_file_aio_write+0x0/0x9d0 [lustre] 2017-02-24 08:29:11 [3183318.012424]  [<ffffffff811917db>] ? do_sync_readv_writev+0xfb/0x140 2017-02-24 08:29:11 [3183318.019891]  [<ffffffff810a1460>] ? autoremove_wake_function+0x0/0x40 2017-02-24 08:29:11 [3183318.027561]  [<ffffffffa051ae0d>] ? cl_env_put+0x16d/0x200 [obdclass] 2017-02-24 08:29:11 [3183318.035212]  [<ffffffff81231a56>] ? security_file_permission+0x16/0x20 2017-02-24 08:29:11 [3183318.042963]  [<ffffffff81192886>] ? do_readv_writev+0xd6/0x1f0 2017-02-24 08:29:11 [3183318.049950]  [<ffffffffa098a4d3>] ? ll_file_read+0x143/0x260 [lustre] 2017-02-24 08:29:11 [3183318.057606]  [<ffffffff811929e6>] ? vfs_writev+0x46/0x60 2017-02-24 08:29:11 [3183318.063997]  [<ffffffff81192b11>] ? sys_writev+0x51/0xd0 2017-02-24 08:29:11 [3183318.070389]  [<ffffffff8153c64e>] ? do_device_not_available+0xe/0x10 2017-02-24 08:29:11 [3183318.077955]  [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b Cheers, --  Kilian

            I pushed a b2_8_fe patch of LU-4257 at http://review.whamcloud.com/#/c/21198. Do you have a good reproducer? We only see it once in awhile and haven't figured out what is causing the LBUG and if it can be done in a consistent way.

            simmonsja James A Simmons added a comment - I pushed a b2_8_fe patch of LU-4257 at http://review.whamcloud.com/#/c/21198 . Do you have a good reproducer? We only see it once in awhile and haven't figured out what is causing the LBUG and if it can be done in a consistent way.

            I have a patch to clean up this piece of code, please check out the patch with commit 1101120d3258509fa74f952cd8664bfdc17bd97d in the master branch and it would be worth porting that patch over and see if it can fix the problem.

            jay Jinshan Xiong (Inactive) added a comment - I have a patch to clean up this piece of code, please check out the patch with commit 1101120d3258509fa74f952cd8664bfdc17bd97d in the master branch and it would be worth porting that patch over and see if it can fix the problem.

            This problem still exist on b2_8_fe branch. We have hit this problem twice while running in production.

            2016-07-01T20:09:21.362206-04:00 c11-3c1s2n3 LustreError:
            32026:0:(vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs
            >= vio->vui_iter->nr_segs ) failed: tot_nrsegs: 1,
            nrsegs: 2
            2016-07-01T20:09:21.362293-04:00 c11-3c1s2n3 LustreError:
            32026:0:(vvp_io.c:573:vvp_io_update_iov()) LBUG
            2016-07-01T20:09:21.362336-04:00 c11-3c1s2n3 Pid: 32026, comm:
            bowtie2-build-s
            2016-07-01T20:09:21.362344-04:00 c11-3c1s2n3 Call Trace:
            2016-07-01T20:09:21.362359-04:00 c11-3c1s2n3 [<ffffffff81006651>]
            try_stack_unwind+0x161/0x1a0
            2016-07-01T20:09:21.362367-04:00 c11-3c1s2n3 [<ffffffff81004eb9>]
            dump_trace+0x89/0x430
            2016-07-01T20:09:21.391856-04:00 c11-3c1s2n3 [<ffffffffa011aac0>]
            lbug_with_loc+0x90/0x1d0 [libcfs]
            2016-07-01T20:09:21.391877-04:00 c11-3c1s2n3 [<ffffffffa06cb3f8>]
            vvp_io_rw_lock+0x738/0x860 [lustre]
            2016-07-01T20:09:21.391893-04:00 c11-3c1s2n3 [<ffffffffa06cb556>]
            vvp_io_write_lock+0x36/0x40 [lustre]
            2016-07-01T20:09:21.391902-04:00 c11-3c1s2n3 [<ffffffffa030a514>]
            cl_io_lock+0x74/0x400 [obdclass]
            2016-07-01T20:09:21.422510-04:00 c11-3c1s2n3 [<ffffffffa030be27>]
            cl_io_loop+0x2b7/0x710 [obdclass]
            2016-07-01T20:09:21.422529-04:00 c11-3c1s2n3 [<ffffffffa0675574>]
            ll_file_io_generic+0x364/0xab0 [lustre]
            2016-07-01T20:09:21.422544-04:00 c11-3c1s2n3 [<ffffffffa0676290>]
            ll_file_aio_write+0x5d0/0x6a0 [lustre]
            2016-07-01T20:09:21.422580-04:00 c11-3c1s2n3 [<ffffffff8114097b>]
            do_sync_readv_writev+0xdb/0x120
            2016-07-01T20:09:21.422594-04:00 c11-3c1s2n3 [<ffffffff81141854>]
            do_readv_writev+0xd4/0x1e0
            2016-07-01T20:09:21.422601-04:00 c11-3c1s2n3 [<ffffffff8114199e>]
            vfs_writev+0x3e/0x60
            2016-07-01T20:09:21.422608-04:00 c11-3c1s2n3 [<ffffffff81141ae5>]
            sys_writev+0x55/0xc0
            2016-07-01T20:09:21.503878-04:00 c11-3c1s2n3 [<ffffffff8133ac2b>]
            system_call_fastpath+0x16/0x1b
            2016-07-01T20:09:21.503911-04:00 c11-3c1s2n3 [<00002aaaab7581be>]
            0x2aaaab7581be
            2016-07-01T20:09:21.503920-04:00 c11-3c1s2n3 Kernel panic - not syncing:
            LBUG
            2016-07-01T20:09:21.503944-04:00 c11-3c1s2n3 Pid: 32026, comm:
            bowtie2-build-s Tainted: P
            3.0.101-0.46.1_1.0502.8871-cray_gem_c #1
            2016-07-01T20:09:21.503961-04:00 c11-3c1s2n3 Call Trace:
            2016-07-01T20:09:21.503970-04:00 c11-3c1s2n3 [<ffffffff81006651>]
            try_stack_unwind+0x161/0x1a0
            2016-07-01T20:09:21.503977-04:00 c11-3c1s2n3 [<ffffffff81004eb9>]
            dump_trace+0x89/0x430
            2016-07-01T20:09:21.503990-04:00 c11-3c1s2n3 [<ffffffff810060bc>]
            show_trace_log_lvl+0x5c/0x80
            2016-07-01T20:09:21.503996-04:00 c11-3c1s2n3 [<ffffffff810060f5>]
            show_trace+0x15/0x20
            2016-07-01T20:09:21.504011-04:00 c11-3c1s2n3 [<ffffffff81336d32>]
            dump_stack+0x79/0x84
            2016-07-01T20:09:21.504018-04:00 c11-3c1s2n3 [<ffffffff81336dd1>]
            panic+0x94/0x1da
            2016-07-01T20:09:21.504051-04:00 c11-3c1s2n3 [<ffffffffa011abf1>]
            lbug_with_loc+0x1c1/0x1d0 [libcfs]
            2016-07-01T20:09:21.504070-04:00 c11-3c1s2n3 [<ffffffffa06cb3f8>]
            vvp_io_rw_lock+0x738/0x860 [lustre]
            2016-07-01T20:09:21.504079-04:00 c11-3c1s2n3 [<ffffffffa06cb556>]
            vvp_io_write_lock+0x36/0x40 [lustre]
            2016-07-01T20:09:21.504121-04:00 c11-3c1s2n3 [<ffffffffa030a514>]
            cl_io_lock+0x74/0x400 [obdclass]
            2016-07-01T20:09:21.504139-04:00 c11-3c1s2n3 [<ffffffffa030be27>]
            cl_io_loop+0x2b7/0x710 [obdclass]
            2016-07-01T20:09:21.504157-04:00 c11-3c1s2n3 [<ffffffffa0675574>]
            ll_file_io_generic+0x364/0xab0 [lustre]
            2016-07-01T20:09:21.504174-04:00 c11-3c1s2n3 [<ffffffffa0676290>]
            ll_file_aio_write+0x5d0/0x6a0 [lustre]
            2016-07-01T20:09:21.504185-04:00 c11-3c1s2n3 [<ffffffff8114097b>]
            do_sync_readv_writev+0xdb/0x120
            2016-07-01T20:09:21.532355-04:00 c11-3c1s2n3 [<ffffffff81141854>]
            do_readv_writev+0xd4/0x1e0
            2016-07-01T20:09:21.532377-04:00 c11-3c1s2n3 [<ffffffff8114199e>]
            vfs_writev+0x3e/0x60
            2016-07-01T20:09:21.532387-04:00 c11-3c1s2n3 [<ffffffff81141ae5>]
            sys_writev+0x55/0xc0
            2016-07-01T20:09:21.532405-04:00 c11-3c1s2n3 [<ffffffff8133ac2b>]
            system_call_fastpath+0x16/0x1b
            2016-07-01T20:09:21.532417-04:00 c11-3c1s2n3 [<00002aaaab7581be>]
            0x2aaaab7581bd
            2016-07-01T20:09:21.662130-04:00 c6-0c2s1n2 LustreError:
            24943:0:(vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs
            >= vio->vui_iter->nr_segs ) failed: tot_nrsegs: 1,
            nrsegs: 2
            2016-07-01T20:09:21.662163-04:00 c6-0c2s1n2 LustreError:
            24943:0:(vvp_io.c:573:vvp_io_update_iov()) LBUG
            2016-07-01T20:09:21.662205-04:00 c6-0c2s1n2 Pid: 24943, comm:
            bowtie2-build-s
            2016-07-01T20:09:21.662214-04:00 c6-0c2s1n2 Call Trace:
            2016-07-01T20:09:21.662222-04:00 c6-0c2s1n2 [<ffffffff81006651>]
            try_stack_unwind+0x161/0x1a0
            2016-07-01T20:09:21.662229-04:00 c6-0c2s1n2 [<ffffffff81004eb9>]
            dump_trace+0x89/0x430
            2016-07-01T20:09:21.662241-04:00 c6-0c2s1n2 [<ffffffffa011aac0>]
            lbug_with_loc+0x90/0x1d0 [libcfs]
            2016-07-01T20:09:21.692133-04:00 c6-0c2s1n2 [<ffffffffa06cb3f8>]
            vvp_io_rw_lock+0x738/0x860 [lustre]
            2016-07-01T20:09:21.692165-04:00 c6-0c2s1n2 [<ffffffffa06cb556>]
            vvp_io_write_lock+0x36/0x40 [lustre]
            2016-07-01T20:09:21.692204-04:00 c6-0c2s1n2 [<ffffffffa030a514>]
            cl_io_lock+0x74/0x400 [obdclass]
            2016-07-01T20:09:21.692212-04:00 c6-0c2s1n2 [<ffffffffa030be27>]
            cl_io_loop+0x2b7/0x710 [obdclass]
            2016-07-01T20:09:21.692236-04:00 c6-0c2s1n2 [<ffffffffa0675574>]
            ll_file_io_generic+0x364/0xab0 [lustre]
            2016-07-01T20:09:21.692257-04:00 c6-0c2s1n2 [<ffffffffa0676290>]
            ll_file_aio_write+0x5d0/0x6a0 [lustre]
            2016-07-01T20:09:21.742635-04:00 c6-0c2s1n2 [<ffffffff8114097b>]
            do_sync_readv_writev+0xdb/0x120
            2016-07-01T20:09:21.742668-04:00 c6-0c2s1n2 [<ffffffff81141854>]
            do_readv_writev+0xd4/0x1e0
            2016-07-01T20:09:21.742678-04:00 c6-0c2s1n2 [<ffffffff8114199e>]
            vfs_writev+0x3e/0x60
            2016-07-01T20:09:21.742731-04:00 c6-0c2s1n2 [<ffffffff81141ae5>]
            sys_writev+0x55/0xc0
            2016-07-01T20:09:21.742745-04:00 c6-0c2s1n2 [<ffffffff8133ac2b>]
            system_call_fastpath+0x16/0x1b
            2016-07-01T20:09:21.742754-04:00 c6-0c2s1n2 [<00002aaaab7581be>]
            0x2aaaab7581be
            2016-07-01T20:09:21.742800-04:00 c6-0c2s1n2 Kernel panic - not syncing:
            LBUG
            2016-07-01T20:09:21.742827-04:00 c6-0c2s1n2 Pid: 24943, comm:
            bowtie2-build-s Tainted: P
            3.0.101-0.46.1_1.0502.8871-cray_gem_c #1
            2016-07-01T20:09:21.742849-04:00 c6-0c2s1n2 Call Trace:
            2016-07-01T20:09:21.742904-04:00 c6-0c2s1n2 [<ffffffff81006651>]
            try_stack_unwind+0x161/0x1a0
            2016-07-01T20:09:21.742940-04:00 c6-0c2s1n2 [<ffffffff81004eb9>]
            dump_trace+0x89/0x430
            2016-07-01T20:09:21.742948-04:00 c6-0c2s1n2 [<ffffffff810060bc>]
            show_trace_log_lvl+0x5c/0x80
            2016-07-01T20:09:21.772093-04:00 c6-0c2s1n2 [<ffffffff810060f5>]
            show_trace+0x15/0x20
            2016-07-01T20:09:21.772123-04:00 c6-0c2s1n2 [<ffffffff81336d32>]
            dump_stack+0x79/0x84
            2016-07-01T20:09:21.772132-04:00 c6-0c2s1n2 [<ffffffff81336dd1>]
            panic+0x94/0x1da
            2016-07-01T20:09:21.772205-04:00 c6-0c2s1n2 [<ffffffffa011abf1>]
            lbug_with_loc+0x1c1/0x1d0 [libcfs]
            2016-07-01T20:09:21.772224-04:00 c6-0c2s1n2 [<ffffffffa06cb3f8>]
            vvp_io_rw_lock+0x738/0x860 [lustre]
            2016-07-01T20:09:21.772241-04:00 c6-0c2s1n2 [<ffffffffa06cb556>]
            vvp_io_write_lock+0x36/0x40 [lustre]
            2016-07-01T20:09:21.772256-04:00 c6-0c2s1n2 [<ffffffffa030a514>]
            cl_io_lock+0x74/0x400 [obdclass]
            2016-07-01T20:09:21.802196-04:00 c6-0c2s1n2 [<ffffffffa030be27>]
            cl_io_loop+0x2b7/0x710 [obdclass]
            2016-07-01T20:09:21.802260-04:00 c6-0c2s1n2 [<ffffffffa0675574>]
            ll_file_io_generic+0x364/0xab0 [lustre]
            2016-07-01T20:09:21.802270-04:00 c6-0c2s1n2 [<ffffffffa0676290>]
            ll_file_aio_write+0x5d0/0x6a0 [lustre]
            2016-07-01T20:09:21.802279-04:00 c6-0c2s1n2 [<ffffffff8114097b>]
            do_sync_readv_writev+0xdb/0x120
            2016-07-01T20:09:21.802299-04:00 c6-0c2s1n2 [<ffffffff81141854>]
            do_readv_writev+0xd4/0x1e0
            2016-07-01T20:09:21.802317-04:00 c6-0c2s1n2 [<ffffffff8114199e>]
            vfs_writev+0x3e/0x60
            2016-07-01T20:09:21.832209-04:00 c6-0c2s1n2 [<ffffffff81141ae5>]
            sys_writev+0x55/0xc0
            2016-07-01T20:09:21.832229-04:00 c6-0c2s1n2 [<ffffffff8133ac2b>]
            system_call_fastpath+0x16/0x1b
            2016-07-01T20:09:21.832259-04:00 c6-0c2s1n2 [<00002aaaab7581be>]
            0x2aaaab7581bd

            simmonsja James A Simmons added a comment - This problem still exist on b2_8_fe branch. We have hit this problem twice while running in production. 2016-07-01T20:09:21.362206-04:00 c11-3c1s2n3 LustreError: 32026:0:(vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs >= vio->vui_iter->nr_segs ) failed: tot_nrsegs: 1, nrsegs: 2 2016-07-01T20:09:21.362293-04:00 c11-3c1s2n3 LustreError: 32026:0:(vvp_io.c:573:vvp_io_update_iov()) LBUG 2016-07-01T20:09:21.362336-04:00 c11-3c1s2n3 Pid: 32026, comm: bowtie2-build-s 2016-07-01T20:09:21.362344-04:00 c11-3c1s2n3 Call Trace: 2016-07-01T20:09:21.362359-04:00 c11-3c1s2n3 [<ffffffff81006651>] try_stack_unwind+0x161/0x1a0 2016-07-01T20:09:21.362367-04:00 c11-3c1s2n3 [<ffffffff81004eb9>] dump_trace+0x89/0x430 2016-07-01T20:09:21.391856-04:00 c11-3c1s2n3 [<ffffffffa011aac0>] lbug_with_loc+0x90/0x1d0 [libcfs] 2016-07-01T20:09:21.391877-04:00 c11-3c1s2n3 [<ffffffffa06cb3f8>] vvp_io_rw_lock+0x738/0x860 [lustre] 2016-07-01T20:09:21.391893-04:00 c11-3c1s2n3 [<ffffffffa06cb556>] vvp_io_write_lock+0x36/0x40 [lustre] 2016-07-01T20:09:21.391902-04:00 c11-3c1s2n3 [<ffffffffa030a514>] cl_io_lock+0x74/0x400 [obdclass] 2016-07-01T20:09:21.422510-04:00 c11-3c1s2n3 [<ffffffffa030be27>] cl_io_loop+0x2b7/0x710 [obdclass] 2016-07-01T20:09:21.422529-04:00 c11-3c1s2n3 [<ffffffffa0675574>] ll_file_io_generic+0x364/0xab0 [lustre] 2016-07-01T20:09:21.422544-04:00 c11-3c1s2n3 [<ffffffffa0676290>] ll_file_aio_write+0x5d0/0x6a0 [lustre] 2016-07-01T20:09:21.422580-04:00 c11-3c1s2n3 [<ffffffff8114097b>] do_sync_readv_writev+0xdb/0x120 2016-07-01T20:09:21.422594-04:00 c11-3c1s2n3 [<ffffffff81141854>] do_readv_writev+0xd4/0x1e0 2016-07-01T20:09:21.422601-04:00 c11-3c1s2n3 [<ffffffff8114199e>] vfs_writev+0x3e/0x60 2016-07-01T20:09:21.422608-04:00 c11-3c1s2n3 [<ffffffff81141ae5>] sys_writev+0x55/0xc0 2016-07-01T20:09:21.503878-04:00 c11-3c1s2n3 [<ffffffff8133ac2b>] system_call_fastpath+0x16/0x1b 2016-07-01T20:09:21.503911-04:00 c11-3c1s2n3 [<00002aaaab7581be>] 0x2aaaab7581be 2016-07-01T20:09:21.503920-04:00 c11-3c1s2n3 Kernel panic - not syncing: LBUG 2016-07-01T20:09:21.503944-04:00 c11-3c1s2n3 Pid: 32026, comm: bowtie2-build-s Tainted: P 3.0.101-0.46.1_1.0502.8871-cray_gem_c #1 2016-07-01T20:09:21.503961-04:00 c11-3c1s2n3 Call Trace: 2016-07-01T20:09:21.503970-04:00 c11-3c1s2n3 [<ffffffff81006651>] try_stack_unwind+0x161/0x1a0 2016-07-01T20:09:21.503977-04:00 c11-3c1s2n3 [<ffffffff81004eb9>] dump_trace+0x89/0x430 2016-07-01T20:09:21.503990-04:00 c11-3c1s2n3 [<ffffffff810060bc>] show_trace_log_lvl+0x5c/0x80 2016-07-01T20:09:21.503996-04:00 c11-3c1s2n3 [<ffffffff810060f5>] show_trace+0x15/0x20 2016-07-01T20:09:21.504011-04:00 c11-3c1s2n3 [<ffffffff81336d32>] dump_stack+0x79/0x84 2016-07-01T20:09:21.504018-04:00 c11-3c1s2n3 [<ffffffff81336dd1>] panic+0x94/0x1da 2016-07-01T20:09:21.504051-04:00 c11-3c1s2n3 [<ffffffffa011abf1>] lbug_with_loc+0x1c1/0x1d0 [libcfs] 2016-07-01T20:09:21.504070-04:00 c11-3c1s2n3 [<ffffffffa06cb3f8>] vvp_io_rw_lock+0x738/0x860 [lustre] 2016-07-01T20:09:21.504079-04:00 c11-3c1s2n3 [<ffffffffa06cb556>] vvp_io_write_lock+0x36/0x40 [lustre] 2016-07-01T20:09:21.504121-04:00 c11-3c1s2n3 [<ffffffffa030a514>] cl_io_lock+0x74/0x400 [obdclass] 2016-07-01T20:09:21.504139-04:00 c11-3c1s2n3 [<ffffffffa030be27>] cl_io_loop+0x2b7/0x710 [obdclass] 2016-07-01T20:09:21.504157-04:00 c11-3c1s2n3 [<ffffffffa0675574>] ll_file_io_generic+0x364/0xab0 [lustre] 2016-07-01T20:09:21.504174-04:00 c11-3c1s2n3 [<ffffffffa0676290>] ll_file_aio_write+0x5d0/0x6a0 [lustre] 2016-07-01T20:09:21.504185-04:00 c11-3c1s2n3 [<ffffffff8114097b>] do_sync_readv_writev+0xdb/0x120 2016-07-01T20:09:21.532355-04:00 c11-3c1s2n3 [<ffffffff81141854>] do_readv_writev+0xd4/0x1e0 2016-07-01T20:09:21.532377-04:00 c11-3c1s2n3 [<ffffffff8114199e>] vfs_writev+0x3e/0x60 2016-07-01T20:09:21.532387-04:00 c11-3c1s2n3 [<ffffffff81141ae5>] sys_writev+0x55/0xc0 2016-07-01T20:09:21.532405-04:00 c11-3c1s2n3 [<ffffffff8133ac2b>] system_call_fastpath+0x16/0x1b 2016-07-01T20:09:21.532417-04:00 c11-3c1s2n3 [<00002aaaab7581be>] 0x2aaaab7581bd 2016-07-01T20:09:21.662130-04:00 c6-0c2s1n2 LustreError: 24943:0:(vvp_io.c:573:vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs >= vio->vui_iter->nr_segs ) failed: tot_nrsegs: 1, nrsegs: 2 2016-07-01T20:09:21.662163-04:00 c6-0c2s1n2 LustreError: 24943:0:(vvp_io.c:573:vvp_io_update_iov()) LBUG 2016-07-01T20:09:21.662205-04:00 c6-0c2s1n2 Pid: 24943, comm: bowtie2-build-s 2016-07-01T20:09:21.662214-04:00 c6-0c2s1n2 Call Trace: 2016-07-01T20:09:21.662222-04:00 c6-0c2s1n2 [<ffffffff81006651>] try_stack_unwind+0x161/0x1a0 2016-07-01T20:09:21.662229-04:00 c6-0c2s1n2 [<ffffffff81004eb9>] dump_trace+0x89/0x430 2016-07-01T20:09:21.662241-04:00 c6-0c2s1n2 [<ffffffffa011aac0>] lbug_with_loc+0x90/0x1d0 [libcfs] 2016-07-01T20:09:21.692133-04:00 c6-0c2s1n2 [<ffffffffa06cb3f8>] vvp_io_rw_lock+0x738/0x860 [lustre] 2016-07-01T20:09:21.692165-04:00 c6-0c2s1n2 [<ffffffffa06cb556>] vvp_io_write_lock+0x36/0x40 [lustre] 2016-07-01T20:09:21.692204-04:00 c6-0c2s1n2 [<ffffffffa030a514>] cl_io_lock+0x74/0x400 [obdclass] 2016-07-01T20:09:21.692212-04:00 c6-0c2s1n2 [<ffffffffa030be27>] cl_io_loop+0x2b7/0x710 [obdclass] 2016-07-01T20:09:21.692236-04:00 c6-0c2s1n2 [<ffffffffa0675574>] ll_file_io_generic+0x364/0xab0 [lustre] 2016-07-01T20:09:21.692257-04:00 c6-0c2s1n2 [<ffffffffa0676290>] ll_file_aio_write+0x5d0/0x6a0 [lustre] 2016-07-01T20:09:21.742635-04:00 c6-0c2s1n2 [<ffffffff8114097b>] do_sync_readv_writev+0xdb/0x120 2016-07-01T20:09:21.742668-04:00 c6-0c2s1n2 [<ffffffff81141854>] do_readv_writev+0xd4/0x1e0 2016-07-01T20:09:21.742678-04:00 c6-0c2s1n2 [<ffffffff8114199e>] vfs_writev+0x3e/0x60 2016-07-01T20:09:21.742731-04:00 c6-0c2s1n2 [<ffffffff81141ae5>] sys_writev+0x55/0xc0 2016-07-01T20:09:21.742745-04:00 c6-0c2s1n2 [<ffffffff8133ac2b>] system_call_fastpath+0x16/0x1b 2016-07-01T20:09:21.742754-04:00 c6-0c2s1n2 [<00002aaaab7581be>] 0x2aaaab7581be 2016-07-01T20:09:21.742800-04:00 c6-0c2s1n2 Kernel panic - not syncing: LBUG 2016-07-01T20:09:21.742827-04:00 c6-0c2s1n2 Pid: 24943, comm: bowtie2-build-s Tainted: P 3.0.101-0.46.1_1.0502.8871-cray_gem_c #1 2016-07-01T20:09:21.742849-04:00 c6-0c2s1n2 Call Trace: 2016-07-01T20:09:21.742904-04:00 c6-0c2s1n2 [<ffffffff81006651>] try_stack_unwind+0x161/0x1a0 2016-07-01T20:09:21.742940-04:00 c6-0c2s1n2 [<ffffffff81004eb9>] dump_trace+0x89/0x430 2016-07-01T20:09:21.742948-04:00 c6-0c2s1n2 [<ffffffff810060bc>] show_trace_log_lvl+0x5c/0x80 2016-07-01T20:09:21.772093-04:00 c6-0c2s1n2 [<ffffffff810060f5>] show_trace+0x15/0x20 2016-07-01T20:09:21.772123-04:00 c6-0c2s1n2 [<ffffffff81336d32>] dump_stack+0x79/0x84 2016-07-01T20:09:21.772132-04:00 c6-0c2s1n2 [<ffffffff81336dd1>] panic+0x94/0x1da 2016-07-01T20:09:21.772205-04:00 c6-0c2s1n2 [<ffffffffa011abf1>] lbug_with_loc+0x1c1/0x1d0 [libcfs] 2016-07-01T20:09:21.772224-04:00 c6-0c2s1n2 [<ffffffffa06cb3f8>] vvp_io_rw_lock+0x738/0x860 [lustre] 2016-07-01T20:09:21.772241-04:00 c6-0c2s1n2 [<ffffffffa06cb556>] vvp_io_write_lock+0x36/0x40 [lustre] 2016-07-01T20:09:21.772256-04:00 c6-0c2s1n2 [<ffffffffa030a514>] cl_io_lock+0x74/0x400 [obdclass] 2016-07-01T20:09:21.802196-04:00 c6-0c2s1n2 [<ffffffffa030be27>] cl_io_loop+0x2b7/0x710 [obdclass] 2016-07-01T20:09:21.802260-04:00 c6-0c2s1n2 [<ffffffffa0675574>] ll_file_io_generic+0x364/0xab0 [lustre] 2016-07-01T20:09:21.802270-04:00 c6-0c2s1n2 [<ffffffffa0676290>] ll_file_aio_write+0x5d0/0x6a0 [lustre] 2016-07-01T20:09:21.802279-04:00 c6-0c2s1n2 [<ffffffff8114097b>] do_sync_readv_writev+0xdb/0x120 2016-07-01T20:09:21.802299-04:00 c6-0c2s1n2 [<ffffffff81141854>] do_readv_writev+0xd4/0x1e0 2016-07-01T20:09:21.802317-04:00 c6-0c2s1n2 [<ffffffff8114199e>] vfs_writev+0x3e/0x60 2016-07-01T20:09:21.832209-04:00 c6-0c2s1n2 [<ffffffff81141ae5>] sys_writev+0x55/0xc0 2016-07-01T20:09:21.832229-04:00 c6-0c2s1n2 [<ffffffff8133ac2b>] system_call_fastpath+0x16/0x1b 2016-07-01T20:09:21.832259-04:00 c6-0c2s1n2 [<00002aaaab7581be>] 0x2aaaab7581bd

            Landed for 2.8.0

            jgmitter Joseph Gmitter (Inactive) added a comment - Landed for 2.8.0

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17632/
            Subject: LU-7382 llite: Fix iovec references accounting in ll_file_aio_read/write
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 57f055f8d0df80e140724b00d1729f454222a83a

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/17632/ Subject: LU-7382 llite: Fix iovec references accounting in ll_file_aio_read/write Project: fs/lustre-release Branch: master Current Patch Set: Commit: 57f055f8d0df80e140724b00d1729f454222a83a

            I've uploaded a new set of four dumps with Andriy's patch to here:
            ftp.whamcloud.com:/uploads/LU-7382/151223_dumps.tar.gz

            All four nodes which failed had debug enabled. I've included the extracted (and sorted) logs for one of them:
            c0-0c1s0n1-1512222203_log.sort

            paf Patrick Farrell (Inactive) added a comment - I've uploaded a new set of four dumps with Andriy's patch to here: ftp.whamcloud.com:/uploads/ LU-7382 /151223_dumps.tar.gz All four nodes which failed had debug enabled. I've included the extracted (and sorted) logs for one of them: c0-0c1s0n1-1512222203_log.sort

            Andriy Skulysh (andriy.skulysh@seagate.com) uploaded a new patch: http://review.whamcloud.com/17632
            Subject: LU-7382 llite: vvp_io_update_iov() ASSERTION failure
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 8431fec8ba5434b76aa994d93d5fa44b850be689

            gerrit Gerrit Updater added a comment - Andriy Skulysh (andriy.skulysh@seagate.com) uploaded a new patch: http://review.whamcloud.com/17632 Subject: LU-7382 llite: vvp_io_update_iov() ASSERTION failure Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 8431fec8ba5434b76aa994d93d5fa44b850be689
            bobijam Zhenyu Xu added a comment -

            Thank you Ann for the crash dump.

            bobijam Zhenyu Xu added a comment - Thank you Ann for the crash dump.

            People

              bobijam Zhenyu Xu
              bobijam Zhenyu Xu
              Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated: