Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12503

LustreError: 19435:0:(vvp_io.c:1056:vvp_io_write_start()) LBUG

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.14.0, Lustre 2.12.4
    • Lustre 2.10.6, Lustre 2.12.2
    • None
    • Server: PowerEdge R640 with 64 GB memory and Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz
      OS: CentOS 7.5.1804
      Lustre client: 2.12.2
    • 3
    • 9223372036854775807

    Description

      We are running our lustre file system on 1 mds and 8 oss nodes. we are running lustre 2.10.6 on the lustre servers and clients.

      On one of the clients, we are exporting lustre via NFS3 and smb, it has been working fine for more than a year, but recently the client which is exporting lustre as NFS and smb start to crash due to a lustre bug as following:

       

      2014.148312] LustreError: 19435:0:(vvp_io.c:1056:vvp_io_write_start()) ASSERTION( vio->vui_iocb->ki_pos == pos ) failed: ki_pos 1209601876 [1209597952, 1210056704)
      [ 2014.148338] LustreError: 19435:0:(vvp_io.c:1056:vvp_io_write_start()) LBUG
      [ 2014.148352] Pid: 19435, comm: nfsd 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019
      [ 2014.148353] Call Trace:
      [ 2014.148376] [<ffffffffc0a0d7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [ 2014.148389] [<ffffffffc0a0d87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [ 2014.148394] [<ffffffffc1061270>] vvp_io_write_start+0x790/0x820 [lustre]
      [ 2014.148419] [<ffffffffc0cb5328>] cl_io_start+0x68/0x130 [obdclass]
      [ 2014.148449] [<ffffffffc0cb74fc>] cl_io_loop+0xcc/0x1c0 [obdclass]
      [ 2014.148462] [<ffffffffc101765b>] ll_file_io_generic+0x63b/0xcb0 [lustre]
      [ 2014.148470] [<ffffffffc10182f2>] ll_file_aio_write+0x442/0x590 [lustre]
      [ 2014.148476] [<ffffffff8d040e6b>] do_sync_readv_writev+0x7b/0xd0
      [ 2014.148480] [<ffffffff8d042aae>] do_readv_writev+0xce/0x260
      [ 2014.148482] [<ffffffff8d042cd5>] vfs_writev+0x35/0x60
      [ 2014.148484] [<ffffffffc0699f90>] nfsd_vfs_write+0xc0/0x3a0 [nfsd]
      [ 2014.148492] [<ffffffffc069c962>] nfsd_write+0x112/0x2a0 [nfsd]
      [ 2014.148498] [<ffffffffc06a3070>] nfsd3_proc_write+0xc0/0x160 [nfsd]
      [ 2014.148504] [<ffffffffc0694810>] nfsd_dispatch+0xe0/0x290 [nfsd]
      [ 2014.148509] [<ffffffffc0610cf3>] svc_process_common+0x493/0x760 [sunrpc]
      [ 2014.148523] [<ffffffffc06110c3>] svc_process+0x103/0x190 [sunrpc]
      [ 2014.148531] [<ffffffffc069416f>] nfsd+0xdf/0x150 [nfsd]
      [ 2014.148535] [<ffffffff8cec1da1>] kthread+0xd1/0xe0
      [ 2014.148539] [<ffffffff8d575c1d>] ret_from_fork_nospec_begin+0x7/0x21
      [ 2014.148543] [<ffffffffffffffff>] 0xffffffffffffffff
      [ 2014.148551] Kernel panic - not syncing: LBUG
      [ 2014.148561] CPU: 2 PID: 19435 Comm: nfsd Kdump: loaded Tainted: G OE ------------ 3.10.0-957.21.3.el7.x86_64 #1
      [ 2014.148579] Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 1.4.8 05/21/2018
      [ 2014.148592] Call Trace:
      [ 2014.148603] [<ffffffff8d563107>] dump_stack+0x19/0x1b
      [ 2014.148615] [<ffffffff8d55c810>] panic+0xe8/0x21f
      [ 2014.148629] [<ffffffffc0a0d8cb>] lbug_with_loc+0x9b/0xa0 [libcfs]
      [ 2014.148650] [<ffffffffc1061270>] vvp_io_write_start+0x790/0x820 [lustre]
      [ 2014.148675] [<ffffffffc0cb3357>] ? cl_lock_request+0x67/0x1f0 [obdclass]
      [ 2014.148699] [<ffffffffc0cb5328>] cl_io_start+0x68/0x130 [obdclass]
      [ 2014.148722] [<ffffffffc0cb74fc>] cl_io_loop+0xcc/0x1c0 [obdclass]
      [ 2014.148739] [<ffffffffc101765b>] ll_file_io_generic+0x63b/0xcb0 [lustre]
      [ 2014.148753] [<ffffffff8ced3250>] ? check_preempt_curr+0x80/0xa0
      [ 2014.148771] [<ffffffffc10182f2>] ll_file_aio_write+0x442/0x590 [lustre]
      [ 2014.148784] [<ffffffff8d040e6b>] do_sync_readv_writev+0x7b/0xd0
      [ 2014.148914] [<ffffffff8d042aae>] do_readv_writev+0xce/0x260
      [ 2014.149049] [<ffffffffc1017eb0>] ? ll_file_splice_read+0x1e0/0x1e0 [lustre]
      [ 2014.149185] [<ffffffffc1018440>] ? ll_file_aio_write+0x590/0x590 [lustre]
      [ 2014.149318] [<ffffffff8d11e003>] ? ima_get_action+0x23/0x30
      [ 2014.149447] [<ffffffff8d11d51e>] ? process_measurement+0x8e/0x250
      [ 2014.149578] [<ffffffff8d03f087>] ? do_dentry_open+0x1e7/0x2e0
      [ 2014.149708] [<ffffffff8d042cd5>] vfs_writev+0x35/0x60
      [ 2014.149841] [<ffffffffc0699f90>] nfsd_vfs_write+0xc0/0x3a0 [nfsd]
      [ 2014.149975] [<ffffffffc069c962>] nfsd_write+0x112/0x2a0 [nfsd]
      [ 2014.150109] [<ffffffffc06a3070>] nfsd3_proc_write+0xc0/0x160 [nfsd]
      [ 2014.150243] [<ffffffffc0694810>] nfsd_dispatch+0xe0/0x290 [nfsd]
      [ 2014.150381] [<ffffffffc0610cf3>] svc_process_common+0x493/0x760 [sunrpc]
      [ 2014.150489] LustreError: 19462:0:(vvp_io.c:1056:vvp_io_write_start()) ASSERTION( vio->vui_iocb->ki_pos == pos ) failed: ki_pos 1211699028 [1211695104, 1212153856)
      [ 2014.150491] LustreError: 19462:0:(vvp_io.c:1056:vvp_io_write_start()) LBUG
      [ 2014.150492] Pid: 19462, comm: nfsd 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019
      [ 2014.150492] Call Trace:
      [ 2014.150514] [<ffffffffc0a0d7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [ 2014.150519] [<ffffffffc0a0d87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [ 2014.150533] [<ffffffffc1061270>] vvp_io_write_start+0x790/0x820 [lustre]
      [ 2014.150551] [<ffffffffc0cb5328>] cl_io_start+0x68/0x130 [obdclass]
      [ 2014.150564] [<ffffffffc0cb74fc>] cl_io_loop+0xcc/0x1c0 [obdclass]
      [ 2014.150571] [<ffffffffc101765b>] ll_file_io_generic+0x63b/0xcb0 [lustre]
      [ 2014.150577] [<ffffffffc10182f2>] ll_file_aio_write+0x442/0x590 [lustre]
      [ 2014.150580] [<ffffffff8d040e6b>] do_sync_readv_writev+0x7b/0xd0
      [ 2014.150581] [<ffffffff8d042aae>] do_readv_writev+0xce/0x260
      [ 2014.150583] [<ffffffff8d042cd5>] vfs_writev+0x35/0x60
      [ 2014.150589] [<ffffffffc0699f90>] nfsd_vfs_write+0xc0/0x3a0 [nfsd]
      [ 2014.150594] [<ffffffffc069c962>] nfsd_write+0x112/0x2a0 [nfsd]
      [ 2014.150599] [<ffffffffc06a3070>] nfsd3_proc_write+0xc0/0x160 [nfsd]
      [ 2014.150603] [<ffffffffc0694810>] nfsd_dispatch+0xe0/0x290 [nfsd]
      [ 2014.150613] [<ffffffffc0610cf3>] svc_process_common+0x493/0x760 [sunrpc]
      [ 2014.150621] [<ffffffffc06110c3>] svc_process+0x103/0x190 [sunrpc]
      [ 2014.150625] [<ffffffffc069416f>] nfsd+0xdf/0x150 [nfsd]
      [ 2014.150627] [<ffffffff8cec1da1>] kthread+0xd1/0xe0
      [ 2014.150630] [<ffffffff8d575c1d>] ret_from_fork_nospec_begin+0x7/0x21
      [ 2014.150634] [<ffffffffffffffff>] 0xffffffffffffffff
      [ 2014.152515] LustreError: 19480:0:(vvp_io.c:1056:vvp_io_write_start()) ASSERTION( vio->vui_iocb->ki_pos == pos ) failed: ki_pos 1213796180 [1213792256, 1214251008)
      [ 2014.152517] LustreError: 19480:0:(vvp_io.c:1056:vvp_io_write_start()) LBUG
      [ 2014.152518] Pid: 19480, comm: nfsd 3.10.0-957.21.3.el7.x86_64 #1 SMP Tue Jun 18 16:35:19 UTC 2019
      [ 2014.152519] Call Trace:
      [ 2014.152542] [<ffffffffc0a0d7cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
      [ 2014.152548] [<ffffffffc0a0d87c>] lbug_with_loc+0x4c/0xa0 [libcfs]
      [ 2014.152569] [<ffffffffc1061270>] vvp_io_write_start+0x790/0x820 [lustre]
      [ 2014.152593] [<ffffffffc0cb5328>] cl_io_start+0x68/0x130 [obdclass]
      [ 2014.152610] [<ffffffffc0cb74fc>] cl_io_loop+0xcc/0x1c0 [obdclass]
      [ 2014.152620] [<ffffffffc101765b>] ll_file_io_generic+0x63b/0xcb0 [lustre]
      [ 2014.152630] [<ffffffffc10182f2>] ll_file_aio_write+0x442/0x590 [lustre]
      [ 2014.152632] [<ffffffff8d040e6b>] do_sync_readv_writev+0x7b/0xd0
      [ 2014.152634] [<ffffffff8d042aae>] do_readv_writev+0xce/0x260
      [ 2014.152635] [<ffffffff8d042cd5>] vfs_writev+0x35/0x60
      [ 2014.152643] [<ffffffffc0699f90>] nfsd_vfs_write+0xc0/0x3a0 [nfsd]
      [ 2014.152649] [<ffffffffc069c962>] nfsd_write+0x112/0x2a0 [nfsd]
      [ 2014.152655] [<ffffffffc06a3070>] nfsd3_proc_write+0xc0/0x160 [nfsd]
      [ 2014.152661] [<ffffffffc0694810>] nfsd_dispatch+0xe0/0x290 [nfsd]
      [ 2014.152671] [<ffffffffc0610cf3>] svc_process_common+0x493/0x760 [sunrpc]
      [ 2014.152679] [<ffffffffc06110c3>] svc_process+0x103/0x190 [sunrpc]
      [ 2014.152685] [<ffffffffc069416f>] nfsd+0xdf/0x150 [nfsd]
      [ 2014.152687] [<ffffffff8cec1da1>] kthread+0xd1/0xe0
      [ 2014.152689] [<ffffffff8d575c1d>] ret_from_fork_nospec_begin+0x7/0x21
      [ 2014.152693] [<ffffffffffffffff>] 0xffffffffffffffff
      [ 2014.157437] [<ffffffffc06110c3>] svc_process+0x103/0x190 [sunrpc]
      [ 2014.157572] [<ffffffffc069416f>] nfsd+0xdf/0x150 [nfsd]
      [ 2014.157704] [<ffffffffc0694090>] ? nfsd_destroy+0x80/0x80 [nfsd]
      [ 2014.157835] [<ffffffff8cec1da1>] kthread+0xd1/0xe0
      [ 2014.157963] [<ffffffff8cec1cd0>] ? insert_kthread_work+0x40/0x40
      [ 2014.158094] [<ffffffff8d575c1d>] ret_from_fork_nospec_begin+0x7/0x21
      [ 2014.158224] [<ffffffff8cec1cd0>] ? insert_kthread_work+0x40/0x40
      (END)

       

      We have updated that client to lustre 2.12.2, but it did not help 

      Attachments

        Issue Links

          Activity

            [LU-12503] LustreError: 19435:0:(vvp_io.c:1056:vvp_io_write_start()) LBUG
            lixi_wc Li Xi made changes -
            Link New: This issue is related to DDN-1151 [ DDN-1151 ]
            lixi_wc Li Xi made changes -
            Labels Original: exap
            wshilong Wang Shilong (Inactive) made changes -
            Labels New: exap
            mdiep Minh Diep made changes -
            Link Original: This issue is related to JFC-28 [ JFC-28 ]
            pjones Peter Jones made changes -
            Link Original: This issue is related to JFC-17 [ JFC-17 ]
            pjones Peter Jones made changes -
            Link New: This issue is related to JFC-20 [ JFC-20 ]
            pjones Peter Jones made changes -
            Labels Original: LTS12
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.12.4 [ 14690 ]
            mdiep Minh Diep made changes -
            Link New: This issue is related to JFC-28 [ JFC-28 ]
            pjones Peter Jones made changes -
            Link New: This issue is related to JFC-17 [ JFC-17 ]

            People

              bobijam Zhenyu Xu
              halifu Saerda Halifu (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: