[LU-753] kernel:LustreError: 4998:0:(filter_io_26.c:791:filter_commitrw_write()) LBUG Created: 11/Oct/11  Updated: 16/Apr/13  Resolved: 10/Jan/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.2.0, Lustre 2.4.0, Lustre 2.1.1

Type: Bug Priority: Critical
Reporter: Jinshan Xiong (Inactive) Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: File 4998.log.gz    
Issue Links:
Related
is related to LU-433 remove jbd2-jcberr patch from kernel Resolved
Severity: 3
Rank (Obsolete): 4873

 Description   

I hit this bug when I do the following thing:

On a fresh lustre env:
1. lfs setstripe -c -1 /mnt/lustre
2. dd if=/dev/zero of=/mnt/lustre/testfile bs=1M count=1500 oflag=direct
3. do rewrite: dd if=/dev/zero of=/mnt/lustre/testfile bs=1M count=1500 oflag=direct

it hit this bug, and the stack trace on the OSS is:

LustreError: 4998:0:(filter_io_26.c:791:filter_commitrw_write()) ASSERTION(oti->oti_transno <= obd->obd_last_committed) failed: oti_transno 752 last_committed 750
LustreError: 4998:0:(filter_io_26.c:791:filter_commitrw_write()) LBUG
Pid: 4998, comm: ll_ost_io_79

Call Trace:
[<ffffffffa02c2855>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[<ffffffffa02c2e95>] lbug_with_loc+0x75/0xe0 [libcfs]
[<ffffffffa08c3b11>] filter_commitrw_write+0x2dc1/0x2ec8 [obdfilter]
[<ffffffffa0339fd1>] ? lnet_ni_send+0x51/0xe0 [lnet]
[<ffffffffa0338cf7>] ? LNetMDBind+0x1c7/0x410 [lnet]
[<ffffffffa050a35a>] ? ptlrpc_start_bulk_transfer+0x15a/0x620 [ptlrpc]
[<ffffffffa08b45bd>] filter_commitrw+0x2bd/0x2e0 [obdfilter]
[<ffffffffa050f6b5>] ? lustre_msg_buf+0x85/0x90 [ptlrpc]
[<ffffffffa053b82b>] ? __req_capsule_get+0x14b/0x6b0 [ptlrpc]
[<ffffffffa03c398c>] ? lprocfs_counter_add+0x12c/0x196 [lvfs]
[<ffffffffa087c4aa>] obd_commitrw+0x11a/0x410 [ost]
[<ffffffffa088674b>] ost_brw_write+0xf2b/0x18a0 [ost]
[<ffffffffa05109d1>] ? lustre_pack_reply_flags+0xd1/0x1f0 [ptlrpc]
[<ffffffff8126ac7c>] ? put_dec+0x10c/0x110
[<ffffffff8126af6e>] ? number+0x2ee/0x320
[<ffffffffa04d1890>] ? target_bulk_timeout+0x0/0xe0 [ptlrpc]
[<ffffffffa088ad35>] ost_handle+0x3325/0x4b90 [ost]
[<ffffffffa02cd6b1>] ? libcfs_debug_vmsg2+0x4d1/0xb50 [libcfs]
[<ffffffffa050f314>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc]
[<ffffffffa0511a5c>] ? lustre_msg_get_status+0x3c/0xa0 [ptlrpc]
[<ffffffffa051fe8e>] ptlrpc_main+0xb8e/0x1900 [ptlrpc]
[<ffffffffa051f300>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
[<ffffffff8100c1ca>] child_rip+0xa/0x20
[<ffffffffa051f300>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
[<ffffffffa051f300>] ? ptlrpc_main+0x0/0x1900 [ptlrpc]
[<ffffffff8100c1c0>] ? child_rip+0x0/0x20

LustreError: dumping log to /tmp/lustre-log.1318354849.4998


Info required for matching: sanity-quota test_23a 23a



 Comments   
Comment by Jinshan Xiong (Inactive) [ 11/Oct/11 ]

I didn't look into this bug yet so I'm not sure if there is a quick reproduce path.

Comment by Jinshan Xiong (Inactive) [ 13/Oct/11 ]

I took a look at this issue today because it stops my test frequently.

The root cause of this issue is that fsfilt_commit_wait() returns before j_commit_callback is called for that specific commit sequence.

Let's say proc1 is journal commit process, proc2 is the waiting process(filter write process), and there are two transaction t1 and t2 pending, proc2 wants to wait for t2 to be finished.

proc1:                                                                proc2:
                                                                          wait for t2
finish t1
call commit callback for t1, update committed_transno
wake up proc2
finish t2
                                                                          see t2 is finished, return
                                                                          hit LBUG since transno is actually less then committed_transno

call commit callback for t2, update committed_transno

I have no idea to fix this issue gracefully. Maybe we should patch jbd2 to let it call commit_callback before updating j_commit_sequence; or modify lustre to actually wait for committed_transno.

Comment by Jinshan Xiong (Inactive) [ 13/Oct/11 ]

I made a hot fix at: http://review.whamcloud.com/1519

So if anybody else has the same issue, this fix can be applied to move things forward. The real fix should be contributed by ldiskfs expert.

Comment by Zhenyu Xu [ 20/Oct/11 ]

another hit at https://maloo.whamcloud.com/test_sets/fa85f7c8-fad2-11e0-bbc0-52540025f9af

Comment by Mikhail Pershin [ 21/Oct/11 ]

Another hit:
https://maloo.whamcloud.com/test_sets/a6ab3b88-fb9d-11e0-88a9-52540025f9af

Comment by Peter Jones [ 21/Oct/11 ]

Niu

Could you please look into this one? It is causing regular autotest failures

Thanks

Peter

Comment by Jinshan Xiong (Inactive) [ 21/Oct/11 ]

It would be fine to just comment it out.

Comment by Andreas Dilger [ 21/Oct/11 ]

I think the root of the problem is that fsfilt_commit_wait() is returning too early. Any workarounds at the higher level are just wrong.

Maybe I'm missing something, but jbd2_journal_commit_transaction() on my system looks like:

        if (journal->j_commit_callback)
                journal->j_commit_callback(journal, commit_transaction);

        trace_jbd2_end_commit(journal, commit_transaction);
        jbd_debug(1, "JBD: commit %d complete, head %d\n",
                  journal->j_commit_sequence, journal->j_tail_sequence);
        if (to_free)
                kfree(commit_transaction);

        wake_up(&journal->j_wait_done_commit);

so it is calling the commit callback before waking up the threads waiting in jbd2_log_wait_commit():

       while (tid_gt(tid, journal->j_commit_sequence)) {
                jbd_debug(1, "JBD: want %d, j_commit_sequence=%d\n",
                                  tid, journal->j_commit_sequence);
                wake_up(&journal->j_wait_commit);
                read_unlock(&journal->j_state_lock);
                wait_event(journal->j_wait_done_commit,
                                !tid_gt(tid, journal->j_commit_sequence));
                read_lock(&journal->j_state_lock);
        }

However, I guess if this thread is getting spurious wakeups for some reason, it might just wake up after jbd2_journal_commit_transaction() updated j_commit_sequence and dropped j_state_lock and exit before the commit callback is called.

It looks like the root cause of this problem is the new jbd2 commit callback patch (LU-433, http://review.whamcloud.com/983). It looks like the commit callback is now at the end of jbd2_journal_commit_transaction(), while the old commit callback was just before "restart_loop:" before j_commit_sequence was updated.

That said, I'm not sure the right way to fix this. Technically, the code is correct in that jbd2_log_wait_commit() is waiting until the transaction is complete, but it is racy if one considers the commit callbacks as part of the transaction. It doesn't seem very correct to move the commit callback earlier, and it may even improve performance now that the transaction can complete before the callbacks are run.

However, we lose out on some valuable sanity checking in the obdfilter::filter_commitrw_write() that the data is on disk after we have returned from fsfilt_commit_wait(). Maybe the value of this LASSERT() is declining, since this code will all disappear once we move to OFD/OSD?

Comment by Andreas Dilger [ 21/Oct/11 ]

Hmm, going further, if we remove this LASSERT() then it will push the problem back to the client. The client reply will have transno < last_committed, and it would have to keep the bulk RPC for replay, even though it may have asked for a sync IO.

It looks like the right solution is to wait for the commit callback to be run before returning to the client:

        /* In rare cases fsfilt_commit_wait() will wake up and return after
         * the transaction has finished its work and updated j_commit_sequence
         * but the commit callbacks have not been run yet.  Wait here until
         * that is finished so that clients requesting sync IO don't see the
         * reply transno < last_committed.  LU-753 */
        if (unlikely(obd->obd_replayable && !rc && wait_handle &&
                     oti->oti_transno <= obd->obd_last_committed)) {
                struct l_wait_info lwi = { 0 };
                l_wait_event(&waitq, oti->oti_transno >=obd->obd_last_committed,
                             &lwi);
        }
Comment by Jinshan Xiong (Inactive) [ 21/Oct/11 ]

Does clients check if comitted_transno is not less than RPC transno, I tend to this it will be okay they don't. The only side effect is that clients have to hold those pages for longer time until up-to-date committed transno is broadcasted by next RPC.

Comment by Niu Yawei (Inactive) [ 22/Oct/11 ]

Does clients check if comitted_transno is not less than RPC transno, I tend to this it will be okay they don't. The only side effect is that clients have to hold those pages for longer time until up-to-date committed transno is broadcasted by next RPC.

indeed, seems keeping the replay a little bit longer in very rare case isn't harmful.Andreas, any objections?

Comment by Andreas Dilger [ 22/Oct/11 ]

If the client requested sync RPCs for some reason (e.g. liblustre, some other client that doesnt handle bulk RPC replay, if the client RAM is short) then the OST should honor this request.

I don't think this will delay RPC replies in most cases. Consider how many bulk writes have been sent in every test run since LU-433 has been landed agains how few times this LASSERT() has been hit. This means the performance implact of implementing correct sync RPCs is basically zero.

Please add the l_wait_event() as suggested.

Comment by Niu Yawei (Inactive) [ 23/Oct/11 ]

http://review.whamcloud.com/#change,1583

Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » x86_64,client,el5,ofa #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » i686,client,el6,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » x86_64,server,el5,ofa #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » i686,client,el5,ofa #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » i686,server,el6,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » i686,server,el5,ofa #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » i686,server,el5,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Build Master (Inactive) [ 26/Oct/11 ]

Integrated in lustre-master » i686,client,el5,inkernel #312
LU-753 obdfilter: improper LASSERT in filter_commitrw_write()

Oleg Drokin : 0935fc8eae15696820f87951d8955a9c8bc66177
Files :

  • lustre/obdfilter/filter_io_26.c
Comment by Peter Jones [ 03/Nov/11 ]

Landed for 2.2

Comment by Build Master (Inactive) [ 02/May/12 ]

Integrated in lustre-dev » x86_64,client,el5,inkernel #340
LU-753 obd: remove obsolete commit callback code (Revision a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9)

Result = SUCCESS
Mikhail Pershin : a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9
Files :

  • lustre/include/obd.h
Comment by Build Master (Inactive) [ 02/May/12 ]

Integrated in lustre-dev » i686,client,el6,inkernel #340
LU-753 obd: remove obsolete commit callback code (Revision a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9)

Result = SUCCESS
Mikhail Pershin : a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9
Files :

  • lustre/include/obd.h
Comment by Build Master (Inactive) [ 02/May/12 ]

Integrated in lustre-dev » i686,server,el5,inkernel #340
LU-753 obd: remove obsolete commit callback code (Revision a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9)

Result = SUCCESS
Mikhail Pershin : a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9
Files :

  • lustre/include/obd.h
Comment by Build Master (Inactive) [ 02/May/12 ]

Integrated in lustre-dev » x86_64,server,el6,inkernel #340
LU-753 obd: remove obsolete commit callback code (Revision a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9)

Result = SUCCESS
Mikhail Pershin : a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9
Files :

  • lustre/include/obd.h
Comment by Build Master (Inactive) [ 02/May/12 ]

Integrated in lustre-dev » i686,client,el5,inkernel #340
LU-753 obd: remove obsolete commit callback code (Revision a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9)

Result = SUCCESS
Mikhail Pershin : a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9
Files :

  • lustre/include/obd.h
Comment by Build Master (Inactive) [ 02/May/12 ]

Integrated in lustre-dev » x86_64,server,el5,inkernel #340
LU-753 obd: remove obsolete commit callback code (Revision a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9)

Result = SUCCESS
Mikhail Pershin : a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9
Files :

  • lustre/include/obd.h
Comment by Build Master (Inactive) [ 02/May/12 ]

Integrated in lustre-dev » x86_64,client,el6,inkernel #340
LU-753 obd: remove obsolete commit callback code (Revision a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9)

Result = SUCCESS
Mikhail Pershin : a5f2ebc2f26186ace7cdd17c3f6c934747fa1db9
Files :

  • lustre/include/obd.h
Generated at Sat Feb 10 01:10:05 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.