[LU-2129] ASSERTION( last_rec->lrh_index == tail->lrt_index ) Created: 09/Oct/12  Updated: 20/Nov/12  Resolved: 20/Nov/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Minh Diep Assignee: Mikhail Pershin
Resolution: Fixed Votes: 0
Labels: NFBlocker
Environment:

https://build.whamcloud.com/job/lustre-master/951/
On opensfs with 1 mds, 4 oss, 24 clients


Severity: 3
Rank (Obsolete): 5131

 Description   

Lustre: DEBUG MARKER: ----============= acceptance-small: parallel-scale ============---- Tue Oct 9 09:38:01 PDT 2012
Lustre: DEBUG MARKER: excepting tests: parallel_grouplock
Lustre: DEBUG MARKER: Using TIMEOUT=20
LustreError: 13309:0:(osd_io.c:1062:osd_ldiskfs_write_record()) sdd1: error reading offset 12288 (block 3): rc = -28
LustreError: 13309:0:(llog_osd.c:175:llog_osd_write_blob()) lustre-MDT0000-osd: error writing log buffer: rc = -28
LustreError: 13309:0:(mgs_llog.c:738:record_lcfg()) failed -28
LustreError: 13309:0:(mgs_llog.c:1074:mgs_write_log_direct_all()) MGS: writing log lustre-params: rc = -28
LustreError: 13309:0:(osd_io.c:1062:osd_ldiskfs_write_record()) sdd1: error reading offset 12288 (block 3): rc = -28
LustreError: 13309:0:(llog_osd.c:175:llog_osd_write_blob()) lustre-MDT0000-osd: error writing log buffer: rc = -28
LustreError: 13309:0:(mgs_llog.c:738:record_lcfg()) failed -28
LustreError: 13309:0:(mgs_llog.c:1074:mgs_write_log_direct_all()) MGS: writing log lustre-OST0000: rc = -28
LustreError: 13309:0:(mgs_llog.c:2956:mgs_write_log_param()) err -28 on param 'sys.jobid_var=procname_uid'
LustreError: 13309:0:(mgs_handler.c:926:mgs_iocontrol()) MGS: setparam err: rc = -28
LustreError: 11580:0:(llog_osd.c:579:llog_osd_next_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed:
LustreError: 11580:0:(llog_osd.c:579:llog_osd_next_block()) LBUG
Pid: 11580, comm: ll_mgs_0003

Call Trace:
[<ffffffffa0991905>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[<ffffffffa0991f17>] lbug_with_loc+0x47/0xb0 [libcfs]
[<ffffffffa0cd6435>] llog_osd_next_block+0xa55/0xa60 [obdclass]
[<ffffffffa151d436>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc]
[<ffffffffa153a80c>] llog_origin_handle_next_block+0x50c/0x780 [ptlrpc]
[<ffffffffa1747f93>] mgs_handle+0xb13/0x11e0 [mgs]
[<ffffffffa09a2621>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[<ffffffffa152d8dc>] ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
[<ffffffff8104e309>] ? __wake_up_common+0x59/0x90
[<ffffffff810533f3>] ? __wake_up+0x53/0x70
[<ffffffffa152eecc>] ptlrpc_main+0xc0c/0x19f0 [ptlrpc]
[<ffffffffa152e2c0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
[<ffffffff8100c14a>] child_rip+0xa/0x20
[<ffffffffa152e2c0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
[<ffffffffa152e2c0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
[<ffffffff8100c140>] ? child_rip+0x0/0x20

Kernel panic - not syncing: LBUG
Pid: 11580, comm: ll_mgs_0003 Not tainted 2.6.32-279.5.1.el6_lustre.x86_64 #1
Call Trace:
[<ffffffff814fd58a>] ? panic+0xa0/0x168
[<ffffffffa0991f6b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
[<ffffffffa0cd6435>] ? llog_osd_next_block+0xa55/0xa60 [obdclass]
[<ffffffffa151d436>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc]
[<ffffffffa153a80c>] ? llog_origin_handle_next_block+0x50c/0x780 [ptlrpc]
[<ffffffffa1747f93>] ? mgs_handle+0xb13/0x11e0 [mgs]
[<ffffffffa09a2621>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[<ffffffffa152d8dc>] ? ptlrpc_server_handle_request+0x41c/0xe00 [ptlrpc]
[<ffffffff8104e309>] ? __wake_up_common+0x59/0x90
[<ffffffff810533f3>] ? __wake_up+0x53/0x70
[<ffffffffa152eecc>] ? ptlrpc_main+0xc0c/0x19f0 [ptlrpc]
[<ffffffffa152e2c0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
[<ffffffff8100c14a>] ? child_rip+0xa/0x20
[<ffffffffa152e2c0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
[<ffffffffa152e2c0>] ? ptlrpc_main+0x0/0x19f0 [ptlrpc]
[<ffffffff8100c140>] ? child_rip+0x0/0x20



 Comments   
Comment by Oleg Drokin [ 09/Oct/12 ]

This happens in my testing at random tests
last occurence today in the morning, crashdump available:

[10933.785890] LustreError: 11126:0:(lmv_obd.c:1190:lmv_statfs()) can't stat MDS #0 (lustre-MDT0000-mdc-ffff880096259bf0), error -5
[10933.786846] LustreError: 11126:0:(llite_lib.c:1558:ll_statfs_internal()) md_statfs fails: rc = -5
[10935.038782] Lustre: DEBUG MARKER: == replay-single test 21: |X| open(O_CREAT), unlink touch new, replay, close (test mds_cleanup_orphans) == 06:04:11 (1349777051)
[10935.209519] Turning device loop0 (0x700000) read-only
[10935.233110] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
[10935.239897] Lustre: DEBUG MARKER: local REPLAY BARRIER on lustre-MDT0000
[10935.555060] Removing read-only on unknown block (0x700000)
[10945.911054] LDISKFS-fs (loop0): recovery complete
[10945.912650] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: 
[10945.944018] LustreError: 11387:0:(llog_osd.c:579:llog_osd_next_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed: 
[10945.944868] LustreError: 11387:0:(llog_osd.c:579:llog_osd_next_block()) LBUG
[10945.945387] Pid: 11387, comm: ll_mgs_0001
[10945.945389] 
[10945.945389] Call Trace:
[10945.945410]  [<ffffffffa0c24915>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[10945.945424]  [<ffffffffa0c24f27>] lbug_with_loc+0x47/0xb0 [libcfs]
[10945.945469]  [<ffffffffa1124af5>] llog_osd_next_block+0xa55/0xa60 [obdclass]
[10945.945521]  [<ffffffffa1288dc6>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc]
[10945.945542]  [<ffffffffa12a542c>] llog_origin_handle_next_block+0x55c/0x780 [ptlrpc]
[10945.945549]  [<ffffffffa066ef73>] mgs_handle+0xb13/0x11e0 [mgs]
[10945.945557]  [<ffffffffa0c356d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[10945.945576]  [<ffffffffa1297853>] ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc]
[10945.945587]  [<ffffffffa0c2566e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
[10945.945614]  [<ffffffffa1290541>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc]
[10945.945619]  [<ffffffff81051f73>] ? __wake_up+0x53/0x70
[10945.945646]  [<ffffffffa129a3ea>] ptlrpc_main+0xb9a/0x1960 [ptlrpc]
[10945.945673]  [<ffffffffa1299850>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[10945.945676]  [<ffffffff8100c14a>] child_rip+0xa/0x20
[10945.945702]  [<ffffffffa1299850>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[10945.945744]  [<ffffffffa1299850>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[10945.945748]  [<ffffffff8100c140>] ? child_rip+0x0/0x20
[10945.945749] 
[10945.948136] Kernel panic - not syncing: LBUG
[10945.948138] Pid: 11387, comm: ll_mgs_0001 Not tainted 2.6.32-debug #6
[10945.948139] Call Trace:
[10945.948146]  [<ffffffff814f75e4>] ? panic+0xa0/0x168
[10945.948155]  [<ffffffffa0c24f7b>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
[10945.948198]  [<ffffffffa1124af5>] ? llog_osd_next_block+0xa55/0xa60 [obdclass]
[10945.948242]  [<ffffffffa1288dc6>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc]
[10945.948285]  [<ffffffffa12a542c>] ? llog_origin_handle_next_block+0x55c/0x780 [ptlrpc]
[10945.948296]  [<ffffffffa066ef73>] ? mgs_handle+0xb13/0x11e0 [mgs]
[10945.948312]  [<ffffffffa0c356d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[10945.948354]  [<ffffffffa1297853>] ? ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc]
[10945.948368]  [<ffffffffa0c2566e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
[10945.948409]  [<ffffffffa1290541>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc]
[10945.948413]  [<ffffffff81051f73>] ? __wake_up+0x53/0x70
[10945.948452]  [<ffffffffa129a3ea>] ? ptlrpc_main+0xb9a/0x1960 [ptlrpc]
[10945.948492]  [<ffffffffa1299850>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[10945.948496]  [<ffffffff8100c14a>] ? child_rip+0xa/0x20
[10945.948540]  [<ffffffffa1299850>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[10945.948620]  [<ffffffffa1299850>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[10945.948624]  [<ffffffff8100c140>] ? child_rip+0x0/0x20
Comment by Mikhail Pershin [ 10/Oct/12 ]

I've added debug to get more information about this issue:
http://review.whamcloud.com/#change,4240

Oleg, please use it if you can reproduce this bug

Comment by Oleg Drokin [ 15/Oct/12 ]

Just reproduced

[49201.778332] Lustre: DEBUG MARKER: == replay-single test 40: cause recovery in ptlrpc, ensure IO continues == 12:28:41 (1350318521)
[49201.795828] Lustre: DEBUG MARKER: multiop /mnt/lustre/f.replay-single.40 OS_c
[49214.016506] LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=on. Opts: 
[49214.071707] LustreError: 2626:0:(llog_osd.c:588:llog_osd_next_block()) ASSERTION( last_rec->lrh_index == tail->lrt_index ) failed: Rec idx 0, tail idx -30720 in llog #0x2#0x1#00000000 offset 92312, len 2200
[49214.072538] LustreError: 2626:0:(llog_osd.c:588:llog_osd_next_block()) LBUG
[49214.072817] Pid: 2626, comm: ll_mgs_0001
[49214.073065] 
[49214.073065] Call Trace:
[49214.073473]  [<ffffffffa0b34915>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[49214.073788]  [<ffffffffa0b34f27>] lbug_with_loc+0x47/0xb0 [libcfs]
[49214.074118]  [<ffffffffa0ecf5f9>] llog_osd_next_block+0xa89/0xa90 [obdclass]
[49214.074455]  [<ffffffffa1092a96>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc]
[49214.074931]  [<ffffffffa10af05c>] llog_origin_handle_next_block+0x55c/0x780 [ptlrpc]
[49214.075406]  [<ffffffffa06bef73>] mgs_handle+0xb13/0x11e0 [mgs]
[49214.075679]  [<ffffffffa0b456d1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[49214.075980]  [<ffffffffa10a1483>] ptlrpc_server_handle_request+0x463/0xe70 [ptlrpc]
[49214.076443]  [<ffffffffa0b3566e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
[49214.076736]  [<ffffffffa109a171>] ? ptlrpc_wait_event+0xb1/0x2a0 [ptlrpc]
[49214.077051]  [<ffffffff81051f73>] ? __wake_up+0x53/0x70
[49214.077330]  [<ffffffffa10a401a>] ptlrpc_main+0xb9a/0x1960 [ptlrpc]
[49214.077710]  [<ffffffffa10a3480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[49214.077993]  [<ffffffff8100c14a>] child_rip+0xa/0x20
[49214.078267]  [<ffffffffa10a3480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[49214.078583]  [<ffffffffa10a3480>] ? ptlrpc_main+0x0/0x1960 [ptlrpc]
[49214.078860]  [<ffffffff8100c140>] ? child_rip+0x0/0x20

Comment by Mikhail Pershin [ 17/Oct/12 ]

As I suspected the llog contains partially written records. Thus must be result of llog_osd_write_blob() called with buffer to write. In that case there are 3 separated writes actually - record header, buffer and record tail. So we can have first write successful but others not which causes assertion upon read.

Comment by Mikhail Pershin [ 18/Oct/12 ]

Possible solution would be the locking with write lock around those 3 writes and read lock in llog_osd_next_block(). Given the fact "3 writes" case is not common (used only for config llogs) that shouldn't cause lot of contentions.

Meanwhile I wonder can we avoid those '3 writes' at all, demanding llog_write be not called with external buffer but only with fully prepared llog record. That means mgs should prepare llog record with copying lustre_cfg buffer. Such overhead is not critical for MGS but we are avoiding that buggy case with 3 separate writes in llog_osd

Comment by Mikhail Pershin [ 19/Oct/12 ]

Fix with locking, write lock envelopes problematic multiple writes while dt_read() is protected with read lock.
http://review.whamcloud.com/4303

Another and probably better way could be removal of those multiple writes and write llog record only by single block but would require more work, though will cleanup llog nicely

Comment by Mikhail Pershin [ 12/Nov/12 ]

patch is ready for landing, need confirmation from Oleg that bug it not seen anymore

Comment by Andreas Dilger [ 20/Nov/12 ]

Patch landed, closing bug.

Generated at Sat Feb 10 01:22:36 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.