[LU-11195] LBUG: (lod_lov.c:633:lod_free_comp_buffer()) ASSERTION( entry->llc_stripe == ((void *)0) ) Created: 30/Jul/18  Updated: 13/Apr/21  Resolved: 04/Sep/18

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.0, Lustre 2.11.0
Fix Version/s: Lustre 2.12.0, Lustre 2.10.7

Type: Bug Priority: Major
Reporter: Ann Koehler (Inactive) Assignee: Ann Koehler (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Attachments: File dklog_plus_extracts.tgz    
Issue Links:
Duplicate
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

The primary MDS and then its backup panic when replaying layout change requests on a PFL file.

[  842.061520] LustreError: 22966:0:(lod_lov.c:633:lod_free_comp_buffer()) ASSERTION( entry->llc_stripe == ((void *)0) ) failed: 
[  842.075148] LustreError: 22966:0:(lod_lov.c:633:lod_free_comp_buffer()) LBUG 
[  842.083308] Pid: 22966, comm: mdt03_001

crash_x86_64> bt
PID: 22966  TASK: ffff881f0cbfdee0  CPU: 18  COMMAND: "mdt03_001"
 #0 [ffff881f0c4f7680] machine_kexec at ffffffff8105d77b
 #1 [ffff881f0c4f76e0] __crash_kexec at ffffffff81108742
 #2 [ffff881f0c4f77b0] panic at ffffffff816a863f
 #3 [ffff881f0c4f7830] lbug_with_loc at ffffffffc0b65854 [libcfs]
 #4 [ffff881f0c4f7850] lod_free_comp_buffer at ffffffffc151b62f [lod]
 #5 [ffff881f0c4f7898] lod_free_comp_entries at ffffffffc151ff57 [lod]
 #6 [ffff881f0c4f78b0] lod_parse_striping at ffffffffc1522140 [lod]
 #7 [ffff881f0c4f7920] lod_load_striping_locked at ffffffffc152262a [lod]
 #8 [ffff881f0c4f7958] lod_declare_layout_change at ffffffffc153d0bf [lod]
 #9 [ffff881f0c4f79c0] mdd_declare_layout_change at ffffffffc13e9102 [mdd]
#10 [ffff881f0c4f79e8] mdd_layout_change at ffffffffc13f110d [mdd]
#11 [ffff881f0c4f7a40] mdt_layout_change at ffffffffc145aaf8 [mdt]
#12 [ffff881f0c4f7a88] mdt_intent_layout at ffffffffc146220a [mdt]
#13 [ffff881f0c4f7b40] mdt_intent_opc at ffffffffc1458315 [mdt]
#14 [ffff881f0c4f7ba0] mdt_intent_policy at ffffffffc145ff63 [mdt]
#15 [ffff881f0c4f7be0] ldlm_lock_enqueue at ffffffffc0e92f0e [ptlrpc]
#16 [ffff881f0c4f7c48] ldlm_handle_enqueue0 at ffffffffc0ebb753 [ptlrpc]
#17 [ffff881f0c4f7cd8] tgt_enqueue at ffffffffc0f41b32 [ptlrpc]
#18 [ffff881f0c4f7cf8] tgt_request_handle at ffffffffc0f454da [ptlrpc]
#19 [ffff881f0c4f7d40] ptlrpc_server_handle_request at ffffffffc0eed733 [ptlrpc]
#20 [ffff881f0c4f7de0] ptlrpc_main at ffffffffc0ef0ee2 [ptlrpc]
#21 [ffff881f0c4f7ec8] kthread at ffffffff810b4031
#22 [ffff881f0c4f7f50] ret_from_fork at ffffffff816c155d

During recovery, the MDS can not handle the fact that the file is already striped. The client is replaying a layout operation which completed. The server tries to free the component buffers but asserts when it finds striping info defined that it doesn't expect.

We have a dump of the LBUG with -1 debug logs. The scenario in the dump is the replay of 2 layout change rpcs on a PFL file with 4 components:


Process A - pid 22961

00000020:00000040:4.0:1531845858.765709:000000:22961:0:(tgt_handler.c:578:tgt_handle_recovery()) @@@ Got new replay  req@ffff881f1041ce00 x1605550737582320/t0(1065152839981) o101->5fa746bc-5d86-4a17-630c-889714734378@175@gni6:148/0 lens 848/0 e 0 to 0 dl 1531845938 ref 1 fl Interpret:/4/ffffffff rc 0/-1

00000004:00000040:4.0:1531845858.765745:000000:22961:0:(mdt_handler.c:3807:mdt_intent_layout()) [0x340077935:0x99b3:0x0]got layout change request from client: opc:2 flags:0x0 extent [0x1000000, 0x1100000)

Process B - pid 

00000020:00000040:18.0:1531845858.766672:000000:22966:0:(tgt_handler.c:578:tgt_handle_recovery()) @@@ Got new replay  req@ffff881e83a8bf00 x1605550737582560/t0(1065152849881) o101->5fa746bc-5d86-4a17-630c-889714734378@175@gni6:148/0 lens 920/0 e 0 to 0 dl 1531845938 ref 1 fl Interpret:/4/ffffffff rc 0/-1

00000004:00000040:18.0:1531845858.766710:000000:22966:0:(mdt_handler.c:3807:mdt_intent_layout()) [0x340077935:0x99b3:0x0]got layout change request from client: opc:2 flags:0x0 extent [0x2000000, 0x2100000)

Process A runs first. It:

  • finds object O for [0x340077935:0x99b3:0x0] in the lu_site object cache,
  • follows the normal layout request path and loads component definitions from disk into O
  • detects that the request is for replay: deletes all components and loads component definitions from the rpc buffers into O. Only the first 2 of the 4 components are flagged as initialized.
  • defines striping info for those two components

Process A leaves the object in cache. Some components have striping info, some don't. The O.ldo_comp_cached flag is NOT set.

Process B then runs. It tries to follow the same path as Process A. It:

  • finds O in the object cache
  • tries to remove the components of O before loading them from disk
  • finds the llc_stripe info set on the first component and LBUGs

If O.lldo_comp_cached flag was set, then Process B would not try to load the components from disk. Instead it would remove the components defined for O and replace them with the layout from the rpc buffer, which would be correct.

The following notes outline the function flow involved in the bug. Process A and Process B both call lod_declare_layout_change(); the calls are sequential.

lod_declare_layout_change() - layout from rpc
    lod_load_striping_locked() - returns if obj striping is cached
                                               - creates obj.comps; lbuf from lu_env
        lod_parse_striping() - allocs stripes according to obj <---  Process B LBUGs
        sets obj.ldo_comp_cached
    layout_lock_data_init()
  
    lod_declare_update_plain() - comp info from mlc_buf(rpc)
        if replay
            lod_object_free_striping() - removes stripes from obj
                unsets obj.ldo_comp_cached
            lod_use_defined_striping() - obj, mlc_buf(rpc), 
                                                       - builds comps,stripes from rpc
                  lod_alloc_comp_entries()                               
            lod_get_lov_ea() - gets ea from disk puts in lov_env_info
        loop over obj components 
            if obj.e_start > rpc.e_end break from loop
            if replay skip comps already initialized on disk
            add comp to reconstruct list
        if no comps added to reconstruct list return EALREADY  <--- Process A exits through this path
        lod_obj_inc_layout_gen()
        lod_declare_instantiate_components()

The replay logic in lod_declare_update plain leaves the object in an unexpected state: components have llc_stripe set but ldo_comp_cached is false. The simplest fix is just to add "obj.ldo_comp_cached = 1" following the call to lod_use_defined_striping().



 Comments   
Comment by Ann Koehler (Inactive) [ 30/Jul/18 ]

I've attached the full dklog along with extractions of the problematic replay requests. If you want the vmcore, let me know where to put it.

Comment by Gerrit Updater [ 30/Jul/18 ]

Ann Koehler (amk@cray.com) uploaded a new patch: https://review.whamcloud.com/32904
Subject: LU-11195 lod: Mark comps cached on replay of layout change
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 6533ded780cacd36c7ecff86e0aba32c7efab18e

Comment by Ann Koehler (Inactive) [ 06/Aug/18 ]

An additional requirement to trigger this bug is that the components on disk must be flagged as initialized when Process A replays the layout change request. lod_declare_update_plain() must return -EALREADY. If lod_declare_instantiate_components() is called then obj.ldo_comp_cached will be set and Process B will not LBUG on the next replay.

Comment by Gerrit Updater [ 04/Sep/18 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/32904/
Subject: LU-11195 lod: Mark comps cached on replay of layout change
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: e021026d0c37d8806d16dbaad6a9d4f47844c999

Comment by Peter Jones [ 04/Sep/18 ]

Landed for 2.12

Comment by Gerrit Updater [ 04/Sep/18 ]

Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/33110
Subject: LU-11195 lod: Mark comps cached on replay of layout change
Project: fs/lustre-release
Branch: b2_10
Current Patch Set: 1
Commit: c93c7902048497a40dcb94977f7c47949dd36b3a

Comment by Gerrit Updater [ 16/Jan/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/33110/
Subject: LU-11195 lod: Mark comps cached on replay of layout change
Project: fs/lustre-release
Branch: b2_10
Current Patch Set:
Commit: de87598eaf5eea309faa845209148b787ed29c10

Generated at Sat Feb 10 02:41:46 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.