[LU-14430] Amount of default ACLs is limited by 31 for new files Created: 12/Feb/21  Updated: 16/Jul/23  Resolved: 29/Jul/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.5
Fix Version/s: Lustre 2.15.0

Type: Bug Priority: Minor
Reporter: Mikhail Pershin Assignee: Mikhail Pershin
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
is related to LU-15575 Interop sanity test_103e: failed to c... Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

While directory may have many default ACLs they cannot be inherited in newly created file. This is MDD internal issue and it is caused by buffer size limitation during the ACL processing



 Comments   
Comment by Gerrit Updater [ 12/Feb/21 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41494
Subject: LU-14430 mdd: fix inheritance of big default ACLs
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 30ba8c97f0e1e5ffa06c7e9f68a47c79ae8b2d59

Comment by Gerrit Updater [ 22/Feb/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41494/
Subject: LU-14430 mdd: fix inheritance of big default ACLs
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f3d03bc38a3afdef83635d578ee0b2ffdd985685

Comment by Peter Jones [ 22/Feb/21 ]

Landed for 2.15

Comment by Alex Zhuravlev [ 26/Feb/21 ]

just got this on fresh master:

LustreError: 13185:0:(mdd_dir.c:2322:mdd_acl_init()) ASSERTION( def_acl_buf->lb_len <= acl_buf->lb_len ) failed: in sanity / 103e

Trace:

PID: 13185  TASK: ffff8802066e8000  CPU: 1   COMMAND: "mdt01_001"
 #0 [ffff880247133960] panic at ffffffff810af9a3
    /home/lustre/linux-4.18.0-32.el8/kernel/panic.c: 265
 #1 [ffff8802471339f0] mdd_create at ffffffffa0c2997e [mdd]
    /home/lustre/master-mine/libcfs/include/libcfs/libcfs_fail.h: 95
 #2 [ffff880247133ab8] mdt_reint_open at ffffffffa0cd02f4 [mdt]
    /home/lustre/master-mine/lustre/include/md_object.h: 616
 #3 [ffff880247133c10] mdt_intent_open at ffffffffa0ca3dff [mdt]
    /home/lustre/master-mine/lustre/mdt/mdt_handler.c: 4469
 #4 [ffff880247133c50] mdt_intent_policy at ffffffffa0ca1b89 [mdt]
    /home/lustre/master-mine/lustre/mdt/mdt_handler.c: 4616
 #5 [ffff880247133cb8] ldlm_lock_enqueue at ffffffffa0553cf8 [ptlrpc]
    /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lock.c: 1776
 #6 [ffff880247133d18] ldlm_handle_enqueue0 at ffffffffa0578f98 [ptlrpc]
    /home/lustre/master-mine/lustre/ptlrpc/../../lustre/ldlm/ldlm_lockd.c: 1390
 #7 [ffff880247133d90] tgt_enqueue at ffffffffa05fcddf [ptlrpc]
    /home/lustre/master-mine/lustre/ptlrpc/../../lustre/target/tgt_handler.c: 1393
 #8 [ffff880247133da8] tgt_request_handle at ffffffffa0602c70 [ptlrpc]
    /home/lustre/master-mine/lustre/include/lu_target.h: 618
 #9 [ffff880247133e20] ptlrpc_main at ffffffffa05ae915 [ptlrpc]
    /home/lustre/master-mine/lustre/include/lustre_net.h: 2448
#10 [ffff880247133f10] kthread at ffffffff810d0350
    /home/lustre/linux-4.18.0-32.el8/kernel/kthread.c: 246
#11 [ffff880247133f50] ret_from_fork at ffffffff818001c4
    /home/lustre/linux-4.18.0-32.el8/arch/x86/entry/entry_64.S: 422
Comment by Gerrit Updater [ 26/Feb/21 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41775
Subject: LU-14430 mdd: don't assert on default ACL big buffer
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8d4c140e8a8a46c797e977bcfb2c7ee1ee5dbad4

Comment by Mikhail Pershin [ 26/Feb/21 ]

Alex, could you check that the latest patch fixes that assertion?

Comment by Mikhail Pershin [ 04/Mar/21 ]

More work is needed here, after allowing maximum ACL buffers to be filled (64K), simple test to set as many ACLs as possible fails in several aspects. More patches to be added here

Comment by Mikhail Pershin [ 06/Mar/21 ]

While testing maximum amount of ACL with ldiskfs I've encountered problem with transaction credits upon file creation. Investigation made by Alex showed that ldiskfs_new_inode() calls also ldiskfs_init_acl(). It copies all parent ACLs to the new file as well as adds transaction credits for that. This causes LBUG() each time I am trying to create file in directory with maximum default ACLs:

osd_trans_dump_creds())   create: 1/8/16, destroy: 0/0/0
...
[25034.738128] LustreError: 30925:0:(osd_internal.h:1319:osd_trans_exec_check()) LBUG
[25034.738368] Pid: 30925, comm: mdt01_003 3.10.0 #5 SMP Sun Jun 2 15:04:32 EDT 2019
[25034.738369] Call Trace:
[25034.738375]  [<ffffffffa00d77ad>] libcfs_call_trace+0x7d/0xa0 [libcfs]
[25034.738384]  [<ffffffffa00d784c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[25034.738390]  [<ffffffffa0a7d346>] cfs_fail_check_set.part.51.constprop.95+0x0/0x79 [osd_ldiskfs]
[25034.738401]  [<ffffffffa0a512a2>] osd_create+0x972/0x13c0 [osd_ldiskfs]
[25034.738410]  [<ffffffffa0c949d5>] lod_sub_create+0x1e5/0x470 [lod]
[25034.738422]  [<ffffffffa0c85189>] lod_create+0x69/0x360 [lod]
[25034.738430]  [<ffffffffa0b391c3>] mdd_create_object_internal+0xc3/0x300 [mdd]
[25034.738440]  [<ffffffffa0b2189c>] mdd_create_object+0x5c/0x800 [mdd]
[25034.738447]  [<ffffffffa0b2c44d>] mdd_create+0xe6d/0x1600 [mdd]
[25034.738453]  [<ffffffffa0bbd3a0>] mdt_reint_open+0x2470/0x32c0 [mdt]
[25034.738468]  [<ffffffffa0bb00b3>] mdt_reint_rec+0x83/0x220 [mdt]
[25034.738479]  [<ffffffffa0b8c2c1>] mdt_reint_internal+0x6e1/0xb00 [mdt]
[25034.738488]  [<ffffffffa0b98eb2>] mdt_intent_open+0x82/0x3a0 [mdt]
[25034.738498]  [<ffffffffa0b96fd5>] mdt_intent_policy+0x445/0xd90 [mdt]
[25034.738508]  [<ffffffffa04b6636>] ldlm_lock_enqueue+0x366/0x9c0 [ptlrpc]
[25034.738540]  [<ffffffffa04ddd26>] ldlm_handle_enqueue0+0xa66/0x1620 [ptlrpc]
[25034.738568]  [<ffffffffa0566b12>] tgt_enqueue+0x62/0x210 [ptlrpc]
[25034.738604]  [<ffffffffa056fede>] tgt_request_handle+0xade/0x15e0 [ptlrpc]
[25034.738744]  [<ffffffffa051171b>] ptlrpc_server_handle_request+0x25b/0xad0 [ptlrpc]
[25034.738776]  [<ffffffffa0515aa3>] ptlrpc_main+0xbe3/0x21e0 [ptlrpc]
[25034.738805]  [<ffffffff8110aad4>] kthread+0xd4/0xe0
[25034.738810]  [<ffffffff81839777>] ret_from_fork_nospec_end+0x0/0x39
[25034.738813]  [<ffffffffffffffff>] 0xffffffffffffffff

That is caused by ldiskfs adds more credits in transaction and MDD is not aware about that. Moreover that also means the ACLs EA is copied first always, so LMA and LOV EA could go into extra block if ACL is big enough.

Considering that MDD handles all ACLs by itself because of ZFS, it is not needed at all to use ACL handling in ldiskfs, so I was trying to disable ACL in ldiskfs and let MDD work with it as with ZFS. Unfortunately that is not possible without ldiskfs patching, it uses set of xattr handlers which have internal checks for ACL mount option and deny any setxattr/getxattr for ACL EA.

At the moment I have no good and simple solution for that LBUG. Reserving twice more credits in MDD would cause big overhead with many stripes and that is just not good to do double work while handling ACLs with ldiskfs in terms of performance and resources.

Comment by Gerrit Updater [ 11/Mar/21 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/42013
Subject: LU-14430 mdt: fix maximum ACL handling
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 8c877d9e677968cebca4812a487b8a120611fed2

Comment by Gerrit Updater [ 13/Mar/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41775/
Subject: LU-14430 mdd: don't assert on default ACL big buffer
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: b66b530c18c910ded562e279c9db02fcdad42176

Comment by Gerrit Updater [ 28/Apr/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/42013/
Subject: LU-14430 mdt: fix maximum ACL handling
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: aa92caa21fa2a4473dce5889de7fcd17e171c1a0

Comment by Gerrit Updater [ 12/May/21 ]

Mike Pershin (mpershin@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43672
Subject: LU-14430 mdd: use own buffer for changelog
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 84b681ae814710b22f66fc69ddee997bfe34c181

Comment by Andreas Dilger [ 13/May/21 ]

I added some debugging code to the users of mti_big_buf and tracked the double-use problem down to the code in mdd_declare_changelog_store() using it internally while only declaring the operation:

[ 6181.942124] Lustre: testfs-MDD0000: changelog on
[ 6185.985737] Lustre: testfs-MDD0001: changelog on
[ 6201.030925] LustreError: 6461:0:(mdd_dir.c:766:mdd_declare_changelog_store())
 ASSERTION( !mdd_env_info(env)->mdi_big_buf_used ) failed: mdi_big_buf used in mdd_dir.c:2630:mdd_create()
[ 6201.043070] LustreError: 6461:0:(mdd_dir.c:766:mdd_declare_changelog_store()) LBUG
[ 6201.049962] Pid: 6461, comm: mdt00_003 3.10.0-1160.21.1.el7_lustre.ddn13.x86_64 #1 SMP Fri Mar 19 20:56:15 UTC 2021
[ 6201.059974] Call Trace:
[ 6201.062767]  [<ffffffffc06317cc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[ 6201.067814]  [<ffffffffc063187c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[ 6201.070500]  [<ffffffffc10f189c>] mdd_declare_changelog_store+0x39c/0x410 [mdd]
[ 6201.077335]  [<ffffffffc10f1d9d>] mdd_declare_create+0x48d/0xdf0 [mdd]
[ 6201.083021]  [<ffffffffc10f55b1>] mdd_create+0x8f1/0x1790 [mdd]
[ 6201.085670]  [<ffffffffc1188bf8>] mdt_reint_open+0x2578/0x33d0 [mdt]
[ 6201.090106]  [<ffffffffc117b7d3>] mdt_reint_rec+0x83/0x210 [mdt]
[ 6201.092757]  [<ffffffffc1157481>] mdt_reint_internal+0x6e1/0xb00 [mdt]
[ 6201.094711]  [<ffffffffc11641a2>] mdt_intent_open+0x82/0x3a0 [mdt]
[ 6201.099578]  [<ffffffffc11622c5>] mdt_intent_policy+0x435/0xd80 [mdt]
[ 6201.104017]  [<ffffffffc0a7a686>] ldlm_lock_enqueue+0x376/0x9b0 [ptlrpc]
[ 6201.107773]  [<ffffffffc0aa2236>] ldlm_handle_enqueue0+0xaa6/0x1630 [ptlrpc]
[ 6201.113342]  [<ffffffffc0b2c012>] tgt_enqueue+0x62/0x210 [ptlrpc]
[ 6201.118300]  [<ffffffffc0b30bee>] tgt_request_handle+0xaee/0x15f0 [ptlrpc]
[ 6201.132348]  [<ffffffffc0ad75db>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[ 6201.137484]  [<ffffffffc0adaf44>] ptlrpc_main+0xb34/0x1470 [ptlrpc]

That code doesn't need a large lu_buf for the whole changelog record to declare the transaction size, just rec->cr_hdr (struct llog_rec_hdr), which could be allocated directly on the stack. That would also avoid some overhead in that function since it doesn't need to check/allocate mti_big_buf.

Comment by Gerrit Updater [ 13/May/21 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43683
Subject: LU-14430 mdd: use own rec_hdr for changelog declare
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 69003b78fb8da081a1d09d072b52e1fbb997059b

Comment by Gerrit Updater [ 19/May/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43672/
Subject: LU-14430 mdd: use own buffer for changelog
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 48aac6f1d9c2dd05328023c39d0dc95be92aa0fe

Comment by Gerrit Updater [ 19/May/21 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43738
Subject: LU-14430 mdd: rename mti_big_buf to mdi_big_buf
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 9e117587e5932e02ceffa0fe4d1323e7fcd2754e

Comment by Gerrit Updater [ 19/May/21 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43739
Subject: LU-14430 mdd: rename mti_oa to mdi_oa and friends
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: bda500aa02236c0b31721cc1e3f70b9ed9a63c7a

Comment by Gerrit Updater [ 19/May/21 ]

Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/43740
Subject: LU-14430 mdd: rename mti_fid to mdi_fid and friends
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 5e7eb51652b5fa612b86ab045532168c66522691

Comment by Peter Jones [ 19/May/21 ]

Still some patches being tracked under this ticket

Comment by Gerrit Updater [ 27/May/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43683/
Subject: LU-14430 mdd: use own rec_hdr for changelog declare
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: ff52f8c1736ad7ef2621d23366a1ca6572aa7f22

Comment by Gerrit Updater [ 12/Jul/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43738/
Subject: LU-14430 mdd: rename mti_big_buf to mdi_big_buf
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: f9f38c33ab8484102cdb3736868f4e7bece594ae

Comment by Gerrit Updater [ 12/Jul/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43739/
Subject: LU-14430 mdd: rename mti_oa to mdi_oa and friends
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 9a23a5de12164f9d50db9e602f085bb0c3cc9d8a

Comment by Gerrit Updater [ 27/Jul/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/43740/
Subject: LU-14430 mdd: rename mti_fid to mdi_fid and friends
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: b1ed8e57da67feddb9c5e67abaf6db1b70333fa0

Comment by Peter Jones [ 29/Jul/21 ]

Landed for 2.15

Generated at Sat Feb 10 03:09:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.