[LU-11247] chgrp, OST mount, MDT/MGS journal deadlock Created: 14/Aug/18  Updated: 14/Aug/18

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: John Hammond Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This is similar to LU-11119, LU-11227, and LU-11236. However those issues are fixed by https://review.whamcloud.com/#/c/32964/ whereas this issue is not. If an OST containing an object for a file is unmounted when chgrp on that file is done then chgrp will hang (which is fine) and remounting the OST will hang as well (which is not):

o:~# export OSTCOUNT=2
o:~# $LUSTRE/tests/llmount.sh
...
o:~# lfs setstripe -c2 /mnt/lustre/f0
o:~# chown sanity: /mnt/lustre/f0
o:~# umount /mnt/lustre-ost1
o:~# sudo -u sanity chgrp gsanity1 /mnt/lustre/f0 &
[1] 31691
o:~# mount /tmp/lustre-ost1 /mnt/lustre-ost1 -t lustre -o loop

Stack traces:

31692 chgrp
[<ffffffffc0de8520>] ptlrpc_set_wait+0x480/0x790 [ptlrpc]
[<ffffffffc0de88ad>] ptlrpc_queue_wait+0x7d/0x220 [ptlrpc]
[<ffffffffc10a8c57>] mdc_reint+0x57/0x160 [mdc]
[<ffffffffc10a91ae>] mdc_setattr+0x1ae/0x4a0 [mdc]
[<ffffffffc0d544ff>] lmv_setattr+0x20f/0x3b0 [lmv]
[<ffffffffc16a42f7>] ll_setattr_raw+0x7e7/0x1290 [lustre]
[<ffffffffc16a4e0c>] ll_setattr+0x6c/0xd0 [lustre]
[<ffffffffb9239af4>] notify_change+0x2c4/0x420
[<ffffffffb921840c>] chown_common+0x19c/0x1d0
[<ffffffffb92199ef>] SyS_fchownat+0xcf/0x120
[<ffffffffb972082f>] system_call_fastpath+0x1c/0x21
[<ffffffffffffffff>] 0xffffffffffffffff

30381 mdt01_001
[<ffffffffc0de8520>] ptlrpc_set_wait+0x480/0x790 [ptlrpc]
[<ffffffffc0de88ad>] ptlrpc_queue_wait+0x7d/0x220 [ptlrpc]
[<ffffffffc15ec733>] osp_remote_sync+0xd3/0x200 [osp]
[<ffffffffc15d3cef>] osp_attr_set+0x4bf/0x5d0 [osp]
[<ffffffffc15846b8>] lod_sub_attr_set+0x1c8/0x460 [lod]
[<ffffffffc15630e0>] lod_obj_stripe_attr_set_cb+0x40/0x100 [lod]
[<ffffffffc156f91e>] lod_obj_for_each_stripe+0x11e/0x2d0 [lod]
[<ffffffffc157104b>] lod_attr_set+0x3db/0x9e0 [lod]
[<ffffffffc1438e40>] mdd_attr_set_internal+0x120/0x2a0 [mdd]
[<ffffffffc1439c2d>] mdd_attr_set+0x8bd/0xcf0 [mdd]
[<ffffffffc14aa31f>] mdt_attr_set+0x19f/0xbb0 [mdt]
[<ffffffffc14ab589>] mdt_reint_setattr+0x609/0xa90 [mdt]
[<ffffffffc14aba93>] mdt_reint_rec+0x83/0x210 [mdt]
[<ffffffffc148b1d2>] mdt_reint_internal+0x6b2/0xa80 [mdt]
[<ffffffffc14961e7>] mdt_reint+0x67/0x140 [mdt]
[<ffffffffc0e5e2aa>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[<ffffffffc0e0140b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[<ffffffffc0e04c44>] ptlrpc_main+0xb14/0x1fb0 [ptlrpc]
[<ffffffffb90bb161>] kthread+0xd1/0xe0
[<ffffffffb9720677>] ret_from_fork_nospec_end+0x0/0x39
[<ffffffffffffffff>] 0xffffffffffffffff

31708 mount.lustre
[<ffffffffc0de8520>] ptlrpc_set_wait+0x480/0x790 [ptlrpc]
[<ffffffffc0de88ad>] ptlrpc_queue_wait+0x7d/0x220 [ptlrpc]
[<ffffffffc0d7eb44>] mgc_target_register+0x134/0x4c0 [mgc]
[<ffffffffc0d81e3b>] mgc_set_info_async+0x37b/0x1610 [mgc]
[<ffffffffc0bde93b>] server_start_targets+0x116b/0x2a30 [obdclass]
[<ffffffffc0be12fc>] server_fill_super+0x10fc/0x18c0 [obdclass]
[<ffffffffc0bb65f8>] lustre_fill_super+0x328/0x950 [obdclass]
[<ffffffffb921ef3f>] mount_nodev+0x4f/0xb0
[<ffffffffc0bae748>] lustre_mount+0x38/0x60 [obdclass]
[<ffffffffb921fabe>] mount_fs+0x3e/0x1b0
[<ffffffffb923d097>] vfs_kern_mount+0x67/0x110
[<ffffffffb923f6bf>] do_mount+0x1ef/0xce0
[<ffffffffb92404f3>] SyS_mount+0x83/0xd0
[<ffffffffb972082f>] system_call_fastpath+0x1c/0x21
[<ffffffffffffffff>] 0xffffffffffffffff

30373 ll_mgs_0001
[<ffffffffc0240495>] jbd2_log_wait_commit+0xc5/0x140 [jbd2]
[<ffffffffc0241987>] __jbd2_journal_force_commit+0x57/0xb0 [jbd2]
[<ffffffffc0241a21>] jbd2_journal_force_commit+0x21/0x30 [jbd2]
[<ffffffffc12b9539>] ldiskfs_force_commit+0x29/0x30 [ldiskfs]
[<ffffffffc1342290>] osd_sync+0x50/0x180 [osd_ldiskfs]
[<ffffffffc13c3bed>] mgs_target_reg+0x62d/0x1320 [mgs]
[<ffffffffc0e5e2aa>] tgt_request_handle+0xaea/0x1580 [ptlrpc]
[<ffffffffc0e0140b>] ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
[<ffffffffc0e04c44>] ptlrpc_main+0xb14/0x1fb0 [ptlrpc]
[<ffffffffb90bb161>] kthread+0xd1/0xe0
[<ffffffffb9720677>] ret_from_fork_nospec_end+0x0/0x39
[<ffffffffffffffff>] 0xffffffffffffffff

Note that this requires a shared MGS and MDT.


Generated at Sat Feb 10 02:42:14 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.