[LU-12206] error handling on mdt_init0() Created: 19/Apr/19  Updated: 25/May/19  Resolved: 25/May/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.13.0

Type: Bug Priority: Minor
Reporter: Vladimir Saveliev Assignee: Vladimir Saveliev
Resolution: Fixed Votes: 0
Labels: patch

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

When mdt_init0() returns with error, it has to guarantee that destroying of exports is completed before mdt_device_free() frees the mdt_device:

static struct lu_device *mdt_device_alloc(const struct lu_env *env,
...
                rc = mdt_init0(env, m, t, cfg);
                if (rc != 0) {
                        mdt_device_free(env, l);

Otherwise, zombied work queue may access freed memory resulting in general protection, something like:

[  472.711559] general protection fault: 0000 [#1] SMP 
[  472.714047] Modules linked in: loop lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE
) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) dm_flakey libcfs(OE) ext4 mbcache jbd2 ip6table_filter ip6_tables devlink ipta
ble_filter sunrpc intel_powerclamp iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel linear lrw gf128mul glue_helper ablk_helper cryptd sg pcspkr video ip_tables xfs libcrc32c s
d_mod crc_t10dif crct10dif_generic sr_mod cdrom ata_generic pata_acpi crct10dif_pclmul crct10dif_common crc32c_intel serio_raw ahci libahci ata_piix e1000 libata dm_mirror dm_region
_hash dm_log dm_mod
[  472.737938] CPU: 0 PID: 44 Comm: kworker/0:2 Kdump: loaded Tainted: G           OE  ------------   3.10.0-862.14.4.el7.x86_64 #10
[  472.741324] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  472.743571] Workqueue: obd_zombid obd_zombie_exp_cull [obdclass]
[  472.745032] task: ffff9a64378b2f40 ti: ffff9a643791c000 task.ti: ffff9a643791c000
[  472.747274] RIP: 0010:[<ffffffffc08829c5>]  [<ffffffffc08829c5>] tgt_client_free+0x1e5/0x3c0 [ptlrpc]
[  472.749950] RSP: 0018:ffff9a643791fda8  EFLAGS: 00010206
[  472.751172] RAX: 5a5a5a5a5a5a5a5a RBX: ffff9a640354af28 RCX: 000000018020001a
[  472.752730] RDX: 0000000000000000 RSI: ffffeb4e0112a7c0 RDI: 0000000040000000
[  472.754243] RBP: ffff9a643791fdd0 R08: ffff9a6404a9f880 R09: 000000018020001a
[  472.755651] R10: 0000000004a9f801 R11: ffffeb4e0112a7c0 R12: ffff9a640354ac00
[  472.757045] R13: ffff9a640354aec8 R14: ffff9a64118530b8 R15: ffff9a640354af28
[  472.758840] FS:  0000000000000000(0000) GS:ffff9a643fc00000(0000) knlGS:0000000000000000
[  472.761192] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  472.763423] CR2: 00007f04d8ed3000 CR3: 000000007850c000 CR4: 00000000000606f0
[  472.765355] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  472.775769] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  472.778600] Call Trace:
[  472.779383]  [<ffffffffc0ec2327>] mdt_destroy_export+0x57/0x200 [mdt]
[  472.781139]  [<ffffffffc05bf20e>] class_export_destroy+0xee/0x490 [obdclass]
[  472.783158]  [<ffffffffc05bf5c5>] obd_zombie_exp_cull+0x15/0x20 [obdclass]
[  472.784599]  [<ffffffff93ab1d2f>] process_one_work+0x17f/0x440
[  472.785836]  [<ffffffff93ab2dc6>] worker_thread+0x126/0x3c0
[  472.787047]  [<ffffffff93ab2ca0>] ? manage_workers.isra.24+0x2a0/0x2a0
[  472.788592]  [<ffffffff93ab9b51>] kthread+0xd1/0xe0


 Comments   
Comment by Gerrit Updater [ 19/Apr/19 ]

Vladimir Saveliev (c17830@cray.com) uploaded a new patch: https://review.whamcloud.com/34724
Subject: LU-12206 mdt: mdt_init0 failure handling
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 49806583020b38d32c5c535fda975fda965aa3b6

Comment by Gerrit Updater [ 25/May/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34724/
Subject: LU-12206 mdt: mdt_init0 failure handling
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: d1b5146eda4fdaa77dd44bc2195435bda0f83a94

Comment by Peter Jones [ 25/May/19 ]

Landed for 2.13

Generated at Sat Feb 10 02:50:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.