[LU-2282] Oops mounting zfs osd after ldiskfs Created: 27/Sep/12  Updated: 18/Dec/12  Resolved: 18/Dec/12

Status: Closed
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Ned Bass Assignee: Li Wei (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 4213

 Description   

Sometimes it's useful during testing/debugging to compare behavior between the ldiskfs and zfs OSDs. However if I mount a ldiskfs MDT, unmount it, and mount a ZFS MDT, I get the following crash.

mkfs.lustre --mdt --mgs --index=0 --reformat /dev/loop0
mount -t lustre /dev/loop0 /mnt/mdt
umount /mnt/mdt
mkfs.lustre --mdt --mgs --index=0 --reformat --backfstype zfs m/mdt /dev/loop0
mount -t lustre m/mdt /mnt/mdt
BUG: unable to handle kernel NULL pointer dereference at 0000000000000248
IP: [<ffffffffa0be7501>] osd_ldiskfs_it_fill+0x81/0x1e0 [osd_ldiskfs]
PGD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/module/zfs/initstate
CPU 0 
Modules linked in: osd_zfs(U) mdt(U) lod(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) mdd(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) 
fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 zfs(P)(U) z
common(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate vhost_net macvtap macvlan tun virtio_balloon virtio_net snd_hda
_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 i2c_core sg ext4 mbcache 
jbd2 sr_mod cdrom virtio_blk pata_acpi ata_generic ata_piix virtio_pci virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last u
nloaded: speedstep_lib]

Pid: 2355, comm: tgt_recov Tainted: P           ----------------   2.6.32-220.23.1.1chaos.ch5.x86_64 #1 Bochs Bochs
RIP: 0010:[<ffffffffa0be7501>]  [<ffffffffa0be7501>] osd_ldiskfs_it_fill+0x81/0x1e0 [osd_ldiskfs]
RSP: 0018:ffff8800dadc9c90  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8800dda17938 RCX: 0000000000000000
RDX: ffff880113fe3780 RSI: ffffffffa0c089e0 RDI: ffff8800dadc9e50
RBP: ffff8800dadc9cb0 R08: 7fffffffffffffff R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000000 R12: ffff8800daec7d40
R13: ffff8800dadc9e50 R14: 0000000900000001 R15: ffff88011270e7b0
FS:  00007f669c65f700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000248 CR3: 0000000001a85000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process tgt_recov (pid: 2355, threadinfo ffff8800dadc8000, task ffff8800dae20080)
Stack:
 ffff8800dda17938 ffff8800dadc9e50 ffff8800dda17938 ffff880112cf5960
<0> ffff8800dadc9ce0 ffffffffa0be76a1 000000000000f4e8 00000000000005c0
<0> ffffffffa0cb0a20 ffff8800dadc9e50 ffff8800dadc9de0 ffffffffa0ba67b2
Call Trace:
 [<ffffffffa0be76a1>] osd_it_ea_load+0x41/0xa0 [osd_ldiskfs]
 [<ffffffffa0ba67b2>] __mdd_orphan_cleanup+0xa2/0xa80 [mdd]
 [<ffffffffa05a5993>] ? cfs_alloc+0x63/0x90 [libcfs]
 [<ffffffffa092b6dc>] ? osd_key_init+0x5c/0x150 [osd_zfs]
 [<ffffffffa06a5b1f>] ? keys_fill+0x6f/0x1a0 [obdclass]
 [<ffffffffa0bb50d8>] mdd_recovery_complete+0xb8/0x100 [mdd]
 [<ffffffffa0ce0b5f>] mdt_postrecov+0x3f/0x90 [mdt]
 [<ffffffffa0ce23d8>] mdt_obd_postrecov+0x78/0x90 [mdt]
 [<ffffffffa07d56c0>] ? ldlm_reprocess_res+0x0/0x20 [ptlrpc]
 [<ffffffffa07e28d9>] target_recovery_thread+0xa89/0x1440 [ptlrpc]
 [<ffffffffa07e1e50>] ? target_recovery_thread+0x0/0x1440 [ptlrpc]
 [<ffffffff8100c14a>] child_rip+0xa/0x20
 [<ffffffffa07e1e50>] ? target_recovery_thread+0x0/0x1440 [ptlrpc]
 [<ffffffffa07e1e50>] ? target_recovery_thread+0x0/0x1440 [ptlrpc]
 [<ffffffff8100c140>] ? child_rip+0x0/0x20
Code: 83 c8 00 00 00 00 00 00 00 48 89 83 d0 00 00 00 49 83 7c 24 60 00 0f 84 fe 00 00 00 4c 89 ef 48 c7 c6 e0 89 c0 a0 e8 9f e8 ab ff <
4c> 8b a8 48 02 00 00 49 8b 74 24 60 b9 03 00 00 00 4c 89 f2 4c 
RIP  [<ffffffffa0be7501>] osd_ldiskfs_it_fill+0x81/0x1e0 [osd_ldiskfs]

I see it's calling into osd_ldiskfs_it_fill so maybe there's some missing finalization or deregistration when the ldiskfs osd is stopped. Doing a rmmod lod osd_ldiskfs mdt mdd before mounting the zfs osd avoids the crash.



 Comments   
Comment by Andreas Dilger [ 27/Sep/12 ]

Probably the ext2 filesystem superblock is still on the disk after the ZFS pool is created. The ZFS tools should really zero out the ext2 superblock when it is formatting the Filesystem.

Comment by Ned Bass [ 28/Sep/12 ]

The problem appears if I zero out the device between mounts or use different devices. Reversing the order (zfs then ldiskfs) also crashes with RIP in osd_zap_it_load(). So this really looks like incomplete deregistration of the old OSD to me.

Comment by Alex Zhuravlev [ 28/Sep/12 ]

this should be easy to check with lsmod ?

Comment by Ned Bass [ 28/Sep/12 ]

Actually, just removing the lod module between mounts seems sufficient to avoid the crash.

this should be easy to check with lsmod ?

$ mkfs.lustre --mdt --mgs --index=0 --reformat /dev/loop0
$ mount -t lustre /dev/loop0 /mnt/mdt
$ umount /mnt/mdt
$ lsmod
Module                  Size  Used by
mdt                   509398  0 
lod                   245078  0 
mgs                   252181  0 
mgc                    62634  1 mgs
fsfilt_ldiskfs          6769  0 
osd_ldiskfs           279826  0 
mdd                   289703  2 mdt,osd_ldiskfs
lustre                820677  0 
lov                   475002  1 lustre
osc                   398611  1 lov
mdc                   179197  1 lustre
fid                    62413  3 mdt,mdd,mdc
fld                    77389  2 mdt,fid
ksocklnd              326232  1 
ptlrpc               1404175  10 mdt,lod,mgs,mgc,lustre,lov,osc,mdc,fid,fld
obdclass             1138145  22 mdt,lod,mgs,mgc,osd_ldiskfs,mdd,lustre,lov,osc,mdc,fid,fld,ptlrpc
lnet                  330720  4 lustre,ksocklnd,ptlrpc,obdclass
lvfs                   21900  16 mdt,lod,mgs,mgc,fsfilt_ldiskfs,osd_ldiskfs,mdd,lustre,lov,osc,mdc,fid,fld,ptlrpc,obdclass
libcfs                436749  18 mdt,lod,mgs,mgc,fsfilt_ldiskfs,osd_ldiskfs,mdd,lustre,lov,osc,mdc,fid,fld,ksocklnd,ptlrpc,obdclass,lnet,lvfs
ldiskfs               415391  2 fsfilt_ldiskfs,osd_ldiskfs
Comment by Li Wei (Inactive) [ 06/Nov/12 ]

http://review.whamcloud.com/4476

This is to resolve an issue preventing me from reproducing the oops on my single-node setup.

Comment by Li Wei (Inactive) [ 20/Nov/12 ]

With latest master (12a1b23) and the patch above, I can't reproduce the oops. Also, I am able to do

FSTYPE=ldiskfs ... sh llmount.sh
FSTYPE=zfs ... sh llmount.sh
FSTYPE=ldiskfs ... sh llmount.sh

without obvious issues. (The osd-

{ldiskfs,zfs}

kernel modules are not removed during that process.)

Comment by Li Wei (Inactive) [ 18/Dec/12 ]

The side patch above has made it to master. Given that I couldn't reproduce it, I'm closing this ticket for the moment.

Generated at Sat Feb 10 01:23:54 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.