Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-137

ioctl passthrough mechanism for Lustre OST/MDT mountpoints

Details

    • New Feature
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • Lustre 2.1.0, Lustre 2.5.0
    • 14,489
    • 8383

    Description

      Implement an interface for sending IO Control (ioctl) commands from userspace through the Lustre mount point to the underlying ldiskfs filesystem to allow execution of filesystem-wide ioctl() commands, such as resize. This will allow user-space tools that operate via ioctl() commands on the filesystem mountpoint to be used on the Lustre MDT and OST filesystems while they are mounted and in use subject to any limitations of the original ioctl() commands themselves.

      Attachments

        Issue Links

          Activity

            [LU-137] ioctl passthrough mechanism for Lustre OST/MDT mountpoints

            Linking this since I plan on fixing the ioctl direction issue which will provide a proper interface for this as well.

            simmonsja James A Simmons added a comment - Linking this since I plan on fixing the ioctl direction issue which will provide a proper interface for this as well.

            Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23092/
            Subject: LU-137 obdclass: add dt_object_put() and use it
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 5963af745b3aa14410d5ceb66f8a7b7d6aaf576a

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23092/ Subject: LU-137 obdclass: add dt_object_put() and use it Project: fs/lustre-release Branch: master Current Patch Set: Commit: 5963af745b3aa14410d5ceb66f8a7b7d6aaf576a

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23092
            Subject: LU-137 obdclass: add dt_object_put() and use it
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: cf4f59a5a7253e18ac98983216791cb730511d18

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/23092 Subject: LU-137 obdclass: add dt_object_put() and use it Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: cf4f59a5a7253e18ac98983216791cb730511d18

            Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/20161
            Subject: LU-137 osd: better stat info for server mountpoints
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: db5b4aaa6177d9ab179b46a8d3e5d13c5d2c2883

            gerrit Gerrit Updater added a comment - Andreas Dilger (andreas.dilger@intel.com) uploaded a new patch: http://review.whamcloud.com/20161 Subject: LU-137 osd: better stat info for server mountpoints Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: db5b4aaa6177d9ab179b46a8d3e5d13c5d2c2883

            Alex, what about adding a new dt_ioctl() method for the OSD API? It would of course be fine if the underlying OSD doesn't support the given ioctl (e.g. returns -ENOTTY) but gives a way to add features like this. I see that this was handled for FIEMAP by adding a dbo_fiemap_get() method (though it could return -EOPNOTSUPP a LOT earlier in the request processing), but that might result in an explosion of different methods if we add one for every ioctl. Other candidates that we might need in the future include FITRIM to trim unallocated space for SSD or thin-provisioned devices, EXT4_IOC_PRECACHE_EXTENTS to prefetch file extent metadata, EXT4_IOC_MOVE_EXT or EXT4_IOC_MIGRATE for data migration within the OST.

            It isn't yet clear to me if we want separate ioctl methods for the whole device and per file, or if it is OK to just do the "device" ioctls the root inode.

            adilger Andreas Dilger added a comment - Alex, what about adding a new dt_ioctl() method for the OSD API? It would of course be fine if the underlying OSD doesn't support the given ioctl (e.g. returns -ENOTTY) but gives a way to add features like this. I see that this was handled for FIEMAP by adding a dbo_fiemap_get() method (though it could return -EOPNOTSUPP a LOT earlier in the request processing), but that might result in an explosion of different methods if we add one for every ioctl. Other candidates that we might need in the future include FITRIM to trim unallocated space for SSD or thin-provisioned devices, EXT4_IOC_PRECACHE_EXTENTS to prefetch file extent metadata, EXT4_IOC_MOVE_EXT or EXT4_IOC_MIGRATE for data migration within the OST. It isn't yet clear to me if we want separate ioctl methods for the whole device and per file, or if it is OK to just do the "device" ioctls the root inode.

            I don't think you want to manage ZFS pools (or btrfs devices) directly. we can provide vfsmount or superblock via osd_conf_get(), but again that will work for ldiskfs only.

            bzzz Alex Zhuravlev added a comment - I don't think you want to manage ZFS pools (or btrfs devices) directly. we can provide vfsmount or superblock via osd_conf_get(), but again that will work for ldiskfs only.

            Hi Alex,

            This patch: http://review.whamcloud.com/#/c/8286/2 removes lsi_srv_mnt, lmi_mnt and ddp_mnt. It is mentioned in the commit message that vfsmount has become redundant because of the introduction of local storage device.
            IOCTL passthrough patch needs a valid vfsmount.

            The following is per discussion with Andreas:
            Seems ext4_ioctl() needs filp for mnt_{want,drop}_write_file(), which uses file->f_path.mnt for a number of things, so it really needs a valid vfsmnt structure. It looks like the vfsmount is still available in osd_dt_dev(lsi->lsi_dt_dev)->od_mnt, but it looks like od_mnt is specific to the underlying OSD device (ldiskfs or ZFS). It looks like there are no direct ioctls for the DMU - they are all handled via /dev/zfs. there are two ioctls for ZFS files in ZPL, but we don't use that, so this only needs to work for ldiskfs mountpoints for now. But, it isn't safe to dig into the OSD structure directly, since this could crash on a ZFS-backed filesystem. It might make sense to add the vfsmnt into dt_device_param. We might need to add a ->dt_ioctl() method to dt_device_operations, since accessing the superblock directly is also bad.

            Sadly, looking at this patch (LU-137) in light of ZFS-backed devices (i.e. anything 2.4 and later), it seems that there are quite a few things that are not "right" about it. ZFS doesn't even have a superblock, so that makes much of the patch invalid. Some of the info can be fetched via the osd_conf_get() interface, in particular s_dev is important for the resize ioctl. Note that even accessing the osd_device outside of the OSD code isn't possible, because the osd_device structure is different for each OSD.

            What would be the best way to proceed in this case?

            spimpale Swapnil Pimpale (Inactive) added a comment - - edited Hi Alex, This patch: http://review.whamcloud.com/#/c/8286/2 removes lsi_srv_mnt, lmi_mnt and ddp_mnt. It is mentioned in the commit message that vfsmount has become redundant because of the introduction of local storage device. IOCTL passthrough patch needs a valid vfsmount. The following is per discussion with Andreas: Seems ext4_ioctl() needs filp for mnt_{want,drop}_write_file(), which uses file->f_path.mnt for a number of things, so it really needs a valid vfsmnt structure. It looks like the vfsmount is still available in osd_dt_dev(lsi->lsi_dt_dev)->od_mnt, but it looks like od_mnt is specific to the underlying OSD device (ldiskfs or ZFS). It looks like there are no direct ioctls for the DMU - they are all handled via /dev/zfs. there are two ioctls for ZFS files in ZPL, but we don't use that, so this only needs to work for ldiskfs mountpoints for now. But, it isn't safe to dig into the OSD structure directly, since this could crash on a ZFS-backed filesystem. It might make sense to add the vfsmnt into dt_device_param. We might need to add a ->dt_ioctl() method to dt_device_operations, since accessing the superblock directly is also bad. Sadly, looking at this patch ( LU-137 ) in light of ZFS-backed devices (i.e. anything 2.4 and later), it seems that there are quite a few things that are not "right" about it. ZFS doesn't even have a superblock, so that makes much of the patch invalid. Some of the info can be fetched via the osd_conf_get() interface, in particular s_dev is important for the resize ioctl. Note that even accessing the osd_device outside of the OSD code isn't possible, because the osd_device structure is different for each OSD. What would be the best way to proceed in this case?

            Yes, GETVERSION and GETFLAGS ioctls return correct values which are as follows:

            MDS:
            GETVERSION: 0
            GETFLAGS: 0x0
            
            OST:
            GETVERSION: 0
            GETFLAGS: 0x80000
            

            The crash occurred because of a NULL pointer dereference in mnt_want_write().
            The stack is as follows:

            <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000e2
            <1>IP: [<ffffffff81193c34>] mnt_want_write+0x14/0x80
            <4>PGD 1512d067 PUD c0d4067 PMD 0 
            <4>Oops: 0000 [#1] SMP 
            <4>last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:00.0/irq
            <4>CPU 1 
            <4>Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) osd_ldiskfs(U) fsfilt_ldiskfs(U) ldiskfs(U) mdd(U) mgs(U) lquota(U) lfsck(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) jbd2 sha512_generic sha256_generic crc32c_intel nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport e1000 sg vmware_balloon i2c_piix4 i2c_core shpchp ext3 jbd mbcache sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix dm_mod [last unloaded: libcfs]
            <4>
            <4>Pid: 26460, comm: ioctl_passthru Not tainted 2.6.32-279.19.1.el6_lustre.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
            <4>RIP: 0010:[<ffffffff81193c34>]  [<ffffffff81193c34>] mnt_want_write+0x14/0x80
            <4>RSP: 0018:ffff880020525cc8  EFLAGS: 00010246
            <4>RAX: 0000000000000000 RBX: ffff880020525d78 RCX: 0000000000000003
            <4>RDX: 0000000000000001 RSI: 0000000040086604 RDI: 0000000000000002
            <4>RBP: ffff880020525cc8 R08: ffffffffa0462c80 R09: 0000000000000000
            <4>R10: 0000000000000000 R11: 0000000000000206 R12: 00007fff78321100
            <4>R13: ffff88003c9be5b0 R14: 00007fff78321100 R15: 0000000000000000
            <4>FS:  00007f94d8a74700(0000) GS:ffff880002280000(0000) knlGS:0000000000000000
            <4>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
            <4>CR2: 00000000000000e2 CR3: 000000001dcfc000 CR4: 00000000000006e0
            <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            <4>Process ioctl_passthru (pid: 26460, threadinfo ffff880020524000, task ffff880020622aa0)
            <4>Stack:
            <4> ffff880020525d58 ffffffffa07198f4 0000000000000000 ffff88001dc499a8
            <4><d> 0000000000000286 ffff88003c9be5b0 ffff880020525d18 ffffffff811902c0
            <4><d> ffff880020525d18 ffff88003c9be5b0 ffff880020525d58 ffffffff8118ee10
            <4>Call Trace:
            <4> [<ffffffffa07198f4>] ldiskfs_ioctl+0xe4/0x940 [ldiskfs]
            <4> [<ffffffff811902c0>] ? iput+0x30/0x70
            <4> [<ffffffff8118ee10>] ? d_obtain_alias+0xc0/0x230
            <4> [<ffffffffa0451b2a>] server_ioctl+0xba/0xf0 [obdclass]
            <4> [<ffffffff81312cb3>] ? pty_write+0x73/0x80
            <4> [<ffffffff8130c34e>] ? do_output_char+0x1de/0x210
            <4> [<ffffffff81090d4c>] ? remove_wait_queue+0x3c/0x50
            <4> [<ffffffff81052223>] ? __wake_up+0x53/0x70
            <4> [<ffffffff81189012>] vfs_ioctl+0x22/0xa0
            <4> [<ffffffff81310efe>] ? tty_ldisc_deref+0xe/0x10
            <4> [<ffffffff81309e93>] ? tty_write+0x233/0x2a0
            <4> [<ffffffff811891b4>] do_vfs_ioctl+0x84/0x580
            <4> [<ffffffff81176602>] ? vfs_write+0x132/0x1a0
            <4> [<ffffffff81189731>] sys_ioctl+0x81/0xa0
            <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
            <4>Code: 8b 40 58 83 e0 01 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 65 8b 14 25 b8 e0 00 00 48 63 d2 <48> 8b 87 e0 00 00 00 48 03 04 d5 20 81 bf 81 83 00 01 0f ae f0
            <1>RIP  [<ffffffff81193c34>] mnt_want_write+0x14/0x80
            <4> RSP <ffff880020525cc8>
            <4>CR2: 00000000000000e2
            

            This was because the active_filp.f_path.mnt was not filled in before calling the ioctl.
            I have fixed this problem in the next patchset (http://review.whamcloud.com/#/c/7354/4)
            With this fix, I am able to test the SETVERSION and SETFLAGS ioctl.

            I also tested the old online resizefs ioctl LDISKFS_IOC_GROUP_EXTEND (latest e2fsprogs) as follows and it seems to be working:

            # df -kh
            Filesystem            Size  Used Avail Use% Mounted on
            /dev/sdb1              28G   20G  6.3G  76% /
            tmpfs                 435M   88K  435M   1% /dev/shm
            /dev/sda1             486M  109M  352M  24% /boot
            /dev/loop0            147M   18M  120M  13% /mnt/mds1
            /dev/loop1            184M   26M  149M  15% /mnt/ost1
            /dev/loop2            184M   26M  149M  15% /mnt/ost2
            TM2@tcp:/lustre       367M   51M  297M  15% /mnt/lustre
            
            # dd if=/dev/zero of=/tmp/lustre-ost-x bs=1M count=200
            
            # mkfs.lustre --fsname=lustre --mgsnode=192.168.100.26@tcp0 --ost --index=2 --device-size=100000 /tmp/lustre-ost-x
            
            # mount -t lustre -o loop /tmp/lustre-ost-x /mnt/ost-x/
            
            # df -kh
            Filesystem            Size  Used Avail Use% Mounted on
            /dev/sdb1              28G   20G  6.3G  76% /
            tmpfs                 435M   88K  435M   1% /dev/shm
            /dev/sda1             486M  109M  352M  24% /boot
            /dev/loop0            147M   18M  120M  13% /mnt/mds1
            /dev/loop1            184M   26M  149M  15% /mnt/ost1
            /dev/loop2            184M   26M  149M  15% /mnt/ost2
            TM2@tcp:/lustre       458M   56M  378M  13% /mnt/lustre
            /dev/loop3             92M  5.3M   82M   7% /mnt/ost-x
            
            ~/e2fsprogs# ./build/resize/resize2fs /dev/loop3 125M
            resize2fs 1.43-WIP (21-Jan-2013)
            Filesystem at /dev/loop3 is mounted on /mnt/ost-x; on-line resizing required
            old_desc_blocks = 1, new_desc_blocks = 1
            Performing an on-line resize of /dev/loop3 to 32000 (4k) blocks.
            The filesystem on /dev/loop3 is now 32000 blocks long.
            
            # df -kh
            Filesystem            Size  Used Avail Use% Mounted on
            /dev/sdb1              28G   20G  6.3G  76% /
            tmpfs                 435M   88K  435M   1% /dev/shm
            /dev/sda1             486M  109M  352M  24% /boot
            /dev/loop0            147M   18M  120M  13% /mnt/mds1
            /dev/loop1            184M   26M  149M  15% /mnt/ost1
            /dev/loop2            184M   26M  149M  15% /mnt/ost2
            TM2@tcp:/lustre       486M   56M  406M  13% /mnt/lustre
            /dev/loop3            119M  5.3M  109M   5% /mnt/ost-x
            
            # umount /mnt/ost-x/
            
            # mount -t lustre -o loop /tmp/lustre-ost-x /mnt/ost-x/
            
            # df -kh
            Filesystem            Size  Used Avail Use% Mounted on
            /dev/sdb1              28G   20G  6.3G  76% /
            tmpfs                 435M   88K  435M   1% /dev/shm
            /dev/sda1             486M  109M  352M  24% /boot
            /dev/loop0            147M   18M  120M  13% /mnt/mds1
            /dev/loop1            184M   26M  149M  15% /mnt/ost1
            /dev/loop2            184M   26M  149M  15% /mnt/ost2
            TM2@tcp:/lustre       486M   56M  406M  13% /mnt/lustre
            /dev/loop3            119M  5.3M  109M   5% /mnt/ost-x
            
            spimpale Swapnil Pimpale (Inactive) added a comment - Yes, GETVERSION and GETFLAGS ioctls return correct values which are as follows: MDS: GETVERSION: 0 GETFLAGS: 0x0 OST: GETVERSION: 0 GETFLAGS: 0x80000 The crash occurred because of a NULL pointer dereference in mnt_want_write(). The stack is as follows: <1>BUG: unable to handle kernel NULL pointer dereference at 00000000000000e2 <1>IP: [<ffffffff81193c34>] mnt_want_write+0x14/0x80 <4>PGD 1512d067 PUD c0d4067 PMD 0 <4>Oops: 0000 [#1] SMP <4>last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:00.0/irq <4>CPU 1 <4>Modules linked in: lustre(U) ofd(U) osp(U) lod(U) ost(U) mdt(U) osd_ldiskfs(U) fsfilt_ldiskfs(U) ldiskfs(U) mdd(U) mgs(U) lquota(U) lfsck(U) obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) jbd2 sha512_generic sha256_generic crc32c_intel nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc ipv6 dm_mirror dm_region_hash dm_log uinput ppdev parport_pc parport e1000 sg vmware_balloon i2c_piix4 i2c_core shpchp ext3 jbd mbcache sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix dm_mod [last unloaded: libcfs] <4> <4>Pid: 26460, comm: ioctl_passthru Not tainted 2.6.32-279.19.1.el6_lustre.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform <4>RIP: 0010:[<ffffffff81193c34>] [<ffffffff81193c34>] mnt_want_write+0x14/0x80 <4>RSP: 0018:ffff880020525cc8 EFLAGS: 00010246 <4>RAX: 0000000000000000 RBX: ffff880020525d78 RCX: 0000000000000003 <4>RDX: 0000000000000001 RSI: 0000000040086604 RDI: 0000000000000002 <4>RBP: ffff880020525cc8 R08: ffffffffa0462c80 R09: 0000000000000000 <4>R10: 0000000000000000 R11: 0000000000000206 R12: 00007fff78321100 <4>R13: ffff88003c9be5b0 R14: 00007fff78321100 R15: 0000000000000000 <4>FS: 00007f94d8a74700(0000) GS:ffff880002280000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>CR2: 00000000000000e2 CR3: 000000001dcfc000 CR4: 00000000000006e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process ioctl_passthru (pid: 26460, threadinfo ffff880020524000, task ffff880020622aa0) <4>Stack: <4> ffff880020525d58 ffffffffa07198f4 0000000000000000 ffff88001dc499a8 <4><d> 0000000000000286 ffff88003c9be5b0 ffff880020525d18 ffffffff811902c0 <4><d> ffff880020525d18 ffff88003c9be5b0 ffff880020525d58 ffffffff8118ee10 <4>Call Trace: <4> [<ffffffffa07198f4>] ldiskfs_ioctl+0xe4/0x940 [ldiskfs] <4> [<ffffffff811902c0>] ? iput+0x30/0x70 <4> [<ffffffff8118ee10>] ? d_obtain_alias+0xc0/0x230 <4> [<ffffffffa0451b2a>] server_ioctl+0xba/0xf0 [obdclass] <4> [<ffffffff81312cb3>] ? pty_write+0x73/0x80 <4> [<ffffffff8130c34e>] ? do_output_char+0x1de/0x210 <4> [<ffffffff81090d4c>] ? remove_wait_queue+0x3c/0x50 <4> [<ffffffff81052223>] ? __wake_up+0x53/0x70 <4> [<ffffffff81189012>] vfs_ioctl+0x22/0xa0 <4> [<ffffffff81310efe>] ? tty_ldisc_deref+0xe/0x10 <4> [<ffffffff81309e93>] ? tty_write+0x233/0x2a0 <4> [<ffffffff811891b4>] do_vfs_ioctl+0x84/0x580 <4> [<ffffffff81176602>] ? vfs_write+0x132/0x1a0 <4> [<ffffffff81189731>] sys_ioctl+0x81/0xa0 <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b <4>Code: 8b 40 58 83 e0 01 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 65 8b 14 25 b8 e0 00 00 48 63 d2 <48> 8b 87 e0 00 00 00 48 03 04 d5 20 81 bf 81 83 00 01 0f ae f0 <1>RIP [<ffffffff81193c34>] mnt_want_write+0x14/0x80 <4> RSP <ffff880020525cc8> <4>CR2: 00000000000000e2 This was because the active_filp.f_path.mnt was not filled in before calling the ioctl. I have fixed this problem in the next patchset ( http://review.whamcloud.com/#/c/7354/4 ) With this fix, I am able to test the SETVERSION and SETFLAGS ioctl. I also tested the old online resizefs ioctl LDISKFS_IOC_GROUP_EXTEND (latest e2fsprogs) as follows and it seems to be working: # df -kh Filesystem Size Used Avail Use% Mounted on /dev/sdb1 28G 20G 6.3G 76% / tmpfs 435M 88K 435M 1% /dev/shm /dev/sda1 486M 109M 352M 24% /boot /dev/loop0 147M 18M 120M 13% /mnt/mds1 /dev/loop1 184M 26M 149M 15% /mnt/ost1 /dev/loop2 184M 26M 149M 15% /mnt/ost2 TM2@tcp:/lustre 367M 51M 297M 15% /mnt/lustre # dd if=/dev/zero of=/tmp/lustre-ost-x bs=1M count=200 # mkfs.lustre --fsname=lustre --mgsnode=192.168.100.26@tcp0 --ost --index=2 --device-size=100000 /tmp/lustre-ost-x # mount -t lustre -o loop /tmp/lustre-ost-x /mnt/ost-x/ # df -kh Filesystem Size Used Avail Use% Mounted on /dev/sdb1 28G 20G 6.3G 76% / tmpfs 435M 88K 435M 1% /dev/shm /dev/sda1 486M 109M 352M 24% /boot /dev/loop0 147M 18M 120M 13% /mnt/mds1 /dev/loop1 184M 26M 149M 15% /mnt/ost1 /dev/loop2 184M 26M 149M 15% /mnt/ost2 TM2@tcp:/lustre 458M 56M 378M 13% /mnt/lustre /dev/loop3 92M 5.3M 82M 7% /mnt/ost-x ~/e2fsprogs# ./build/resize/resize2fs /dev/loop3 125M resize2fs 1.43-WIP (21-Jan-2013) Filesystem at /dev/loop3 is mounted on /mnt/ost-x; on-line resizing required old_desc_blocks = 1, new_desc_blocks = 1 Performing an on-line resize of /dev/loop3 to 32000 (4k) blocks. The filesystem on /dev/loop3 is now 32000 blocks long. # df -kh Filesystem Size Used Avail Use% Mounted on /dev/sdb1 28G 20G 6.3G 76% / tmpfs 435M 88K 435M 1% /dev/shm /dev/sda1 486M 109M 352M 24% /boot /dev/loop0 147M 18M 120M 13% /mnt/mds1 /dev/loop1 184M 26M 149M 15% /mnt/ost1 /dev/loop2 184M 26M 149M 15% /mnt/ost2 TM2@tcp:/lustre 486M 56M 406M 13% /mnt/lustre /dev/loop3 119M 5.3M 109M 5% /mnt/ost-x # umount /mnt/ost-x/ # mount -t lustre -o loop /tmp/lustre-ost-x /mnt/ost-x/ # df -kh Filesystem Size Used Avail Use% Mounted on /dev/sdb1 28G 20G 6.3G 76% / tmpfs 435M 88K 435M 1% /dev/shm /dev/sda1 486M 109M 352M 24% /boot /dev/loop0 147M 18M 120M 13% /mnt/mds1 /dev/loop1 184M 26M 149M 15% /mnt/ost1 /dev/loop2 184M 26M 149M 15% /mnt/ost2 TM2@tcp:/lustre 486M 56M 406M 13% /mnt/lustre /dev/loop3 119M 5.3M 109M 5% /mnt/ost-x

            It definitely shouldn't crash regardless if what ioctl is used, though I don't necessarily expect it to do anything useful. Presumably the GETVERSION and GETFLAGS ioctls return the correct values from the underlying root inode?

            Next step is to figure out why it crashed and fix that.

            adilger Andreas Dilger added a comment - It definitely shouldn't crash regardless if what ioctl is used, though I don't necessarily expect it to do anything useful. Presumably the GETVERSION and GETFLAGS ioctls return the correct values from the underlying root inode? Next step is to figure out why it crashed and fix that.

            Hi Andeas,

            I have ported the ioctl_passthru-1_8.patch to the latest master.
            With this patch I tested EXT4_IOC_GETFLAGS and EXT4_IOC_GETVERSION ioctls on OST and MDT mountpoints. These ioctls work and return expected values.
            I have added this a testcase in sanity.sh
            The patch can be found here -> http://review.whamcloud.com/#/c/7354/

            I tried the EXT4_IOC_SETVERSION ioctl but that resulted in a crash.
            Is that expected?

            spimpale Swapnil Pimpale (Inactive) added a comment - Hi Andeas, I have ported the ioctl_passthru-1_8.patch to the latest master. With this patch I tested EXT4_IOC_GETFLAGS and EXT4_IOC_GETVERSION ioctls on OST and MDT mountpoints. These ioctls work and return expected values. I have added this a testcase in sanity.sh The patch can be found here -> http://review.whamcloud.com/#/c/7354/ I tried the EXT4_IOC_SETVERSION ioctl but that resulted in a crash. Is that expected?

            Unfortunately, I expect that the 1.8 version of the patch is completely useless for the current 2.1 and master code...

            You might want to experiment with some simple ioctl (e.g. EXT4_IOC_GETFLAGS or EXT4_IOC_GETVERSION) to get that working before you try to have resize2fs calling the online resizer. The end goal is that at least EXT4_IOC_GROUP_EXTEND/EXT4_IOC_GROUP_ADD (old resize), EXT4_IOC_RESIZE_FS (new resize), and FITRIM work.

            adilger Andreas Dilger added a comment - Unfortunately, I expect that the 1.8 version of the patch is completely useless for the current 2.1 and master code... You might want to experiment with some simple ioctl (e.g. EXT4_IOC_GETFLAGS or EXT4_IOC_GETVERSION) to get that working before you try to have resize2fs calling the online resizer. The end goal is that at least EXT4_IOC_GROUP_EXTEND/EXT4_IOC_GROUP_ADD (old resize), EXT4_IOC_RESIZE_FS (new resize), and FITRIM work.

            People

              adilger Andreas Dilger
              adilger Andreas Dilger
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: