Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17110

Slab corruption using fiemap ioctl with fm_extent_count==0

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • None
    • None
    • 2.15.3 clients
      FS1 2.12 servers (clusterstore)
      FS2 2.15.3 servers
    • 3
    • 9223372036854775807

    Description

      We hit this initially on a production env with 2.15 clients using mpifileutils dsync.
      dsync first fiemap call is used to determine the number of extent in the file with no extent allocated in the fiemap structure (.fm_extent_count = 0).

      Reproducer (reproduced on master branch):

      #include <sys/types.h>
      #include <sys/stat.h>
      #include <fcntl.h>
      
      #include <linux/fs.h>
      #include <linux/fiemap.h>
      
      int main(int argc, char **argv)
      {
              char *fname;
              int fsize;
              int fd; 
              int i;
              struct fiemap fiemap = { 
                      .fm_start  = 0,
                      .fm_flags  = FIEMAP_FLAG_SYNC,
                      .fm_extent_count   = 0,
                      .fm_mapped_extents = 0,
              };  
      
              if (argc <= 1)
                      return 1;
      
              fname = argv[1];
      
              fd = open(fname, O_RDONLY);
              if (fd < 0) {
                      perror("Failed to open");
                      return 1;
              }   
      
              fsize = lseek(fd, 0, SEEK_END);
              if (fsize < 0)
                      return 1;
              lseek(fd, 0, SEEK_SET);
      
              fiemap.fm_length = fsize;
      
              while (1) {
                      printf("iter: %i\n", ++i);
                      if (ioctl(fd, FS_IOC_FIEMAP, &fiemap) < 0) {
                              perror("FS_IOC_FIEMAP ioctl failed");
                              return 1;
                      }   
                      usleep(1000);
              }   
      
              return 0;
      }
      
        116.791028] WARNING: CPU: 1 PID: 13475 at lib/list_debug.c:33 __list_add+0xac/0xc0
      [  116.791035] list_add corruption. prev->next should be next (ffff8848b7c2d390), but was ffff8848b7c2d391. (prev=ffff8848a584fe28).
      [  116.791039] Modules linked in: loop zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) joydev libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd cuse grace fuse fscache sunrpc ext4 mbcache jbd2 ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg pcspkr parport_pc snd_timer vboxguest(OE) snd parport video soundcore i2c_piix4 binfmt_misc ip_tables xfs libcrc32c sr_mod
      [  116.791250]  cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel serio_raw e1000 libata drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
      [  116.791317] CPU: 1 PID: 13475 Comm: fiemap_test Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-1160.59.1.el7.centos.plus.x86_64 #1
      [  116.791322] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  116.791327] Call Trace:
      [  116.791348]  [<ffffffff89d975b9>] dump_stack+0x19/0x1b
      [  116.791358]  [<ffffffff8969b278>] __warn+0xd8/0x100
      [  116.791364]  [<ffffffff8969b2ff>] warn_slowpath_fmt+0x5f/0x80
      [  116.791374]  [<ffffffff899b745c>] __list_add+0xac/0xc0
      [  116.791417]  [<ffffffffc08964ba>] libcfs_debug_msg+0x2da/0xac0 [libcfs]
      [  116.791431]  [<ffffffff899a351b>] ? string.isra.7+0x3b/0xf0
      [  116.791669]  [<ffffffffc0d55040>] ? lock_matches+0x230/0x230 [ptlrpc]
      [  116.791837]  [<ffffffffc0d51cc7>] _ldlm_lock_debug+0x647/0x830 [ptlrpc]
      [  116.791944]  [<ffffffffc0d5358d>] ? ldlm_lock_remove_from_lru_nolock+0x3d/0xe0 [ptlrpc]
      [  116.792046]  [<ffffffffc0d55040>] ? lock_matches+0x230/0x230 [ptlrpc]
      [  116.792157]  [<ffffffffc0d54d10>] ldlm_lock_addref_internal_nolock+0x80/0x100 [ptlrpc]
      [  116.792282]  [<ffffffffc0d5503b>] lock_matches+0x22b/0x230 [ptlrpc]
      [  116.792391]  [<ffffffffc0d5508e>] itree_overlap_cb+0x4e/0x70 [ptlrpc]
      [  116.792511]  [<ffffffffc0a7ae3b>] interval_search+0x8b/0x220 [obdclass]
      [  116.792735]  [<ffffffffc0d51534>] search_itree+0x94/0xd0 [ptlrpc]
      [  116.792878]  [<ffffffffc0d5612f>] ldlm_lock_match_with_skip+0x29f/0x9a0 [ptlrpc]
      [  116.792892]  [<ffffffff899a4c64>] ? vsnprintf+0x234/0x6a0
      [  116.792908]  [<ffffffff899a4c64>] ? vsnprintf+0x234/0x6a0
      [  116.792944]  [<ffffffffc0fa0fbd>] osc_object_fiemap+0x15d/0x6a0 [osc]
      [  116.793033]  [<ffffffffc0a66313>] cl_object_fiemap+0x73/0x160 [obdclass]
      [  116.793066]  [<ffffffffc10324f0>] lov_object_fiemap+0x1300/0x18f0 [lov]
      [  116.793131]  [<ffffffffc1701c50>] ? vvp_io_fini+0x410/0x710 [lustre]
      [  116.793215]  [<ffffffffc0a66313>] cl_object_fiemap+0x73/0x160 [obdclass]
      [  116.793267]  [<ffffffffc169cb5c>] ll_do_fiemap+0x2bc/0x390 [lustre]
      [  116.793318]  [<ffffffffc169d057>] ll_fiemap+0x427/0x5f0 [lustre]
      [  116.793332]  [<ffffffff89863934>] do_vfs_ioctl+0x204/0x5b0
      [  116.793343]  [<ffffffff89863d81>] SyS_ioctl+0xa1/0xc0
      [  116.793356]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  116.793366]  [<ffffffff89daaf92>] system_call_fastpath+0x25/0x2a
      [  116.793379]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  116.793388] ---[ end trace d89bc4ba123f5eec ]---
      [  117.442039] ------------[ cut here ]------------
      [  117.442047] WARNING: CPU: 1 PID: 13475 at lib/list_debug.c:62 __list_del_entry+0x82/0xd0
      [  117.442049] list_del corruption. next->prev should be ffff8848a584f1a8, but was a5ffff8848a584f1
      [  117.442050] Modules linked in: loop zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) joydev libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd cuse grace fuse fscache sunrpc ext4 mbcache jbd2 ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg pcspkr parport_pc snd_timer vboxguest(OE) snd parport video soundcore i2c_piix4 binfmt_misc ip_tables xfs libcrc32c sr_mod
      [  117.442116]  cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel serio_raw e1000 libata drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
      [  117.442139] CPU: 1 PID: 13475 Comm: fiemap_test Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-1160.59.1.el7.centos.plus.x86_64 #1
      [  117.442141] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  117.442143] Call Trace:
      [  117.442149]  [<ffffffff89d975b9>] dump_stack+0x19/0x1b
      [  117.442152]  [<ffffffff8969b278>] __warn+0xd8/0x100
      [  117.442155]  [<ffffffff8969b2ff>] warn_slowpath_fmt+0x5f/0x80
      [  117.442174]  [<ffffffffc092ffff>] ? lnet_rtrpools_alloc+0x17f/0x320 [lnet]
      [  117.442177]  [<ffffffff899b74f2>] __list_del_entry+0x82/0xd0
      [  117.442187]  [<ffffffffc0896185>] cfs_tage_to_tail+0x25/0x80 [libcfs]
      [  117.442195]  [<ffffffffc0896abd>] libcfs_debug_msg+0x8dd/0xac0 [libcfs]
      [  117.442199]  [<ffffffff899a351b>] ? string.isra.7+0x3b/0xf0
      [  117.442255]  [<ffffffffc0d55040>] ? lock_matches+0x230/0x230 [ptlrpc]
      [  117.442300]  [<ffffffffc0d51cc7>] _ldlm_lock_debug+0x647/0x830 [ptlrpc]
      [  117.442345]  [<ffffffffc0d5358d>] ? ldlm_lock_remove_from_lru_nolock+0x3d/0xe0 [ptlrpc]
      [  117.442388]  [<ffffffffc0d55040>] ? lock_matches+0x230/0x230 [ptlrpc]
      [  117.442432]  [<ffffffffc0d54d10>] ldlm_lock_addref_internal_nolock+0x80/0x100 [ptlrpc]
      [  117.442473]  [<ffffffffc0d5503b>] lock_matches+0x22b/0x230 [ptlrpc]
      [  117.442514]  [<ffffffffc0d5508e>] itree_overlap_cb+0x4e/0x70 [ptlrpc]
      [  117.442568]  [<ffffffffc0a7ae3b>] interval_search+0x8b/0x220 [obdclass]
      [  117.442653]  [<ffffffffc0d51534>] search_itree+0x94/0xd0 [ptlrpc]
      [  117.442698]  [<ffffffffc0d5612f>] ldlm_lock_match_with_skip+0x29f/0x9a0 [ptlrpc]
      [  117.442704]  [<ffffffff899a4c64>] ? vsnprintf+0x234/0x6a0
      [  117.442707]  [<ffffffff899a4c64>] ? vsnprintf+0x234/0x6a0
      [  117.442719]  [<ffffffffc0fa0fbd>] osc_object_fiemap+0x15d/0x6a0 [osc]
      [  117.442746]  [<ffffffffc0a66313>] cl_object_fiemap+0x73/0x160 [obdclass]
      [  117.442758]  [<ffffffffc10324f0>] lov_object_fiemap+0x1300/0x18f0 [lov]
      [  117.442792]  [<ffffffffc1701c50>] ? vvp_io_fini+0x410/0x710 [lustre]
      [  117.442832]  [<ffffffffc0a66313>] cl_object_fiemap+0x73/0x160 [obdclass]
      [  117.442847]  [<ffffffffc169cb5c>] ll_do_fiemap+0x2bc/0x390 [lustre]
      [  117.442862]  [<ffffffffc169d057>] ll_fiemap+0x427/0x5f0 [lustre]
      [  117.442867]  [<ffffffff89863934>] do_vfs_ioctl+0x204/0x5b0
      [  117.442870]  [<ffffffff89863d81>] SyS_ioctl+0xa1/0xc0
      [  117.442875]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  117.442881]  [<ffffffff89daaf92>] system_call_fastpath+0x25/0x2a
      [  117.442885]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  117.442887] ---[ end trace d89bc4ba123f5eed ]---
      [  118.280802] general protection fault: 0000 [#1] SMP 
      [  118.280822] Modules linked in: loop zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) joydev libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd cuse grace fuse fscache sunrpc ext4 mbcache jbd2 ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg pcspkr parport_pc snd_timer vboxguest(OE) snd parport video soundcore i2c_piix4 binfmt_misc ip_tables xfs libcrc32c sr_mod
      [  118.281074]  cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel serio_raw e1000 libata drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
      [  118.281166] CPU: 1 PID: 13478 Comm: abrt-server Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-1160.59.1.el7.centos.plus.x86_64 #1
      [  118.281195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  118.281214] task: ffff8848b767a100 ti: ffff8848b8b60000 task.ti: ffff8848b8b60000
      [  118.281231] RIP: 0010:[<ffffffffc04519e0>]  [<ffffffffc04519e0>] xfs_trans_buf_item_match+0x60/0xa0 [xfs]
      [  118.281271] RSP: 0018:ffff8848b8b63948  EFLAGS: 00010212
      [  118.281284] RAX: 60ffff8848b8b873 RBX: ffff8848ba1181b0 RCX: ffff88487633f5c1
      [  118.281300] RDX: ffff8848b8b639b8 RSI: ffff8848b69e93c0 RDI: ffff8848ba118260
      [  118.281317] RBP: ffff8848b8b63948 R08: 0000000000000008 R09: ffff8848b59f1f28
      [  118.281333] R10: 0000000002c37820 R11: 0000000000000001 R12: ffff8848b6ab9000
      [  118.281349] R13: ffff8848b8b63a00 R14: ffff8848b8b639b8 R15: ffff8848b69e93c0
      [  118.281366] FS:  00007fa5e1243900(0000) GS:ffff8848bfd00000(0000) knlGS:0000000000000000
      [  118.281384] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  118.281399] CR2: 00007fa5e126f000 CR3: 000000003bece000 CR4: 00000000000606e0
      [  118.281438] Call Trace:
      [  118.281460]  [<ffffffffc0451da2>] xfs_trans_read_buf_map+0x52/0x2c0 [xfs]
      [  118.281485]  [<ffffffffc03f6934>] xfs_btree_read_buf_block.constprop.33+0xa4/0xe0 [xfs]
      [  118.281513]  [<ffffffffc03faa45>] xfs_btree_lookup_get_block+0x95/0x1a0 [xfs]
      [  118.281543]  [<ffffffffc03facff>] xfs_btree_lookup+0xdf/0x420 [xfs]
      [  118.282076]  [<ffffffffc03df60b>] xfs_alloc_lookup_eq+0x1b/0x20 [xfs]
      [  118.282591]  [<ffffffffc03e0ed8>] xfs_free_ag_extent+0x278/0x780 [xfs]
      [  118.283094]  [<ffffffffc03e33da>] xfs_free_extent+0xaa/0x140 [xfs]
      [  118.283604]  [<ffffffffc045272a>] xfs_trans_free_extent+0x4a/0x100 [xfs]
      [  118.284103]  [<ffffffffc04527fe>] xfs_extent_free_finish_item+0x1e/0x40 [xfs]
      [  118.284596]  [<ffffffffc0401738>] xfs_defer_finish+0x128/0x3d0 [xfs]
      [  118.285077]  [<ffffffffc0434cf5>] xfs_itruncate_extents+0xf5/0x220 [xfs]
      [  118.285553]  [<ffffffffc0434ed7>] xfs_inactive_truncate+0xb7/0x110 [xfs]
      [  118.286014]  [<ffffffffc0435528>] xfs_inactive+0x108/0x130 [xfs]
      [  118.286463]  [<ffffffffc043cb15>] xfs_fs_destroy_inode+0x95/0x190 [xfs]
      [  118.286899]  [<ffffffff8986c85b>] destroy_inode+0x3b/0x60
      [  118.287317]  [<ffffffff8986c995>] evict+0x115/0x180
      [  118.287727]  [<ffffffff8986cd6c>] iput+0xfc/0x190
      [  118.288120]  [<ffffffff89860b3e>] do_unlinkat+0x1ae/0x2d0
      [  118.288511]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  118.288892]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  118.289254]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  118.289607]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  118.289944]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  118.290281]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  118.290601]  [<ffffffff89861bbb>] SyS_unlinkat+0x1b/0x40
      [  118.290900]  [<ffffffff89daaf92>] system_call_fastpath+0x25/0x2a
      [  118.291189]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  118.291476] Code: 48 8b 87 b0 00 00 00 48 81 c7 b0 00 00 00 48 39 c7 48 8d 48 f8 75 11 eb 42 66 90 48 8b 41 08 48 39 c7 48 8d 48 f8 74 33 48 8b 01 <81> 78 30 3c 12 00 00 75 e7 48 8b 80 88 00 00 00 48 39 b0 98 00 
      [  118.292459] RIP  [<ffffffffc04519e0>] xfs_trans_buf_item_match+0x60/0xa0 [xfs]
      

      It seems that "LU-16480 lov: fiemap improperly handles fm_extent_count=0" did not fix all the cases.

      Attachments

        Issue Links

          Activity

            [LU-17110] Slab corruption using fiemap ioctl with fm_extent_count==0

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52512/
            Subject: LU-17110 llite: fix slab corruption with fm_extent_count=0
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set:
            Commit: 147632c2c5246a1ad5e5e4259d7f8f84211dba30

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52512/ Subject: LU-17110 llite: fix slab corruption with fm_extent_count=0 Project: fs/lustre-release Branch: b2_15 Current Patch Set: Commit: 147632c2c5246a1ad5e5e4259d7f8f84211dba30
            pjones Peter Jones added a comment -

            Landed for 2.16

            pjones Peter Jones added a comment - Landed for 2.16

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52352/
            Subject: LU-17110 llite: fix slab corruption with fm_extent_count=0
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: a81dc7d0e158894e905ab3d309f7b92864a94378

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/52352/ Subject: LU-17110 llite: fix slab corruption with fm_extent_count=0 Project: fs/lustre-release Branch: master Current Patch Set: Commit: a81dc7d0e158894e905ab3d309f7b92864a94378

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52512
            Subject: LU-17110 llite: fix slab corruption with fm_extent_count=0
            Project: fs/lustre-release
            Branch: b2_15
            Current Patch Set: 1
            Commit: aa0718aaf5a66aa83e61e37fefc6b8ad9adda559

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52512 Subject: LU-17110 llite: fix slab corruption with fm_extent_count=0 Project: fs/lustre-release Branch: b2_15 Current Patch Set: 1 Commit: aa0718aaf5a66aa83e61e37fefc6b8ad9adda559

            "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52352
            Subject: LU-17110 llite: fix slab corruption with fm_extent_count=0
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: d31ab8180f2ee85054f872333ab19ba36f11294b

            gerrit Gerrit Updater added a comment - "Etienne AUJAMES <eaujames@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/52352 Subject: LU-17110 llite: fix slab corruption with fm_extent_count=0 Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: d31ab8180f2ee85054f872333ab19ba36f11294b

            People

              eaujames Etienne Aujames
              eaujames Etienne Aujames
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: