Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17110

Slab corruption using fiemap ioctl with fm_extent_count==0

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.16.0
    • None
    • None
    • 2.15.3 clients
      FS1 2.12 servers (clusterstore)
      FS2 2.15.3 servers
    • 3
    • 9223372036854775807

    Description

      We hit this initially on a production env with 2.15 clients using mpifileutils dsync.
      dsync first fiemap call is used to determine the number of extent in the file with no extent allocated in the fiemap structure (.fm_extent_count = 0).

      Reproducer (reproduced on master branch):

      #include <sys/types.h>
      #include <sys/stat.h>
      #include <fcntl.h>
      
      #include <linux/fs.h>
      #include <linux/fiemap.h>
      
      int main(int argc, char **argv)
      {
              char *fname;
              int fsize;
              int fd; 
              int i;
              struct fiemap fiemap = { 
                      .fm_start  = 0,
                      .fm_flags  = FIEMAP_FLAG_SYNC,
                      .fm_extent_count   = 0,
                      .fm_mapped_extents = 0,
              };  
      
              if (argc <= 1)
                      return 1;
      
              fname = argv[1];
      
              fd = open(fname, O_RDONLY);
              if (fd < 0) {
                      perror("Failed to open");
                      return 1;
              }   
      
              fsize = lseek(fd, 0, SEEK_END);
              if (fsize < 0)
                      return 1;
              lseek(fd, 0, SEEK_SET);
      
              fiemap.fm_length = fsize;
      
              while (1) {
                      printf("iter: %i\n", ++i);
                      if (ioctl(fd, FS_IOC_FIEMAP, &fiemap) < 0) {
                              perror("FS_IOC_FIEMAP ioctl failed");
                              return 1;
                      }   
                      usleep(1000);
              }   
      
              return 0;
      }
      
        116.791028] WARNING: CPU: 1 PID: 13475 at lib/list_debug.c:33 __list_add+0xac/0xc0
      [  116.791035] list_add corruption. prev->next should be next (ffff8848b7c2d390), but was ffff8848b7c2d391. (prev=ffff8848a584fe28).
      [  116.791039] Modules linked in: loop zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) joydev libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd cuse grace fuse fscache sunrpc ext4 mbcache jbd2 ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg pcspkr parport_pc snd_timer vboxguest(OE) snd parport video soundcore i2c_piix4 binfmt_misc ip_tables xfs libcrc32c sr_mod
      [  116.791250]  cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel serio_raw e1000 libata drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
      [  116.791317] CPU: 1 PID: 13475 Comm: fiemap_test Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-1160.59.1.el7.centos.plus.x86_64 #1
      [  116.791322] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  116.791327] Call Trace:
      [  116.791348]  [<ffffffff89d975b9>] dump_stack+0x19/0x1b
      [  116.791358]  [<ffffffff8969b278>] __warn+0xd8/0x100
      [  116.791364]  [<ffffffff8969b2ff>] warn_slowpath_fmt+0x5f/0x80
      [  116.791374]  [<ffffffff899b745c>] __list_add+0xac/0xc0
      [  116.791417]  [<ffffffffc08964ba>] libcfs_debug_msg+0x2da/0xac0 [libcfs]
      [  116.791431]  [<ffffffff899a351b>] ? string.isra.7+0x3b/0xf0
      [  116.791669]  [<ffffffffc0d55040>] ? lock_matches+0x230/0x230 [ptlrpc]
      [  116.791837]  [<ffffffffc0d51cc7>] _ldlm_lock_debug+0x647/0x830 [ptlrpc]
      [  116.791944]  [<ffffffffc0d5358d>] ? ldlm_lock_remove_from_lru_nolock+0x3d/0xe0 [ptlrpc]
      [  116.792046]  [<ffffffffc0d55040>] ? lock_matches+0x230/0x230 [ptlrpc]
      [  116.792157]  [<ffffffffc0d54d10>] ldlm_lock_addref_internal_nolock+0x80/0x100 [ptlrpc]
      [  116.792282]  [<ffffffffc0d5503b>] lock_matches+0x22b/0x230 [ptlrpc]
      [  116.792391]  [<ffffffffc0d5508e>] itree_overlap_cb+0x4e/0x70 [ptlrpc]
      [  116.792511]  [<ffffffffc0a7ae3b>] interval_search+0x8b/0x220 [obdclass]
      [  116.792735]  [<ffffffffc0d51534>] search_itree+0x94/0xd0 [ptlrpc]
      [  116.792878]  [<ffffffffc0d5612f>] ldlm_lock_match_with_skip+0x29f/0x9a0 [ptlrpc]
      [  116.792892]  [<ffffffff899a4c64>] ? vsnprintf+0x234/0x6a0
      [  116.792908]  [<ffffffff899a4c64>] ? vsnprintf+0x234/0x6a0
      [  116.792944]  [<ffffffffc0fa0fbd>] osc_object_fiemap+0x15d/0x6a0 [osc]
      [  116.793033]  [<ffffffffc0a66313>] cl_object_fiemap+0x73/0x160 [obdclass]
      [  116.793066]  [<ffffffffc10324f0>] lov_object_fiemap+0x1300/0x18f0 [lov]
      [  116.793131]  [<ffffffffc1701c50>] ? vvp_io_fini+0x410/0x710 [lustre]
      [  116.793215]  [<ffffffffc0a66313>] cl_object_fiemap+0x73/0x160 [obdclass]
      [  116.793267]  [<ffffffffc169cb5c>] ll_do_fiemap+0x2bc/0x390 [lustre]
      [  116.793318]  [<ffffffffc169d057>] ll_fiemap+0x427/0x5f0 [lustre]
      [  116.793332]  [<ffffffff89863934>] do_vfs_ioctl+0x204/0x5b0
      [  116.793343]  [<ffffffff89863d81>] SyS_ioctl+0xa1/0xc0
      [  116.793356]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  116.793366]  [<ffffffff89daaf92>] system_call_fastpath+0x25/0x2a
      [  116.793379]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  116.793388] ---[ end trace d89bc4ba123f5eec ]---
      [  117.442039] ------------[ cut here ]------------
      [  117.442047] WARNING: CPU: 1 PID: 13475 at lib/list_debug.c:62 __list_del_entry+0x82/0xd0
      [  117.442049] list_del corruption. next->prev should be ffff8848a584f1a8, but was a5ffff8848a584f1
      [  117.442050] Modules linked in: loop zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) joydev libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd cuse grace fuse fscache sunrpc ext4 mbcache jbd2 ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg pcspkr parport_pc snd_timer vboxguest(OE) snd parport video soundcore i2c_piix4 binfmt_misc ip_tables xfs libcrc32c sr_mod
      [  117.442116]  cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel serio_raw e1000 libata drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
      [  117.442139] CPU: 1 PID: 13475 Comm: fiemap_test Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-1160.59.1.el7.centos.plus.x86_64 #1
      [  117.442141] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  117.442143] Call Trace:
      [  117.442149]  [<ffffffff89d975b9>] dump_stack+0x19/0x1b
      [  117.442152]  [<ffffffff8969b278>] __warn+0xd8/0x100
      [  117.442155]  [<ffffffff8969b2ff>] warn_slowpath_fmt+0x5f/0x80
      [  117.442174]  [<ffffffffc092ffff>] ? lnet_rtrpools_alloc+0x17f/0x320 [lnet]
      [  117.442177]  [<ffffffff899b74f2>] __list_del_entry+0x82/0xd0
      [  117.442187]  [<ffffffffc0896185>] cfs_tage_to_tail+0x25/0x80 [libcfs]
      [  117.442195]  [<ffffffffc0896abd>] libcfs_debug_msg+0x8dd/0xac0 [libcfs]
      [  117.442199]  [<ffffffff899a351b>] ? string.isra.7+0x3b/0xf0
      [  117.442255]  [<ffffffffc0d55040>] ? lock_matches+0x230/0x230 [ptlrpc]
      [  117.442300]  [<ffffffffc0d51cc7>] _ldlm_lock_debug+0x647/0x830 [ptlrpc]
      [  117.442345]  [<ffffffffc0d5358d>] ? ldlm_lock_remove_from_lru_nolock+0x3d/0xe0 [ptlrpc]
      [  117.442388]  [<ffffffffc0d55040>] ? lock_matches+0x230/0x230 [ptlrpc]
      [  117.442432]  [<ffffffffc0d54d10>] ldlm_lock_addref_internal_nolock+0x80/0x100 [ptlrpc]
      [  117.442473]  [<ffffffffc0d5503b>] lock_matches+0x22b/0x230 [ptlrpc]
      [  117.442514]  [<ffffffffc0d5508e>] itree_overlap_cb+0x4e/0x70 [ptlrpc]
      [  117.442568]  [<ffffffffc0a7ae3b>] interval_search+0x8b/0x220 [obdclass]
      [  117.442653]  [<ffffffffc0d51534>] search_itree+0x94/0xd0 [ptlrpc]
      [  117.442698]  [<ffffffffc0d5612f>] ldlm_lock_match_with_skip+0x29f/0x9a0 [ptlrpc]
      [  117.442704]  [<ffffffff899a4c64>] ? vsnprintf+0x234/0x6a0
      [  117.442707]  [<ffffffff899a4c64>] ? vsnprintf+0x234/0x6a0
      [  117.442719]  [<ffffffffc0fa0fbd>] osc_object_fiemap+0x15d/0x6a0 [osc]
      [  117.442746]  [<ffffffffc0a66313>] cl_object_fiemap+0x73/0x160 [obdclass]
      [  117.442758]  [<ffffffffc10324f0>] lov_object_fiemap+0x1300/0x18f0 [lov]
      [  117.442792]  [<ffffffffc1701c50>] ? vvp_io_fini+0x410/0x710 [lustre]
      [  117.442832]  [<ffffffffc0a66313>] cl_object_fiemap+0x73/0x160 [obdclass]
      [  117.442847]  [<ffffffffc169cb5c>] ll_do_fiemap+0x2bc/0x390 [lustre]
      [  117.442862]  [<ffffffffc169d057>] ll_fiemap+0x427/0x5f0 [lustre]
      [  117.442867]  [<ffffffff89863934>] do_vfs_ioctl+0x204/0x5b0
      [  117.442870]  [<ffffffff89863d81>] SyS_ioctl+0xa1/0xc0
      [  117.442875]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  117.442881]  [<ffffffff89daaf92>] system_call_fastpath+0x25/0x2a
      [  117.442885]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  117.442887] ---[ end trace d89bc4ba123f5eed ]---
      [  118.280802] general protection fault: 0000 [#1] SMP 
      [  118.280822] Modules linked in: loop zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zcommon(POE) znvpair(POE) zavl(POE) icp(POE) spl(OE) lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) joydev libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd cuse grace fuse fscache sunrpc ext4 mbcache jbd2 ppdev iosf_mbi crc32_pclmul ghash_clmulni_intel snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg pcspkr parport_pc snd_timer vboxguest(OE) snd parport video soundcore i2c_piix4 binfmt_misc ip_tables xfs libcrc32c sr_mod
      [  118.281074]  cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel serio_raw e1000 libata drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
      [  118.281166] CPU: 1 PID: 13478 Comm: abrt-server Kdump: loaded Tainted: P        W  OE  ------------   3.10.0-1160.59.1.el7.centos.plus.x86_64 #1
      [  118.281195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [  118.281214] task: ffff8848b767a100 ti: ffff8848b8b60000 task.ti: ffff8848b8b60000
      [  118.281231] RIP: 0010:[<ffffffffc04519e0>]  [<ffffffffc04519e0>] xfs_trans_buf_item_match+0x60/0xa0 [xfs]
      [  118.281271] RSP: 0018:ffff8848b8b63948  EFLAGS: 00010212
      [  118.281284] RAX: 60ffff8848b8b873 RBX: ffff8848ba1181b0 RCX: ffff88487633f5c1
      [  118.281300] RDX: ffff8848b8b639b8 RSI: ffff8848b69e93c0 RDI: ffff8848ba118260
      [  118.281317] RBP: ffff8848b8b63948 R08: 0000000000000008 R09: ffff8848b59f1f28
      [  118.281333] R10: 0000000002c37820 R11: 0000000000000001 R12: ffff8848b6ab9000
      [  118.281349] R13: ffff8848b8b63a00 R14: ffff8848b8b639b8 R15: ffff8848b69e93c0
      [  118.281366] FS:  00007fa5e1243900(0000) GS:ffff8848bfd00000(0000) knlGS:0000000000000000
      [  118.281384] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  118.281399] CR2: 00007fa5e126f000 CR3: 000000003bece000 CR4: 00000000000606e0
      [  118.281438] Call Trace:
      [  118.281460]  [<ffffffffc0451da2>] xfs_trans_read_buf_map+0x52/0x2c0 [xfs]
      [  118.281485]  [<ffffffffc03f6934>] xfs_btree_read_buf_block.constprop.33+0xa4/0xe0 [xfs]
      [  118.281513]  [<ffffffffc03faa45>] xfs_btree_lookup_get_block+0x95/0x1a0 [xfs]
      [  118.281543]  [<ffffffffc03facff>] xfs_btree_lookup+0xdf/0x420 [xfs]
      [  118.282076]  [<ffffffffc03df60b>] xfs_alloc_lookup_eq+0x1b/0x20 [xfs]
      [  118.282591]  [<ffffffffc03e0ed8>] xfs_free_ag_extent+0x278/0x780 [xfs]
      [  118.283094]  [<ffffffffc03e33da>] xfs_free_extent+0xaa/0x140 [xfs]
      [  118.283604]  [<ffffffffc045272a>] xfs_trans_free_extent+0x4a/0x100 [xfs]
      [  118.284103]  [<ffffffffc04527fe>] xfs_extent_free_finish_item+0x1e/0x40 [xfs]
      [  118.284596]  [<ffffffffc0401738>] xfs_defer_finish+0x128/0x3d0 [xfs]
      [  118.285077]  [<ffffffffc0434cf5>] xfs_itruncate_extents+0xf5/0x220 [xfs]
      [  118.285553]  [<ffffffffc0434ed7>] xfs_inactive_truncate+0xb7/0x110 [xfs]
      [  118.286014]  [<ffffffffc0435528>] xfs_inactive+0x108/0x130 [xfs]
      [  118.286463]  [<ffffffffc043cb15>] xfs_fs_destroy_inode+0x95/0x190 [xfs]
      [  118.286899]  [<ffffffff8986c85b>] destroy_inode+0x3b/0x60
      [  118.287317]  [<ffffffff8986c995>] evict+0x115/0x180
      [  118.287727]  [<ffffffff8986cd6c>] iput+0xfc/0x190
      [  118.288120]  [<ffffffff89860b3e>] do_unlinkat+0x1ae/0x2d0
      [  118.288511]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  118.288892]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  118.289254]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  118.289607]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  118.289944]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  118.290281]  [<ffffffff89daaec9>] ? system_call_after_swapgs+0x96/0x13a
      [  118.290601]  [<ffffffff89861bbb>] SyS_unlinkat+0x1b/0x40
      [  118.290900]  [<ffffffff89daaf92>] system_call_fastpath+0x25/0x2a
      [  118.291189]  [<ffffffff89daaed5>] ? system_call_after_swapgs+0xa2/0x13a
      [  118.291476] Code: 48 8b 87 b0 00 00 00 48 81 c7 b0 00 00 00 48 39 c7 48 8d 48 f8 75 11 eb 42 66 90 48 8b 41 08 48 39 c7 48 8d 48 f8 74 33 48 8b 01 <81> 78 30 3c 12 00 00 75 e7 48 8b 80 88 00 00 00 48 39 b0 98 00 
      [  118.292459] RIP  [<ffffffffc04519e0>] xfs_trans_buf_item_match+0x60/0xa0 [xfs]
      

      It seems that "LU-16480 lov: fiemap improperly handles fm_extent_count=0" did not fix all the cases.

      Attachments

        Issue Links

          Activity

            People

              eaujames Etienne Aujames
              eaujames Etienne Aujames
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: