Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15081

set_nlink() can race with itself corrupting s_remove_count on the client

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.15.0
    • None
    • 3
    • 9223372036854775807

    Description

      Mar  3 16:08:57 c-lmo161 kernel: ------------[ cut here ]------------
      Mar  3 16:08:57 c-lmo161 kernel: WARNING: CPU: 5 PID: 195090 at fs/inode.c:241 __destroy_inode+0xdb/0xf0
      Mar  3 16:08:57 c-lmo161 kernel: Modules linked in: mgc(OE) lustre(OE) lmv(OE) mdc(OE) fid(OE) lov(OE) fld(OE) osc(OE) ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache mlx5_ib(OE) mlx5_core(OE) mlxfw sunrpc ib_isert(OE) ib_srpt(OE) ib_srp(OE) ib_ucm(OE) sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel iTCO_wdt iTCO_vendor_support lrw gf128mul glue_helper ablk_helper cryptd rpcrdma(OE) svcrdma(OE) xprtrdma(OE) ib_umad(OE) rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_iser(OE) ib_ipoib(OE) ib_cm(OE) pcspkr joydev lpc_ich mei_me sg i2c_i801 mei wmi ioatdma ipmi_si ipmi_devintf ipmi_msghandler ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mlx4_ib(OE) ib_uverbs(OE) mlx4_en(OE)
      Mar  3 16:08:57 c-lmo161 kernel: ib_core(OE) mgag200 mlx4_core(OE) drm_kms_helper syscopyarea isci sysfillrect igb crct10dif_pclmul crct10dif_common sysimgblt fb_sys_fops ahci crc32c_intel ttm libsas libahci scsi_transport_sas ptp drm pps_core libata dca mlx_compat(OE) drm_panel_orientation_quirks i2c_algo_bit devlink dm_mirror dm_region_hash dm_log dm_mod
      Mar  3 16:08:57 c-lmo161 kernel: CPU: 5 PID: 195090 Comm: mdtest Tainted: G        W  OE  ------------   3.10.0-957.5.1.el7.x86_64 #1
      Mar  3 16:08:57 c-lmo161 kernel: Hardware name: Intel Corporation S2600JF/S2600JF, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014
      Mar  3 16:08:57 c-lmo161 kernel: Call Trace:
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9ff61e41>] dump_stack+0x19/0x1b
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9f897688>] __warn+0xd8/0x100
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9f8977cd>] warn_slowpath_null+0x1d/0x20
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9fa5ecbb>] __destroy_inode+0xdb/0xf0
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9fa5ecf2>] destroy_inode+0x22/0x60
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9fa5ee45>] evict+0x115/0x180
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9fa5f6ec>] iput+0xfc/0x190
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9fa5385e>] do_unlinkat+0x1ae/0x2d0
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9fa54916>] SyS_unlink+0x16/0x20
      Mar  3 16:08:57 c-lmo161 kernel: [<ffffffff9ff74ddb>] system_call_fastpath+0x22/0x27
      Mar  3 16:08:57 c-lmo161 kernel: ---[ end trace f53c1dca536ee43c ]---
      

      Despite the fact that s_remove_counter itself is an atomic, set_nlink() handling is not atomic. Two racing set_nlink()'s, depending on the race, will lead to one of the two nlink values. However, this race can corrupt s_remove_counter which will lead to spam in the kernel logs.

      Attachments

        Activity

          [LU-15081] set_nlink() can race with itself corrupting s_remove_count on the client
          pjones Peter Jones added a comment -

          Landed for 2.15

          pjones Peter Jones added a comment - Landed for 2.15

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45191/
          Subject: LU-15081 vfs: set_nlink() is not race-safe
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 12b05772fdb6d080819b6c213fcd7f8705278412

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45191/ Subject: LU-15081 vfs: set_nlink() is not race-safe Project: fs/lustre-release Branch: master Current Patch Set: Commit: 12b05772fdb6d080819b6c213fcd7f8705278412

          Although we did not capture the logs for this issue in the field. I believe, a possible scenario is dropping n_link to zero in the context of ll_unlink() and then raising it to 1 in some getattr context as a result of delayed getattr processing:

          void set_nlink(struct inode *inode, unsigned int nlink)
          {
                  if (!nlink) {
                          clear_nlink(inode);
                  } else {
                          /* Yes, some filesystems do change nlink from zero to one */
                          if (inode->i_nlink == 0)
                                  atomic_long_dec(&inode->i_sb->s_remove_count);
          
                          inode->__i_nlink = nlink;
                  }
          }
          EXPORT_SYMBOL(set_nlink);
          

          If two threads are setting nlink to 1 and simultaneously processing "if (inode->i_nlink == 0)", then both of them can drop s_remove_count while it was incremented only once when nlink initially dropped to zero in ll_unlink().

          panda Andrew Perepechko added a comment - Although we did not capture the logs for this issue in the field. I believe, a possible scenario is dropping n_link to zero in the context of ll_unlink() and then raising it to 1 in some getattr context as a result of delayed getattr processing: void set_nlink(struct inode *inode, unsigned int nlink) { if (!nlink) { clear_nlink(inode); } else { /* Yes, some filesystems do change nlink from zero to one */ if (inode->i_nlink == 0) atomic_long_dec(&inode->i_sb->s_remove_count); inode->__i_nlink = nlink; } } EXPORT_SYMBOL(set_nlink); If two threads are setting nlink to 1 and simultaneously processing "if (inode->i_nlink == 0)", then both of them can drop s_remove_count while it was incremented only once when nlink initially dropped to zero in ll_unlink().

          "Andrew Perepechko <andrew.perepechko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45191
          Subject: LU-15081 vfs: set_nlink() is not race-safe
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 0cf621e17c0d035ff33edf56035bce5ecb8beef2

          gerrit Gerrit Updater added a comment - "Andrew Perepechko <andrew.perepechko@hpe.com>" uploaded a new patch: https://review.whamcloud.com/45191 Subject: LU-15081 vfs: set_nlink() is not race-safe Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0cf621e17c0d035ff33edf56035bce5ecb8beef2

          A patch will be uploaded shortly.

          panda Andrew Perepechko added a comment - A patch will be uploaded shortly.

          People

            panda Andrew Perepechko
            panda Andrew Perepechko
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: