Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.10.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      Ever snce first batch of landings in 2.8 I started to get racer clients to lock up on unmount.

      Typical trace:

      [86663.285011] umount        D 0000000000000001 11016 14155  14148 0x00000000
      [86663.285011]  ffff88008a16fd88 0000000000000086 ffff88008a16fd50 ffff88008a16fd4c
      [86663.285011]  ffff88008a16fcf8 ffff8800bcc23280 00004ed2e0f1ebbc ffff880006315c00
      [86663.285011]  0000000000000000 0000000101497763 ffff880081c0e6b8 ffff88008a16ffd8
      [86663.285011] Call Trace:
      [86663.285011]  [<ffffffff8152d911>] schedule_timeout+0x191/0x2e0
      [86663.285011]  [<ffffffff81088290>] ? process_timeout+0x0/0x10
      [86663.285011]  [<ffffffffa0fa4961>] ll_kill_super+0x91/0x180 [lustre]
      [86663.285011]  [<ffffffffa05d94c2>] lustre_kill_super+0x42/0x60 [obdclass]
      [86663.285011]  [<ffffffff81195ed7>] deactivate_super+0x57/0x80
      [86663.285011]  [<ffffffff811b5e2f>] mntput_no_expire+0xbf/0x110
      [86663.285011]  [<ffffffff811b699b>] sys_umount+0x7b/0x3a0
      [86663.285011]  [<ffffffff8108e64d>] ? sigprocmask+0x8d/0x110
      [86663.285011]  [<ffffffff8100b112>] system_call_fastpath+0x16/0x1b
      
      (gdb) l *(ll_kill_super+0x91)
      0x24991 is in ll_kill_super (/home/green/bk/linux-2.6.32-573.3.1.el6-debug/arch/x86/include/asm/atomic_64.h:23).
      18	 *
      19	 * Atomically reads the value of @v.
      20	 */
      21	static inline int atomic_read(const atomic_t *v)
      22	{
      23		return v->counter;
      24	}
      25	
      26	/**
      27	 * atomic_set - set atomic variable
      (gdb) l *(ll_kill_super+0x8f)
      0x2498f is in ll_kill_super (/home/green/git/lustre-release/lustre/llite/llite_lib.c:787).
      782			sbi->ll_umounting = 1;
      783	
      784			/* wait running statahead threads to quit */
      785			while (atomic_read(&sbi->ll_sa_running) > 0) {
      786				set_current_state(TASK_UNINTERRUPTIBLE);
      787				schedule_timeout(msecs_to_jiffies(MSEC_PER_SEC >> 3));
      788			}
      789		}
      790	
      791		EXIT;
      

      Initially it looked like the cause was this patch: http://review.whamcloud.com/18475

      But I am not sure.

      Anyway whatever it is, it just seems to be exposing some sort of a reference leak in statahead code.

      Attachments

        Activity

          [LU-7994] statahead loop in umount
          pjones Peter Jones added a comment -

          Landed for 2.10

          pjones Peter Jones added a comment - Landed for 2.10

          Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23040/
          Subject: LU-7994 statahead: add smp_mb() to serialize ops
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: c5c8d0623a60ee54bd11588a391b3dcd43c1abcf

          gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch https://review.whamcloud.com/23040/ Subject: LU-7994 statahead: add smp_mb() to serialize ops Project: fs/lustre-release Branch: master Current Patch Set: Commit: c5c8d0623a60ee54bd11588a391b3dcd43c1abcf

          Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/23040
          Subject: LU-7994 statahead: add smp_mb() to serialize ops
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: a939946e936f3ada6c8c6d39dcd25aaf60e366d0

          gerrit Gerrit Updater added a comment - Lai Siyao (lai.siyao@intel.com) uploaded a new patch: http://review.whamcloud.com/23040 Subject: LU-7994 statahead: add smp_mb() to serialize ops Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: a939946e936f3ada6c8c6d39dcd25aaf60e366d0

          Hi Lai,

          Can you please have a look into this issue?

          Thanks.
          Joe

          jgmitter Joseph Gmitter (Inactive) added a comment - Hi Lai, Can you please have a look into this issue? Thanks. Joe

          People

            laisiyao Lai Siyao
            green Oleg Drokin
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: