Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-8508

kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.9.0
    • Lustre 2.8.0
    • None
    • 3
    • 9223372036854775807

    Description

      Lustre DNE2 Testing, noticed some issue with latest master builds. When mounting storage targets on servers other than ones with the MGT i get a kernel panic with the below; I have validated this is not (to the best of my ability) network, I have also tried and FE build which works and another master build (3419) which works:

       
      [root@zlfs2-oss1 ~]# mount -vvv -t lustre /dev/nvme0n1 /mnt/MDT0000
      arg[0] = /sbin/mount.lustre
      arg[1] = -v
      arg[2] = -o
      arg[3] = rw
      arg[4] = /dev/nvme0n1
      arg[5] = /mnt/MDT0000
      source = /dev/nvme0n1 (/dev/nvme0n1), target = /mnt/MDT0000
      options = rw
      checking for existing Lustre data: found
      Reading CONFIGS/mountdata
      Writing CONFIGS/mountdata
      mounting device /dev/nvme0n1 at /mnt/MDT0000, flags=0x1000000 options=osd=osd-ldiskfs,user_xattr,errors=remount-ro,mgsnode=192.168.5.21@o2ib,virgin,update,param=mgsnode=192.168.5.21@o2ib,svname=zlfs2-MDT0000,device=/dev/nvme0n1
      mount.lustre: cannot parse scheduler options for '/sys/block/nvme0n1/queue/scheduler'
      
      Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
       kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
      
      Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
       kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) LBUG
      
      Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
       kernel:Kernel panic - not syncing: LBUG
      

      Attached is some debugging / more info.

      Builds Tried:
      master b3424 - issues
      master b3423 - issues
      master b3420 - issues
      master b3419 - works
      fe 2.8 b18 - works

      Attachments

        Issue Links

          Activity

            [LU-8508] kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1
            standan Saurabh Tandan (Inactive) made changes -
            Remote Link New: This issue links to "Page (HPDD Community Wiki)" [ 18365 ]
            pjones Peter Jones made changes -
            Resolution New: Fixed [ 1 ]
            Status Original: In Progress [ 3 ] New: Resolved [ 5 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is duplicated by LU-8611 [ LU-8611 ]
            pjones Peter Jones made changes -
            Assignee Original: nasf [ yong.fan ] New: Kit Westneat [ kit.westneat ]
            simmonsja James A Simmons made changes -
            Link New: This issue is related to LU-3291 [ LU-3291 ]
            yong.fan nasf (Inactive) made changes -
            Status Original: Open [ 1 ] New: In Progress [ 3 ]
            adam.j.roe Adam Roe (Inactive) made changes -
            Description Original: Lustre DNE2 Testing, noticed some issue with latest master builds. When mounting storage targets on servers other than ones with the MGT i get a kernel panic with the below; I have validated this is not (to the best of my ability) network, I have also tried and FE build which works and another master build (3419) which works:

            {noformat}
            [root@zlfs2-oss1 ~]# mount -vvv -t lustre /dev/nvme0n1 /mnt/MDT0000
            arg[0] = /sbin/mount.lustre
            arg[1] = -v
            arg[2] = -o
            arg[3] = rw
            arg[4] = /dev/nvme0n1
            arg[5] = /mnt/MDT0000
            source = /dev/nvme0n1 (/dev/nvme0n1), target = /mnt/MDT0000
            options = rw
            checking for existing Lustre data: found
            Reading CONFIGS/mountdata
            Writing CONFIGS/mountdata
            mounting device /dev/nvme0n1 at /mnt/MDT0000, flags=0x1000000 options=osd=osd-ldiskfs,user_xattr,errors=remount-ro,mgsnode=192.168.5.21@o2ib,virgin,update,param=mgsnode=192.168.5.21@o2ib,svname=zlfs2-MDT0000,device=/dev/nvme0n1
            mount.lustre: cannot parse scheduler options for '/sys/block/nvme0n1/queue/scheduler'

            Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
             kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1

            Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
             kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) LBUG

            Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
             kernel:Kernel panic - not syncing: LBUG
            {noformat}

            Attached is some debugging / more info.

            Builds Tried:
            master b3424 - issues
            master b3423 - issues
            master b3419 - works
            fe 2.8 b18 - works
            New: Lustre DNE2 Testing, noticed some issue with latest master builds. When mounting storage targets on servers other than ones with the MGT i get a kernel panic with the below; I have validated this is not (to the best of my ability) network, I have also tried and FE build which works and another master build (3419) which works:

            {noformat}
            [root@zlfs2-oss1 ~]# mount -vvv -t lustre /dev/nvme0n1 /mnt/MDT0000
            arg[0] = /sbin/mount.lustre
            arg[1] = -v
            arg[2] = -o
            arg[3] = rw
            arg[4] = /dev/nvme0n1
            arg[5] = /mnt/MDT0000
            source = /dev/nvme0n1 (/dev/nvme0n1), target = /mnt/MDT0000
            options = rw
            checking for existing Lustre data: found
            Reading CONFIGS/mountdata
            Writing CONFIGS/mountdata
            mounting device /dev/nvme0n1 at /mnt/MDT0000, flags=0x1000000 options=osd=osd-ldiskfs,user_xattr,errors=remount-ro,mgsnode=192.168.5.21@o2ib,virgin,update,param=mgsnode=192.168.5.21@o2ib,svname=zlfs2-MDT0000,device=/dev/nvme0n1
            mount.lustre: cannot parse scheduler options for '/sys/block/nvme0n1/queue/scheduler'

            Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
             kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) ASSERTION( atomic_read(&d->ld_ref) == 0 ) failed: Refcount is 1

            Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
             kernel:LustreError: 3842:0:(lu_object.c:1243:lu_device_fini()) LBUG

            Message from syslogd@zlfs2-oss1 at Aug 16 21:52:33 ...
             kernel:Kernel panic - not syncing: LBUG
            {noformat}

            Attached is some debugging / more info.

            Builds Tried:
            master b3424 - issues
            master b3423 - issues
            master b3420 - issues
            master b3419 - works
            fe 2.8 b18 - works
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.9.0 [ 11891 ]
            Fix Version/s Original: Lustre 2.5.5 [ 11394 ]
            pjones Peter Jones made changes -
            Priority Original: Minor [ 4 ] New: Critical [ 2 ]
            pjones Peter Jones made changes -
            Fix Version/s New: Lustre 2.5.5 [ 11394 ]

            People

              kit.westneat Kit Westneat (Inactive)
              adam.j.roe Adam Roe (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: