Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-15586

ZFS VERIFY3(sa.sa_magic == SA_MAGIC) failed

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.14.0
    • Centos 8.5, 4.18.0-348.2.1.el8_5.x86_64,
      ZFS 2.1.0/2.1.3-staging
      Lustre 2.14 (latest b2_14 build)
    • 3
    • 9223372036854775807

    Description

      On one of our pre-production file systems, which is used for testing some future functionalities, we run Lustre 2.14 with ZFS 2.1.0, as we use dRAID on the OSS nodes. Same OS/ZFS/Lustre stack is used on the MDS nodes, however in this case just a basic setup with one vdev consisting of one LUN from an all-flash array is used:{}

       

      ~# zpool status -Lv
        pool: mdt0-bkp
       state: ONLINE
      config:        NAME        STATE     READ WRITE CKSUM
              mdt0-bkp    ONLINE       0     0     0
                dm-2      ONLINE       0     0     0 

      The system was running in pre-prod enviorment for few months without any issues, however last week we have observed a problem, which has started from a single event  on the MDS causing:

      kernel: [16066.783270] list_del corruption, ffff9ed0cc31e028->next is LIST_POISON1 (dead000000000100)-
      kernel: [16066.783679] kernel BUG at lib/list_debug.c:47!

      After rebooting the MDS server and running the workload for some time we have observed:

       

       

      kernel: VERIFY3(sa.sa_magic == SA_MAGIC) failed (8 == 3100762)
      kernel: PANIC at zfs_quota.c:89:zpl_get_file_info() 

      Full trace looks like this:

       

       

      Feb 16 21:51:53 ascratch-mds01 kernel: VERIFY3(sa.sa_magic == SA_MAGIC) failed (8 == 3100762)
      Feb 16 21:51:53 ascratch-mds01 kernel: PANIC at zfs_quota.c:89:zpl_get_file_info()
      Feb 16 21:51:53 ascratch-mds01 kernel: Showing stack for process 30151
      Feb 16 21:51:53 ascratch-mds01 kernel: CPU: 8 PID: 30151 Comm: mdt00_096 Tainted: P          IOE    --------- -  - 4.18.0-348.2.1.el8_5.x86_64 #1
      Feb 16 21:51:53 ascratch-mds01 kernel: Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 7.99 03/11/2021
      Feb 16 21:51:53 ascratch-mds01 kernel: Call Trace:
      Feb 16 21:51:53 ascratch-mds01 kernel: dump_stack+0x5c/0x80
      Feb 16 21:51:53 ascratch-mds01 kernel: spl_panic+0xd3/0xfb [spl]
      Feb 16 21:51:53 ascratch-mds01 kernel: ? sg_init_table+0x11/0x30
      Feb 16 21:51:54 ascratch-mds01 kernel: ? __sg_alloc_table+0x6e/0x170
      Feb 16 21:51:54 ascratch-mds01 kernel: ? sg_alloc_table+0x1f/0x50
      Feb 16 21:51:54 ascratch-mds01 kernel: ? sg_init_one+0x80/0x80
      Feb 16 21:51:54 ascratch-mds01 kernel: ? _cond_resched+0x15/0x30
      Feb 16 21:51:54 ascratch-mds01 kernel: ? _cond_resched+0x15/0x30
      Feb 16 21:51:54 ascratch-mds01 kernel: ? mutex_lock+0xe/0x30
      Feb 16 21:51:54 ascratch-mds01 kernel: ? spl_kmem_cache_alloc+0x5d/0x160 [spl]
      Feb 16 21:51:54 ascratch-mds01 kernel: ? dbuf_rele_and_unlock+0x13d/0x6a0 [zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: ? kmem_cache_alloc+0x12e/0x270
      Feb 16 21:51:54 ascratch-mds01 kernel: ? __cv_init+0x3d/0x60 [spl]
      Feb 16 21:51:54 ascratch-mds01 kernel: zpl_get_file_info+0x1ea/0x230 [zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: dmu_objset_userquota_get_ids+0x1f8/0x480 [zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: dnode_setdirty+0x2f/0xe0 [zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: dnode_allocate+0x11d/0x180 [zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: dmu_object_alloc_impl+0x32c/0x3c0 [zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: dmu_object_alloc_dnsize+0x1c/0x30 [zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: __osd_object_create+0x78/0x120 [osd_zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: osd_mkreg+0x98/0x250 [osd_zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: ? __osd_xattr_declare_set+0x190/0x260 [osd_zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: osd_create+0x2c6/0xc90 [osd_zfs]
      Feb 16 21:51:54 ascratch-mds01 kernel: ? __kmalloc_node+0x10e/0x2f0
      Feb 16 21:51:54 ascratch-mds01 kernel: lod_sub_create+0x244/0x4a0 [lod]
      Feb 16 21:51:54 ascratch-mds01 kernel: lod_create+0x4b/0x330 [lod] 

      Originally the issue appeared after running few hundred of ADF simulations in parallel. The issue was becoming more frequent (it took hours to trigger it on start, which changed to minutes after few crashes) and we were able to trigger the problem by a simple untar of linux kernel on the Lustre filesystem.

      What is interesting, even though such an untar would crash the system in second while running on Lustre, the same test running on the MDT dataset mounted as a regular ZFS filesystem (canmount=on) on the MDS server, was not crashable with the same procedure.

      As a  mitigation zpool scrub was invoked, but detected no errors. We have decided to perform a restore of the MDT by using zfs send/recv to a fresh pool, created with  dnodesize and xattr changed from default (legacy, on) to the dnodesize=auto and xattr=sa, what has solved the problem for few hours, after which the problem came back.

      We don't see any hardware issues on the system, both on the disk level and on the server level (no ECC errors etc.), so presumably this is a bug somewhere on the Lustre/ZFS level.

      The only operation done on the Lustre level few days before the issue appeared was tagging some of the directories with project ids for the purpose of project quota accounting, however no enforcment was enabled. Once the issue appeared all ids were cleared to the original state.

      We have opened two related bugreports on openzfs GitHub:
      https://github.com/openzfs/zfs/issues/13143

      https://github.com/openzfs/zfs/issues/13144

       

       

      {{}}

       

      Attachments

        Activity

          People

            wc-triage WC Triage
            m.magrys Marek Magrys
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: