Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-11986

After partial lustre_rmmod, lnet panics on debugfs read

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.14.0
    • Lustre 2.12.0
    • None
    • IML 5-devel on VirtualBox
      Lustre 2.12.0
    • 3
    • 9223372036854775807

    Description

      While investigating LU-9525 I ran across this behavior.

      # modprobe osd-zfs
      # lctl list_nids
      192.168.56.20@tcp
      # lustre_rmmod lnet
      

      Wait a minute for (I assume) iml-agent to try to get lnet status and the following panic results:

      [   68.383777] BUG: unable to handle kernel paging request at ffffffffc0959ca0
      [   68.388689] IP: [<ffffffffc089b654>] lnet_debugfs_read+0x24/0x40 [libcfs]
      [   68.391817] PGD 11c14067 PUD 11c16067 PMD 16472067 PTE 0
      [   68.402787] Oops: 0000 [#1] SMP
      [   68.404489] Modules linked in: libcfs(OE) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) ppdev iosf_mbi crc32_pclmul zcommon(POE) znvpair(POE) ghash_clmulni_intel spl(OE) aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr sg parport_pc parport video i2c_piix4 ip_tables ext4 mbcache jbd2 sr_mod cdrom ata_generic sd_mod crc_t10dif crct10dif_generic pata_acpi crct10dif_pclmul crct10dif_common crc32c_intel serio_raw ahci ata_piix libahci libata e1000 dm_mirror dm_region_hash dm_log dm_mod [last unloaded: lnet]
      [   68.438921] CPU: 1 PID: 4853 Comm: lctl Tainted: P           OE  ------------   3.10.0-957.el7_lustre.x86_64 #1
      [   68.444438] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   68.448776] task: ffff893597914100 ti: ffff893596f44000 task.ti: ffff893596f44000
      [   68.452887] RIP: 0010:[<ffffffffc089b654>]  [<ffffffffc089b654>] lnet_debugfs_read+0x24/0x40 [libcfs]
      [   68.458006] RSP: 0018:ffff893596f47ec8  EFLAGS: 00010246
      [   68.460928] RAX: ffffffffc089b630 RBX: ffff89359ead7900 RCX: ffff893596f47ec8
      [   68.465870] RDX: 0000000000cd50c0 RSI: 0000000000000000 RDI: ffffffffc0959c80
      [   68.475879] RBP: ffff893596f47ed0 R08: ffff893596f47f18 R09: 0000000000000000
      [   68.479744] R10: 00007fff4e9a0f60 R11: 0000000000000246 R12: 0000000000cd50c0
      [   68.483282] R13: ffff893596f47f18 R14: 0000000000001000 R15: 0000000000000000
      [   68.487155] FS:  00007fd435e89740(0000) GS:ffff89359fd00000(0000) knlGS:0000000000000000
      [   68.492060] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   68.495707] CR2: ffffffffc0959ca0 CR3: 0000000017940000 CR4: 00000000000606e0
      [   68.499746] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   68.503180] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   68.507283] Call Trace:
      [   68.508720]  [<ffffffff94640f0f>] vfs_read+0x9f/0x170
      [   68.511351]  [<ffffffff94641dcf>] SyS_read+0x7f/0xf0
      [   68.513924]  [<ffffffff94b74d21>] ? system_call_after_swapgs+0xae/0x146
      [   68.519936]  [<ffffffff94b74ddb>] system_call_fastpath+0x22/0x27
      [   68.524797]  [<ffffffff94b74d21>] ? system_call_after_swapgs+0xae/0x146
      [   68.529162] Code: 5b 41 5c 41 5d 5d c3 0f 1f 44 00 00 55 49 89 c8 48 89 e5 48 83 ec 08 48 8b bf a8 00 00 00 48 89 55 f8 48 8d 4d f8 48 89 f2 31 f6 <48> 8b 47 20 e8 23 b5 ee d3 48 98 48 85 c0 48 0f 44 45 f8 c9 c3
      [   68.545905] RIP  [<ffffffffc089b654>] lnet_debugfs_read+0x24/0x40 [libcfs]
      [   68.549423]  RSP <ffff893596f47ec8>
      [   68.551714] CR2: ffffffffc0959ca0
      [   68.553976] ---[ end trace 31c5ad0e3a22fb28 ]---
      [   68.556862] Kernel panic - not syncing: Fatal exception
      [   68.559764] Kernel Offset: 0x13400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      

      Attachments

        Issue Links

          Activity

            [LU-11986] After partial lustre_rmmod, lnet panics on debugfs read

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39404/
            Subject: LU-11986 lnet: don't read debugfs lnet stats when shutting down
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: f53eea15d470c9bb29a7b867b733db9249aec95b

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/39404/ Subject: LU-11986 lnet: don't read debugfs lnet stats when shutting down Project: fs/lustre-release Branch: master Current Patch Set: Commit: f53eea15d470c9bb29a7b867b733db9249aec95b

            James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/39404
            Subject: LU-11986 lnet: don't read debugfs lnet stats when shutting down
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: b6587af8415064ec8c305cea03abfccddaef1341

            gerrit Gerrit Updater added a comment - James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/39404 Subject: LU-11986 lnet: don't read debugfs lnet stats when shutting down Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: b6587af8415064ec8c305cea03abfccddaef1341

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38716/
            Subject: LU-11986 libcfs: lnet_remove_debugfs() compat for RHEL6
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: 7f821c9382c39fd16156593569737df27dfb0467

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38716/ Subject: LU-11986 libcfs: lnet_remove_debugfs() compat for RHEL6 Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 7f821c9382c39fd16156593569737df27dfb0467

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38716
            Subject: LU-11986 libcfs: lnet_remove_debugfs() compat for RHEL6
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: d38b26e84bccbb2c87a948e46408e67c884eeb76

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38716 Subject: LU-11986 libcfs: lnet_remove_debugfs() compat for RHEL6 Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: d38b26e84bccbb2c87a948e46408e67c884eeb76

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38529/
            Subject: LU-11986 libcfs: add compat for d_hash_and_lookup()
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: fa72fe50b9b4ee8ca2165607e32360a6bebd86e4

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38529/ Subject: LU-11986 libcfs: add compat for d_hash_and_lookup() Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: fa72fe50b9b4ee8ca2165607e32360a6bebd86e4

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38529
            Subject: LU-11986 libcfs: add compat for d_hash_and_lookup()
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: e57f7dab3c9017a9ef979db0d4d2f685a150e0c5

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38529 Subject: LU-11986 libcfs: add compat for d_hash_and_lookup() Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: e57f7dab3c9017a9ef979db0d4d2f685a150e0c5

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38090/
            Subject: LU-11986 libcfs: provide QSTR_INIT compat macro
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set:
            Commit: fd50d435119f0993ee05a4653315e1a55627817b

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/38090/ Subject: LU-11986 libcfs: provide QSTR_INIT compat macro Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: fd50d435119f0993ee05a4653315e1a55627817b

            Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38090
            Subject: LU-11986 libcfs: provide QSTR_INIT compat macro
            Project: fs/lustre-release
            Branch: b2_12
            Current Patch Set: 1
            Commit: c765cb2c59b39ce0706e88d7b4c6cec6ef75e313

            gerrit Gerrit Updater added a comment - Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/38090 Subject: LU-11986 libcfs: provide QSTR_INIT compat macro Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: c765cb2c59b39ce0706e88d7b4c6cec6ef75e313

            So we reduce the chance of this bug

            simmonsja James A Simmons added a comment - So we reduce the chance of this bug
            vitaly_fertman Vitaly Fertman added a comment - afaics, it happened again: https://testing.whamcloud.com/test_sets/4b204efc-bb02-11e9-9fc9-52540065bddc

            People

              simmonsja James A Simmons
              utopiabound Nathaniel Clark
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: