Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3725

SELinux enabled on 1.8.9 client/CentOS 6.4 causes system deadlock when mounting 2.1.6 LFS

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • Lustre 1.8.9, Lustre 2.1.6
    • CentOS 6.3, 2.6.32-279.19.1.el6.x86_64 patchless client v 1.8.9.
      Server CentOS 6.4, Lustre 2.1.6, 2.6.32-358.11.1.el6_lustre.x86_64
      Lnet via tcp over 10GbE (Intel X540-AT2)

    Description

      Mounting a 2.1.6 LFS from 1.8.9 client over lnet@tcp (10GbE) would cause an instant deadlock of the client.

      Coredump shows kernel bug / selinux conflict with mount operation.

      Pertinent core dump dmesg output:

      ---cut---
      <5>Registering the id_resolver key type
      <5>FS-Cache: Netfs 'nfs' registered for caching
      <7>SELinux: initialized (dev 0:1b, type nfs), uses genfs_contexts
      <7>SELinux: initialized (dev 0:1c, type nfs), uses genfs_contexts
      <7>SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
      <7>SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
      <7>SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
      <7>SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
      <6>Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
      <7>SELinux: initialized (dev nfsd, type nfsd), uses genfs_contexts
      <4>NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
      <6>NFSD: starting 90-second grace period
      <6>Lustre: Build Version: jenkins-wc1--PRISTINE-2.6.32-279.19.1.el6.x86_64
      <6>Lustre: Added LNI 192.52.98.54@tcp [8/256/0/180]
      <6>Lustre: Accept secure, port 988
      <6>Lustre: Lustre Client File System; http://www.lustre.org/
      <4>Lustre: MGC192.52.98.30@tcp: Reactivating import
      <4>Lustre: Client hpfs-eg3-client(ffff880335b66800) mount complete
      <7>SELinux: initialized (dev lustre, type lustre), uses xattr
      <5>Bridge firewalling registered
      <6>device virbr0-nic entered promiscuous mode
      <6>virbr0: starting userspace STP failed, starting kernel STP
      <6>ip_tables: (C) 2000-2006 Netfilter Core Team
      <4>nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
      <6>Ebtables v2.0 registered
      <6>ip6_tables: (C) 2000-2006 Netfilter Core Team
      <7>SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
      <7>SELinux: initialized (dev proc, type proc), uses genfs_contexts
      <7>SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
      <7>SELinux: initialized (dev proc, type proc), uses genfs_contexts
      <7>SELinux: initialized (dev 0:22, type nfs4), uses genfs_contexts
      <7>SELinux: initialized (dev 0:23, type nfs4), uses genfs_contexts
      <7>SELinux: initialized (dev 0:24, type nfs4), uses genfs_contexts
      <6>fuse init (API version 7.13)
      <7>SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
      <4>Lustre: MGC192.52.98.142@tcp: Reactivating import
      <4>Lustre: Server MGS version (2.1.6.0) is much newer than client version (1.8.9)
      <4>Lustre: 9039:0:(obd_config.c:1127:class_config_llog_handler()) skipping 'lmv' config: cmd=cf001,clilmv:lmv
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: Skipped 1 previous similar message
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: Skipped 2 previous similar messages
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: Skipped 1 previous similar message
      <4>Lustre: Server hpfs2eg3-MDT0000_UUID version (2.1.6.0) is much newer than client version (1.8.9)
      <6>Lustre: client supports 64-bits dir hash/offset!
      <4>Lustre: Client hpfs2eg3-client(ffff880635520000) mount complete
      <7>SELinux: initialized (dev lustre, type lustre), uses xattr
      <4>------------[ cut here ]------------
      <2>kernel BUG at security/selinux/ss/services.c:625!
      <4>invalid opcode: 0000 [#1] SMP 
      <4>last sysfs file: /sys/kernel/mm/ksm/run
      <4>CPU 4 
      <4>Modules linked in: fuse ip6_tables ebtable_nat ebtables xt_state ipt_MASQUERADE ipt_REJECT iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack xt_CHECKSUM iptable_mangle nf_defrag_ipv4 iptable_filter ip_tables bridge stp llc mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfsd exportfs autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc ext3 jbd tcp_bic vhost_net macvtap macvlan tun kvm_intel kvm uinput nvidia(P)(U) sg microcode serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e ixgbe mdio snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc ioatdma dca i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom mpt2sas scsi_transport_sas raid_class pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
      <4>
      <4>Pid: 7217, comm: gvfsd-trash Tainted: P           ---------------    2.6.32-279.19.1.el6.centos.plus.x86_64 #1 Supermicro X8DTL/X8DTL
      <4>RIP: 0010:[<ffffffff8122a86b>]  [<ffffffff8122a86b>] context_struct_compute_av+0x40b/0x420
      <4>RSP: 0018:ffff8806366a5c08  EFLAGS: 00010246
      <4>RAX: 0000000000000000 RBX: ffff8806366a5d98 RCX: 0000000000000100
      <4>RDX: 0000000000000f3c RSI: 00000000ffffffff RDI: 0000000000000010
      <4>RBP: ffff8806366a5c88 R08: 0000000000013570 R09: ffff8806366a5d98
      <4>R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000007
      <4>R13: ffff880324960948 R14: 0000000000000769 R15: 00000000000007cb
      <4>FS:  00007fadb81f17a0(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
      <4>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>CR2: 000000000199e428 CR3: 0000000636fdd000 CR4: 00000000000006e0
      <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      <4>Process gvfsd-trash (pid: 7217, threadinfo ffff8806366a4000, task ffff880634921500)
      <4>Stack:
      <4> ffff880637fc5380 0007ffff00000007 ffff88062b2a0548 ffff880324960948
      <4><d> ffff8806319fa060 ffff88031c981c00 0000000000000000 0000000000000000
      <4><d> 00070007366a5d28 0000000001002fce ffff8806366a5c68 ffff8806366a5d98
      <4>Call Trace:
      <4> [<ffffffff8122ad45>] security_compute_av+0xf5/0x2c0
      <4> [<ffffffff812142be>] avc_has_perm_noaudit+0x14e/0x470
      <4> [<ffffffffa124850b>] ? __ll_inode_revalidate_it+0x15b/0x640 [lustre]
      <4> [<ffffffff8121462b>] avc_has_perm+0x4b/0x90
      <4> [<ffffffff812163f4>] inode_has_perm+0x54/0xa0
      <4> [<ffffffff812164b2>] selinux_inode_permission+0x72/0xb0
      <4> [<ffffffff8120dc5f>] security_inode_permission+0x1f/0x30
      <4> [<ffffffff81182b5f>] inode_permission+0xaf/0xd0
      <4> [<ffffffff811b81c2>] sys_inotify_add_watch+0xa2/0x450
      <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      <4>Code: ff ff ff e8 48 00 e4 ff 85 c0 0f 84 34 ff ff ff 0f b7 75 8e 48 c7 c7 70 7b 7b 81 31 c0 e8 e7 5d 2c 00 e9 1d ff ff ff 0f 0b eb fe <0f> 0b 0f 1f 00 eb fb 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 
      <1>RIP  [<ffffffff8122a86b>] context_struct_compute_av+0x40b/0x420
      <4> RSP <ffff8806366a5c08>
      ---end---
      

      Issue resolved by disabling selinux on Lustre client node (via selinux=0 at kernel cmdline).

      As the issue is resolved by disabling selinux this ticket filed so it is searchable for problem resolution.

      Attachments

        Activity

          [LU-3725] SELinux enabled on 1.8.9 client/CentOS 6.4 causes system deadlock when mounting 2.1.6 LFS

          People

            pjones Peter Jones
            aeonjeffj Jeff Johnson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: