Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3725

SELinux enabled on 1.8.9 client/CentOS 6.4 causes system deadlock when mounting 2.1.6 LFS

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Fix
    • Minor
    • None
    • Lustre 1.8.9, Lustre 2.1.6
    • CentOS 6.3, 2.6.32-279.19.1.el6.x86_64 patchless client v 1.8.9.
      Server CentOS 6.4, Lustre 2.1.6, 2.6.32-358.11.1.el6_lustre.x86_64
      Lnet via tcp over 10GbE (Intel X540-AT2)

    Description

      Mounting a 2.1.6 LFS from 1.8.9 client over lnet@tcp (10GbE) would cause an instant deadlock of the client.

      Coredump shows kernel bug / selinux conflict with mount operation.

      Pertinent core dump dmesg output:

      ---cut---
      <5>Registering the id_resolver key type
      <5>FS-Cache: Netfs 'nfs' registered for caching
      <7>SELinux: initialized (dev 0:1b, type nfs), uses genfs_contexts
      <7>SELinux: initialized (dev 0:1c, type nfs), uses genfs_contexts
      <7>SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
      <7>SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
      <7>SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
      <7>SELinux: initialized (dev autofs, type autofs), uses genfs_contexts
      <6>Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
      <7>SELinux: initialized (dev nfsd, type nfsd), uses genfs_contexts
      <4>NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
      <6>NFSD: starting 90-second grace period
      <6>Lustre: Build Version: jenkins-wc1--PRISTINE-2.6.32-279.19.1.el6.x86_64
      <6>Lustre: Added LNI 192.52.98.54@tcp [8/256/0/180]
      <6>Lustre: Accept secure, port 988
      <6>Lustre: Lustre Client File System; http://www.lustre.org/
      <4>Lustre: MGC192.52.98.30@tcp: Reactivating import
      <4>Lustre: Client hpfs-eg3-client(ffff880335b66800) mount complete
      <7>SELinux: initialized (dev lustre, type lustre), uses xattr
      <5>Bridge firewalling registered
      <6>device virbr0-nic entered promiscuous mode
      <6>virbr0: starting userspace STP failed, starting kernel STP
      <6>ip_tables: (C) 2000-2006 Netfilter Core Team
      <4>nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
      <6>Ebtables v2.0 registered
      <6>ip6_tables: (C) 2000-2006 Netfilter Core Team
      <7>SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
      <7>SELinux: initialized (dev proc, type proc), uses genfs_contexts
      <7>SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
      <7>SELinux: initialized (dev proc, type proc), uses genfs_contexts
      <7>SELinux: initialized (dev 0:22, type nfs4), uses genfs_contexts
      <7>SELinux: initialized (dev 0:23, type nfs4), uses genfs_contexts
      <7>SELinux: initialized (dev 0:24, type nfs4), uses genfs_contexts
      <6>fuse init (API version 7.13)
      <7>SELinux: initialized (dev fuse, type fuse), uses genfs_contexts
      <4>Lustre: MGC192.52.98.142@tcp: Reactivating import
      <4>Lustre: Server MGS version (2.1.6.0) is much newer than client version (1.8.9)
      <4>Lustre: 9039:0:(obd_config.c:1127:class_config_llog_handler()) skipping 'lmv' config: cmd=cf001,clilmv:lmv
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: Skipped 1 previous similar message
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: Skipped 2 previous similar messages
      <3>LustreError: 11-0: an error occurred while communicating with 192.52.98.142@tcp. The mds_connect operation failed with -16
      <3>LustreError: Skipped 1 previous similar message
      <4>Lustre: Server hpfs2eg3-MDT0000_UUID version (2.1.6.0) is much newer than client version (1.8.9)
      <6>Lustre: client supports 64-bits dir hash/offset!
      <4>Lustre: Client hpfs2eg3-client(ffff880635520000) mount complete
      <7>SELinux: initialized (dev lustre, type lustre), uses xattr
      <4>------------[ cut here ]------------
      <2>kernel BUG at security/selinux/ss/services.c:625!
      <4>invalid opcode: 0000 [#1] SMP 
      <4>last sysfs file: /sys/kernel/mm/ksm/run
      <4>CPU 4 
      <4>Modules linked in: fuse ip6_tables ebtable_nat ebtables xt_state ipt_MASQUERADE ipt_REJECT iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack xt_CHECKSUM iptable_mangle nf_defrag_ipv4 iptable_filter ip_tables bridge stp llc mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfsd exportfs autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc ext3 jbd tcp_bic vhost_net macvtap macvlan tun kvm_intel kvm uinput nvidia(P)(U) sg microcode serio_raw i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support e1000e ixgbe mdio snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc ioatdma dca i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom mpt2sas scsi_transport_sas raid_class pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
      <4>
      <4>Pid: 7217, comm: gvfsd-trash Tainted: P           ---------------    2.6.32-279.19.1.el6.centos.plus.x86_64 #1 Supermicro X8DTL/X8DTL
      <4>RIP: 0010:[<ffffffff8122a86b>]  [<ffffffff8122a86b>] context_struct_compute_av+0x40b/0x420
      <4>RSP: 0018:ffff8806366a5c08  EFLAGS: 00010246
      <4>RAX: 0000000000000000 RBX: ffff8806366a5d98 RCX: 0000000000000100
      <4>RDX: 0000000000000f3c RSI: 00000000ffffffff RDI: 0000000000000010
      <4>RBP: ffff8806366a5c88 R08: 0000000000013570 R09: ffff8806366a5d98
      <4>R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000007
      <4>R13: ffff880324960948 R14: 0000000000000769 R15: 00000000000007cb
      <4>FS:  00007fadb81f17a0(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
      <4>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>CR2: 000000000199e428 CR3: 0000000636fdd000 CR4: 00000000000006e0
      <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      <4>Process gvfsd-trash (pid: 7217, threadinfo ffff8806366a4000, task ffff880634921500)
      <4>Stack:
      <4> ffff880637fc5380 0007ffff00000007 ffff88062b2a0548 ffff880324960948
      <4><d> ffff8806319fa060 ffff88031c981c00 0000000000000000 0000000000000000
      <4><d> 00070007366a5d28 0000000001002fce ffff8806366a5c68 ffff8806366a5d98
      <4>Call Trace:
      <4> [<ffffffff8122ad45>] security_compute_av+0xf5/0x2c0
      <4> [<ffffffff812142be>] avc_has_perm_noaudit+0x14e/0x470
      <4> [<ffffffffa124850b>] ? __ll_inode_revalidate_it+0x15b/0x640 [lustre]
      <4> [<ffffffff8121462b>] avc_has_perm+0x4b/0x90
      <4> [<ffffffff812163f4>] inode_has_perm+0x54/0xa0
      <4> [<ffffffff812164b2>] selinux_inode_permission+0x72/0xb0
      <4> [<ffffffff8120dc5f>] security_inode_permission+0x1f/0x30
      <4> [<ffffffff81182b5f>] inode_permission+0xaf/0xd0
      <4> [<ffffffff811b81c2>] sys_inotify_add_watch+0xa2/0x450
      <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
      <4>Code: ff ff ff e8 48 00 e4 ff 85 c0 0f 84 34 ff ff ff 0f b7 75 8e 48 c7 c7 70 7b 7b 81 31 c0 e8 e7 5d 2c 00 e9 1d ff ff ff 0f 0b eb fe <0f> 0b 0f 1f 00 eb fb 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 
      <1>RIP  [<ffffffff8122a86b>] context_struct_compute_av+0x40b/0x420
      <4> RSP <ffff8806366a5c08>
      ---end---
      

      Issue resolved by disabling selinux on Lustre client node (via selinux=0 at kernel cmdline).

      As the issue is resolved by disabling selinux this ticket filed so it is searchable for problem resolution.

      Attachments

        Activity

          People

            pjones Peter Jones
            aeonjeffj Jeff Johnson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: