[LU-10118] exec start error for lustre-2.10.1_13_g2ee62fb Created: 13/Oct/17  Updated: 13/Oct/17  Resolved: 13/Oct/17

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.1
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: David Racily (Inactive) Assignee: WC Triage
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Red Hat Enterprise Linux Workstation release 7.4 (Maipo)
3.10.0-693.2.2.el7.x86_64 #1 SMP Sat Sep 9 03:55:24 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux


Issue Links:
Duplicate
duplicates LU-10119 systemd Failed at step EXEC spawning ... Resolved
Epic/Theme: lustre-2.10.1
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I have built lustre-2.10.1_13_g2ee62fb on 3.10.0-693.2.2.el7.x86_64 RHEL Workstation release 7.4 (Maipo).

After installation of kmod-lustre-client-2.10.1_13_g2ee62fb-1.el7.x86_64.rpm and lustre-client-2.10.1_13_g2ee62fb-1.el7.x86_64.rpm the lnet startup fails.

The error reported is:

– Unit lnet.service has begun starting up.
Oct 12 13:21:53 kernel: libcfs: loading out-of-tree module taints kernel.
Oct 12 13:21:53 kernel: libcfs: module verification failed: signature and/or required key missing - tainting kernel
Oct 12 13:21:53 kernel: LNet: HW NUMA nodes: 1, HW CPU cores: 20, npartitions: 1
Oct 12 13:21:53 kernel: alg: No test for adler32 (adler32-zlib)
Oct 12 13:21:53 kernel: alg: No test for crc32 (crc32-table)
Oct 12 13:21:54 kernel: LNet: Using FMR for registration
Oct 12 13:21:54 lctl[135556]: LNET configured
Oct 12 13:21:54 kernel: LNet: Added LNI 172.17.1.92@o2ib [8/256/0/180]
Oct 12 13:21:54 systemd[135576]: Failed at step EXEC spawning /usr/sbin/lustre_routes_config: Exec format error
– Subject: Process /usr/sbin/lustre_routes_config could not be executed
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– The process /usr/sbin/lustre_routes_config could not be executed and failed.

– The error number returned by this process is 8.
Oct 12 13:21:54 systemd[1]: lnet.service: main process exited, code=exited, status=203/EXEC
Oct 12 13:21:54 systemd[1]: Failed to start lnet management.
– Subject: Unit lnet.service has failed
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit lnet.service has failed.

– The result is failed.

The fix for this was:
Google suggests that this error message has been associated with a missing “hashpling” in some cases. The lustre_routes_config script has “# Unable to render embedded object: File (/bin/bash”, and I wonder if that space before the “) not found.” isn’t the culprit?

Just a guess. You might try to remove that space from the lustre_routes_config script and try to restart lnet with systemctl.



 Comments   
Comment by James A Simmons [ 13/Oct/17 ]

I think Chris Horn;s patch for LU-10119 fixed this. Can you try it.

Comment by Andreas Dilger [ 13/Oct/17 ]

Closing this issue, as LU-10119 has a patch.

Generated at Sat Feb 10 02:32:13 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.