Details
-
Improvement
-
Resolution: Fixed
-
Minor
-
Lustre 1.8.8, Lustre 1.8.6
-
None
-
2
-
9739
Description
Some CERROR(...) or CWARN() messages clutter up the syslog and should be changed to CDEBUG(D_*, ...).
Ideally, it should be possible to mount then unmount Lustre under normal usage without getting a screen full of messages.
Simply turning off all of the messages is NOT an acceptable solution for all of them. Please try to make changes to both b1_8 and master in a similar manner where possible.
2011-06-23 14:54:31 LustreError: 9730:0:(mds_open.c:1693:mds_close()) @@@ no handle for file close ino 121636742: cookie 0x0 req@ffff812001bdf000 x1372347904386655/t0 o35->c339861d-0c0a-b9ee-a39d-9cfe591452c1@NET_0x50000c0a87223_UUID:0/0 lens 408/864 e 0 to 0 dl 1308866077 ref 1 fl Interpret:/0/0 rc 0/0
Happens many times for each evicted client, but nothing that can be done about it by administrator. This is already fixed on master to use CDEBUG(D_INFO, ...).
LustreError: 8747:0:(file.c:3143:ll_inode_revalidate_fini()) failure -2 inode 63486047
Should be quieted to CDEBUG(D_INODE,) for the -ENOENT case, since this can happen with racing "rm -r" vs. "rm -r" or "ls -l".
LustreError: 11-0: an error occurred while communicating with 172.16.x.x@tcp. The ost_write operation failed with -28
LustreError: 11-0: an error occurred while communicating with 172.23.68.8@tcp. The mds_getattr_lock operation failed with -13
The "-28" (-ENOSPC), "-13" (-EACCES), and "-2" (-EPERM) shouldn't print an error on the client console. However, in this case the problem isn't on the client, but rather because the server is returning an RPC with PTL_RPC_MSG_ERR set. The PTL_RPC_MSG_ERR flag should only be used for cases where there is an error in the RPC handling that prevented the server from even executing the RPC, and NOT for the case where the RPC was processed correctly but returned an error (e.g. -EPERM or -EACCESS or -ENOSPC). Fixing these requires looking into the MDS/OSS code and seeing where the server is returning rc != 0 to the handler, or calling ptlrpc_error() (except in the case where it is not possible to pack a reply message). Some of these may already be fixed on master.
Lustre: myth-OST0002: haven't heard from client c06c5b51-02f2-e84d-3aaa-8cc820badd80 (at 192.168.20.159@tcp) in 249 seconds. I think it's dead, and I am evicting it.
The exp_client_uuid should be kept in a static "last_uuid" string, and if the same UUID is evicted by another target on this node it doesn't need to be printed to the console again, only to debug logs. Use CDEBUG_LIMIT(D_CONSOLE | (mask), ...).
Just looking through https://maloo.whamcloud.com/test_logs/eb3fd5de-98f0-11e0-9a27-52540025f9af to see what messages are printed on every mount, and what can be removed:
Lustre: OBD class driver, http://www.lustre.org/
Lustre: Lustre Version: 1.8.6
Lustre: Build Version: jenkins-wc1--PRISTINE-2.6.32-131.2.1.el6.x86_64
This can just use the "#ifdef CRAY_XT3" version and print "Lustre: Build Version: "BUILD_VERSION"\n", and a separate project that Brian is working on will fix the build version string.
Lustre: Register global MR array, MR size: 0xffffffffffffffff, array size: 1
Please ask Liang how important this is. Maybe it shouldn't be printed if it is 0xffff...?
Lustre: Lustre Client File System; http://www.lustre.org/
Seems redundant with the message in obdclass.
LustreError: 152-6: Ignoring deprecated mount option 'acl'.
Why do our test scripts specify a mount option in local.sh that is no longer useful?
Lustre: Client lustre-client has started
Lustre: client ffff880410c8f000 umount complete
It would be good to make these messages consistent with each other, like:
Lustre: client lustre-client (fff880410c8f00) mount complete
:
:
Lustre: client lustre-client (fff880410c8f00) unmount complete
Attachments
Issue Links
- Trackbacks
-
Lustre 1.8.8-wc1 release testing tracker Lustre 1.8.8wc1 RC1 Tag: v188WC1RC1 Build:
-
Lustre 1.8.x known issues tracker While testing against Lustre b18 branch, we would hit known bugs which were already reported in Lustre Bugzilla https://bugzilla.lustre.org/. In order to move away from relying on Bugzilla, we would create a JIRA
-
Changelog 1.8 Changes from version 1.8.7wc1 to version 1.8.8wc1 Server support for kernels: 2.6.18308.4.1.el5 (RHEL5) Client support for unpatched kernels: 2.6.18308.4.1.el5 (RHEL5) 2.6.32220.13.1.el6 (RHEL6) Recommended e2fsprogs version: 1.41.90....