[LU-14431] error messages that do not stop with newline Created: 13/Feb/21  Updated: 02/Nov/23  Resolved: 02/Nov/23

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.6
Fix Version/s: None

Type: Improvement Priority: Minor
Reporter: Li Xi Assignee: Feng Lei
Resolution: Fixed Votes: 0
Labels: easy

Issue Links:
Related
is related to LU-14475 Make too short log messages meanful Resolved
Rank (Obsolete): 9223372036854775807

 Description   

When I am using grep to check how many CERROR in the codes, I found there are still a dozen of error messages that do not end with a newline. That will cause annoying messages in system log.

lnet/klnds/gnilnd/gnilnd_cb.c(2378):

			CERROR("Unexpected connstamp %#llx(%#llx expected)"
				" from %s", rxmsg->gnm_connstamp,
				found_conn->gnc_peer_connstamp,
				libcfs_nid2str(peer->gnp_nid));

lnet/klnds/gnilnd/gnilnd_conn.c(867):

			CERROR("Looking up network: device is in shutdown");

lustre/ptlrpc/service.c(731):

				CERROR("%s: invalid CPT pattern string: %s",
				       conf->psc_name, cconf->cc_pattern);

lustre/ptlrpc/ptlrpcd.c(754):

			CERROR("%s: invalid CPT pattern string: %s",
			       "ptlrpcd_cpts", ptlrpcd_cpts);

lustre/include/lustre_dlm.h(1159):

			CERROR("lock %p: delayed lvb init failed (rc %d)",
			       lock, rc);

lustre/quota/qmt_lock.c(761):

		CERROR("%s: failed to init env.", qmt->qmt_svname);

lustre/osd-ldiskfs/osd_handler.c(2377):

				CERROR("%s: unsupported checksum type of "
				       "T10PI type '%s'",
				       d->od_svname, name);

lustre/osd-ldiskfs/osd_handler.c(2382):

			CERROR("%s: unsupported T10PI type '%s'",
			       d->od_svname, name);

lustre/target/tgt_handler.c(945):

		CERROR("No target passed");

lustre/obdclass/lprocfs_status.c(78):

		CERROR("LprocFS: No memory to create <debugfs> entry %s", name);


 Comments   
Comment by Andreas Dilger [ 13/Feb/21 ]

These should also be converted to be single-line strings. 

Comment by Li Xi [ 14/Feb/21 ]

The following message does not have space before "opendir_pid":

lustre/llite/statahead.c(1320):
			CERROR("%s: reading dir "DFID" at %llu"
			       "opendir_pid = %u : rc = %d\n",
			       ll_i2sbi(dir)->ll_fsname,
			       PFID(ll_inode2fid(dir)), pos,
			       lli->lli_opendir_pid, rc);
Comment by Li Xi [ 14/Feb/21 ]

The following message has tailing ":" but is missing "rc = %d\n" at the end:

lustre/llite/llite_lib.c(1467):
		CERROR("%s: "DFID" dir layout mismatch:\n",
		       ll_i2sbi(inode)->ll_fsname,
		       PFID(&lli->lli_fid));
Comment by Li Xi [ 14/Feb/21 ]

The following message does not have space after DFID

lustre/mdd/mdd_object.c(2574):
			CERROR("%s: unable to roll back layout swap. FIDs: "
			       DFID" and "DFID "error: %d/%d, steps: %d\n",
			       mdd_obj_dev_name(fst_o),
			       PFID(mdo2fid(snd_o)), PFID(mdo2fid(fst_o)),
			       rc, rc2, steps);
Comment by Li Xi [ 14/Feb/21 ]

The following two messages has weird ":" around DFID, change them to spaces would be better:

lustre/mdd/mdd_object.c(1586):
			CERROR("%s: failed to rollback of layout of: "DFID
			       ": %d, file state unknown\n",
			       mdd_obj_dev_name(obj), PFID(mdo2fid(obj)), rc2);
lustre/mdd/mdd_object.c(1818):
			CERROR("%s: failed to rollback of layout of: "DFID
			       ": %d, file state unkonwn.\n",
			       mdd_obj_dev_name(obj), PFID(mdo2fid(obj)), rc2);
Comment by Li Xi [ 14/Feb/21 ]

The following message missed a space after "string" and missing ": rc = %d" at the end

lustre/llite/symlink.c(102):
		CERROR("%s: inode "DFID": symlink not NULL terminated string"
		       "of length %d\n", sbi->ll_fsname,
		       PFID(ll_inode2fid(inode)), symlen - 1);
Comment by Li Xi [ 14/Feb/21 ]

The following message missed a space after "%llu"

lustre/llite/statahead.c(1320):
			CERROR("%s: reading dir "DFID" at %llu"
			       "opendir_pid = %u : rc = %d\n",
			       ll_i2sbi(dir)->ll_fsname,
			       PFID(ll_inode2fid(dir)), pos,
			       lli->lli_opendir_pid, rc);
Comment by Li Xi [ 14/Feb/21 ]

There is no way guess what the following message is related to without reading the codes:

lustre/obdclass/lu_object.c(2284):
                        CERROR("[%d]: %p %x (%p,%p,%p) %d %d \"%s\"@%p\n",
                               i, key, key->lct_tags,
                               key->lct_init, key->lct_fini, key->lct_exit,
			       key->lct_index, atomic_read(&key->lct_used),
                               key->lct_owner ? key->lct_owner->name : "",
                               key->lct_owner);

Adding a prefix string like "dumping context keys" would help a lot.

Comment by Li Xi [ 14/Feb/21 ]

The space before ":" should be removed

lustre/ofd/ofd_dev.c(1605):
				CERROR("%s : invalid o_seq "DOSTID"\n",
				       ofd_name(ofd), POSTID(&oa->o_oi));
Comment by Li Xi [ 14/Feb/21 ]

Space is missing after "response"

lnet/lnet/acceptor.c(296):
                         CERROR("Error sending magic+version in response"
                               "to version %d from %pI4h: %d\n",
                              peer_version, &peer_ip, rc);
lnet/lnet/acceptor.c(250):
                               CERROR("Error sending magic+version in response"
                                      "to LNET magic from %pI4h: %d\n",
                                       &peer_ip, rc);
Comment by Li Xi [ 14/Feb/21 ]

Space is missing after ":":

lnet/klnds/gnilnd/gnilnd_conn.c(64):
                                CERROR("FATAL:fmablk registration has failed "
                                       "for %ld seconds.\n",
                                       cfs_duration_sec(jiffies - reg_to) +
                                                rfto);
Comment by Li Xi [ 14/Feb/21 ]

Space is missing after ")":

lustre/ptlrpc/nodemap_handler.c(1134):
                 CERROR("cannot allocate memory (%zu bytes)"
                        "for nodemap '%s'\n", sizeof(*nodemap),
                        name);
Comment by Li Xi [ 14/Feb/21 ]

1) Space is missing after "%d."
2) Two spaces before "Not fatal"
3) "\" is not necessary

lustre/ptlrpc/llog_client.c(56):
                CERROR("ctxt->loc_imp == NULL for context idx %d."    \
                       "Unable to complete MDS/OSS recovery,"         \
                       "but I'll try again next time.  Not fatal.\n", \
                       ctxt->loc_idx);                                \
Comment by Andreas Dilger [ 14/Feb/21 ]

Rather than putting comments in this ticket, it would probably be the same effort to make a patch that fixes them all.

Comment by Li Xi [ 14/Feb/21 ]

Missing space after ":":

lustre/osp/osp_object.c(594):
                         CERROR("%s:osp_attr_get update error "DFID": rc = %d\n",
                                dev->dd_lu_dev.ld_obd->obd_name,
                                PFID(lu_object_fid(&dt->do_lu)), rc);
Comment by Li Xi [ 14/Feb/21 ]

Missing space after "but"

lustre/osd-ldiskfs/osd_scrub.c(2671):
                                CERROR("%s: UUID has been changed, but"
                                       "failed to allocate RAM for report\n",
                                       osd_dev2name(dev));
Comment by Andreas Dilger [ 14/Feb/21 ]

All of the missing space errors will go away with single-line error strings, which is one reason why they are recommended by checkpatch. The other is that it makes fining error strings in the code much easier, since words in the error will not be split across lines. 

Comment by James A Simmons [ 22/Feb/21 ]

Anjus is working on the patch. Can we reassign the ticket back to her?

Comment by Andreas Dilger [ 22/Feb/21 ]

Li Xi, James, it definitely doesn't make sense to have two developers duplicating work on this simple patch.

Li Xi, was your original intent for filing this ticket that Feng Lei have some simple task to become familiar with Lustre? It would have been good to assign it to him in that case.

James, I guess the same is true for Anjus, which was confusing since the ticket was not assigned to anyone. It makes sense at this point for both Anjus and Feng Lei to submit whatever patches they each currently have, and then we can work out if there are conflicts.

There are many other tickets labelled "easy" in JIRA, so there is plenty of work to go around without having duplication of effort:
https://jira.whamcloud.com/issues/?jql=project%3DLU%20and%20labels%3Deasy%20and%20resolution%3Dunresolved

Some useful ones to work on would be:

Comment by Li Xi [ 23/Feb/21 ]

Li Xi, was your original intent for filing this ticket that Feng Lei have some simple task to become familiar with Lustre? It would have been good to assign it to him in that case.

Yes, the reason why I didn't push a patch instantly was that it looks like a good change for Feng Lei to getting familiar with the patching/pushing/testing/reviewing/merging process of Lustre.

Feng Lei has assigned the ticket to himself when started working on it a few days ago. As far as I know, Feng Lei already has a patch. As well, there are other error messages that need clean up and Feng Lei is working on that too.

simmonsja, I guess Anjus has not finished the patch yet? Does it make sense for Anjus to working on the other tickets in the easy list that Andreas pointed out (or anything else?).

Comment by Gerrit Updater [ 23/Feb/21 ]

Feng, Lei (flei@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/41723
Subject: LU-14431 log: Add ending newline for some messages.
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: ae259e4e6f8944a1e87ccb8bfdb84d039ea230cd

Comment by James A Simmons [ 01/Mar/21 ]

The current patch missed a few so another patch will be coming.

Comment by Gerrit Updater [ 10/Mar/21 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/41723/
Subject: LU-14431 log: Add ending newline for some messages.
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 503bf7f29a49914082480a7cdceb914e42a95dd8

Comment by Peter Jones [ 11/Mar/21 ]

Landed for 2.15

Comment by Andreas Dilger [ 11/Mar/21 ]

I think there is still more to be done here.

Comment by Andreas Dilger [ 26/Nov/21 ]

I've gone through and used strikeout on the messages that have already been fixed. There are still a few that could be done before closing the ticket.

Generated at Sat Feb 10 03:09:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.