[LU-11205] Failure to clear the changelog for user 1 on MDT Created: 03/Aug/18 Updated: 16/Apr/20 Resolved: 21/Sep/19 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.0, Lustre 2.10.4, Lustre 2.10.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Stephane Thiell | Assignee: | Mikhail Pershin |
| Resolution: | Duplicate | Votes: | 3 |
| Labels: | ORNL | ||
| Environment: |
CentOS 7.4 (3.10.0-693.2.2.el7_lustre.pl1.x86_64) |
||
| Attachments: |
|
||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||||||
| Description |
|
Hello, We're seeing the following messages on Oak's MDT in 2.10.4: Aug 03 09:21:39 oak-md1-s2 kernel: Lustre: 11137:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 03 09:31:38 oak-md1-s2 kernel: Lustre: 11271:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Robinhood (also running 2.10.4) shows this: 2018/08/03 10:00:47 [13766/22] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 13975842301) returned -22
2018/08/03 10:00:47 [13766/22] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
2018/08/03 10:00:47 [13766/16] llapi | cannot purge records for 'cl1'
2018/08/03 10:00:47 [13766/16] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 13975842303) returned -22
2018/08/03 10:00:47 [13766/16] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
2018/08/03 10:00:47 [13766/4] llapi | cannot purge records for 'cl1'
2018/08/03 10:00:47 [13766/4] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 13975842304) returned -22
2018/08/03 10:00:47 [13766/4] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
Oak's MDT usage is as follow: [root@oak-md1-s2 ~]# df -h -t lustre Filesystem Size Used Avail Use% Mounted on /dev/mapper/md1-rbod1-mdt0 1.3T 131G 1022G 12% /mnt/oak/mdt/0 [root@oak-md1-s2 ~]# df -i -t lustre Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/md1-rbod1-mdt0 873332736 266515673 606817063 31% /mnt/oak/mdt/0 I'm concerned that the MDT might fill up with changelogs. Could you please assist in troubleshooting this issue? |
| Comments |
| Comment by Peter Jones [ 03/Aug/18 ] |
|
John Could you please investigate? Thanks Peter |
| Comment by Stephane Thiell [ 16/Aug/18 ] |
|
Quick update. I first throught that the errors were gone but actually no... Aug 13 12:07:03 oak-md1-s2 kernel: Lustre: 11259:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 13 14:02:28 oak-md1-s2 kernel: Lustre: 11167:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 13 16:41:14 oak-md1-s2 kernel: Lustre: 11281:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 13 17:08:53 oak-md1-s2 kernel: Lustre: 11299:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 10:04:01 oak-md1-s2 kernel: Lustre: 10793:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 10:09:03 oak-md1-s2 kernel: Lustre: 11167:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 11:23:46 oak-md1-s2 kernel: Lustre: 93611:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 12:08:46 oak-md1-s2 kernel: Lustre: 11280:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 12:26:47 oak-md1-s2 kernel: Lustre: 11289:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 12:39:25 oak-md1-s2 kernel: Lustre: 93614:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 13:02:00 oak-md1-s2 kernel: Lustre: 11178:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 13:07:09 oak-md1-s2 kernel: Lustre: 11178:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 13:12:12 oak-md1-s2 kernel: Lustre: 11181:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 14:25:42 oak-md1-s2 kernel: Lustre: 11127:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 14:25:44 oak-md1-s2 kernel: Lustre: 11127:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 14:25:47 oak-md1-s2 kernel: Lustre: 11257:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 15:47:17 oak-md1-s2 kernel: Lustre: 61745:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 15:47:18 oak-md1-s2 kernel: Lustre: 11181:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 15:47:18 oak-md1-s2 kernel: Lustre: 11181:0:(mdd_device.c:1577:mdd_changelog_clear()) Skipped 32 previous similar messages Aug 14 15:47:19 oak-md1-s2 kernel: Lustre: 11283:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 15:47:19 oak-md1-s2 kernel: Lustre: 11283:0:(mdd_device.c:1577:mdd_changelog_clear()) Skipped 28 previous similar messages Aug 14 15:47:21 oak-md1-s2 kernel: Lustre: 61745:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 15:47:21 oak-md1-s2 kernel: Lustre: 61745:0:(mdd_device.c:1577:mdd_changelog_clear()) Skipped 74 previous similar messages Aug 14 16:58:14 oak-md1-s2 kernel: Lustre: 11178:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 16:58:14 oak-md1-s2 kernel: Lustre: 11178:0:(mdd_device.c:1577:mdd_changelog_clear()) Skipped 37 previous similar messages Aug 14 23:30:07 oak-md1-s2 kernel: Lustre: 11167:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 14 23:51:21 oak-md1-s2 kernel: Lustre: 11127:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 15 10:10:08 oak-md1-s2 kernel: Lustre: 11169:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 15 10:34:25 oak-md1-s2 kernel: Lustre: 11142:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 15 10:34:25 oak-md1-s2 kernel: Lustre: 11142:0:(mdd_device.c:1577:mdd_changelog_clear()) Skipped 1 previous similar message Robinhood client: Aug 08 14:55:20 oak-rbh01 kernel: LustreError: 210395:0:(mdc_changelog.c:215:chlg_load()) oak-MDT0000-mdc-ffff883163481800: fail to open changelog catalog: rc = -108 Aug 08 14:55:20 oak-rbh01 kernel: LustreError: 210395:0:(mdc_changelog.c:215:chlg_load()) Skipped 1 previous similar message Aug 08 14:55:21 oak-rbh01 kernel: Lustre: oak-MDT0000-mdc-ffff883163481800: Connection restored to 10.0.2.52@o2ib5 (at 10.0.2.52@o2ib5) Aug 12 21:10:06 oak-rbh01 kernel: LustreError: 265170:0:(mdc_changelog.c:228:chlg_load()) oak-MDT0000-mdc-ffff883163481800: fail to process llog: rc = -5 Aug 14 22:05:59 oak-rbh01 kernel: LustreError: 314238:0:(mdc_changelog.c:228:chlg_load()) oak-MDT0000-mdc-ffff883163481800: fail to process llog: rc = -5 Aug 15 10:09:20 oak-rbh01 kernel: LustreError: 134695:0:(mdc_changelog.c:228:chlg_load()) oak-MDT0000-mdc-ffff883163481800: fail to process llog: rc = -5 Aug 15 13:29:35 oak-rbh01 kernel: LustreError: 193634:0:(mdc_changelog.c:228:chlg_load()) oak-MDT0000-mdc-ffff883163481800: fail to process llog: rc = -5 |
| Comment by John Hammond [ 22/Aug/18 ] |
|
Hi Stephane, Could you post the output of lctl get_param 'mdd.*.changelog*' on the MDS? |
| Comment by Stephane Thiell [ 27/Aug/18 ] |
|
Hi John, [root@oak-md1-s2 ~]# lctl get_param 'mdd.*.changelog*' mdd.oak-MDT0000.changelog_mask= CREAT MKDIR HLINK SLINK MKNOD UNLNK RMDIR RENME RNMTO OPEN CLOSE LYOUT TRUNC SATTR HSM MTIME CTIME MIGRT mdd.oak-MDT0000.changelog_size=5153840 mdd.oak-MDT0000.changelog_users= current index: 14644158111 ID index cl1 14644156700 The frequency of the errors is varying but doesn't look that bad today (Aug 27) as there has been only a single error reported so far: Aug 26 09:06:07 oak-md1-s2 kernel: Lustre: 11184:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 26 09:08:30 oak-md1-s2 kernel: Lustre: 11159:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 26 09:11:28 oak-md1-s2 kernel: Lustre: 11130:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 26 09:38:51 oak-md1-s2 kernel: Lustre: 11154:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 26 09:49:44 oak-md1-s2 kernel: Lustre: 11192:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 26 09:56:04 oak-md1-s2 kernel: Lustre: 11281:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 26 09:57:11 oak-md1-s2 kernel: Lustre: 11281:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 26 11:32:10 oak-md1-s2 kernel: Lustre: 11184:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Aug 27 10:37:06 oak-md1-s2 kernel: Lustre: 11274:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Let me know what you think... Thanks! |
| Comment by Kim Sebo [ 05/Nov/18 ] |
|
We're also seeing this problem at NCI. Filesystem is Lustre 2.10.4, DNE, MDTs are ZFS, mdt usage is ~33% full. Nov 5 13:52:20 g1b-mds02 kernel: Lustre: 70004:0:(mdd_device.c:1577:mdd_changelog_clear()) gdata1b-MDD0001: Failure to clear the changelog for user 1: -22 Nov 5 13:53:59 g1b-mds02 kernel: Lustre: 70006:0:(mdd_device.c:1577:mdd_changelog_clear()) gdata1b-MDD0001: Failure to clear the changelog for user 1: -22 Nov 5 13:54:40 g1b-mds02 kernel: Lustre: 31712:0:(mdd_device.c:1577:mdd_changelog_clear()) gdata1b-MDD0001: Failure to clear the changelog for user 1: -22 robinhood log: 2018/11/05 13:52:20 [3729/3] llapi | cannot purge records for 'cl1'
2018/11/05 13:52:20 [3729/3] ChangeLog | ERROR: llapi_changelog_clear("gdata1b-MDT0001", "cl1", 606866988) returned -22 2018/11/05 13:52:20 [3729/3] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
mds changelog params: mds02 ~]# lctl get_param 'mdd.*.changelog*' mdd.gdata1b-MDT0001.changelog_mask= MARK CREAT MKDIR HLINK SLINK MKNOD UNLNK RMDIR RENME RNMTO OPEN CLOSE LYOUT TRUNC SATTR XATTR HSM MTIME CTIME MIGRT mdd.gdata1b-MDT0001.changelog_size=157118864 mdd.gdata1b-MDT0001.changelog_users= current index: 624530992 ID index cl1 623048533
|
| Comment by Shawn Hall (Inactive) [ 08/Nov/18 ] |
|
We're seeing the same on 4 different file systems that are running 2.10.5. Robinhood reports errors about clearing the changelogs and MDS reports the same errors as above. We do have Robinhood scans running simultaneously, so it's possible the scan load is a contributing factor. Interestingly, we are able to clear the logs by hand. lfs changelog_clear fsname-MDT0000 cl1 0 This is probably a red herring, but I do find it odd that the Lustre MDS logs all refer to fsname-MD*D*0000 vs. all other references to fsname-MD*T*0000. Hopefully just a difference in nomenclature that doesn't result in code issues. |
| Comment by John Hammond [ 12/Nov/18 ] |
|
Shawn, could enable trace and ioctl (lctl set_param debug="+trace ioctl") and capture the lustre debug logs from the MDT when this occurs? |
| Comment by John Hammond [ 12/Nov/18 ] |
|
It's possible that this is due to |
| Comment by Peter Jones [ 02/Dec/18 ] |
|
Mike What do you advise with this one? Peter |
| Comment by Stephane Thiell [ 15/Jan/19 ] |
|
Note: MDS is now running Lustre 2.10.6 This has started to impact us again and it looks like that due to that issue, Robinhood cannot keep up anymore: Jan 15 08:05:11 oak-md1-s2 kernel: Lustre: 3889:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 08:15:12 oak-md1-s2 kernel: Lustre: 3851:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 08:25:33 oak-md1-s2 kernel: Lustre: 3884:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 08:35:51 oak-md1-s2 kernel: Lustre: 3859:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 08:45:56 oak-md1-s2 kernel: Lustre: 3855:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 08:56:11 oak-md1-s2 kernel: Lustre: 3927:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 09:06:36 oak-md1-s2 kernel: Lustre: 3887:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 09:16:41 oak-md1-s2 kernel: Lustre: 15913:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 09:26:51 oak-md1-s2 kernel: Lustre: 3855:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 09:37:02 oak-md1-s2 kernel: Lustre: 16626:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 09:52:15 oak-md1-s2 kernel: Lustre: 3887:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 10:02:49 oak-md1-s2 kernel: Lustre: 3771:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 10:13:38 oak-md1-s2 kernel: Lustre: 13466:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 10:26:10 oak-md1-s2 kernel: Lustre: 3886:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 10:36:24 oak-md1-s2 kernel: Lustre: 15911:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 10:48:08 oak-md1-s2 kernel: Lustre: 3884:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 10:58:12 oak-md1-s2 kernel: Lustre: 3853:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 11:08:40 oak-md1-s2 kernel: Lustre: 3851:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 11:20:10 oak-md1-s2 kernel: Lustre: 3887:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 Jan 15 11:31:07 oak-md1-s2 kernel: Lustre: 3900:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 [root@oak-md1-s2 ~]# lctl get_param 'mdd.*.changelog*' mdd.oak-MDT0000.changelog_mask= CREAT MKDIR HLINK SLINK MKNOD UNLNK RMDIR RENME RNMTO OPEN CLOSE LYOUT TRUNC SATTR HSM MTIME CTIME MIGRT mdd.oak-MDT0000.changelog_size=147587726208 mdd.oak-MDT0000.changelog_users= current index: 6044589790 ID index cl1 4832349718 Robinhood logs: 2019/01/15 11:40:04 [36769/16] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 4832305700) returned -22
2019/01/15 11:40:04 [36769/16] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
2019/01/15 11:40:04 [36769/22] llapi | cannot purge records for 'cl1'
2019/01/15 11:40:04 [36769/22] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 4832305701) returned -22
2019/01/15 11:40:04 [36769/22] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
2019/01/15 11:40:04 [36769/22] llapi | cannot purge records for 'cl1'
2019/01/15 11:40:04 [36769/22] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 4832305702) returned -22
2019/01/15 11:40:04 [36769/22] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
2019/01/15 11:40:04 [36769/22] llapi | cannot purge records for 'cl1'
2019/01/15 11:40:04 [36769/22] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 4832305703) returned -22
2019/01/15 11:40:04 [36769/22] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR.
2019/01/15 11:40:04 [36769/21] llapi | cannot purge records for 'cl1'
2019/01/15 11:40:04 [36769/21] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 4832305705) returned -22
|
| Comment by James A Simmons [ 15/Jan/19 ] |
|
Sigh. So the special changelog character device is not scaling as well |
| Comment by Quentin Bouget [ 16/Jan/19 ] |
|
IMO this is not an issue with the changelog character device. Looking at the code, enabling ioctl debug as John advised would really help: static int mdd_changelog_clear(...)
{
...
/* cur_endrec is the oldest purgeable record, make sure we're newer */
if (rec->cur_endrec > mcuc->mcuc_endrec) {
CDEBUG(D_IOCTL, "Request %llu out of range: %llu\n",
mcuc->mcuc_endrec, rec->cur_endrec);
RETURN(-EINVAL);
}
...
}
At least that way we can see what lustre thinks is happening. |
| Comment by Stephane Thiell [ 17/Jan/19 ] |
|
Hello Quentin! Thanks for the feedback. I tried that, but something is weird... FYI oak-md1-s2 is running oak-MDT0000 on Lustre 2.10.6 and I set the requested debug flags: [root@oak-md1-s2 tmp]# lctl get_param debug debug=trace ioctl warning error emerg console Now, an occurrence is seen on the Robinhood server: 2019/01/17 09:45:35 [36769/13] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 5442377766) returned -22
corresponding epoch timestamp is 1547747135 around this time on the MDS... almost no logs for several seconds, just some watchdog.c:120:lcw_cb() messages, like if logging were somewhat blocked at that time... 00000400:00000001:1.1:1547747072.834205:0:0:0:(watchdog.c:123:lcw_cb()) Process leaving 00000400:00000001:0.1F:1547747137.346202:0:0:0:(watchdog.c:120:lcw_cb()) Process entered no logs from 1547747072 to 1547747137? something's wrong dk file attached as dk.1547747365.gz Another occurence: 2019/01/17 09:49:54 [36769/14] ChangeLog | ERROR: llapi_changelog_clear("oak-MDT0000", "cl1", 5443652339) returned -22
That's epoch timestamp 1547747394 On the MDS, there are some "missing" logs at that time: 00000400:00000001:19.1:1547747376.450231:0:0:0:(watchdog.c:123:lcw_cb()) Process leaving 00000400:00000001:14.1F:1547747537.730185:0:0:0:(watchdog.c:120:lcw_cb()) Process entered no logs from 1547747376 to 1547747537? dk file attached as dk.1547747668.gz The rest of the time, debug_kernel logging seems OK and we got several messages / sec. Am I missing something obvious? Could this be related to this issue? Thanks!! Stephane
|
| Comment by Stephane Thiell [ 17/Jan/19 ] |
|
Some additional info: I have never seen the "out of range" CDEBUG message from mdd_changelog_clear(), but as you can see in the attached dk files, there is a lot of calls to llog_changelog_cancel_cb ending with: 00000004:00000001:18.0:1547747364.771440:0:3358:0:(mdd_device.c:230:llog_changelog_cancel_cb()) Process leaving (rc=1 : 1 : 1) 00000004:00000001:18.0:1547747364.771443:0:3358:0:(mdd_device.c:265:llog_changelog_cancel()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:18.0:1547747364.771444:0:3358:0:(mdd_device.c:1595:mdd_changelog_clear()) Process leaving However timestamps are not matching the errors on the robinhood client. |
| Comment by Quentin Bouget [ 18/Jan/19 ] |
|
Hello Stephane, I don't think the llog_changelog_cancel_cb() entries ending with "rc=1" are faulty ones. I read a bit of code in obdclass/llog.c and it might just be that "1" indicates a record was deleted (or something like that). Did you notice how there is no "Process entered" logging entry for mdd_changelog_clear() in your dk files? Only "Process leaving"... Also, the fact there is not a return code associated with the message means the function went through the EXIT macro, which is not in the code path that prints the intial error message you reported: Failure to clear the changelog for user 1: -22. Other than that, the missing logs part sure is scary! Haven't been able to figure it out an explanation for it... yet. |
| Comment by Stephane Thiell [ 18/Jan/19 ] |
|
Hey Quentin, after several unsuccessful attempts because each time some debug logs were missing, I was finally able to catch an occurrence (just now) of Failure to clear the changelog for user 1: -22 when debug enabled on the MDS in production. The result is: 00000004:00000001:12.0:1547828517.194668:0:4204:0:(mdd_device.c:1543:mdd_changelog_clear()) Process entered 00000004:00000080:12.0:1547828517.194668:0:4204:0:(mdd_device.c:1546:mdd_changelog_clear()) oak-MDD0000: Purge request: id=1, endrec=5796171010 00000040:00000001:12.0:1547828517.194669:0:4204:0:(llog_cat.c:887:llog_cat_process_or_fork()) Process entered 00000040:00000001:12.0:1547828517.194670:0:4204:0:(llog.c:717:llog_process_or_fork()) Process entered 00000040:00000001:12.0:1547828517.194670:0:4204:0:(llog.c:430:llog_process_thread()) Process entered 00000040:00000001:12.0:1547828517.194672:0:4204:0:(lustre_log.h:454:llog_next_block()) Process entered 00000040:00000001:12.0:1547828517.194672:0:4204:0:(llog_osd.c:853:llog_osd_next_block()) Process entered 00000040:00000001:12.0:1547828517.194674:0:4204:0:(llog_osd.c:987:llog_osd_next_block()) Process leaving via out (rc=0 : 0 : 0x0) 00000040:00000001:12.0:1547828517.194674:0:4204:0:(lustre_log.h:464:llog_next_block()) Process leaving (rc=0 : 0 : 0) 00000040:00000001:12.0:1547828517.194675:0:4204:0:(llog_cat.c:811:llog_cat_process_cb()) Process entered 00000040:00000001:12.0:1547828517.194675:0:4204:0:(llog_cat.c:237:llog_cat_id2handle()) Process entered 00000040:00000001:12.0:1547828517.194676:0:4204:0:(llog_cat.c:259:llog_cat_id2handle()) Process leaving via out (rc=0 : 0 : 0x0) 00000040:00000001:12.0:1547828517.194677:0:4204:0:(llog.c:717:llog_process_or_fork()) Process entered 00000040:00000001:12.0:1547828517.194677:0:4204:0:(llog.c:430:llog_process_thread()) Process entered 00000040:00000001:12.0:1547828517.194679:0:4204:0:(lustre_log.h:454:llog_next_block()) Process entered 00000040:00000001:12.0:1547828517.194679:0:4204:0:(llog_osd.c:853:llog_osd_next_block()) Process entered 00000040:00000001:12.0:1547828517.194680:0:4204:0:(llog_osd.c:987:llog_osd_next_block()) Process leaving via out (rc=0 : 0 : 0x0) 00000040:00000001:12.0:1547828517.194680:0:4204:0:(lustre_log.h:464:llog_next_block()) Process leaving (rc=0 : 0 : 0) 00000004:00000001:12.0:1547828517.194681:0:4204:0:(mdd_device.c:1491:mdd_changelog_clear_cb()) Process entered 00000004:00000080:12.0:1547828517.194681:0:4204:0:(mdd_device.c:1508:mdd_changelog_clear_cb()) Request 5796171010 out of range: 5796171012 00000004:00000001:12.0:1547828517.194682:0:4204:0:(mdd_device.c:1509:mdd_changelog_clear_cb()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00000040:00000001:12.0:1547828517.194682:0:4204:0:(llog.c:614:llog_process_thread()) Process leaving via out (rc=18446744073709551594 : -22 : 0xffffffffffffffea) 00000040:00000001:12.0:1547828517.194683:0:4204:0:(llog.c:757:llog_process_or_fork()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00000040:00000001:12.0:1547828517.194684:0:4204:0:(llog_cat.c:876:llog_cat_process_cb()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00000040:00000001:12.0:1547828517.194684:0:4204:0:(llog.c:614:llog_process_thread()) Process leaving via out (rc=18446744073709551594 : -22 : 0xffffffffffffffea) 00000040:00000001:12.0:1547828517.194685:0:4204:0:(llog.c:757:llog_process_or_fork()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00000040:00000001:12.0:1547828517.194685:0:4204:0:(llog_cat.c:919:llog_cat_process_or_fork()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00000004:00000400:12.0:1547828517.194686:0:4204:0:(mdd_device.c:1577:mdd_changelog_clear()) oak-MDD0000: Failure to clear the changelog for user 1: -22 00000004:00000001:12.0:1547828517.194686:0:4204:0:(mdd_device.c:1595:mdd_changelog_clear()) Process leaving 00000004:00000001:12.0:1547828517.194687:0:4204:0:(mdd_device.c:1630:mdd_iocontrol()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00000004:00000001:12.0:1547828517.194690:0:4204:0:(mdt_handler.c:6245:mdt_iocontrol()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) 00000004:00000001:12.0:1547828517.194690:0:4204:0:(mdt_handler.c:1846:mdt_set_info()) Process leaving (rc=18446744073709551594 : -22 : ffffffffffffffea) Note that there is an out of range error indeed! Full 1-min long log attached as dk.1547828521.gz if needed. Thanks! |
| Comment by Patrick Farrell (Inactive) [ 18/Jan/19 ] |
|
By the way, about the missing DK logs: This happens occasionally, I've seen it due to two issues: 2. debug_mb is divided in to separate per CPU buffers, and so if you have a lot of CPUs, it's easy for individual debug buffers to get pretty small and then be overrun. (For example, a KNL system with ~550 MB debug_mb would have approximately 2 MB of buffer for each CPU.) |
| Comment by Stephane Thiell [ 26/Jan/19 ] |
|
(Thanks Patrick for the info, that makes sense!) Regarding our current changelog clear errors, I just got the same errors on a brand new Lustre 2.12.0 system with DNE. I can see the error on all MDTs: [687488.926705] Lustre: 129258:0:(mdd_device.c:1794:mdd_changelog_clear()) Skipped 1 previous similar message [687626.354169] Lustre: 127970:0:(mdd_device.c:1794:mdd_changelog_clear()) fir-MDD0003: Failure to clear the changelog for user 1: -22 [687659.957219] Lustre: 127247:0:(mdd_device.c:1794:mdd_changelog_clear()) fir-MDD0003: Failure to clear the changelog for user 1: -22 [687817.192283] Lustre: 126623:0:(mdd_device.c:1794:mdd_changelog_clear()) fir-MDD0001: Failure to clear the changelog for user 1: -22 [688238.001376] Lustre: 127247:0:(mdd_device.c:1794:mdd_changelog_clear()) fir-MDD0003: Failure to clear the changelog for user 1: -22 client: 2019/01/26 08:21:32 [80692/10] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0000", "cl1", 29530963) returned -22
2019/01/26 08:35:17 [80692/11] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0003", "cl1", 69223505) returned -22
2019/01/26 08:48:24 [80692/12] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0002", "cl1", 70347408) returned -22
2019/01/26 09:03:59 [80692/17] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0002", "cl1", 71433250) returned -22
2019/01/26 09:03:59 [80692/16] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0002", "cl1", 71433251) returned -22
2019/01/26 09:04:17 [80692/9] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0002", "cl1", 71453756) returned -22
2019/01/26 09:04:17 [80692/4] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0002", "cl1", 71453757) returned -22
2019/01/26 09:09:31 [80692/17] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0001", "cl1", 81715534) returned -22
2019/01/26 09:12:49 [80692/12] ChangeLog | ERROR: llapi_changelog_clear("fir-MDT0000", "cl1", 30202391) returned -22
|
| Comment by Quentin Bouget [ 28/Jan/19 ] |
|
I just realised I read the logs wrong! I thought lustre was refusing to clear a record "in the future", but it is actually refusing to clear an "already cleared" one. Is it possible there are several processes feeding from the same changelog reader? Clearing records in parallel, out of order? |
| Comment by Stephane Thiell [ 05/Feb/19 ] |
|
Thanks, this is useful and made me think of |
| Comment by Scott Nolin [ 12/Mar/19 ] |
|
So this also affects 2.12 - can we get that added to the affects/versions?
We are in a bind on a couple filesystems now, and were contemplating updating, but no point.
Scott |
| Comment by Scott Nolin [ 15/Mar/19 ] |
|
To be complete - I checked our version. This definitely affects 2.10.6 too. |
| Comment by James A Simmons [ 28/Mar/19 ] |
|
Note this taking out our production file system at ORNL!!! Anything to try yet. |
| Comment by Mikhail Pershin [ 28/Mar/19 ] |
|
is that possible to collect and attach changelog_users itself? If bottom fs is ldiskfs it can be mounted as ldiskfs to some mountpoint and then there is 'changelog_users' in the root directory. Run llog_reader on it: # llog_reader <mountpoint>/changelog_users rec #1 type=1064553b len=64 offset 8192 Header size : 8192 Time : Thu Mar 28 05:47:15 2019 Number of records: 1 Target uuid : ----------------------- #01 (064)id=[0x6:0x1:0x0]:0 path=O/1/d6/6 Please, provide output from that llog_reader, then note that it is the llog catalog and real records are located in paths of catalog records, in the example above it is 'O/1/d6/6'. Please attach file from the path of latest record, if there are many, for analysis . You can run llog_reader on it as well: # llog_reader <mountpoint>/O/1/d6/6 rec #1 type=10670000 len=40 offset 8192 rec #2 type=10670000 len=40 offset 8232 rec #3 type=10670000 len=40 offset 8272 Header size : 8192 Time : Thu Mar 28 05:50:06 2019 Number of records: 3 Target uuid : ----------------------- #01 (040)changelog_user record id:0x0 #02 (040)changelog_user record id:0x0 #03 (040)changelog_user record id:0x0 That would be very useful to get these llogs when that problem occurs, so we can check what is in llog at that point. |
| Comment by Mikhail Pershin [ 28/Mar/19 ] |
|
to clear all changelogs try the following: |
| Comment by Philip B Curtis [ 28/Mar/19 ] |
|
Mikhail, The filesystem that James is referencing is in production at the moment with limited downtime availability to try the above recommendations. We do have a test system that is a smaller scale to the production one, but without the scale we have been unable to reproduce the issue there. In an effort to stop this from happening, we attempted to de-register the changelog user on all 4 of the MDSs. The consumer in this instance is intended to be robinhood but we have not been running it until we can try to sort this out. The lustre version is 2.11 running on rhel7.6 servers and the MDTs and OSTs are zfs. -Philip |
| Comment by Mikhail Pershin [ 28/Mar/19 ] |
|
Philip, it doesn't require downtime actually and can be done on running system, if you have access to server fs you can mount it locally as ldiskfs or zfs and read local data. For zfs that is zfs mount <dataset> I think
|
| Comment by Philip B Curtis [ 28/Mar/19 ] |
|
We were able to produce some of these errors on the TDS while hammering the FS with mdtest. We were getting messages like those described 2019/03/28 10:24:53 robinhood@f2-util01[373768/6] ChangeLog | ERROR: llapi_changelog_clear("f2-tds-MDT0000", "cl1", 5425145) returned -22 2019/03/28 10:24:53 robinhood@f2-util01[373768/6] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR. 2019/03/28 10:24:53 robinhood@f2-util01[373768/6] llapi | cannot purge records for 'cl1' 2019/03/28 10:24:53 robinhood@f2-util01[373768/6] ChangeLog | ERROR: llapi_changelog_clear("f2-tds-MDT0000", "cl1", 5425146) returned -22 2019/03/28 10:24:53 robinhood@f2-util01[373768/6] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR. 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] llapi | cannot purge records for 'cl1' 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] ChangeLog | ERROR: llapi_changelog_clear("f2-tds-MDT0000", "cl1", 5470278) returned -22 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR. 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] llapi | cannot purge records for 'cl1' 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] ChangeLog | ERROR: llapi_changelog_clear("f2-tds-MDT0000", "cl1", 5470276) returned -22 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR. 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] llapi | cannot purge records for 'cl1' 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] ChangeLog | ERROR: llapi_changelog_clear("f2-tds-MDT0000", "cl1", 5470277) returned -22 2019/03/28 10:26:47 robinhood@f2-util01[373768/9] EntryProc | Error -22 performing callback at stage STAGE_CHGLOG_CLR. I waited around for some of these to stack up a bit and then mounted the MDT. This is all that is showing in changelog_users which seems correct as it was mostly caught up when I halted things.
[root@f2-tds-mds1 10:39:00][mdt0]# llog_reader changelog_users
rec #1 type=1064553b len=64 offset 8192
Header size : 8192
Time : Tue Mar 26 11:58:06 2019
Number of records: 1
Target uuid :
-----------------------
#01 (064)id=[0x4a:0x1:0x0]:0 path=oi.1/0x1:0x4a:0x0
I did grab the records referenced just in case it had something interesting.
[root@f2-tds-mds1 10:44:12][mdt0]# llog_reader oi.1/0x1\:0x4a\:0x0
rec #1 type=10670000 len=40 offset 8192
Header size : 8192
Time : Thu Mar 28 08:48:52 2019
Number of records: 1
Target uuid :
-----------------------
#01 (040)changelog_user record id:0x0
And just for completeness, I grabbed what was is in the changelog_catalog
[root@f2-tds-mds1 10:47:51][mdt0]# llog_reader changelog_catalog
Bit 1 of 6 not set
Bit 2 of 6 not set
Bit 3 of 6 not set
Bit 4 of 6 not set
Bit 5 of 6 not set
Bit 6 of 6 not set
Bit 7 of 6 not set
Bit 8 of 6 not set
Bit 9 of 6 not set
Bit 10 of 6 not set
Bit 11 of 6 not set
Bit 12 of 6 not set
Bit 13 of 6 not set
Bit 14 of 6 not set
Bit 15 of 6 not set
Bit 16 of 6 not set
Bit 17 of 6 not set
Bit 18 of 6 not set
Bit 19 of 6 not set
Bit 20 of 6 not set
Bit 21 of 6 not set
Bit 22 of 6 not set
Bit 23 of 6 not set
Bit 24 of 6 not set
Bit 25 of 6 not set
Bit 26 of 6 not set
Bit 27 of 6 not set
Bit 28 of 6 not set
Bit 29 of 6 not set
Bit 30 of 6 not set
Bit 31 of 6 not set
Bit 32 of 6 not set
Bit 33 of 6 not set
Bit 34 of 6 not set
Bit 35 of 6 not set
Bit 36 of 6 not set
Bit 37 of 6 not set
Bit 38 of 6 not set
Bit 39 of 6 not set
Bit 40 of 6 not set
Bit 41 of 6 not set
Bit 42 of 6 not set
Bit 43 of 6 not set
Bit 44 of 6 not set
Bit 45 of 6 not set
Bit 46 of 6 not set
Bit 47 of 6 not set
Bit 48 of 6 not set
Bit 49 of 6 not set
Bit 50 of 6 not set
Bit 51 of 6 not set
Bit 52 of 6 not set
Bit 53 of 6 not set
Bit 54 of 6 not set
Bit 55 of 6 not set
Bit 56 of 6 not set
Bit 57 of 6 not set
Bit 58 of 6 not set
Bit 59 of 6 not set
Bit 60 of 6 not set
Bit 61 of 6 not set
Bit 62 of 6 not set
Bit 63 of 6 not set
Bit 64 of 6 not set
Bit 65 of 6 not set
Bit 66 of 6 not set
Bit 67 of 6 not set
Bit 68 of 6 not set
Bit 69 of 6 not set
Bit 70 of 6 not set
Bit 71 of 6 not set
Bit 72 of 6 not set
Bit 73 of 6 not set
Bit 74 of 6 not set
Bit 75 of 6 not set
Bit 76 of 6 not set
Bit 77 of 6 not set
Bit 78 of 6 not set
Bit 79 of 6 not set
Bit 80 of 6 not set
Bit 81 of 6 not set
Bit 82 of 6 not set
Bit 83 of 6 not set
Bit 84 of 6 not set
Bit 85 of 6 not set
Bit 86 of 6 not set
Bit 87 of 6 not set
Bit 88 of 6 not set
rec #89 type=1064553b len=64 offset 13824
rec #90 type=1064553b len=64 offset 13888
rec #91 type=1064553b len=64 offset 13952
rec #92 type=1064553b len=64 offset 14016
rec #93 type=1064553b len=64 offset 14080
rec #94 type=1064553b len=64 offset 14144
Header size : 8192
Time : Tue Mar 26 11:58:06 2019
Number of records: 6
Target uuid :
-----------------------
#89 (064)id=[0xaf:0x1:0x0]:0 path=oi.1/0x1:0xaf:0x0
#90 (064)id=[0xb0:0x1:0x0]:0 path=oi.1/0x1:0xb0:0x0
#91 (064)id=[0xb1:0x1:0x0]:0 path=oi.1/0x1:0xb1:0x0
#92 (064)id=[0xb2:0x1:0x0]:0 path=oi.1/0x1:0xb2:0x0
#93 (064)id=[0xb3:0x1:0x0]:0 path=oi.1/0x1:0xb3:0x0
#94 (064)id=[0xb4:0x1:0x0]:0 path=oi.1/0x1:0xb4:0x0
If helpful, I can also provide what is in the records referenced there. -Philip |
| Comment by Mikhail Pershin [ 28/Mar/19 ] |
|
Philip, that can be late but maybe you have also lustre debug log? The debug from mdd_changelog_clear() is needed similar to one provided by Stephane above:
make sure that there is 'ioctl' debug level on server with lctl get_param debug and set it if needed with lctl set_param debug=+ioctl |
| Comment by Mikhail Pershin [ 28/Mar/19 ] |
|
At the moment can you also check the last changelog entries? in path=oi.1/0x1:0xb4:0x0 |
| Comment by Philip B Curtis [ 28/Mar/19 ] |
|
It looks like the ioctl debug flag is already set. I've uploaded the dk output. Also attached is the output from the oi.1/0x1:0xb4:0x0 path.
-Philip |
| Comment by Mikhail Pershin [ 28/Mar/19 ] |
|
Philip, I've checked debug log output, it is too late, there are no needed messages, if you will be able to catch the moment when -22 error will happens, try to get debug log again if possible |
| Comment by Philip B Curtis [ 28/Mar/19 ] |
|
No worries. I'll start up the traffic again to try to catch it when it happens. Report back soon. -Philip |
| Comment by Philip B Curtis [ 28/Mar/19 ] |
|
Grabbed the output as soon as I started seeing them. I have the FS stopped if you need me to grab anything else. -Philip |
| Comment by Mikhail Pershin [ 28/Mar/19 ] |
|
Philip, I have needed log, thanks, check please that llog:
I think it is oi.1/0x1:0xd8:0x0 not sure about exact path though, you can check it at index 106 of changelog_catalog |
| Comment by Mikhail Pershin [ 28/Mar/19 ] |
|
in context of this issue, does anyone see any other bad things with changelogs except these messages? Does it really cause growing changelogs? From what I see so far that is just valid race between requests to clear changelogs when they are re-ordered, possibly being inflight. So first request clears more records and then there can be late request which wants to clear lower records which are cleared already. I'd say this valid case and it is just too loud saying 'Failure to clean llogs'. I think that will be OK just to exit with no error if such situation happens. And changelog should still be cleared as expected and don't grow up, moreover this error indicates that llog is being cleared regularly and that is just concurrent attempts to clear it. |
| Comment by Scott Nolin [ 28/Mar/19 ] |
|
Yes, our changelog grows without bound.
Scott |
| Comment by Mikhail Pershin [ 29/Mar/19 ] |
|
Scott, could you provice output of lctl get_param mdd.*.changelog* from MDT? And just to be sure, are you using robinhood to manage changelogs? |
| Comment by Philip B Curtis [ 29/Mar/19 ] |
|
Mikhail, Here is the requested llog. |
| Comment by Philip B Curtis [ 29/Mar/19 ] |
|
If I am understanding things correctly, then I agree with Scott's statement, unless the changelog_size does not actually indicate the size of the changelog. On our production filesystem, after changelog issues bit us in another way we de-registered the changelog users. Poking around a bit more, it doesn't look like the changelog user was completely cleaned up as initially I attempted to register again to manually run the changelog_clear and then try to de-register again as a test. It did not look like that user was cleaned up completely either. I could just as well be misunderstanding things and if so please correct me. This is the output of that lctl get_param mdd..changelog from one of the production MDSs.
[root@f2-mds1 06:56:14][~]# lctl get_param mdd.*.changelog* mdd.f2-MDT0000.changelog_deniednext=60 mdd.f2-MDT0000.changelog_gc=0 mdd.f2-MDT0000.changelog_mask= MARK CREAT MKDIR HLINK SLINK MKNOD UNLNK RMDIR RENME RNMTO OPEN CLOSE LYOUT TRUNC SATTR XATTR HSM MTIME CTIME ATIME MIGRT FLRW RESYNC GXATR NOPEN mdd.f2-MDT0000.changelog_max_idle_indexes=2097446912 mdd.f2-MDT0000.changelog_max_idle_time=2592000 mdd.f2-MDT0000.changelog_min_free_cat_entries=2 mdd.f2-MDT0000.changelog_min_gc_interval=3600 mdd.f2-MDT0000.changelog_size=10333280 mdd.f2-MDT0000.changelog_users= current index: 677996307 ID index (idle seconds) I would be glad to assist with any testing you might need to try to get this to resolution. -Philip |
| Comment by Scott Nolin [ 29/Mar/19 ] |
|
I am away from systems on travel until next week. For us, absolutely positive the changelog was not being cleared. We definitely saw this while using Robinhood. I don't know if we have logs handy, as we had to not use Robinhood on affected filesystems because if this. We may have one we can test on, I will try to get information next week.
Scott
|
| Comment by Jesse Stroik [ 29/Mar/19 ] |
|
Mikhail, I work with Scott. We monitor the changelog sizes on all of our lustre file systems via mdd.*.changelog_size. Here's the output of changelog_size for that MDT: mdd.scratch-MDT0000.changelog_size=23117357872 This is a scratch file system for a cluster we're waiting to put into production. The policy engine is necessary for this use case. We could work around it for a period of time and that's one of the plans we're considering right now. The other is to just recreate the file system with 2.10.3. Jesse
[edit] Log attached to this issue as lu-11205-ssec.log.gz |
| Comment by Mikhail Pershin [ 29/Mar/19 ] |
|
Jesse, I see, and how many changelog users do you have registered? |
| Comment by Jesse Stroik [ 29/Mar/19 ] |
|
One: ============= lctl get_param mdd.scratch-MDT0000.changelog_users |
| Comment by Mikhail Pershin [ 29/Mar/19 ] |
|
Jesse, I see, but re-registered many times, probably that is the reason, I will do more investigation in that area |
| Comment by Jesse Stroik [ 29/Mar/19 ] |
|
I think deregistering and registering the new user was an attempt to debug this issue. It's been a few months so I don't recall for sure. I have another file system on 2.10.6 that I will test with. I didn't configure robinhood for it yet or register a changelog user. I'll let you know the results yet this afternoon. Let me know if there's any other information or debugging i can provide.
|
| Comment by Stephane Thiell [ 29/Mar/19 ] |
|
Changelogs in Lustre 2.10.3 are not usable (see |
| Comment by Jesse Stroik [ 29/Mar/19 ] |
|
Mikhail, Hold off on further investigation of our issue at SSEC. It looks like it's just a simple FSNAME-MDT conflict. The name-MDT combination our new file system and-about-to-be-replaced file system is the same. I'll move the new file system to a different robinhood server and expect it'll be fine. It's that damn 8 character limit stifling our creativity I think we reported this as affecting 2.10.6 - that may be incorrect. I can follow up next week after we see how the changelog processor works over the weekend. Stephane, We're processing the changelogs for two 2.10.3 file systems and while we do see the a few instances of llapi_changelog_clear returning -22 on those file systems per day I don't see any corresponding connectivity issues. This particular policy engine server is under quite a bit of pressure as well. Jesse |
| Comment by Stephane Thiell [ 29/Mar/19 ] |
|
Jesse, in 2.10.3, if I remember correctly, there is a changelogs garbage collector that is enabled by default and is broken due to an uninitialized variable, so if for any reasons you don't consume your changelogs quickly enough and triggers the GC, your MDS is likely to crash. |
| Comment by Gerrit Updater [ 31/May/19 ] |
|
James Simmons (uja.ornl@yahoo.com) uploaded a new patch: https://review.whamcloud.com/35013 |
| Comment by James A Simmons [ 31/May/19 ] |
|
Created a reproducer. Locally I see the same errors as reported here. |
| Comment by James A Simmons [ 18/Jun/19 ] |
|
I sat down and investigated this issue. I can see the logs can be properly cleared if the right variables are used for indexing. For example: lctl get_param -n mdd.lustre-MDT0000.changelog_users current index: 98306 ID index (idle seconds) cl2 32769 (702) The index (32769) never changes until after lfs changelog clear is called. So if you want to clear a range of changelogs you look at "current index" to get the upper bound for clearing. When I did that the change logs were cleared as they should be and index was set to this new lower bound. You can review https://review.whamcloud.com/#/c/35013/ to see how I got this to work. After I figured this out the question become is reading "current index" the right approach. From reading the robin hood code it looks like its using the logs read from llapi_changelog_recv() to determine the upper bound. I haven't tried that case in my sanity test but lfs changelog $FSNAME-MTD0000 32769 shows the correct information. I can update the test but to me this is looking like a robin hood bug. |
| Comment by Quentin Bouget [ 18/Jun/19 ] |
|
> From reading the robin hood code it looks like its using the logs read from llapi_changelog_recv() to determine the upper bound This is the only way for robinhood, and pretty much anything that processes changelog. The "current index" approach works well in tests but you really only want to dismiss changelog records you actually processed, and this rarely coincides with the value in "current index" (at least not on active filesystems). IMO, out of order changelog records look like the ideal suspects. |
| Comment by Olaf Weber [ 18/Jun/19 ] |
|
In the HPE DMF 7 Lustre changelog reader we do something similar: only clear records up to the point where we're absolutely certain, in the presence of out-of-order and outright missing records, that we're not clearing valid changelog records that we yet have to process. Something similar, with slightly more relaxed constraints applies to restarting reading the changelog. |
| Comment by James A Simmons [ 19/Jun/19 ] |
|
Currently I'm trying to get a grasp of what is going on and this is new area of Lustre for me. The reason I'm looking into this is that its been many months with no progress and we have a production system that is losing 3% of its MDT disk space per week. I will look to updating the test to use lfs changelog $FNAME-MDTXXXX since that is using llapi_changelog_recv(). From their I will need to think what level of the software stack this needs to resolved at. Looking at the problem at this current level of understanding this does not look like a kernel side solution. User land should never expect guaranteed sequential data IMNSHO. Now the question is should this be handled by llapi_changelog_recv() or in the external application. |
| Comment by Quentin Bouget [ 20/Jun/19 ] |
|
I for one would prefer if the char device could handle this complexity. I expect there is one – good – way to handle this, and I do not see any reason to leave it to applications or the llapi to do the reordering. If there are different possible tradeoffs, then maybe the llapi (but definitely not the applications themselves).
|
| Comment by James A Simmons [ 20/Jun/19 ] |
|
I just can't see this being completely resolved in the kernel code but lets consider the cases. One approach is if a record is not sequential then the char device just blocks until the next proper record comes. The limitation to this approach is that it would require that missing record to exist which might not be the case. The char device could send a request to resend and if the record doesn't exist return -EEXIST. Does that really handle this case? The question is how large can the gap be? The code would have to handle the case the record does show up after the -EEXIST is already reported. This approach could be server side as well. Again we might have a case in which a record shows up pretty late. Another approach for resolving the out of order issue we would have to cache the records as they come so they can be sorted. This again could be done server side or client side. This would require some blocking until some amount is collected so it could be sorted. The question again is how far apart are the changelog record gaps? Do we wait until CDEV_CHLG_MAX_PREFETCH or does that need to go up? Also the same issue exist if a record just doesn't exist. Or the record just comes really late. Lastly this handling as described above could be done in llapi_changelog_recv(). Same issues again, mind you like this approach the most. |
| Comment by Olaf Weber [ 20/Jun/19 ] |
|
I'm pretty certain the missing records are an issue with the char device code not returning them to userspace despite them being present in the on-disk changelog. See the writeup in |
| Comment by James A Simmons [ 20/Jun/19 ] |
|
Ah, that is what I was looking for. It wasn't clear if the records were never coming or coming late. So the mdd layer actually drops the out of order records so the char device never sees those records. Still points out the issue of gaps, which could still happen, needs to be handled. Also how large is the out of order, 16, 1024, 64K, 1GB of records? |
| Comment by Olaf Weber [ 20/Jun/19 ] |
|
To clarify for the dropped records, as far as I can tell it is the char device code that drops them somehow. I base this on not seeing this issue when using pre-char-device code. In my experience out-of-order sequences are typically small, with the most common occurrence being two records in inverse order. I don't think I have ever seen sequences that didn't resolve within a few tens of records. For DMF 7 we default to assuming a record was dropped rather than out of order if we get 1000 records further along and still haven't seen it. At present I know of no cases where that heuristic failed. |
| Comment by James A Simmons [ 24/Jun/19 ] |
|
I think I see what the bug is in the char device. Need to play with it a bit to come up with a solution to the dropped records. |
| Comment by Quentin Bouget [ 26/Jun/19 ] |
|
I wrote a toy changelog reader that I think reproduces the bug described here. With a lustre file system mounted at /mnt/lustre, one MDT named lustre-MDT0000, and a changelog client registered on that MDT: $ tar xf changelog-reader.tgz $ rm changelog-reader.tgz $ cd changelog-reader $ make $ $ # Generate a stream of changelog records $ sudo while true; do touch /mnt/lustre/file; rm /mnt/lustre/file; done & $ $ # Read and acknowledge those changelog records $ sudo ./changelog_reader The expected output looks something like: cleared record 128 cleared record 256 cleared record 384 cleared record 512 ... cleared record 896 changelog-reader: cannot purge records for 'cl1': Invalid argument (22) ./changelog-reader: llapi_changelog_clear: Invalid argument |
| Comment by James A Simmons [ 26/Jun/19 ] |
|
Excellent. I see the reproducer. Will give it a try today. Thank you. |
| Comment by Olaf Weber [ 26/Jun/19 ] |
|
There are some kernel-side CDEBUG(D_IOCTL) that may provide useful information if enabled. |
| Comment by James A Simmons [ 26/Jun/19 ] |
|
Quentin what licence do you want for your changelog reader and who is the copyright holder? |
| Comment by Quentin Bouget [ 27/Jun/19 ] |
|
Er, whatever Lustre uses those days is fine. GPLv2? I wrote the code, during working hours. In French law, I believe I hold "parental" rights to the work (so I can ask to have my name associated with it) and my employer – CEA – holds the copyright. So, GPLv2 header + "author: Quentin Bouget <quentin.bouget@cea.fr>" + "Copyright (c) 2019 Commissariat a l'energie atomique et aux energies alternatives". |
| Comment by James A Simmons [ 27/Jun/19 ] |
|
So I have been trying your changelog_reader and find the problem is very rare. I saw it only after 250K file creation and then deletion. To really reproduce this I might need to inject an out of order condition manually in the mdd layer. |
| Comment by Olaf Weber [ 27/Jun/19 ] |
|
Are you doing file creation/deletion on multiple nodes at the same time? In my experience that helps induce these problems. |
| Comment by Andreas Dilger [ 03/Jul/19 ] |
|
Having looked at this issue for the first time, my theory is definitely that this relates to out-of-order Changelog records:
My understanding of the out-of-order record problem is that the "out of orderness" is only within a single journal transaction, because two (or more) threads allocate transaction handles, then get a slot in the Changelog, but their RPC processing finished in a different order in which they allocated a slot in the Changelog file. So, it shouldn't be possible to have records like "A, F, E, D, B, C" that are not within the same disk transaction, so they should either all be committed before an MDS crash, or not at all. My estimate is that the upper limit for misordered records is likely the number of MDS threads running (possibly limited by the number of cores on the MDS), since there can't be more active RPCs than this within one transaction that might race with each other. That means it should be reasonable to reorder the records in the kernel or llapi so that they are in order to the application (not every application should have to handle this itself), and correspondingly it needs to drop the cancel for lower-numbered records that have been reordered until the later ones have been processed, or it would be possible to lose the higher-numbered records which appear earlier in the Changelog if there was a crash during processing. If the records are always presented to the application in order, there should never be a case where a higher-numbered record is cancelled without processing the lower-numbered one, as that would be broken even when the records are in order on disk. I think there are two potential workarounds for the cancellation issue, both of which could be implemented:
Separately, there should be a fix to either the kernel or userspace Changelog processing to detect and correct slightly misordered records. I think even if the ordering problem is fixed we would want to have one or both of the cancellation fixes to allow existing Changelogs to be processed. |
| Comment by Alexander Boyko [ 15/Jul/19 ] |
|
I want to add my thoughts mdd_changelog_store() fill the record index in sync way for mdd, rec->cr.cr_index = ++mdd->mdd_cl.mc_index; after that llog_add() So assume we have two threads in same time, cr_index = 1 and cr_index = 2 before llog_add. There is no any guarantee that first cr_index would be the first for llog_write_rec. To guaranty the same transaction for reordered cr_index, we need to set cr_index after llog_th = thandle_get_sub(env, th, ctxt->loc_handle->lgh_obj); diff --git a/lustre/mdd/mdd_dir.c b/lustre/mdd/mdd_dir.c
index 3a24e21..21ad94b 100644
--- a/lustre/mdd/mdd_dir.c
+++ b/lustre/mdd/mdd_dir.c
@@ -806,13 +806,6 @@ int mdd_changelog_store(const struct lu_env *env, struct mdd_device *mdd,
rec->cr_hdr.lrh_type = CHANGELOG_REC;
rec->cr.cr_time = cl_time();
- spin_lock(&mdd->mdd_cl.mc_lock);
- /* NB: I suppose it's possible llog_add adds out of order wrt cr_index,
- * but as long as the MDD transactions are ordered correctly for e.g.
- * rename conflicts, I don't think this should matter. */
- rec->cr.cr_index = ++mdd->mdd_cl.mc_index;
- spin_unlock(&mdd->mdd_cl.mc_lock);
-
ctxt = llog_get_context(obd, LLOG_CHANGELOG_ORIG_CTXT);
if (ctxt == NULL)
return -ENXIO;
@@ -821,6 +814,14 @@ int mdd_changelog_store(const struct lu_env *env, struct mdd_device *mdd,
if (IS_ERR(llog_th))
GOTO(out_put, rc = PTR_ERR(llog_th));
+ spin_lock(&mdd->mdd_cl.mc_lock);
+ /* NB: I suppose it's possible llog_add adds out of order wrt cr_index,
+ * but as long as the MDD transactions are ordered correctly for e.g.
+ * rename conflicts, I don't think this should matter. */
+ rec->cr.cr_index = ++mdd->mdd_cl.mc_index;
+ spin_unlock(&mdd->mdd_cl.mc_lock);
+
+
/* nested journal transaction */
rc = llog_add(env, ctxt->loc_handle, &rec->cr_hdr, NULL, llog_th);
|
| Comment by Stephane Thiell [ 19/Jul/19 ] |
|
Hi @aboyko This seems very promising!! I guess all Robinhood users are hitting this issue with recent versions of Lustre. It's unclear to me if it's wasting space on the MDTs, but it's definitively very confusing for Lustre sysadmins. Are you going to submit your patch to gerrit (http://review.whamcloud.com/) so that it can be properly reviewed by others? Thanks again! |
| Comment by James A Simmons [ 22/Jul/19 ] |
|
I tried the fix from Boyko and it didn't resolve the problem |
| Comment by Alexander Boyko [ 30/Jul/19 ] |
|
I've created a fix, submitting soon. |
| Comment by Gerrit Updater [ 30/Jul/19 ] |
|
Alexandr Boyko (c17825@cray.com) uploaded a new patch: https://review.whamcloud.com/35650 |
| Comment by Alexander Boyko [ 30/Jul/19 ] |
|
@James A Simmons, could you try the patch above? It also includes the reproducer for this issue. == sanity test 160j: Verify that reordered records could be cancelled in changelog =================== 08:27:16 (1563352036) mdd.lustre-MDT0000.changelog_mask=+hsm mdd.lustre-MDT0001.changelog_mask=+hsm Registered 2 changelog users: 'cl1 cl1' fail_loc=0x8000015d lustre-MDT0000.2 01CREAT 08:27:18.551297912 2019.07.17 0x0 t=[0x200000402:0x13:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x3:0x0] f2 lustre-MDT0000.3 06UNLNK 08:27:18.553672555 2019.07.17 0x1 t=[0x200000402:0xa:0x0] j=rm.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x5:0x0] f3 lustre-MDT0000.4 14SATTR 08:27:18.554017844 2019.07.17 0x14 t=[0x200000402:0x8:0x0] j=chmod.0 ef=0xf u=0:0 nid=192.168.101.35@tcp lustre-MDT0000.5 11CLOSE 08:27:18.555499959 2019.07.17 0x42 t=[0x200000402:0x13:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp lustre-MDT0000.1 01CREAT 08:27:18.550045424 2019.07.17 0x0 t=[0x200000402:0x12:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp p=[0x200000402:0x7:0x0] 1 lustre-MDT0000.6 11CLOSE 08:27:19.562156592 2019.07.17 0x42 t=[0x200000402:0x12:0x0] j=touch.0 ef=0xf u=0:0 nid=192.168.101.35@tcp 2 lustre-MDT0000: clear the changelog for cl1 to record #2 3 lustre-MDT0000: clear the changelog for cl1 to record #3 4 lustre-MDT0000: clear the changelog for cl1 to record #4 5 lustre-MDT0000: clear the changelog for cl1 to record #5 1 lustre-MDT0000: clear the changelog for cl1 to record #1 lfs changelog_clear: cannot purge records for 'cl1': Invalid argument (22) changelog_clear: record out of range: 1 sanity test_160j: @@@@@@ FAIL: Fail to clear changelog record 1 Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:5919:error() = /usr/lib64/lustre/tests/sanity.sh:13140:test_160j() = /usr/lib64/lustre/tests/test-framework.sh:6203:run_one() = /usr/lib64/lustre/tests/test-framework.sh:6242:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:6038:run_test() = /usr/lib64/lustre/tests/sanity.sh:13145:main() Dumping lctl log to /tmp/test_logs/1563352024/sanity.test_160j.*.1563352042.log |
| Comment by James A Simmons [ 30/Jul/19 ] |
|
Yep, setting it up right now. Thank you. |
| Comment by James A Simmons [ 01/Aug/19 ] |
|
We did some testing and saw these results. Before we started robinhood to purge things we had: lctl get_param ..changelog_size mdd.f2-tds-MDT0000.changelog_size=4711216 Afterwards we saw lctl get_param ..changelog_size mdd.f2-tds-MDT0000.changelog_size=928368 Does this patch help resolve some of the orphan records as well that we couldn't cleanup before? |
| Comment by James A Simmons [ 02/Aug/19 ] |
|
running lfs changelog clear 0 restores the changelog_size back to 4711216. There is some strange issue there but that is independent of this patch. The testing has gone very well!! Thanks Boyko. |
| Comment by Olaf Weber [ 05/Aug/19 ] |
|
When I tested the code posted in the review earlier today, I still saw out-of-order and lost changelog records. From this and other tests it does appear that lost changelog records are out-of-order records that are not reliably (and often reliably not) returned to userspace. |
| Comment by James A Simmons [ 07/Aug/19 ] |
|
Was it better? In our testing it did make the problem go away. I wonder if the problem is less common. |
| Comment by Olaf Weber [ 07/Aug/19 ] |
|
It is hard to say whether it really reduces the lost records issue, as too much changed in the cluster I tested on to make it truly apples/apples. |
| Comment by Alexander Boyko [ 08/Aug/19 ] |
|
@Olaf Weber, the patch allows clear of unordered records. The processing is different case. I do think that software should take care. Read a number of records and process it in order. In most cases it can process unordered too. We should not see unordered operation for a same file or directory cause they are synchronized by a parent lock. |
| Comment by Alexander Boyko [ 16/Sep/19 ] |
|
|
| Comment by Olaf Weber [ 16/Sep/19 ] |
|
Agreed that the proposed fix for Related to the "orphan records" that James mentioned, in my testing of this patch it did not (fully) resolve that issue: the only way I could get them to go away was to rebuild the changelog by deregistering all readers. |
| Comment by Qian Yingjin [ 17/Sep/19 ] |
|
From my understanding of current changelog mechanism from the source code, the chagelog records will not delete if there are more than one changelog user using it. |
| Comment by Peter Jones [ 21/Sep/19 ] |
|
The consensus seems to be that this can be closed as a duplicate of |
| Comment by Stephane Thiell [ 17/Oct/19 ] |
|
We just upgraded our Lustre 2.12 servers and the Robinhood client to 2.12.3 RC1 and we're still seeing these log messages: fir-md1-s3: Oct 16 21:40:09 fir-md1-s3 kernel: Lustre: 18584:0:(mdd_device.c:1807:mdd_changelog_clear()) fir-MDD0002: Failure to clear the changelog for user 1: -22 Not sure about the real impact though. |
| Comment by Mikhail Pershin [ 25/Oct/19 ] |
|
Stephane, does it cause changelog growing as original issue did? |
| Comment by Stephane Thiell [ 06/Dec/19 ] |
|
Still a problem with 2.12.3, robinhood hangs when reading the changelogs, and when this happens, an error appears on the console. I need to restart robinhood almost every day due to this problem: Nov 28 14:58:11 fir-rbh01 kernel: LustreError: 80708:0:(mdc_changelog.c:249:chlg_load()) fir-MDT0002-mdc-ffff95cae9e38800: fail to process llog: rc = -5 Nov 28 20:08:16 fir-rbh01 kernel: LustreError: 70617:0:(mdc_changelog.c:249:chlg_load()) fir-MDT0003-mdc-ffff95cae9e38800: fail to process llog: rc = -5 Nov 29 02:17:01 fir-rbh01 kernel: LustreError: 123662:0:(mdc_changelog.c:249:chlg_load()) fir-MDT0001-mdc-ffff95cae9e38800: fail to process llog: rc = -5 Dec 01 14:06:55 fir-rbh01 kernel: LustreError: 78509:0:(mdc_changelog.c:249:chlg_load()) fir-MDT0000-mdc-ffff95cae9e38800: fail to process llog: rc = -2 Dec 02 15:39:14 fir-rbh01 kernel: LustreError: 887:0:(mdc_changelog.c:249:chlg_load()) fir-MDT0002-mdc-ffff95cae9e38800: fail to process llog: rc = -2 Dec 05 00:12:07 fir-rbh01 kernel: LustreError: 12066:0:(mdc_changelog.c:249:chlg_load()) fir-MDT0003-mdc-ffff95cae9e38800: fail to process llog: rc = -5 Note that this isn't a problem with 2.10.8 And Mike, I don't know regarding the changelog growing issue, we had never noticed this specific problem (that doesn't mean it doesn't happen, I was concerned too about that), I think LLNL reported that at some point. Because we're DoM-ready, our MDTs are very large. |
| Comment by Andrew Elwell [ 16/Apr/20 ] |
|
We're also seeing this with the following combination - sles11 sp4 client against a 2.10.8 RHEL 7.6 server Robinhood will run fine for so long, then stop clearing changelogs: 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | ==================== Dumping stats at 2020/04/16 00:15:54 ===================== 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | ======== General statistics ========= 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | Daemon start time: 2020/04/15 12:45:52 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | Started modules: log_reader 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | ChangeLog reader #0: 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | fs_name = pgfs 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | mdt_name = MDT0000 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | reader_id = cl1 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | records read = 15682180 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | interesting records = 4444583 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | suppressed records = 11175370 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | records pending = 451 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | last received: rec_id=161344242, rec_time=2020/04/16 00:15:53.691544, received at 2020/04/16 00:15:53.738479 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | receive speed: 409.11 rec/sec, log/real time ratio: 1.00 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | last pushed: rec_id=161343461, rec_time=2020/04/16 00:15:48.603088, pushed at 2020/04/16 00:15:53.737393 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | push speed: 419.50 rec/sec, log/real time ratio: 1.00 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | last committed: rec_id=161343461, rec_time=2020/04/16 00:15:48.603088, committed at 2020/04/16 00:15:53.7417 04 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | commit speed: 419.50 rec/sec, log/real time ratio: 1.00 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | last cleared: rec_id=161343461, rec_time=2020/04/16 00:15:48.603088, cleared at 2020/04/16 00:15:53.742430 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | clear speed: 419.50 rec/sec, log/real time ratio: 1.00 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | ChangeLog stats: 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | MARK: 0, CREAT: 882918, MKDIR: 128738, HLINK: 501, SLINK: 22345, MKNOD: 0, UNLNK: 493670 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | RMDIR: 54927, RENME: 64817, RNMTO: 0, OPEN: 0, CLOSE: 13427698, LYOUT: 78, TRUNC: 64903 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | SATTR: 456295, XATTR: 48510, HSM: 0, MTIME: 36766, CTIME: 14, ATIME: 0, MIGRT: 0 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | FLRW: 0, RESYNC: 0, GXATR: 0, NOPEN: 0 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | ==== EntryProcessor Pipeline Stats === 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | Idle threads: 24 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | Id constraints count: 0 (hash min=0/max=0/avg=0.0) 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | Name constraints count: 0 (hash min=0/max=0/avg=0.0) 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | Stage | Wait | Curr | Done | Total | ms/op | 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | 0: GET_FID | 0 | 0 | 0 | 0 | 0.00 | 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | 1: GET_INFO_DB | 0 | 0 | 0 | 40627 | 0.36 | 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | 2: GET_INFO_FS | 0 | 0 | 0 | 40475 | 0.07 | 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | 3: PRE_APPLY | 0 | 0 | 0 | 40597 | 0.00 | 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | 4: DB_APPLY | 0 | 0 | 0 | 40597 | 0.13 | 98.15% batched (avg batch size: 28.7) 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | 5: CHGLOG_CLR | 0 | 0 | 0 | 40627 | 0.02 | 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | 6: RM_OLD_ENTRIES | 0 | 0 | 0 | 0 | 0.00 | 2020/04/16 00:15:54 robinhood@bullion[14485/1] STATS | DB ops: get=3538391/ins=959210/upd=2832420/rm=566905 2020/04/16 00:16:00 robinhood@bullion[14485/2] ChangeLog | Error in llapi_changelog_recv(): -5: Input/output error 2020/04/16 00:16:01 robinhood@bullion[14485/2] ChangeLog | Error in llapi_changelog_recv(): -5: Input/output error 2020/04/16 00:16:02 robinhood@bullion[14485/2] ChangeLog | Error in llapi_changelog_recv(): -5: Input/output error 2020/04/16 00:16:03 robinhood@bullion[14485/2] ChangeLog | Error in llapi_changelog_recv(): -5: Input/output error 2020/04/16 00:16:04 robinhood@bullion[14485/2] ChangeLog | Error in llapi_changelog_recv(): -5: Input/output error 2020/04/16 00:16:05 robinhood@bullion[14485/2] ChangeLog | Error in llapi_changelog_recv(): -5: Input/output error 2020/04/16 00:16:06 robinhood@bullion[14485/2] ChangeLog | Error in llapi_changelog_recv(): -5: Input/output error but nothing untoward in the MDS logs at the same time although there was a load peak and CPUs in 'wait' (grafana plots from collectd can be attached). LNet traffic from the MDS wasn't anything particularly impressive. The robinhood server is one of the few machines we've tested with a 2.12 client - we're still mostly 2.10.8 or 2.7 (Cray) as we've still got a 2.5.x filesystem (sonnexion) and don't want to move too far ahead with clients. |