Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major
-
None
-
None
-
None
-
3
-
4047
Description
Customer uses changelog to watch the filesystem. For some weeks it's been properly working but now we observe several problems on the changelog functionality. These problems do not allow us to properly use changelog:
[root@xxx14 ~]# lctl --device project-MDT0000 changelog_register
project-MDT0000: Registered changelog userid 'cl6'
[root@xxx14 ~]# cat /proc/fs/lustre/mdd/project-MDT0000/changelog_users
current index: 5077892
ID index
[root@helios14 ~]# lctl --device project-MDT0000 changelog_deregister cl6
error: changelog_deregister: Invalid argument
[root@helios14 ~]# cat /proc/fs/lustre/mdd/project-MDT0000/changelog_mask
MARK CREAT MKDIR HLINK SLINK MKNOD UNLNK RMDIR RNMFM RNMTO OPEN CLOSE IOCTL TRUNC SATTR XATTR HSM MTIME CTIME
Trying to read changelog sequence starts at 3200000, so it's like if there was still one client reader not having cleared further entries.
Tryning to clear every client reader does not change anything:
for i in 1 2 3 4 5 6 ; do lfs changelog_clear project-MDT0000 cl$i 5000000 ; done
And in MDS' syslog:
1340367641 2012 Jun 22 21:20:41 helios14 kern warning kernel Lustre: 9270:0:(mdd_device.c:1478:mdd_changelog_user_purge()) Could not determine changelog records to purge; rc=-22
1340367641 2012 Jun 22 21:20:41 helios14 kern warning kernel Lustre: 9270:0:(mdd_device.c:1478:mdd_changelog_user_purge()) Skipped 3 previous similar messages
Changelogs can be read but not cleared, thus affecting the tool reading changelogs.
This issue arises with lustre-2.1.1.
Does this sound as a known problem for you? How could we have more debugging information while system is in production?