[LU-1556] MDS does not register changelog readers Created: 22/Jun/12 Updated: 16/Mar/15 Resolved: 06/Jan/15 |
|
| Status: | Closed |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Diego Moreno (Inactive) | Assignee: | Yang Sheng |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Severity: | 3 | ||||
| Rank (Obsolete): | 4047 | ||||
| Description |
|
Customer uses changelog to watch the filesystem. For some weeks it's been properly working but now we observe several problems on the changelog functionality. These problems do not allow us to properly use changelog: [root@xxx14 ~]# lctl --device project-MDT0000 changelog_register Trying to read changelog sequence starts at 3200000, so it's like if there was still one client reader not having cleared further entries. Tryning to clear every client reader does not change anything: And in MDS' syslog: Changelogs can be read but not cleared, thus affecting the tool reading changelogs. This issue arises with lustre-2.1.1. Does this sound as a known problem for you? How could we have more debugging information while system is in production? |
| Comments |
| Comment by Peter Jones [ 22/Jun/12 ] |
|
Yangsheng Could you please look into this one? Thanks Peter |
| Comment by Yang Sheng [ 24/Jun/12 ] |
|
Hi, Diego, Could you upload more logs in there? Especially the context of 1340367641 2012 Jun 22 21:20:41 helios14 kern warning kernel Lustre: 9270:0:(mdd_device.c:1478:mdd_changelog_user_purge()) Could not determine changelog records to purge; rc=-22 |
| Comment by Diego Moreno (Inactive) [ 25/Jun/12 ] |
|
As I couldn't upload the dmesg file via jira I uploaded it in whamcloud's ftp. Please, find there the dmesg file with some llog errors I couldn't understand. We often see this message: |
| Comment by Yang Sheng [ 28/Jun/12 ] |
|
Thanks, Diego, This is very useful. |
| Comment by Diego Moreno (Inactive) [ 02/Jul/12 ] |
|
Some more logs (lustre debug file). Log has been enabled just before running "lctl --device MDT0000 changelog_register". It seems there's an invalid entry on changelog but how could this entry be there?? and how can we remove this bad entry?? If you need the entire file, please ask. 00000040:00020000:1.0:1340966614.416509:0:24262:0:(llog_lvfs.c:473:llog_lvfs_next_block()) Invalid llog tail at log id 670040609/1607149776 offset 16384 00000040:00020000:8.0:1340966614.427773:0:24264:0:(llog_lvfs.c:473:llog_lvfs_next_block()) Invalid llog tail at log id 670040609/1607149776 offset 16384 00000004:02000000:1.0:1340966671.265493:0:24267:0:(mdd_device.c:270:mdd_changelog_on()) mdd_obd-project-MDT0000: changelog on 00000004:00020000:1.0:1340966671.269665:0:24267:0:(mdd_device.c:274:mdd_changelog_on()) Changelogs cannot be enabled due to error condition (see mdd_obd-project-MDT0000 log). 00000040:00020000:1.0:1340966683.106474:0:24270:0:(llog_lvfs.c:473:llog_lvfs_next_block()) Invalid llog tail at log id 670040609/1607149776 offset 16384 00000040:00020000:1.0:1340966683.126397:0:24272:0:(llog_lvfs.c:473:llog_lvfs_next_block()) Invalid llog tail at log id 670040609/1607149776 offset 16384 00000004:02000000:1.0:1340966711.276400:0:24275:0:(mdd_device.c:270:mdd_changelog_on()) mdd_obd-project-MDT0000: changelog on 00000004:00020000:1.0:1340966711.280559:0:24275:0:(mdd_device.c:274:mdd_changelog_on()) Changelogs cannot be enabled due to error condition (see mdd_obd-project-MDT0000 log). 00000040:00020000:1.0:1340966721.286568:0:24279:0:(llog_lvfs.c:473:llog_lvfs_next_block()) Invalid llog tail at log id 670040609/1607149776 offset 16384 00000004:00000400:6.0:1340966721.306381:0:24277:0:(mdd_device.c:1478:mdd_changelog_user_purge()) Could not determine changelog records to purge; rc=-22 00000004:02000000:7.0:1340966758.786405:0:24281:0:(mdd_device.c:270:mdd_changelog_on()) mdd_obd-project-MDT0000: changelog on 00000004:00020000:7.0:1340966758.790585:0:24281:0:(mdd_device.c:274:mdd_changelog_on()) Changelogs cannot be enabled due to error condition (see mdd_obd-project-MDT0000 log). 00000040:00020000:1.0:1340967087.617075:0:24326:0:(llog_lvfs.c:473:llog_lvfs_next_block()) Invalid llog tail at log id 670040609/1607149776 offset 16384 00000040:00020000:8.0:1340967087.628338:0:24328:0:(llog_lvfs.c:473:llog_lvfs_next_block()) Invalid llog tail at log id 670040609/1607149776 offset 16384 00000004:02000000:1.0:1340967087.629959:0:24329:0:(mdd_device.c:270:mdd_changelog_on()) mdd_obd-project-MDT0000: changelog on 00000004:00020000:1.0:1340967087.635292:0:24329:0:(mdd_device.c:274:mdd_changelog_on()) Changelogs cannot be enabled due to error condition (see mdd_obd-project-MDT0000 log). 00000004:02000000:7.0:1340967117.856469:0:24346:0:(mdd_device.c:270:mdd_changelog_on()) mdd_obd-project-MDT0000: changelog on 00000004:00020000:7.0:1340967117.860651:0:24346:0:(mdd_device.c:274:mdd_changelog_on()) Changelogs cannot be enabled due to error condition (see mdd_obd-project-MDT0000 log). |
| Comment by Yang Sheng [ 20/Jul/12 ] |
|
Hi, Diego, Could you please upload the file MDT: CONFIGS/changelog_users for me? Juse use debugfs open your mdt device, and then use 'dump CONFIGS/changelog_users <choose a filename>'. TIA. |
| Comment by Diego Moreno (Inactive) [ 03/Aug/12 ] |
|
I attach the changelog_users and changelog_catalog files from the MDT not registering new readers. |
| Comment by Yang Sheng [ 03/Aug/12 ] |
|
Thanks, Diego, it is very helpful. |
| Comment by Yang Sheng [ 24/Dec/14 ] |
|
Hi, Diego, Is this issue still make sense? I am not reproduce it in my local box. |
| Comment by Diego Moreno (Inactive) [ 06/Jan/15 ] |
|
Hi, We've not seen this bug for a while. |