[LU-11581] Not all changelog entries are returned to userspace - Whamcloud Community JIRA

Details

Type: Bug
Resolution: Duplicate
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.10.1
Labels:
- changelog
Environment:
Lustre 2.10 based virtual cluster

Epic/Theme:
- changelog
Severity:
2
Rank (Obsolete):
9223372036854775807

Description

In a Lustre 2.10+ based cluster I have observed a problem where some changelog entries are not returned to userspace. Which entries are dropped is not consistent across attempts to read them.

I can reproduce this by doing the following:

Register a changelog reader to enable changelog
On at least two client nodes, run a file creation/deletion loop - I use a recursive copy of /usr/include to a client-specific directory
Wait until the changelog has grown to a couple million entries.
Stop the file creation/deletion loops, and ensure the filesystem is idle.
Run lfs changelog several times on a client and redirect the output to different files.
Compare the files.

What I have observed is that I got different output files from lfs changelog every single time. Changelog records that are absent in one of the output files are present in another and vice versa. At no point were all entries that should be in the on-disk log returned.

In my (admittedly CPU-starved) virtual cluster the drop rate was approximately 1 entry per 16000 records, but in a test like above having a few million on-disk records is required to consistently see the problem.

Notes:

I originally observed this with a changelog reader which has been instrumented to detect this kind of issue. The description above regards how it can be reproduced without relying on a proprietary tool.
I have not been able to reproduce this in a 2.7+ based cluster. Admittedly that one does have much more capable hardware as well.
To compare the output files with tools like comp you need to sort them first using 'sort -n'. This is thanks to ~~LU-11426~~
This issue may in fact be caused by ~~LU-11426~~ interacting with the new (in 2.10) mechanism to return changelog entries to userspace.

Attachments

Issue Links

is related to

LU-11426 2/2 Olafs agree: changelog entries are emitted out of order

Resolved

Activity

People

Assignee:: WC Triage

Reporter:: Olaf Weber (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 29/Oct/18 4:19 PM

Updated:: 06/Aug/19 2:31 PM

Resolved:: 26/Mar/19 3:35 PM