Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Fixed
Priority: Minor
Fix Version/s: Lustre 2.11.0
Affects Version/s: None
Labels:
None

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

I do not have much to share except the attached reproducer.

The key elements of the reproducer seem to be:

setup lustre with two mountpoints;
create a file;
launch a copytool on mountpoint A;
suspend the copytool;
archive the file created at step 1 from mountpoint A*;
delete the file on mountpoint B;
sync;
un-suspend the copytool (the output of the copytool should indicate that llapi_hsm_action_begin() failed with EIO, not ENOENT)
umount => the process hangs in an unkillable state.

*You can use mountpoint B at step 5, but only if you created the file from mountpoint A.

I added some debug in the reproducer that should be logged in /tmp.

I suspect those two lines in the dmesg are related to this issue (they are logged at umount time):

[  143.575078] LustreError: 3703:0:(ldlm_resource.c:1094:ldlm_resource_complain()) filter-lustre-OST0000_UUID: namespace resource [0x2:0x0:0x0].0x0 (ffff8806ab7b6900) refcount nonzero (1) after lock cleanup; forcing cleanup.
[  143.578233] LustreError: 3703:0:(ldlm_resource.c:1676:ldlm_resource_dump()) --- Resource: [0x2:0x0:0x0].0x0 (ffff8806ab7b6900) refcount = 2

Note: the title should probably be updated once we figure what the issue exactly is

Attachments

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

reproducer-lu-10302.sh
01/Dec/17 10:10 AM
1 kB
Quentin Bouget

Issue Links

is related to

LU-10723 Interop 2.10.3<->2.11 sanity test_232b: OSS hung

Resolved

is related to

LU-10357 ll_ioc_copy_{start,end}() depend on search_inode_for_lustre() which is bad

Resolved

Activity

People

Assignee:: John Hammond

Reporter:: CEA

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 30/Nov/17 3:29 PM

Updated:: 05/Aug/20 1:50 PM

Resolved:: 22/Dec/17 12:50 PM