Loading...

Details

Type: Bug
Resolution: Unresolved
Priority: Critical
Fix Version/s: None
Affects Version/s: Lustre 2.14.0, Lustre 2.15.1
Labels:
None
Environment:
Linux 5.4.0-1091-azure #96~18.04.1-Ubuntu SMP Tue Aug 30 19:15:32 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

Reproducible on 2.15.1 and 2.14.0. Both clients and servers are running Ubuntu 18.04 as shown in Environment.

Steps to reproduce:

# confirm hsm is enabled
mds-node:~# lctl get_param mdt.lustrefs-MDT0000.hsm_control
mdt.lustrefs-MDT0000.hsm_control=enabled

# setup pcc on client 0
client-0:~# mkdir /pcc
client-0:~# chmod 777 /pcc /lustre
client-0:~# lhsmtool_posix --daemon --hsm-root /pcc --archive=2 /lustre < /dev/null > /tmp/copytool_log 2>&1
client-0:~# lctl pcc add /lustre /pcc -p "gid={0},gid={2001} rwid=2"
# setup pcc on client 1
client-1:~# mkdir /pcc
client-1:~# chmod 777 /pcc /lustre
client-1:~# lhsmtool_posix --daemon --hsm-root /pcc --archive=3 /lustre < /dev/null > /tmp/copytool_log 2>&1
client-1:~# lctl pcc add /lustre /pcc -p "gid={0},gid={2001} rwid=3"
# create file on client 0 and confirm in-cache
client-0:~# echo "test" > /lustre/test
client-0:~# lfs pcc state /lustre/test
file: /lustre/test, type: readwrite, PCC file: /pcc/0001/0000/0402/0000/0002/0000/0x200000402:0x1:0x0, user number: 0, flags: 0
# read file from client 1
client-1:~# lfs pcc state /lustre/test
file: /lustre/test, type: none
client-1:~# cat /lustre/test
cat: /lustre/test: No data available
client-1:~# cat /lustre/test
test
client-1:~# lfs pcc state /lustre/test
file: /lustre/test, type: none
# check pcc state, and attempt to attach again on client 0
client-0:~# lfs pcc state /lustre/test
file: /lustre/test, type: none
client-0:~# lfs pcc attach -i 2 /lustre/test
^C^C^C^C^C^C^C^C^C <---- hang
# while client 0 is hanging, check state on client 1
client-1:~# lfs pcc state /lustre/test
^C^C^C^C <---- hang

Minutes later things resolve and the stuck command lines return. Examining the MDS, it crashed and rebooted. Relevant
output from dmesg:

[ 3266.211270] LustreError: 11458:0:(mdt_handler.c:960:mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed:
[ 3266.217023] LustreError: 11458:0:(mdt_handler.c:960:mdt_big_xattr_get()) LBUG
[ 3266.220653] Pid: 11458, comm: mdt_rdpg02_001 5.4.0-1091-azure #96~18.04.1-Ubuntu SMP Tue Aug 30 19:15:32 UTC 2022
[ 3266.220653] Call Trace TBD:
[ 3266.220654] Kernel panic - not syncing: LBUG
[ 3266.222778] CPU: 8 PID: 11458 Comm: mdt_rdpg02_001 Kdump: loaded Tainted: P OE 5.4.0-1091-azure #96~18.04.1-Ubuntu
[ 3266.224582] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 3266.224582] Call Trace:
[ 3266.224582] dump_stack+0x57/0x6d
[ 3266.224582] panic+0xf8/0x2d4
[ 3266.224582] lbug_with_loc+0x89/0x2c0 [libcfs]
[ 3266.224582] mdt_big_xattr_get+0x398/0x8b0 [mdt]
[ 3266.224582] ? mdd_read_unlock+0x2d/0xc0 [mdd]
[ 3266.224582] ? mdd_readpage+0x1919/0x1ed0 [mdd]
[ 3266.224582] __mdt_stripe_get+0x1d4/0x430 [mdt]
[ 3266.224582] mdt_attr_get_complex+0x56e/0x1af0 [mdt]
[ 3266.224582] mdt_mfd_close+0x2062/0x41c0 [mdt]
[ 3266.224582] ? lustre_msg_buf+0x17/0x50 [ptlrpc]
[ 3266.224582] ? __req_capsule_offset+0x5ae/0x6e0 [ptlrpc]
[ 3266.224582] mdt_close_internal+0x1f0/0x250 [mdt]
[ 3266.259003] mdt_close+0x483/0x13f0 [mdt][ 3266.259003] tgt_request_handle+0xc9a/0x1950 [ptlrpc]
[ 3266.259003] ? lustre_msg_get_transno+0x22/0xe0 [ptlrpc]
[ 3266.259003] ptlrpc_register_service+0x25e6/0x4610 [ptlrpc]
[ 3266.259003] ? __switch_to_asm+0x34/0x70
[ 3266.259003] kthread+0x121/0x140
[ 3266.259003] ? ptlrpc_register_service+0x1590/0x4610 [ptlrpc]
[ 3266.259003] ? kthread_park+0x90/0x90
[ 3266.259003] ret_from_fork+0x35/0x40
[ 3266.259003] Kernel Offset: 0x1be00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Attachments

Issue Links

is related to

LU-13816 LustreError: 18408:0:(mdt_handler.c:892:mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed:

Resolved

LU-13599 LustreError: 30166:0:(service.c:189:ptlrpc_save_lock()) ASSERTION( rs->rs_nlocks < 8 ) failed

Resolved

LU-13615 mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed

Closed

PCC crashes MDS: mdt_big_xattr_get()) ASSERTION( info->mti_big_lmm_used == 0 ) failed

Details

Description

Attachments

Issue Links

Activity

People

Dates