[LU-10493] sanity failed after rolling downgrade client and MDS to 2.9.0: unrecognized lsm_magic Created: 11/Jan/18  Updated: 06/Aug/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.10.3
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Sarah Liu Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

test steps:
1. setup system as 2.9.0 ldiskfs
2. rolling upgrade OSS, MDS, clients to 2.10.3 and run sanity.sh
3. rolling downgrade clients and MDS to 2.9.0 again
4. run sanity and hit following error

[ 2703.795356] Lustre: 37531:0:(gss_mech_switch.c:72:lgss_mech_register()) Register gssnull mechanism
[ 2703.805447] Key type lgssc registered
[ 2703.918403] Lustre: Echo OBD driver; http://www.lustre.org/
[ 3442.176039] LustreError: 38823:0:(obd_config.c:1393:class_process_proc_param()) lov.: error writing proc entry 'stripeoffset': rc = -34
[ 3442.237926] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3467.223275] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3492.221534] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3517.223253] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3542.221080] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3567.221487] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3592.221469] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3642.232950] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3642.250857] LustreError: Skipped 1 previous similar message
[ 3717.232771] LustreError: 11-0: lustre-MDT0000-mdc-ffff88105bd0d000: operation mds_connect to node 10.2.5.222@tcp failed: rc = -16
[ 3717.250723] LustreError: Skipped 2 previous similar messages
[ 3767.218588] Lustre: Mounted lustre-client
[ 3866.722826] Lustre: Unmounted lustre-client
[ 4301.210942] LustreError: 39941:0:(obd_config.c:1393:class_process_proc_param()) lov.: error writing proc entry 'stripeoffset': rc = -34
[ 4301.270769] Lustre: Mounted lustre-client
[ 5063.673847] Lustre: Unmounted lustre-client
[ 5504.360486] LustreError: 41049:0:(obd_config.c:1393:class_process_proc_param()) lov.: error writing proc entry 'stripeoffset': rc = -34
[ 5504.438625] Lustre: Mounted lustre-client
[ 5504.538823] LustreError: 11-0: lustre-OST0001-osc-ffff880848cae000: operation ost_connect to node 10.2.2.44@tcp failed: rc = -16
[ 5504.538826] LustreError: Skipped 1 previous similar message
[ 5508.174035] Lustre: DEBUG MARKER: Using TIMEOUT=100
[ 5510.164684] LustreError: 41534:0:(lov_obd.c:1411:lov_quotactl()) ost 1 is inactive
[ 5529.474128] LustreError: 41534:0:(osc_quota.c:271:osc_quotactl()) ptlrpc_queue_wait failed, rc: -107
[ 5537.489411] Lustre: DEBUG MARKER: Client: Lustre version: 2.9.0
[ 5538.377136] Lustre: DEBUG MARKER: MDS: Lustre version: 2.9.0
[ 5539.489242] Lustre: DEBUG MARKER: OSS: Lustre version: 2.10.3_RC1
[ 5540.122761] Lustre: DEBUG MARKER: -----============= acceptance-small: sanity ============----- Thu Jan 11 19:23:21 UTC 2018
[ 5546.921267] Lustre: DEBUG MARKER: Using TIMEOUT=100
[ 5551.837739] LustreError: 44231:0:(lov_internal.h:100:lsm_op_find()) unrecognized lsm_magic 0bd60bd0
[ 5551.850507] LustreError: 44231:0:(lov_pack.c:213:lov_verify_lmm()) bad disk LOV MAGIC: 0x0BD60BD0; dumping LMM (size=344):
[ 5551.865235] LustreError: 44231:0:(lov_pack.c:222:lov_verify_lmm()) FF0BFF0B58010000030000000000030000000000000000000000000000000000010000001000000000000000000000000000000400000000FF0000003800000000000000000000000000000000000000020000000000000000000004000000000000004000000000FF000000380000000000000000000000000000000000000003000000000000000000004000000000FFFFFFFFFFFFFFFF200100003800000000000000000000000000000000000000FF0BFF0B01000000FF4600000000000011FF0000020000000000100001000000625A06000000000000000000000000000000000000000000FF0BFF0B01000000FF4600000000000011FF000002000000000010000100FFFF0000000000000000000000000000000000000000FFFFFFFFFF0BFF0B01000000FF4600000000000011FF000002000000000010000100FFFF0000000000000000000000000000000000000000FFFFFFFF
[ 5551.951767] LustreError: 44231:0:(lcommon_cl.c:181:cl_file_inode_init()) Failure to initialize cl object [0x20000a811:0x4680:0x0]: -22
[ 5551.970055] LustreError: 44231:0:(llite_lib.c:2300:ll_prep_inode()) new_inode -fatal: rc -22
[ 5555.470015] Lustre: Unmounted lustre-client
[ 5593.801397] Key type lgssc unregistered
[ 5593.808152] Lustre: 44968:0:(gss_mech_switch.c:81:lgss_mech_unregister()) Unregister krb5 mechanism
[ 5600.473651] LNet: Removed LNI 10.2.2.47@tcp
[root@onyx-76 tests]# 



 Comments   
Comment by Andreas Dilger [ 12/Jan/18 ]

This appears to be a left-over composite file (PFL or FLR) from some other testing. There is nothing else wrong that I can see.

#define LOV_MAGIC_COMP_V1       0x0BD60000
Comment by Ian Costello [ 06/Aug/21 ]

is the stipeoffset set to 65535 rather than -1?

Generated at Sat Feb 10 02:35:35 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.