[LU-4806] Test failure sanity test_53: can not match last_seq/last_id for *OST*-osc-MDT0000 Created: 24/Mar/14  Updated: 19/Apr/22

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.6.0, Lustre 2.8.0, Lustre 2.10.1
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 13221

 Description   

This issue was created by maloo for Nathaniel Clark <nathaniel.l.clark@intel.com>

This issue relates to the following test suite run:
http://testing.hpdd.intel.com/test_sets/963b13e0-b144-11e3-808b-52540035b04c
https://testing.hpdd.intel.com/test_sets/bcc5032e-b147-11e3-98c2-52540035b04c

The sub-test test_53 failed with the following error:

can not match last_seq/last_id for *OST*-osc-MDT0000

Info required for matching: sanity 53



 Comments   
Comment by Oleg Drokin [ 24/Mar/14 ]

This is a regression from unlanded patch in http://review.whamcloud.com/#/c/7936/

Comment by Yang Sheng [ 26/Mar/14 ]

Yes, This issue really cause by lprocfs patch. I am also hit it in my local test(comment-78888). I have commented in relate patches for the root cause.

Comment by James A Simmons [ 26/Mar/14 ]

I have been looking into it. I see what you mean about not freeing type->typ_procsym when creating a symlink fails to create. This is only done in the case of proc subdirectories like target_obds. I'm thinking the safest place to free type->typ_procsym is in class_unregister_type. I testing new patches.

Comment by James A Simmons [ 27/Mar/14 ]

The latest LU-3319 patches seem to calm these problems down.

Comment by Sarah Liu [ 24/Jul/14 ]

Hit this error in rolling upgrade test, after upgrade server from 2.5.2 ldiskfs to b2_6-rc2, clients are still 2.5.2, sanity test_53 failed as the same error.

Comment by James A Simmons [ 27/Aug/15 ]

Do we still see this problem anymore? If not we can close this ticket.

Comment by Justin Miller [ 27/Aug/15 ]

We're seeing this test failure for 2.7.0.

(debug.c:345:libcfs_debug_mark_buffer()) DEBUG MARKER: sanity test_53: @@@@@@ FAIL: can not match last_seq/last_id for *OST*-osc
Comment by Sarah Liu [ 01/Sep/15 ]

hit in interop testing with master server(DNE) and 2.5.3 client:

== sanity test 53: verify that MDS and OSTs agree on pre-creation ====== 19:53:08 (1441075988)
Lustre: DEBUG MARKER: == sanity test 53: verify that MDS and OSTs agree on pre-creation ====== 19:53:08 (1441075988)
onyx-25: error: get_param: osc/*OST*-osc-/prealloc_last_id: Found no match
 sanity test_53: @@@@@@ FAIL: can not match last_seq/last_id for *OST*-osc- 
Lustre: DEBUG MARKER: sanity test_53: @@@@@@ FAIL: can not match last_seq/last_id for *OST*-osc-
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:4343:error_noexit()
  = /usr/lib64/lustre/tests/test-framework.sh:4374:error()
  = sanity.sh:3968:test_53()
  = /usr/lib64/lustre/tests/test-framework.sh:4613:run_one()
  = /usr/lib64/lustre/tests/test-framework.sh:4648:run_one_logged()
  = /usr/lib64/lustre/tests/test-framework.sh:4516:run_test()
  = sanity.sh:3971:main()
Dumping lctl log to /home/w3liu/toro_home/test_logs/sanity.test_53.*.1441075990.log
FAIL 53 (9s)
Comment by James A Simmons [ 01/Sep/15 ]

I see what the problem is. The param "osc/*OST*-osc-/prealloc_last_id" needs to be "osc/*OST*-osc-*/prealloc_last_id". The bug is in get_mdtosc_proc_path() in test-framework.sh.

The mdt_index variable is coming back null.

Comment by Sarah Liu [ 11/Jan/16 ]

https://testing.hpdd.intel.com/test_sets/173b63f6-b575-11e5-bf32-5254006e85c2

server: lustre-master #3297 RHEL7 DNE(4 MDT)
client: 2.7.1

Comment by Sarah Liu [ 25/Sep/17 ]

https://testing.hpdd.intel.com/test_sets/806257c8-a212-11e7-b778-5254006e85c2 2.10.1 MOFED zfs

Comment by Andreas Dilger [ 08/Dec/17 ]

James, I see you identified the problem in your Aug'17 comment, could you please make a patch for this?

Comment by James Nunez (Inactive) [ 11/Dec/17 ]

I've looked at sanity test_53 failures for the past year in Maloo for autotest results. For both master and b2_10, this test has not failed with the 'can not match last_seq/last_id for OSTosc' error this year. I did not look at other branches.

We do see this test fail with this error when we’re doing manual testing with all servers on a single node. Here are two examples:
https://testing.hpdd.intel.com/test_sessions/802c5d30-a212-11e7-b778-5254006e85c2
https://testing.hpdd.intel.com/test_sessions/128214f8-a7c8-11e7-b786-5254006e85c2

Generated at Sat Feb 10 01:46:00 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.