Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Duplicate
Priority: Minor
Fix Version/s: None
Affects Version/s: Lustre 2.16.0
Labels:
None
Environment:
Ubuntu 24.04 client

Severity:
3
Rank (Obsolete):
9223372036854775807

Description

This issue was created by maloo for jianyu <yujian@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/76b849f1-717c-4798-a705-f16312ffa149

test_18h failed with the following error:

== sanity-lfsck test 18h: LFSCK can repair crashed PFL extent range ========================================================== 06:35:47 (1721975747)
#####
The PFL extent crashed. During the first cycle LFSCK scanning,
the layout LFSCK will keep the bad PFL file(s) there without
scanning its OST-object(s). Then in the second stage scanning,
the OST will return related OST-object(s) to the MDT as orphan.
And then the LFSCK on the MDT can rebuild the PFL extent with
the 'orphan(s)' stripe information.
#####
0+1 records in
0+1 records out
295280 bytes (295 kB, 288 KiB) copied, 0.00202335 s, 146 MB/s
cp: error copying '/mnt/lustre/d18h.sanity-lfsck/f0' to '/mnt/lustre/d18h.sanity-lfsck/guard': No data available
Inject failure stub to simulate bad PFL extent range
CMD: onyx-35vm6 /usr/sbin/lctl set_param fail_loc=0x162f
fail_loc=0x162f
chown: warning: '.' should be ':': '1.1'
CMD: onyx-35vm6 /usr/sbin/lctl set_param fail_loc=0
fail_loc=0
dd: error writing '/mnt/lustre/d18h.sanity-lfsck/f0': No data available
1+0 records in
0+0 records out
0 bytes copied, 0.00104513 s, 0.0 kB/s
Trigger layout LFSCK to find out the bad lmm_oi and fix them
CMD: onyx-35vm6 /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t layout -r -o
Started LFSCK on the device lustre-MDT0000: scrub layout
CMD: onyx-35vm6 /usr/sbin/lctl get_param -n 			mdd.lustre-MDT0000.lfsck_layout |
			awk '/^status/ { print \$2 }'
CMD: onyx-35vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0000.lfsck_layout
CMD: onyx-35vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0001.lfsck_layout
CMD: onyx-35vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0002.lfsck_layout
CMD: onyx-35vm3 /usr/sbin/lctl get_param -n obdfilter.lustre-OST0003.lfsck_layout
CMD: onyx-35vm6 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout
Data in /mnt/lustre/d18h.sanity-lfsck/f0 should not be broken
Binary files /mnt/lustre/d18h.sanity-lfsck/f0 and /mnt/lustre/d18h.sanity-lfsck/guard differ
 sanity-lfsck test_18h: @@@@@@ FAIL: (6) Data in /mnt/lustre/d18h.sanity-lfsck/f0 is broken

Test session details:
clients: https://build.whamcloud.com/job/lustre-master/4558 - 6.8.0-35-generic
servers: https://build.whamcloud.com/job/lustre-master/4558 - 5.14.0-427.24.1_lustre.el9.x86_64

<<Please provide additional information about the failure here>>

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-lfsck test_18h - (6) Data in /mnt/lustre/d18h.sanity-lfsck/f0 is broken

Attachments

Issue Links

is duplicated by

LU-18094 sanity-hsm test_58: FAIL: file data wrong after truncate

Resolved

is related to

LU-18103 sanity test_244a: FAIL: sendfile+grouplock failed

Resolved

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(11 mentioned in)

Activity

People

Assignee:: Hongchao Zhang

Reporter:: Maloo

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Dates

Created:: 02/Aug/24 5:58 PM

Updated:: 19/Sep/24 3:57 PM

Resolved:: 12/Sep/24 1:39 AM