[LU-6635] sanity-lfsck test_18e:FAIL: (8) .lustre/lost+found/MDT0000/ should not be empty Created: 24/May/15  Updated: 20/Oct/16  Resolved: 20/Oct/16

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.8.0
Fix Version/s: Lustre 2.9.0

Type: Bug Priority: Minor
Reporter: Maloo Assignee: nasf (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Related
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for wangdi <di.wang@intel.com>

This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/35e75e88-012d-11e5-9d1f-5254006e85c2.

The sub-test test_18e failed with the following error:

(8) .lustre/lost+found/MDT0000/ should not be empty
CMD: shadow-20vm12 /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_layout
There should be stub file under .lustre/lost+found/MDT0000/
 sanity-lfsck test_18e: @@@@@@ FAIL: (8) .lustre/lost+found/MDT0000/ should not be empty 
  Trace dump:
  = /usr/lib64/lustre/tests/test-framework.sh:4727:error_noexit()
  = /usr/lib64/lustre/tests/test-framework.sh:4758:error()
  = /usr/lib64/lustre/tests/sanity-lfsck.sh:2261:test_18e()
  = /usr/lib64/lustre/tests/test-framework.sh:5020:run_one()
  = /usr/lib64/lustre/tests/test-framework.sh:5057:run_one_logged()
  = /usr/lib64/lustre/tests/test-framework.sh:4907:run_test()
  = /usr/lib64/lustre/tests/sanity-lfsck.sh:2277:main()
Dumping lctl log to /logdir/test_logs/2015-05-22/lustre-reviews-el6_6-x86_64--review-dne-part-2--2_9_1__32432__-70061688987660-225248/sanity-lfsck.test_18e.*.1432352325.log
CMD: shadow-20vm10.shadow.whamcloud.com,shadow-20vm11,shadow-20vm12,shadow-20vm8,shadow-20vm9 /usr/sbin/lctl dk > /logdir/test_logs/2015-05-22/lustre-reviews-el6_6-x86_64--review-dne-part-2--2_9_1__32432__-70061688987660-225248/sanity-lfsck.test_18e.debug_log.\$(hostname -s).1432352325.log;
         dmesg > /logdir/test_logs/2015-05-22/lustre-reviews-el6_6-x86_64--review-dne-part-2--2_9_1__32432__-70061688987660-225248/sanity-lfsck.test_18e.dmesg.\$(hostname -s).1432352325.log


 Comments   
Comment by Di Wang [ 24/May/15 ]

Another failure

https://testing.hpdd.intel.com/test_sets/71b23c06-0116-11e5-8699-5254006e85c2

Comment by Peter Jones [ 25/May/15 ]

Fan Yong

Any ideas on this one?

Peter

Comment by nasf (Inactive) [ 27/May/15 ]

The main purpose for sanity-lfsck test_18e is that:

During the first-stage scanning in the layout LFSCK, if some MDT-object dangling references some OST-object, then at that time, the layout LFSCK cannot know whether such dangling reference is right or not, then it will re-create the specified (lost) OST-object. If such decision is wrong, means during the second-stage scanning, the layout LFSCK finds that some orphan OST-object back-references the former MDT-object that with dangling reference, then the layout LFSCK will try to replace the re-created OST-obejct with the orphan OST-object. But before such replacing, it needs to check whether the re-created OST-obejct has ever been modified after the re-creating. If no, replace it; otherwise, there is conflict, the layout LFSCK will link the orphan OST-object to a new MDT-object under .lustre/lost+found/MDTxxxx/. test_18e is for the later case.

About LFSCK logic maybe not perfect. But the failure here is that: when the layout LFSCK on the OST to check whether the re-created OST-object has ever been modified after the re-creating, it found that it is empty! That it is strange. According to the test scripts, we modified such re-created OST-object as following:

        echo "Trigger layout LFSCK on all devices to find out orphan OST-object"
        $START_LAYOUT -r -o -c || error "(2) Fail to start LFSCK for layout!"

        wait_update_facet mds1 "$LCTL get_param -n \
                mdd.$(facet_svc mds1).lfsck_layout |
                awk '/^status/ { print \\\$2 }'" "scanning-phase2" $LTIME ||
                error "(3) MDS1 is not the expected 'scanning-phase2'"

        # to guarantee all updates are synced.
        sync
        sleep 2

        echo "Write new data to f2 to modify the new created OST-object."
        echo "dummy" >> $DIR/$tdir/a1/f2

        do_facet $SINGLEMDS $LCTL set_param fail_val=0 fail_loc=0

And the logs shows that the fail_loc has been triggered. On the other hand, when the layout LFSCK checks whether the re-created OST-object has ever been modified after the re-creating, it will take the ldlm lock firstly, the log shows that it has taken the log, but the OST-object was empty.

00010000:00010000:1.0:1432342702.953829:0:15109:0:(ldlm_request.c:487:ldlm_cli_enqueue_local()) ### client-side local enqueue handler, new lock created ns: filter-lustre-OST0000_UUID lock: ffff880070e43080/0xbd508038ad6121c7 lrc: 3/0,1 mode: EX/EX res: [0x7ac:0x0:0x0].0 rrc: 1 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) flags: 0x40010000000000 nid: local remote: 0x0 expref: -99 pid: 15109 timeout: 0 lvb_type: 0
00100000:10000000:1.0:1432342702.953874:0:15109:0:(lfsck_layout.c:2037:lfsck_layout_slave_conditional_destroy()) lustre-OST0000-osd: layout LFSCK destroyed the empty OST-object [0x100000000:0x7ac:0x0] that was created for reparing dangling referenced case. But the original missing OST-object is found now.

Since the re-created OST-object has not been modified, the layout LFSCK will replaced it with the orphan OST-object, that is NOT the test_18e expectation.

Di, would you please to check whether there is some logic in your patches that may affect the data to be written back to the OST? Thanks!

Comment by nasf (Inactive) [ 15/Jun/15 ]

Di, is it still trouble for you? or you did not hit it again because of being fixed by the side-effect of some patch?

Comment by Di Wang [ 15/Jun/15 ]

Nasf: yes, this still happens https://testing.hpdd.intel.com/test_sessions/3f3d32ea-12a4-11e5-b9a9-5254006e85c2

Comment by Di Wang [ 16/Jun/15 ]
 would you please to check whether there is some logic in your patches that may affect the data to be written back to the OST? Thanks!

I do not remember I did sth like this.

Comment by nasf (Inactive) [ 16/Jun/15 ]
        wait_update_facet mds1 "$LCTL get_param -n \
                mdd.$(facet_svc mds1).lfsck_layout |
                awk '/^status/ { print \\\$2 }'" "scanning-phase2" $LTIME ||
                error "(3) MDS1 is not the expected 'scanning-phase2'"

        # to guarantee all updates are synced.
        sync
        sleep 2

        echo "Write new data to f2 to modify the new created OST-object."
        echo "dummy" >> $DIR/$tdir/a1/f2

The reason is that there were some very slow processing, and caused that before performing 'echo "dummy" >> $DIR/$tdir/a1/f2', the layout LFSCK has already waken up from the OBD_FAIL_LFSCK_DELAY3 and handled the target object "f2". The suspect point is the "sync". You can add some timer before and after the "sync" to measure how long it will take in your test.

Comment by nasf (Inactive) [ 31/Jul/15 ]

Di, do you have the logs with "sync" time measured?
Or can we close the ticket if it is not valid any longer?

Comment by Di Wang [ 25/Aug/15 ]

Fan Yong: No, I do not have the logs, only got these failures on Maloo test anyway, here is another failure from yesterday. Please check, thanks.
https://testing.hpdd.intel.com/test_sets/7368a004-4b0c-11e5-bc8b-5254006e85c2

Comment by nasf (Inactive) [ 02/Sep/15 ]

According to the log https://testing.hpdd.intel.com/test_sets/7368a004-4b0c-11e5-bc8b-5254006e85c2, the client side time is consumed inside the following scripts of wait_update_facet:

        $START_LAYOUT -r -o -c || error "(2) Fail to start LFSCK for layout!"

        wait_update_facet mds1 "$LCTL get_param -n \
                mdd.$(facet_svc mds1).lfsck_layout |
                awk '/^status/ { print \\\$2 }'" "scanning-phase2" $LTIME ||
                error "(3) MDS1 is not the expected 'scanning-phase2'"

        # to guarantee all updates are synced.
        sync
        sleep 2
        
        echo "Write new data to f2 to modify the new created OST-object."
        echo "dummy" >> $DIR/$tdir/a1/f2
00000001:00000001:1.0:1440478103.384505:0:4206:0:(debug.c:334:libcfs_debug_mark_buffer()) ***************************************************
00000001:02000400:1.0:1440478103.384506:0:4206:0:(debug.c:335:libcfs_debug_mark_buffer()) DEBUG MARKER: /usr/sbin/lctl get_param -n             mdd.lustre-MDT0000.lfsck_layout |
                awk '/^status/ { print $2 }'
00000001:00000001:1.0:1440478103.385590:0:4206:0:(debug.c:336:libcfs_debug_mark_buffer()) ***************************************************
...
00000001:00000001:1.0:1440478124.343827:0:4318:0:(debug.c:334:libcfs_debug_mark_buffer()) ***************************************************
00000001:02000400:1.0:1440478124.343828:0:4318:0:(debug.c:335:libcfs_debug_mark_buffer()) DEBUG MARKER: /usr/sbin/lctl get_param -n                     mdd.lustre-MDT0000.lfsck_layout |
                        awk '/^status/ { print $2 }'
00000001:00000001:1.0:1440478124.344946:0:4318:0:(debug.c:336:libcfs_debug_mark_buffer()) ***************************************************

The expected status detect time is about 1 second, but the real case is about 21 seconds. Such too long interval caused the subsequent write option to be postponed after the LFSCK replacing the new created OST-object.

It seems that the client was NOT in heavy load. So please check your test scripts to guarantee that the wait_update_facet() works well.

Comment by James Nunez (Inactive) [ 26/Oct/15 ]

Another failure on master at https://testing.hpdd.intel.com/test_sets/b122edfe-7b2d-11e5-9650-5254006e85c2

Comment by Gerrit Updater [ 03/Nov/15 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/17025
Subject: LU-6635 tests: more log message for wait_update
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 68b20adc367826650b1c48a464b4fb500deee788

Comment by James Nunez (Inactive) [ 17/Dec/15 ]

More failures on master:
2015-12-12 07:31:16 - https://testing.hpdd.intel.com/test_sets/b1b6505e-a0cf-11e5-9d88-5254006e85c2
2015-12-15 04:11:00 - https://testing.hpdd.intel.com/test_sets/89178d04-a2f3-11e5-9b3d-5254006e85c2
2015-12-16 10:58:27 - https://testing.hpdd.intel.com/test_sets/bf7d399a-a413-11e5-b715-5254006e85c2
2015-12-16 22:16:33 - https://testing.hpdd.intel.com/test_sets/0b63ca3a-a451-11e5-8701-5254006e85c2

Comment by Gerrit Updater [ 26/Jan/16 ]

Fan Yong (fan.yong@intel.com) uploaded a new patch: http://review.whamcloud.com/18146
Subject: LU-6635 lfsck: block repalcing the OST-object for test
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 183abfd4cb2186c1170cd1dfaac31d02df9ddeda

Comment by nasf (Inactive) [ 26/Jan/16 ]

The reason is that the client side write happened after the OST replaced the new created OST-object with the old orphan. So the solution is that we need to hold the replacing until the write happened. The patch http://review.whamcloud.com/18146 is for that.

Comment by Gerrit Updater [ 20/Oct/16 ]

Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/18146/
Subject: LU-6635 lfsck: block replacing the OST-object for test
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 7a814e94e065551ab79e2ba75df9626e4940efc5

Comment by Peter Jones [ 20/Oct/16 ]

Landed for 2.9

Generated at Sat Feb 10 02:01:55 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.