Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18962

sanity test_27M: lod_fill_mirrors()) ASSERTION( (!!(!lo->ldo_is_composite) == !!(lo->ldo_mirror_count == 0)) )

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Sergey Cheremencev <scherementsev@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/763b6c83-cfa9-42e3-99e7-4bfa54cae8ad

      test_27M failed with the following error:

      onyx-107vm6 crashed during sanity test_27M
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/112825 - 4.18.0-553.46.1.el8_10.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/112825 - 4.18.0-553.46.1.el8_lustre.x86_64

      <<Please provide additional information about the failure here>>

      [ 1514.074237] Lustre: DEBUG MARKER: == sanity test 27M: test O_APPEND striping =============== 10:13:43 (1745662423)
      [ 1514.307798] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.append_pool
      [ 1514.631832] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param mdd.*.append_pool=LOV_MAXPOOLNAME*
      [ 1514.969093] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.append_stripe_count
      [ 1515.161685] LustreError: 96333:0:(lod_lov.c:588:lod_fill_mirrors()) ASSERTION( (!!(!lo->ldo_is_composite) == !!(lo->ldo_mirror_count == 0)) ) failed: 
      [ 1515.164259] LustreError: 96333:0:(lod_lov.c:588:lod_fill_mirrors()) LBUG
      [ 1515.165580] CPU: 0 PID: 96333 Comm: mdt00_005 Kdump: loaded Tainted: G           OE     -------- -  - 4.18.0-553.46.1.el8_lustre.x86_64 #1
      [ 1515.167899] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 1515.169021] Call Trace:
      [ 1515.169550]  dump_stack+0x41/0x60
      [ 1515.170249]  lbug_with_loc.cold.8+0x5/0x43 [libcfs]
      [ 1515.171261]  lod_fill_mirrors+0x5ea/0x760 [lod]
      [ 1515.172324]  lod_striped_create+0x2fa/0x5a0 [lod]
      [ 1515.173295]  lod_create+0x2ca/0x3f0 [lod]
      [ 1515.174131]  mdd_create_object_internal+0xad/0x330 [mdd]
      [ 1515.175285]  mdd_create_object+0xaf/0xa70 [mdd]
      [ 1515.176215]  mdd_create+0xdc5/0x1b30 [mdd]
      [ 1515.177058]  mdt_reint_open+0x2ec4/0x3110 [mdt]
      [ 1515.178180]  mdt_reint_rec+0x123/0x270 [mdt]
      [ 1515.179086]  mdt_reint_internal+0x4c5/0x970 [mdt]
      [ 1515.180060]  mdt_intent_open+0x13b/0x420 [mdt]
      [ 1515.180987]  ? mdt_intent_fixup_resent+0x220/0x220 [mdt]
      [ 1515.182066]  mdt_intent_opc.constprop.82+0x125/0xc50 [mdt]
      [ 1515.183172]  ? lprocfs_counter_add+0x117/0x180 [obdclass]
      [ 1515.184550]  mdt_intent_policy+0xfb/0x470 [mdt]
      [ 1515.185494]  ldlm_lock_enqueue+0x3e7/0x8f0 [ptlrpc]
      [ 1515.187028]  ? cfs_hash_multi_bd_lock+0xa0/0xa0 [obdclass]
      [ 1515.188147]  ldlm_handle_enqueue+0x3f3/0x15f0 [ptlrpc]
      [ 1515.189268]  tgt_enqueue+0xa8/0x230 [ptlrpc]
      [ 1515.190232]  tgt_request_handle+0x3f4/0x1b80 [ptlrpc]
      [ 1515.191328]  ptlrpc_server_handle_request+0x27b/0xcd0 [ptlrpc]
      [ 1515.192560]  ? lprocfs_counter_add+0x117/0x180 [obdclass]
      [ 1515.193669]  ptlrpc_main+0xc81/0x1560 [ptlrpc]
      [ 1515.194636]  ? __schedule+0x2d9/0x870
      [ 1515.195378]  ? ptlrpc_wait_event+0x5b0/0x5b0 [ptlrpc]
      [ 1515.196453]  kthread+0x134/0x150
      [ 1515.197126]  ? set_kthread_struct+0x50/0x50
      [ 1515.197969]  ret_from_fork+0x35/0x40
      [ 1515.198729] Kernel panic - not syncing: LBUG
      

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanity test_27M - onyx-107vm6 crashed during sanity test_27M

      Attachments

        Issue Links

          Activity

            [LU-18962] sanity test_27M: lod_fill_mirrors()) ASSERTION( (!!(!lo->ldo_is_composite) == !!(lo->ldo_mirror_count == 0)) )

            This comment was generated by Crash AI (GPT-4o).

            Issue Information

            • Type: Kernel Assertion Failure
            • Indicators: crash, Call Trace, Kernel panic
            • Summary: sanity test_27M: lod_fill_mirrors()) ASSERTION( (!!(!lo->ldo_is_composite) == !!(lo->ldo_mirror_count == 0)) )

            Test Information

            • Test: sanity
            • Status: CRASH
            • Failed Subtest: 'onyx-107vm6 crashed during sanity test_27M'
            • Possible Related Bugs: EX-12103, LU-18962

            Environment Information

            Lustre Details

            • Version: 2.16.54.48
            • Branch: master
            • Revision: ec72fecaaf95dcc8241105533e3c6650b73eb431

            System Information

            • Node: onyx-107vm6
            • Services: MDS 1
            • Architecture: x86_64
            • Distribution: RHEL 8.10
            • Memory: 2.6 GB
            • Kernel: 4.18.0-553.46.1.el8_lustre.x86_64
            • Networks: tcp

            Resource Links

            • Build: Build #112825
            • VMCore Path: your_user_name@ssh.onyx.whamcloud.int:/scratch/dumps/onyx-107vm6.onyx.whamcloud.com/10.240.28.209-2025-04-26-10:13:52/vmcore

            Crash Analysis

            Due to the lack of access to vmlinux or vmcore files, detailed crash analysis information could not be obtained. Basic information is provided based on available data.

            Technical Details

            The crash occurred during the execution of the `lod_fill_mirrors` function, which is part of the Lustre Object Distribution (LOD) module. The assertion failure indicates a mismatch in the expected state of the object distribution mirrors, specifically related to the composite status and mirror count. The call trace shows a sequence of function calls leading to the assertion, starting from `lod_fill_mirrors` and involving several other Lustre modules such as `mdd` and `mdt`.

            Recommendations

            1. Investigate the state of `lo->ldo_is_composite` and `lo->ldo_mirror_count` at the time of failure.
            2. Review recent changes to the LOD mirror handling code.
            3. Check for race conditions in mirror setup/teardown operations.
            4. Consider adding additional validation before the assertion point.

            vkuznetsov Vitaliy Kuznetsov added a comment - This comment was generated by Crash AI (GPT-4o). Issue Information Type : Kernel Assertion Failure Indicators : crash, Call Trace, Kernel panic Summary : sanity test_27M: lod_fill_mirrors()) ASSERTION( (!!(!lo->ldo_is_composite) == !!(lo->ldo_mirror_count == 0)) ) Test Information Test : sanity Status : CRASH Failed Subtest : 'onyx-107vm6 crashed during sanity test_27M' Possible Related Bugs : EX-12103 , LU-18962 Environment Information Lustre Details Version : 2.16.54.48 Branch : master Revision : ec72fecaaf95dcc8241105533e3c6650b73eb431 System Information Node : onyx-107vm6 Services : MDS 1 Architecture : x86_64 Distribution : RHEL 8.10 Memory : 2.6 GB Kernel : 4.18.0-553.46.1.el8_lustre.x86_64 Networks : tcp Resource Links Build : Build #112825 VMCore Path : your_user_name@ssh.onyx.whamcloud.int:/scratch/dumps/onyx-107vm6.onyx.whamcloud.com/10.240.28.209-2025-04-26-10:13:52/vmcore Crash Analysis Due to the lack of access to vmlinux or vmcore files, detailed crash analysis information could not be obtained. Basic information is provided based on available data. Technical Details The crash occurred during the execution of the `lod_fill_mirrors` function, which is part of the Lustre Object Distribution (LOD) module. The assertion failure indicates a mismatch in the expected state of the object distribution mirrors, specifically related to the composite status and mirror count. The call trace shows a sequence of function calls leading to the assertion, starting from `lod_fill_mirrors` and involving several other Lustre modules such as `mdd` and `mdt`. Recommendations 1. Investigate the state of `lo->ldo_is_composite` and `lo->ldo_mirror_count` at the time of failure. 2. Review recent changes to the LOD mirror handling code. 3. Check for race conditions in mirror setup/teardown operations. 4. Consider adding additional validation before the assertion point.
            bobijam Zhenyu Xu added a comment - the patch https://review.whamcloud.com/c/fs/lustre-release/+/58981 causes this error.

            People

              scherementsev Sergey Cheremencev
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: