Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18496

sanityn test_80b: crash RIP: 0010:mutex_lock+0x19/0x30

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Qian Yingjin <qian@ddn.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/f71abdfd-dc72-46c4-9aec-8b5b5452f2fc

      test_80b failed with the following error:

      onyx-107vm4 crashed during sanityn test_80b
       8855.553395] LustreError: 252476:0:(mdt_reint.c:2533:mdt_reint_migrate()) lustre-MDT0002: migrate [0x280000404:0x21b:0x0]/source_file failed: rc = -114
      [ 8855.556190] LustreError: 252476:0:(mdt_reint.c:2533:mdt_reint_migrate()) Skipped 1 previous similar message
      [ 8856.216711] LustreError: 252476:0:(mdd_dir.c:4457:mdd_migrate_cmd_check()) lustre-MDD0000: 'migrate_dir' migration was interrupted, run 'lfs migrate -m 3 -c 1 -H crush migrate_dir' to finish migration: rc = -1
      [ 8856.839105] LustreError: 252476:0:(mdd_dir.c:4457:mdd_migrate_cmd_check()) lustre-MDD0000: 'migrate_dir' migration was interrupted, run 'lfs migrate -m 2 -c 1 -H crush migrate_dir' to finish migration: rc = -1
      [ 8856.842675] LustreError: 252476:0:(mdd_dir.c:4457:mdd_migrate_cmd_check()) Skipped 6 previous similar messages
      [ 8856.844712] LustreError: 252476:0:(mdt_reint.c:2533:mdt_reint_migrate()) lustre-MDT0000: migrate [0x280000404:0x20f:0x0]/migrate_dir failed: rc = -1
      [ 8856.847255] LustreError: 252476:0:(mdt_reint.c:2533:mdt_reint_migrate()) Skipped 10 previous similar messages
      [ 8858.131298] LustreError: 252476:0:(mdd_dir.c:4457:mdd_migrate_cmd_check()) lustre-MDD0000: 'migrate_dir' migration was interrupted, run 'lfs migrate -m 1 -c 1 -H crush migrate_dir' to finish migration: rc = -1
      [ 8858.134987] LustreError: 252476:0:(mdd_dir.c:4457:mdd_migrate_cmd_check()) Skipped 3 previous similar messages
      [ 8858.865752] LustreError: 451159:0:(mdt_reint.c:2533:mdt_reint_migrate()) lustre-MDT0000: migrate [0x280000404:0x20f:0x0]/migrate_dir failed: rc = -1
      [ 8858.868387] LustreError: 451159:0:(mdt_reint.c:2533:mdt_reint_migrate()) Skipped 29 previous similar messages
      [ 8860.131770] LustreError: 451159:0:(mdd_dir.c:4457:mdd_migrate_cmd_check()) lustre-MDD0002: 'migrate_dir' migration was interrupted, run 'lfs migrate -m 0 -c 1 -H crush migrate_dir' to finish migration: rc = -1
      [ 8860.135463] LustreError: 451159:0:(mdd_dir.c:4457:mdd_migrate_cmd_check()) Skipped 20 previous similar messages
      [ 8862.109739] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
      [ 8862.112630] PGD 0 P4D 0 
      [ 8862.113234] Oops: 0002 [#1] SMP PTI
      [ 8862.114021] CPU: 0 PID: 252463 Comm: mdt00_002 Kdump: loaded Tainted: P        W  OE     -------- -  - 4.18.0-553.16.1.el8_lustre.x86_64 #1
      [ 8862.116409] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 8862.117554] RIP: 0010:mutex_lock+0x19/0x30
      [ 8862.118451] Code: 00 0f 1f 44 00 00 be 02 00 00 00 e9 d1 fb ff ff 90 0f 1f 44 00 00 53 48 89 fb e8 02 e0 ff ff 31 c0 65 48 8b 14 25 40 dc 01 00 <f0> 48 0f b1 13 74 06 48 89 df 5b eb ca 5b c3 cc cc cc cc 0f 1f 40
      [ 8862.121971] RSP: 0018:ffffbbfa4175f8f8 EFLAGS: 00010246
      [ 8862.123012] RAX: 0000000000000000 RBX: 00000000000000b8 RCX: dead000000000200
      [ 8862.124420] RDX: ffffa0076bf3d000 RSI: ffffa00791c43478 RDI: 00000000000000b8
      [ 8862.125833] RBP: 00000000000000b8 R08: ffffa00798860000 R09: 000000000000019c
      [ 8862.127213] R10: 8080808080808080 R11: 0000000000000000 R12: ffffa0076af38c78
      [ 8862.128608] R13: ffffa007625a6b00 R14: ffffa00791c42d80 R15: ffffa00791c42cf0
      [ 8862.129989] FS:  0000000000000000(0000) GS:ffffa007ffc00000(0000) knlGS:0000000000000000
      [ 8862.131552] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 8862.132682] CR2: 00000000000000b8 CR3: 0000000094a10006 CR4: 00000000001706f0
      [ 8862.134068] Call Trace:
      [ 8862.134654]  ? __die_body+0x1a/0x60
      [ 8862.135419]  ? no_context+0x1ba/0x3f0
      [ 8862.136186]  ? __bad_area_nosemaphore+0x157/0x180
      [ 8862.137135]  ? do_page_fault+0x37/0x12d
      [ 8862.137923]  ? page_fault+0x1e/0x30
      [ 8862.138667]  ? mutex_lock+0x19/0x30
      [ 8862.139390]  ? mutex_lock+0xe/0x30
      [ 8862.140096]  sa_spill_rele+0x19/0xa0 [zfs]
      [ 8862.141442]  osd_object_sa_dirty_rele+0x41/0x120 [osd_zfs]
      [ 8862.142624]  osd_trans_stop+0x3bc/0x550 [osd_zfs]
      [ 8862.143596]  top_trans_stop+0x9f/0x1120 [ptlrpc]
      [ 8862.144667]  lod_trans_stop+0x90/0x380 [lod]
      [ 8862.145678]  mdd_trans_stop+0x29/0x172 [mdd]
      [ 8862.146658]  mdd_dir_layout_shrink+0x6c6/0x1110 [mdd]
      [ 8862.147698]  ? mdd_layout_change+0x60a/0x1890 [mdd]
      [ 8862.148698]  mdd_layout_change+0x60a/0x1890 [mdd]
      [ 8862.149671]  ? __mdt_stripe_get+0xf7/0x570 [mdt]
      [ 8862.150796]  mdt_dir_layout_update+0x798/0x1120 [mdt]
      [ 8862.151860]  mdt_reint_setxattr+0xbd8/0x1080 [mdt]
      [ 8862.152862]  mdt_reint_rec+0x123/0x270 [mdt]
      [ 8862.153764]  mdt_reint_internal+0x4b9/0x810 [mdt]
      [ 8862.154751]  mdt_reint+0x5d/0x110 [mdt]
      [ 8862.155583]  tgt_request_handle+0x3f4/0x1a30 [ptlrpc]
      [ 8862.156711]  ptlrpc_server_handle_request+0x2aa/0xcf0 [ptlrpc]
      [ 8862.157950]  ? lprocfs_counter_add+0x10e/0x180 [obdclass]
      [ 8862.159324]  ptlrpc_main+0xc9e/0x15c0 [ptlrpc]
      [ 8862.160306]  ? __schedule+0x2d9/0x870
      [ 8862.161070]  ? ptlrpc_wait_event+0x5b0/0x5b0 [ptlrpc]
      [ 8862.162165]  kthread+0x134/0x150
      [ 8862.162873]  ? set_kthread_struct+0x50/0x50
      [ 8862.163733]  ret_from_fork+0x35/0x40
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-reviews/109192 - 4.18.0-553.16.1.el8_10.x86_64
      servers: https://build.whamcloud.com/job/lustre-reviews/109192 - 4.18.0-553.16.1.el8_lustre.x86_64

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      sanityn test_80b - onyx-107vm4 crashed during sanityn test_80b

      Attachments

        Issue Links

          Activity

            [LU-18496] sanityn test_80b: crash RIP: 0010:mutex_lock+0x19/0x30

            this is a duplicate of LU-18153

            bzzz Alex Zhuravlev added a comment - this is a duplicate of LU-18153

            People

              wc-triage WC Triage
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: