[LU-76] Racer kernel panic in _ldlm_lock_debug Created: 09/Feb/11 Updated: 29/May/17 Resolved: 29/May/17 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.0.0, Lustre 2.1.0 |
| Fix Version/s: | Lustre 1.8.6 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Oleg Drokin | Assignee: | Oleg Drokin |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
| Severity: | 3 |
| Bugzilla ID: | 24,099 |
| Rank (Obsolete): | 10109 |
| Description |
|
Oracle reports this failure: We don't have any available debug data for this ourselves. |
| Comments |
| Comment by Jian Yu [ 12/Apr/11 ] |
|
Branch: b1_8 While running racer test on Toro cluster, one client node (client-8) hit kernel panic as follows: Lustre: DEBUG MARKER: -----============= acceptance-small: racer ============----- Tue Apr 12 05:03:34 PDT 2011 Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: Using TIMEOUT=20 Lustre: DEBUG MARKER: == test 1: racer on clients: client-8-ib,client-9-ib DURATION=900 == 05:03:36 (1302609816) LustreError: 10180:0:(file.c:3329:ll_inode_revalidate_fini()) failure -2 inode 94042 BUG: unable to handle kernel paging request at 0000000273713030 IP: [<ffffffffa09504c6>] _ldlm_lock_debug+0xf6/0x680 [ptlrpc] PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/block/lloop14/removable CPU 3 Modules linked in: llite_lloop(U) lustre(U) mgc(U) lov(U) osc(U) mdc(U) lquota(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) ext2 rdma_cm iw_cm ib_addr nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc ib_ipoib ib_cm ib_sa ipv6 serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core mlx4_ib ib_mad ib_core mlx4_en mlx4_core igb dca ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mod [last unloaded: libcfs] Modules linked in: llite_lloop(U) lustre(U) mgc(U) lov(U) osc(U) mdc(U) lquota(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) ext2 rdma_cm iw_cm ib_addr nfs lockd fscache nfs_acl auth_rpcgss autofs4 sunrpc ib_ipoib ib_cm ib_sa ipv6 serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core mlx4_ib ib_mad ib_core mlx4_en mlx4_core igb dca ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mod [last unloaded: libcfs] Pid: 12199, comm: ldlm_bl_00 Not tainted 2.6.32-71.18.2.el6.x86_64 #1 X8DTT RIP: 0010:[<ffffffffa09504c6>] [<ffffffffa09504c6>] _ldlm_lock_debug+0xf6/0x680 [ptlrpc] RSP: 0018:ffff8802f7f1bcc0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8802f35b5800 RCX: ffffffffa09d2070 RDX: 0000000010000000 RSI: 0000000000010000 RDI: ffff8802f364e000 RBP: ffff8802f7f1be10 R08: ffffffffa09caa70 R09: 000000000000058d R10: 0000000000010000 R11: ffffffffa09d1e40 R12: 000000005a5a5a5a R13: 00000000ffffff9d R14: 0000000000007646 R15: 0000000000000000 FS: 00007fcf9ca59700(0000) GS:ffff880032e60000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000273713030 CR3: 0000000001001000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ldlm_bl_00 (pid: 12199, threadinfo ffff8802f7f1a000, task ffff8802f763b520) Stack: ffffffffa09da57c 0000000000026928 ffff8802f7f1bd10 ffffffff8105c806 <0> ffff880200000002 ffffffffa09da594 ffff880032e169f0 ffff8802f763b558 <0> 0000000000000001 ffff8802f763b520 ffff8802f7f1bd40 ffffffff81061c21 Call Trace: [<ffffffff8105c806>] ? update_curr+0xe6/0x1e0 [<ffffffff81061c21>] ? dequeue_entity+0x1a1/0x1e0 [<ffffffff81059dc2>] ? finish_task_switch+0x42/0xd0 [<ffffffff814c8fb6>] ? thread_return+0x4e/0x778 [<ffffffffa0952fed>] ? ldlm_lock_put+0x19d/0x450 [ptlrpc] [<ffffffffa09751dd>] ldlm_handle_bl_callback+0x1ad/0x260 [ptlrpc] [<ffffffff810921ac>] ? remove_wait_queue+0x3c/0x50 [<ffffffffa097df71>] ldlm_bl_thread_main+0x1f1/0x440 [ptlrpc] [<ffffffff8111f059>] ? free_pages+0x49/0x50 [<ffffffff8105c540>] ? default_wake_function+0x0/0x20 [<ffffffff810141ca>] child_rip+0xa/0x20 [<ffffffffa097dd80>] ? ldlm_bl_thread_main+0x0/0x440 [ptlrpc] [<ffffffff810141c0>] ? child_rip+0x0/0x20 Code: 44 89 b4 24 90 00 00 00 44 89 ac 24 88 00 00 00 48 8b 97 c8 00 00 00 48 89 94 24 80 00 00 00 48 8b 97 f0 00 00 00 48 89 54 24 78 <4a> 8b 14 e5 60 5d 9e a0 48 89 54 24 70 48 8b 93 88 00 00 00 48 RIP [<ffffffffa09504c6>] _ldlm_lock_debug+0xf6/0x680 [ptlrpc] RSP <ffff8802f7f1bcc0> CR2: 0000000273713030 ---[ end trace ec850569dd6fda5e ]--- Kernel panic - not syncing: Fatal exception The console log of client-8 is in the attachment. |
| Comment by Andreas Dilger [ 29/May/17 ] |
|
Close old bug. |