[LU-4633] exception RIP: qsd_entry_iter_cb+29 Created: 14/Feb/14 Updated: 19/Feb/14 Resolved: 18/Feb/14 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.4.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Mahmoud Hanafi | Assignee: | Zhenyu Xu |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
RELEASE: 2.6.32-358.23.2.el6.20140115.x86_64.lustre241 |
||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Epic: | server | ||||||||
| Rank (Obsolete): | 12680 | ||||||||
| Description |
|
We seen a number of crashes with GPF. PID: 20770 TASK: ffff881cf939caa0 CPU: 24 COMMAND: "lquota_wb_nbp7-" #0 [ffff881f97119770] machine_kexec at ffffffff81035e8b /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/arch/x86/kernel/machine_kexec_64.c: 336 #1 [ffff881f971197d0] crash_kexec at ffffffff810c0492 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/kernel/kexec.c: 1121 #2 [ffff881f971198a0] kdb_kdump_check at ffffffff812858d7 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/kdb/kdbmain.c: 1214 #3 [ffff881f971198b0] kdb_main_loop at ffffffff81288ac7 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/kdb/kdbmain.c: 1322 #4 [ffff881f971199c0] kdb_save_running at ffffffff81282c2e /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/kdb/kdbsupport.c: 798 #5 [ffff881f971199d0] kdba_main_loop at ffffffff81463988 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/arch/x86/kdb/kdba_support.c: 980 #6 [ffff881f97119a10] kdb at ffffffff81285dc6 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/kdb/kdbmain.c: 2165 #7 [ffff881f97119a80] kdba_entry at ffffffff814632a7 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/arch/x86/kdb/kdba_support.c: 1264 #8 [ffff881f97119a90] notifier_call_chain at ffffffff81545255 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/kernel/notifier.c: 95 #9 [ffff881f97119ad0] atomic_notifier_call_chain at ffffffff815452ba /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/kernel/notifier.c: 192 #10 [ffff881f97119ae0] notify_die at ffffffff8109c28e /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/kernel/notifier.c: 573 #11 [ffff881f97119b10] __die at ffffffff81543122 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/arch/x86/kernel/dumpstack.c: 288 #12 [ffff881f97119b40] die at ffffffff8100f288 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/arch/x86/kernel/dumpstack.c: 325 #13 [ffff881f97119b70] do_general_protection at ffffffff81542d02 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/arch/x86/kernel/traps.c: 400 #14 [ffff881f97119ba0] general_protection at ffffffff81542495 /usr/src/debug/kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/arch/x86_64/kernel/entry.S [exception RIP: qsd_entry_iter_cb+29] RIP: ffffffffa0cde9bd RSP: ffff881f97119c50 RFLAGS: 00010206 RAX: 5a5a5a5a5a5a5a5a RBX: ffff880db16a9d80 RCX: ffff881f97119d1c RDX: ffff880db16a9d80 RSI: ffff881f97119c80 RDI: ffff881ffcc303c0 RBP: ffff881f97119c60 R8: 00000000fffffffb R9: ffff881cc8f56c00 R10: 0000000000000000 R11: 00000000000000be R12: ffff881f97119d1c R13: 0000000000000024 R14: 5a5a5a5a5a5a5a5a R15: ffffffffa0cde9a0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #15 [ffff881f97119c68] cfs_hash_for_each_tight at ffffffffa04e31c5 [libcfs] /usr/src/debug/lustre-2.4.1/libcfs/libcfs/hash.c: 1473 #16 [ffff881f97119cc8] cfs_hash_for_each_safe at ffffffffa04e33e3 [libcfs] /usr/src/debug/lustre-2.4.1/libcfs/libcfs/hash.c: 1540 #17 [ffff881f97119cd8] qsd_start_reint_thread at ffffffffa0cdf497 [lquota] /usr/src/debug/lustre-2.4.1/lustre/quota/lquota_internal.h: 275 #18 [ffff881f97119d58] qsd_ready at ffffffffa0ce69f8 [lquota] /usr/src/debug/lustre-2.4.1/lustre/quota/qsd_handler.c: 264 #19 [ffff881f97119d88] qsd_adjust at ffffffffa0ce7654 [lquota] /usr/src/debug/lustre-2.4.1/lustre/quota/lquota_internal.h: 275 #20 [ffff881f97119e08] qsd_upd_thread at ffffffffa0ce3a1f [lquota] /usr/src/debug/lustre-2.4.1/lustre/quota/qsd_writeback.c: 413 #21 [ffff881f97119f48] kernel_thread at ffffffff8100c0ca /usr/src/debug////////kernel-lustre241-2.6.32-358.23.2.el6/linux-2.6.32-358.23.2.el6.20140115.x86_64/arch/x86/kernel/entry_64.S: 1213 |
| Comments |
| Comment by Peter Jones [ 14/Feb/14 ] |
|
Bobijam Could you please look into this one? Thanks Peter |
| Comment by Niu Yawei (Inactive) [ 17/Feb/14 ] |
|
Looks related to |
| Comment by Zhenyu Xu [ 18/Feb/14 ] |
|
dup of |
| Comment by Mahmoud Hanafi [ 18/Feb/14 ] |
|
Is rhis really a dup of |
| Comment by Zhenyu Xu [ 19/Feb/14 ] |
|
they are different occurrence of the same cause: lqe hash entry messed up (use after release), so we think it's a dup. |