[LU-362] lu_kmem_init() needs error handling Created: 26/May/11 Updated: 20/Dec/12 Resolved: 20/Dec/12 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.1.0 |
| Fix Version/s: | Lustre 2.2.0 |
| Type: | Bug | Priority: | Major |
| Reporter: | Mikhail Pershin | Assignee: | Mikhail Pershin |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 4861 | ||||||||
| Description |
|
Originally this issue was duplicate of |
| Comments |
| Comment by nasf (Inactive) [ 16/Aug/11 ] |
|
Is there any patch available? I hit it on master. |
| Comment by Mikhail Pershin [ 21/Aug/11 ] |
|
available and approved, but not yet landed: |
| Comment by Jian Yu [ 02/Sep/11 ] |
|
Lustre Tag: v2_1_0_0_RC1 While running obdfilter-survey test_1a, the client node crashed: [38173.199256] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [38173.207250] IP: [<ffffffff810efd70>] kmem_cache_alloc+0x60/0x1b0 [38173.213346] PGD 0 [38173.215441] Oops: 0000 [#1] SMP [38173.218770] last sysfs file: /sys/module/ptlrpc/initstate [38173.224225] CPU 2 [38173.226325] Modules linked in: obdecho(N+) lustre(N) mgc(N) lov(N) osc(N) mdc(N) lmv(N) fid(N) fld(N) lquota(N) ko2iblnd(N) ptlrpc(N) obdclass(N) lvfs(N) ksocklnd(N) lnet(N) libcfs(N) ext2 nfs lockd fscache nfs_acl auth_rpcgss sunrpc autofs4 rdma_ucm rdma_cm iw_cm ib_addr ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad iw_cxgb3 cxgb3 mdio mlx4_en mlx4_ib ib_mthca ib_mad ib_core cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq microcode loop dm_mod joydev mlx4_core iTCO_wdt ioatdma pcspkr iTCO_vendor_support rtc_cmos igb rtc_core serio_raw tpm_tis i2c_i801 sg rtc_lib dca i2c_core tpm tpm_bios button usbhid hid uhci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd ext3 mbcache jbd fan processor ahci libata scsi_mod thermal thermal_sys hwmon [last unloaded: lnet_selftest] [38173.301209] Supported: Yes [38173.303980] Pid: 30236, comm: ptlrpcd-rcv Tainted: G N 2.6.32.36-0.5-default #1 X8DTT [38173.312793] RIP: 0010:[<ffffffff810efd70>] [<ffffffff810efd70>] kmem_cache_alloc+0x60/0x1b0 [38173.321407] RSP: 0018:ffff88031ef55da0 EFLAGS: 00010046 [38173.326774] RAX: 0000000000000002 RBX: 0000000000000078 RCX: 0000000000000000 [38173.333972] RDX: 000000000060a040 RSI: 0000000000000050 RDI: 0000000000000000 [38173.341169] RBP: 0000000000000050 R08: 0000000000000000 R09: 0000000000000000 [38173.348358] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [38173.355548] R13: 0000000000000206 R14: ffff880316f52540 R15: ffff88031ef55ee0 [38173.362739] FS: 00007fd1f9235700(0000) GS:ffff880014840000(0000) knlGS:0000000000000000 [38173.370937] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [38173.376744] CR2: 0000000000000010 CR3: 0000000001804000 CR4: 00000000000006e0 [38173.383937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [38173.391130] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [38173.398318] Process ptlrpcd-rcv (pid: 30236, threadinfo ffff88031ef54000, task ffff880316f52540) [38173.407209] Stack: [38173.409287] ffffffff81ab4180 0000000000000078 ffffffffa0d2a4c0 ffff88031ef55ea0 [38173.416631] <0> 0000000000000000 ffffffffa0d1e121 000000010090b646 ffffffff81058ac0 [38173.424513] <0> ffff880316f52540 ffffffffa081d294 0000000000000000 0000000000000000 [38173.432653] Call Trace: [38173.435171] [<ffffffffa0d1e121>] echo_thread_key_init+0x31/0x1d0 [obdecho] [38173.442227] [<ffffffffa07302a5>] keys_fill+0x65/0x130 [obdclass] [38173.448423] [<ffffffffa0730399>] lu_env_refill+0x9/0x30 [obdclass] [38173.454805] [<ffffffffa0851810>] ptlrpcd+0x170/0x370 [ptlrpc] [38173.460734] [<ffffffff81003fba>] child_rip+0xa/0x20 [38173.465760] Code: 00 00 00 12 74 1b 65 48 8b 04 25 88 b5 00 00 48 63 80 44 e0 ff ff a9 00 ff ff 07 0f 84 8a 00 00 00 65 8b 04 25 98 cd 00 00 48 98 <49> 8b 04 c4 f6 40 0c 02 0f 85 ea 00 00 00 65 8b 04 25 98 cd 00 [38173.486418] RIP [<ffffffff810efd70>] kmem_cache_alloc+0x60/0x1b0 [38173.492591] RSP <ffff88031ef55da0> [38173.496136] CR2: 0000000000000010 Maloo report: https://maloo.whamcloud.com/test_sets/2a4d7ad4-d52c-11e0-8d02-52540025f9af |
| Comment by Jian Yu [ 08/Sep/11 ] |
|
I hit this issue again while running obdfilter-survey against lustre-master build #276 on RHEL5/x86_64 disto/arch. |
| Comment by nasf (Inactive) [ 18/Oct/11 ] |
|
Another failure: https://maloo.whamcloud.com/test_sets/3904eba0-f6da-11e0-a451-52540025f9af |
| Comment by Jinshan Xiong (Inactive) [ 19/Oct/11 ] |
|
I think you should use patch at: |
| Comment by Mikhail Pershin [ 19/Oct/11 ] |
|
Jinshan, patch in |
| Comment by Jinshan Xiong (Inactive) [ 19/Oct/11 ] |
|
Actually your patch is nothing to do with the problem. Also I don't think the patch is really needed to ONLY fix style issue - even you'd really like, it should be in another ticket. |
| Comment by Mikhail Pershin [ 19/Oct/11 ] |
|
yes, as we discussed the |
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Build Master (Inactive) [ 03/Nov/11 ] |
|
Integrated in Result = SUCCESS
|
| Comment by Mikhail Pershin [ 20/Dec/12 ] |
|
merged |