[LU-286] racer: general protection fault: 0000 [1] SMP RIP: __wake_up_common+60} Created: 05/May/11  Updated: 08/Aug/11  Resolved: 08/Aug/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0, Lustre 1.8.6
Fix Version/s: Lustre 2.1.0, Lustre 1.8.6

Type: Bug Priority: Blocker
Reporter: Yang Sheng Assignee: Yang Sheng
Resolution: Fixed Votes: 0
Labels: None

Severity: 3
Bugzilla ID: 24,508
Rank (Obsolete): 10125

 Description   

sfire8 console :

2011-04-23 07:28:17 general protection fault: 0000 [1] SMP
2011-04-23 07:28:17 last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
2011-04-23 07:28:17 CPU 2
2011-04-23 07:28:17 Modules linked in: mgc lustre lov mdc lquota osc ksocklnd ptlrpc obdclass lvfs
lnet libcfs cpufreq_ondemand cpufreq_userspace cpufreq_powersave powernow_k8 freq_table edd
ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad mlx4_ib dock mlx4_core ib_mthca ib_mad ib_core thermal
processor fan button battery ac loop dm_mod usbhid sg ide_cd cdrom sd_mod generic amd74xx ide_core
ohci_hcd shpchp i2c_nforce2 i2c_core pci_hotplug ehci_hcd sata_nv libata forcedeth scsi_mod usbcore
tg3 af_packet nfs lockd nfs_acl sunrpc
2011-04-23 07:28:17 Pid: 1882, comm: ls Tainted: G U 2.6.16.60-0.69.1-smp #1
2011-04-23 07:28:17 RIP: 0010:[<ffffffff8012b90b>] <ffffffff8012b90b>{__wake_up_common+60}
2011-04-23 07:28:17 RSP: 0018:ffff810041583948 EFLAGS: 00010012
2011-04-23 07:28:17 RAX: 00000000c48348d8 RBX: 0000000000000001 RCX: 0000000000000000
2011-04-23 07:28:17 RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff8846d7c8
2011-04-23 07:28:17 RBP: ffff810041583978 R08: ffffffff8846d7e0 R09: 0000000000000001
2011-04-23 07:28:17 R10: ffff810041583788 R11: 0000000000000000 R12: 50478d4828ec8348
2011-04-23 07:28:17 R13: ffff81006dee1810 R14: 0000000000000000 R15: 0000000000000000
2011-04-23 07:28:17 FS: 00002b35b66d9d70(0000) GS:ffff81011d732240(0000) knlGS:0000000000000000
2011-04-23 07:28:17 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2011-04-23 07:28:17 CR2: 00002b76c9ed6ac0 CR3: 0000000116f98000 CR4: 00000000000006e0
2011-04-23 07:28:17 Process ls (pid: 1882, threadinfo ffff810041582000, task ffff8100355f67d0)
2011-04-23 07:28:17 Stack: c48348d800000003 0000000000000000 0000000000000001 ffff81006dee1810
2011-04-23 07:28:17 0000000000000003 0000000000000296 ffff8100415839b8 ffffffff8012c92f
2011-04-23 07:28:17 ffff810041583b24 ffff8100636f85f0
2011-04-23 07:28:17 Call Trace: <ffffffff8012c92f>{__wake_up+56}
<ffffffff88642f48>{:mdc:mdc_exit_request+152}
2011-04-23 07:28:17 <ffffffff88648bdc>{:mdc:mdc_enqueue+3500}
<ffffffff8020010a>

{vsnprintf+815}

2011-04-23 07:28:17 <ffffffff88649715>{:mdc:mdc_intent_lock+709}
<ffffffff886fddc0>{:lustre:ll_mdc_blocking_ast+0}
2011-04-23 07:28:17 <ffffffff88506660>{:ptlrpc:ldlm_completion_ast+0}
<ffffffff886fb187>{:lustre:ll_prepare_mdc_op_data+135}
2011-04-23 07:28:17 <ffffffff886c25c0>{:lustre:__ll_inode_revalidate_it+656}
2011-04-23 07:28:17 <ffffffff886fddc0>{:lustre:ll_mdc_blocking_ast+0}
<ffffffff80196ced>{__link_path_walk+3962}
2011-04-23 07:28:17 <ffffffff80196df0>

{link_path_walk+218}

<ffffffff8017304a>{__vma_link+69}
2011-04-23 07:28:17 <ffffffff886c4600>{:lustre:ll_inode_revalidate_it+128}
2011-04-23 07:28:17 <ffffffff886c4769>{:lustre:ll_getattr_it+25}
<ffffffff886c4894>{:lustre:ll_getattr+52}
2011-04-23 07:28:17 <ffffffff8018fe51>

{vfs_stat_fd+50}

<ffffffff801fe8e3>

{rb_insert_color+97}

2011-04-23 07:28:17 <ffffffff8017304a>{__vma_link+69} <ffffffff80173b6c>

{vma_link+113}

2011-04-23 07:28:17 <ffffffff80190005>

{sys_newstat+25}

<ffffffff801feec6>{__up_write+20}
2011-04-23 07:28:17 <ffffffff8010bd0d>

{error_exit+0}

<ffffffff8010ae36>

{system_call+126}

2011-04-23 07:28:17
2011-04-23 07:28:17 Code: 41 ff 50 f8 85 c0 74 0a f6 45 d4 01 74 04 ff cb 74 10 4d 89
2011-04-23 07:28:17 RIP <ffffffff8012b90b>{__wake_up_common+60} RSP <ffff810041583948>



 Comments   
Comment by Yang Sheng [ 13/May/11 ]

patch on http://review.whamcloud.com/#change,506

Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » i686,client,el5,inkernel #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_locks.c
  • lustre/mdc/mdc_lib.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » x86_64,server,el5,ofa #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_lib.c
  • lustre/mdc/mdc_locks.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » x86_64,client,el5,ofa #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_locks.c
  • lustre/mdc/mdc_lib.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » x86_64,client,ubuntu1004,inkernel #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_locks.c
  • lustre/mdc/mdc_lib.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » x86_64,client,el6,inkernel #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_lib.c
  • lustre/mdc/mdc_locks.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » i686,client,el5,ofa #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_locks.c
  • lustre/mdc/mdc_lib.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » i686,client,el6,inkernel #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_lib.c
  • lustre/mdc/mdc_locks.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » x86_64,client,el5,inkernel #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_locks.c
  • lustre/mdc/mdc_lib.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » x86_64,server,el5,inkernel #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_locks.c
  • lustre/mdc/mdc_lib.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » i686,server,el5,inkernel #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_lib.c
  • lustre/mdc/mdc_locks.c
Comment by Build Master (Inactive) [ 13/May/11 ]

Integrated in lustre-b1_8 » i686,server,el5,ofa #53
LU-286 racer: general protection fault.

Johann Lombardi : 58a0c1a6c239774fc3f662d58b47cb57f4e53c6d
Files :

  • lustre/mdc/mdc_lib.c
  • lustre/mdc/mdc_locks.c
Comment by Andreas Dilger [ 08/Aug/11 ]

This patch, along with the change from https://bugzilla.lustre.org/show_bug.cgi?id=18213 need to be landed on master as well.

Comment by Andreas Dilger [ 08/Aug/11 ]

Hmm, I might be wrong in that this patch is only needed together with b=18213 (which allows mdc_enter_request() to be interrupted) because before that time the mdc_enter_request() thread would be blocked until it is actually removed from the list). That would still be a useful fix for master, but possibly not a blocker.

Even so, I like the b1_8 code better, where the mdc_enter_request() thread removes the stack-allocated mcw struct from the list itself, rather than having the mdc_exit_request() thread do it.

Comment by Peter Jones [ 08/Aug/11 ]

Johann, could you please comment on this one? Does the anomaly Andreas has spotted warrant being a 2.1 blocker?

Comment by Andreas Dilger [ 08/Aug/11 ]

My bad. It seems that this patch was landed to master via LU-234. I think I was looking at an old checkout of master on my test node.

Generated at Sat Feb 10 01:05:33 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.