[LU-586] 1.8<->2.1 interop: list_add corruption Created: 10/Aug/11  Updated: 05/Dec/11  Resolved: 05/Dec/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.1.0, Lustre 1.8.6
Fix Version/s: Lustre 2.1.0, Lustre 1.8.7

Type: Bug Priority: Minor
Reporter: Jian Yu Assignee: Lai Siyao
Resolution: Fixed Votes: 0
Labels: None
Environment:

Old Lustre Version: 1.8.6-wc1
Lustre Build: http://newbuild.whamcloud.com/job/lustre-b1_8/100/

New Lustre Version: 2.0.66.0
Lustre Build: http://newbuild.whamcloud.com/job/lustre-master/228/

Clean upgrading (Lustre servers and clients were upgraded all at once) from Lustre 1.8.6-wc1 to Lustre 2.0.66.0 under the following configuration:

OSS1: RHEL5/x86_64
OSS2: RHEL5/x86_64
MDS: RHEL5/x86_64
Client1: RHEL6/x86_64
Client2: RHEL5/x86_64


Severity: 3
Rank (Obsolete): 4914

 Description   

After the upgrading, Lustre clients hit the following corruption while running racer test:

Aug  9 20:07:08 client-12 kernel: Lustre: DEBUG MARKER: == runracer test 1: racer on clients: client-12,client-13 DURATION=120 == 20:07:07 (1312945627)
Aug  9 20:07:08 client-12 rshd[27104]: root@client-12.lab.whamcloud.com as root: cmd='(PATH=$PATH:/usr/lib64/lustre/utils:/usr/lib64/lustre/tests:/sbin:/usr/sbin; cd /usr/lib64/lustre/tests; LUSTRE="/usr/lib64/lustre"  sh -c "DURATION=120 /usr/lib64/lustre/tests/racer/racer.sh /mnt/lustre/racer ");echo XXRETCODE:$?'
Aug  9 20:07:08 client-12 kernel: ------------[ cut here ]------------
Aug  9 20:07:08 client-12 kernel: WARNING: at lib/list_debug.c:26 __list_add+0x6d/0xa0() (Tainted: G           ---------------- T)
Aug  9 20:07:08 client-12 kernel: Hardware name: X8DTT
Aug  9 20:07:08 client-12 kernel: list_add corruption. next->prev should be prev (ffff8803179f3e80), but was ffff8801edb6bcd8. (next=ffff8801edb6bbf0).
Aug  9 20:07:08 client-12 kernel: Modules linked in: lustre(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) fid(U) fld(U) lquota(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U) nfs lockd fscache(T) nfs_acl auth_rpcgss autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror dm_region_hash dm_log mlx4_ib ib_mad ib_core mlx4_en mlx4_core igb serio_raw ghes hed i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache sd_mod crc_t10dif ahci dm_mod [last unloaded: libcfs]
Aug  9 20:07:08 client-12 kernel: Pid: 27276, comm: ln Tainted: G           ---------------- T 2.6.32-131.2.1.el6.x86_64 #1
Aug  9 20:07:08 client-12 kernel: Call Trace:
Aug  9 20:07:08 client-12 kernel: [<ffffffff810670f7>] ? warn_slowpath_common+0x87/0xc0
Aug  9 20:07:08 client-12 kernel: [<ffffffff810671e6>] ? warn_slowpath_fmt+0x46/0x50
Aug  9 20:07:08 client-12 kernel: [<ffffffff8126a18c>] ? put_dec+0x10c/0x110
Aug  9 20:07:08 client-12 kernel: [<ffffffff81272abd>] ? __list_add+0x6d/0xa0
Aug  9 20:07:08 client-12 kernel: [<ffffffff8108e45b>] ? add_wait_queue+0x3b/0x60
Aug  9 20:07:08 client-12 kernel: [<ffffffffa08937ee>] ? cfs_waitq_add+0xe/0x10 [libcfs]
Aug  9 20:07:08 client-12 kernel: [<ffffffffa0276038>] ? seq_client_alloc_fid+0x98/0x8a0 [fid]
Aug  9 20:07:08 client-12 kernel: [<ffffffff8126b6e0>] ? string+0x40/0x100
Aug  9 20:07:08 client-12 kernel: [<ffffffff8105dc20>] ? default_wake_function+0x0/0x20
Aug  9 20:07:08 client-12 kernel: [<ffffffff8126c816>] ? vsnprintf+0x2b6/0x5f0
Aug  9 20:07:08 client-12 kernel: [<ffffffffa03644ee>] ? mdc_fid_alloc+0x3e/0x100 [mdc]
Aug  9 20:07:08 client-12 kernel: [<ffffffffa08965de>] ? cfs_curproc_is_in_groups+0xe/0x10 [libcfs]
Aug  9 20:07:08 client-12 kernel: [<ffffffffa04a5059>] ? ll_i2suppgid+0x19/0x30 [lustre]
Aug  9 20:07:08 client-12 kernel: [<ffffffffa037192d>] ? mdc_create+0x1ed/0x620 [mdc]
Aug  9 20:07:08 client-12 kernel: [<ffffffffa048a0d8>] ? ll_prep_md_op_data+0x148/0x410 [lustre]
Aug  9 20:07:08 client-12 kernel: [<ffffffffa04a4298>] ? ll_new_node+0x198/0x7b0 [lustre]
Aug  9 20:07:08 client-12 kernel: [<ffffffff8117e573>] ? generic_permission+0x23/0xb0
Aug  9 20:07:08 client-12 kernel: [<ffffffffa04a4d4e>] ? ll_symlink+0x6e/0x1c0 [lustre]
Aug  9 20:07:08 client-12 kernel: [<ffffffff8117f9da>] ? vfs_symlink+0x9a/0xc0
Aug  9 20:07:08 client-12 kernel: [<ffffffff8118202e>] ? sys_symlinkat+0xee/0x110
Aug  9 20:07:08 client-12 kernel: [<ffffffff810d1ac2>] ? audit_syscall_entry+0x272/0x2a0
Aug  9 20:07:08 client-12 kernel: [<ffffffff814e017e>] ? do_page_fault+0x3e/0xa0
Aug  9 20:07:08 client-12 kernel: [<ffffffff81182066>] ? sys_symlink+0x16/0x20
Aug  9 20:07:08 client-12 kernel: [<ffffffff8100b172>] ? system_call_fastpath+0x16/0x1b
Aug  9 20:07:08 client-12 kernel: ---[ end trace 904817c3d0504e64 ]---

Console log on client-13:

Lustre: DEBUG MARKER: == runracer test 1: racer on clients: client-12,client-13 DURATION=120 == 20:07:07 (1312945627)
list_add corruption. next->prev should be ffff81030d0df688, but was 00007fff85414ec0
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at lib/list_debug.c:26
invalid opcode: 0000 [1] SMP 
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
CPU 0 
Modules linked in: nfs fscache nfs_acl mgc(U) lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf be2iscsi iscsi_tcp bnx2i cnic uio cxgb3i iw_cxgb3 cxgb3 libiscsi_tcp ib_iser libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi ib_srp rds ib_sdp ib_ipoib ipoib_helper ipv6 xfrm_nalgo crypto_api rdma_ucm rdma_cm ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ib_sa loop dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport mlx4_ib ib_mad ib_core mlx4_en sg joydev igb i2c_i801 shpchp tpm_tis tpm 8021q i7core_edac mlx4_core tpm_bios i2c_core edac_mc serio_raw dca pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 9598, comm: file_concat.sh Tainted: G      2.6.18-238.12.1.el5 #1
RIP: 0010:[<ffffffff80157c90>]  [<ffffffff80157c90>] __list_add+0x24/0x68
RSP: 0018:ffff8103173f3a28  EFLAGS: 00010082
RAX: 0000000000000058 RBX: ffff8103155a7ae0 RCX: ffffffff8031af28
RDX: ffffffff8031af28 RSI: 0000000000000000 RDI: ffffffff8031af20
RBP: ffff81030d0df688 R08: ffffffff8031af28 R09: 000000000000003e
R10: ffff8103173f36c8 R11: 0000000000000080 R12: ffff8103173f3ae0
R13: 0000000000000246 R14: ffff8103173f3ac8 R15: ffff81030d0df5e8
FS:  00002ae601cc1f80(0000) GS:ffffffff80426000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006bb424 CR3: 000000030fcbd000 CR4: 00000000000006e0
Process file_concat.sh (pid: 9598, threadinfo ffff8103173f2000, task ffff810318a25860)
Stack:  ffff8103173f3ac8 ffff81030d0df5c8 ffff81030d0df680 ffffffff80049c8e
 ffff81030d0df5c0 ffff81030d0df608 ffff81030d0df680 ffffffff88a7d89b
 0000000000000001 ffffffff8001a870 0000003000000030 ffff8103173f3bf0
Call Trace:
 [<ffffffff80049c8e>] add_wait_queue+0x24/0x34
 [<ffffffff88a7d89b>] :fid:seq_client_alloc_fid+0x12b/0x860
 [<ffffffff8001a870>] vsnprintf+0x3f8/0x627
 [<ffffffff8008e430>] default_wake_function+0x0/0xe
 [<ffffffff887bca7b>] :libcfs:libcfs_debug_vmsg2+0x75b/0x9f0
 [<ffffffff88a93c6d>] :mdc:mdc_fid_alloc+0x7d/0x100
 [<ffffffff88aa5605>] :mdc:mdc_intent_lock+0x385/0x610
 [<ffffffff88bf4c7c>] :lustre:ll_i2gids+0x6c/0xa0
 [<ffffffff88bf4fd0>] :lustre:ll_md_blocking_ast+0x0/0x580
 [<ffffffff88946cd0>] :ptlrpc:ldlm_completion_ast+0x0/0x6a0
 [<ffffffff88bf4828>] :lustre:ll_lookup_it+0x8c8/0xc90
 [<ffffffff88bf4fd0>] :lustre:ll_md_blocking_ast+0x0/0x580
 [<ffffffff88bf5956>] :lustre:ll_lookup_nd+0x1e6/0x400
 [<ffffffff80022a76>] d_alloc+0x174/0x1a9
 [<ffffffff800370b8>] __lookup_hash+0x10b/0x12f
 [<ffffffff8001b227>] open_namei+0xf2/0x712
 [<ffffffff8006723e>] do_page_fault+0x4fe/0x874
 [<ffffffff8002768a>] do_filp_open+0x1c/0x38
 [<ffffffff8001a061>] do_sys_open+0x44/0xbe
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 0f 0b 68 cb 3f 2c 80 c2 1a 00 48 8b 55 00 48 39 da 74 1b 48 
RIP  [<ffffffff80157c90>] __list_add+0x24/0x68
 RSP <ffff8103173f3a28>
 <0>Kernel panic - not syncing: Fatal exception

Maloo report: https://maloo.whamcloud.com/test_sets/4c7fa66c-c308-11e0-9a20-52540025f9af



 Comments   
Comment by Peter Jones [ 10/Aug/11 ]

Lai

Could you please look into this one as your top priority?

Thanks

Peter

Comment by Liang Zhen (Inactive) [ 10/Aug/11 ]

I made a quick look and think waitq using in seq_client_alloc_fid() is suspicious:

lustre/fid/fid_request.c
                rc = seq_fid_alloc_prep(seq, &link);
                if (rc)
                        continue;

                rc = seq_client_alloc_seq(seq, &seqnr);
                if (rc) {
                        CERROR("%s: Can't allocate new sequence, "
                               "rc %d\n", seq->lcs_name, rc);
                        cfs_up(&seq->lcs_sem);
                        RETURN(rc);
                }

if we got error returned by seq_client_alloc_seq(), we will leave @link on waitq (added by seq_fid_alloc_seq) and return, but @link is in stack of this function which can be overwritten anytime.

FYI
Liang

Comment by Lai Siyao [ 11/Aug/11 ]

Yes, this looks to be a bug, but maloo is not accessible at the moment, and I can't make certain whether this is the cause.

Comment by Lai Siyao [ 12/Aug/11 ]

I didn't find any error message for seq_client_alloc_seq() failure, so it's not the cause for this panic. After checking seq_fid_alloc_prep():

static int seq_fid_alloc_prep(struct lu_client_seq *seq,
                              cfs_waitlink_t *link)
{

        cfs_waitq_add(&seq->lcs_waitq, link);
        if (seq->lcs_update) {
                cfs_up(&seq->lcs_sem);
                cfs_set_current_state(CFS_TASK_UNINT);
                cfs_waitq_wait(link, CFS_TASK_UNINT);
                cfs_set_current_state(CFS_TASK_RUNNING);
                cfs_down(&seq->lcs_sem);
                return -EAGAIN;
        }
        ++seq->lcs_update;
        cfs_up(&seq->lcs_sem);
        return 0;
}

Once seq is updating, current thread will schedule and try again later, however it doesn't remove itself from waitq, therefore it may add itself more than once, and cause the panic.

Comment by Lai Siyao [ 12/Aug/11 ]

review is on http://review.whamcloud.com/#change,1216

Comment by Jian Yu [ 12/Aug/11 ]

After the upgrading, running metabench test also hit the similar issue:
https://maloo.whamcloud.com/test_sets/955e4d74-c4bc-11e0-9a20-52540025f9af

Comment by Niu Yawei (Inactive) [ 16/Aug/11 ]

I also got such list_add corruption error while running the interop tests locally, looks it's quite easy to trigger this bug, and it's not an interop bug actually, so I'm confused that why it was not hit in 2.1 daily tests, any idea?

Comment by Lai Siyao [ 16/Aug/11 ]

I also feel a bit strange of it. One possible cause might be that upgrading is tested with physical machine, while autotest using VM. And this panic is triggered by parallel metadata operation test like metabench (multiple threads race on fid seq allocation), which is easier to occur on physical machines.

Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » x86_64,client,el5,ofa #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » i686,client,el6,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » x86_64,server,el5,ofa #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » i686,client,el5,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » i686,server,el5,ofa #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » i686,server,el6,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » i686,client,el5,ofa #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » i686,server,el5,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Comment by Build Master (Inactive) [ 16/Aug/11 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #259
LU-586 1.8 <-> 2.1 interop: list_add corruption

Oleg Drokin : 2807ae2826aa10aca69beaa924f69093326f8351
Files :

  • lustre/fid/fid_request.c
Generated at Sat Feb 10 01:08:29 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.