Details
-
Bug
-
Resolution: Won't Fix
-
Major
-
None
-
Lustre 2.9.0, Lustre 2.10.0, Lustre 2.10.1, Lustre 2.11.0, Lustre 2.10.2, Lustre 2.10.4, Lustre 2.12.1
-
Interop: 2.8.0 EL7.2 Server/EL7.2 Client
Server: b2_8, build# 12
Client: master, build# 3431
-
3
-
9223372036854775807
Description
This issue was created by maloo for Saurabh Tandan <saurabh.tandan@intel.com>
This issue relates to the following test suite run: https://testing.hpdd.intel.com/test_sets/e91156c4-7276-11e6-8afd-5254006e85c2.
The sub-test test_racer_on_nfs failed with the following error:
test failed to respond and timed out
test log:
./file_exec.sh: line 13: 23336 Segmentation fault $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null ./file_exec.sh: line 13: 25007 Segmentation fault $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null ./file_exec.sh: line 13: 26378 Segmentation fault $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null ./file_exec.sh: line 13: 31986 Segmentation fault $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null ./file_exec.sh: line 13: 1896 Segmentation fault $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null ./file_exec.sh: line 13: 1981 Segmentation fault $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null ./file_exec.sh: line 13: 2975 Segmentation fault $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null ./file_exec.sh: line 13: 6421 Segmentation fault $DIR/$file 0.$((RANDOM % 5 + 1)) 2> /dev/null racer cleanup sleeping 5 sec ... file_create.sh: no process found dir_create.sh: no process found file_rm.sh: no process found file_rename.sh: no process found file_link.sh: no process found file_symlink.sh: no process found file_list.sh: no process found file_concat.sh: no process found file_exec.sh: no process found file_chown.sh: no process found file_chmod.sh: no process found file_mknod.sh: no process found file_truncate.sh: no process found file_delxattr.sh: no process found file_getxattr.sh: no process found file_setxattr.sh: no process found there should be NO racer processes: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
Client dmesg:
df D ffff88005170fa80 0 6802 12502 0x00000080 [ 7070.189011] ffff88005170fa08 0000000000000082 ffff8800799c5c00 ffff88005170ffd8 [ 7070.189011] ffff88005170ffd8 ffff88005170ffd8 ffff8800799c5c00 ffff8800799c5c00 [ 7070.189011] ffff88007ff66c58 0000000000000082 ffffffffa0225d20 ffff88005170fa80 [ 7070.189011] Call Trace: [ 7070.189011] [<ffffffffa0225d20>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] [ 7070.189011] [<ffffffff8163bc59>] schedule+0x29/0x70 [ 7070.189011] [<ffffffffa0225d55>] rpc_wait_bit_killable+0x35/0x80 [sunrpc] [ 7070.189011] [<ffffffff81639ad0>] __wait_on_bit+0x60/0x90 [ 7070.189011] [<ffffffffa0225d20>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc] [ 7070.189011] [<ffffffff81639b87>] out_of_line_wait_on_bit+0x87/0xb0 [ 7070.189011] [<ffffffff810a6ba0>] ? wake_atomic_t_function+0x40/0x40 [ 7070.189011] [<ffffffffa021cfe0>] ? call_bind_status+0x290/0x290 [sunrpc] [ 7070.189011] [<ffffffffa021cfe0>] ? call_bind_status+0x290/0x290 [sunrpc] [ 7070.189011] [<ffffffffa0226c44>] __rpc_execute+0x154/0x420 [sunrpc] [ 7070.189011] [<ffffffff810a6ae5>] ? wake_up_bit+0x25/0x30 [ 7070.189011] [<ffffffffa022877e>] rpc_execute+0x5e/0xa0 [sunrpc] [ 7070.189011] [<ffffffffa021e240>] rpc_run_task+0x70/0x90 [sunrpc] [ 7070.189011] [<ffffffffa021e2b0>] rpc_call_sync+0x50/0xc0 [sunrpc] [ 7070.189011] [<ffffffffa06bf5cb>] nfs3_rpc_wrapper.constprop.10+0x6b/0xb0 [nfsv3] [ 7070.189011] [<ffffffffa06c02a6>] nfs3_proc_getattr+0x56/0xb0 [nfsv3] [ 7070.189011] [<ffffffffa0514f7f>] __nfs_revalidate_inode+0xbf/0x310 [nfs] [ 7070.189011] [<ffffffffa0515722>] nfs_revalidate_inode+0x22/0x60 [nfs] [ 7070.189011] [<ffffffff811e9bdc>] ? lookup_fast+0xcc/0x2e0 [ 7070.189011] [<ffffffffa050cc5b>] nfs_weak_revalidate+0x4b/0xf0 [nfs] [ 7070.189011] [<ffffffff811e9257>] complete_walk+0x87/0xe0 [ 7070.189011] [<ffffffff811ec21f>] do_last+0x46f/0x1270 [ 7070.189011] [<ffffffff812f96f2>] ? radix_tree_lookup_slot+0x22/0x50 [ 7070.189011] [<ffffffff811eed02>] path_openat+0xc2/0x490 [ 7070.189011] [<ffffffff811f04cb>] do_filp_open+0x4b/0xb0 [ 7070.189011] [<ffffffff811fd057>] ? __alloc_fd+0xa7/0x130 [ 7070.189011] [<ffffffff811dde73>] do_sys_open+0xf3/0x1f0 [ 7070.189011] [<ffffffff811ddf8e>] SyS_open+0x1e/0x20 [ 7070.189011] [<ffffffff81646c49>] system_call_fastpath+0x16/0x1b
MDS console:
20:40:23:[ 3465.447651] Lustre: DEBUG MARKER: == parallel-scale-nfsv3 test racer_on_nfs: racer on NFS client ======================================= 20:35:50 (1472960150) 20:40:23:[ 3499.433180] LustreError: 10949:0:(mdt_handler.c:893:mdt_getattr_internal()) lustre-MDT0000: getattr error for [0x200000bd2:0x1800:0x0]: rc = -2 20:40:23:[ 3499.438171] LustreError: 10949:0:(mdt_handler.c:893:mdt_getattr_internal()) Skipped 1 previous similar message 20:40:23:[ 3735.618177] BUG: unable to handle kernel paging request at ffffffa07d7040ff 20:40:23:[ 3735.619144] IP: [<ffffffa07d7040ff>] 0xffffffa07d7040fe 20:40:23:[ 3735.619144] PGD 194d067 PUD 0 20:40:23:[ 3735.619144] Oops: 0010 [#1] SMP 20:40:23:[ 3735.619144] Modules linked in: osc(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) sha512_generic crypto_null libcfs(OE) ldiskfs(OE) dm_mod rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xprtrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod crc_t10dif crct10dif_generic crct10dif_common ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ppdev pcspkr virtio_balloon i2c_piix4 parport_pc parport nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 ata_generic pata_acpi cirrus syscopyarea sysfillrect sysimgblt virtio_blk drm_kms_helper ttm 8139too ata_piix libata drm serio_raw virtio_pci virtio_ring virtio 8139cp mii i2c_core floppy 20:40:23:[ 3735.619144] CPU: 0 PID: 13 Comm: rcuos/1 Tainted: G OE ------------ 3.10.0-327.3.1.el7_lustre.x86_64 #1 20:40:23:[ 3735.619144] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 20:40:23:[ 3735.619144] task: ffff88007c058b80 ti: ffff88007c060000 task.ti: ffff88007c060000 20:40:23:[ 3735.619144] RIP: 0010:[<ffffffa07d7040ff>] [<ffffffa07d7040ff>] 0xffffffa07d7040fe 20:40:23:[ 3735.619144] RSP: 0018:ffff88007c063e10 EFLAGS: 00010286 20:40:23:[ 3735.619144] RAX: ffff88007c063fd8 RBX: ffff880040551227 RCX: 0000000180400035 20:40:23:[ 3735.619144] RDX: ffffffa07d7040ff RSI: ffffea000188a4c0 RDI: ffff880040551227 20:40:23:[ 3735.619144] RBP: ffff88007c063ec0 R08: ffff880062293840 R09: 0000000180400035 20:40:23:[ 3735.619144] R10: ffffea000188a4c0 R11: ffffffff81123f70 R12: 0000000000000001 20:40:23:[ 3735.619144] R13: ff880008399070ff R14: 0000000000000039 R15: ffff880039ef1750 20:40:23:[ 3735.619144] FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 20:40:23:[ 3735.619144] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b 20:40:23:[ 3735.619144] CR2: ffffffa07d7040ff CR3: 00000000364cf000 CR4: 00000000000006f0 20:40:23:[ 3735.619144] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 20:40:23:[ 3735.619144] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
This issue was first seen on 08/23/2016
Attachments
Issue Links
- is related to
-
LU-11718 parallel-scale-nfsv3 test racer_on_nfs crashes with ‘BUG: unable to handle kernel paging request’
- Resolved
-
LU-11766 parallel-scale-nfsv3 test racer_on_nfs crash MDS
- Resolved
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...