[LU-721] Parallel writes to same file results in a file of zeroes Created: 25/Sep/11  Updated: 10/Jan/12  Resolved: 10/Oct/11

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 1.8.6
Fix Version/s: Lustre 2.2.0, Lustre 2.1.1, Lustre 1.8.7

Type: Bug Priority: Blocker
Reporter: Zhenyu Xu Assignee: Zhenyu Xu
Resolution: Fixed Votes: 0
Labels: None
Environment:

Compute node:

  • SLES 11 kernel 2.6.32.29-0.3-default
  • lustre 1.8.25 build 1.8.5-20110509165255-PRISTINE-2.6.32.29-0.3-default
    OSS:
  • SLES 10 SP2 kernerl 2.6.16.60-0.42.8_default
  • lustre 1.8.4

Severity: 1
Bugzilla ID: 24,548
Rank (Obsolete): 4901

 Description   

this is in an MPI environment.

Total (customer) is running MPI(SGI-MPT, actually) processes on different compute
nodes writing to the same file in pararallel would result in the file
contain zeros. this doesn't happen all the time, but enough to be a severe
issue.



 Comments   
Comment by Zhenyu Xu [ 25/Sep/11 ]

quote from https://bugzilla.lustre.org/show_bug.cgi?id=24548#c23

The following scenario seems to lead to wrong data in a file:

One client writes data under a lock, it completes the write and runs file closing.
Another client wants to write the file, requests a lock, server sees a conflict and sends blocking
AST to first client.
First client handles blocking AST (sending data protected by a lock to OST), the lock has
LDLM_FL_CANCELING flag set.

In the meantime, file close runs concurrently and enters dcache.c:ll_ddelete(), which contains the
following lines:

/* if not ldlm lock for this inode, set i_nlink to 0 so that

  • this inode can be recycled later b=20433 */
    LASSERT(atomic_read(&de->d_count) == 0);
    if (de->d_inode && !find_cbdata(de->d_inode))
    de->d_inode->i_nlink = 0;

dcache.c:find_cbdata() checks if there are any locks associated with the file using
dcache.c:return_if_equal() which currently does not count lock with LDLM_FL_CANCELING as such
locks. So, inode gets nlink set to 0, and delete_inode invalidates pages which have not been sent
to OST yet.

The proposed quick fix changes return_if_equal() to respect locks which are being canceled but
whose data are to be sent to servers.

Comment by Zhenyu Xu [ 25/Sep/11 ]

b18 patch tracking at http://review.whamcloud.com/1422
master patch tracking at http://review.whamcloud.com/1423

Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » i686,client,el6,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » x86_64,server,el6,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » i686,client,el5,ofa #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » x86_64,client,el5,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » i686,server,el5,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » i686,server,el6,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » x86_64,client,sles11,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » x86_64,client,el6,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » i686,client,el5,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » x86_64,server,el5,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » x86_64,client,ubuntu1004,inkernel #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » x86_64,server,el5,ofa #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » i686,server,el5,ofa #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 03/Oct/11 ]

Integrated in lustre-master » x86_64,client,el5,ofa #284
LU-721 llite: Parallel writes to same file results in zero file

Oleg Drokin : 0356a2376375020abb861423ce557767dadf593e
Files :

  • lustre/llite/dcache.c
Comment by Sarah Liu [ 04/Oct/11 ]

got kernel panic when running conf-sanity test 22, 23a on build http://build.whamcloud.com/job/lustre-reviews/2387/ rhel5-x86_64

Lustre: DEBUG MARKER: == test 22: start a client before osts (should return errs) == 15:54:17
Lustre: DEBUG MARKER: ost now in FULL state
LustreError: 152-6: Ignoring deprecated mount option 'acl'.
Lustre: MGC192.168.4.2@o2ib: Reactivating import
Lustre: Client lustre-client has started
Lustre: 25423:0:(client.c:1487:ptlrpc_expire_one_request()) @@@ Request x1381780714160180 sent from lustre-OST0000-osc-ffff810326f68800 to NID 192.168.4.131@o2ib 5s ago has timed out (5s prior to deadline).
req@ffff810325487400 x1381780714160180/t0 o8->lustre-OST0000_UUID@192.168.4.131@o2ib:28/4 lens 368/584 e 0 to 1 dl 1317768871 ref 1 fl Rpc:N/0/0 rc 0/0
LustreError: 30513:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
LustreError: 30513:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 2 previous similar messages
LustreError: 30513:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
LustreError: 30513:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 2 previous similar messages
Lustre: client ffff810326f68800 umount complete
LustreError: 152-6: Ignoring deprecated mount option 'acl'.
Lustre: MGC192.168.4.2@o2ib: Reactivating import
Lustre: Client lustre-client has started
LustreError: 30629:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
LustreError: 30629:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Skipped 1 previous similar message
LustreError: 30629:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
LustreError: 30629:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) Skipped 1 previous similar message
Lustre: client ffff81033eaf8c00 umount complete
LustreError: 152-6: Ignoring deprecated mount option 'acl'.
Lustre: MGC192.168.4.2@o2ib: Reactivating import
LustreError: 11-0: an error occurred while communicating with 192.168.4.2@o2ib. The mds_connect operation failed with -11
Lustre: Client lustre-client has started
Lustre: setting import lustre-MDT0000_UUID INACTIVE by administrator request
Lustre: setting import lustre-OST0000_UUID INACTIVE by administrator request
LustreError: 31531:0:(ldlm_request.c:1039:ldlm_cli_cancel_req()) Got rc -108 from cancel RPC: canceling anyway
LustreError: 31531:0:(ldlm_request.c:1597:ldlm_cli_cancel_list()) ldlm_cli_cancel_list: -108
Lustre: 31531:0:(client.c:1487:ptlrpc_expire_one_request()) @@@ Request x1381780836843537 sent from MGC192.168.4.2@o2ib to NID 192.168.4.2@o2ib 6s ago has timed out (6s prior to deadline).
req@ffff810326cbc800 x1381780836843537/t0 o251->MGS@MGC192.168.4.2@o2ib_0:26/25 lens 192/384 e 0 to 1 dl 1317768914 ref 1 fl Rpc:N/0/0 rc 0/0
Lustre: client ffff8103263fb000 umount complete
LustreError: 152-6: Ignoring deprecated mount option 'acl'.
Lustre: MGC192.168.4.2@o2ib: Reactivating import
LustreError: 11-0: an error occurred while communicating with 192.168.4.2@o2ib. The mds_connect operation failed with -16
LustreError: 31583:0:(lov_obd.c:1009:lov_cleanup()) lov tgt 0 not cleaned! deathrow=0, lovrc=1
LustreError: 31583:0:(ldlm_resource.c:636:ldlm_namespace_free_post()) ASSERTION(list_empty(&ns->ns_list_chain)) failed
LustreError: 31583:0:(ldlm_resource.c:636:ldlm_namespace_free_post()) LBUG
Pid: 31583, comm: mount.lustre

Call Trace:
[<ffffffff8007725f>] do_flush_tlb_all+0x0/0x6a
[<ffffffff887af6a1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
[<ffffffff887afbda>] lbug_with_loc+0x7a/0xd0 [libcfs]
[<ffffffff887b7fc0>] tracefile_init+0x0/0x110 [libcfs]
[<ffffffff888ddc08>] ldlm_namespace_free_post+0x2f8/0x3f0 [ptlrpc]
[<ffffffff888e473c>] client_obd_cleanup+0x6c/0x100 [ptlrpc]
[<ffffffff88a02bc5>] osc_cleanup+0x255/0x2d0 [osc]
[<ffffffff8887157c>] class_decref+0x5dc/0x730 [obdclass]
[<ffffffff88859962>] obd_zombie_impexp_cull+0x82/0xa0 [obdclass]
[<ffffffff8887285d>] class_detach+0x2bd/0x350 [obdclass]
[<ffffffff8885da89>] class_name2obd+0x9/0x30 [obdclass]
[<ffffffff8887662e>] class_process_config+0x1dae/0x3180 [obdclass]
[<ffffffff88878547>] class_manual_cleanup+0xb47/0xe10 [obdclass]
[<ffffffff8885f66f>] class_devices_in_group+0x4f/0xa0 [obdclass]
[<ffffffff88b1c06d>] ll_put_super+0x1bd/0x2f0 [lustre]
[<ffffffff88b1e33a>] ll_fill_super+0x219a/0x2370 [lustre]
[<ffffffff888e469e>] client_connect_import+0x4ee/0x520 [ptlrpc]
[<ffffffff8888ab50>] lustre_fill_super+0x1770/0x1970 [obdclass]
[<ffffffff800ef592>] get_filesystem+0x12/0x3b
[<ffffffff800e6298>] sget+0x365/0x377
[<ffffffff800e69fd>] set_anon_super+0x0/0xab
[<ffffffff888893e0>] lustre_fill_super+0x0/0x1970 [obdclass]
[<ffffffff800e6c77>] get_sb_nodev+0x4f/0x97
[<ffffffff800e654f>] vfs_kern_mount+0x93/0x11a
[<ffffffff800e6618>] do_kern_mount+0x36/0x4d
[<ffffffff800f0ec1>] do_mount+0x6a9/0x719
[<ffffffff8000d044>] do_lookup+0x65/0x1e6
[<ffffffff8002cc16>] mntput_no_expire+0x19/0x89
[<ffffffff8000a81a>] __link_path_walk+0xf79/0xfb9
[<ffffffff8002cc16>] mntput_no_expire+0x19/0x89
[<ffffffff8000eb94>] link_path_walk+0xa6/0xb2
[<ffffffff800ce6ab>] zone_statistics+0x3e/0x6d
[<ffffffff8000f41e>] __alloc_pages+0x78/0x308
[<ffffffff8004c76c>] sys_mount+0x8a/0xcd
[<ffffffff8005d28d>] tracesys+0xd5/0xe0

Message fromLustreError: dumping log to /tmp/lustre-log.1317768920.31583
syslogd@ at Tue Oct 4 15:55:20 2011 ...
client-15 kernel: LustreError: 31583:0:(ldlm_resource.c:636:ldlm_namespace_free_post()) ASSERTION(list_empty(&ns->ns_list_chain)) failed

Message from syslogd@ at Tue Oct 4 15:55:20 2011 ...
client-15 kernel: LustreError: 31583:0:(ldlm_resource.c:636:ldlm_namespace_free_post()) LBUG
general protection fault: 0000 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
CPU 2
Modules linked in: mgc(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf be2iscsi iscsi_tcp bnx2i cnic uio cxgb3i iw_cxgb3 cxgb3 libiscsi_tcp ib_iser libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi ib_srp rds ib_sdp ib_ipoib ipoib_helper ipv6 xfrm_nalgo crypto_api rdma_ucm rdma_cm ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ib_sa loop dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport mlx4_ib ib_mad ib_core mlx4_en joydev sg igb 8021q shpchp dca mlx4_core i2c_i801 i2c_core tpm_tis tpm i7core_edac edac_mc serio_raw tpm_bios pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 31592, comm: ldlm_poold Tainted: G 2.6.18-238.12.1.el5 #1
RIP: 0010:[<ffffffff8890325f>] [<ffffffff8890325f>] :ptlrpc:ldlm_pool_recalc+0xdf/0x130
RSP: 0000:ffff8103271a5e20 EFLAGS: 00010246
RAX: 5a5a5a5a5a5a5a5a RBX: 0000000000000000 RCX: ffff810336983e40
RDX: ffffffff8898c6f0 RSI: ffff810336983e40 RDI: ffff810325bb452c
RBP: ffff810325bb44c0 R08: ffff8103271a4000 R09: 000000000000003e
R10: ffff81010b7e4038 R11: 0000000000000246 R12: 0000000000000002
R13: ffffffff88998660 R14: 0000000000000002 R15: 0000000000000002
FS: 00002b798882e6e0(0000) GS:ffff81010b763ec0(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b7fcca2a500 CR3: 00000003283af000 CR4: 00000000000006e0
Process ldlm_poold (pid: 31592, threadinfo ffff8103271a4000, task ffff8103255b57e0)
Stack: ffff810325bb4400 ffffffff88998660 0000000000000002 ffffffff88998660
0000000000000002 ffffffff889040bf 000000010008d605 ffffffff8005bbcb
000000010008d605 ffffffff80063893 0000000000000000 0000000000200200
Call Trace:
[<ffffffff889040bf>] :ptlrpc:ldlm_pools_recalc+0x2bf/0x2f0
[<ffffffff8005bbcb>] del_timer_sync+0xc/0x16
[<ffffffff80063893>] schedule_timeout+0x92/0xad
[<ffffffff88906226>] :ptlrpc:ldlm_pools_thread_main+0x106/0x2cf
[<ffffffff8008e430>] default_wake_function+0x0/0xe
[<ffffffff8005dfb1>] child_rip+0xa/0x11
[<ffffffff88906120>] :ptlrpc:ldlm_pools_thread_main+0x0/0x2cf
[<ffffffff8005dfa7>] child_rip+0x0/0x11

Code: 48 8b 00 48 85 c0 74 1b 48 89 ef ff d0 48 8b bd c0 00 00 00
RIP [<ffffffff8890325f>] :ptlrpc:ldlm_pool_recalc+0xdf/0x130
RSP <ffff8103271a5e20>
<0>Kernel panic - not syncing: Fatal exception

Comment by Zhenyu Xu [ 05/Oct/11 ]

Sarah, this panic is due to LU-607, while latest b1_8 codebase had reverted the wrong patch, which should solve the panic.

Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » i686,client,el5,ofa #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/llite/dcache.c
  • lustre/tests/conf-sanity.sh
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » x86_64,client,el5,inkernel #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/tests/conf-sanity.sh
  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » x86_64,client,el5,ofa #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/tests/conf-sanity.sh
  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » x86_64,server,el5,ofa #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/tests/conf-sanity.sh
  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » x86_64,client,el6,inkernel #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/llite/dcache.c
  • lustre/tests/conf-sanity.sh
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » x86_64,server,el5,inkernel #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/llite/dcache.c
  • lustre/tests/conf-sanity.sh
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » i686,client,el5,inkernel #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/llite/dcache.c
  • lustre/tests/conf-sanity.sh
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » i686,client,el6,inkernel #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/llite/dcache.c
  • lustre/tests/conf-sanity.sh
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » i686,server,el5,ofa #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/tests/conf-sanity.sh
  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » x86_64,client,ubuntu1004,inkernel #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/tests/conf-sanity.sh
  • lustre/llite/dcache.c
Comment by Build Master (Inactive) [ 10/Oct/11 ]

Integrated in lustre-b1_8 » i686,server,el5,inkernel #136
LU-721 Parallel writes to same file results in a file of zeroes

Johann Lombardi : 0f7efbe4dcc8c78fcf73660ded113bcaa6d5ae68
Files :

  • lustre/llite/dcache.c
  • lustre/tests/conf-sanity.sh
Comment by Peter Jones [ 10/Oct/11 ]

Fixed for 1.8.7 and 2.2

Generated at Sat Feb 10 01:09:47 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.