Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-935

Crash lquota:dquot_create_oqaq+0x28f/0x510

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 1.8.x (1.8.0 - 1.8.5)

    Description

      The Lustre infrastructure is based on two HP Blade Server with an
      Hitachi Shared Storage. On the first server we have MDS, MGS, OST0/1/2,
      on the second server we have OST3/4..
      The first server is osiride-lp-030 and the second is osiride-lp-031.
      The clustering of these services are based on Red Hat Cluster Suite.
      The crash of the Lustre infrastructure is daily and we experience in the
      log these dumps:

      Dec 9 11:27:08 osiride-lp-030 kernel: BUG: soft lockup - CPU#8 stuck for 10s! [ll_mdt_06:21936]
      Dec 9 11:27:08 osiride-lp-030 kernel: CPU 8:
      Dec 9 11:27:08 osiride-lp-030 kernel: Modules linked in: obdfilter(U) ost(U) mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) crc16(U) lock_dlm(U) gfs2(U)
      dlm(U) configfs(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lvfs(U) lnet(U) libcfs(U) bonding(U) ipv6(U) xfrm_nalgo(U) cryp
      to_api(U) video(U) backlight(U) sbs(U) power_meter(U) hwmon(U) i2c_ec(U) i2c_core(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U)
      ac(U) dm_round_robin(U) dm_multipath(U) scsi_dh(U) parport_pc(U) lp(U) parport(U) joydev(U) bnx2x(U) sg(U) amd64_edac_mod(U) shpchp(U) bnx2(U) serio_raw(U) t
      g3(U) pcspkr(U) edac_mc(U) hpilo(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_mem_cache(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_log(U) dm_mod(U) u
      sb_storage(U) qla2xxx(U) scsi_transport_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
      Dec 9 11:27:08 osiride-lp-030 kernel: Pid: 21936, comm: ll_mdt_06 Tainted: G 2.6.18-194.17.1.el5_lustre.20110315140510 #1
      Dec 9 11:27:08 osiride-lp-030 kernel: RIP: 0010:[<ffffffff8882a270>] [<ffffffff8882a270>] :lquota:dquot_create_oqaq+0x2b0/0x510
      Dec 9 11:27:08 osiride-lp-030 kernel: RSP: 0018:ffff8104484e3ac0 EFLAGS: 00000246
      Dec 9 11:27:08 osiride-lp-030 kernel: RAX: 0000000000000000 RBX: ffff81041eee3ef0 RCX: 000000000000000c
      Dec 9 11:27:08 osiride-lp-030 kernel: RDX: 0000000000000000 RSI: 0000000000001400 RDI: 0000000000001400
      Dec 9 11:27:08 osiride-lp-030 kernel: RBP: 0000000000000004 R08: 000000000000000c R09: 0000000001000000
      Dec 9 11:27:08 osiride-lp-030 kernel: R10: 000000000000000c R11: 0000000000500000 R12: ffffffffffffffff
      Dec 9 11:27:08 osiride-lp-030 kernel: R13: 003fffffffffffff R14: 0000000000000282 R15: ffff81041eee3f00
      Dec 9 11:27:08 osiride-lp-030 kernel: FS: 00002b6411676230(0000) GS:ffff81010fc954c0(0000) knlGS:00000000f6cf2b90
      Dec 9 11:27:08 osiride-lp-030 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      Dec 9 11:27:08 osiride-lp-030 kernel: CR2: 00000000f6140000 CR3: 0000000000201000 CR4: 00000000000006e0
      Dec 9 11:27:08 osiride-lp-030 kernel:
      Dec 9 11:27:08 osiride-lp-030 kernel: Call Trace:
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8882ad69>] :lquota:lustre_dqget+0x679/0x7e0
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8882b086>] :lquota:init_oqaq+0x56/0x1c0
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8883285e>] :lquota:mds_set_dqblk+0x8de/0x2010
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff88732fd3>] :ptlrpc:ptl_send_buf+0x3f3/0x5b0
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8873b94a>] :ptlrpc:lustre_pack_reply_flags+0x86a/0x950
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff80150d56>] __next_cpu+0x19/0x28
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff88823e9a>] :lquota:mds_quota_ctl+0x16a/0x3c0
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8873ba59>] :ptlrpc:lustre_pack_reply+0x29/0xb0
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff88afe78f>] :mds:mds_handle+0x3d7f/0x4d10
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff800767ae>] smp_send_reschedule+0x4e/0x53
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8008c92d>] enqueue_task+0x41/0x56
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8873da35>] :ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff887473b9>] :ptlrpc:ptlrpc_server_handle_request+0x989/0xe00
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff88747b15>] :ptlrpc:ptlrpc_wait_event+0x2e5/0x310
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8008b3bd>] __wake_up_common+0x3e/0x68
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff88748ac8>] :ptlrpc:ptlrpc_main+0xf88/0x1150
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8008c92d>] enqueue_task+0x41/0x56
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8873da35>] :ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff887473b9>] :ptlrpc:ptlrpc_server_handle_request+0x989/0xe00
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff88747b15>] :ptlrpc:ptlrpc_wait_event+0x2e5/0x310
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8008b3bd>] __wake_up_common+0x3e/0x68
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff88748ac8>] :ptlrpc:ptlrpc_main+0xf88/0x1150
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff88747b40>] :ptlrpc:ptlrpc_main+0x0/0x1150
      Dec 9 11:27:08 osiride-lp-030 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
      Dec 9 11:27:08 osiride-lp-030 kernel:
      Dec 9 11:27:15 osiride-lp-030 kernel: Lustre: Service thread pid 23639 was inactive for 218.00s. Watchdog stack traces are limited to 3 per 300 seconds, sk pping this one.

      This saturates the resources of the server and the clients are unable to
      access to the filesystem.

      Regards

      Attachments

        Issue Links

          Activity

            [LU-935] Crash lquota:dquot_create_oqaq+0x28f/0x510

            Integrated in lustre-b2_1 » i686,client,el5,inkernel #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » i686,client,el5,inkernel #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » x86_64,client,el5,ofa #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » x86_64,client,el5,ofa #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » i686,server,el5,ofa #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » i686,server,el5,ofa #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » x86_64,server,el5,inkernel #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » x86_64,server,el5,inkernel #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » i686,server,el5,inkernel #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » i686,server,el5,inkernel #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » x86_64,client,el5,inkernel #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » x86_64,client,el5,inkernel #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » i686,server,el6,inkernel #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » i686,server,el6,inkernel #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » x86_64,client,el6,inkernel #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » x86_64,client,el6,inkernel #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » x86_64,server,el5,ofa #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » x86_64,server,el5,ofa #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            Integrated in lustre-b2_1 » i686,client,el5,ofa #41
            LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a)

            Result = SUCCESS
            Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a
            Files :

            • lustre/quota/quota_master.c
            hudson Build Master (Inactive) added a comment - Integrated in lustre-b2_1 » i686,client,el5,ofa #41 LU-935 quota: break early when b/i_unit_sz exceeded upper limit (Revision ed57fd22280fe5d1e2f8a57f21e83922ad565b3a) Result = SUCCESS Oleg Drokin : ed57fd22280fe5d1e2f8a57f21e83922ad565b3a Files : lustre/quota/quota_master.c

            People

              niu Niu Yawei (Inactive)
              lustre.support Supporto Lustre Jnet2000 (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: