Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.8.0
-
None
-
3
-
9223372036854775807
Description
Console logs:
===========
Jan 10 14:50:47 snx11000n004 XYRAID(snx11000n004_md1-jnlr)[11815]: INFO: snx11000n004_md1-jnlr stop exit : 0
Jan 10 14:50:48 snx11000n004 kernel: [340395.094979] __ratelimit: 1047 callbacks suppressed
Jan 10 14:50:48 snx11000n004 kernel: [340395.099992] Write to readonly device md139 (0x90008b) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff8802fede3678
Jan 10 14:50:48 snx11000n004 kernel: [340395.114672] Write to readonly device md139 (0x90008b) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff8802fede3748
Jan 10 14:50:48 snx11000n004 kernel: [340395.129363] Write to readonly device md139 (0x90008b) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff8802fede3748
Jan 10 14:50:48 snx11000n004 kernel: [340395.144056] Write to readonly device md5 (0x900005) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff88049c2fb610
Jan 10 14:50:48 snx11000n004 kernel: [340395.176736] LDISKFS-fs error (device md5): ldiskfs_mb_release_inode_pa: pa free mismatch: [pa ffff88066edde7b8] [phy 16646156] [logic 267] [len 117] [free 115] [error 0] [inode 117886005] [freed 117]
Jan 10 14:50:48 snx11000n004 kernel: [340395.194885] Aborting journal on device md139.
Jan 10 14:50:48 snx11000n004 kernel: [340395.199450] Write to readonly device md139 (0x90008b) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff880594188540
Jan 10 14:50:48 snx11000n004 kernel: [340395.214136] LDISKFS-fs (md5): Remounting filesystem read-only
Jan 10 14:50:48 snx11000n004 kernel: [340395.220182] Write to readonly device md5 (0x900005) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff880596d40a88
Jan 10 14:50:48 snx11000n004 kernel: [340395.234671] LDISKFS-fs error (device md5): ldiskfs_mb_release_inode_pa: free 117, pa_free 115
Jan 10 14:50:48 snx11000n004 kernel: [340395.243558] ----------[ cut here ]----------
Jan 10 14:50:48 snx11000n004 kernel: [340395.248362] kernel BUG at /builddir/build/BUILD/lustre-ldiskfs-3.3.0.x2/ldiskfs/mballoc.c:3799!
Jan 10 14:50:49 snx11000n004 kernel: [340395.256674] Write to readonly device md143 (0x90008f) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff8802fede3bc0
Jan 10 14:50:49 snx11000n004 kernel: [340395.256679] Write to readonly device md143 (0x90008f) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff8802fede30c8
Jan 10 14:50:49 snx11000n004 kernel: [340395.256695] Write to readonly device md143 (0x90008f) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff8802fede30c8
Jan 10 14:50:49 snx11000n004 kernel: [340395.256712] Write to readonly device md7 (0x900007) bi_flags: f000000000000001, bi_vcnt: 1, bi_idx: 0, bi->size: 4096, bi_cnt: 2, bi_private: ffff880483f645a8
Jan 10 14:50:49 snx11000n004 kernel: [340395.317126] invalid opcode: 0000 1 SMP
Jan 10 14:50:49 snx11000n004 kernel: [340395.321485] last sysfs file: /sys/devices/virtual/block/md131/uevent
Jan 10 14:50:49 snx11000n004 kernel: [340395.328034] CPU 0
Jan 10 14:50:49 snx11000n004 kernel: [340395.424861]
Jan 10 14:50:49 snx11000n004 kernel: [340395.520504] Pid: 11724, comm: umount Tainted: P W ---------------- 2.6.32-131.21.1.el6.lustre.3021.x86_64 #1 CS6000AC
Jan 10 14:50:49 snx11000n004 kernel: [340395.532218] RIP: 0010:[<ffffffffa0921ab6>] [<ffffffffa0921ab6>] ldiskfs_mb_release_inode_pa+0x346/0x360 [ldiskfs]
Jan 10 14:50:49 snx11000n004 kernel: [340395.542875] RSP: 0018:ffff8805de375a58 EFLAGS: 00010202
Jan 10 14:50:49 snx11000n004 kernel: [340395.548364] RAX: 0000000000000073 RBX: 0000000000000075 RCX: ffff8807d3b0bc00
Jan 10 14:50:49 snx11000n004 kernel: [340395.555743] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffff8806ebf95f00
Jan 10 14:50:49 snx11000n004 kernel: [340395.563142] RBP: ffff8805de375b08 R08: 0000000000000000 R09: 0000000000000080
Jan 10 14:50:49 snx11000n004 kernel: [340395.570521] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880324a63490
Jan 10 14:50:49 snx11000n004 kernel: [340395.577903] R13: ffff880596f74408 R14: 0000000000000082 R15: ffff88066edde7b8
Jan 10 14:50:49 snx11000n004 kernel: [340395.585286] FS: 00007f58836fe740(0000) GS:ffff880044600000(0000) knlGS:0000000000000000
Jan 10 14:50:49 snx11000n004 kernel: [340395.593619] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jan 10 14:50:49 snx11000n004 kernel: [340395.599539] CR2: 00007f6553fd90a0 CR3: 0000000779cc0000 CR4: 00000000000406f0
Jan 10 14:50:49 snx11000n004 kernel: [340395.606922] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 10 14:50:49 snx11000n004 kernel: [340395.614299] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 10 14:50:49 snx11000n004 kernel: [340395.621678] Process umount (pid: 11724, threadinfo ffff8805de374000, task ffff8806c96e80c0)
Jan 10 14:50:49 snx11000n004 kernel: [340395.630271] Stack:
Jan 10 14:50:49 snx11000n004 kernel: [340395.632464] ffff880500000075 0000000000000073 ffff880500000000 000000000706cc35
Jan 10 14:50:49 snx11000n004 kernel: [340395.639970] <0> 0000000000000075 000000000000007c ffff8805de375a98 ffffffff811a4066
Jan 10 14:50:49 snx11000n004 kernel: [340395.648046] <0> ffff8807d3b0bc00 ffff8807a0f1a800 ffff88066edde7b8 0000000000fe0000
Jan 10 14:50:49 snx11000n004 kernel: [340395.656371] Call Trace:
Jan 10 14:50:49 snx11000n004 kernel: [340395.659000] [<ffffffff811a4066>] ? __wait_on_buffer+0x26/0x30
Jan 10 14:50:49 snx11000n004 kernel: [340395.665024] [<ffffffffa092556e>] ldiskfs_discard_preallocations+0x1fe/0x490 [ldiskfs]
Jan 10 14:50:49 snx11000n004 kernel: [340395.673193] [<ffffffffa093e1c6>] ldiskfs_clear_inode+0x16/0x50 [ldiskfs]
Jan 10 14:50:49 snx11000n004 kernel: [340395.680168] [<ffffffff8118ceaf>] clear_inode+0x8f/0x110
Jan 10 14:50:49 snx11000n004 kernel: [340395.685655] [<ffffffff8118cf70>] dispose_list+0x40/0x120
Jan 10 14:50:49 snx11000n004 kernel: [340395.691236] [<ffffffff8118d41a>] invalidate_inodes+0xea/0x190
Jan 10 14:50:49 snx11000n004 kernel: [340395.697249] [<ffffffff81174f2c>] generic_shutdown_super+0x4c/0xe0
Jan 10 14:50:49 snx11000n004 kernel: [340395.703603] [<ffffffff81174ff1>] kill_block_super+0x31/0x50
Jan 10 14:50:49 snx11000n004 kernel: [340395.709455] [<ffffffff811760a0>] deactivate_super+0x70/0x90
Jan 10 14:50:49 snx11000n004 kernel: [340395.715291] [<ffffffff811915af>] mntput_no_expire+0xbf/0x110
Jan 10 14:50:49 snx11000n004 kernel: [340395.721253] [<ffffffffa10eb9c4>] unlock_mntput+0x64/0x70 [obdclass]
Jan 10 14:50:49 snx11000n004 kernel: [340395.727818] [<ffffffffa10f3ae3>] server_put_super+0x433/0x13e0 [obdclass]
Jan 10 14:50:49 snx11000n004 kernel: [340395.734875] [<ffffffff8108e120>] ? autoremove_wake_function+0x0/0x40
Jan 10 14:50:49 snx11000n004 kernel: [340395.741494] [<ffffffff8118d426>] ? invalidate_inodes+0xf6/0x190
Jan 10 14:50:49 snx11000n004 kernel: [340395.747672] [<ffffffff81174f3b>] generic_shutdown_super+0x5b/0xe0
Jan 10 14:50:49 snx11000n004 kernel: [340395.754054] [<ffffffff81175026>] kill_anon_super+0x16/0x60
Jan 10 14:50:49 snx11000n004 kernel: [340395.759856] [<ffffffffa10ea166>] lustre_kill_super+0x36/0x60 [obdclass]
Jan 10 14:50:49 snx11000n004 kernel: [340395.766760] [<ffffffff811760a0>] deactivate_super+0x70/0x90
Jan 10 14:50:49 snx11000n004 kernel: [340395.772612] [<ffffffff811915af>] mntput_no_expire+0xbf/0x110
Jan 10 14:50:49 snx11000n004 kernel: [340395.778555] [<ffffffff811919db>] sys_umount+0x7b/0x3a0
Jan 10 14:50:49 snx11000n004 kernel: [340395.783971] [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
same crash hit twice in 4 attempts. logs attached (kern, message, conman); will upload dump to ftp server.