[LU-3415] After upgrade server from 1.8.9 to 2.4, hit (qsd_entry.c:198:qsd_refresh_usage()) ASSERTION( qqi->qqi_acct_obj ) failed Created: 29/May/13  Updated: 22/Jul/13  Resolved: 19/Jun/13

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.1, Lustre 2.5.0

Type: Bug Priority: Critical
Reporter: Sarah Liu Assignee: Niu Yawei (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

before upgrade: server and client run 1.8.9
after upgrade: server runs lustre-b2_4RC2
client runs 1.8.9


Severity: 3
Rank (Obsolete): 8470

 Description   

1. Setup lustre with both servers and clients are 1.8.9
2. Shutdown the filesystem
3. upgrade the servers from 1.8.9 to 2.4
4. mount the filesystem on and run sanity with servers=2.4RC2 and clients=1.8.9
MDS hit LBUG when running sanity test_6c:

Lustre: DEBUG MARKER: == test 6c: touch .../f6c; chown .../f6c ====================== == 15:40:24
LustreError: 8176:0:(qsd_entry.c:198:qsd_refresh_usage()) ASSERTION( qqi->qqi_acct_obj ) failed: 
LustreError: 8176:0:(qsd_entry.c:198:qsd_refresh_usage()) LBUG
Pid: 8176, comm: mdt01_003

Call Trace:

 [<ffffffffa0374895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
Message from sy [<ffffffffa0374e97>] lbug_with_loc+0x47/0xb0 [libcfs]
slogd@fat-amd-1  [<ffffffffa0b673a7>] qsd_refresh_usage+0x2c7/0x300 [lquota]
at May 29 15:40: [<ffffffffa0b67520>] qsd_lqe_read+0x140/0x2b0 [lquota]
24 ...
 kernel [<ffffffffa0b617e9>] lqe_locate+0x339/0x850 [lquota]
:LustreError: 81 [<ffffffffa0b742bf>] qsd_op_begin+0x3ff/0xb40 [lquota]
76:0:(qsd_entry. [<ffffffffa0c73676>] osd_declare_qid+0xd6/0x3f0 [osd_ldiskfs]
c:198:qsd_refres [<ffffffffa0c55344>] osd_declare_attr_set+0x204/0x7c0 [osd_ldiskfs]
h_usage()) ASSER [<ffffffffa0e14638>] lod_declare_attr_set+0x128/0x4a0 [lod]
TION( qqi->qqi_a [<ffffffffa0ad0a69>] mdd_attr_set+0x3f9/0x1390 [mdd]
cct_obj ) failed [<ffffffffa0d64498>] mdt_attr_set+0x268/0x560 [mdt]
: 

Message [<ffffffffa0d64d2c>] mdt_reint_setattr+0x59c/0xca0 [mdt]
 from syslogd@fa [<ffffffffa065e10e>] ? lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
t-amd-1 at May 2 [<ffffffffa0d5e891>] mdt_reint_rec+0x41/0xe0 [mdt]
 [<ffffffffa0d43b03>] mdt_reint_internal+0x4c3/0x780 [mdt]

 kernel:LustreE [<ffffffffa0d43e04>] mdt_reint+0x44/0xe0 [mdt]
rror: 8176:0:(qs [<ffffffffa0d48ab8>] mdt_handle_common+0x648/0x1660 [mdt]
d_entry.c:198:qs [<ffffffffa0d82165>] mds_regular_handle+0x15/0x20 [mdt]
d_refresh_usage( [<ffffffffa066e388>] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
)) LBUG
 [<ffffffffa03755de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
 [<ffffffffa0386d8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
 [<ffffffffa06656e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
 [<ffffffff81055ab3>] ? __wake_up+0x53/0x70
 [<ffffffffa066f71e>] ptlrpc_main+0xace/0x1700 [ptlrpc]
 [<ffffffffa066ec50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
 [<ffffffff8100c0ca>] child_rip+0xa/0x20
 [<ffffffffa066ec50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
 [<ffffffffa066ec50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20

Kernel panic - not syncing: LBUG
Pid: 8176, comm: mdt01_003 Not tainted 2.6.32-358.6.2.el6_lustre.g230b174.x86_64 #1
Call Trace:
 [<ffffffff8150d878>] ? panic+0xa7/0x16f
 [<ffffffffa0374eeb>] ? lbug_with_loc+0x9b/0xb0 [libcfs]
 [<ffffffffa0b673a7>] ? qsd_refresh_usage+0x2c7/0x300 [lquota]
 [<ffffffffa0b67520>] ? qsd_lqe_read+0x140/0x2b0 [lquota]
 [<ffffffffa0b617e9>] ? lqe_locate+0x339/0x850 [lquota]
 [<ffffffffa0b742bf>] ? qsd_op_begin+0x3ff/0xb40 [lquota]
 [<ffffffffa0c73676>] ? osd_declare_qid+0xd6/0x3f0 [osd_ldiskfs]
 [<ffffffffa0c55344>] ? osd_declare_attr_set+0x204/0x7c0 [osd_ldiskfs]
 [<ffffffffa0e14638>] ? lod_declare_attr_set+0x128/0x4a0 [lod]
 [<ffffffffa0ad0a69>] ? mdd_attr_set+0x3f9/0x1390 [mdd]
 [<ffffffffa0d64498>] ? mdt_attr_set+0x268/0x560 [mdt]
 [<ffffffffa0d64d2c>] ? mdt_reint_setattr+0x59c/0xca0 [mdt]
 [<ffffffffa065e10e>] ? lustre_pack_reply_flags+0xae/0x1f0 [ptlrpc]
 [<ffffffffa0d5e891>] ? mdt_reint_rec+0x41/0xe0 [mdt]
 [<ffffffffa0d43b03>] ? mdt_reint_internal+0x4c3/0x780 [mdt]
 [<ffffffffa0d43e04>] ? mdt_reint+0x44/0xe0 [mdt]
 [<ffffffffa0d48ab8>] ? mdt_handle_common+0x648/0x1660 [mdt]
 [<ffffffffa0d82165>] ? mds_regular_handle+0x15/0x20 [mdt]
 [<ffffffffa066e388>] ? ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
 [<ffffffffa03755de>] ? cfs_timer_arm+0xe/0x10 [libcfs]
 [<ffffffffa0386d8f>] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
 [<ffffffffa06656e9>] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
 [<ffffffff81055ab3>] ? __wake_up+0x53/0x70
 [<ffffffffa066f71e>] ? ptlrpc_main+0xace/0x1700 [ptlrpc]
 [<ffffffffa066ec50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
 [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
 [<ffffffffa066ec50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
 [<ffffffffa066ec50>] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
 [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu


 Comments   
Comment by Niu Yawei (Inactive) [ 30/May/13 ]

http://review.whamcloud.com/6492

Comment by Niu Yawei (Inactive) [ 19/Jun/13 ]

patch landed

Generated at Sat Feb 10 01:33:43 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.