[LU-2273] failure on lustre-rsync-test test_1: malloc.c:3091: sYSMALLOc: Assertion '(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])' Created: 04/Nov/12  Updated: 30/Nov/12  Resolved: 30/Nov/12

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.4.0
Fix Version/s: Lustre 2.4.0

Type: Bug Priority: Blocker
Reporter: Maloo Assignee: Bob Glossman (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

lustre master build #1011 SLES11 SP2 client


Severity: 3
Rank (Obsolete): 5432

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://maloo.whamcloud.com/test_sets/f24a7d5e-2542-11e2-9e7c-52540035b04c.

The sub-test test_1 failed with the following error:

Error in replicating xattrs.

test log shows

== lustre-rsync-test test 1: Simple Replication ====================================================== 06:13:33 (1351862013)
CMD: client-26vm7 lctl --device lustre-MDT0000 changelog_register -n
lustre-MDT0000: Registered changelog user cl1
CMD: client-26vm7 lctl get_param -n mdd.lustre-MDT0000.changelog_users
CMD: client-26vm7 dumpe2fs -h /dev/lvm-MDS/P1 2>&1 | grep -q large_xattr
CMD: client-26vm7 dumpe2fs -h /dev/lvm-MDS/P1 2>&1
CMD: client-26vm7 dumpe2fs -h /dev/lvm-MDS/P1 2>&1 | grep -q large_xattr
Replication #1
lustre_rsync: malloc.c:3091: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) & ~((2 * (sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long)old_end & pagemask) == 0)' failed.
Lustre filesystem: lustre
MDT device: lustre-MDT0000
Source: /mnt/lustre
Target: /tmp/target
Target: /tmp/target2
Statuslog: /tmp/lustre_rsync.log
Changelog registration: cl1
Starting changelog record: 0
Clear changelog after use: no
/usr/lib64/lustre/tests/lustre-rsync-test.sh: line 124: 14508 Aborted                 $LRSYNC -s $DIR -t $TGT -t $TGT2 -m $MDT0 -u $CL_USER -l $LREPL_LOG -D $LRSYNC_LOG
Replication #2
	lustre_rsync -s <lustre_root_path> -t <target_path> -m <mdt> -r <user id> -l <status log>
lustre_rsync can also pick up parameters from a status log created earlier.
	lustre_rsync -l <log_file>
options:
	--xattr <yes|no> replicate EAs
	--abort-on-err   abort at first err
	--verbose
	--dry-run        don't write anything
Please specify changelog consumer registration id.
getfattr: /tmp/target/d0.lustre-rsync-test/d1/file5: No such file or directory
getfattr: /tmp/target2/d0.lustre-rsync-test/d1/file5: No such file or directory
 lustre-rsync-test test_1: @@@@@@ FAIL: Error in replicating xattrs. 
  Trace dump:


 Comments   
Comment by Bob Glossman (Inactive) [ 14/Nov/12 ]

I'm still not clear about exactly why this only shows up in SLES clients. I can only assume that el6 malloc has better or different padding or packing. I did find some one off errors in the lustre_rsync test program.

http://review.whamcloud.com/#change,4583

Comment by Peter Jones [ 30/Nov/12 ]

Landed for 2.4

Generated at Sat Feb 10 01:23:49 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.