[LU-14759] sanity-quota test_0: 'Timeout occurred after 273 mins, last suite running was sanity-quota' Created: 11/Jun/21  Updated: 22/Oct/21

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for Serguei Smirnov <ssmirnov@ddn.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/af21d98e-60c9-456c-9869-d9c3026da780

test_0 failed with the following error:

Timeout occurred after 273 mins, last suite running was sanity-quota
CMD: trevis-66vm4 /usr/sbin/lctl conf_param lustre.quota.ost=ugp
CMD: trevis-66vm3 /usr/sbin/lctl dl
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0000.quota_slave.enabled
Waiting 90s for 'ugp'
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0000.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0000.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0000.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0000.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0000.quota_slave.enabled
Updated after 9s: want 'ugp' got 'ugp'
CMD: trevis-66vm3 /usr/sbin/lctl dl
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0001.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl dl
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0002.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl dl
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0003.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl dl
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0004.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl dl
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0005.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl dl
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0006.quota_slave.enabled
CMD: trevis-66vm3 /usr/sbin/lctl dl
CMD: trevis-66vm3 /usr/sbin/lctl get_param -n osd-zfs.lustre-OST0007.quota_slave.enabled
running as uid/gid/euid/egid 60000/60000/60000/60000, groups:
 [dd] [if=/dev/zero] [bs=1M] [of=/mnt/lustre/d0.sanity-quota/f0.sanity-quota-0] [count=10] [conv=fsync]
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 3.41233 s, 3.1 MB/s
Delete files...
Wait for unlink objects finished...
CMD: trevis-66vm4 /usr/sbin/lctl set_param -n os[cd]*.*MD*.force_sync 1
CMD: trevis-66vm4 /usr/sbin/lctl get_param -n osc.*MDT*.sync_*
CMD: trevis-66vm4 /usr/sbin/lctl get_param -n osc.*MDT*.sync_*
CMD: trevis-66vm4 /usr/sbin/lctl get_param -n osc.*MDT*.sync_*
sleep 5 for ZFS zfs
sleep 5 for ZFS zfs
Waiting for local destroys to complete
CMD: trevis-66vm4,trevis-66vm5 lctl set_param -n os[cd]*.*MDT*.force_sync=1
CMD: trevis-66vm3 lctl set_param -n osd*.*OS*.force_sync=1
CMD: trevis-66vm4 /usr/sbin/lctl get_param -n version 2>/dev/null
CMD: trevis-66vm4 zpool get all
Resetting fail_loc on all nodes...CMD: trevis-66vm1.trevis.whamcloud.com,trevis-66vm2,trevis-66vm3,trevis-66vm4,trevis-66vm5 lctl set_param -n fail_loc=0 	    fail_val=0 2>/dev/null
done.
CMD: trevis-66vm1.trevis.whamcloud.com /usr/sbin/lctl get_param catastrophe 2>&1
CMD: trevis-66vm2 /usr/sbin/lctl get_param catastrophe 2>&1
CMD: trevis-66vm3 /usr/sbin/lctl get_param catastrophe 2>&1
CMD: trevis-66vm4 /usr/sbin/lctl get_param catastrophe 2>&1
CMD: trevis-66vm5 /usr/sbin/lctl get_param catastrophe 2>&1
CMD: trevis-66vm1.trevis.whamcloud.com,trevis-66vm2,trevis-66vm3,trevis-66vm4,trevis-66vm5 dmesg

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
sanity-quota test_0 - Timeout occurred after 273 mins, last suite running was sanity-quota



 Comments   
Comment by Alex Zhuravlev [ 21/Jun/21 ]

I think something went wrong with the environment:

[13112.959919] blk_update_request: I/O error, dev vda, sector 21271440 op 0x1:(WRITE) flags 0x800 p
Comment by Elena Gryaznova [ 22/Oct/21 ]

one more :
https://testing.whamcloud.com/test_logs/8184077e-1558-4c47-ac7b-960740d540fe/show_text

[  975.962241] blk_update_request: I/O error, dev sda, sector 22957952 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  975.964075] blk_update_request: I/O error, dev sda, sector 22957952 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  975.965814] Buffer I/O error on dev dm-10, logical block 2869232, async page read
Generated at Sat Feb 10 03:12:32 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.