Details
-
Bug
-
Resolution: Fixed
-
Minor
-
None
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for James Simmons <uja.ornl@gmail.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/64bcf39a-0c5d-4ebe-aa8e-07c6118719cd
Test session details:
clients: https://build.whamcloud.com/job/lustre-reviews/107661 - 5.14.0-362.24.1.el9_3.x86_64
servers: https://build.whamcloud.com/job/lustre-reviews/107661 - 5.14.0-362.24.1_lustre.el9.x86_64
<<Please provide additional information about the failure here>>
current_state: FULL
state_history:
- [ 1726673245, DISCONN ]
- [ 1726673245, CONNECTING ]
- [ 1726673245, RECOVER ]
- [ 1726673245, FULL ]
- [ 1726673262, DISCONN ]
- [ 1726673262, CONNECTING ]
- [ 1726673262, RECOVER ]
- [ 1726673262, FULL ]
- [ 1726673279, DISCONN ]
- [ 1726673279, CONNECTING ]
- [ 1726673279, RECOVER ]
- [ 1726673279, FULL ]
- [ 1726673297, DISCONN ]
- [ 1726673297, CONNECTING ]
- [ 1726673297, RECOVER ]
- [ 1726673297, FULL ]
mdc.lustre-MDT0001-mdc-ffff88f406920000.state=
current_state: FULL
state_history: - [ 1726671294, CONNECTING ]
- [ 1726671295, FULL ]
mdc.lustre-MDT0002-mdc-ffff88f406920000.state=
current_state: FULL
state_history: - [ 1726671294, CONNECTING ]
- [ 1726671295, FULL ]
mdc.lustre-MDT0003-mdc-ffff88f406920000.state=
current_state: FULL
state_history: - [ 1726671294, CONNECTING ]
- [ 1726671295, FULL ]
recovery-small test_10a: @@@@@@ FAIL: no eviction: before:1726673307
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:7177:error()
= /usr/lib64/lustre/tests/recovery-small.sh:154:test_10a()
= /usr/lib64/lustre/tests/test-framework.sh:7522:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:7585:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:7408:run_test()
= /usr/lib64/lustre/tests/recovery-small.sh:171:main()
Dumping lctl log to /autotest/autotest-1/2024-09-18/lustre-reviews_review-dne-part-5_107661_16_4f1e33f4-b320-40c7-8af0-801d1ab9dc56//recovery-small.test_10a.*.1726673420.log
CMD: trevis-24vm7,trevis-56vm1.trevis.whamcloud.com,trevis-56vm2,trevis-56vm3,trevis-83vm7 /usr/sbin/lctl dk > /autotest/autotest-1/2024-09-18/lustre-reviews_review-dne-part-5_107661_16_4f1e33f4-b320-40c7-8af0-801d1ab9dc56//recovery-small.test_10a.debug_log.\$(hostname -s).1726673420.log;
dmesg > /autotest/autotest-1/2024-09-18/lustre-reviews_review-dne-part-5_107661_16_4f1e33f4-b320-40c7-8af0-801d1ab9dc56//recovery-small.test_10a.dmesg.\$(hostname -s).1726673420.log
CMD: trevis-56vm1.trevis.whamcloud.com checkstat -v -p 0777 /mnt/lustre
/mnt/lustre has perms 0777 OK
CMD: trevis-83vm7 dmesg
[ 2314.424588] Lustre: mdt00_001: service thread pid 10192 was inactive for 42.474 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes:
CMD: trevis-83vm7 dmesg
[ 2383.543931] Lustre: mdt00_001: service thread pid 10192 completed after 111.597s. This likely indicates the system was overloaded (too many service threads, or not enough hardware resources).
Attachments
Issue Links
- duplicates
-
LU-15630 recovery-small test_10a: no eviction: before:1646723217
- Open