1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<TITLE> [Mageia-sysadm] Jonund and Ecosse restarted...
</TITLE>
<LINK REL="Index" HREF="index.html" >
<LINK REL="made" HREF="mailto:mageia-sysadm%40mageia.org?Subject=Re%3A%20%5BMageia-sysadm%5D%20Jonund%20and%20Ecosse%20restarted...&In-Reply-To=%3C4E41587C.8030508%40mageia.org%3E">
<META NAME="robots" CONTENT="index,nofollow">
<META http-equiv="Content-Type" content="text/html; charset=us-ascii">
<LINK REL="Previous" HREF="003835.html">
<LINK REL="Next" HREF="003825.html">
</HEAD>
<BODY BGCOLOR="#ffffff">
<H1>[Mageia-sysadm] Jonund and Ecosse restarted...</H1>
<B>Thomas Backlund</B>
<A HREF="mailto:mageia-sysadm%40mageia.org?Subject=Re%3A%20%5BMageia-sysadm%5D%20Jonund%20and%20Ecosse%20restarted...&In-Reply-To=%3C4E41587C.8030508%40mageia.org%3E"
TITLE="[Mageia-sysadm] Jonund and Ecosse restarted...">tmb at mageia.org
</A><BR>
<I>Tue Aug 9 17:55:40 CEST 2011</I>
<P><UL>
<LI>Previous message: <A HREF="003835.html">[Mageia-sysadm] Jonund and Ecosse restarted...
</A></li>
<LI>Next message: <A HREF="003825.html">[Mageia-sysadm] Package removal request: gnu-free-mono-fonts
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#3838">[ date ]</a>
<a href="thread.html#3838">[ thread ]</a>
<a href="subject.html#3838">[ subject ]</a>
<a href="author.html#3838">[ author ]</a>
</LI>
</UL>
<HR>
<!--beginarticle-->
<PRE>Pascal Terjan skrev 9.8.2011 16:37:
><i> On Fri, Aug 5, 2011 at 00:33, Thomas Backlund<<A HREF="https://www.mageia.org/mailman/listinfo/mageia-sysadm">tmb at mageia.org</A>> wrote:
</I>>><i> Hi,
</I>>><i>
</I>>><i> Since both Jonund and Ecosse had dropped some of their build speed,
</I>>><i> I checked them out.
</I>>><i>
</I>>><i> both had zombie rpmbuild processes with the oldest dating about ~8 days ago,
</I>>><i> slow disk io and Ecosse hit ATA Bus Reset errors.
</I>>><i>
</I>>><i> So I restarted both to flush out the memory and re-init the disc
</I>>><i> controllers. both are now running nicely again.
</I>><i>
</I>><i> ecosse is very very slow and looking at dmesg, it seems to have happened again
</I>><i>
</I>><i> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
</I>><i> ata3.00: failed command: FLUSH CACHE EXT
</I>><i> ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
</I>><i> res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
</I>><i> ata3.00: status: { DRDY }
</I>><i> ata3: soft resetting link
</I>><i> ata3.00: configured for UDMA/133
</I>><i> ata3.00: retrying FLUSH 0xea Emask 0x4
</I>><i> ata3: EH complete
</I>><i>
</I>><i> urpmi installs a few packages per minute only, spending most time in D
</I>><i> state while nothing else is running on the machine
</I>
Hm,
this one seems to start the whole mess:
INFO: task jbd2/dm-0-8:633 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jbd2/dm-0-8 D 0000000000000002 0 633 2 0x00000000
ffff88022e3a7b40 0000000000000046 ffff88022e3a7b00 ffffffff8104d7ea
0000000000015840 0000000000000282 ffff88022e3a7af0 000000006c462645
ffff88022e3a7fd8 000000000000faf0 ffff88022f172da0 ffff88022dca16d0
Call Trace:
[<ffffffff8104d7ea>] ? try_to_wake_up+0x2da/0x410
[<ffffffff81084609>] ? ktime_get_ts+0xa9/0xe0
[<ffffffff810e9680>] ? sync_page+0x0/0x50
[<ffffffff813c3b33>] io_schedule+0x73/0xc0
[<ffffffff810e96bd>] sync_page+0x3d/0x50
[<ffffffff813c418f>] __wait_on_bit+0x5f/0x90
[<ffffffff810e9843>] wait_on_page_bit+0x73/0x80
[<ffffffff8107a1d0>] ? wake_bit_function+0x0/0x40
[<ffffffff810f3bf5>] ? pagevec_lookup_tag+0x25/0x40
[<ffffffff810e9c5d>] filemap_fdatawait_range+0x10d/0x1a0
[<ffffffff810e9d1b>] filemap_fdatawait+0x2b/0x30
[<ffffffffa0008df5>] jbd2_journal_commit_transaction+0x745/0x12f0 [jbd2]
[<ffffffff8106b53b>] ? try_to_del_timer_sync+0x7b/0xe0
[<ffffffffa000f193>] kjournald2+0xb3/0x200 [jbd2]
[<ffffffff8107a190>] ? autoremove_wake_function+0x0/0x40
[<ffffffffa000f0e0>] ? kjournald2+0x0/0x200 [jbd2]
[<ffffffff81079c66>] kthread+0x96/0xa0
[<ffffffff8100ae24>] kernel_thread_helper+0x4/0x10
[<ffffffff81079bd0>] ? kthread+0x0/0xa0
[<ffffffff8100ae20>] ? kernel_thread_helper+0x0/0x10
IIRC, it was fixed in newer kernels, but since mdv still fails to
release fixed kernels for 2010.1 I guess I'll either release a fixed
kernel myself for Mageia BS, or simply rebuild the Mageia 1 kernel for it.
One other thing we could do for now is to only have one buildbot running
until the kernel is fixed.
--
Thomas
</PRE>
<!--endarticle-->
<HR>
<P><UL>
<!--threads-->
<LI>Previous message: <A HREF="003835.html">[Mageia-sysadm] Jonund and Ecosse restarted...
</A></li>
<LI>Next message: <A HREF="003825.html">[Mageia-sysadm] Package removal request: gnu-free-mono-fonts
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#3838">[ date ]</a>
<a href="thread.html#3838">[ thread ]</a>
<a href="subject.html#3838">[ subject ]</a>
<a href="author.html#3838">[ author ]</a>
</LI>
</UL>
<hr>
<a href="https://www.mageia.org/mailman/listinfo/mageia-sysadm">More information about the Mageia-sysadm
mailing list</a><br>
</body></html>
|