Wednesday, April 04, 2007

EDS + Hibernate = 100% CPU

There has been issues reported by GW users that Evolution hangs/suspends for a while when composing mails. We weren't able to reproduce the issue and respective gdb traces didn't give enough information. Lately, I started hibernating my laptop and when I resume from hibernation, I noticed that EDS started taking 100% CPU and when gdb is attached to the process... voila!!!.. EDS was running with 264 threads and most of it are waiting to update GW addressbook and rest are waiting to update GW calendars.

A little further investigation revealed that it was because of the combination of g_timeout_add()+hibernation. GLib stores the last processed time for g_timeout_add and when hibernated, the memory image is stored and restored, when resumed from hibernation. When restored, callbacks registered with g_timeout_add() gets called - as the difference in time during hibernation and resume satisfies the g_timeout_add() timeout value and that too in multiples of 100. Harinath (of Mono fame) helped me understand the GLib part and the fixes have gone in for GW and webcal provider.

6 comments:

rjamestaylor said...

I think this describes the "Evolution freezes at regular intervals" events I'm witnessing running Ubuntu 7.04, Evolution 2.10.1 while attaching to an Exchange 2003 server. At least the symptoms are the same... 100% CPU utilization, Evo cannot refresh GUI and becomes completely unresponsive for 5 to 10 seconds.

I cannot express how frustrating this "frozen app" experience is... Anyway for a user to adjust the hibernation or whatever is freezing me out of my mailbox?

Thanks.

rjamestaylor said...

This is not pretty, but here's a snippet of strace from the evolution-2.2 process when the "freeze" occurs. (Oh, the freeze uses 100% of one of my desktops 2 CPUs, not both.)

This poll / resource temp not available / poll block cycles during the "freeze":

poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=8, events=POLLIN|POLLPRI}, {fd=10, events=POLLIN|POLLPRI}, {fd=24, events=POLLIN}, {fd=50, events=POLLIN}, {fd=52, events=POLLIN}, {fd=18, events=POLLIN}, {fd=20, events=POLLIN}, {fd=22, events=POLLIN}], 10, 964) = 0
gettimeofday({1182867077, 91181}, NULL) = 0
writev(11, [{"GIOP\1\2\1\0\204\0\0\0", 12}, {"\220~\371\277\3\0\0\0\0\0\0\0\34\0\0\0\0\0\0\0\33L\340"..., 132}], 2) = 144
futex(0x8088e58, FUTEX_WAKE, 1) = 1
gettimeofday({1182867077, 91703}, NULL) = 0
clock_gettime(CLOCK_REALTIME, {1182867077, 92013550}) = 0
futex(0x8090280, FUTEX_WAKE, 1) = 1
futex(0x8090284, FUTEX_WAIT, 5417, {29, 999689450}) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x8090138, FUTEX_WAKE, 1) = 0
writev(11, [{"GIOP\1\2\1\0\204\0\0\0", 12}, {"\220~\371\277\3\0\0\0\0\0\0\0\34\0\0\0\0\0\0\0\33L\340"..., 132}], 2) = 144
futex(0x8088e58, FUTEX_WAKE, 1) = 1
gettimeofday({1182867077, 92796}, NULL) = 0
futex(0x8090138, FUTEX_WAKE, 1) = 1
clock_gettime(CLOCK_REALTIME, {1182867077, 93229922}) = 0
futex(0x8090280, FUTEX_WAKE, 1) = 1
futex(0x8090284, FUTEX_WAIT, 5419, {29, 999566078}) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x8090138, FUTEX_WAKE, 1) = 0
ioctl(3, FIONREAD, [0]) = 0
gettimeofday({1182867077, 93603}, NULL) = 0

Veerapuram Varadhan said...

@rjamestaylor: Thanks for your comment. Can you get me gdb trace of the process when it freezes? This issue was with GW provider. 5 to 10 seconds seems some other issue to me.

rjamestaylor said...

sorry for the long interval between posts -- I just now saw your request for the gdb output. Unfortunately, I have long since switched from Evolution to Thunderbird and am using Ubuntu 7.10 now as well.

osma said...

It's weird -- I thought I'd seen the last of the evolution freezes when I upgraded to F8 a couple of months ago (test release at the time), but no -- they're back. It's a 50% chance roughly that evolution freezes up upon resume, and I see this futex() call not returning problem in both evolution and e-d-s. I was Googling around and this post was the only thing I saw that discussed the problem. But 2.12 shouldn't have it any more?!

Veerapuram Varadhan said...

@osma: This fix was only to GW provider and your issue seems to be different, unless you are using GW provider. Can you send a mail to evolution-list@gnome.org with stack trace when it hangs? Stack trace of both eds and evolution will be required to fix the issue. TIA.