Outage Log

12th January 2009

System accidentally rebooted by Steve, who was not paying attention to which window he typed "halt" into. I've installed molly-guard now.

19th December 2009

The same OOM problem again. Even on 2.6.27.41 kernel and newer kvm we had OOM. I've now bumped forward to the 2.6.32 kernel release and disabled the use of VNC.

10th December 2009

The same OOM condition occurred ahead of my planned reboot for tomorrow.

All guests shut down cleanly and host system rebooted into a more recent kernel + kvm pair. The packages I installed are public.

9th December 2009

The host machine started giving OOM errors, and killing guests.

This was "solved" via a reboot of the host machine. I've also upgraded the version of kvm installed upon the host - in case the OOM was caused by memory leaks in the virtio layer.

As not all guests have been migrated yet I've suspended that until we can see if the problem recurs. All the memory has been passed with memtest already - and it isn't yet obvious what the cause was. Perhaps a leak somewhere.