Derek Atkins
2011-Apr-05 14:27 UTC
ANNOUNCE: Unplanned Server Outage overnight last night due to power outage
Good morning everyone, Atlanta got hit by some major storms last night, which caused a signifanct power loss starting around 11pm US/EDT. Power stayed down long enough that my UPSes expired. Power came back and got lost a few times over the course of the early morning and came back completely around 1:30. Unfortunately the VM Server hosting code did not. At around 4:30 I woke up, came down to the server, and determined the issue and was able to recover the system relatively quickly. I took the opportunity to update some software and reboot the system cleanly again to test it all. The final reboot happened around 5:45am and the VMs came back up shortly thereafter. All services should now be running normally. I'm sorry for any inconvenience this outage may have caused. For those that care about the gritty details, the failure was due to the way I added additional disk space to the VM server. When I added the disks I used Linux software raid, however when I created the array I used the md-raid metadata version 1.2. Unfortunately the running kernel was unable to build the full logical volume due to this "mistake". I was able to correct the immediate issue by upgrading the kernel. This allows the new raid to load later, however it now loads as md127 instead of md3, but that doesn't affect the significant operation of the server. Upgrading the kernel included an upgraded vmware-server patch that will hopefully work better than the previous vmware-server patch I had tried. If the system remains stable with the new kernel then all will be good. However if the server reverts back to its disk IO issues that I observed before then I might have to determine new ways to work around the raid metadata issue, which may involve upgrading to grub2. Hopefully it wont come to that. Please keep your fingers crossed. -derek -- Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory Member, MIT Student Information Processing Board (SIPB) URL: http://web.mit.edu/warlord/ PP-ASEL-IA N1NWH warlord at MIT.EDU PGP key available