The problem turned out to be something other than a hardware failure. The good news is I have set up a completely new dedicated server with a completely new company (who will be managing updates and security for me). The bad news is, I lost ALL the data that was on the original server.
I do have local backups of web sites I did, but I lost databases, and people using webmail lost saved email. Clients who were doing their own web sites may have lost their data if they werent backing it up locally.
Brief recap of events:
1. Server was hacked at the end of July. I hired SeeksAdmin.com to go in and clean up the mess, patch everything, and lock it all down. Everything was great until the server somehow got rebooted (I had nothing to do with it), and it didn’t come back up. According to 1and1, my server provider, the machine was stuck booting up because it couldn’t load the kernel. They couldn’t select the previous kernel because SeeksAdmin had locked down lilo, the bootloader. I know SeeksAdmin had mentioned they had problems with the new kernel working, but they claimed they rebooted the machine multiple times and rolled it back to the older version. I can’t prove or disprove that, but the circumstances are a bit concerning.
2. I was 99% sure that my backups were being performed by 1and1, and that all was OK when it came down to re-imaging the hard drive. Unfortunately, 1and1 locks out the backup FTP server from being accessed except from your dedicated server. So, I had no way of verifying that the backups were OK since the server wouldn’t boot and I couldn’t log in to look at the backups.
3. SeeksAdmin re-imaged the server for me, and after getting it back up, I logged in and FTP’d to the backup server to check things out. Nothing was there. I was stunned, and I was very angry.
When all was said and done, I was left wondering what had happened. There is really no one person/company to blame, rather, a bunch of bad things conspired from different places to screw me over and cause a large nightmare for all the clients I had hosted on that server. Had 1and1 been doing the backups, which they were supposed to be doing, all would have been OK. But then, it seems the system had become unstable since SeeksAdmin had gone in to do their work, so I wouldn’t have run into the problem if I hadn’t hired them. But then, if I never was hacked in the first place, none of this would have happened in the first place.
The silver lining to all of this is that I had been itching to leave 1and1 for quite some time, as they are the Wal-mart of web hosting. I was stuck with them because the task of moving all my clients was just too time consuming to think about. The new company I settled with is all about service, and being supportive of their clients. They are smaller, personal, and responsive. I also got away from having to use Plesk, and am now happily setting up all the sites in WHM and CPanel.