Montreal Tech Watch

feu centre de donnees iWeb-CL Last night was a test for iWeb Technologies support team. A fire broke near their iWeb-CL data center, and as a result, power supply was spotty. In such a scenario, and to prevent any sudden power outage for hosted servers, generators are started and iWeb waits for HydroQuébec to provide again regular electricity.

iWeb was especially unlucky though. All generators started as expected, but one transfer switch didn’t switch power as expected to one of the 3 generators; so a while later, a whole segment housing 3000 dedicated servers was off the grid.

It’s a web hosting company’s worst scenario. In less than a second, thousands of websites, web applications and services, databases, e-commerce websites are down. A support team must then race to restart all servers, making sure that the restart process doesn’t overload the power infrastructure, and finally, most servers are hopefully back a few minutes later… but the damage has already been done. Data has been lost, software were stopped, hard drives might have been corrupted due to the sudden power outage. Websites are not back online since most of the time, a sys-admin must login to the server and launch the web server and other services like databases.

The incident makes us think that while software has the central role in tech, hardware and networking is still a critical component.

As for iWeb and hosting at iWeb, there are many questions arising.

First, could this have been prevented? As far as I know, no data center in the world is free from accidents. If you are a sys-admin, you just have to assume that a server WILL shut down one day, due to faulty networking, power outage, disk failure etc. In the case of iWeb, I know they do weekly stress tests, all equipment, power supply and networking are redundant, server components are tested before delivered to customers. Even if your name is Google, Apple or Amazon, you can’t have a fault-free data center, and even solutions like cloud computing can’t prevent this. You can have an image backup, but there will be outages during the time required to transfer the image and provision another instance in another data center. As far as I know, the only way to get 100% risk-free hosting is to get a hosting solutions from 2 different providers, and use 1 as a disaster recovery backup. An Internet-based business must then architect their hosting infrastructure based on those probable scenarii.

Second, in the case of iWeb, you can see here lots of drama. That’s more than 500 comments in a few hours. Lots of angry customers, a few supporting iWeb, lots of impatience etc. iWeb support team was there though, which is good, with people like “Sylvain – iWeb” replying to each customer. I am wondering though why there wasn’t any activity on iWeb’s twitter or Facebook. That’s 2 medium you absolutely have to update, track and communicate. A frustrated customer on twitter will do lots of damage, and that’s probably another customer you will have to phone in the next days to get him back.

I shall leave the conclusion of the story to iWeb’s customers. In those times of crisis, customers who were looking to switch to other providers will do so in the next couple of days. Other customers who feel that iWeb had a responsive and professional support team will stay. It’s a test for iWeb, a test they didn’t need since they were launching the new iWeb SmartServers

Photo: Photo-Media.ca

  • Intellitix provides rfid access to Coachella

    #coachella

  • twtspire.com| idea for the next startup = One Tweet Away? twtspire.com| idea for the next startup = One Tweet Away?

    twtspire.com| idea for the next startup = One Tweet Away?

    Startups solve problems. So if you find a problem there’s probably a startup idea lying somewhere nearby. A Montreal developer Kenji Williams developed an app called twtspire.com that scours twitter and automatically detects tweets from people that wonder why a solution doesn’t exist for a specific problem they’re having. Here are example of tweets from [...]

  • AccelerateMTL : more than just a conference

    AccelerateMTL is coming up on the afternoon on May 23rd, right after the FounderFuel demo day. It’s announced as a conference full of good keynotes, from successful entrepreneurs like BeyondTheRack founder, renowned Internet marketers, and other Internet execs. View more on the eventbrite page. As the name suggests, the presentations were curated to accelerate startups. [...]

Comments

  • Stéphane Jose November 04, 2010

    Thanks Heri for your input. Please note that the shown picture is not iWeb’s building burning. iWeb-CL facility has not been physically affected by the fire. The unexpected power outage lasted about one hour in all and affected only a portion of the 12,000 servers in hosted CL. iWeb is currently working closely with the clients that need extra assistance with their servers (some of which needed a manual intervention to complete their reboot). ^SJ.

  • Stéphane Jose November 04, 2010

    Also note that the 3,000 servers mentioned here were down only briefly. The current count of unavailable servers is now very low, considering, (a little over 100), but we won’t give up until everybody’s server is back up online.

  • Montreal Tech Watch November 04, 2010

    Fire starts near data center; 3000 servers down http://bit.ly/9A5xgI

  • François Maillet November 04, 2010

    That's where I'm hosted. Mine didn't go down phewf RT @mtw: Fire starts near data center; 3000 servers down #montreal http://bit.ly/bt1KQI

  • Canada Tech Eqentia November 04, 2010

    Fire starts near data center; 3000 servers down http://eqent.me/cZMHVd

  • CVCA Team November 04, 2010

    Fire starts near data center; 3000 servers down http://eqent.me/cZMHVd

  • Gabriel Dancause November 04, 2010

    Y'avait un feu près de chez @iWeb hier. Ça explique la coupure. http://ow.ly/34Gyi

  • Denis Canuel November 05, 2010

    Disaster recovery and business continuity are not exact sciences. You can’t be ready for each and every possibility out there. You analyze and evaluate risk, then normally make decisions based on how much money you can afford to lose during that downtime.

    Trading companies and banks can’t afford to have long downtime so they invest millions to stay up at all times.

    I think for what people pay, iWeb gives them a pretty impressive architecture. Try duplicating it in your basement and it will cost you a lot more than 10$ a month.

    Sure we can complain all day if we want, but in the end, who didn’t do their due dilligence? It doesn’t matter who’s fault it was. If you need 100% uptime, design a proper solution.

  • Heri November 05, 2010

    Well yes, I was thinking about business continuity.

    And yes, iWeb invests a lot of time in network and power redundance, as well as regular testing of their infrastructure. But at the end, no hosting company can guarantee a 100% uptime and it’s up to the buyer to design a business continuity solution (which in my mind means having servers at 2 different data centers)

You must be logged in to post a comment.

blog comments powered by Disqus