Montreal Tech Watch

Back!

By Heri Jul 17th 2008 in Design

nuclear mtw

For those who were not aware, MontrealTechWatch was down from yesterday early morning till 8.00pm today 17th of July.

It might seem normal and in the-order-of-things that the server comes back; for most sys admins, it’s just a matter of opening a ticket and the tech support would restart somehow the whole thing. But this time, it was radically different. Just a few hours ago, it was considered to be un-recoverable *sweats* , and with it databases **shivers** plus all generated files for the past 2 years ***faints***. We tried one last hack, which miraculously worked.

For those curious about technical details, this server hosts many websites and services. It hosts for instance a RoR site, graciously hosted since it’s a friend’s, plus another experiment, using Phusion Passenger. I’ve discovered that mod_rails has a big memory problems and leaves around dead processes; which I intended to solve by writing a god-like ruby script that would kill & clean processes, and even if the parent process was defunct and couldn’t be killed. Fast-forward, yesterday morning, this script launched the system command kill -9 1 … with the script owned by root user… which is the equivalent of shooting yourself in the head … while jumping from a plane 30000 feet high. XenServer can’t even restart, reinstall snapshot backups, relaunch, nor be re-setup, and all files & databases were deemed lost and inaccessible.

MTW is taken very seriously and I know of its importance; and this should never happen again. There’s one thing to blame here, which is trying to use experimental scripts on a production server. If this was a company, I would have fired the Linux idiot who wrote the script. Oh wait… Anyway, thanks for everyone who were there, it’s much appreciated. I’ll look into getting an additional resource as a sandbox and get a bulletproof environment for MTW

  • Celebrating Montreal’s technology talent: Showcase your talent in front of peers

    Celebrating Montreal’s technology talent: Showcase your talent in front of peers

    #MTLStartupTalent

  • MTLNewTech #39 with Echoer, Ruumies, TagMyDoc, Wikimeta, Ziliko

    #MTLNewTech

  • GrowLab Event Coming to Montreal. Let’s talk Start-ups!

    GrowLab Event Coming to Montreal. Let’s talk Start-ups!

    Start-up accelerator GrowLab and Dealmaker Media have put together a tour with a group of great people to talk about the reality of being an entrepreneur. If you have a start-up or you’re thinking about it, you need to be there to join the discussion. Here are some of the questions that will be addressed: [...]

Comments

  • Denis Canuel July 18, 2008

    Yeah, who hired that Linux guy..?? ;)
    Glad you’re back! That was a close one. I’ll take some time myself to make sure that I have an offline backup as well…

  • Mark MacLeod July 18, 2008

    Welcome bark Heri! We were jonsing without our daily dose of MTW

  • Heri July 18, 2008

    Thanks, guys,

    Hopefully will catch up and publish a couple more articles

  • Ed July 18, 2008

    We spotted something similar with v2.0.1 of Passenger. After debugging it with my colleagues and the modrails developers, they released two patches in v2.0.2 that solved the issue for us. Before the patch, our installation was eating all memory address space and eventually died. Consider upgrading?

    Glad to hear things are back!

  • Smiling Triton July 18, 2008

    Glad you’re back, your site is indeed very useful.
    SmTt

  • Mehdi Akiki July 20, 2008

    Welcome back Heri!

  • Fred Brunel July 22, 2008

    Glad MTW is back. That kind of stuff happened to the very best lately.

You must be logged in to post a comment.

blog comments powered by Disqus