Conner's Family


Forum Outage

This time it wasn't the cable company nor the power.. :sigh:

One of the two hard drives in the server has developed a series of bad sectors.. Friday morning (around 3am ET) the file journaling system noticed them and, as an automatic safety measure, put that drive into read-only mode. Unfortunately, the MySQL database that runs these forums (and quite a few other things on here) became unusable, but because I was still out of town, I couldn't safely reboot/repair anything until I got back home tonight. When I finally did get home, I rebooted the server (at or about 9pm), but it required manually running something called fsck three times to "fix" the drive enough to get the system back up.. this process took almost four hours (I got it back up at or about 12:45am, do the math yourself..) so the mud was down along with the mud's forums, my family site, etc (every thing I host) for all of that time and we didn't have any internet available to us to let anyone know by any other means either. I offer my apologies for any confusion or inconvenience this may have caused anyone, but it really was unavoidable.

At some time in the relatively near future, I will have to bring everything down again to replace that hard drive, though I'm still not entirely giving up on the idea of replacing the server itself instead as the hard drive is no older than the server and both are due for replacement anyway.

Posted by: Conner Oct 8, 2007, 8:56 am
Comments:
Posted by: Samson On Oct 13, 2007, 9:07 pm
Hopefully by "near future" you mean within the next few days. Don't play around with bad sectors, especially in linux. They have a tendency to produce Arthmoor style collapses in very short order. They could build up to the point where data recovery is no longer possible.

You mentioned needing to run fsck three times. That's twice too many in my book. One time should have been enough, but needing two more runs means your drive is on the brink of death. I've been down that road before.

Posted by: Conner On Oct 13, 2007, 11:56 pm
:sigh: Alas, I know that you're right, and I am making appropriate back-ups accordingly, but I was really hoping this beastie would serve me a little longer than the "next few days" yet. Personally, I take having to manually run fsck even once as a very bad sign, but having sporadic income makes planning stuff like system upgrades a bit hairy. :(

Posted by: Samson On Oct 15, 2007, 5:32 am
Try planning for emergency hardware replacements when the server is down and you had other plans for the money. It sucks, I know, but if you want the data to survive you need to act. I didn't even get the courtesy of a warning from the hardware :(

Posted by: Conner On Oct 16, 2007, 4:28 am
I know, I really do.. in this case though it's more a matter of finding the funds, not even just reallocating them. :( ..as I said though, I am making backups accordingly, and I suppose I do have a hard drive or two sitting in some other older boxes around here that I could scavenge from, I just really wanted to  build a whole new one already rather than piecemeal upgrade my existing server for a change. :sigh:

Posted by: Conner On Oct 29, 2007, 6:09 pm
This time it was Comcast.. Last night their service for us was less than Comcastic...

Alas, Comcast, according to their representatives at technical support, is doing some emergency maintainence this morning. At 12:50 AM ET, we lost internet, I contacted Comcast at 12:57 (after spending 5 minutes on hold and in their phone menu) and was told that service was expected to be restored within the hour. ...our internet seems to finally have been restored around 11:00 AM ET, so much for their "one hour emergency maintainence". :(

On a positive note, while the internet was down for us, I did a full server backup (and transferred the resultant 28.7 GB tarball to another computer on the network) and rebooted the server a couple of days early (I usually reboot the server the first of each month).

Posted by: Conner On Jan 4, 2008, 1:13 am
Well, folks, once again Comcast has done their magic and left us without internet for about an hour, so far there's no word on why, but at least it appears we're back up. Hopefully the lag we've been suffering the last couple of days will have gone away with the outage finally.

Posted by: Conner On Jan 31, 2008, 9:10 am
Sorry, meant to post this back on the 28th when I posted it for the MUD:

Sorry that I'm a bit late on posting this one, folks, it's been a very busy weekend and I didn't want to say anything before that because I didn't know for certain what the status really was. Back on the 19th Comcast had an "outage" that actually amounted to a variable signal loss of up to 95% (as measured by Comcast's technical support representative), I called them 15 minutes into the problem and was told that a service repair would be scheduled for the earliest possible date of the 29th but that they already had people working on the problem from their end in the interim so it might be resolved sooner. From that time, we continued to have intermittant full outages and really bad lag through the 23rd when it seemed to have more or less cleared up. Finally, on the 25th, Comcast called me to let me know the problem was resolved and they apologized for the inconvenience but didn't offer a credit for the outage nor an explaination of the problem.

I suppose I really need to call them back to see if I can at least get a credit for the four days we suffered through, if not for the full six they seem to feel we should've suffered through.

Posted by: Conner On Jan 31, 2008, 9:19 am
Once again, the network will be unreachable from the outside, but this time it will be unreachable from the inside too and Comcast won't be to blame. Normally each month, on the morning of the first, I reboot the server, but this month, I'm doing it a day early because Dragona and I got new desks and in order to move everything from the old ones (I bought my desk back in 2000 and it barely managed to survive through five household moves plus four additional in-house rearrangements.. they needed replacement) I need to bring everything offline for what I'm choosing to call semi-scheduled maintaince. Anyway, I expect that this means that the mud, bbs, all my websites, and all the muds/websites that I host will be unreachable from approximately 4:30 AM EST until approximately 6:30 AM EST - if things go well, that might mean we'll be back sooner, if they don't, it might mean we'll be down longer. In any event, thanks for whatever patience you're able to spare me through this (and any positive wishes as well) and sorry for any inconvenience this may cause any one.

Posted by: Conner On Feb 1, 2008, 4:59 am
Well, that was quite the experience, and we were down MUCH longer than I'd expected or hoped (apologies to all), but we're back up now and everything appears to be working reasonably well.

Posted by: Conner On Feb 29, 2008, 11:34 pm
Well, this time Comcast isn't saying what happened that interrupted our service, but they are at least admitting that the outage that ran from 4:09pm until 6:28pm (at least it seems to be back up now) did happen. Sorry for any inconvenience. :(

Nevermind, apparently it wasn't really back yet, I'll let you all know when it does come back up as soon as I can once it does come back up and stays up for long enough for us to verify that it's for real.

Ok, as of 7:55pm, it appears that we really do finally have internet again. Yay! :)

..or, maybe as of 8:14pm it will be for real this time? :sigh:

Posted by: Conner On Jun 19, 2008, 1:15 am
Many of you may have noticed that the last three weeks have been a bit rough for us, we've experienced intermittent lag of very notable degrees. Last night, from about 12:30am - 2:15am EST we had a complete outage and I contacted Comcast about it and was told they were doing scheduled maintenance. (I really wish they'd let me know where they schedule these maintenances so I could be aware of them ahead of time... :bs:) This morning, the problem was so much more exacerbated that I contacted them several more times throughout the day to try to get the problem resolved. Finally, they have admitted that we've tested and ruled out every possible cause within our site and therefore they have agreed to send a technician on-site this Friday afternoon (6/20/08) to see if (s)he can resolve the issue from outside as feel that it is a hardware issue that can not be resolved remotely. I will add additional notes here when I know more to share with you. We deeply apologize for any inconvenience this may cause and very much appreciate any understanding/patience/support you can manage to muster for us.

Posted by: Conner On Jun 21, 2008, 6:25 pm
Well, Comcast's technician was here yesterday, as scheduled, for quite awhile so he could replace all our cables and splitters (one at a time) and finally it was decided that the next step is to have "maintainance check the hard line" which basically means they've scheduled the actual network admins to check the feed for the whole neighborhood. This was supposed to happen either some time last night or some time this coming Monday, since today the connection seems to feel about the same as it has the last few days (more or less, since the problem is some what sporadic), I'm guessing that they decided to deal with it on Monday. Wish I had more to tell y'all, but that's the update for now. On the bright side, we have all new cabling and all new splitters here now, we just haven't replaced the cable modem itself nor the server itself yet, but we've also determined, officially, that the problem isn't my fault or even at my end directly.

Posted by: Conner On Jun 23, 2008, 8:01 pm
Comcast won't admit that they've actually done anything (thus is couldn't have been their problem) but their people must've done something because as of this afternoon, everything feels back to normal-ish to us and all our pings seem to be returning at decent speeds again.

Posted by: Conner On Jul 16, 2008, 12:29 am
Yet once again.. (Sorry I didn't post last night, I just had so many other things to catch up on first...) Comcast hath struck!

At or about 12:15 AM EST, we lost internet completely. I called Comcast at 12:25 (gave them a few in case it was something really minor) and waited on hold for ten minutes to talk to a representative who spent less than a minute with me to advise that their technicians were performing scheduled maintenance and I could expect to be without internet until at least 6:00 AM and that she was sorry that they hadn't notified me ahead of time but the terms of my service don't require them to do so. :no:



Thankfully, before I could even really get myself worked up to nearly as upset as this should've gotten me, internet was restored at or about 12:45 AM and seems to have been mostly stable and normal since.


:sb: I know that I really should either not complain or change providers but changing providers is a huge hassle (and my choices are rather limited given the rural area I live in) and would invariably result in a price increase. My main complaint though is simply this: would it really be so difficult and unreasonable for them to shoot an email to their customers or post to a web site that we, the customers, can access and be aware of when they have scheduled to have scheduled maintenance?? :(

Posted by: Conner On Aug 7, 2008, 12:51 am
Apologies to anyone affected by today's outage, but it appears to finally have concluded. For the record, here's what went down as far as we know:

Internet went down for us at 5:39 PM EST.

Called Comcast at 5:56 PM EST and spoke to Maurice who couldn't do much because their provisioning system was down but he scheduled a technician to come visit tomorrow afternoon.

Called Comcast again at 6:23 PM EST and spoke to Joshua in hopes of getting more/better information, he confirmed that it was an area outage and filed the trouble report to activate their engineers. Existing service call is still scheduled but most likely unneeded and there is no ETA on the resolution of the problem nor, at this point, any indication as to what the problem really is. Joshua's spectulation is that it's most likely a vehicle hit the wrong telephone pole and took out our area's cables.

At 7:39 PM EST our internet appears to be live again.

Posted by: Conner On Oct 14, 2008, 4:52 am
Sorry about this weekend, folks, looks like we goofed... seems that when we were disconnecting out laptops and such to leave Friday morning, one of us accidently tapped the network cable beween the firewall server and the cable modem without knowing it. The result is that our network had no internet from sometime Friday through about an hour ago when we got home and I was able to do something about it. (While we were gone, we did try repeatedly to see if it was some sort of problem we could resolve remotely, but alas, we were ~400 miles away and couldn't plug a cable back in until we were able to get back.. though you'd be amazed at what we did try because we didn't know what the problem really was.)

Posted by: Conner On Oct 26, 2009, 7:11 pm
I know that it looks great seeing that this is the first time I'm having to add an update to this thread in over a year, but it's not really because we've had no down time in that period, it's because I'd been lazy about updating this for each minor outage (lasted only a few minutes or so) since then. Unfortunately, last night we got hit by a storm that knocked out our power for a few seconds several times and once for about a minute and our satellite services completely. The TV came back within a few minutes each time, but the internet seemed to just stay unable to get good enough signal to come back so we went to bed assuming it'd be back by morning and we'd deal with our own internet stuff then. When we woke up it was still out but also still raining.. about half an hour or so ago Dragona went into our office to check on things and told me our satellite modem and three of our 8 computers in there seemed to be powerless so I checked. Still not sure what happened to the UPS powering those three, but it appears to have its battery completely drained and was powered off itself. I've got everything back up now, but I don't know if that UPS will recharge itself, as it should, or if I'll be needing to come up with some money to replace it somehow yet. Seems like there's always something (expensive) going on these days. :sigh:

Posted by: Conner On Oct 27, 2009, 1:13 am
Just a bit of follow-up, I contacted APC's tech support about the UPS I mentioned in the last post and they have determined that the batteries had some sort of issue while the unit was still in warranty so they're sending me a new set of batteries to replace the current one, entirely at their expense. This means that in 3-5 days everything will be back to normal again but in the meanwhile if weather steals our power again it's all without battery backup so yet more downtime. On a positive note, since the UPS uses hot-swappable batteries, there will be no need for "scheduled downtime" in order to replace the batteries when they arrive.

Posted by: Samson On Oct 31, 2009, 5:21 am
APC is sending you replacements, just like that? I've never had luck with those goofballs at all. They always tell me I need to buy new batteries and have always tried to goad me into buying whole new UPSs when things break.

Posted by: Conner On Oct 31, 2009, 8:53 am
After trying to trouble shoot it myself for a good twenty minutes, I called APC directly and spent another hour (roughly) on the phone with one of their techs who kept stating "please hold for at least three minutes while I check on that" but finally deemed, after walking me through several troubleshooting steps (most of which I'd already tried and which involved disconnecting the battery and unplugging the UPS then replugging in the UPS and then reconnecting the battery), that my UPS was still under warranty and its battery pack was bad so they were going to send me a new one with a RMA to return the old one on their tab even for shipping. :shrug:

On the negative side, he also told me that I was looking at 3-5 days until the new battery pack would arrive and, while I did get a confirmation email within a few minutes after the call that the order had been sent to their warehouse, I still have not gotten an email confirming that it's been shipped yet (which their initial email said I'd be getting) and it has not arrived either. :sigh:

APC's Initial Email sent Mon, Oct 26, 2009 at 16:14 said:
This email confirms your Service Order # 1-1267373630. You will get another email to confirm shipment that will contain tracking information.

Obviously there was a bit more information in the email, along with lovely APC logos and such, but.. on a positive side, it seems that APC is the very first company to be able to tell us what our zip+4 is here since the mailing address here was newly created for us and added to the 911/postal service databases when we bought the place. Even the USPS hadn't been able to do that before this. :D

(I'd paste the whole email here, but it's got my real name and our full street address and the guy's name and phone number with extension who I spoke to in it. My own information I don't mind sharing with those who want to know badly enough to ask, but... )

Valid XHTML 1.1! Valid CSS!