Author Topic: Another day, another crash  (Read 497 times)

Freehaven

  • LOLS AND DONGS WHOLESALE
  • ***
  • Posts: 323
  • E-points: +12/-28
    • View Profile
Another day, another crash
« on: December 19, 2011, 01:41:20 am »
It appears that the database has crashed and is currently in the process of recovery.
This is not something that should be happening, so an investigation as to why it took place is underway. No ETA yet

Yak, once again, shows us that nobody at FA has a goddamned clue what's going on with FA. The site's been slow-to-unreachable in the early morning hours for the past week or so, and you'd think they'd have looked into things because of that.

Conan

  • Sean Piche Wannabe Club
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 603
  • E-points: +33/-9
  • ¯\(°_o)/¯
    • View Profile
Re: Another day, another crash
« Reply #1 on: December 19, 2011, 02:30:36 am »
Quote from: Yak
What was originally considered a hardware failure turned out to be:
Quote
Dec 19 03:25:40 novastorm kernel: swap_pager_getswapspace(16): failed
Dec 19 03:25:52 novastorm kernel: pid 90279 (mysqld), uid 88, was killed: out of swap space

Reconfigured MySQL to use less memory and restarted the server. Also restarted the backup process that got interrupted halfway with this crash.
FA will be on the slow side the first 15-20 minutes while the table data cache is being populated.

HARDWARE FAILURE!!!!!! Quick! Round up the donatio-.... oh it just ran out of memory.

Instead of thinking they need to have three file servers (like plans announced earlier this summer called for), they should really be repurposing Trogdor (Or maybe the backup server-that-is-responsible-for-storing-a-single-tar-file-and-thats-it) into a second database server. They're obsessive about anything to do with files, but when it comes to the database (which has been the root cause of like 80% of site issues) they continue to let it limp along on a single box.

Also, I'm guessing now that the past week of slowness probably has to do with the backup, the times seem to match up at least...

ProvincialTwit

  • Abuse Dept.
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 774
  • E-points: +72/-33
    • View Profile
Re: Another day, another crash
« Reply #2 on: December 19, 2011, 09:53:49 am »
Ahahahaha not only did they run out of RAM, they ran out of swap.  That means, of course, the 'slow to unreachable in the early morning' is actually their poor server thrashing itself to death.  I bet the iowait percentage on that thing is astronomical.

But as per usual the place is run by idiots and any advice of those 'in the know' will be summarily dismissed and ignored.