Author Topic: FA does another thing, screws it up  (Read 2457 times)

Conan

  • Sean Piche Wannabe Club
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 603
  • E-points: +33/-9
  • ¯\(°_o)/¯
    • View Profile
FA does another thing, screws it up
« on: August 13, 2013, 09:24:08 pm »
Early this morning, FA went down for planned database maintenance. In FA tradition, this was talked about in as general terms as possible.

Quote
Administrator notice:
Fur Affinity will be going down for database maintenance on Aug 13, 2013 at 8:00 AM UTC / 4:00 AM EDT.
Downtime is estimated to be less than an hour.

An hour and a half after the downtime began, it came back up. But as you can tell by the graph below, it didn't stay up for very long.


Just hours later the site began throwing 503 and 502 errors, before giving out entirely (for what appears to be an hour before someone does something about it). The site was brought back online with the explanation:
Quote from: Yak
An issue was identified as related to the earlier database maintenance.
 Changes are being reverted. Unfortunately, this means that FA will be in for another downtime in the near future to actually see the changes applied.
 There will be an announcement about that downtime in advance.

Why all the problems? Why the downtime? We turn to Twitter:
We're working on resolving the issue. The downtime is due to an issue with our  database migration to the new servers.

What new servers? The new ones people donated. In February. It took them nearly six months to finally get around to setting up the hardware they desperately needed, and they still botch something in the process of migrating to them.

What exactly happened? Our sources from IRC say:
Quote
19:23 < Pi> <yak[away]> bah. binlogs filled up the current db server's harddrive faster than a snapshot could be moved to the new one and have it sync up as a slave.

Conan

  • Sean Piche Wannabe Club
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 603
  • E-points: +33/-9
  • ¯\(°_o)/¯
    • View Profile
Re: FA does another thing, screws it up
« Reply #1 on: September 10, 2013, 10:29:51 pm »
Looks like they're giving the migration another go, and once again: six months of prep time and they have only narrowed down the "read only mode" to 24-48 hours.

Quote
Administrator notice: On Thursday, Sep. 12 08:00 AM UTC FA will be upgrading our database servers. Server migration is expected to last between 24-48 hours. The site will be put in Read Only mode during this time. We apologize for any inconvenience this may cause. Please plan accordingly.

This after Plan A failed spectacularly

winserv03fan

  • Dumbest Username Award - May 2012
  • *
  • Posts: 65
  • E-points: +4/-4
  • A Duck!
    • View Profile
Re: FA does another thing, screws it up
« Reply #2 on: September 12, 2013, 06:23:24 am »
Quote
FA will be upgrading our database servers.
Woah, pretty impressive having FA upgrade their servers for them...

I'm wondering how this would take anywhere near 2 days to complete though. All FA typical bullshit aside, can't it be as easy as copying the database from one drive to another? I don't think that should take anywhere near 2 days, unless they're using Wi-Fi, or like, Bluetooth to do it.

nrr

  • Sean Piche Fan Club
  • Cabalistic Fuckhead
  • *
  • Posts: 79
  • E-points: +7/-3
  • OMG SO CUTE ^__^
    • View Profile
    • lynxies :3
Re: FA does another thing, screws it up
« Reply #3 on: September 12, 2013, 06:39:03 am »
I'm wondering how this would take anywhere near 2 days to complete though. All FA typical bullshit aside, can't it be as easy as copying the database from one drive to another? I don't think that should take anywhere near 2 days, unless they're using Wi-Fi, or like, Bluetooth to do it.

Typically, in well-run ops shops, these kinds of migrations happen in stages that span a couple of weeks because of how ITSM change control works, amongst other things. FA eschews all of that in favor of just fucking doing it, which is nice, but you need to have competent people in place _and_ failsafes (like database backups and a lot of extra disk) handy in order to keep from causing a catastrophe.
im glad the "I saw a furry IRL" thread is so good at bringing goons together

YOUR PARTICIPLES AREN'T THE ONLY THINGS DANGLING

Conan

  • Sean Piche Wannabe Club
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 603
  • E-points: +33/-9
  • ¯\(°_o)/¯
    • View Profile
Re: FA does another thing, screws it up
« Reply #4 on: September 12, 2013, 12:04:25 pm »
Quote
FA will be upgrading our database servers.
Woah, pretty impressive having FA upgrade their servers for them...

I'm wondering how this would take anywhere near 2 days to complete though. All FA typical bullshit aside, can't it be as easy as copying the database from one drive to another? I don't think that should take anywhere near 2 days, unless they're using Wi-Fi, or like, Bluetooth to do it.

The database is known to be several hundred gigabytes in size, to the degree that it was once said the notification table was nearly 400GB alone. That's probably millions of rows. All because they refuse to force delete ancient notifications "just in case" someone needs one someday.

It takes 24-48 hours to complete because FA seems to want to do this sort of thing in the most inefficient ways possible, because Yak is a developer first and foremost who roleplays a sysadmin from time to time.

winserv03fan

  • Dumbest Username Award - May 2012
  • *
  • Posts: 65
  • E-points: +4/-4
  • A Duck!
    • View Profile
Re: FA does another thing, screws it up
« Reply #5 on: September 12, 2013, 06:21:05 pm »
The database is known to be several hundred gigabytes in size, to the degree that it was once said the notification table was nearly 400GB alone. That's probably millions of rows. All because they refuse to force delete ancient notifications "just in case" someone needs one someday.

I find this fucking hilarious. Were they saving them with the whole string and HTML formatting for each row? Haha.

Fate

  • James Woods with a Handgun and a Hardon
  • *
  • Posts: 54
  • E-points: +9/-2
  • the fuck
    • View Profile
Re: FA does another thing, screws it up
« Reply #6 on: September 13, 2013, 06:46:12 am »
The database is known to be several hundred gigabytes in size, to the degree that it was once said the notification table was nearly 400GB alone. That's probably millions of rows. All because they refuse to force delete ancient notifications "just in case" someone needs one someday.

good god. just periodically erase notifications every six weeks, fuck.


Then again, this is FA, bastion of inefficiency and using gaming rigs as servers.

JigsawJones

  • Posts: 11
  • E-points: +0/-0
  • Uninitiated Rube
    • View Profile
Re: FA does another thing, screws it up
« Reply #7 on: September 13, 2013, 07:23:28 am »
THAT'S NINTENDO POWER!!!

Ketsuban

  • *
  • Posts: 48
  • E-points: +5/-1
  • Initiated Rube
    • View Profile
Re: FA does another thing, screws it up
« Reply #8 on: September 14, 2013, 03:22:10 pm »
FA is out of readonly mode, what does the load graph look like?

Pi

  • POOR IMPULSE CONTROL
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 614
  • E-points: +40/-10
  • <blink>yes hello</blink>
    • View Profile
    • Clan Spum userpage
Re: FA does another thing, screws it up
« Reply #9 on: September 14, 2013, 04:00:43 pm »
Bad.

The response times shot through the roof, and the SQL times got worse.
"we did farts.  now we do sperm.  we are cutting edge." — Theo DeRaadt

Conan

  • Sean Piche Wannabe Club
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 603
  • E-points: +33/-9
  • ¯\(°_o)/¯
    • View Profile
Re: FA does another thing, screws it up
« Reply #10 on: September 14, 2013, 04:38:08 pm »


Left side of the graph is from prior to the outage. Far right is after. Yeah, it got worse, and it looks like PHP is partially to blame.
« Last Edit: September 14, 2013, 06:14:34 pm by Conan »

winserv03fan

  • Dumbest Username Award - May 2012
  • *
  • Posts: 65
  • E-points: +4/-4
  • A Duck!
    • View Profile
Re: FA does another thing, screws it up
« Reply #11 on: September 16, 2013, 07:18:37 pm »
Via Twitter:
Quote
We made some changes this morning. How is performance holding up today for everyone? Better, faster?

FA's been completely unresponsive for near an hour now...
« Last Edit: September 16, 2013, 08:36:17 pm by winserv03fan »

Fate

  • James Woods with a Handgun and a Hardon
  • *
  • Posts: 54
  • E-points: +9/-2
  • the fuck
    • View Profile
Re: FA does another thing, screws it up
« Reply #12 on: September 16, 2013, 08:20:20 pm »
so this is an "upgrade" in the same way I "upgraded" my old PC with bullets.

Ketsuban

  • *
  • Posts: 48
  • E-points: +5/-1
  • Initiated Rube
    • View Profile
Re: FA does another thing, screws it up
« Reply #13 on: September 16, 2013, 09:14:17 pm »
Some two or three hours later, there's finally a recognition that the site isn't working.

Quote
We are aware of the current site outage and looking into the cause. We will keep you updated as more information comes to light.
We're not going to actually tell you anything, though - what do you take us for, professionals? FURAFFINITY AND THE FENDER AND REDNEF CHARACTERS ARE THE PROPERTY OF FUR AFFINITY LLC

Conan

  • Sean Piche Wannabe Club
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 603
  • E-points: +33/-9
  • ¯\(°_o)/¯
    • View Profile
Re: FA does another thing, screws it up
« Reply #14 on: September 17, 2013, 01:16:29 am »
After six hours offline, it just came back up. As usual, no one seems to want to explain what happened.

Based on past events, I'd say the database crashed and since the redundancy of their new redundant servers isn't operational yet, the site toppled over.

Oh, the irony of that all.
« Last Edit: September 17, 2013, 02:33:55 am by Conan »

ProvincialTwit

  • Abuse Dept.
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 774
  • E-points: +72/-33
    • View Profile
Re: FA does another thing, screws it up
« Reply #15 on: September 17, 2013, 07:40:20 am »
BUT YOU GUYS ITS A FREE SITE AGLAHJGAHUWGHAULWGHARELUGAH FUCK MY FACE

ahem.

Christ if that place were being run like any sort of real business these people would've been replaced ages ago.  If this is how Piche thinks you're supposed to run a web service it's no wonder he needed favors from mommy and daddy to get a job.

Ketsuban

  • *
  • Posts: 48
  • E-points: +5/-1
  • Initiated Rube
    • View Profile
Re: FA does another thing, screws it up
« Reply #16 on: September 18, 2013, 07:25:50 am »
The database move has fucked up profiles for anyone who made use of Unicode characters e: as well as everywhere else - anytime someone said "character © person" the copyright symbol got mangled.

e: and as soon as it was there it's gone again.

Conan

  • Sean Piche Wannabe Club
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 603
  • E-points: +33/-9
  • ¯\(°_o)/¯
    • View Profile
Re: FA does another thing, screws it up
« Reply #17 on: September 19, 2013, 12:57:33 am »



SQL times and response times are still slower than they were prior to the upgrade.

BUT YOU GUYS ITS A FREE SITE AGLAHJGAHUWGHAULWGHARELUGAH FUCK MY FACE

It's like you predicted a post one of their forum mods made earlier today.

Quote
I love how everyone assumes FA is run like a full time business with employees instead of a hobby site that exists off ads and donations and relies on volunteers.
 
 Taking a day to fix something isn't some ridiculous amount of time, especially when there's only a small team knowledgeable about what's going on behind the scenes.

Geez guys it's completely unacceptable to think that the largest furry website with all sorts of really big numbers they like to throw around might restart the database in under six hours!!!!!!! what do you think this is?!?!?!?!?!

Pi

  • POOR IMPULSE CONTROL
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 614
  • E-points: +40/-10
  • <blink>yes hello</blink>
    • View Profile
    • Clan Spum userpage
Re: FA does another thing, screws it up
« Reply #18 on: September 19, 2013, 08:30:06 am »
BUT YOU GUYS ITS A FREE SITE AGLAHJGAHUWGHAULWGHARELUGAH FUCK MY FACE

It's like you predicted a post one of their forum mods made earlier today.

Quote
I love how everyone assumes FA is run like a full time business with employees instead of a hobby site that exists off ads and donations and relies on volunteers.
 
 Taking a day to fix something isn't some ridiculous amount of time, especially when there's only a small team knowledgeable about what's going on behind the scenes.
And if there were a larger team that were more knowledgeable about what is going on behind the scenes, they might fix shit a little more quickly. But hey.
"we did farts.  now we do sperm.  we are cutting edge." — Theo DeRaadt

vigilante777

  • Posts: 20
  • E-points: +0/-1
  • Uninitiated Rube
    • View Profile
Re: FA does another thing, screws it up
« Reply #19 on: September 24, 2013, 12:39:43 pm »
And the site is tanking again with loading times of a minute. I don't really have any good analysis tools, so here's a random site with some info's: http://loadimpact.com/load-test/www.furaffinity.net-7bf296068380db52e8b81b1c03e10b6c