Author Topic: FA implements comment hiding, exploits found, thrashing and flailing results  (Read 6547 times)

Jim Demintia

  • Postcount ate Whippany, NJ
  • ****
  • Posts: 628
  • E-points: +24/-6
  • Deflator Mouse
    • View Profile
I'm guessing you found something along the lines of Xapian or Lucene. I created a database application a while back as a school project, and towards the end, had I had the time, I looked into integrating one of those into my project. You can't just drop them in, you have to integrate them into your code.

Anyway, search in computer science is something they devote entire courses to in school. It's not as simple as most people think it is, and some folks even make their careers out of it and related issues. But basically, to be able to efficiently search you have to either: a) have infinite resource capacity (say, like Google), b) not have a lot to search, or c) have a non-retarded database schema that allows efficient search.

I think FA has none of those things.
Can it be this sad design
Could be the very same
A wooly man without a face
And a beast without a name

nrr

  • Sean Piche Fan Club
  • Cabalistic Fuckhead
  • *
  • Posts: 79
  • E-points: +7/-3
  • OMG SO CUTE ^__^
    • View Profile
    • lynxies :3
Speaking of FA, does anyone know if the search feature is completely coded by FA? I found an open source search software earlier and it looks horribly efficient (and wouldn't kill a server like FA search did).

There're lots of freely available indexers scattered about online.  The program I use for managing my mail at work, for example, uses Xapian internally to handle search functionality.  Our wiki, seeing as it's written to use Java/J2EE, uses Lucene.  There're no doubt various others, but these appear to be the big two.  Not a huge deal.  Pick one.

Nevertheless, FA's search functionality, near as I can tell, is done entirely in-house, yes.  As Jim already stated, there are people who dedicate their entire careers to devising search algorithms, so this isn't exactly the most trivial of problems to solve.
im glad the "I saw a furry IRL" thread is so good at bringing goons together

YOUR PARTICIPLES AREN'T THE ONLY THINGS DANGLING

Eevee

  • VAPOREONWARE
  • Cabalistic Fuckhead
  • *
  • Posts: 48
  • E-points: +8/-0
    • View Profile
I'm almost certain that FA's search is against a naïve Sphinx index.  verix and I wrote the first (very simple) implementation years ago, and it used Sphinx; plus the stats look like the sort of stuff Sphinx spits out.

Jim Demintia

  • Postcount ate Whippany, NJ
  • ****
  • Posts: 628
  • E-points: +24/-6
  • Deflator Mouse
    • View Profile
You know, it occurs to me that FA is a fairly large site for the types of amateur-hour coding techniques it uses—I mean from what I can tell this is seriously high-school level stuff...from 1998. I think it might have been Yak who said that there is no templating...at least not any that is completely separated. I had suspected as much but having random printf()s or whatever scattered throughout 2,000 lines of code (and who knows how well it's split into files—we've all heard the one-file 2k-line program horror stories), has got to be a bitch to work on.

I mean, shit. They're even coming up with specialized on-disk file systems that are optimized for the sort of things high-traffic web applications have to do.

Anyway, the search interface is rather odd, I thought—IIRC you have to search against keywords or something like that, you can't do a free-form Google-style query. It's all very state-of-the-90s (remember when search had extensive help that came with it, talking about boolean operators?) Which, I'm not saying they need the accuracy or quality of Google search results, but it'd be nice if it could just search the keywords, the submission description text....etc. etc.

Admittedly, though, even parsing a free-form query isn't as simple as it might appear. People expect search to "just work", for better or worse, thanks to Google. And there's a reason that Google will personally pursue just about every newly-minted CS Ph.D.

Apparently though, having relevant education disqualifies you from working on FA (not that you'd want to) so...
Can it be this sad design
Could be the very same
A wooly man without a face
And a beast without a name

u63r

  • *
  • Posts: 33
  • E-points: +1/-7
    • View Profile
Quote from: u63r
I tend to side with 'Neer in these sort of things, just because he's usually less passive-aggressive.

Haha what?
Funny; I was not so sleep deprived that I couldn't type a comprehensible comment--six, in fact--but enough to have no idea what I was saying.

I was wrong. I apologize for wasting everyone's time.

EDIT: See what I mean?
« Last Edit: November 12, 2010, 04:26:17 pm by u63r »

loki

  • **
  • Posts: 125
  • E-points: +2/-2
    • View Profile
You know, it occurs to me that FA is a fairly large site for the types of amateur-hour coding techniques it uses—I mean from what I can tell this is seriously high-school level stuff...from 1998. I think it might have been Yak who said that there is no templating...at least not any that is completely separated. I had suspected as much but having random printf()s or whatever scattered throughout 2,000 lines of code (and who knows how well it's split into files—we've all heard the one-file 2k-line program horror stories), has got to be a bitch to work on.

I mean, shit. They're even coming up with specialized on-disk file systems that are optimized for the sort of things high-traffic web applications have to do.

Anyway, the search interface is rather odd, I thought—IIRC you have to search against keywords or something like that, you can't do a free-form Google-style query. It's all very state-of-the-90s (remember when search had extensive help that came with it, talking about boolean operators?) Which, I'm not saying they need the accuracy or quality of Google search results, but it'd be nice if it could just search the keywords, the submission description text....etc. etc.

Admittedly, though, even parsing a free-form query isn't as simple as it might appear. People expect search to "just work", for better or worse, thanks to Google. And there's a reason that Google will personally pursue just about every newly-minted CS Ph.D.

Apparently though, having relevant education disqualifies you from working on FA (not that you'd want to) so...

2,000 lines of code isn't very much at all but if it's spaghetti code or just plain shit it might be impossible to understand. Being in PHP probably doesn't help either...

Also, Google has some really clever stuff like inverse bitmap indexes which have some academic papers written about them; I doubt FA could come up with anything up to scratch with that level. Then again, FA has a fraction of the data to index that Google does so they don't need anything fancy - I doubt they have anyone with data management / DBA experience though...

ProvincialTwit

  • Abuse Dept.
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 774
  • E-points: +72/-33
    • View Profile
These are people who think a 'search function' looks like
$query = "select * from unnormalized_fa_table where $field like '%$search_term%'";
$result = mysql_query($query);


It kinda hurt to type that out.

Jim Demintia

  • Postcount ate Whippany, NJ
  • ****
  • Posts: 628
  • E-points: +24/-6
  • Deflator Mouse
    • View Profile
Haha, what's "norm-al-i-zation?" </Sean-Piche>
Can it be this sad design
Could be the very same
A wooly man without a face
And a beast without a name

Conan

  • Sean Piche Wannabe Club
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 603
  • E-points: +33/-9
  • ¯\(°_o)/¯
    • View Profile
http://forums.furaffinity.net/threads/86324-Hide-Comment-not-working

Lo and behold, the comment hiding feature is severely broken.

Why does Dragoneer refuse to find competent help? At this point I just don't see what's so special about Yak. How is releasing a vulnerable, buggy piece of software a GOOD thing?

ProvincialTwit

  • Abuse Dept.
  • Postcount ate Whippany, NJ
  • ****
  • Posts: 774
  • E-points: +72/-33
    • View Profile
Because he's more interested in whether or not a 'programmer' (to use the term loosely) will do exactly what he wants, without wanting anything pesky like 'recognition' or 'money', as opposed to actually being competent and able to build a secure, functional art archive/pseudo-social-networking site.

Idiots and bad programmers:
-will work for free under some guise of 'doing it for the community'
-will do exactly what Peachy wants without questioning motives or the wisdom of implementing a given 'feature'
-will ignore trouble tickets complaining about the most basic of flaws at the root level

Good programmers:
-can build secure, functional websites in a language other than PHP
-are going to expect some manner of actual compensation
-will want to fix the most basic of flaws on the site from the ground up before implementing any extraneous add-ons
-probably wouldn't work on a furry porn website anyway

Eevee

  • VAPOREONWARE
  • Cabalistic Fuckhead
  • *
  • Posts: 48
  • E-points: +8/-0
    • View Profile
Dragoneer doesn't know what he wants, can't make substantial decisions, and doesn't trust anyone else to decide or implement anything for him.  Under those conditions, the status quo becomes king.

Jim Demintia

  • Postcount ate Whippany, NJ
  • ****
  • Posts: 628
  • E-points: +24/-6
  • Deflator Mouse
    • View Profile
Dude, look at Dragoneer. He lets his life happen to him. Passive people like that have basically built their lives around the avoidance of any decision-making, responsibility, or serious interpersonal relationships. The only irritating thing about him is that he's like this at 30-/+ years old and not only does he think he's a community "leader", but he expects the praise and accolades afforded to someone who has actually accomplished something, not bullied control of some domain name out of the hands of a kid using his military-contractor bonus or whatever.
Can it be this sad design
Could be the very same
A wooly man without a face
And a beast without a name