Why Yahoo! Answers is broken.

The lunatics are in charge of the asylum, that’s why.

Someone asked a question wanting to know which codec they needed to import a music file into their application – they were getting an error “unknown codec”.

So; the choice of answers given:

1) A generic answer which would solve the problem for all people having this error, explaining what a codec was, and how to find out which codec would be needed for any file.

2) An answer that included the lucky guess “Ogg Vorbis”.

Guess which answer was given the “Best Answer”?  Number 2.  Huh???

It’s not an isolated incident, either.  There’s more examples of this kind of lunacy.  That’s not to say that I don’t understand why the answer got rated higher, it afterall solved the questioners problem in a quick and easy manner.  It just didn’t add anything to the value of the site; it’s meant to be a repository of questions and good answers.  Hey Ho.

Oh, and the quality of the questions leaves something to be decided, too.  Half the questioners don’t have a clue about the subject they’re asking about, yet they claim to need a desperately technical answer.  My personal favourite is:

Question Title: Should I use the uniprocessor HAL with a Quad Core (Q6600)?
Question content: Is there an easy way to switch HALs?

Erm. Where to begin? Is there an easy way to switch HALs? Why, yes, there is. It’s dead simple once you’ve read and understood the Windows DDK. Of course once you’ve done that, you’ll not be wanting to swap HALs; there’s absolutely no point, beyond wanting to artificially limit the number of cores that windows will use on the chip. That can be done so much easier with the /onecpu setting in BOOT.INI

But of course, if you put that in an answer, you’re wasting your time, even though it’s actually a pretty good answer to a dumb question.  I suspect that the best answer would be given to someone who answers with:

LOLZ, dude.  K3wln3ss for has changing HALs.  HAL FTW!!!11!!

And the number of people wanting their homework done is utterly outstanding.  I’ve seen a number of multiple choice questions up on there.  How can that not be homework?

This site is not bad for your computer’s health part 2

Well, the site’s now clean (and has been for a number of hours now).
Google has done a sweep, and is telling me that not only is the site clean (bottom of picture) but that the site is still linked to badware (top of picture).
My challenge to google is tell me which one is correct, and if it’s the bottom statement, unblock my site. This is getting beyond a joke.

How not to design a user interface

This site is not bad for your computer’s health

… despite Google telling you that it is.
Google’s not always right, it seems.

Google’s assertion that this site is hosting “badware” (warning – made up word) is at best misleading, and at worst an out-and-out untruth.

My reasoning:

  1. Misleading: The version of WordPress I use to power this blog had a security vulnerability in it, which meant people could maliciously edit posts I’d made and inject content. In all of these cases, the additional material injected was a link to another website.
  2. Untruth: As above, the link on my site was just that – a link to another website. My website had absolutely no “badware” on it at any time. It’s a small distinction to make, but if my business was using wordpress, and the world at large was presented with a page from google saying my site could not be trusted and that it was trying to hack their computers, I think I, like many other business owners, would be upset at the loss of business (short term) and reputation (long term) that this causes.

Things that – in my opinion – Google needs to fix.

  • Correct the warning that users see after clicking on a link. The warning page – as pointed out above – is wrong. It should simply state that Google’s automated tool for protecting the internet community has detected that a link on this site has been found to point to a “badware” distribution site. Even changing the sense of the sentence “Warning – visiting this web site may harm your computer!” into a more passive form, so that the implication that “this web site” is actively trying to do something bad is removed.
  • Provide people who’s websites get infected with at least a count of distinct issues that have been found on their website. At the moment, the Google warning email just says that the website is serving up “badware”. This website actually suffered from two seperate problems, after finding the first, I resubmitted for testing, and they came back with the same – unhelpful – statement; that my website was still hosting “badware”. If I had been told that there were 2 (or however many are found) instances of the problem, I wouldn’t have stopped after finding the first.
    The sheer number of people requesting help on the ‘Stop “badware”‘ group pages is indicative that something is seriously wrong with the reporting mechanism.
  • Speed up their review process. If they are making wild accusations that a website is actively hosting “badware”, then they should be as quick to unblock a site as they are to block it. If an online company gets blocked – as has happened (sorry, can’t find the link to the page that I found before) – then Google will hold their website in the blocked state until a retest is done. This can take a long time, during which time, people cannot access the site from google’s search results, and due to the nature of the wording of the warning, harm is done to the business’ reputation
  • Before the browsers start using this “badware” security alert mechanism to block websites in the browser, the process needs to be streamlined, so that an automatic check can not only condemn a website, but also give it a clean bill of health. The process of freeing a website from purgatory should be near real-time. I do not believe that an automatic check of the website cannot be done within a period of half an hour – Google has a famously large number of servers and bandwidth available to it. If it cannot be done in this realtime manner, I think it is too flawed to be useful.

I would like to point out that this last point, about browsers using the Google “badware” database as a check is, in principle a very good idea; afterall, protecting people from “badware” is something that would make the internet a much nicer place to play.
However, with the caveats listed above, the database becomes even more insidious – without turning the whole security mechanism off, I cannot access my own website, even though it is clean (I know it is, I’ve just finished cleaning it, and upgraded the software so it won’t happen again).
It is this real-time checking of the “badware” database by browsers that is the painful part when the database is too slow to de-list websites.

And don’t think that browsers aren’t going to do this. Firefox 3 Beta 3 has the feature turned on by default.

EDIT: Someone else has had the same problem, and has the same problems with the process: