Simple Sites Forum
December 02, 2008, 08:26:24 PM *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News: This is just a website to help promote simple, stylish and elegant websites. It is not to say that all simple sites = elegant sites or vice versa.

We are just trying to spread the awareness that simple sites can be stylish and elegant as well.
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: How Google handles hacked sites  (Read 1305 times)
solidghost
Administrator
Hero Member
*****
Posts: 1269



View Profile WWW
« on: December 05, 2006, 12:06:09 AM »

http://www.mattcutts.com/blog/how-google-handles-hacked-sites/

f you’ve never read my blog before, welcome. I’m the head of the webspam team at Google. And I have a blog for days just like this.

Okay, first off you should go read this post. It’s entitled “Me Against Google” and the author is unhappy that talkorigins.org was nowhere to be found in Google for the last 5-6 days. After that post, go read this Slashdot post, entitled “Google De-indexes Talk.Origins, Won’t Say Why.” By the time you’re done, your pulse should be pounding. Hell, you should be angry. Damn that evil Google for not communicating with webmasters!! Or as Wesley put it in his blog:

    You might think that a company that prides itself upon advanced textual analysis and automated decision-making algorithms might provide helpful warning messages to webmasters concerning problems found in their sites. You would be wrong.

Okay, ready for my side of the story? Here’s the timeline of how things happened:
- talkorigins.org was hacked on November 18th. I know this because Wesley says so in his blog post.
- By November 27th, Google had detected spammy links and text on talkorigins.org. In case you’re wondering, here’s what the cracker added:


<script>document.write(String.fromCharCode(60,100,105,118,32,115,116,121,108,101,61,39,100,
105,115,112,108,97,121,58,110,111,110,101,39,62))</script><br><a href="http://vvu.edu.gh/images/?i=animal-porn">animal porn</a>, <a href="http://vvu.edu.gh/images/?i=animal-sex">animal sex</a>, <a href="http://vvu.edu.gh/images/?i=beastiality">beastiality</a>, <a href="http://vvu.edu.gh/images/?i=rape-sex">rape sex</a>, <a href="http://vvu.edu.gh/images/?i=sleeping-sex">sleeping sex</a>, <a href="http://deepx.com/images/?i=animal-porn">animal porn</a>, <a href="http://deepx.com/images/?i=beastiality">beastiality</a>, <a href="http://deepx.com/images/?i=dog-porn">dog porn</a>, <a href="http://deepx.com/images/?i=horse-porn">horse porn</a>, <a href="http://deepx.com/images/?i=rape-sex">rape sex</a>, <a href="http://deepx.com/images/?i=sleeping-sex">sleeping sex</a>, <a href="http://theoi.com/image/?i=animal-porn">animal porn</a>, <a href="http://theoi.com/image/?i=animal-sex">animal sex</a>, <a href="http://theoi.com/image/?i=beastiality">beastiality</a>, <a href="http://ugobe.com/media/?i=dvd-covers">dvd covers</a>, <a href="http://ugobe.com/media/?i=dvd-ripper">dvd ripper</a>, <a href="http://ugobe.com/media/?i=psp-downloads">psp downloads</a>, <a href="http://ugobe.com/media/?i=psp-games">psp games</a>, <a href="http://ugobe.com/media/?i=psp-movies">psp movies</a>

Not pretty stuff–lots of text about rape and animal porn. In case you’re wondering, that JavaScript at the beginning produces the string “<div style=’display:none’>”, which makes the entire section of spammy junk hidden. So talkorigins.org has these porn words and spammy links, and it’s all hidden via sneaky JavaScript.

We have pretty good reason to believe that this site was hacked, but it’s still causing problems for regular users, so Google has to take action. Here’s what we do:
- By November 27th, the site was classified as hacked and spammy. We stopped showing it for user queries.
- By November 27th, we started flagging this site as penalized in Google’s webmaster console. I believe that Google is the only search engine that will confirm to webmasters that their site does have penalties. No, we don’t confirm penalties if we think it might clue in web spammers that they’ve been caught. But yes, we do try to confirm penalties if we think a site is legitimate or has been hacked. You can read more about how we confirm penalties in this previous post.

I hear a few people ask, “It’s nice that I can sign up for Google’s webmaster console and learn that Google penalized my site. But couldn’t Google have done more?” Well, it turns out that we did do more:
- By November 28th, we emailed multiple addresses at talkorigins.org to let them know exactly what happened. According to the records I’m looking at, we tried to email contact at talkorigins.org, info at talkorigins.org, support at talkorigins.org, and webmaster at talkorigins.org with a timestamp of 2006-11-28 14:24:15. Here’s an excerpt from the email that we sent:

    Dear site owner or webmaster of talkorigins.org,

    While we were indexing your webpages, we detected that some of your
    pages were using techniques that were outside our quality guidelines,
    which can be found here: http://www.google.com/webmasters/guidelines.html
    In order to preserve the quality of our search engine, we have
    temporarily removed some webpages from our search results. Currently
    pages from talkorigins.org are scheduled to be removed for at least 60 days.

    Specifically, we detected the following practices on your webpages:

    * The following hidden text on talkorigins.org:

    e.g.
    animal porn, animal sex, beastiality, rape sex, sleeping sex, animal porn, beastiality, dog porn, horse porn, rape sex, sleeping sex, animal porn, animal sex, beastiality, dvd covers, dvd ripper, psp downloads, psp games, psp movies
    …

    We would prefer to have your pages in Google’s index. If you wish to be
    reincluded, please correct or remove all pages that are outside our
    quality guidelines. When you are ready, please visit:

    https://www.google.com/webmasters/sitemaps/reinclusion?hl=en

    to learn more and request a reinclusion request.
    …

You can read more about how we try to email webmasters about issues on their site in this previous post. According to his post, Wesley did a reinclusion request recently, and I’ve confirmed that the reinclusion request was approved, so I expect talkorigins.org to be back in Google within 24-48 hours.

But let’s take a step back. This site was hacked and stuffed with a bunch of hidden spammy porn words and links. Google detected the spam in less than 10 days; that’s faster than the site owner noticed it. We temporarily removed the site from our index so that users wouldn’t get the spammy porn back in response to queries. We made it possible for the webmaster to verify that their site was penalized. Then we emailed the site, with the exact page and the exact text that was causing problems. We provided a link to the correct place for the site owner to request reinclusion. We also made the penalty for a relatively short time (60 days), so that if the webmaster fixed the issue but didn’t contact Google, they would still be fine after a few weeks.

Ultimately, each site owner is responsible for making sure that their site isn’t spammy. If you pick a bad search engine optimizer (SEO) and they make a ton of spammy doorway pages on your domain, Google still needs to take action. Hacked sites are no different: lots of spammy/hacked sites will try to install malware on users’ computers. If your site is hacked and turns spammy, Google may need to remove your site, but we will also try to alert you via our webmaster console and even by emailing you to let you know what happened. To the best of my knowledge, no other search engine confirms any penalties to sites, nor do they email site owners.

Wesley and anyone else who works on talkorigins.org, I’m sorry that this was a stressful experience for you. Could Google do a better job? Absolutely, and we’ll keep working on it. For example, maybe we can show a more specific message for hacked sites in the webmaster console. Google could also try to identify better email addresses when writing to site owners. For example, for talkorigins.org, there are email addresses such as “archive@” and “submissions@” that we could have used instead that might have reached the right person. I’m open to other suggestions too. But please give Google a little bit of credit, because I do think we’re doing more to alert webmasters to issues than any other search engine.

* Good to know that Google doesn't just cut you off without any warning. Make sure your webmaster email account is active! And go get Google Webmaster.
Logged

sacx13
Sr. Member
****
Posts: 412



View Profile WWW
« Reply #1 on: December 07, 2006, 01:26:02 PM »

Quote
You can read more about how we try to email webmasters about issues on their site in this previous post. According to his post, Wesley did a reinclusion request recently, and I’ve confirmed that the reinclusion request was approved, so I expect talkorigins.org to be back in Google within 24-48 hours.

I think it will take more than 10 days to be again in Index.

Regards
Logged

Darksat
Sr. Member
****
Posts: 351



View Profile
« Reply #2 on: December 07, 2006, 05:25:58 PM »

Its still cool that they let you know.
Logged

Darksats IT Security, SEO and Webmaster Forum - Secure all ports, man the firewall, hackers off the port side.
solidghost
Administrator
Hero Member
*****
Posts: 1269



View Profile WWW
« Reply #3 on: December 08, 2006, 03:45:20 AM »

Quote
You can read more about how we try to email webmasters about issues on their site in this previous post. According to his post, Wesley did a reinclusion request recently, and I’ve confirmed that the reinclusion request was approved, so I expect talkorigins.org to be back in Google within 24-48 hours.

I think it will take more than 10 days to be again in Index.

Regards

Ya, usually.
I think this is an exception.
Logged

Gracia
Newbie
*
Posts: 21


View Profile Email
« Reply #4 on: March 30, 2007, 10:21:50 AM »

Matt Cutts wrote an excellent blog post named How google handles hacked sites. This is very interesting to me, since I had a similar situation with a client in the past
Logged

solidghost
Administrator
Hero Member
*****
Posts: 1269



View Profile WWW
« Reply #5 on: April 22, 2007, 12:02:51 PM »

That article is from Matt Cutt's blog, Gracia.
Logged

Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1 RC3 | SMF © 2001-2006, Lewis Media
Seo4Smf v0.1 © Webmaster's Talks
Valid XHTML 1.0! Valid CSS!