http://ekstreme.com/thingsofsorts/seosem/googlebot-requested-a-css-fileA recent SEO Refugee thread brought up the subject of whether Google (and the SE bots in general) check CSS files. Testing that is easy: download your log files and search for all requests to the CSS file(s) coming from Googlebot. I usually get a few hits every time I do this (every couple of months or so), but on closer inspection, the hits have always been from a spoofed user agent string, where some clown browses the web pretending to be Googlebot. This is easy enough to accomplish, for example, using Firefox and the user agent switcher extension.
Just now, I decided to check again, and one hit actually did come from a Googlebot IP address. The exact line from the log file is:
66.249.72.52 - - [24/Oct/2006:17:17:35 -0500] “GET /global/x.css HTTP/1.1″ 200 8382 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
Sure enough, the requesting IP address really does belong to the Google IP block. Also notice that there is no HTTP referer header set, as you’d expect. As far as I am aware, this is the first time anyone has spotted Googlebot requesting a CSS file.
Digging deeper, I tried to find more requests to CSS files. The one requested (x.css) is the main CSS file for ekstreme.com. There is another stylesheet for the Socializer. I couldn’t find any Googlebot requests for that. I also checked my other sites, and I couldn’t find any requests there either. In short, this is the only CSS file request by Googlebot I could find.
Incidentally, that same Googlebot requested just over 3000 pages from eKstreme.com that day.
What does this mean? If Google is now interested in CSS files, could it be also interesting in discovering hidden text? That would be interesting
So can you please check your logs? I’m sure eKstreme.com is not special enough to be the only site whose CSS file got crawled. This is how I check my logs: I use grep (Windows users get Unix Utils) with the following command:
grep -F "x.css" logfile.txt > css-hits.txt
This picks up all requests to the CSS file. Make sure you replace ‘x.css’ with your CSS file’s name! Next, we fish out all hits that mention Google. To avoid any case sensitivity issues, we simply search for ‘oogle’:
grep -F "oogle" css-hits.txt > oogle-css.txt
Now open up oogle-css.txt and look at each hit individually. You’ll usually have a few dozen so it won’t be that hard. For any hits claiming to be Googlebot, check the IP address to see if it is part of Google’s IP block or not.