Robots.txt, The Whitehouse, Iraq And My Old Website

Even now, 6 months after transferring my blog from http://users.chariot.net.au/~jktaheny/blogger/blog.htm to http://taheny.com, Google still ranks my old website address above this one for most searches. It is annoying, as I want visitors to come here and read ‘Joe’s up to date Ramblings’.

After some contemplation, today I added a Robots.txt file to my old website. In theory, the following piece of script should stop all search engine robots visiting the old blog:

User-agent: *
Disallow: /


If the text works and the search robots (including Google) don’t visit my old blog, I hope and expect it to drop out of the search rankings to be replaced with my current blog. We will see.

When looking for Robots.txt advice, I came across the Whitehouse’s Robots.txt page with over 2000 lines of text. This many lines in a Robots.txt is not uncommon for large websites (whitehouse.gov has over 600,000 pages).

What I did find strange was almost all the pages contained in the text were Iraq-related. Here is a random screenshot:

I know Iraq has been a major issue, but surely, most of the Whitehouse web pages are not Iraq-related. If this is the case, why does the Whitehouse not want many Iraq pages spidered? Are they embarrassed by the mess they have made?

UPDATE: The Whitehouse/Robots.txt/Iraq issue has been covered many times before.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.