Sunday, June 26, 2011

SEO Tip 8: Tell the Search Engines What to Index


I may take a lot of heat from the other SEO's out there for this one, especially because Google and other search engines have already helped reduce the amount of duplicate content indexed. However, I do enough search queries that begin with "site:" to know that duplicate content is still a major issue. Worse, I see a lot of files showing up in the indexes that should be hidden from the world (case in study: all the free PDF's you're probably still downloading from SEO Tip #7).

Optimizing Your robots.txt File

By far the easiest top 10 SEO tips you will ever do as it relates to search engine optimization is include a robots.txt file at the root of your website. Open up a text editor, such as Notepad and type "User-agent: *". Then save the file as robots.txt and upload it to your root directory on your domain. This one command will tell any spider that hits your website to "please feel free to crawl every page of my website".

Hopefully, you've already moved all the excessive JavaScripts and CSS styles into their own folders on your website to reduce the filesize and load time of the pages on your website. If you have, adding a simple "Disallow: /js/" to a file called the robots.txt will tell the crawlers not to bother with files in the JS folder and to only focus on your content, as opposed to non-important source code. Here's an example of the robots.txt file from this website:Redirecting Duplicate Content

For consistency, it's better to have one version of each page to get all the inbound links and earn all of the points with the search engines. This means telling Google and Bing (in their respective Webmaster Tools) to only index the www.-version of your website (or the non-www version if you're "one of those types of people"). You can also use your Windows Server or a file called the .htaccess file on your Apache server to permanently redirect one version to the other.

Next, add a new tag to every page of your website to prevent other versions of the page from appearing in the search results. Just think about all the different ways we display content. There are often "Print View", "Flash Version", and pages with reviews, ratings and comments that append page URLs with strings such as &rating=5, &view=print, etc. To correct this issue, we add a Canonical Tag to every page of the website. Here's the syntax:

Finally, you should round up all those domains you bought and make sure they are pointing to your one main website with a 301 Permanent Redirect. Bruce Clay created a way to this efficiently which he called an IP Funnel. I've been the victim of this so many times being an SEO Expert. More than once, I've found myself scratching my head trying to figure out why a website would not get Google PageRank™, only to find out later than an older domain held by the client had been displaying the same content and had been the one Google gave the credit to.

No comments:

Post a Comment