Make Sure Your Site is Indexed
Search for your domain name in Google. Has Google indexed all your site pages? If not, you need to check Google Webmaster Tools to find why some of your pages have gone missing. Take the recommended steps.
If a search for your domain name in Google is not returning any of your Website pages, it could be a sign that Google has penalized you for some reason. Again, you need to check the Manual Penalty section of Google Webmaster Tools to confirm this.
The Google penalty could be against specific pages in your site. Or it could be against your entire site. In general, the following behavior could get you into trouble with Google:
- Site with auto-generated content: In such a site, everything you read is created by a program.
- Site with thin affiliate content: This site carries product descriptions, reviews, etc. which are copied from the original merchant without adding any value.
- Site with scraped content: Here, content from other sites is copied and pasted using software.
- Site with doorway pages: In such a site, each page serves as a door leading to another site, which is the final destination.
- Site with unnatural links: This site is suspected to have either bought hundreds of links that point to the site or made deals for exchanging links on a large scale.
- Site with unnatural outgoing links: This site is believed to have provided links to hundreds of other sites for money.
- Site with hacked pages: The search engine thinks a third party has hacked into the site and modified its files.
- Site which spams: Such a site works as a spam-creation engine.
- Site which does cloaking and sneaky redirects: For the same URL, the site shows different pages to users and search engines.
- Site with hidden text and keywords: This site tries to hide text from search engines so as to deceive them about stuffing pages with keywords.
Check for Duplicate Content
Copy a snippet of text from your home page. Do a phrase match search in Google by surrounding the text with double-quotes. Study the results. Is Google returning more than one page from your site with the same text? If so, you are using the same content in more than one page. Please avoid this and ensure all your pages have unique content.
Sometimes, the search results could return the same text from other sites. This could mean one of two things: either you are being copied, or you are copying them.
If you are being copied, contact the offending site and ask the webmaster to remove the content. If you are doing the copying, rewrite the text with your own words. Repeat the phrasal match search with text samples from each of your pages. Avoid duplication. Google doesn’t like it.
Check for Canonicalization
Sometimes the same web page can be loaded from different URLs. To illustrate, consider the website example.com.
It could be reached from:
This is a problem because though each of the above URLs reach the same address, search engines value them differently for incoming links. So every site should decide on the correct version and stick to it.
Using the Correct Redirect
Your web page URLs can change. The site itself can move to a new address. So visitors who reach the earlier web address have to be guided correctly to the new URL. The correct server redirect code to use is 301 for almost all situations because it passes on 90% of the link value of the original URL to the new URL. Make sure you use 301 redirects for all your pages that have moved.
Meta Robots vs. Robots.txt
There are two ways of controlling indexation of sites by search engines. The preferred way is to use the meta keyword in the head element of each webpage. This allows lots of control for the website owner on how bot visits to the site are managed.
Sometimes a robots.txt file is added to the root domain to stop search engines from visiting a web page. It’s good to use this option only when thousands of URLs have to be excluded at once. But on the flipside it takes away the ability of search engines to value links from within the site. In no case should the same content be excluded using both the Robots.txt file and the meta robots tag. Google will always go by the directions of the Robots.txt file. As a general rule, websites should prefer the meta keywords and reserve Robotx.txt files for large-scale exclusion of pages.
Hierarchy and Parameters in URLs
While creating your URL structures, it would be good to have a category-based system. The idea is that users should understand what the page is all about without even visiting it, by merely looking at the URL. Eg.
Sometimes Content Management Systems force websites to have URLs with symbols like ‘?’, ‘&’, etc. called parameters. It is better is to use no more than two such parameters per URL and to rewrite the URL to make it simple.
Two Sets of Site Maps
Include two sets of site maps. The regular sitemap page made up of HTML links is a road map for visitors. It will increase usability. Additionally, include a sitemap as an xml file to the root domain of your site. Eg. www.example.com/sitemap.xml. The second sitemap will help search bots to index your site.
A visitor to your site should be able to complete her interaction in three clicks. A sale process should be completed in three clicks. For those who look for information, the third click should be able to open the contact page or online enquiry form.
Different Language Versions
Think whether you need to publish your site in other languages. Find out the volume of traffic coming in from countries where English is not the main language. If this is high, you could consider publishing your site in another language. Once done, register with the local Google site.
More Than One Route to Discovery
Provide different paths for site visitors to discover things in your site: index pages, search boxes, site maps, etc. Let them have the pleasure of finding out on their own.