HOW TO FIND DUPLICATE CONTENT IN YOUR WEBSITE
By Lucid Softech IT Solutions | Social Networking,
30 Mar 2016
Duplicate content refers to a piece of content that exists at more than one place on the internet. It could be completely same i.e two or more than two places having exactly matching content or could be similar to a great extent.
If there are multiple pages in your site containing duplicate content then Google can punish you by penalizing ¬ displaying your website in the search engine results.
- Types of Duplicate Content :There are 2 types of duplicate content :Malicious duplicate content : It refers to intentional duplication of content in order to get high ranking in search results. This approach is not considered a good practice as the same content is spread across different websites over the internet & hence results in poor user experience. Such kind of duplicity if found by search engines such as Google, Yahoo, MSN may cause the website to get “black listed” from the search results.
- Non – malicious duplicate content :It means variations of the same web page such as normal web version & mobile version, or any item that is accessible through different URL’s etc.
How to avoid duplicate content :
- You can include a copyright notice on every web page of your site in order to reduce the chances of your content being copied by anyone else.
- In case people can access your website or webpage in two or more waysi.e through different URLs for eg: http:/lucidsoftech.com/home, http://home.lucidsoftech.com etc. then it would be a good idea to use one out of these URL’s as your preferred URL , and use the 301 redirects to divert traffic from all such other URLs to this preferred URL.
- If you have to host someone else content & there are chances that it may appear as duplicated content then it is advisable to use robots meta tag in the head section of that web page, which would prevent search engines from indexing it.
- In case you are publishing a topic that should contain exactly the same points as the content already present on the web then instead of copying it as it is, it would be beneficial to rephrase it and give users a new, fresh & interesting content to read.
- If you are having two or more pages on your site having considerable amount of similarities in the content then it would be better if you could combine them into a single web page or try to combine the matching content in one page & unique content in other pages.
- Before accepting any new posts by guests in your blog, better check them for duplicity. As, plagiarism can lead to major penalties for any reputable websites.
HOW TO FIND DUPLICATE CONTENT:
There are many tools available in the market which will let you know if there is any duplicate content issue for your website & help you address it accordingly. Some such tools are:
- Google Webmaster Tools: Google Webmaster Tools can aid you in identifying and reducing the duplicate content for your site. In case twoor more URLs are pointing to the same web page, then the search engines will get confused in deciding which page should actually rank. The Google webmaster tools can help you in setting or telling Google about your preferred domain which means you are informing which version of your website is the true / canonical website.
- Screaming Frog : The Screaming Frog SEO Spider is a small desktop program you can easily install on your PC. To check your site, you need to enter your website address in this tool and click on the Start button.It will check for all the factors on your site and will display the full report of the analysis in some time.It can help address issues such as duplicate content existing on your blog and you can easily remove the duplicate content once it is identified from your blog by using the features provided by this tool.
- Click on Page Titles > Select Duplicate (in filter). It will display the pages which contain duplicate content and then you can analyze those pages and correct them.
- Click on the tab “Page titles” or “Meta Description”-> Select Duplicate (from filter) to get duplicate page titles.
- Click on “URL” tab -> Select Duplicate (from filter) to get information about pages containing multiple URL.
- Siteliner : You have to enter the URL of your website & choose “GO” option which will allow you to check for duplicate content and broken links. A detailed report having information about duplicate content, broken links, and skipped pages will be generated. In the Site Details-> Click Duplicate Content, and get an overview of the URLs, match words, titles, match pages, and match percentage.
- Virante Duplicate Content Checker: You will have to give information about your website domain and this tool will scan your site to check for internal duplicity (if any). It performs a Google cache check, www versus non-www check , 404 check, PR dispersion, checks the headers returned by both versions of the URL, and supplemental pages in the Google index.
- Xenu: A tool to check the broken links. You can also check identical titles by going through the table -> Launch Xenu Sleuth -> Go to File -> Click check URL -> Click OK.It will start crawling the URLs. Now, save the file and export that file to MS Excel. Once you have the file analyze it for duplicate content issues.
- SmallSeoTools: Copy and paste your blog post in the box and this tool will check for plagiarism & give information on how original your content is. You can copy the content to be checked for duplicity and paste it in the yellow box on the tool. Type the captcha code -> Click on “Check for Plagiarism.” The copied phrases will be marked in red color. Click on the highlighted text to see the source.
Closing Remarks
Duplicate content is a problem that is spread on a large scale on the internet. Your website will automatically rank up in the order if you maintain original & quality content on your site.
Never try to create duplicate content by copying intentionally from other websites. Since if it gets detected by Google or any search engine you will be penalized either you get low in the ranking or your site might be completely removed from the Google index.
To escape from such unfavorable circumstances using the tools as discussed in the article might assist you to overcome such issues.