d
Topic
Posts:
2
April 02, 2013
g
1
upvotes

Duplicate content on collection pages

The SEO company we are working with are very concerned with the amount of duplicate content on the collection/tag pages which are created dynamically by Shopify. 

The following pages show exactly the same products and description content:

http://www.highoctanesport.co.uk/collections/mens-ski-jackets
http://www.highoctanesport.co.uk/collections/mens-ski-jackets/mens
http://www.highoctanesport.co.uk/collections/mens-ski-jackets/mens+ski
http://www.highoctanesport.co.uk/collections/mens-ski-jackets/mens+ski+jacket ;

The following pages show different products but the description content is duplicated:

http://www.highoctanesport.co.uk/collections/ski-snowboard
http://www.highoctanesport.co.uk/collections/ski-snowboard/mens
http://www.highoctanesport.co.uk/collections/ski-snowboard/mens+jacket
http://www.highoctanesport.co.uk/collections/ski-snowboard/goggles
http://www.highoctanesport.co.uk/collections/ski-snowboard/mens+goggles ;

And there are many other URLs using the same description.

Is there a way of inserting <meta name="robots" content="noindex,follow" /> onto every one of these tag-filtered collection pages, leaving just the parent collection page to be indexed by search engines?

Failing that, is there a way of removing the description content from all of these extra tag-filtered pages, except the original collection page?

i
Replies
Posts:
5829
Last edited April 05, 2013
g
3
upvotes

The solution is to add the following code to your theme.liquid file inside the head element:

{% if template contains 'collection' and current_tags %}
<meta name="robots" content="noindex" />
<link rel="canonical" href="{{ shop.url }}{{ collection.url }}" />
{% else %}
<link rel="canonical" href="{{ canonical_url }}" />
{% endif %} 

Make sure to get rid of the following, as the above code will replace it:

<link rel="canonical" href="{{ canonical_url }}" />
Caroline from http://11heavens.com ∴ mllegeorgesand AT gmail DOT com
Posts:
2
April 07, 2013

Thanks Caroline, that works perfectly!

On a similar OCD-SEO note, is there any way of having rel="next" and rel="prev" tags on paginated collection pages, as explained here:

uk/2011/09/pagination-with-relnext-and-relprev.html">http://googlewebmastercentral.blogspot.co.uk/2011/09/pagination-with-relnext-and-relprev.html

Posts:
5829
April 07, 2013

On a similar OCD-SEO note

Ah ah ah, you funny man!

Is there any way of having rel="next" and rel="prev" tags on paginated collection pages.

Sure thing. I am writing some code now, but I have a question. If one is on this page:

http://www.highoctanesport.co.uk/collections/triathlon/mens

... and there is a page 2 to that (there is):

http://www.highoctanesport.co.uk/collections/triathlon/mens?page=2

On page 1, should the next meta tag point to: 

http://www.highoctanesport.co.uk/collections/triathlon?page=2

Or:

http://www.highoctanesport.co.uk/collections/triathlon/mens?page=2

From our OCD-SEO point of view, we don't want collection pages filtered by tags to be indexed, hence the previous code.

Either way, it's problematic, actually.

If using http://www.highoctanesport.co.uk/collections/triathlon/mens?page=2 then the browser hits a page with a canonical URL that points to page 1 of http://www.highoctanesport.co.uk/collections/triathlon.

If using http://www.highoctanesport.co.uk/collections/triathlon?page=2 then you're not really pointing the next page after page 1, but pointing to the second page of the unfiltered view. In the filtered view, page 3 may not have a 'next' page while the unfiltered view may have one. Do you see the clusterf##k I am describing?

Caroline from http://11heavens.com ∴ mllegeorgesand AT gmail DOT com
Posts:
5829
April 07, 2013

I was overthinking it. On filtered view pages, we should not even worry about prev and next meta tags. Let me write some code for collection pages that are not filtered with tags.

Caroline from http://11heavens.com ∴ mllegeorgesand AT gmail DOT com
Posts:
5829
Last edited April 07, 2013

Something like this should work — the code is a bit brutal:

{% if template contains 'collection' %}

  {% unless current_tags %}

    {% assign number_of_products_per_page = 12 %}

    {% comment %}
    If we have more products in this collection that we show per page.
    {% endcomment %}
  
    {% if collection.all_products_count > number_of_products_per_page %}

      {% assign previous_page = current_page | minus: 1 %}
      <!-- PREVIOUS PAGE: {{ previous_page }} -->
      {% assign next_page = current_page | plus: 1 %}
      <!-- NEXT PAGE: {{ next_page }} -->

      {% if previous_page > 0 %}
      <link rel="prev" href="{{ shop.url }}/collections/{{ collection.handle }}?page={{ previous_page }}" />
      {% endif %}

      {% assign number_of_products_covered = current_page | times: number_of_products_per_page %}
      {% if number_of_products_covered < collection.all_products_count %}
      <link rel="next" href="{{ shop.url }}/collections/{{ collection.handle }}?page={{ next_page }}" />
      {% endif %}
  
    {% endif %}
  
  {% endunless %}

{% endif %}

The challenge here was to do without all the paginate properties, the paginate tag is not yet opened in the head of theme.liquid. All we have is the global variable current_page.

Caroline from http://11heavens.com ∴ mllegeorgesand AT gmail DOT com
Posts:
5829
April 07, 2013
g
1
upvotes

The global current_page variable is documented here: http://wiki.shopify.com/Current_page

Caroline from http://11heavens.com ∴ mllegeorgesand AT gmail DOT com
Posts:
169
June 10, 2013

Hi Caroline - this is really great.  I would swear I got hit by Google's Penguin update due to these types of page duplications (the pages that had more 'copies' due to shopify got hit harder in rankings).  

The solution above works great, but for when you have no collection handle

eg

collections/types?page=2&q=Statues+and+Busts

Here is what is output in this example;
<link rel="canonical" href="http://www.redhottoys.co.uk/collections/types?page=2&q=Statues+and+Busts"; />

<!-- PREVIOUS PAGE: 1 --> <!-- NEXT PAGE: 3 -->

<link rel="prev" href="http://www.redhottoys.co.uk/collections/?page=1"; />
 

Is there a way to easily solve this?

Thanks for your help

Posts:
17
June 28, 2013

Thank you Caroline, that bit of code is exactly what I was looking for:


{% if template contains 'collection' and current_tags %} <meta name="robots" content="noindex" /> <link rel="canonical" href="{{ shop.url }}{{ collection.url }}" /> {% else %} <link rel="canonical" href="{{ canonical_url }}" /> {% endif %}
 

Does it matter where it is positioned within the head section, please forgive my ignorance?  I know it should not be within any existing if endif statement but if you could please tell me the best place to put it I would appreciate it greatly?

Thanks in advance,

James.

logz05 Member
Posts:
1
October 10, 2013

Thanks Caroline -  have been looking for this.

Posts:
1
9 months ago

Hi - I added 

<meta name="robots" content="noindex,follow" />

just after the  

{% if previous_page > 0 %} 

in order to prevent google indexing all the other Collection pages other than the first page which has been resulting in webmaster tools flagging duplicate meta descriptions for the Collection pages.

Can anyone tell me if they think this will prevent the duplicate warnings and is it a good idea/good way to do it? 

Holly Member
Posts:
26
9 months ago

Good questions Halfdan!

I would like to know also:

Best fix for preventing Paginated collection pages from being crawled?

Any fix to prevent Search filter pages from being crawled? I thought Robots text was stopping that. Confused!

I have been trying to implement this snippet someone came up with for a pagination fix but not sure if it the best fix to do or if I am doing it correctly?

{% if paginate.pages > 1 %}
  	{% assign current_url = collection.handle %} 
	{% if paginate.current_page == 1 %}
	<meta href="/collections/{{current_url}}" rel="canonical" />
	{% endif %}

	{% if paginate.current_page > 1 %}
	<meta name="robots" content="noindex, follow" />
	{% endif %}

    {% if paginate.previous %}
	<meta href="{{ paginate.previous.url }}" rel="prev" />
    {% endif %}
    {% if paginate.next %}
	<meta href="{{ paginate.next.url }}" rel="next" />
    {% endif %}
{% endif %}

 

Posts:
545
4 months ago

Would providing a sitemap for search engines be better?

<link rel="sitemap" type="application/xml" title="Sitemap" href="{{ shop.url }}/sitemap.xml">

... then they wouldn't have to scrape the entire front end of the site and get erroneous results. </thinking>

I'm a million different people
Posts:
2
about 1 month ago

Hi Caroline

I'm having the same issue with duplicate content due to tags within my collections. I took your advice and inserted the following code into the head element of my theme.liquid:


{% if template contains 'collection' and current_tags %} <meta name="robots" content="noindex" /> <link rel="canonical" href="{{ shop.url }}{{ collection.url }}" /> {% else %} <link rel="canonical" href="{{ canonical_url }}" /> {% endif %}
 

However, an HTML Improvements check through Google Webmaster is still showing that I have multiple pages with duplicate Meta Descriptions, for example:
www.milktooth.com.au/collections/wear-accessorise/art
www.milktooth.com.au/collections/wear-accessorise/bears
www.milktooth.com.au/collections/wear-accessorise/bib
www.milktooth.com.au/collections/wear-accessorise/bibs-&-booties
www.milktooth.com.au/collections/wear-accessorise/bibs
www.milktooth.com.au/collections/wear-accessorise/boys
www.milktooth.com.au/collections/wear-accessorise/bracelets
www.milktooth.com.au/collections/wear-accessorise/dance
www.milktooth.com.au/collections/wear-accessorise/dinosaurs
www.milktooth.com.au/collections/wear-accessorise/elephants

....and the list goes on.

I'm concerned, because the HTML Improvements list was last generated 6 November, and I inserted your code almost a week ago and I have also since uploaded my sitemap, so I was hoping to see no duplicate Meta Descriptions.

Do you have any idea why the code might not have worked for my website? Or do I need to wait a bit longer for Google to update things?

Thank you so much for any advice you can provide!!

Katherine

Log in or sign up for an account to reply.

f
Your Reply