Meetup Recap 8/20/2014

by

in

Thanks again to everyone who came out to our Developers meetup on Thursday night. Despite the absence of Ben and Andy for this one, we managed to cover a couple of interesting topics that I hope you found helpful.

Google Analytics and filtering out Spammy Results.

When reading reports in Google Analytics you’ll probably notice that you’re getting a lot of traffic from websites like:

floating-share-buttons.com
get-free-social-traffic.com
buttons-for-website.com
darodar.com
chinese-amezon.com
and a whole lot more that you’ve never heard of.

Check out this screenshot of some of the traffic that is being reported on a website of a local commercial design firm

Screen Shot 08-22-15 at 01.13 AM

At quick glance you might say, 1091 sessions isn’t that bad for local commercial design firm, unfortunately all of this referral traffic (including the 633 sessions from floating-share-buttons.com) are spoofed results created by robots who are simply submitting information packets to google’s servers using your analytics ID.

Why the hell would they do that?!

I’ve found that there are different motivations for each spammer. Some of them do it to generate leads, some to drive affiliate traffic, and others to earn ad revenue by increasing traffic. Not clear on how they generate traffic? Well if you are looking at your report and see that floating-share-buttons.com is forwarding such a large amount of traffic to your site, do you think you might be motivated to visit the site and see who they are and how your business is being represented on their site? Of course you would (or at least you would have before this awesome blog post).

Ok so again, most of these are not real visits; they are simply using your analytics code to submit packages to google. A good tell-tale sign of this is incomplete or (not set) fields that show up in your report. Here’s an example.

If you set the secondary dimension on your referral report to display the Hostname

Screen Shot 08-22-15 at 01.35 AM

You’ll see that for most of the referrers the Hostname is not set.

Screen Shot 08-22-15 at 01.39 AM

This is actually good news for us, because we can use these shortcuts to filter out the junk and get our reports showing us realistic results. So let’s do that and get rid of these things.

There are two parts to this: First we will build in filters to prevent these results from showing up in our future reports, however they do not work retroactively so I’ll need to show you a strategy to allow you to strip the junk out of any reports you look at prior to us making this change.

Step 1: Let’s build a couple of filters.

In Google Analytics go to your admin section and select filters.

Screen Shot 08-22-15 at 01.48 AM

Click on the Screen Shot 08-22-15 at 01.49 AM button and enter the following information

Filter Name:  Exclude common spam

Filter Type: Custom

In the Exclude section

Filter Field: Campaign source

Filter Pattern:  (Copy and paste the following) **Last updated 8/22/15**

darodar\.|semalt\.|buttons-for.*?website|blackhatworth|ilovevitaly|prodvigator|cenokos\.|ranksonic\.|adcash\.|share.?buttons\.|social.?buttons\.|hulfingtonpost\.|free.*traffic|buy-cheap-online|-seo|seo-|videos-for

Screen Shot 08-22-15 at 02.01 AM

Next we can see how this would effect our current results by clicking the “Verify this filter” link under the Filter Verification section

Screen Shot 08-22-15 at 02.03 AM

You should see something like this, showing you that if the filter was running today, it would have eliminated these referrers from the report.

Screen Shot 08-22-15 at 02.05 AM

That’s it for this one. Click save and check back here periodically for updates to the filter.

Next we’ll build a filter that only allows results that include our Hostname. This will eliminate a majority of the ghost referrals.

Once again create a new filter and fill it out as follows:

Filter Name:  Only include hostname
Filter Type: Custom
*Go down and choose the “Include” radio button
Filter Field:
Hostname
Filter Pattern: Insert your website here for this example it will be: www\.zdesigninc\.com|zdesigninc\.com
**Notice that these filters must be written as regular expressions, which means that you have to escape special characters like the ‘.’ to do this you simply place a backslash in front of it. In this example I’m including two versions of the domain. I can add as many as I’d like as long as I separate them with a pipe character (|). 

Screen Shot 08-22-15 at 02.18 AM

Once again verify the filter and you should see something like this:

Screen Shot 08-22-15 at 02.19 AM

WHOA!!! That’s more like it. Look at all that garbage that will no longer be skewing our results. Make sure you save the filter and from this point on, your analytics reports will be much cleaner and you will have actionable data that provides real insights into your business’ web traffic.

Step 2: Advanced filtering on existing reports

Remember the new filters won’t effect our existing and historical reports, they will only be in effect on all reports going forward. Does this mean that you are out of luck when looking at past reports? Of course not! You’ll only need to apply some real time advanced filters that are similar to those that we just created. Let’s do that.

First make sure that you add a secondary dimension of Hostname (just like we did earlier) on the report you want to view

Screen Shot 08-22-15 at 01.35 AM Screen Shot 08-22-15 at 01.39 AM

 

Next click on the advanced link on the upper right header (just below the graph).

Screen Shot 08-22-15 at 02.30 AM

 

Now we can add some filters. Let’s start with Hostname since it will have the greatest impact.

Make sure that include is selected and in the Add Dimension box type Hostname.

Select “Containing” from the dropdown list (it should be the default) and then type your domain name into the box. (This should be in a normal format… do not escape any characters and only add 1 domain per query.

Screen Shot 08-22-15 at 02.37 AM

If you hit apply here you’ll see that a large number of bad results have now been filtered out of the report, but you’ll likely have some stuff that still needs to be removed. Unfortunately I am unaware of an easy way to do this without doing them individually. (If you know a better way, please share)

In my example I still have referrers like: success-seo.com, buttons-for-website.com, etc that I don’t want to see. Here’s how to get rid of them. I’ll demonstrate success-seo which seems to be the biggest culprit at the moment

Screen Shot 08-22-15 at 02.45 AM

Go back up and click on the “edit” link that has replaced the “advanced” link in the top right header just beneath the graph. Your query should appear again, ready for more conditions.

Screen Shot 08-22-15 at 02.48 AM

Now click the “Add a dimension or metric” button. Then in the search box type “Source” and click the dimension to add it to the query.

In the text box next to containing start typing the name of the source you want to remove and when it pops up in the results select it to add it to the query. In my example I started typing ” success” and success-seo.com came up so I added it. Next click EXCLUDE from the drop down on the left side.

Screen Shot 08-22-15 at 02.55 AM

Now if you click apply you should see the result has been filtered out.

Screen Shot 08-22-15 at 02.56 AM

VOILA!!!!! Simply do this for each of the results that you don’t want to see and you’ll have a report that more accurately represents the traffic you’re getting and where it is actually coming from. The good news is that with the filters we set earlier, you won’t have to do this on your reports going forward.

Again I hope this was helpful. We also discussed different ways of adding the analytics script to your wordpress site, but this post was a little lengthy so I’ll summarize what we discussed on that topic in another post. As always if you have any questions or feedback, please leave them in the comments or reach out to me directly. You can find my contact info at about.me/ronbrennan I look forward to hearing your thoughts.