This post is part of Made @ HubSpot, an internal mindset in which we learn from experiments conducted by our own HubSpotters.
Have you ever tried to get your clean laundry up by hand and things keep falling out of the huge garment you are wearing? This is similar to trying to increase organic website traffic.
Your content calendar is full of new ideas, but with every published website an older page sinks in the search engine ranking.
Get SEO traffic is difficult, but keeping SEO traffic is a different ball game. Content tends to “deteriorate” over time due to new content created by competitors, changing search engine algorithms or a variety of other reasons.
You have trouble moving the entire website forward, but data loss occurs again and again if you don’t pay attention.
We recently both (Alex Birkett and Braden Becker ?) has developed a way to automatically, true to scale and determine this traffic loss before it even happens.
The problem with traffic growth
At HubSpot, we increase our organic traffic by taking two trips from the laundry room instead of one.
The first trip is with new content and targets new keywords for which we have no rank yet.
The second trip consists of updated content that uses part of our editorial calendar to find out which content loses the most traffic – and leads – and to reinforce it with new content and SEO-oriented maneuvers that better serve certain keywords. It is a concept that we (and many marketers) call “historical optimization. ”
However, there is a problem with this growth strategy.
As our website traffic increases, tracking each and every page can be a difficult process. Selection of correct Pages to update are even more difficult.
Last year we wondered if there was a way to find blog posts whose organic traffic is only at risk of declining, diversifying our selection of updates, and possibly making traffic more stable as our blog grows.
Restore traffic versus protect traffic
Before we talk about the absurdity of trying to restore traffic that has not yet been lost, let’s look at the benefits.
When you view the performance of a page, you can easily see a decrease in traffic. For most growth-oriented marketers, this is downward traffic trend line is hard to ignore and nothing is more satisfying than the recovery of this trend.
However, restoring all traffic is costly: Since you can’t know where to lose traffic until you lose it, the time between traffic decline and recovery is a victim of leads, demos, and free users. Subscribers or a similar growth metric derived from your most interested visitors.
You can see this in the organic trend chart below for a single blog post. Even if the traffic is saved, you missed the opportunity to support your sales efforts downstream.
If you were able to find and protect (or even increase) traffic on the site In front It needs to be restored. You don’t have to make the sacrifice shown in the picture above. The question is: how do we do it?
How to predict falling traffic
To our delight, we didn’t need a crystal ball to predict traffic wear. However, what we needed was SEO data That suggests that traffic for certain blog posts could go goodbye, though something should continue. (We also had to write a script that could extract this data for the entire website – more on that in a minute.)
High keyword rankings generate organic traffic for a website. In addition, the lion’s share of traffic to websites that are lucky enough to rank on the first page. This traffic reward is even higher for keywords that receive a particularly high number of searches per month.
If a blog post slips off the first page of Google for this high volume keyword, it’s a toast.
Taking into account the relationship between keywords, keyword search volume, ranking position and organic traffic, we knew that we would see the start of a traffic loss here.
And luckily that SEO tools at our disposal can show us that the ranking changes over time:
The picture above shows a table with keywords for which a single blog post is ranked.
For one of these keywords, this blog post is in position 14 (page 1 of Google consists of positions 1-10). The red boxes show the ranking and the high volume of 40,000 monthly searches for this keyword.
Even sadder than item 14 of this article is how he got there.
As you can see in the blue-green trend line above, this blog post was once a high-profile result, but has been dropped over and over again over the next few weeks. The post’s traffic confirmed what we saw – a noticeable drop in organic pageviews shortly after this post dropped from page 1 for that keyword.
You can see where it goes. We wanted to spot these drops when they were about to leave page 1, and so restore the traffic that we were at risk of losing. And we wanted to do this automatically for dozens of blog posts at once.
The traffic tool “In danger”
The functionality of the At Risk Tool is actually a bit simple. We thought about it in three parts:
- Where do we get our input data from?
- How do we clean it?
- What results of this data enable us to make better decisions when optimizing content?
First, where do we get the data from?
1. Keyword data from SEMRush
What we wanted was real estate keyword research data. We would therefore like to display all the keywords that hubspot.com stands for, in particular blog.hubspot.com, as well as all related data that correspond to these keywords.
Some fields that are valuable to us are our current search engine ranking, our previous search engine ranking, the monthly search volume of this keyword and possibly the value (estimated by keyword difficulty or CPC) of this keyword.
We used that to get this data SEMrush API (In particular, we use the Domain Organic Search Keywords report):
Use of R., a popular programming language for statisticians and analysts as well as marketers (specifically we use the “Httr” library To work with APIs, we then retrieved the top 10,000 keywords that drive traffic to blog.hubspot.com (as well as our properties for Spanish, German, French, and Portuguese). We are currently doing this once every quarter.
This is a lot of raw data that is useless in itself. So we have to clean up the data and move it to a format that we find useful.
Then how do we clean up the data and create formulas to get answers to the content to be updated?
2. Clean up the data and create the formulas
We also do most of the data cleansing in our R script. Before our data ever hits another data storage source (be it sheets or a database data table), most of our data is cleaned up and formatted as we want it to be.
We do this with a few short lines of code:
After retrieving 10,000 rows of keyword data, we analyze them from the API in the code above to make them readable, and then build them into a data table. We then subtract the current ranking from the past ranking to get that difference in the ranking (if we previously ranked 4th and now ranked 9th, the difference in the ranking is -5).
We have further filtered so that only those with a difference in the negative value ranking are shown (i.e. only keywords for which we have lost the ranking, not those that we have won or that have remained the same).
We then send this cleaned and filtered data table to Google Sheets, where we apply tons of custom formulas and conditional formatting.
After all, we had to know: what are the results and how do we actually make decisions when optimizing content?
3. Output tools for endangered content: How we make decisions
Based on the input columns (keyword, current position, historical position, position difference and monthly search volume) and the formulas above, we calculate a categorical variable for an output.
A URL / line can be one of the following:
- “IN DANGER”
- “Volatile”
- Empty (no value)
Empty exitsor these lines with no value mean that we can essentially ignore these URLs for now. You haven’t lost any significant ranking or were already on page 2 of Google.
“Volatile” means the page is losing rank but is not yet old enough to warrant an action. New websites constantly jump around in rankings with increasing age. At a certain point, they generate enough “topic authority” to generally stay in position for a while. For content that supports a product launch or an otherwise important marketing campaign, we may give these posts a TLC because they are still mature. It is therefore worth marking them.
“In danger” is mainly what we’re looking for – blog posts published more than six months ago have dropped in ranking and are now ranked 8th to 10th for a high volume keyword. We see this as a “red zone” for bad content, with less than 3 positions from page 1 to page 2 away from Google.
The spreadsheet formula for these three tags is listed below – basically a compound IF statement to find page 1 rankings, a negative ranking difference, and the distance of the publication date from the current day.
What we have learned
In short, it works! The tool described above was added to our workflow regularly, if not often. However, not all predictive updates save data traffic on time. In the following example, a blog post fell after an update from page 1 and later returned to a higher position.
And that’s okay.
We have no control over when and how often Google chooses to redraw and reorder a page.
Of course you can Resend the URL to Google and ask them to redraw (this extra step may be worthwhile for critical or time-sensitive content). However, the goal is to minimize the time that this content underperforms and stop bleeding – even if this means that the speed of recovery is left to chance.
While you never really know how many page views, leads, signups, or subscriptions you lose on each page, the precautions you take now save you time you would otherwise spend figuring out why all of your site traffic is unifying Jumped last week.