Monday, October 16, 2017

Google Shares Details About the Technology Behind Googlebot

Posted by goralewicz

Crawling and indexing has been a hot topic over the last few years. As soon as Google launched Google Panda, people rushed to their server logs and crawling stats and began fixing their index bloat. All those problems didn’t exist in the “SEO = backlinks” era from a few years ago. With this exponential growth of technical SEO, we need to get more and more technical. That being said, we still don’t know how exactly Google crawls our websites. Many SEOs still can’t tell the difference between crawling and indexing.

The biggest problem, though, is that when we want to troubleshoot indexing problems, the only tool in our arsenal is Google Search Console and the Fetch and Render tool. Once your website includes more than HTML and CSS, there's a lot of guesswork into how your content will be indexed by Google. This approach is risky, expensive, and can fail multiple times. Even when you discover the pieces of your website that weren’t indexed properly, it's extremely difficult to get to the bottom of the problem and find the fragments of code responsible for the indexing problems.

Fortunately, this is about to change. Recently, Ilya Grigorik from Google shared one of the most valuable insights into how crawlers work:

Interestingly, this tweet didn’t get nearly as much attention as I would expect.

So what does Ilya’s revelation in this tweet mean for SEOs?

Knowing that Chrome 41 is the technology behind the Web Rendering Service is a game-changer. Before this announcement, our only solution was to use Fetch and Render in Google Search Console to see our page rendered by the Website Rendering Service (WRS). This means we can troubleshoot technical problems that would otherwise have required experimenting and creating staging environments. Now, all you need to do is download and install Chrome 41 to see how your website loads in the browser. That’s it.

You can check the features and capabilities that Chrome 41 supports by visiting Caniuse.com or Chromestatus.com (Googlebot should support similar features). These two websites make a developer’s life much easier.

Even though we don’t know exactly which version Ilya had in mind, we can find Chrome’s version used by the WRS by looking at the server logs. It’s Chrome 41.0.2272.118.

It will be updated sometime in the future

Chrome 41 was created two years ago (in 2015), so it’s far removed from the current version of the browser. However, as Ilya Grigorik said, an update is coming:

I was lucky enough to get Ilya Grigorik to read this article before it was published, and he provided a ton of valuable feedback on this topic. He mentioned that they are hoping to have the WRS updated by 2018. Fingers crossed!

Google uses Chrome 41 for rendering. What does that mean?

We now have some interesting information about how Google renders websites. But what does that mean, practically, for site developers and their clients? Does this mean we can now ignore server-side rendering and deploy client-rendered, JavaScript-rich websites?

Not so fast. Here is what Ilya Grigorik had to say in response to this question:

We now know WRS' capabilities for rendering JavaScript and how to debug them. However, remember that not all crawlers support Javascript crawling, etc. Also, as of today, JavaScript crawling is only supported by Google and Ask (Ask is most likely powered by Google). Even if you don’t care about social media or search engines other than Google, one more thing to remember is that even with Chrome 41, not all JavaScript frameworks can be indexed by Google (read more about JavaScript frameworks crawling and indexing). This lets us troubleshoot and better diagnose problems.

Don’t get your hopes up

All that said, there are a few reasons to keep your excitement at bay.

Remember that version 41 of Chrome is over two years old. It may not work very well with modern JavaScript frameworks. To test it yourself, open http://jsseo.expert/polymer/ using Chrome 41, and then open it in any up-to-date browser you are using.

The page in Chrome 41 looks like this:

The content parsed by Polymer is invisible (meaning it wasn’t processed correctly). This is also a perfect example for troubleshooting potential indexing issues. The problem you're seeing above can be solved if diagnosed properly. Let me quote Ilya:

"If you look at the raised Javascript error under the hood, the test page is throwing an error due to unsupported (in M41) ES6 syntax. You can test this yourself in M41, or use the debug snippet we provided in the blog post to log the error into the DOM to see it."

I believe this is another powerful tool for web developers willing to make their JavaScript websites indexable. We will definitely expand our experiment and work with Ilya’s feedback.

The Fetch and Render tool is the Chrome v. 41 preview

There's another interesting thing about Chrome 41. Google Search Console's Fetch and Render tool is simply the Chrome 41 preview. The righthand-side view (“This is how a visitor to your website would have seen the page") is generated by the Google Search Console bot, which is... Chrome 41.0.2272.118 (see screenshot below).

Zoom in here

There's evidence that both Googlebot and Google Search Console Bot render pages using Chrome 41. Still, we don’t exactly know what the differences between them are. One noticeable difference is that the Google Search Console bot doesn’t respect the robots.txt file. There may be more, but for the time being, we're not able to point them out.

Chrome 41 vs Fetch as Google: A word of caution

Chrome 41 is a great tool for debugging Googlebot. However, sometimes (not often) there's a situation in which Chrome 41 renders a page properly, but the screenshots from Google Fetch and Render suggest that Google can’t handle the page. It could be caused by CSS animations and transitions, Googlebot timeouts, or the usage of features that Googlebot doesn’t support. Let me show you an example.

Chrome 41 preview:

Image blurred for privacy

The above page has quite a lot of content and images, but it looks completely different in Google Search Console.

Google Search Console preview for the same URL:

As you can see, Google Search Console’s preview of this URL is completely different than what you saw on the previous screenshot (Chrome 41). All the content is gone and all we can see is the search bar.

From what we noticed, Google Search Console renders CSS a little bit different than Chrome 41. This doesn’t happen often, but as with most tools, we need to double check whenever possible.

This leads us to a question...

What features are supported by Googlebot and WRS?

According to the Rendering on Google Search guide:

  • Googlebot doesn't support IndexedDB, WebSQL, and WebGL.
  • HTTP cookies and local storage, as well as session storage, are cleared between page loads.
  • All features requiring user permissions (like Notifications API, clipboard, push, device-info) are disabled.
  • Google can’t index 3D and VR content.
  • Googlebot only supports HTTP/1.1 crawling.

The last point is really interesting. Despite statements from Google over the last 2 years, Google still only crawls using HTTP/1.1.

No HTTP/2 support (still)

We've mostly been covering how Googlebot uses Chrome, but there's another recent discovery to keep in mind.

There is still no support for HTTP/2 for Googlebot.

Since it's now clear that Googlebot doesn’t support HTTP/2, this means that if your website supports HTTP/2, you can’t drop HTTP 1.1 optimization. Googlebot can crawl only using HTTP/1.1.

There were several announcements recently regarding Google’s HTTP/2 support. To read more about it, check out my HTTP/2 experiment here on the Moz Blog.

Via https://developers.google.com/search/docs/guides/r...

Googlebot’s future

Rumor has it that Chrome 59’s headless mode was created for Googlebot, or at least that it was discussed during the design process. It's hard to say if any of this chatter is true, but if it is, it means that to some extent, Googlebot will “see” the website in the same way as regular Internet users.

This would definitely make everything simpler for developers who wouldn’t have to worry about Googlebot’s ability to crawl even the most complex websites.

Chrome 41 vs. Googlebot’s crawling efficiency

Chrome 41 is a powerful tool for debugging JavaScript crawling and indexing. However, it's crucial not to jump on the hype train here and start launching websites that “pass the Chrome 41 test.”

Even if Googlebot can “see” our website, there are many other factors that will affect your site’s crawling efficiency. As an example, we already have proof showing that Googlebot can crawl and index JavaScript and many JavaScript frameworks. It doesn’t mean that JavaScript is great for SEO. I gathered significant evidence showing that JavaScript pages aren’t crawled even half as effectively as HTML-based pages.

In summary

Ilya Grigorik’s tweet sheds more light on how Google crawls pages and, thanks to that, we don’t have to build experiments for every feature we're testing — we can use Chrome 41 for debugging instead. This simple step will definitely save a lot of websites from indexing problems, like when Hulu.com’s JavaScript SEO backfired.

It's safe to assume that Chrome 41 will now be a part of every SEO’s toolset.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Does Googlebot Support HTTP/2? Challenging Google's Indexing Claims – An Experiment

Posted by goralewicz

I was recently challenged with a question from a client, Robert, who runs a small PR firm and needed to optimize a client’s website. His question inspired me to run a small experiment in HTTP protocols. So what was Robert’s question? He asked...

Can Googlebot crawl using HTTP/2 protocols?

You may be asking yourself, why should I care about Robert and his HTTP protocols?

As a refresher, HTTP protocols are the basic set of standards allowing the World Wide Web to exchange information. They are the reason a web browser can display data stored on another server. The first was initiated back in 1989, which means, just like everything else, HTTP protocols are getting outdated. HTTP/2 is one of the latest versions of HTTP protocol to be created to replace these aging versions.

So, back to our question: why do you, as an SEO, care to know more about HTTP protocols? The short answer is that none of your SEO efforts matter or can even be done without a basic understanding of HTTP protocol. Robert knew that if his site wasn’t indexing correctly, his client would miss out on valuable web traffic from searches.

The hype around HTTP/2

HTTP/1.1 is a 17-year-old protocol (HTTP 1.0 is 21 years old). Both HTTP 1.0 and 1.1 have limitations, mostly related to performance. When HTTP/1.1 was getting too slow and out of date, Google introduced SPDY in 2009, which was the basis for HTTP/2. Side note: Starting from Chrome 53, Google decided to stop supporting SPDY in favor of HTTP/2.

HTTP/2 was a long-awaited protocol. Its main goal is to improve a website’s performance. It's currently used by 17% of websites (as of September 2017). Adoption rate is growing rapidly, as only 10% of websites were using HTTP/2 in January 2017. You can see the adoption rate charts here. HTTP/2 is getting more and more popular, and is widely supported by modern browsers (like Chrome or Firefox) and web servers (including Apache, Nginx, and IIS).

Its key advantages are:

  • Multiplexing: The ability to send multiple requests through a single TCP connection.
  • Server push: When a client requires some resource (let's say, an HTML document), a server can push CSS and JS files to a client cache. It reduces network latency and round-trips.
  • One connection per origin: With HTTP/2, only one connection is needed to load the website.
  • Stream prioritization: Requests (streams) are assigned a priority from 1 to 256 to deliver higher-priority resources faster.
  • Binary framing layer: HTTP/2 is easier to parse (for both the server and user).
  • Header compression: This feature reduces overhead from plain text in HTTP/1.1 and improves performance.

For more information, I highly recommend reading “Introduction to HTTP/2” by Surma and Ilya Grigorik.

All these benefits suggest pushing for HTTP/2 support as soon as possible. However, my experience with technical SEO has taught me to double-check and experiment with solutions that might affect our SEO efforts.

So the question is: Does Googlebot support HTTP/2?

Google's promises

HTTP/2 represents a promised land, the technical SEO oasis everyone was searching for. By now, many websites have already added HTTP/2 support, and developers don’t want to optimize for HTTP/1.1 anymore. Before I could answer Robert’s question, I needed to know whether or not Googlebot supported HTTP/2-only crawling.

I was not alone in my query. This is a topic which comes up often on Twitter, Google Hangouts, and other such forums. And like Robert, I had clients pressing me for answers. The experiment needed to happen. Below I'll lay out exactly how we arrived at our answer, but here’s the spoiler: it doesn't. Google doesn’t crawl using the HTTP/2 protocol. If your website uses HTTP/2, you need to make sure you continue to optimize the HTTP/1.1 version for crawling purposes.

The question

It all started with a Google Hangouts in November 2015.

When asked about HTTP/2 support, John Mueller mentioned that HTTP/2-only crawling should be ready by early 2016, and he also mentioned that HTTP/2 would make it easier for Googlebot to crawl pages by bundling requests (images, JS, and CSS could be downloaded with a single bundled request).

"At the moment, Google doesn’t support HTTP/2-only crawling (...) We are working on that, I suspect it will be ready by the end of this year (2015) or early next year (2016) (...) One of the big advantages of HTTP/2 is that you can bundle requests, so if you are looking at a page and it has a bunch of embedded images, CSS, JavaScript files, theoretically you can make one request for all of those files and get everything together. So that would make it a little bit easier to crawl pages while we are rendering them for example."

Soon after, Twitter user Kai Spriestersbach also asked about HTTP/2 support:

His clients started dropping HTTP/1.1 connections optimization, just like most developers deploying HTTP/2, which was at the time supported by all major browsers.

After a few quiet months, Google Webmasters reignited the conversation, tweeting that Google won’t hold you back if you're setting up for HTTP/2. At this time, however, we still had no definitive word on HTTP/2-only crawling. Just because it won't hold you back doesn't mean it can handle it — which is why I decided to test the hypothesis.

The experiment

For months as I was following this online debate, I still received questions from our clients who no longer wanted want to spend money on HTTP/1.1 optimization. Thus, I decided to create a very simple (and bold) experiment.

I decided to disable HTTP/1.1 on my own website (https://goralewicz.com) and make it HTTP/2 only. I disabled HTTP/1.1 from March 7th until March 13th.

If you’re going to get bad news, at the very least it should come quickly. I didn’t have to wait long to see if my experiment “took.” Very shortly after disabling HTTP/1.1, I couldn’t fetch and render my website in Google Search Console; I was getting an error every time.

My website is fairly small, but I could clearly see that the crawling stats decreased after disabling HTTP/1.1. Google was no longer visiting my site.

While I could have kept going, I stopped the experiment after my website was partially de-indexed due to “Access Denied” errors.

The results

I didn’t need any more information; the proof was right there. Googlebot wasn’t supporting HTTP/2-only crawling. Should you choose to duplicate this at home with our own site, you’ll be happy to know that my site recovered very quickly.

I finally had Robert’s answer, but felt others may benefit from it as well. A few weeks after finishing my experiment, I decided to ask John about HTTP/2 crawling on Twitter and see what he had to say.

(I love that he responds.)

Knowing the results of my experiment, I have to agree with John: disabling HTTP/1 was a bad idea. However, I was seeing other developers discontinuing optimization for HTTP/1, which is why I wanted to test HTTP/2 on its own.

For those looking to run their own experiment, there are two ways of negotiating a HTTP/2 connection:

1. Over HTTP (unsecure) – Make an HTTP/1.1 request that includes an Upgrade header. This seems to be the method to which John Mueller was referring. However, it doesn't apply to my website (because it’s served via HTTPS). What is more, this is an old-fashioned way of negotiating, not supported by modern browsers. Below is a screenshot from Caniuse.com:

2. Over HTTPS (secure) – Connection is negotiated via the ALPN protocol (HTTP/1.1 is not involved in this process). This method is preferred and widely supported by modern browsers and servers.

A recent announcement: The saga continues

Googlebot doesn’t make HTTP/2 requests

Fortunately, Ilya Grigorik, a web performance engineer at Google, let everyone peek behind the curtains at how Googlebot is crawling websites and the technology behind it:

If that wasn’t enough, Googlebot doesn't support the WebSocket protocol. That means your server can’t send resources to Googlebot before they are requested. Supporting it wouldn't reduce network latency and round-trips; it would simply slow everything down. Modern browsers offer many ways of loading content, including WebRTC, WebSockets, loading local content from drive, etc. However, Googlebot supports only HTTP/FTP, with or without Transport Layer Security (TLS).

Googlebot supports SPDY

During my research and after John Mueller’s feedback, I decided to consult an HTTP/2 expert. I contacted Peter Nikolow of Mobilio, and asked him to see if there were anything we could do to find the final answer regarding Googlebot’s HTTP/2 support. Not only did he provide us with help, Peter even created an experiment for us to use. Its results are pretty straightforward: Googlebot does support the SPDY protocol and Next Protocol Navigation (NPN). And thus, it can’t support HTTP/2.

Below is Peter’s response:


I performed an experiment that shows Googlebot uses SPDY protocol. Because it supports SPDY + NPN, it cannot support HTTP/2. There are many cons to continued support of SPDY:

  1. This protocol is vulnerable
  2. Google Chrome no longer supports SPDY in favor of HTTP/2
  3. Servers have been neglecting to support SPDY. Let’s examine the NGINX example: from version 1.95, they no longer support SPDY.
  4. Apache doesn't support SPDY out of the box. You need to install mod_spdy, which is provided by Google.

To examine Googlebot and the protocols it uses, I took advantage of s_server, a tool that can debug TLS connections. I used Google Search Console Fetch and Render to send Googlebot to my website.

Here's a screenshot from this tool showing that Googlebot is using Next Protocol Navigation (and therefore SPDY):

I'll briefly explain how you can perform your own test. The first thing you should know is that you can’t use scripting languages (like PHP or Python) for debugging TLS handshakes. The reason for that is simple: these languages see HTTP-level data only. Instead, you should use special tools for debugging TLS handshakes, such as s_server.

Type in the console:

sudo openssl s_server -key key.pem -cert cert.pem -accept 443 -WWW -tlsextdebug -state -msg
sudo openssl s_server -key key.pem -cert cert.pem -accept 443 -www -tlsextdebug -state -msg

Please note the slight (but significant) difference between the “-WWW” and “-www” options in these commands. You can find more about their purpose in the s_server documentation.

Next, invite Googlebot to visit your site by entering the URL in Google Search Console Fetch and Render or in the Google mobile tester.

As I wrote above, there is no logical reason why Googlebot supports SPDY. This protocol is vulnerable; no modern browser supports it. Additionally, servers (including NGINX) neglect to support it. It’s just a matter of time until Googlebot will be able to crawl using HTTP/2. Just implement HTTP 1.1 + HTTP/2 support on your own server (your users will notice due to faster loading) and wait until Google is able to send requests using HTTP/2.


Summary

In November 2015, John Mueller said he expected Googlebot to crawl websites by sending HTTP/2 requests starting in early 2016. We don’t know why, as of October 2017, that hasn't happened yet.

What we do know is that Googlebot doesn't support HTTP/2. It still crawls by sending HTTP/ 1.1 requests. Both this experiment and the “Rendering on Google Search” page confirm it. (If you’d like to know more about the technology behind Googlebot, then you should check out what they recently shared.)

For now, it seems we have to accept the status quo. We recommended that Robert (and you readers as well) enable HTTP/2 on your websites for better performance, but continue optimizing for HTTP/ 1.1. Your visitors will notice and thank you.


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Friday, October 13, 2017

Writing Headlines that Serve SEO, Social Media, and Website Visitors All Together - Whiteboard Friday

Posted by randfish

Have your headlines been doing some heavy lifting? If you've been using one headline to serve multiple audiences, you're missing out on some key optimization opportunities. In today's Whiteboard Friday, Rand gives you a process for writing headlines for SEO, for social media, and for your website visitors — each custom-tailored to its audience and optimized to meet different goals.

Writing headlines that serve SEO, Social Media, and Website Visitors

Click on the whiteboard image above to open a high-resolution version in a new tab!


Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we're going to chat about writing headlines. One of the big problems that headlines have is that they need to serve multiple audiences. So it's not just ranking and search engines. Even if it was, the issue is that we need to do well on social media. We need to serve our website visitors well in order to rank in the search engines. So this gets very challenging.

I've tried to illustrate this with a Venn diagram here. So you can see, basically...

SEO

In the SEO world of headline writing, what I'm trying to do is rank well, earn high click-through rate, because I want a lot of those visitors to the search results to choose my result, not somebody else's. I want low pogo-sticking. I don't want anyone clicking the back button and choosing someone else's result because I didn't fulfill their needs. I need to earn links, and I've got to have engagement.

Social media

On the social media side, it's pretty different actually. I'm trying to earn amplification, which can often mean the headline tells as much of the story as possible. Even if you don't read the piece, you amplify it, you retweet it, and you re-share it. I'm looking for clicks, and I'm looking for comments and engagement on the post. I'm not necessarily too worried about that back button and the selection of another item. In fact, time on site might not even be a concern at all.

Website visitors

For website visitors, both of these are channels that drive traffic. But for the site itself, I'm trying to drive right visitors, the ones who are going to be loyal, who are going to come back, hopefully who are going to convert. I want to not confuse anyone. I want to deliver on my promise so that I don't create a bad brand reputation and detract from people wanting to click on me in the future. For those of you have visited a site like Forbes or maybe even a BuzzFeed and you have an association of, "Oh, man, this is going to be that clickbait stuff. I don't want to click on their stuff. I'm going to choose somebody else in the results instead of this brand that I remember having a bad experience with."

Notable conflicts

There are some notable direct conflicts in here.

  1. Keywords for SEO can be really boring on social media sites. When you try and keyword stuff especially or be keyword-heavy, your social performance tends to go terribly.
  2. Creating mystery on social, so essentially not saying what the piece is truly about, but just creating an inkling of what it might be about harms the clarity that you need for search in order to rank well and in order to drive those clicks from a search engine. It also hurts your ability generally to do keyword targeting.
  3. The need for engagement and brand reputation that you've got for your website visitors is really going to hurt you if you're trying to develop those clickbait-style pieces that do so well on social.
  4. In search, ranking for low-relevance keywords is going to drive very unhappy visitors, people who don't care that just because you happen to rank for this doesn't necessarily mean that you should, because you didn't serve the visitor intent with the actual content.

Getting to resolution

So how do we resolve this? Well, it's not actually a terribly hard process. In 2017 and beyond, what's nice is that search engines and social and visitors all have enough shared stuff that, most of the time, we can get to a good, happy resolution.

Step one: Determine who your primary audience is, your primary goals, and some prioritization of those channels.

You might say, "Hey, this piece is really targeted at search. If it does well on social, that's fine, but this is going to be our primary traffic driver." Or you might say, "This is really for internal website visitors who are browsing around our site. If it happens to drive some traffic from search or social, well that's fine, but that's not our intent."

Step two: For non-conflict elements, optimize for the most demanding channel.

For those non-conflicting elements, so this could be the page title that you use for SEO, it doesn't always have to perfectly match the headline. If it's a not-even-close match, that's a real problem, but an imperfect match can still be okay.

So what's nice in social is you have things like Twitter cards and the Facebook markup, graph markup. That Open Graph markup means that you can have slightly different content there than what you might be using for your snippet, your meta description in search engines. So you can separate those out or choose to keep those distinct, and that can help you as well.

Step three: Author the straightforward headline first.

I'm going to ask you author the most straightforward version of the headline first.

Step four: Now write the social-friendly/click-likely version without other considerations.

Is to write the opposite of that, the most social-friendly or click-likely/click-worthy version. It doesn't necessarily have to worry about keywords. It doesn't have to worry about accuracy or telling the whole story without any of these other considerations.

Step five: Merge 3 & 4, and add in critical keywords.

We're going to take three and four and just merge them into something that will work for both, that compromises in the right way, compromises based on your primary audience, your primary goals, and then add in the critical keywords that you're going to need.

Examples:

I've tried to illustrate this a bit with an example. Nest, which Google bought them years ago and then they became part of the Alphabet Corporation that Google evolved into. So Nest is separately owned by Alphabet, Google's parent company. Nest came out with this new alarm system. In fact, the day we're filming this Whiteboard Friday, they came out with a new alarm system. So they're no longer just a provider of thermostats inside of houses. They now have something else.

Step one: So if I'm a tech news site and I'm writing about this, I know that I'm trying to target gadget and news readers. My primary channel is going to be social first, but secondarily search engines. The goal that I'm trying to reach, that's engagement followed by visits and then hopefully some newsletter sign-ups to my tech site.

Step two: My title and headline in this case probably need to match very closely. So the social callouts, the social cards and the Open Graph, that can be unique from the meta description if need be or from the search snippet if need be.

Step three: I'm going to do step three, author the straightforward headline. That for me is going to be "Nest Has a New Alarm System, Video Doorbell, and Outdoor Camera." A little boring, probably not going to tremendously well on social, but it probably would do decently well in search.

Step four: My social click-likely version is going to be something more like "Nest is No Longer Just a Thermostat. Their New Security System Will Blow You Away." That's not the best headline in the universe, but I'm not a great headline writer. However, you get the idea. This is the click-likely social version, the one that you see the headline and you go, "Ooh, they have a new security system. I wonder what's involved in that." You create some mystery. You don't know that it includes a video doorbell, an outdoor camera, and an alarm. You just hear, "They've got a new security system. Well, I better look at it."

Step five: Then I can try and compromise and say, "Hey, I know that I need to have video doorbell, camera, alarm, and Nest." Those are my keywords. Those are the important ones. That's what people are going to be searching for around this announcement, so I've got to have them in there. I want to have them close to the front. So "Nest's New Alarm, Video Doorbell and Camera Are About to Be on Every Home's Must-Have List." All right, resolved in there.

So this process of writing headlines to serve these multiple different, sometimes competing priorities is totally possible with nearly everything you're going to do in SEO and social and for your website visitors. This resolution process is something hopefully you can leverage to get better results.

All right, everyone, we'll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Wednesday, October 11, 2017

Do iPhone Users Spend More Online Than Android Users?

Posted by MartyMeany

Apple has just launched their latest flagship phones to market and later this year they'll release their uber-flagship: the iPhone X. The iPhone X is the most expensive iPhone yet, at a cool $999. With so many other smartphones on the market offering similar functionality, it begs the question: Do iPhone users simply spend more money than everyone else?

At Wolfgang Digital, we love a bit of data, so we’ve trawled through a massive dataset of 31 million iPhone and Android sessions to finally answer this question. Of course, we’ve got some actionable nuggets of digital marketing strategy at the end, too!

Why am I asking this question?

Way back when, before joining the online marketing world, I sold mobile phones. I couldn’t get my head around why people bought iPhones. They're more expensive than their Android counterparts, which usually offer the same, if not increased, functionality (though you could argue the latter is subjective).

When I moved into the e-commerce department of the same phone retailer, my team would regularly grab a coffee and share little nuggets of interesting e-commerce trends we’d found. My personal favorite was a tale about Apple users spending more than desktop users. The story I read talked about how a hotel raised prices for people booking while using an Apple device. Even with the increased prices, conversion rates didn’t budge as the hotel raked in extra cash.

I’ve always said this story was anecdotal because I simply never saw the data to back it up. Still, it fascinated me.

Finding an answer

Fast forward a few years and I’m sitting in Wolfgang Digital behind the huge dataset that powered our 2017 E-Commerce Benchmark KPI Study. It occurred to me that this data could answer some of the great online questions I’d heard over the years. What better place to start than that tale of Apple users spending more money online than others?

The online world has changed a little since I first asked myself this question, so let’s take a fresh 2017 approach.

Do iPhone users spend more than Android users?

When this hypothesis first appeared, people were comparing Mac desktop users and PC desktop users, but the game has changed since then. To give the hypothesis a fresh 2017 look, we’re going to ask whether iPhone users spend more than Android users. Looking through the 31 million sessions on both iOS and Android operating systems, then filtering the data by mobile, it didn’t take long to find the the answer to this question that had followed me around for years. The results were astonishing:

On average, Android users spend $11.54 per transaction. iPhone users, on the other hand, spend a whopping $32.94 per transaction. That means iPhone users will spend almost three times as much as Android users when visiting an e-commerce site.

Slightly smug that I’ve finally answered my question, how do we turn this from being an interesting nugget of information to an actionable insight?

What does this mean for digital marketers?

As soon as you read about iPhone users spending three times more than Android users, I’m sure you started thinking about targeting users specifically based on their operating system. If iOS users are spending more money than their Android counterparts, doesn’t it make sense to shift your spend and targeting towards iOS users?

You’re right. In both Facebook and AdWords, you can use this information to your advantage.

Targeting operating systems within Facebook

Of the "big two" ad platforms, Facebook offers the most direct form of operating system targeting. When creating your ads, Facebook’s Ad Manager will give you the option to target "All Mobile Devices," "iOS Devices Only," or "Android Devices Only." These options mean you can target those high average order value-generating iPhone users.

Targeting operating systems within AdWords

AdWords will allow you to target operating systems for both Display Campaigns and Video Campaigns. When it comes to Search, you can’t target a specific operating system. You can, however, create an OS-based audience using Google Analytics. Once this audience is built, you can remarket to an iOS audience with "iPhone"-oriented ad texts. Speaking at Wolfgang Essentials this year, Wil Reynolds showed clips of people talking through their decision to click in SERPs. It’s incredible to see people skipping over year-old content before clicking an article that mentions "iPhone." Why? Because that user has an iPhone. That’s the power of relevancy.

You’ll also be able to optimize and personalize your bids in Search, safe in the knowledge that iPhone users are more likely to spend big than Android users.

There you have it. Don’t let those mad stories you hear pass you by. You might just learn something!


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Tuesday, October 10, 2017

How to Turn Low-Value Content Into Neatly Organized Opportunities - Next Level

Posted by jocameron

Welcome to the newest installment of our educational Next Level series! In our last post, Brian Childs offered up a beginner-level workflow to help discover your competitor's backlinks. Today, we're welcoming back Next Level veteran Jo Cameron to show you how to find low-quality pages on your site and decide their new fate. Read on and level up!


With an almost endless succession of Google updates fluctuating the search results, it’s pretty clear that substandard content just won’t cut it.

I know, I know — we can’t all keep up with the latest algorithm updates. We’ve got businesses to run, clients to impress, and a strong social media presence to maintain. After all, you haven’t seen a huge drop in your traffic. It’s probably OK, right?

So what’s with the nagging sensation down in the pit of your stomach? It’s not just that giant chili taco you had earlier. Maybe it’s that feeling that your content might be treading on thin ice. Maybe you watched Rand’s recent Whiteboard Friday (How to Determine if a Page is "Low Quality" in Google's Eyes) and just don’t know where to start.

In this edition of Next Level, I’ll show you how to start identifying your low-quality pages in a few simple steps with Moz Pro's Site Crawl. Once identified, you can decide whether to merge, shine up, or remove the content.

A quick recap of algorithm updates

The latest big fluctuations in the search results were said to be caused by King Fred: enemy of low-quality pages and champion of the people’s right to find and enjoy content of value.

Fred took the fight to affiliate sites, and low-value commercial sites were also affected.

The good news is that even if this isn’t directed at you, and you haven’t taken a hit yourself, you can still learn from this update to improve your site. After all, why not stay on the right side of the biggest index of online content in the known universe? You’ll come away with a good idea of what content is working for your site, and you may just take a ride to the top of the SERPs. Knowledge is power, after all.

Be a Pro

It’s best if we just accept that Google updates are ongoing; they happen all.the.time. But with a site audit tool in your toolkit like Moz Pro's Site Crawl, they don’t have to keep you up at night. Our shiny new Rogerbot crawler is the new kid on the block, and it’s hungry to crawl your pages.

If you haven’t given it a try, sign up for a free trial for 30 days:

Start a free trial

If you’ve already had a free trial that has expired, write to me and I’ll give you another, just because I can.

Set up your Moz Pro campaign — it takes 5 minutes tops — and Rogerbot will be unleashed upon your site like a caffeinated spider.

Rogerbot hops from page to page following links to analyze your website. As Rogerbot hops along, a beautiful database of pages is constructed that flag issues you can use to find those laggers. What a hero!

First stop: Thin content

Site Crawl > Content Issues > Thin Content

Thin content could be damaging your site. If it’s deemed to be malicious, then it could result in a penalty. Things like zero-value pages with ads or spammy doorway pages — little traps people set to funnel people to other pages — are bad news.

First off, let’s find those pages. Moz Pro Site Crawl will flag "thin content" if it has less than 50 words (excluding navigation and ads).

Now is a good time to familiarize yourself with Google’s Quality Guidelines. Think long and hard about whether you may be doing this, intentionally or accidentally.

You’re probably not straight-up spamming people, but you could do better and you know it. Our mantra is (repeat after me): “Does this add value for my visitors?” Well, does it?

Ok, you can stop chanting now.

For most of us, thin content is less of a penalty threat and more of an opportunity. By finding pages with thin content, you have the opportunity to figure out if they're doing enough to serve your visitors. Pile on some Google Analytics data and start making decisions about improvements that can be made.

Using moz.com as an example, I’ve found 3 pages with thin content. Ta-da emoji!

I’m not too concerned about the login page or the password reset page. I am, however, interested to see how the local search page is performing. Maybe we can find an opportunity to help people who land on this page.

Go ahead and export your thin content pages from Moz Pro to CSV.

We can then grab some data from Google Analytics to give us an idea of how well this page is performing. You may want to look at comparing monthly data and see if there are any trends, or compare similar pages to see if improvements can be made.

I am by no means a Google Analytics expert, but I know how to get what I want. Most of the time that is, except when I have to Google it, which is probably every second week.

Firstly: Behavior > Site Content > All Pages > Paste in your URL

  • Pageviews - The number of times that page has been viewed, even if it’s a repeat view.
  • Avg. Time on Page - How long people are on your page
  • Bounce Rate - Single page views with no interaction

For my example page, Bounce Rate is very interesting. This page lives to be interacted with. Its only joy in life is allowing people to search for a local business in the UK, US, or Canada. It is not an informational page at all. It doesn’t provide a contact phone number or an answer to a query that may explain away a high bounce rate.

I’m going to add Pageviews and Bounce Rate a spreadsheet so I can track this over time.

I’ll also added some keywords that I want that page to rank for to my Moz Pro Rankings. That way I can make sure I’m targeting searcher intent and driving organic traffic that is likely to convert.

I’ll also know if I’m being out ranked by my competitors. How dare they, right?

As we've found with this local page, not all thin content is bad content. Another example may be if you have a landing page with an awesome video that's adding value and is performing consistently well. In this case, hold off on making sweeping changes. Track the data you’re interested in; from there, you can look at making small changes and track the impact, or split test some ideas. Either way, you want to make informed, data-driven decisions.

Action to take for tracking thin content pages

Export to CSV so you can track how these pages are performing alongside GA data. Make incremental changes and track the results.

Second stop: Duplicate title tags

Site Crawl > Content Issues > Duplicate Title Tags

Title tags show up in the search results to give human searchers a taste of what your content is about. They also help search engines understand and categorize your content. Without question, you want these to be well considered, relevant to your content, and unique.

Moz Pro Site Crawl flags any pages with matching title tags for your perusal.

Duplicate title tags are unlikely to get your site penalized, unless you’ve masterminded an army of pages that target irrelevant keywords and provide zero value. Once again, for most of us, it’s a good way to find a missed opportunity.

Digging around your duplicate title tags is a lucky dip of wonder. You may find pages with repeated content that you want to merge, or redundant pages that may be confusing your visitors, or maybe just pages for which you haven’t spent the time crafting unique title tags.

Take this opportunity to review your title tags, make them interesting, and always make them relevant. Because I’m a Whiteboard Friday friend, I can’t not link to this title tag hack video. Turn off Netflix for 10 minutes and enjoy.

Pro tip: To view the other duplicate pages, make sure you click on the little triangle icon to open that up like an accordion.

Hey now, what’s this? Filed away under duplicate title tags I’ve found these cheeky pages.

These are the contact forms we have in place to contact our help team. Yes, me included — hi!

I’ve got some inside info for you all. We’re actually in the process of redesigning our Help Hub, and these tool-specific pages definitely need a rethink. For now, I’m going to summon the powerful and mysterious rel=canonical tag.

This tells search engines that all those other pages are copies of the one true page to rule them all. Search engines like this, they understand it, and they bow down to honor the original source, as well they should. Visitors can still access these pages, and they won’t ever know they've hit a page with an original source elsewhere. How very magical.

Action to take for duplicate title tags on similar pages

Use the rel=canonical tag to tell search engines that https://moz.com/help/contact is the original source.

Review visitor behavior and perform user testing on the Help Hub. We’ll use this information to make a plan for redirecting those pages to one main page and adding a tool type drop-down.

More duplicate titles within my subfolder-specific campaign

Because at Moz we’ve got a heck of a lot of pages, I’ve got another Moz Pro campaign set up to track the URL moz.com/blog. I find this handy if I want to look at issues on just one section of my site at a time.

You just have to enter your subfolder and limit your campaign when you set it up.

Just remember we won’t crawl any pages outside of the subfolder. Make sure you have an all-encompassing, all-access campaign set up for the root domain as well.

Not enough allowance to create a subfolder-specific campaign? You can filter by URL from within your existing campaign.

In my Moz Blog campaign, I stumbled across these little fellows:

https://moz.com/blog/whiteboard-friday-how-to-get-an-seo-job

https://moz.com/blog/whiteboard-friday-how-to-get-an-seo-job-10504

This is a classic case of new content usurping the old content. Instead of telling search engines, “Yeah, so I’ve got a few pages and they’re kind of the same, but this one is the one true page,” like we did with the rel=canonical tag before, this time I’ll use the big cousin of the rel=canonical, the queen of content canonicalization, the 301 redirect.

All the power is sent to the page you are redirecting to, as well as all the actual human visitors.

Action to take for duplicate title tags with outdated/updated content

Check the traffic and authority for both pages, then add a 301 redirect from one to the other. Consolidate and rule.

It’s also a good opportunity to refresh the content and check whether it's... what? I can’t hear you — adding value to my visitors! You got it.

Third stop: Duplicate content

Site Crawl > Content Issues > Duplicate Content

When the code and content on a page looks the same are the code and content on another page of your site, it will be flagged as "Duplicate Content." Our crawler will flag any pages with 90% or more overlapping content or code as having duplicate content.

Officially, in the wise words of Google, duplicate content doesn’t incur a penalty. However, it can be filtered out of the index, so still not great.

Having said that, the trick is in the fine print. One bot’s duplicate content is another bot’s thin content, and thin content can get you penalized. Let me refer you back to our old friend, the Quality Guidelines.

Are you doing one of these things intentionally or accidentally? Do you want me to make you chant again?

If you’re being hounded by duplicate content issues and don’t know where to start, then we’ve got more information on duplicate content on our Learning Center.

I’ve found some pages that clearly have different content on them, so why are these duplicate?


So friends, what we have here is thin content that’s being flagged as duplicate.

There is basically not enough content on the page for bots to distinguish them from each other. Remember that our crawler looks at all the page code, as well as the copy that humans see.

You may find this frustrating at first: “Like, why are they duplicates?? They're different, gosh darn it!” But once you pass through all the 7 stages of duplicate content and arrive at acceptance, you’ll see the opportunity you have here. Why not pop those topics on your content schedule? Why not use the “queen” again, and 301 redirect them to a similar resource, combining the power of both resources? Or maybe, just maybe, you could use them in a blog post about duplicate content — just like I have.

Action to take for duplicate pages with different content

Before you make any hasty decisions, check the traffic to these pages. Maybe dig a bit deeper and track conversions and bounce rate, as well. Check out our workflow for thin content earlier in this post and do the same for these pages.

From there you can figure out if you want to rework content to add value or redirect pages to another resource.

This is an awesome video in the ever-impressive Whiteboard Friday series which talks about republishing. Seriously, you’ll kick yourself if you don’t watch it.

Broken URLs and duplicate content

Another dive into Duplicate Content has turned up two Help Hub URLs that point to the same page.

These are no good to man or beast. They are especially no good for our analytics — blurgh, data confusion! No good for our crawl budget — blurgh, extra useless page! User experience? Blurgh, nope, no good for that either.

Action to take for messed-up URLs causing duplicate content

Zap this time-waster with a 301 redirect. For me this is an easy decision: add a 301 to the long, messed up URL with a PA of 1, no discussion. I love our new Learning Center so much that I’m going to link to it again so you can learn more about redirection and build your SEO knowledge.

It’s the most handy place to check if you get stuck with any of the concepts I’ve talked about today.

Wrapping up

While it may feel scary at first to have your content flagged as having issues, the real takeaway here is that these are actually neatly organized opportunities.

With a bit of tenacity and some extra data from Google Analytics, you can start to understand the best way to fix your content and make your site easier to use (and more powerful in the process).

If you get stuck, just remember our chant: "Does this add value for my visitors?” Your content has to be for your human visitors, so think about them and their journey. And most importantly: be good to yourself and use a tool like Moz Pro that compiles potential issues into an easily digestible catalogue.

Enjoy your chili taco and your good night’s sleep!


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Listen to MozPod, the Free SEO Podcast from Moz

Posted by BrianChilds

We're marketers. We know from firsthand experience that there aren’t enough hours in the day to do everything that needs to get done. And that's even more true once you commit to leveling up and learning new skills.

The learning curve for developing digital marketing skills can be steep, and staying informed as things evolve and change (thanks, Google) can feel like a full-time job. Our Moz Training has classes to help accelerate the learning process, but as startup folks ourselves, we understand the importance of multitasking.

Learn SEO on the go

We're thrilled to introduce MozPod, an SEO podcast focused on sharing lessons from digital marketing experts. Episodes are led by instructors from Moz Academy and we discuss a wide variety of digital marketing concepts, from common terminology to recent changes and best practices.

Check it out on iTunes

Where can I listen in?


Upcoming episodes

Our first series covers conversion rate optimization, PageRank, and link building:

Ep. 1: The Science of Crawling and Indexing

Guest: Neil Martinsen-Burrell of Moz

Dr. Neil Martinsen-Burrell shares his perspective as a statistician on the development of Page Authority and Domain Authority. Great data and interesting stats.

Ep. 2: What's a Good Conversion Rate?

Guest: Carl Schmidt of Unbounce

Carl discusses the Unbounce Conversion Rate Benchmark Report and what SEOs can learn from an analysis of over 74 million landing page visitors. Great for content writers.

Ep. 3: Link Building Fundamentals

Guest: The PageOnePower team

MozPod interviews PageOnePower about how search engines place value on links. Collin, Cody, and Nicholas share the personal wisdom they've gained from working at a link building company.


Want to be a guest on MozPod?

If you’d like to share your recent SEO analysis or have a topic you think MozPod listeners would find valuable, please send us your ideas! MozPod is a place for our community of SEOs and digital marketers to learn. We’d love to hear from you.

Simply fill out this form to share your idea: Be on MozPod


Give it a listen and let us know what topics you'd like to hear about in the comments!

Listen to MozPod on iTunes


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!

Monday, October 9, 2017

Yes, Competitors Can Edit Your Listing on Google My Business

Posted by JoyHawkins

I decided to write this article in response to a recent article that was published over at CBSDFW. The article was one of many stories about how spammers update legitimate information on Google as a way to send more leads somewhere else. This might shock some readers, but it was old news to me since spam of this nature on Google Maps has been a problem for almost a decade.

What sparked my interest in this article was Google’s response. Google stated:

Merchants who manage their business listing info through Google My Business (which is free to use), are notified via email when edits are suggested. Spammers and others with negative intent are a problem for consumers, businesses, and technology companies that provide local business information. We use automated systems to detect for spam and fraud, but we tend not to share details behind our processes so as not to tip off spammers or others with bad intent.

Someone might read that and feel safe, believing that they have nothing to worry about. However, some of us who have been in this space for a long time know that there are several incorrect and misleading statements in that paragraph. I’m going to point them out below.


"Merchants are notified by email"

  1. Google just started notifying users by email last month. Their statement makes it sound like this has been going on for ages. Before September 2017, there were no emails going to people about edits made to their listings.
  2. Not everyone gets an email about edits that have been made. To test this, I had several people submit an update to a listing I own to change the phone number. When the edit went live, the Google account that was the primary owner on the listing got an email; the Google account that was a manager on the listing did not.

Similarly, I am a manager on over 50 listings and 7 of them currently show as having updates in the Google My Business dashboard. I haven’t received a single email since they launched this feature a month ago.

"Notified [...] when edits are suggested"

Merchants are not notified when edits are "suggested." Any time I've ever heard of an email notification in the last month, it went out after the edit was already live.

Here's a recent case on the Google My Business forum. This business owner got an email when his name was updated because the edit was already live. He currently has a pending edit on his listing to change the hours of operation. Clearly this guy is on top of things, so why hasn’t he denied it? Because he wouldn’t even know about it since it’s pending.

The edit isn’t live yet, so he’s not receiving a notification — either by email or inside the Google My Business dashboard.

Edits show up in the Google My Business dashboard as "Updates from Google." Many people think that if they don’t "accept" these edits in the Google My Business dashboard, the edits won’t go live. The reality is that by "accepting" them, you’re just confirming something that’s already live on Google. If you "don’t accept," you actually need to edit the listing to revert it back (there is no "deny" button).

Here's another current example of a listing I manage inside Google My Business. The dashboard doesn’t show any updates to the website field, yet there's a pending edit that I can see on the Google Maps app. A user has suggested that the proper website is a different page on the website than what I currently have. The only way to see all types of pending edits is via Check the Facts on Google Maps. No business owner I've ever spoken to has any clue what this is, so I think it’s safe to say they wouldn’t be checking there.

Here's how I would edit that original response from Google to make it more factually correct:

Merchants who manage their business listing info through Google My Business (which is free to use) are notified when edits made by others are published on Google. Sometimes they are notified by email and the updates are also shown inside the Google My Business dashboard. Google allows users (other than the business owner) to make edits to listings on Google, but the edits are reviewed by either automated systems or, in some cases, actual human beings. Although the system isn’t perfect, Google is continually making efforts to keep the map free from spam and malicious editing.

Do you manage listings that have been edited by competitors? What's your experience been? Share your story in the comments below!


Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read!