google

Clickbait Headlines and what John Mueller Says

March 17, 2020 by Aaron Weiss

Clickbait headlines have infiltrated our society in a negative fashion and has become quite pervasive and unfortunately persuasive. What is worse is when those headlines aren’t clicked upon and read, but instead scrolled past with the headline taken as fact without reading the entire article.

For example, we’ll get tidbits from Googlers like John Mueller such as a recent comment about how W3C validation doesn’t impact search results. Had you just read the headline, “Google’s John Mueller: We Do Not Use W3C Validation in Search Results,” you would have thought that was the end of the conversation.

Without clicking on the article and reading it myself, I immediately recognized what was going on here. Just because Mueller stated that W3C validation doesn’t impact search results, doesn’t mean it doesn’t impact the result of crawling your website or how well your website’s structure is developed. Someone who doesn’t read between the lines isn’t going to pick that up.

This article will look at a few times John has provided more information about whether a topic or factor is or is not considered in search results or how to approach a situation. The goal is to investigate the language being used within the clickbait headlines and how it is reported on by third-parties such as SEO news websites.

The goal is to hopefully get others to understand how important it is to read articles and dissect the information to make better conclusions. Generally, this is a return to mid-level high school contextual critical thinking by looking at four different articles that have cited responses with John Mueller of Google.

What are Clickbait Headlines?

Clickbait is a form of false advertisement or misleading headlines which influence the user to click on the headlines to visit the page. This is done so with deceptive and over-sensationalized headlines.

Article #1: “Google’s John Mueller: We Do Not Use W3C Validation in Search Results”

The first article and comment is reported on by Search Engine Journal. This clickbait headline is very clear: “W3C Validation is not used as a factor in search results.” That seems simple enough.

If you do not know what W3C validation is, it is a tool that helps validate HTML code to help uncover potential warnings and errors that might make it more difficult to render a web page. Having valid HTML code on your website can help your website load more consistently across different browsers and devices.

Beyond the Headline

If you click on the article, you’ll learn that the original question posed to Mueller was “whether W3C validation errors could slow down the time it takes to download a page.”

Initially, this question stated to have nothing to do with whether W3C Validation mattered to search results. The question asked if HTML errors took more time to download a page. The assumption here is not if the page took longer to download in a browser, but with Googlebot.

It just so happens that Mueller decided to elaborate more fully:

“In general, the W3C validation is something that we do not use when it comes to search. So you don’t need to worry if your pages kind of meet the validation bar or not. However, using the validation is a great way to double check that you’re not doing anything broken on your site.” (Source)

The “broken-ness” referred to here is meant to help the admin understand if the website has the capability to load properly, which can impact how well the site is crawled. Meaning: if there are obstacles in Googlebot’s way when it crawls a page, it can hinder how well it crawls your site or how often. Therefore, the key meaning in this article is not truly laid-out: Understanding the difference between when a Google employee says that a factor doesn’t impact search results doesn’t mean it doesn’t impact the crawl.

Furthermore, it’s important to understand that W3C validation can the website load properly for the user first and foremost. Lastly, W3C Validation also has ADA Compliance and AIRA factors which may assist those who need additional assistance to interact with websites and applications.

Article #2: “When John Mueller Of Google Is Frustrated A Site Not Ranking Well”

First, this clickbait headline is grammatically incorrect. It’s missing an “is” between “site” and “not.” Double-whammy.

However, this article is from Search Engine Roundtable and is mostly a quote from John referring to a question by James Bradley from a Hangout where he asked about how to recover from a drop in rankings nearly one year prior.

Beyond the Headline

What to make of this: even a seasoned Google employee with insider information can be stumped by their own algorithm. This makes me feel better about being in the SEO industry.

We’re all humans dealing with the results of machines and software developed by humans. Ideally something is going to go wrong or we’re not going to understand the results and factors that make up the results.

John is showing that he needs another set of eyes to get help on something he can’t seem to figure out. This is sign of a truly mature adult who takes his craft seriously and is asking for help when something is beyond his own understanding.

Article #3: “Google Says Don’t Focus on how it Defines Content Quality”

This article is from Edgy.app and the clickbait headline insinuates for website administrators to not focus on how Google defines quality.

The problem with this headline is how it uses “Google” instead of John Mueller. It is true that John Mueller is a subject matter expert who is speaking on behalf of Google, but this is not Google speaking.

Generally, this headline is stating that Google’s definition of quality has no factor on search rankings.

Beyond the Headline

A user asked:

“What is quality content in Google’s eyes? If two people are writing on the same content, it’s possible they have a different opinion on the same thing. Then how does Google decide which one is better?”

This is a fantastic question! How does Google determine which article is of more quality, even if they come to different conclusions? John’s response:

So instead of trying to work back how Google’s algorithms might be working, I would recommend trying to figure out what your users are actually thinking and doing things like user studies…

Google has discussed the importance of authority through the initialism of EAT in the past. However, that initialism doesn’t tell you how to express experience, authority, and trust. Essentially, the author can assume their content is relevant and provides value to the user, but only the user will be able to make that determination.

For years there has been an assumed correlation between quality content and word length. But a thousand word article doesn’t prove that the article is of high quality. That’s just an example of how even known correlations do not exactly equate to quality.

Then again, even if your article is more relevant it may not necessarily mean that is resonates with the user or solves the problem they might be experiencing. That is why John recommends user testing to understand what it is the user attempt to solve.

But that still doesn’t answer the question which even John doesn’t respond to exactly: what if two articles about the same thing have different findings or results? I truly believe that this isn’t something that should be left for Google to answer because it’s subjective. Instead, having a diverse set of ideas that are well researched and backed by evidence is valuable to the user and society. This is a situation where Google is merely a tool to the user to find options and it is up to the user to decide what is and isn’t important to their search.

Article #4: “Google’s John Mueller: ~2/3rds Of What He Says Is Taken Out Of Context”

I saved the best for last.

This clickbait headline from Search Engine Roundtable states that 2/3rds of what John Mueller says is taken out of context. On the surface, especially after writing about the first three examples, I could say this is accurate. Since we’re discussing how the culture of clickbait headlines may be a detrimental form of communication, I think this quite a meta way of concluding this investigation.

Beyond the Headline

Jason Barnard of kalicube.pro published an podcast interview with John Mueller where Jason asked if John’s comments are taken out of context to drive traffic through clickbait headlines. John’s response:

“It is hard to say like a number but I’d guess that about 2/3rds of the content out there is kind of taken out of context and presented in a way that doesn’t really to those cases that they are talking about. And that is within the SEO space and I assume that is kind of similar within all other technical spaces or maybe even general with news.”

What John is saying is that users are not clicking on the article to learn what is really going on, nor are they attempting to understand the context of the user’s question (user research and intent) and the context of John’s responses. This means that John’s responses might be specific to a site or specific situation and the user only absorbing clickbait headlines are potentially harming their understanding of the industry.

Clickbait Headlines Conclusion on what John Muller Says

Don’t just read clickbait headlines and believe you’ve got quality information that has improved your position as an SEO.

I’ll be honest, I do this with non-SEO related news. I think we all do. We’re all humans who are much more susceptible to manipulation that we aware of or will admit.

In general, there are a few take away from the above articles:

Don’t assume the clickbait headline is the whole story. User behavior has been studied to manipulate our lack of time and anxieties.
We’re all humans. We don’t always understand the things we create. Quite an existential concept really.
Even experts aren’t going to have all the answers. However, experts do ask other experts. That’s a sign of a true expert.
Try to understand your users and audience. This is a basic marketing practice (see first bullet point)
Learn about the context of something to truly understand it. New information should pose new questions.
Clickbait headlines are damaging to journalistic integrity, but everyone is susceptible to it.
Don’t believe everything you read.

Yes, Google Does Read Text on Images

July 11, 2019 by Aaron Weiss

On a blog post on June 20th this year, I wrote a blog post wondering, “Can Google Read Images?” Well, it turns out they can. I found similar blog posts that have reported the same, but I wanted to try for myself in a small experiment.

To complete this test, I created an image that included the text “The Worst St Pete SEO Expert.” The only place that exact phrase existed was on the rendered image. No where else. Not the filename, ALT text, etc.

Within 4 or 5 days, I found the following in an image search:

Google search result showing an image displaying with text only found in the rendered image. — Proof that Google does read text within images, but they seem to only display in Image searches.

So, there’s no doubt that Google can indeed read images. However, this didn’t show any increase any organic web rankings. Therefore, this appears to only affect image searches.

If you click the link, it goes to my homepage. That’s because the homepage is most-likely a high-priority crawl target as I developed my most recent blog posts to appear on the homepage.

I suspect if you’re a photographer or image creator looking to improve your stock photography sales, this might be an excellent option to grow your incoming traffic.

For those who are focused solely on organic web searches, this may not provide much of a rankings boost. However, it’s great information to know.

Overall, I’m impressed and tickled. Perhaps I should have provided a less self-deprecating phrase, but I wanted this to really stick out. Let’s see what trouble it brews for me later.

Can Google Read Images?

June 20, 2019 by Aaron Weiss

There’s a good reason why this blog post’s featured image says something very negative. I need it to be unquestionably unique because it’s a test.

Almost a year ago, I came across a Chrome extension called Project Naptha which allows you to highlight and extract text from an image. I was floored by the possibility. Then I saw that SnagIt from TechSmith – my go-to screen capture software – also was now able to grab text from a screenshot.

Surely if a Chrome extension and a screen capture program could read and extract text from an image, Google should be able to as well. Right?

History and Context

Knowing what has come before gives a clue on what to expect in the future. Below are some technologies that have been reasonably within the consumer’s hands in the last few decades.

Optical Character Recognition (OCR)

First, we already know there is some precedent for this. Optical character recognition (OCR) is a technology found in many consumer-level scanners and software. This allows you to scan a document and have the text extracted to a word processing program. However, the source needed to be printed text, and the fonts could not vary widely. Additionally, accuracy wasn’t high, requiring review and cleanup.

Text Verification

You might be familiar with Google’s reCAPTCHA tests to validate human input and deter SPAM and automated form submissions. Every time a user successfully passed validation, it helped Google’s own OCR software to become smarter. Having a human recognize words or phrases that the OCR software couldn’t, increased the accuracy of the software. This helped Google capture the entire New York Times archive.

Image Verification

More recently, Recaptcha now has us picking out buses, cars, chimneys, store fronts, bridges, etc. This version of Recaptcha is called No CAPTCHA, reCAPTCHA reduces the friction from used frustration in entering the wrong text. Instead, the user simply clicks on one or more images that match the requested image. Here, Google is now increasing it’s image recognition software, which increase it’s image search capabilities and Street View.

Mobile Check Cashing

Cashing your Grandma’s birthday check has never been easier thanks to mobile check cashing. Take a picture of the back and front of your check and boom, money in your account. This is just an instance of OCR benefiting consumers.

Image to Chart

Microsoft Excel’s mobile app now has the capability of capturing data from a picture of a chart and importing it to an spreadsheet. This feature is in beta, and the reviewed on Lifehacker states it’s not perfect.

Research

So we know that there are precedents for such software, but can Google actually read, understand, and leverage text within images in it’s search results?

Google has stated that it’s bots and index cannot read images, but SEO Roundtable explains Google does have a patent that can read text in images. So they state they have the technology, just they aren’t using in a certain fashion.

Hold on there for a second.

Some quick Googling can provide some more insight. There are many articles explaining that evidence suggests that they can and do, and might have for some time:

The Method

Can I replicate these findings? I’m gonna try. That is how you discover new SEO research. Therefore, the blog post featured image in this post has something that I would not normally write about myself in order to have something unique.

To be clear, the text in the image will not appear anywhere else. This includes the filename, ALT text, embedded data, etc. It will only appear as text in the image. My WordPress installation, like many, will generate a few variant images for specific use-cases (.i.e. thumbnails, etc.) so there may be a few different versions that are found and indexed, but I’m uncertain which version will rank, although I believe that primary featured image will.

I’m going to give it about 3 months before I report back my findings. I feel that, since this website is still pretty new in the eyes of Google, I’m sure that the algorithm is still sorting out where this website belongs.