You Are Browsing The SEO Category

The Future of Search and SEO

June 16 2011 // SEO // 5 Comments

What did the announcements made at Inside Search really mean for the future of search and SEO? More than meets the eye.

Mobile Search

The key to Google’s mobile announcements is that the human computer interface is being streamlined based on the platform and query intent.

Google developed shortcuts for specific types of mobile searches, essentially providing ways to make searches with certain types of intent super efficient. These searches are now performed through a combination of location and icon selection.

No words necessary.

Google Mobile Shortcuts Example

Making it easier to search is important, but predicting query intent is even more powerful. If you’re not thinking about query intent, you’re not doing your job.

Voice Search

Voice search on desktop devices is another change in the human computer interface. Unlike mobile, this time it’s encouraging more words, not less.

Twiki and Gary Coleman

The actual voice search capability is impressive but I’m more interested in how it might change keyword targeting and keyword research. User syntax is a crucial part of keyword research. Do people search on your topic using this word or that word? What words are they using to modify or reformulate their queries?

The answers to those questions are bound to change as voice search becomes a greater part of the equation. In short, you speak different then you write. This change in user syntax may not happen overnight, but I believe that as voice search becomes more prevalent we’ll be looking at different types of user syntax and, potentially, a shift in how long tail content is accessed.

Image Search

Images are becoming a connective tissue on the Internet. Whether it’s Flickr, mlkshk, Color, Instagram, Photobucket, iStockphoto, deviantART, We Heart It, Pinterest or any number of other sites and services, images are ubiquitous.

I’m an unabashed LOLcats fan and can envision a future where we communicate through a combination of memes. Think hieroglyphs for the Internet age. But I digress.

a picture is worth 1000 words

The trite phrase is that a picture is worth a 1000 words. Yet, I don’t think Google’s at the point where images alone are meaningful. The pixel matching algorithm may be quite good but those pictures still need words to describe them. File names, captions, alt text, anchor text, page content and other descriptors are necessary to provide context and results.

The drag and drop image search interface is a big change. But basic image optimization will remain important and likely increase in importance in the future.

Instant Pages

Speed. Google is consumed with reducing the time it takes between seeking and finding knowledge. Reducing the load time of a SERP click by pre-rending the page certainly supports this mission and improves user experience. Yet, I think there’s an ulterior motive.

Google wants to get better at predicting quality.

speed of google search

We know that Google measures and likely uses the pogostick rate as a benchmark for satisfaction, which plays an important part in determining search quality. The problem with the pogostick rate was the variable load speed of the target page.

Was the pogostick rate a measure of the quality of the content or on how quickly that content was delivered?

The first way Google looked to solve this was by convincing sites to get faster. So they talked about speed as a signal and built us all that pretty graph in Google Webmaster Tools with the crazy benchmark of 1.5 seconds. Most of us made headway on speed but not enough to satisfy Google.

Google figured out a way to do it themselves and meet their own 1.5 second benchmark. Now the pogostick rate for these pre-rendered clicks are pure. Google has removed the speed of delivery as a mitigating factor, giving them more confidence in the pogostick rate as a measure of content satisfaction.

Instant Pages should help Google improve overall search quality.

TL;DR

Google is changing the way we search though vastly different human computer interfaces. They’re looking to make it easier and faster to match query intent with value. If the way we search is changing, you better believe SEO is changing too. Monolithic, text only, cookie-cutter SEO devoid of intent analysis will become less and less effective.

SEO isn’t dead, it’s just getting more interesting.

Google Related Searches

June 13 2011 // SEO // 18 Comments

I love a good spreadsheet full of numbers and columns that I can filter and manipulate. But when I begin keyword research it’s not about numbers, it’s about user intent and query syntax. It’s about finding the right modifiers.

Google Related Searches

Google related searches is a powerful way to gather keyword intelligence. You’ll find a subset of related searches at the bottom of search results but you’ll want to use the advanced search option on the left hand menu to get full value from this feature.

You may need to click the More search tools link to reveal this option.

More search tools

One more housekeeping note. I’m going to split this post up into two sections, the first addressing tactical ways to use Google related searches and the second straying into a more theoretical examination that might appeal to algorithm geeks.

Find Root Modifiers

Related searches can help you quickly identify top root term modifiers.

google related searches for heart attack

For the root term ‘heart attack’ you get a nice collection of modifiers. Then enter a space after the term in the search bar to trip Google Instant.

google instant while in related searches

That’s right, you get double the suggestions when using Google Instant with related searches. Not only that, but these are different from the suggestions offered in the normal Google Instant interface.

Time to use your brain and collect the modifiers that make sense. I usually copy and paste the root modifier combinations to a text pad. Then drop the list into the Google Keyword Tool to find additional keyword opportunities and benchmark query volume.

Find Modifier Classes

Locating a few strong modifiers is nice but identifying a modifier class is even better. I define a modifier class as a set of structured terms. In this instance a picture really is worth a 1,000 words.

find modifier classes through related searches

Here you quickly confirm that song is a modifier class for Kasabian lyrics and probably is for any band+lyrics combination. (This is a shameless plug for the amazing Kasabian.)

Another example uncovers an easy brand modifier class.

google related searches identify modifier classes

Modifier classes are great ways to understand query patterns and, should you have the content to support it, expand your footprint.

Find Term Synonyms

Usually you’ll start the research process with a target keyword. However, this term may or may not be the way your customers are actually searching for your product or service.

google related searches for tv repair

In this instance ‘tv repair’ might be synonymous with ‘tv troubleshooting’ and ‘tv problems’. It’s up to you to figure out what the query intent is for each of these variations. Don’t just blindly gather up modifiers willy-nilly!

I recommend using Google Insights for Search to see the category profile for each of these terms. In this case, you’ll find that the local category is more prevalent for those searching for ‘tv repair’ than ‘tv troubleshooting’. I interpret this to mean the intent behind ‘tv repair’ may lean toward finding a local tv repairman while the intent behind ‘tv troubleshooting’ may lean toward finding a do-it-yourself solution.

As a safeguard you’ll want to perform a synonym query on many of these modifiers.

Find Competitors

Google related searches can also bring up potential competitors.

find competitors in google related searches

For the term ‘sweatshirt’, retailers like American Eagle, American Apparel, Old Navy and Hollister are all presented. If you’re selling sweatshirts you might add these sites to your list of competitors. What’s interesting is that none of these sites rank on the first page of organic results for this term.

A quick analysis leads me to believe these sites are optimizing for ‘hoodie’ and ‘sweater’ instead of ‘sweatshirt’. That might not be a bad idea based on comparative query volume.

Find Semantic Keywords

You’ll probably have noticed by now that sometimes Google related searches don’t include the target keyword at all. In these instances Google is returning a type of semantic keyword.

find semantic keywords using google related searches

Bananas foster is my favorite dessert (and I was lucky enough to have it at Brennan’s.) Here you can see that Google is returning other desserts related to bananas foster. To be fair this probably isn’t true semantics but simply a measure of closely related queries. Nevertheless it can be an interesting way to find potential keyword targets for SEO or PPC programs.

Speaking about PPC, what about pay per click?

google related searches for pay per click

Sure enough Google related searches does a nice job of bringing up related terms without the keyword term being present.

You Are A Tool

Sounds like a put down but it’s actually a compliment. Google related searches is a powerful keyword research tool, allowing you to explore and find modifiers, modifier classes, synonyms, competitors and semantic terms. Yet the most critical part is to apply your own analysis and to intelligently validate assumptions with other tool sets.

(This ends the tactical part of the post. You’re now entering the theoretical side that might only appeal to algorithm geeks. You’ve been warned.)

Bigram Breakage

What struck me as I examined related searches is how it reveals the application of bigram algorithms. Loosely interpreted, it’s a way to model where to split word pairs or break a query into component parts.

Admittedly, I’m a neophyte in this realm so I’ll let Steven Levy, author of In The Plex, explain it.

The key to understanding … was the black art of “bigram breakage”; that is, how should a search engine parse a series of words entered into a query field, making the kind of distinctions that a smart human being would make?”

Related searches puts bigram breakage front and center.

Example of Bigram Algorithm

The related searches for ‘social network sites’ returns modifiers on the entire term, but also for component terms: ‘sites’ and ‘social network’. This indicates that Google has learned where best to split this phrase. There aren’t any ‘network sites’ modifiers displayed.

Here’s another interesting example.

Google Wildcard Bigrams

Once again Google returns the whole term with modifiers. But this time it’s identified ‘social contact’ and ‘social ___ integration’ as the component terms. I, for one, am fascinated by how Google determines how to split word phrases.

Speed of (Machine) Learning

Not only can you get a sense for how Google is splitting word phrases but also how long it takes them to learn when to do so.

Angry Birds Related Searches

Related searches for angry birds only returns modifiers. It seems like Google has learned that it should not split these two words. Yet, it hasn’t yet learned to return related searches without that keyword (e.g. – Fruit Ninja).

Contrast this to another angry animal.

Angry Dogs Related Searches

The contrast is pretty stark. Here Google does split the phrase, producing modifiers for ‘dogs’ and ‘angry’.

Watch and Learn?

Understanding the science behind search can be interesting and, at times, useful. In this case, I also wonder if watching how the related searches change over time for certain valuable keywords might be instructive.

What happens when the bigram breakage for a term changes? How long does it take for Google to recognize when not to break a word pair? How long does it take before Google develops semantic terms for that word pair? How do these things impact normal day-to-day optimization efforts?

I don’t know the answers. Heck, I might not even be asking the right questions! But I believe observation can be a great teacher. So I’ll be keeping an eye on when and how certain related searches evolve over time.

Google Scribe SEO Hints

June 05 2011 // SEO + Technology // 6 Comments

Lost in the Google +1 button launch and Schema.org announcement was the release of a new version of Google Scribe. In typical Google fashion, this unassuming product may be more important than both Google +1 and Schema.org.

Google Scribe

What is Google Scribe?

Google Scribe is one of a number of Google Labs experiments.

Google Scribe helps you write better documents. Using information from what you have already typed in a document, Google Scribe’s text completion service provides related word or phrase completion suggestions. It also checks your documents for incorrect phrases, punctuations, and other errors like misspellings. In addition to saving keystrokes, Google Scribe’s suggestions indicate correct or popular phrases to use.

Think of it as an intelligent version of Google Docs.

Language Research Lab

But what is Google Scribe really about? Look no further than the engineer working on the project.

Google Scribe Engineer

That’s right, Google Scribe is about language models, something at the core of how Google interprets and evaluates web content.

Since Google Scribe’s first release on Google Labs last year, we have been poring over your feedback and busy adding the top features you asked for. Today, we’re excited to announce a new version of Google Scribe that brings more features to word processing.

Poring over your feedback might seem like they’re reading comments and suggestions submitted by users, but in actuality I’m guessing it’s the complex analysis of usage. Google Scribe is about language research. The kind of research helping Google refine algorithmic signals.

Every time you use Google Scribe you’re helping to refine the language model by choosing from one of many text completion suggestions. Google is getting smarter about language.

Semantic Proofreading

One of the new features seems to be a direct result of this analysis: semantic proofreading.

Semantic Proofreading Example

Normal spell check would not catch the words in this example because both words are correctly spelled. Yet, the language model has learned that the word awesome is rarely ever preceded by the word quiet.

That’s quite awesome.

Good Writing Matters

Unless you’ve been living under a rock you probably know that Google is using spelling and grammar as a way to determine content quality. Any analysis of Amit Singhal’s Panda questions would indicate that grammar and spelling are gaining in algorithmic importance.

I’d recently discussed Google’s potential use of spelling and grammar on reviews with Bill Slawski. I wasn’t convinced it was a good idea.

But then Barry Schwartz reported on a Google Webmaster Forum response by Google employee John Mueller regarding spelling in user generated content.

This was noteworthy enough to prompt an official Google tweet.

Google Good Spelling Tweet

Is that clear enough for you?

Anchor Text Suggestions

This new version of Google Scribe creates automatic anchor text for a URL. That in itself is pretty interesting, but Google Scribe also gives alternate anchor text suggestions and the ability for the user to create their own.

Here are two examples using my fellow Sphinn editors: Michael Gray and Jill Whalen.

Google Scribe Link Suggestions for Wolf Howl

Google Scribe Link Suggestions for High Rankings

Clearly Google Scribe is already seeing and using back link profiles. But Google will learn about the validity of the anchor text every time someone changes the anchor text from the automated, or primary suggestion, to one of the other suggestions or creates something entirely new.

What happens when Google Scribe determines that the primary suggestion for a URL is rarely used? The implication is that link suggestions could provide a feedback mechanism on overly optimized or ‘off-topic’ anchor text.

In other words, a paid link signal.

High Quality Documents

I’m convinced Google Scribe is helping to improve Google’s ability to interpret and analyze language. But there are indications that Google could be thinking even bigger.

Google Scribe Labs Description

Sure enough the description of Google Scribe starts with that succinct elevator pitch. “Write high-quality documents quickly.” The last word tells me it’s meant to support the new digital content work flow.

Scribe Bookmarklet and Extension

You can take Google Scribe on the go using the bookmarklet or Chome extension. I’m using the bookmarklet right now as I’m writing this post.

Google Scribe WordPress Integration

It’s a bit clunky from a UX perspective but I see a lot of potential. A more refined product might help sites ensure their users are producing well written user generated content.

Flipping The Funnel

Why limit yourself to the output of content when you can influence the input of content.

The explosion of digital content has been made possible, in large part, by blogging platforms. Yet, the quality of the content has been uneven, and that’s probably being generous. So why not attack the problem at the top of the funnel? Help people write better content.

I like the idea. In fact, I like it so much I’m exploring a side project that does the same thing in a different yet complementary way.

Google Scribe and SEO

Data and Spot Star Trek LOLcat

Like it or not, Google is using spelling and grammar to determine content quality. Google Scribe is one method being used by Google to better understand and evaluate language and anchor text. It’s not about the actual product (right now) but about the data (feedback) Google Scribe is producing.

Instead of obsessing about the specifics of the Panda update the SEO community can look to Google Scribe and take the hint. It’s not just what you say, it’s also how you say it.

So if you’re responsible for content, take a few more minutes and proofread your work. Google will.

Mechanical Turk Tips

June 03 2011 // SEO + Technology // 7 Comments

Amazon Mechanical Turk is a great way to do a wide variety of tasks, from content creation to image tagging to usability. Here are 15 tips to get the most out of Mechanical Turk.

Mechanical Turk Logo

Learn The Lingo

What’s a HIT? Mechanical Turk can be a bit confusing upon first glance. In particular, you’ll need to understand this one important acronym.

A HIT is a Human Intelligence Task and is the work you’re asking workers to perform. A HIT can refer to the specific task your asking them to perform but also doubles as the terminology of the actual job you post in the community.

Select A+ Workers

95 percent or more approval rate for HITs

The long and the short of it is that reputation matters and past performance is a good indicator of future performance. Limit your HITs to those with at least a 95% approval rate.

It may shrink your pool of workers and could increase the time to completion but you make up for it in QA savings.

Segment Your Workers

Match the right workers to the right task. In my experience, you get better results from US based workers when you’re doing anything that requires writing or transcription. Conversely, international workers often excel in tasks such as data validation and duplicate detection.

Give Workers More Time Than They Need

The time you give is the time workers have before the HIT disappears. Imagine starting a job and when you come back to turn in your work and collect payment the shop has closed and left town. This can really frustrate workers.

Mechanical Turk Reward Tip

I think Amazon creates this problem with the messaging around the hourly rate calculation. My advice, don’t get too hung up on the hourly rate and err on the side of providing more time for your HITs.

Provide Specific Directions

Remember that you are communicating work at a distance to an unknown person. There’s no back-and-forth dialog to clarify.

In addition, workers are looking to complete work quickly and to ensure they fulfill the HIT so their approval rate remains high. The latter, in particular, makes specificity very important.

Tell workers exactly what to do and what type of work output is expected.

Make It Look Easy

While the directions should be specific you don’t want a 500 word paragraph of text to scare folks off. Make sure your HIT looks easy from a visual perspective. This means it’s easily scanned and understood.

Take advantage of the HTML editor and build in a proper font hierarchy, appropriate input fields and use a pop of color when you really want to draw attention to something important.

Give Your HIT a Good Title

Make sure your HIT title is the appropriate length (not too short or long) and that it’s descriptive and appealing.

Mechanical Turk HIT Title examples

A good title is a mixture of SEO and marketing principles. It should be relevant and descriptive but also interesting and alluring.

Bundle The Work

If you can do it, bundle a bunch of small tasks into one HIT. For instance, have them tag 10 photos at a time.

This helps because you can set a higher price for your HIT. You’ll attract a larger pool of workers since many don’t seek out ‘penny’ HITs.

Mind Your Email

Workers will email you – frequently. Do not ignore them.

You are joining a community. Just take a peek at Turker Nation. As with any community, you get and build a reputation. Don’t make it a bad one. Respond to your email, even if the response isn’t what workers want to hear.

In addition, you learn how to tweak your HIT by listening to and interacting with the workers.

Pay Fast

A lot of the email you may receive is around a familiar refrain: “When will you pay.” This gets tedious so I generally recommend paying quickly, reducing the amount of unproductive email and giving you a good reputation within the community.

Pay Mechanical Turk HITs Fast

That means setting your automatic approval for something like 2 or 3 days.

Develop a QA System

To pay fast you need a good QA system. You can either do this yourself or, alternatively, put the work out as a separate HIT. That’s right, you can use Mechanical Turk to QA your Mechanical Turk work. Insert your Inception or Yo Dawg joke here.

Bonus Good Work

10 Dollar Bill

Give a bonus when you find workers who have done an excellent job on number of HITs. It doesn’t have to be a huge amount, but take the top performers and give them a bonus.

Not only is this the right thing to do, it’ll go a long way to establishing yourself in the community and developing a loyal pool of quality workers.

Build a Workforce

Once you find and bonus good workers, continue to give them HITs. You can do this by creating a list of and limiting HITs to just those workers.

If you do this you probably want to keep the ‘Required for preview’ box checked so workers not on that list aren’t frustrated by previewing a HIT they don’t have any chance of working on.

Download the worker history (under Manage > Workers) and use Excel to find high volume and high quality workers. Then create your list (under Manage > Qualification Types) so you can use it in your HIT.

Block Bad Apples

Just as you build a list of good workers, you also need to block a few of the bad ones. They might have dynamite approval ratings but for different types of tasks. Some people are good a some things and … not so good at others.

Coaching workers is time consuming and costly, so it’s probably better for you and the worker to simply part ways. You ensure the approval rate on your HITs remains high and the worker won’t put their approval rate in jeopardy.

Understand Assignments

Finally, understand and use assignments wisely. Each HIT can be assigned to a certain number of workers.

Warning on Assignments per HIT

So if you’re HIT is about getting feedback on your new homepage design, you might assign 500 workers to that HIT. That means you’ll give 500 reactions to your new homepage. It’s one general task that requires multiple responses.

But if you’re HIT is about validating phone numbers for 500 businesses, you will assign 1 worker to each HIT. That means you’ll get one validation per phone number. Do not assign 500 workers or you’ll get 500 validations per phone number. That’s wasteful and likely to irk those businesses too.

Mechanical Turk Tips

These tips are the product of experience (both mine and the talented Drew Ashlock), of trial and error, of stubbing toes during the process.

I hope this helps you avoid some of those pitfalls and allows you to get the most out of a truly innovative and valuable service.

Google SEO Communication

June 02 2011 // Marketing + SEO // 6 Comments

Google has a love hate relationship with the SEO community. They view many SEO agencies, consultants and services as part of the problem – parasites that seek to exploit and game their algorithm. No doubt, many fall into this category.

NIN Pretty Hate Machine CD Cover

Unfortunately, Google’s lack of transparency contributes to the problem, spawning a host of poor theories and misguided practices. In addition, the changing nature of the algorithm creates a powerful variant of bit rot – outdated information and myths that stubbornly persist.

In response, Google has worked (perhaps reluctantly) to improve communication with the SEO community. They send employees to search conferences, write blogs, create videos, maintain a forum, provide informational tools and have a presence on social media platforms (Twitter) and sites (Hacker News).

The vast majority of these efforts are undertaken by one person: Matt Cutts.

Last month Google increased their communication efforts, dedicating a blog to search (it’s about time!) and doing a live 90 minute Q&A session via YouTube. I’m encouraged by these new developments but Google still doesn’t have a solid share of voice within the SEO community and when it does it is often viewed with suspicion.

Here are three ways Google could improve SEO relations.

Google Search Summit

Invite select members (perhaps 50) of the SEO community to the Google campus for a search summit with Google engineers. This is very different from a conference where the day-to-day mechanics of the SEO industry are discussed.

Instead, I propose a real exchange of ideas on the nature and problems of search. It could even have a lean component where groups are challenged to propose a new way to deal with a specific search problem.

There are a number of smart folks in the SEO community who could contribute positively to discussions on search quality or web spam. Even if Google doesn’t believe this, understanding how the SEO community perceives certain stances, guidelines and practices would be valuable.

At a minimum, the dialog would provide additional context behind search guidelines and algorithmic efforts. For Google, this means the attendees become agents of ‘truth’. By allowing the SEO community to truly engage and learn, they can help transmit Google’s message. I’m not talking about a Kool Aid conversion but instead building a greater degree of trust through knowledge transfer and personal relationships.

Attendance would require some modicum of discretion and a certain level of knowledge or interest in information retrieval, human computer interaction, natural language processing and machine learning.

Even if I didn’t get an invite (though I’d want one), I think it’s a good idea for Google and the SEO community.

Google Change Log

The SEO community is intensely curious about when  and what changes are made to search, whether they be algorithmic or design oriented. Some amount of transparency here would go a long way. Would it really hurt to let the SEO community know that a certain type of bucket test was in the field?

We’re already seeing most of the UX tests, with blogs cranking out screenshots of the latest SERP oddity they’ve encountered. So why not publish a changelog, using FriendFeed as a model.

FriendFeed Change Log

FriendFeed makes it clear that this wasn’t comprehensive, but it did provide a level of transparency and insight into pain points and personality. The latter even more so because the user is linked to their FriendFeed account.

Imagine a Google changelog where the user is linked to a Google Profile. God forbid we learn a little bit about the search quality engineers.

I understand that there are certain changes that cannot be shared. But opening the kimono just a little would go a long way.

LOLMatts

Matt Cutts is willing to interact at length at conferences and jump into comment threads (in a single bound). He gets a bit of help from folks like Maile Ohye and John Mueller, but he’s essentially a solo act.

If Google isn’t going to allow (or encourage) more engineers to interact with stakeholders (yeah, I have a business background) then you have to amplify the limited amount of Matt we have at our disposal.

What better way than to create a Matt Cutts meme? LOLMatts!

Matt Cutts Meme on Page Sculpting

Yes, this is tongue in cheek, but my point is to do some marketing.

Matt Cutts Meme about Cloaking

Make the messages pithy and viral.

Matt Cutts Meme about Meta Keywords

Break through the clutter and keep it simple.

Matt Cutts Meme about Paid Links

Make it easier for people to pass along important information. I’ve just created four LOLMatts that cover page sculpting, cloaking, meta keywords and paid links. Of course this can go wrong in a multitude of ways and be used for evil. But the idea is to think of ways to amplify the message.

Develop some interesting infographics. Heck, Danny Sullivan even got you started. Get busy creating some presentations (you could do worse than to use Rand as a model) and upload them to SlideShare. Or create an eBook and let people pay for it with a Tweet.

Let’s see some marketing innovation.

TL;DR

Google’s rocky relationship with the SEO community could be improved through real interaction and engagement, an increase in transparency (both technical and human) and marketing techniques that would amplify their message.

The SEO community and Google would benefit from these efforts.

SEO Freeloaders

May 12 2011 // SEO // 8 Comments

This is not a ‘SEO is Dead’ post. Let me make that clear from the beginning. But SEO is going to get tougher, not because of the Panda update or anything else Google may implement but because search volume growth is decelerating.

SEO Drafting

The SEO industry has had the wind at its back as search volume soared month after month and year after year. Some might say it was pretty tough not to fall into success.

That’s not to say there wasn’t a lot of good SEO going on. But if you were posting 25% yearly SEO growth were you really being effective? Shouldn’t SEO growth be normalized based on search volume trends?

Search Volume Trends

Search Volume Trends 2004 to 2011

This graph measures explicit monthly US searches from December of 2004 to April 2011 using a mix of comScore qSearch and Nielsen//NetRatings MegaView Search reports. In that time the number of monthly searches has risen from 3.3 billion to 16.9 billion.

Search Volume Growth

To some the trend might look rosy. But look closer. Using December search volume as a benchmark, the year over year (YoY) growth in search volume is decelerating.

YoY Search Volume Growth

The YoY growth in 2011 could be in the single digits. Of course you could drill down into specific category (even keyword) search growth, but I believe this is a macro level trend based on demographics.

Search Adoption

When I debated the definition of search quality, I mapped out daily search usage against the innovation curve. In May of 2010 79% of American adults were online and 87% used a search engine to find information.

If you do the math you find that approximately 165 million American adults or 69% of the total adult population now use search.

Search Adoption Table

That puts us a little more than half way through the Late Majority, and that was a full year ago.

SEO Shakeout

Recently the SEO industry has grappled with the idea of standards or certifications and differed on ‘outing’ SEO companies who violate search engine guidelines. The industry is maturing, but I wonder if we’re missing the bigger picture.

That gale force tailwind we once had is now a gentle breeze. Decelerating search volume growth will squeeze mediocre SEO out of our industry.

It will push us all to up our game, to evolve and specialize. The free ride is nearly over, it’s time to put up or shut up.

Translating Panda Questions

May 08 2011 // SEO // 2 Comments

On Friday, Google released a list of questions to help guide publishers who have been impacted by the Panda update.

Because Google can’t (or won’t) give specifics about their algorithm, we’re always left to read between the lines, trying to decipher the true meaning behind the words. Statements by Matt Cutts are given the type of scrutiny Wall Street gives those by Ben Bernanke.

Speculation is entertaining, but is it productive? Google seems to encourage it, even within this recent blog post.

These are the kinds of questions we ask ourselves as we write algorithms that attempt to assess site quality. Think of it as our take at encoding what we think our users want.

So perhaps there is value (beyond entertainment) in trying to translate and decode the recent Panda questions.

Panda Questions Translation

Matt Cutts to English Translation

Would you trust the information presented in this article?

The web is still about trust and authority. The fact that this is the first question makes me believe it’s a reference to Google’s normal calculation of PageRank using the (rickety) link graph.

Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature?

Is Google looking at the byline of articles and the relationships between people and content? Again, the order of this question makes me think this is a reference to the declining nature of the link graph and the rising influence of the people graph.

Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?

This reveals a potential internal duplicate content score and over-optimization signal in which normal keyword clustering is thwarted by (too) similar on-site content. It may also be referred to as the eHow signal.

Would you be comfortable giving your credit card information to this site?

Outside of qualitative measures, Google might be looking for the presence or prominence of a privacy policy.

Does this article have spelling, stylistic, or factual errors?

Google Best Guess at Eiffel Tower Height

We already know that Google applies a reading level to content. But maybe they also extract and run statements through a fact checking database. So, stating that the Eiffel tower was 500 meters (instead of 324) might be a negative signal.

Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?

This sounds like a mechanism to find sites that have no internal topical relevance. In particular, it feels like a signal designed to identify splogs.

Does the article provide original content or information, original reporting, original research, or original analysis?

Nice example of keyword density here! But it certainly gets the point across though, doesn’t it? Google isn’t green when it comes to content recycling. Google wants original content.

Does the page provide substantial value when compared to other pages in search results?

This type of comparative relevance may be measured, over time, through pogosticking and rank normalized CTR.

How much quality control is done on content?

Misspelled Sign

I think this is another reference to spelling and grammar. Google is proud (and should be) of their Did You Mean? spelling correction. I can’t imagine they wouldn’t want to apply it in other ways. As for grammar, I wonder if they dislike dangling prepositions?

Does the article describe both sides of a story?

I believe this is a Made for Amazon signal that tries to identify sites where the only goal is to generate clicks on affiliate links. I wonder if they’ve been able to develop a statistical model through machine learning that identifies overly one-sided content?

Is the site a recognized authority on its topic?

This seems like a clear reference to the idea of a site being a hub for information within a specific topic.

Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?

Not much between-the-line reading necessary on this one. Google doesn’t like content farms. This might as well just reference Mahalo and Demand Media.

Was the article edited well, or does it appear sloppy or hastily produced?

Once again, more emphasis on attention to detail within the content. Brush up on those writing and editing skills!

For a health related query, would you trust information from this site?

The qualifier of health makes me think this is a promotional signal, seeking to identify sites that are promoting some supplement or herb with outrageous claims of wellness. I’d guess machine learning on the content coupled with an increased need for citations (links) from .gov, .org or .edu sites could produce a decent model.

Would you recognize this site as an authoritative source when mentioned by name?

Brands matter. You can’t get much more transparent.

Does this article provide a complete or comprehensive description of the topic?

Oddly, the first thing that jumps to mind is article length. Does size really matter?

Does this article contain insightful analysis or interesting information that is beyond obvious?

Beyond obvious is an interesting turn of phrase. I’m not sure what to make of it unless they’re somehow targeting content like ‘How To Boil Water‘.

It may also refer to ‘articles’ that are essentially a rehash of another article. You’ve seen them, the kind where large portions of another article are excerpted surrounded by a small introductory sentence.

Is this the sort of page you’d want to bookmark, share with a friend, or recommend?

Social Signals

This clearly refers to social bookmarking, Tweets, Facebook Likes and Google +1s and the social signals that should help better (or at least more popular) content rise to the top. These social gestures are the modern day equivalent to a link.

Does this article have an excessive amount of ads that distract from or interfere with the main content?

There is evidence (and a good deal of chatter) that Google is actually rendering pages and can determine ads and chrome (masthead, navigation etc.) from actual content. If true, Google could create a content to ad ratio.

I’d also guess that this ratio is applied most often based on what is visible to the majority of users. How much real content is visible when applied using Google’s Browser Size tool? You should know.

Would you expect to see this article in a printed magazine, encyclopedia or book?

This anachronistic question is about trust. I simply find it interesting that Google still believes that these old mediums convey more trust than their online counterparts.

Are the articles short, unsubstantial, or otherwise lacking in helpful specifics?

Another question around article length and ‘shallow’ content. I wonder if there is some sort of word diversity metric that could be applied that would help identify articles that lacked substance and specifics.

Are the pages produced with great care and attention to detail vs. less attention to detail?

The verbiage of ‘pages produced’ makes me think this is about code and not content. We’ve heard that code validation isn’t really a signal, but that’s different from seeing gross errors in the mark-up that translates into bad user experience.

Would users complain when they see pages from this site?

Blocked Sites

This is obviously a reference to the Chrome Personal Blocklist extension and new Blocked Sites functionality. Both features seemed like reactions to pressure from people like Vivek Wadhwa, Paul Kedrosky, Jeff Atwood, Michael Arrington and Rich Skrenta.

That this question is last in this list makes it seem like it was a late addition, and might lend some credence to the idea that these spam initiatives were spurred by the public attention brought by the Internati.

Panda Questions Analysis

Taken together it’s interesting to note the number of questions that seem to be consumed with grammar, spelling and attention to detail. Yet, if Google had really gotten better at identifying quality in this way, wouldn’t it have been better to apply it on the URL level and not the domain level? (I have a few ideas why this didn’t happen, but I’ll share that in another post.)

Overall, the questions point to the shifting nature of how Google measures trust and authority as well as a clear concern about critical thinking and the written word. In light of the recent changes at Google, is this evidence that Google is more concerned with returning knowledge rather than simple results?

WordPress Duplicate Content

April 27 2011 // Rant + SEO + Technology // 23 Comments

In February Aaron Bradley sent me an email to let me know that I had a duplicate content problem on this blog. He had just uncovered and rectified this issue on his own blog and was kind enough to give me a heads up.

Comment Pagination

The problem comes in the way that WordPress handles comment pagination. The default setting essentially creates a duplicate comment page.

Here’s what it looks like in the wild. Two pages with the same exact content.

http://blog.wolframalpha.com/2011/04/18/new-age-pyramids-enhance-population-data/comment-page-1/

http://blog.wolframalpha.com/2011/04/18/new-age-pyramids-enhance-population-data

That’s not good. Not good at all.

Comment-Page-1 Problem

The comment-page-1 issue offends my own SEO sensibilities, but how big of a problem is it really?

WordPress Spams Google

There are 28 million inurl results for comment-page-1. 28 million!

Do the same inurl search for comment-page-2 and you get about 5 million results. This means that only 5 million of these posts attracted enough comments to create a second paginated comment page. Subtract one from the other and you wind up with 23 million duplicate pages.

The Internet is a huge place so this is probably not a large percentage of total pages but … it’s material in my opinion.

Change Your Discussion Settings

If you’re running a WordPress blog I implore you to do the following.

Go to your WordPress Dashboard and select Settings –> Discussions.

How To Fix Comment-Page-1 Problem

If you regularly get a lot of comments (more than 50 in this default scenario) you might want to investigate SEO friendly commenting systems like Disqus, IntenseDebate or LiveFyre.

Unchecking the ‘break comments into pages’ setting will ensure you’re not creating duplicate comment pages moving forward. Prior comment-page-1 URLs did redirect, but seemed to be doing so using a 302 (yuck). Not satisfied I sought out a more permanent solution.

Implement an .htaccess RewriteRule

It turns out that this has been a known issue for some time and there’s a nice solution to the comment-page-1 problem in the WordPress Forum courtesy of Douglas Karr. Simply add the following rewrite rule to your .htaccess file.

RewriteRule ^(.*)/comment-page-1/ $1/ [R=301,L]

This puts 301s in place for any comment-page-1 URL. You could probably use this and keep the ‘break comments into pages’ setting on, which would remove duplicate comment-page-1 URLs but preserve comment-page-2 and above.

Personally, I’d rather have the comments all on one page or move to a commenting platform. So I turned the ‘break comments into pages’ setting off and went a step further in my rewrite rule.

RewriteRule ^.*/comment-page-.* $1/ [R=301,L]

This puts 301s in place for any comment-page-#. Better safe than sorry.

Don’t Rely on rel=canonical

Many of the comment-page-1 URLs have a rel=canonical in place. However, sometimes it is set up improperly.

Improper Rel=Canonical

Here the rel=canonical actually reinforces the duplicate comment-page-1 URL. I’m not sure if this is a problem with the Meta SEO Pack or simple user error in using that plugin.

Many times the rel=canonical is set up just fine.

Canonical URL from All-In-One SEO Pack

The All in One SEO Pack does have a Canonical URL option. I don’t use that option but I’m guessing it probably addresses this issue. The problem is that rel=canonical doesn’t stick nearly as well as a 301.

Comment-Page-1 in SERP

So even though this post from over three months ago has a rel=canonical, the comment-page-1 URL is still being returned. In fact, there are approximately 110 instances of this on this domain alone.

Comment Page 1 Site Results

Stop Comment-Page-1 Spam

23 million pages and counting. Sure, it would be nice if WordPress would fix this issue, but short of that it’s up to us to stop this. Fix your own blog and tell a friend.

Friends don’t let friends publish duplicate content.

The Fresh Content Myth

April 14 2011 // SEO // 4 Comments

One of the SEO myths that seems to stubbornly persist is the value of fresh content. The problem revolves around the definition of fresh. Google likes new content. That’s very different from refreshed content, which is where many people seem to focus their attention.

Here’s a quick guide to new versus refreshed content, illustrated with cats (and Vladimir Putin).

Google Loves Kittens

Google Loves Kittens

New content is what Google craves. Googlebot will fawn over newly minted content. “Awwwww, so cute!”

This is what Google means when they say they want fresh content. They want kittens! It’s not that they don’t like cats, but they’re all grown up. They’re not as exciting or surprising anymore.

Kittens For Sale!

One of the tricks some people talk about is changing the time stamp on a piece of content. The idea being that by changing the date it makes the content look new. Now, my SEO philosophy is based on the idea that search engines are like children, but even a five year old can tell the difference between a kitten and a cat.

Kitten vs Cat

Saying your cat is really a kitten won’t work. Google knows it’s still a cat.

Renaming Your Cat

Others will change the title tag thinking that, just by making this change, the search engine will treat it like ‘fresh’ content. If you renamed your cat, would it suddenly become a kitten?

One of the problems here is that you can sometimes change the title tag to something better … or worse.

Cat vs Putin

If your title tag for this content was Cat and you change it to Russian Blue you’ll probably do better. If you then change the title tag from Russian Blue to Blue Russian you probably won’t. At that point Google may think your content is about a very cold individual from Russia.

Renaming your content doesn’t transform your cat into a kitten. However, changing your cat’s name may have an impact on relevance one way or the other.

Dressing Your Cat

Basement Cat Sweater

New comments or reviews makes my content fresh, right? Wrong! Dressing your content up in a new outfit does not turn it into new content. Once again, you might get a false positive, because the additional text may add (or subtract due to pagination) something to the content.

Putting a sweater on your cat might make it more interesting, but it’s still a cat.

Changing Your Cat

Maybe you come home and your child has painted the cat green. Perhaps the cat comes home one night with a small notch out of its ear. Is that cat now a kitten? No! Small changes to your content do not make it new.

All Cats

Even major changes to content don’t transform a cat into a kitten. Again, you might change the relevance of that content (for good or for bad) but it won’t be new. If the content is completely different (a dog instead of a cat) you probably want to create a new piece of content.

Because Google likes puppies too.

Are Brands Good For Search?

April 11 2011 // eCommerce + SEO // Comments Off on Are Brands Good For Search?

Brands are becoming a greater part of search results. But is that a good thing?

Brands in Search

Brands are the solution, not the problem. Brands are how you sort out the cesspool.

That’s what then Google CEO Eric Schmidt said in October of 2008. In March of 2009 the Vince Change made good on that comment, giving brands an extra boost in search results. And today we talk about the rising prominence of brand signals in Google’s algorithm.

Brands, Comfort and Trust

Many claim that brands increase trust in search results. Users see something familiar and that conveys a level of trust. This might be true (though I think people may be conflating comfort with trust) but, more to the point, is it really what search is about?

Why do we search? Many definitions of search imply the act of locating something otherwise unknown or concealed from us. That certainly doesn’t apply to these brands.

Around The Internet Corner

A recent comment on this blog is what really got me thinking about how brands and search intersect.

If I wanted to buy something, I’d go straight to Amazon. I don’t need the top of half my Google searches to all be stores I can drive down the street to get to.

A number of years ago it was far more difficult to get from one point of the Internet to the other. Connection speeds were slower and tabbed browsing wasn’t as ubiquitous as it is today.

Search supplanted browsers as the fastest way to get from point A to point B. But today, not only are those stores ‘down the street’ they’re also just around the Internet corner. The next site is one ‘open a new tab’ click away.

Search may no longer be the fastest way to get from point A to point B.

Speedy Navigation

Search Speed and Navigation

I think Google has responded to this evolution. Google Instant can be viewed in a very different light if you think about whether search or the browser is the fastest way to get from place to place. Shaving off those seconds are tremendously important in ensuring that users continue to use Google to navigate the Internet.

What about navigation? Navigational searches are on the rise, and we seem to tacitly accept navigational search as a given part of the landscape. But why?

Why are we still using Google to search inventory of known sites and brands? We know how to get to these stores. Well … Google makes it easy, providing more and more pathways to brands and stores.

Google Related Brands and Stores

I think these implementations might also be teaching users that they could simply visit these sites directly. Right now inertia is on Google’s side, but for how much longer?

Better Brand Search

Browsers have a real opportunity to retake control of user navigation. Unfortunately, the human computer interface for browsers is dreadful.

Firefox Search Box

Could the search box dynamically change the search engine based on the query? Right now the user is forced to change this on a per query basis. And I’d bet selecting other search engines is a low single digit percentage activity.

Maybe a transactional search brings up the option to search your favorite stores, launching each in a separate tab? It could even be a separate window, creating a self-contained environment for you to shop your favorite stores for that product.

Perhaps as you visit eCommerce sites your browser prompts you to add that site to your personal mall. Then when you’re looking for a product, you simply enter it (in a different and well labeled field) and your personal mall is created.

This is but one off-the-cuff idea! There are so many other ways to tackle this problem that would eliminate the need for traditional search engines.

But who is going to do this? Mozilla has little incentive to innovate in this direction given their lucrative relationship with Google. Chrome? Not unless someone else did it first. That leaves Internet Explorer who have consistently shown a lack of vision and execution.

Another search engine? I do like what DuckDuckGo is doing, by automatically putting an Amazon search result at the top when it identifies a transactional query. But I’m not sure any upstart has the power to turn the tide without help from a browser.

Brands Hasten Search Demise

Why search Google if it’s just returning the same brands I already know and trust? Especially since I can get to those sites (quickly) without the annoying ads.

By placing more and more brands at the top of search results I feel like Google is hastening this realization. Users may begin to see the results as more comfortable and trusted but not more valuable.

brand directory

Search is currently the infrastructure of the Internet mall. It’s how people ‘walk’ from one store to the other. Homogenized brand results may turn Google into a directory of the Internet mall. You might reference the directory once in a while when you’re stuck, but most of the time you’ll ‘walk’ from store to store on your own instead.

xxx-bondage.com