San Francisco SEO and Internet Marketing | Blind Five Year Old

PageRank Ponzi

September 09 2011 // Rant + SEO // 12 Comments

Why are you still submitting your site and articles to directories? Sure, there was a time when directories were valuable. But that time has passed. So stop feeding their business and build your own instead.

Totally Flabbergasted LOLcat

Page Rank Ponzi

Directories are essentially a form of PageRank ponzi. They use your content to build their business – to build their trust and authority – and, in exchange, lease a small fraction of that trust and authority (e.g. PageRank) back to you.

You either give away or actually pay to provide them with content. They take your assets, gladly, and use it to do what you should be doing. Even if you get a small benefit from this exchange, you’re getting the short end of the stick.

Directory Heyday

There was a time when directories were useful and valuable. From the mid-to-late 90s to around 2003, directories were used by many to find sites and content. This was before tabbed browsing and broadband connections made it easy to get from one site to another. This was before search became the dominant way to navigate the web. This was before social platforms allowed you to tap your social graph and crowdsource information.

One only needs to look at the search volume for the term ‘web directory’ to see that this is an outdated method of online discovery.

Distribution

In the directory heyday it may have been difficult to get your site, article or blog post distributed. The web was not nearly as connected or fluid.

But today we have blogging platforms, a robust social graph and numerous social media outlets that give you an opportunity to capitalize on your own intellectual property instead of giving it away to others for peanuts.

We Are The Directory

Whether you call it curation or crowdsourcing there are other repositories that mimic and exceed the traditional directory. You might search Delicious. In fact, more people should. Or you might try out Trunk.ly.

We’re doing the work of directories every day.

Caffeine

In June of 2010, Google launched Caffeine and increased their ability to crawl and index the web. This was one of the last pieces of the puzzle in making directories obsolete.

Previously, directories might have been able to quickly surface new sites or content that hadn’t yet been found by Google. But that’s just not the case today. Google finds new content even in the dark and dusty corners of the Internet where Geocities pages lurk and survive.

Google Directory

So what does Google think about directories today?

Google Directory No Longer Available Message

Google shut down their directory. Read that again and think about what it means for the future and value of directories. And don’t get me started on the utter collapse of DMOZ. (No, I’m not even going to link there.)

As an aside, Google may want to consider a folder level URL removal so directory results (which return a 404) don’t clutter up SERPs.

Directory Spam

Most web directories are hastily thrown together arbitrage sites that serve as outposts for spam. Here’s a excerpt from an email sent to me by an ‘SEO Consultant’.

Directory Spam

This is not SEO, at least not the SEO I practice. Some may reject this carpet bombing approach but subscribe to the idea that a handful of paid directories are worthwhile.

I say save your money.

Paid Link or Paid Listing?

Jack McCoy from Law & Order

Frankly, I’m still a bit irked that Google doesn’t view a paid listing as a paid link. The argument for paid directories is that they provide a certain level of curation that makes them valuable. You’re paying for someone to curate that directory – not for the link. This seems a very thin argument at best, and a bunch of claptrap at worst. Most, if not all, directories are pretty much a free-for-all as long as what you’re submitting isn’t complete spam or off topic. The level of curation is marginal, and I’m being nice.

Not only that, but it comes down to intent. For some reason I hear Jack McCoy yelling ‘intent follows the bullet’. It’s not a perfect analogy, but the general idea is that intent matters. Today, the intent for a directory listing is, quite simply, to secure a back link. So, what exactly is the difference between a paid link and a paid listing? There is none as far as I can tell.

Link Value

REM Out of Time Cover Art

How valuable is that directory link anyway? I’m telling you that the value of these links declines every day. People aren’t using these sites. Newer technologies have replaced directories in the information ecosystem. The closure of the Google Directory should be a wake up call to anyone still clinging to this practice.

TL;DR

Traditional directories are an obsolete method of information discovery. Even if they provide some small benefit today, you’re paying a hefty price to support someone else’s dying business model. Stop PageRank ponzi and invest in the future and yourself instead.

Did You Mean Humor

September 04 2011 // Humor + SEO // 1 Comment

Last week I spent about two days in a WordPress PHP maze of death that, in exasperation, led me to type a cartoon-like expression of frustration into a Google search.

Google Did You Mean Result for Arrrugghghga

And I laughed. Because sure enough Google had four different versions for me to choose from. Four!

We know that Google has a sense of humor. Take the ‘did you mean’ result for recursion.

Google Did You Mean Result for Recursion

That’s right. The ‘did you mean’ result for recursion is … recursion. Clicking on recursion takes you right back to the same page. I’m amused. But don’t try to pay Google a compliment.

Google Did You Mean Result for Compliment

What ‘did you mean’ suggestions have made you laugh?

Optimize The Google +1 Button

September 01 2011 // SEO + Social Media // 20 Comments

Last week Google made a major change to the +1 button, allowing users to share a +1 with Circles on Google+. Essentially, the +1 button now has similar functionality to Facebook’s Like button. I’d been procrastinating on Like button optimization so this was the impetus I needed to finally do something about it.

Here’s what I did to optimize the Google +1 button. Let me state up front that I am not a coding guru. But I know enough to be dangerous, enjoy tinkering and am not afraid to break my site as I experiment.

I recommend this only for those using a self-hosted WordPress blog.

Speed

When the Google +1 button first came out it was s l o w. It was the last of my buttons to load. But that changed when Google released asynchronous code. If you haven’t made the switch yet, go and get the new code.

I use the Smart Sharing plugin for my floating share bar. So I just dropped the appropriate code snippets into the ‘Custom Codes’ area. This is really easy. After doing this, the +1 button is often the very first to load.

Placement

Where you put the Google +1 button matters. The floating share bar on this blog, powered by the Smart Sharing plugin, did require a fair amount of CSS customization to display properly. (Thank goodness for Firebug and other sites who use a similar display.) But the customization has been worth it!

It’s not just about where but how many buttons you present. A huge row or drop down full of buttons can cause indecision. Too many choices are bad. So pick the buttons that are meaningful to your audience and use them.

I get far more social engagement with a handful of floating buttons than a slew of buttons at the top or bottom of the post.

Snippet

Now that speed and placement are taken care of it’s time to optimize the snippet. The snippet is the title, description and image displayed when someone shares your content through a Like or Google +1 button. Controlling what this looks like is important.

Here’s an optimized snippet from this blog.

Optimized Google +1 Button Snippet

And here’s a non-optimized snippet from the Official Google Blog.

Non-Optimized Google +1 Button Snippet

Now, maybe Google can get away with a poor snippet but the rest of us probably can’t.

Structured Markup

Optimizing your snippet means adding some structured markup to your page. This is a lot less complicated than it sounds. In fact, both Facebook and Google are omnivorous and will parse your site looking for any way to cobble together the best snippet.

How Google Populates the Google +1 Snippet

Google recommends using the new Schema.org microdata. But they seem happy to fall back on the more ubiquitous Open Graph protocol (funny how they don’t mention the word Facebook) or normal meta title and description tags. In fact, the only method they don’t recommend is having absolutely no structured markup at all.

Don’t obsess about implementing microdata because Google certainly isn’t.

Most of you probably have the title and description taken care of, but it’s the image that might be causing you some heartburn. In many ways though, that might be the most important part of the snippet. People scan content and a good image is like a stop sign for the eyes.

For a long time Facebook was using my Feedburner chicklet for my snippet image. I got tired of seeing this and created a default image for all of my posts. It wasn’t optimal but at least I didn’t look like a buffoon anymore. Read on to see how I took things to the next level.

Plugins

Like millions of others, I rely on a number of plugins to add features and functionality to my WordPress blog. It’s important that you understand the plugins I’m using since it will impact how you might implement snippet optimization with your own WordPress configuration.

I use the Like plugin to generate my Open Graph tags. It’s worked well, though I may think about doing it myself after going through this process. In general, this plugin is easy to use and efficient. It’s this plugin that allowed me to hard code a default image for my snippet.

I use All In One SEO to optimize title and meta description, among other things. I hear good things about WordPress SEO by Yoast, but haven’t had any real reason to switch. I may eventually, but I use other plugins that may have to be deactivated to take advantage of the full feature set offered.

I tried the Schema for WordPress plugin and admire the effort, but find it too complex and onerous. In addition, I don’t believe it supports the Schema meta tags which is what the Google +1 button seems to want.

Frankly, this is part of the problem with the microdata implementation. Do I put it in the head or the body? Or both? It’s almost too flexible.

Custom Field

So, with all those Plugins in place, my main task was to find a way to generate a different image for each post. That’s where a custom field comes in handy. Custom fields sound complicated but they’re actually pretty straight forward. I created a custom field that would contain the URL of the snippet image.

You’ll see the Custom Fields option in your normal Edit Post window.

Enter a new Custom Field

Click ‘Enter new’ and you’ll now be presented with an entry field.

Configuring a Custom Field

Just enter the name of your custom field (I used og_img but you could name is something different) and the value you want to give that custom field for this post and click the ‘Add Custom Field’ button. In this case enter the full URL path of the image you want to use for this post’s snippet in the value field.

This creates that custom field in the database and associates the URL for that custom field for this post. This does not mean that you are assigning this value (this URL) to every post. This is just the way you get the custom field ball rolling.

Custom Fields in Drop Down Menu

Now you’ve got the custom field in place for that post and you can select that custom field from the drop down menu for all other posts from now on.

In fact, that’s how you’ll specify the image snippet for each post moving forward. You only use the ‘Enter new’ to create the custom field. Once created, always select it from the Name drop down menu.

Yes, this means that you’ll be taking one extra step each time you publish a blog post. I see this as a small price to pay for an optimized image snippet and lump it in with writing an optimized title and description. They’re simply part of the publishing process.

Image

The image you use can make or break your snippet. It’s not just about the content of that image but the size of that image. You could have the best, funniest image ever but if it doesn’t translate on social platforms then it’s all for naught. In this case size actually refers to the dimensions of the image.

A square image (e.g. – 300×300) is a safe bet. If you’re not using a square or squarish image then you want to pick vertical rectangles over horizontal rectangles. Obviously, really long vertical rectangles won’t work either. Use commonsense, test and learn as you go.

Image Snippet Size Guide

One other thing to note is that the image you use doesn’t actually have to be in the blog post. I personally think it should, but you can upload any image to use as your image snippet. This might come in handy if you don’t have any images in a post (not recommended) or images that won’t translate onto social platforms (such as a horizontal code screen capture.)

Header Code

Creating the Custom Field only gets you half way there. Then you have to use it within your theme. That means you’ll need to tinker with your header.php file by navigating to Appearance -> Editor -> Header (header.php)

Edit Theme Header

First things first. Copy your entire header.php file into a text file just in case you really screw something up and need to revert back. Just copy everything in that window into your favorite text application.

Next, you want to place the following code into your header.php file between the head tags (<head>, </head>). I recommend placing the code in just before the close (</head>) tag.

 <?php if(get_post_meta($post->ID, 'og_img')){ ?> <meta property="og:image" content="<?php echo get_post_meta($post->ID, 'og_img', $single = true); ?>" /> <?php } else { ?> <meta property="og:image" content="https://www.blindfiveyearold.com/wp-content/uploads/2008/09/blind-five-year-old-150x150.png" /> <?php } ?>

This code creates an og:image tag using the image in your custom field (og_img) or, if there is no custom field for that post, will use the default image specified instead. This is essentially the code that starfly describes in this WordPress forum post.

Make sure you change the custom field name appropriately (if you didn’t use og_img as I did) and that you use your own default image instead of mine. The first line of this code is non-functional and is there only to help you identify it amid all the other code.

After you’ve entered this code, click ‘update the file’ and you’re done!

Validation

To test whether it works you can first look at your blog post and view source. Look for the og:image tag and make sure that it’s populating correctly. If that’s not your cup of tea, you can also run your blog post through Facebook’s ~~URL Linter~~ Debugger.

Finally, try sharing your post on using the Google +1 and Like buttons and make sure it’s optimized to your liking.

Where’s the Microdata?

I intended to show you how to cover your bases and implement an additional set of microdata in your header. It was going to be pretty darn clever. In fact, I spent two days figuring it all out, cursing PHP and breaking my site for hours on end. Trust me, I want to show it off!

But as I did due diligence on this post the microdata seemed to confuse Facebook and was, overall, just too brittle to use at this point. Honestly, do you really want to maintain two different types of structured markup?

I can see using microdata in other ways, but not for snippet optimization.

Google +1 Button Optimization

Two Birds with One Stone

So, in short, I’m using All In One SEO and the Like plugin along with a Custom Field to generate Open Graph tags that are then used by both Facebook and Google in their button snippets. Two birds with one stone!

I welcome questions on my implementation as well as feedback from others who can improve my code or have found other ways to optimize these snippets.

What Does Google Think Your Site Is About?

August 15 2011 // SEO // 7 Comments

Over the last two and a half years I’ve become convinced that Google performs rich categorization of sites and queries. The signs are plain as day and the impact is substantial. Categorization allows Google to apply algorithmic changes by category as well as deliver more relevant and diverse search results.

Scattergories

Site Categorization

In late 2008 I wrote about how taxonomic search could be used to create diversity in search engine results. Later, In May 2009, Google launched search options that included the ability to filter your results by Forums and Reviews.

Rich Snippets stole the show during this launch but it should have been crystal clear that Google was able to distinguish site or page type. How else would it deliver results in these filters?

Google made this even more transparent when they tested the ability to see more or fewer shopping sites. Obviously Google was able to tell (with relative accuracy) which sites were eCommerce enabled.

You may also come across search results for Q&A sites or forums that include a pseudo-rich snippet which includes the number of posts and authors.

Example of Google Discussion Rich Snippet

This is not structured markup. This isn’t RDFa or microdata. Instead, Google has identified patterns in these sites, extracting the relevant information and creating their own rich snippet. More recently, Google demonstrated that it could determine if a page contained search results.

These are interesting applications but it’s what is at the heart of it all that is really remarkable – the ability to algorithmically identify and categorize a page or site.

Query Categorization

If it were a snake it would have bit you already. Of course Google categorizes queries! You see this when local results are triggered, or when the Onebox is presented. Enter a query that is product based and the Shopping Onebox is likely to appear. Enter a query that is health related and the Health Onebox may appear.

Google Health Onebox

It’s not a matter of if, but how deeply Google categorizes queries.

Matching Queries to Categories

Matching queries to categories can help deliver the right information to users. You can already see this happening as Google changes the left hand search options based on the query.

You won’t get a Recipes option if you search for ‘baked new jersey’.

You won’t get a Books option if you search for ‘crying of lot 23’.

These examples could be delivered by simply accessing relevant databases of recipes or books, but I’m guessing the relation is far more dynamic.

Matching Documents to Queries

The question then becomes how Google matches web pages (aka documents) to queries. What documents (or results) does Google return based on the query categorization?

A recent patent dissected by Bill Slawski shows how Google might think about matching documents, queries, keywords and categories. It indicates that what Google thinks your site is about could influence how and when it is returned for certain queries.

If you were to take all of that information that Google provides for your site, and try to guess at a category or categories that Google might assign for your site, could you?

That is an interesting question.

Google Ad Planner

One of the unconventional places I’ve looked at recently to give me a sense of what Google might think a site is about is Google Ad Planner.

Google Ad Planner Result for SEO by the Sea

The content categories are usually a mixture of accurate and bizarre, particularly around geography. Now, I’m not saying that the content categories here influence Google’s categorization. But I have found it illuminating to look at Ad Planner results when comparing competitors.

Beyond the content categories are affinity scores for sites also visited and audience interests. The latter is another interesting data point when thinking about how Google might categorize sites.

Google Ad Planner Sites Visited and Audience Interests

The good news is Google has a compelling reason for this data to be accurate (it’s attached to advertising) and the data seems to be updated frequently. As an example, the Ad Planner thumbnail for this blog shows my most recent blog post.

Google Ad Planner Results for Blind Five Year Old

Why Google Ad Planner thinks I’m in Canada is a bit of a mystery. I did do a blog posts about Hockey Memories and The Flyers 1987 Stanley Cup Playoffs, but I’m not in Canada (as lovely a country as it may be.)

Taxonomy within Google Ad Planner

After verifying site ownership you can change your categories, which reveals a fairly robust taxonomy. Is this the same taxonomy Google uses in their algorithm? Probably not. But it might help inform or update the one used for search. So, I’ve gone ahead and changed my categories and description.

It’s in my can’t-hurt-and-might-help category.

TL;DR

Google continues to work on matching categories to queries and queries to web pages to improve search quality. Understanding how your site might be perceived by Google is an important new step in search engine optimization.

Doppelgangers

August 10 2011 // Humor + Life // 3 Comments

I’ve thought about this post for a while but never pulled the trigger. But with the ruckus over names on G+ and the ridiculous Same Name reality show I thought it was finally time to blog about doppelgängers.

Doppelgänger

Merriam-Webster defines doppelgänger as:

1: a ghostly counterpart of a living person 2 a : double 2a b : alter ego b c : a person who has the same name as another

The emphasis is mine and I’m using that version of the definition here. It’s easy to find Internet doppelgängers by doing a vanity search on Google or setting up a Google Alert on your name.

So without further ado, let me introduce you to my three doppelgängers. (Who knew I was so talented?)

Pro Freestyle Skateboarder

AJ Kohn co-founded One Skateboard Co and is an accomplished freestyle skateboarder.

The odd thing is we both grew up near Philadelphia. Because of this, every now and again I get a random email from someone looking to reconnect with him. In a few cases it’s been tough for me to persuade them that I’m not that AJ Kohn.

Amateur Baseball Player

AJ Kohn is an infielder on the Plainview Bucks, an amateur baseball team in Minnesota.

Team Photo of Plainview Bucks

I don’t know which one is AJ Kohn, but I do know that he’s the star of the team, batting .424 with 31 hits, 11 runs and 12 RBI.

Marine Biologist

AJ Kohn was the first to report that some cone snails hunt fish by paralyzing them with a powerful neurotoxin. Their sting can even kill people!

Cone Snail

Now Professor Emeritus at University of Washington, he dedicated his professional life to “understanding of the evolutionary processes that have led to high biotic diversity in tropical marine environments.”

Passion

I take great comfort in my doppelgängers because they remind me to be passionate about what I do. Each AJ Kohn followed a passion, whether it be skateboarding, baseball or marine biology. I admire that.

What about your doppelgängers? Find yours and tell me what they’re up to. #doppelgangers

Comment Censorship

August 07 2011 // Rant + Social Media + Technology // 18 Comments

In the past month I’ve left a number of comments on blogs only to find they were never published.

Fry Meme Spam or Blog Censorship

I’d like to believe that the blog owners simply didn’t see my comment. That it fell into their spam queue which they rarely, if ever, look at. Because the alternative is that they saw the post and decided to suppress it. Now, it’s their blog – their little corner of the Internet – but this type of censorship is troubling.

Comments Content

What about the content of my comments? To be fair, in some instances I was disagreeing with some or all of the content in that post. But I did so in a constructive manner, using links to my own thoughts on the topic or to other material to help round out my argument.

I regularly publish comments on this blog that are contrary to my own opinion. One only has to look at the comments on my Stop Writing For People post for examples. I’m free to respond and defend myself, but having the debate in the open is important. It builds trust, much like having bad reviews on a product is actually a good thing.

Comments are incredibly valuable because they provide additional information on the content. They make your content better through clarification, confirmation, addition and debate.

Comments = Content.

Comments are a rich source of meta information that deliver value to both readers and search engines. This extends to links as well! Relevant links in comments help create a web of information that users now and in the future will find useful.

Yet it is those links that may be at the root of the problem.

Comment Spam

It’s like the Internet version of a plague of locusts. One of the most popular ways to combat comment spam is to screen comments that have links. This is one of the default setting in Akismet.

It makes sense since many spammers will drop a link or links in comments. But links are not the problem. Spammers are the problem.

What’s wrong with contextual links to relevant content? This is not behavior that should be penalized. In fact, it should be encouraged. In many ways, the comment spam problem threatens the link graph.

ratio of comment spam to real comments

Not only that but, anecdotally, it seems that comment spam sometimes pushes people to disable comments altogether. When the ratio of comment spam to real comments is too high, many simply give up. I understand the decision but it’s depressing that it gets to that point.

Outsourcing

Fed up with comment spam and general comment management, have we decided to outsource engagement to social networks? Twitter, Facebook, LinkedIn, and Google+ are all happy to provide venues in which comments can flourish. Make no mistake, these venues understand the value of comments.

Is our obsession with amplification and generating social proof robbing us of the real value of comments and conversation? Certainly there is some hope that it’s like a rubber band. The content goes out, but then snaps back, drawing more comments to your content. It works to a certain extent, but by how much and at what cost is an interesting debate.

The Filter Bubble

Of course these bloggers may have seen my comment and simply decided not to publish it. Eli Pariser argues that personalization and ‘invisible algorithmic editing’ as a real danger but I think comment censorship (whether intentional or accidental) is the true menace.

I believe much of the hype around the filter bubble is FUD. Personalization is rather minimal in most cases though I do agree with Gabriel Weinberg’s view of how to deal with personalization.

Personalization is not a black and white feature. It doesn’t have to be on or off. It isn’t even one-dimensional. At a minimum users should know which factors are being used and at best they should be able to choose which factors are being used, to what degree and in what contexts.

Personalization deals with the fact that some content isn’t being made readily visible. Comment censorship excises content from the Internet altogether.

Identity

So what could help get us out of this morass? How can we ensure comments are once again a vital part of the content ecosystem? Identity.

Identity

The reason why many embraced Facebook comments was because comments are attached to an identity. Not only that, but an identity that people cared about. This obviates the need for aggressive moderation. You might run into a troll, but it’ll be a troll you can clearly identify and block.

Identity essentially stops comment spam because you can’t post as Best Miami Attorneys. Comment moderation is suddenly manageable again.

Censorship

A commenting system that uses identity removes most of the uncertainty around comment censorship. If my comment isn’t published, it’s likely because that blogger made an active decision to toss it into the HTML version of The Bermuda Triangle.

Cat Censors Blog Comments

If the filter bubble can be managed through making personalization transparent, so too can comment censorship. A third-party, identity-backed comment system could track the number of comments censored on each blog. A grade or score could then be shown to let users know how much of the conversation was being censored. In some ways it would be like Charity Navigator but for blogs.

So perhaps the blogger who touts the benefits of community actually censors 32% of blog comments. That might be an interesting thing to know.

Could this get messy? Sure. But you can build a system of checks and balances.

Reputation

Bad Reputation by Joan Jett

Joan Jett might not care about her bad reputation but you should. Whether it’s a thumbs-up, thumbs-down, number of Likes, sentiment analysis, length of comments, spelling and grammar or other metrics, a savvy comment system could begin to assign reputation to each user.

So the censorship percentage wouldn’t be flat in nature. If you blocked a known troll, no worries. If you censored someone who had a history of abusive comments full of foul language, no problem.

On the other hand, it would be disturbing if you censor someone who consistently adds value to conversations. The reputation of those you censor would matter.

Confidence

I’d like to be confident that I’m not missing good comments that wind up going into spam.

I’d like to be confident that if I take the time and effort to comment on a blog that it will be published and, hopefully, spark further comment and conversation.

I’d like to be confident that the comments I read are not biased and simply a form of self-curated cheerleading.

“Confidence is contagious. So is lack of confidence.” – Vince Lombardi

The Internet desperately needs more confidence.

Inside Google’s Search Office

August 04 2011 // SEO // 17 Comments

I had the opportunity to attend last night’s Churchill Club event held at the Computer History Museum that brought together Amit Singhal, Ben Gomes and Matt Cutts.

Inside Google’s Search Office marked the first time the three have appeared together. If that wasn’t enough, the event was moderated by Danny Sullivan who is both incredibly smart about search and a gifted moderator.

For someone who follows search closely, this was a can’t miss event. Thanks to Rick Bucich (who also helped this introvert mingle) for the heads up on the event.

This was not an SEO conference or event. Very few in the audience were hoping to capture that ‘gotcha’ moment. Instead, we were given a more human look at how these three Googlers think and feel about their work.

Passion

All clearly have a passion for search. Amit, in particular, was animated and visibly enthusiastic about the subject matter, at times perched on the edge of his seat as he told a story or answered a question. Matt was his usual self, mixing down-to-earth advice with a goofy sense of humor. Ben, the most reserved of the bunch, lit up when talking about specific challenges faced and overcome.

I saw Amit bristle at the idea that AdWords spending influences organic search. I watched Matt grimace at the mention of Eli Pariser’s Filter Bubble and the conclusion that Google was doing something untoward in personalizing results. These things generate a response because they care.

What they care most about is doing right by the user. This often sounds trite or contrived. Some claim that this altruistic goal is just a facade for greed. But I don’t think any of them would dispute that Google prospers when they do right by the user. They strongly believe that one leads to the other.

As an aside, I’m not sure who would win a gesticulation war between Matt and Amit.

Testing

We all know that Google does a tremendous amount of testing. Amit described the process Google goes through in approving an algorithm update. It reminded me a bit of Schoolhouse Rock’s How a Bill Becomes a Law.

Schoolhouse Rock Bill on Steps

Many changes are tested but only a few make it through the entire process and result in an algorithm update. Based on the ballpark numbers mentioned (20,000 changes and 500 updates), less than 3% of the tests result in an algorithm change.

Relevance

One of the things I’ve written about previously is the subjectivity of relevance and quality. One man’s trash is another man’s treasure. This was the first time I heard Google address this issue head on. While they believe that their scientific method allows them to come closer to an objective relevance they know it won’t be perfect.

Matt talked about the fact that each algorithm update will make some certain amount of queries better but some much smaller amount of queries worse. The idea that each algorithm update will only have beneficial results is … fantasy.

The Needs of the Many Outweigh The Needs of the Few

I’ll paraphrase another quote referenced by Danny and Ben. Two humans will agree on relevance only 80% of the time. If you looked at that same result a year later, you may not agree with yourself, let alone someone else. The implication (one I happen to agree with) is that relevance is a moving target.

These statements showed a humility that I had previously found lacking. Google should rightly be proud of having the best search algorithm on the planet, but it was refreshing to hear them admit that it was an imperfect work-in-progress. Google is on the cutting edge of this field and they continue to work (hard) at getting it right.

Speed

Speed was a central theme in the discussion of search innovation. It got me thinking more about measuring quality and relevance based on the amount of time it took from query to click.

Obviously Google has worked to reduce that time through Google Instant and, most recently, Instant Pages. But might Google score search results on how quickly they satisfied query intent? That doesn’t mean the user clicked on the first result, it simply means that the results presented yielded the best user result as measured in time to productive click.

Think about that for a moment. Is Google less concerned with getting rank right and more concerned with optimizing for quick knowledge transfer? Clearly having the right listings at or near the top matters, but perhaps a combination of listings is most effective. We focus on the percentage of clicks by rank, but that may not be the dominant metric Google is using.

Social

Google seems both excited and cautious about how it integrates and uses social signals. Amit reduced social to ‘who knows who and about what.’ But there’s a lot more to be done here and identity is clearly on their mind as a way to strengthen this signal.

Google Plus Logo

In addition, I believe Google is a bit stunned at the quick adoption rate of Google+. I’m not saying they didn’t think it would be successful, but it feels like they didn’t expect these types of numbers. At this stage, I think they’d like G+ users to take a breath and provide them some time to iterate.

Googlers

This was also a fun event for me because I met a few more Googlers. I got to talk with Sagar Kamgar and Jeremy Hylton about rel=”author” and how it may change moving forward, met Jim Haljun and chatted briefly with both Amit and Ben. And I was happy to reconnect with Matt after missing him at this year’s SMX West.

I’m not a Google apologist. I think my writing makes that clear. But I find it difficult to stomach claims that Googlers are elitist liars who feed the SEO community misinformation. This irritates me because nearly all the Googlers I’ve met (online and in person) have been smart, friendly and very generous with their time.

Future

Fancy Feast Cat Food

Amit and Matt both shared their visions for the future of search. In five years Amit would like to see search behaving more like the ship’s computer on Star Trek. He’s keen on voice search and the growing impact of mobile devices.

Matt’s vision was limited to the next few years where the use of geo-location, mobile devices and other contextual data could allow Google to prompt users. Matt would never forget to pick up cat food again! He’d also like the voice to text ability to be refined so he can respond to email on his drive home. “That shouldn’t be that hard. I’ll file a bug when I get home.”

Beyond these predictions I was caught up in their passion and reminded of why I love search: it’s full of amazingly talented people with inquisitive minds working on something that is constantly evolving.

Computational Humor

July 31 2011 // Humor + SEO // Comments Off on Computational Humor

Googlebot does not laugh. It is not the lovable Johnny 5 and is probably closer to HAL in disposition.

HAL from 2001: A Space Odyssey

Humor Signal

Does Google use humor as a search algorithm signal? My first instinct is to state, without hesitation, that there is no humor signal. There’s no analysis of puns or witticisms or double-entendres or other grammatical humor.

The clever pun you use as your browser title will be lost on Googlebot and, truth be told, the majority of your readers as well. Googlebot doesn’t have a funny bone.

Backhanded Signal

Yet the algorithm will reward humor through other signals. Something that’s really funny will attract links and generate a high number of Likes, Tweets and other social gestures.

So humor is reflected in the algorithm through the link graph and social signals.

Research shows that our sharing behavior is heavily informed by humor. We share to get a laugh. Just ask Ben Huh, CEO of I Can Has Cheezburger?

But that’s an indirect signal, sort of like a backhanded compliment. Google doesn’t really know it’s funny, it just knows it’s popular and engaging.

Computational Humor

Google might be able to determine whether a document is humorous or not. Computational humor, a relatively new field of study, has shown a moderate amount of promise.

Early computational humor research focused on producing jokes, resulting in the Joke Analysis and Production Engine (JAPE) and HAHAcronym, which does just what it says. More recently, there’s been a greater emphasis on identifying humor using natural language processing and machine learning (pdf).

Researchers were able to train an algorithm, with high accuracy, to identify humorous one-liners from standard headlines and even proverbs, which had similar sentence construction.

Google Humor

Is Google interested in humor? It’s tough to tell. But a recent patent, brought to my attention by (who else) Bill Slawski, titled Embedded Communication of Link Information contains an interesting idea.

The improved link tags may allow the publishers to communicate additional information, such as opinions, about the content locations and/or document locations. The additional information may be along one or more dimensions. Therefore, different information may be conveyed at the same time. For example, one dimension may indicate that a content location and/or a document location is offensive as well as funny.

For example, “offensive=very” or “funny=somewhat.”

Coincidence? Probably. Nevertheless, I think it’s interesting that humor was one of the examples selected. And most of the Googlers I know have a keen wit.

Cat in a Dryer LOLcat

Humor and Quality

At the end of the day the real question is whether humor and quality are related? Does the incidence of certain types of puns or other humor signal quality? Could quality be predicted based on Monty Python references? Or the presence of alliteration?

Could you construct a database of humor? Monty Python phrases. Caddy Shack quotes. Famous puns. Palindromes. Internet memes. Knock knock jokes. Using that database, could each type of humor be given a score based on how often it appears in what quality of content?

Those scores in place, you can begin to use humor as a signal to help ‘turn up’ high quality content. Perhaps … to 11.

Perhaps the relationship between content quality and humor doesn’t exist. There’s a tremendous amount of personal taste involved and the cultural differences are also substantial. In addition, much of our humor is visual in nature and for all the advances in image search Googlebot still can’t really see.

Yet, humor is clearly a compelling part of the content puzzle. And there’s a part of me that believes that the mere presence of humor indicates a higher level of care and passion, which dovetails nicely with Google’s recent guidance on high quality sites. I doubt you’re going to see humor (at least intentional humor) on splogs or MFA sites.

I still don’t think there’s a humor signal. But maybe there should be. It would be nice to reward writers for their spelling, grammar and creativity.

Image Search in Google Analytics

July 26 2011 // Analytics + SEO // 19 Comments

Think you got a bump from Panda 2.3? Not so fast.

Image Search Analytics

In looking at a number of client sites I notice that image search traffic, tracked under referring traffic (google.com / referral) with the referral path of imgres, fell off a cliff as of July 23rd.

Where'd My Image Traffic Go?

Where did that image traffic go? Organic.

Organic Image Search Traffic Bump

So if you thought you’d been the beneficiary of Panda 2.3 (launched late last week), you might want to make sure it’s not a phantom image search bump.

The Definition of Organic

At present I can’t find an easy way within Google Analytics to distinguish between organic traffic that is search based versus image based. That strikes me as a step back since these forms of traffic are not homogeneous in nature. Lumping image search in with organic is like smearing vaseline on your windshield. I can still see, just not as well as I could before.

There’s probably a hack you can put together via filters, but most users won’t make that effort.

Where’s Image Search?

This isn’t the first time Google has played Where’s Waldo with image search. On May 6th, 2010 Google moved image search traffic from images.google.com to google.com.

images.google.com traffic drop

At least that time you could wander around Google Analytics and spot the new source/medium that would provide the same level of specificity. Oddly, you’d still see some stray images.google.com traffic after this change. I always meant to track that down but never got around to it. This new update seems to finish the job and eliminate the remaining images.google.com traffic that had been trickling in.

New Dimension Please

I am hoping that this is just evidence that Google Analytics will launch a new dimension so we can separate these two different types of search traffic. Yet, you’d think they’d launch the dimension before migrating the traffic.

For a long time I figured that these changes were an indication that image search was the ugly duckling of the bunch. But recent events make me believe that Google is very invested in image search, so why the lack viable reporting? No, ‘it’s free’ is not the right answer.

I’m waiting to hear from a few Google sources and will update this post if I get any type of insight or confirmation. Until then, how do you feel about this change?

Google Plus, Identity and SEO

July 25 2011 // SEO + Social Media // 23 Comments

Google+ has seen unprecedented adoption over the last month and it is clearly the best social effort Google has produced to date.

But why is Google pursuing social in the first place?

G+

Some see G+ as proof that Google is chasing tail lights, while others see it as a natural extension of Google’s mission, which is ‘to organize the world’s information and make it universally accessible and useful.’ In this instance, that information is social in nature.

I see two compelling reasons behind G+: search and advertising.

G+ provides Google with a primary source of social data to mine and use for search purposes. Can you blame Google for not wanting to rely on Twitter or Facebook for social signals?

G+ also seeks to break the stranglehold Facebook has on attention. This is important because the ad business runs on attention. If you’re on Facebook for 55 minutes a day, that leaves less time for other sites. G+ doesn’t have to beat Facebook it simply needs to fracture its monopoly on attention.

Identity

Who Album Cover

One of the more controversial aspects of Google+ has been Google’s insistence on people using real names.

Google Profiles is a product that works best in the identified state. This way you can be certain you’re connecting with the right person, and others will have confidence knowing that there is someone real behind the profile they’re checking out. For this reason, Google Profiles requires you to use the name that you commonly go by in daily life.

I probably won’t make many friends by saying I completely agree with this policy.

Some believe that sacrificing anonymity will sacrifice risk-taking, honesty and whistle-blowing. There may be less of that, but it won’t stop. And is it so bad to encourage people to take risks and be honest without the crutch of anonymity? For those of you who think I’m not personally aware of the issues here, let me disabuse you of that notion.

Back in 2008 I dashed off a quick email to the editor of The Contra Costa Times. It wound up in the paper and was attributed with my name and city. About a week later I got this in the mail.

coward writes a letter

This person found my address (which was not in the paper) and sent me this erudite note. It was unsigned and did not have a return address. It was a bit chilling. Yet, the anonymity used belies a cowardly nature. This person also can’t seem to get my name right, but who cares about details like that when you just want to call someone names.

I’m not saying all anonymity leads to this type of behavior. Nor do I downplay the need for anonymity in certain situations. Speaking out in other countries could have dire consequences. But one of the reasons people embraced Facebook’s new commenting system was the fact that identity would reduce spam and needless flame wars. No more anonymous pot shots from the peanut gallery.

Could Google be doing a better job in how they handle the policy? Yes. But I believe they’re far more concerned with ensuring that spam doesn’t enter this new ecosystem, both for user experience and data integrity.

What’s In A Name?

Nick Halden Alias

I don’t think Google is saying you have to use your true name. I’m assuming that Mark Twain could sign up, but so could Samuel Clemens. And it’s not that hard to set up a new persona online. Maybe we’d even see profiles for both Neil Caffrey and Nick Halden.

But why does Google care so much about identity, particularly on G+? What does it really do for them? Here’s where I think the rubber really meets the road.

Identity allows Google to map the web of people.

Since Google launched, they’ve indexed the web using links. That’s worked amazingly well, despite what many competitors and faux-academics have contended. Mind you, there’s still a lot of room for improvement.

Today we know that there are people behind sites and documents. Facebook understood that people were ‘hiding’ behind websites and they wanted to reveal who was visiting (and Liking) those sites. Google, on the other hand, seems more interested in who created those websites and documents. They’re looking to augment the web of links with the web of people. That’s why identity is so important.

Without identity Google can’t create a web of people. They can’t mine the interactions on G+ to determine trust and influence. They need a primary key for establishing a new social database and name makes a whole lot of sense.

Yes, Google will want alternate email addresses, pseudonyms and even usernames, but they need something to pivot around. Mapping the social graph necessitates this type of approach. Google’s already pretty good at this, which is why you may be presented with suggested links in your Google Profile. One look at your Social Connections should give you an idea of how much Google knows. (Really, go look at this today!)

SEO

The carrot that Google can wave to encourage identity is search. Rel=”author” is the first overt way that Google is rewarding identity in search results. But that’s just the tip of the iceberg as far as I can tell.

Getting your smiling mug on search results will certainly help your click-through rate and boost your ego, but long term Google can begin to map all of your social interactions. They’ll encourage you to link more and more of your digital self so they can confidently perform this analysis.

I’m a firm believer that Google has a GIGO problem. The explosion of digital content has made everyone an author, an expert and pundit. Jason Calacanis certainly doesn’t suffer fools gladly.

There are a lot of stupid people out there … and stupid people shouldn’t write.

I don’t agree with that statement but I do with what Jason said next.

There needs to be a better system for tuning down the stupid people and tuning up the smart people.

Google needs a better way to sift through and determine quality in an age where content can be produced and distributed with such ease.

Instead of going from the document level and building up, what if you used an ‘author’ filter? Google’s Panda update created an A and B pile for site quality. Google could do the same for author quality. Yes, it’s a scary thought that Google might pass judgement on your corpus of content, but in the end that’s pretty much their job.

This doesn’t mean that B pile authored content wouldn’t rank, but it may need to exhibit other signals to outrank A pile authored content.

AuthorRank

AuthorRank

AuthorRank will help augment the rickety link graph. The reliance on a Google Profile, insistence on a real name, the implementation of rel=”author” and the whole of G+ makes me believe that authorship is a vital part of how Google wants to measure the web. I’m hopeful that Google can determine a better way to measure these social gestures, beyond what Klout, PeerIndex or others currently provide.

Could Google transform trust and influence from a popularity contest to a meritocracy?

Google could look at the speed of sharing. If the time between my interaction with that content and my sharing of that content is too short, I’d like that to count less. This may eliminate much of the robo-syndication that persists today and even reduce the sheep and ego mentality of promoting luminary content.

In addition, comments are a rich source of data, both as meta data on the shared piece of content but as a signal of depth and breadth of engagement. You’ll have noted that some of the Internet famous get a slew of comments on G+. But how many of them are in the ‘brilliant post’, ‘totally agree’ and ‘Yes!’ variety? What percentage of your comments are productive versus congratulatory?

AuthorRank could provide real value to search right out of the gate. Imagine if blogs without real identity were given less prominence? Suddenly splogs would disappear as would many of the exact match keyword domains with MFA (Made for AdSense or Made for Amazon) content.

TL;DR

Google+ requires identity so it can confidently map a web of people and use authorship to better refine search results. In the process, and as an added bonus, they may also break Facebook’s monopoly on attention.

Blind Five Year Old

PageRank Ponzi

Did You Mean Humor

Optimize The Google +1 Button

What Does Google Think Your Site Is About?

Doppelgangers

Comment Censorship

Inside Google’s Search Office

Computational Humor

Image Search in Google Analytics

Google Plus, Identity and SEO

Subscribe

Recently Published

Browse by Category

Search The Site

Follow

Blog Roll