Tags Archives

You are currently viewing all posts tagged with &amp.

Posted by randfish

You may have seen the recent string of posts about SEO vs. Social Media, starting with this effective, but poorly argued controversy-bait, which was excoriated by Elysia Brooker and Hugo Guzman, then followed up with a more nuanced view by Darren Rowse. While I’m not particularly interested (nor do I think there’s much value) in re-hashing or arguing these points, I did think the topic warranted attention, as it brings up some excellent points marketers should carefully consider as they invest in their craft.

We Search for What We Want + Need

The search for information and answers has been essential to humans since time immemorial. And there’s no sign that our latest iteration, web search, is losing any steam:

Growth in Search Query Volume 2006-2010

Even as we’ve reached a maturity point with broadband adoption and online population, searches are rising. We’re not searching less every month; we’re searching more.

Search is an intent driven activity. We don’t search casually (much), we search to find answers, information, goods and services to consume. The power of search marketing – whether paid or organic – is simple: Be in front of the consumer at the time of consumption. There’s no more effective time to be present and no more effective way of knowing what is desired. All the social graph analysis in the world won’t tell you that Sunday evening, I got fed up with my current selection of footwear and, after some searching, spent a few hundred dollars on Zappos. But being front and center when I queried mens puma shoes brought them some nice business.

We’re Social to Discover and Share

Social media – whether it’s Twitter, Facebook, Flickr, Reddit, StumbleUpon or something else – is about connections, interaction, discovery and distraction. We hardly ever use these portals as a way to find answers, though they certainly may provide plenty to unasked questions.

Social media marketing advocates often make the case that social is how we find out about new products on the web, but, at least so far, the data doesn’t back up this assertion:

ATG Study on Where Users Discover Products
-
ATG Study on How Users Discover Products via SearchEngineLand

However, I am strongly inclined to believe the claim that social media is how we find out about new content on the web, particularly when we’re not seeking something in particular (as with a search). Blogs, pictures, video, research and the like are surely seeing an increased share of their visits from social, and that branding exposure is definitely valuable.

Some recent GroupM Research helped to shed the light of data on this supposition, noting that:

  • The click-through rate in organic search results for users who have been exposed to a brand’s social marketing campaign are 2.4X higher than those that haven’t; for paid search, it showed a jump from 4.5% to 11.8% (in both cases, this is for branded queries)
  • Consumers using social media are 1.7x more likely to search with the intention of making a list of brands or products to consider purchasing compared to those who do not use social media

Ben Yoskovitz talked about this value in his recent analysis:

Based on the information in this report, it’s reasonable to argue that social media marketing can increase the quality of leads (and not just the volume). It’s possible to hone in on, and understand intent through search and how social media exposure affects that intent. And as people are exposed (and I would say involved with – since exposure sounds like you’re just broadcasting stuff at people, which isn’t what social media is about) to social media their intent is more focused and driven towards lead conversion

That’s the kind of social media marketing value I can get behind. Get exposed to potential customers through social so that when they build their consideration set, search and purchase, you’ll have a leg up on the competition.

What Drives Traffic (and Converts) for Whom

It pays to understand the bias of this flare-up’s instigator, and I’ve got plenty of compelling data myself to see his perspective. Last weekend, I started publishing content on a personal blog – no domain authority, no links and little chance of performing well in search. But the results from social media – Twitter, Facebook and Hacker News in particular – are fairly remarkable:

Traffic Data

The search traffic demand, all 78 visits, was generated from the articles that went popular on Twitter & HN. The site itself still doesn’t rank for its own name. Yet, social media sent 22,000 visits over 9 days. No wonder bloggers, in particular those that monetize through advertising, sponsorships and other traffic-driven systems, have a proclivity for investing in social traffic. Perhaps it’s not so crazy to suggest on Problogger.net, a site about growing blog traffic and improving monetization, that social can be "better" than SEO.

I’d still argue that overall, referring traffic of all kinds sent from social, particularly from the largest network (Facebook), is only a fraction of the visits Google sends out each day (unless you’re in the business of appealing to the Facebook audience biases – I was a bit frustrated with how the data was clearly manipulated in the reference piece to fit the story). But, social does eliminate some of the inherent biases that search engines carry and let content that appeals to social users flourish no matter the site’s ability to grow its link profile, make content accessible to spiders or effectively target keywords.

Now let’s look at an example on the opposite end of the spectrum – conversions for a B2B product.

SEOmoz’s PRO membership may not be a good investment unless you’re a marketer actively engaged with SEO, but given that both the search and social traffic our site attracts
likely fall into this intent group (interested in SEO and likely to be in web marketing), a comparison seems fair.

First, I did some prep work in our Google Analytics account by creating an advanced segment called "social traffic" that contains any referral source with "twitter," "facebook," "stumbleupon," "linkedin," "flickr," and "ycombinator" – these represent the vast majority of our social media sources. Next, I compared this traffic quantitatively with our search referrals over the past two weeks:

  • Social Traffic – 26,599 visits from 30 sources
  • Organic Search Traffic – 102,349 from 20 sources

I then compared the percent of these reaching our landing or purchase pages for PRO membership. Here’s organic search:

Organic Search Traffic

And here’s social traffic:

Social Traffic

Here’s what I see:

  • 4.5% of organic search visitors considered a purchase
  • 1.3% of social traffic considered a purchase
  • While I can’t disclose full numbers, I can see that a fair number of search visits converted vs. zero for social.

In fact, looking at the entire year to date traffic to SEOmoz from social sources, it appears not 1 visit has ever converted for us. Social may be a great way to drive traffic, build branding and make a purchase more likely in the future, but from a direct conversion standpoint, it doesn’t hold a candle to search. To be fair, I’m not looking at full life cycle or even first-touch attribution, which makes this analysis less comprehensive, though likely still directionally informative.

Takeaways

Given the research and data here and in the posts/content referenced, I think we can say a few things about search and social as marketing channels:

  1. There shouldn’t be a VS.: This isn’t about pitting web marketers against each other (or perhaps, more accurately, themselves, since our industry survey data suggests many of us are responsible for both). There’s obvious value in both channels and to suggest otherwise is ideological nonsense and worse, self-defeating.
  2. Search Converts: Billion+ isn’t being wasted on Google’s search ads – that sucker send intent-driven, focused, conversion-ready visits like nobody else on the web.
  3. Social Has Value: Those exposed to a social campaign are better customers and prospects; making social not only a branding and traffic channel, but an opportunity for conversion rate optimization.
  4. SEO Is Hard in the Early Stages: Without a strong link profile, even great content may not perform particularly well in search results.
  5. Segmenting Search and Social is Key: Unless you separate, analyze and iterate, you’re doomed to miss opportunities and falsely attribute value. I’m particularly worried about those marketers who invest heavily in social to the detriment of SEO because the immediacy of the rewards is so much more tangible and emotionally compelling (He’s following me on Twitter! We have 200 Facebook fans!) – make sure appropriate effort goes where it can earn ROI; it’s our job.

For another interesting (and more social-media biased) perspective, check out Search vs. Social from Bradford Cross.

I’d love to hear more from you on this topic, too. 

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by randfish

The process of launching a new website is, for many entrepreneurs, bloggers and business owners, an uncertain and scary prospect. This is often due to both unanswered questions and incomplete knowledge of which questions to ask. In this post, I’ll give my best recommendations for launching a new site from a marketing and metrics setup perspective. This won’t just help with SEO, but on traffic generation, accessibility, and your ability to measure and improve everything about your site.

#1 – Install Visitor Analytics

Nothing can be improved that is not tracked. Keeping these immortal words of wisdom in mind, get your pages firing analytics code before your first visitor. Google Analytics is the obvious choice, and customization options abound (for most sites more advanced than a basic blog, I’d highly recommend at least using first-touch attribution).

Google Analytics Metrics

Google analytics, or any other package (see some alternatives here), needs to be placed on every page of your site and verified. Do yourself a favor and install in a template file you can be sure is on every page (e.g. footer.php). GA’s instructions will indicate that placing the code at the top of the page is key, but I’m generally in favor of leaving it at the bottom to help page load time for visitors (though the new asynchronous GA code is pretty fast).

#2 – Set Up Google & Bing Webmaster Tools Accounts

Both Google & Bing have webmaster tools programs that monitor data about your site and message it back to you through online interfaces. This is the heartbeat of your site from the search engines’ perspective and for that reason, it’s wise to stay on top of the data they share.

Bing Webmaster Tools

That said, the numbers inside these tools are not always perfect, and often have serious flaws. The referring keywords and traffic data are, in my experience, far off what analytics tools will report (and in those cases, trust your analytics, not the engines’ tools). Likewise, crawl, spidering and indexation data isn’t always solid, either. Nonetheless, new features and greater accuracy continue to roll out (more of the former than the latter unfortunately) and it’s worth having these both set up.

#3 – Run a Crawl Simulation of Your Site

No matter how perfect you or your developers are, there’s always problems at launch – broken links, improper redirects, missing titles, pages lacking rel=canonical tags (see more on why we recommend using it and the dangers of implementing improperly), files blocked by robots.txt, etc.

Web App Crawl Data

By running a crawl test with a free tool like Xenu or GSiteCrawler, or leveraging a paid tool like Custom Crawl from Labs or the Crawl Service in the Web App (pictured above), you can check your site’s accessibility and insure that visitors and search engines can reach pages successfully in the ways you want. If you launch first, you’ll often find that critical errors are left to rot because the priority list fills up so quickly with other demands on development time. Crawl tests are also a great way to verify contractor or outsourced development work.

#4 – Test Your Design with Browser Emulators

In addition to testing for search engine and visitor accessiblity, you’ll want to make sure the gorgeous graphics and layout you’ve carefully prepared checks out in a variety of browsers. My rule is to test anything that has higher than 2% market share, which currently means (according to Royal Pingdom): Internet Explorer, Firefox, Chrome, Safari and Opera.

There’s a great list of browser testing options from FreelanceFolder here, so I’ll just add that in-person testing, on your own PCs & Macs, is also a highly recommended use of an hour.

#5 – Set Up RSS Feed Analytics

Virtually every site will have some form of structured data being pushed out through an RSS feed. And, just like visitor analytics, if you want to improve the reach and quality of the feed, you’ll need to leverage data.

Feedburner Dashboard for SEOmoz

Feedburner is the de facto software of choice, and it’s very solid (though, good alternatives do exist). Getting your feed and the analytics to track and measure it is typically a very easy process because there’s nothing to verify – you can create and promote any feed you want with just a few button clicks.

One important recommendation – don’t initially use the counter "chicklet" like: Feedburner Chicklet with 0 Readers 

It has a bad psychological impact to see that no one has subscribed to your new RSS feed. Instead, just provide a standard link or graphic and after you’ve amassed a few hundred or thousand readers, use the numeric readout to provide additional social proof.

#6 – Tag the Actions that Matter

No matter what your site is, there are actions you’re hoping visitors will take – from tweeting a link to your post to leaving a comment to buying a product or subscribing to an email list. Whatever those actions might be, you need to record the visits that make them through your analytics tool. Casey Henry’s post on Google Analytics’ Event Tracking will provide a thorough walkthrough.

Once action tracking is in place, you can segment traffic sources and visit paths by the actions that were taken and learn more about what predicts a visitor is going to be valuable. If you’re pouring hours each day into Twitter m but seeing no actions, you might try a different channel, even if the traffic volume is high.

#7 – Conduct an Online Usability/Branding Test

Before a formal launch, it can be extremely helpful to get a sense of what users see, experience and remember when they browse to your site for a few seconds or try to take an action. There’s some fantastic new software to help with this, including Clue App, screenshot below:

Clue App Test on SEOmoz

Last week, I set up a Clue App test for SEOmoz’s homepag
e in 30 seconds and tweeted a single link to it, which garnered 158 kind responses with words and concepts people remembered from the visit. This type of raw testing isn’t perfect, but it can give you a great look into the minds of your visitors. If the messages being taken away aren’t the ones you intended, tweaking may be critical.

In addition to Clue, dozens of other easy usability and user-testing apps are now on the market. Conversion Rate Experts has a good list here and Craig Tomlin’s got another excellent one here.

#8 – Establish a KPI Dashboard

No matter what your website does, you live and die by some key metrics. If you’re starting out as a blogger, your RSS subscribers, unique visits, pageviews and key social stats (tweets, links, Facebook shares, etc) are your lifeblood. If you’re in e-commerce, it’s all of the above plus # of customers, sales, sales volume, returning vs. new buyers, etc.

SEOmoz Partial KPI Chart

Whatever your particular key metrics might be, you need a single place – often just a basic spreadsheet – where these important numbers are tracked on a daily or weekly basis. Setting this up before you launch will save you a ton of pain later on and give you consistent statistics to work back from and identify trends with in the future.

#9 – Build an Email List of Friends & Business Contacts for Launch

This may seem non-obvious, but it’s shocking how a friendly email blast to just a few dozen of your close contacts can help set the stage for a much more successful launch. Start by building a list of the people who owe you favors, have helped out and who you can always rely on. If you’re feeling a bit more aggressive in your marketing, you can go one circle beyond that to casual business partners and acquaintences.

Once you have the list, you’ll need to craft an email. I highly recommend being transparent, requesting feedback and offering to return the favor. You should also use BCC and make yourself the recipient. No one wants to be on a huge, visible email list to folks they may not know (and get the resulting reply-all messages).  

#10 – Create Your Google Alerts

TheAlerts Service from Google certainly isn’t perfect, but it’s free, ubiquitous, and can give you the heads up on some of the sites and pages that mention your brand or link to you in a timely fashion.

Google Alerts

Unfortunately, the service sends through a lot of false positives – spam, scraper sites and low quality junk. It also tends to miss a lot of good, relevant mentions and links, which is why the next recommendation’s on the list. 

#11 – Bookmark Brand Tracking Queries

In order to keep track of your progress and identify the sites and pages that mention or link to your new site, you’ll want to set up a series of queries that can run on a regular basis (or automated if you’ve got a good system for grabbing the data and putting it into a tracking application). These include a number of searches at Google, Twitter and Backtype:

Reputation Monitoring Queries

The queries should use your brand name in combination with specific searches, like the example below (using "seomoz" and "seomoz.org"):

You can add more to this list if you find them valuable/worthwhile, but these basics should take you most of the way on knowing where your site has been mentioned or referenced on the web.

#12 – Make Email Signup/Subscription Available

Capturing the email addresses of your potential customers/audience can be a huge win for the influence you’re able to wield later to promote new content, products or offerings. Before you launch, you’ll want to carefully consider how and where you can offer something in exchange for permission to build an email list.

One of the most common ways to build good lists is to offer whitepaper, e-book, video or other exclusive content piece for download/access to those who enter an email address. You can also collect emails from comment registration (which tend to be lower overall quality), through an email newsletter subscription offering (which tend to be very high quality) or via a straight RSS subscription (but you’ll need to self-manage if you want to have full access to those emails). Services like MailChimp, ExactTarget, Constant Contact and iContact are all options for this type of list building and management.

#13 – Create Your Site/Brand’s Social Accounts

Social media has become popular and powerful enough that any new site should be taking advantage of it. At a minimum, I’d recommend creating accounts on the following networks:

And if you have more time or energy to devote, I’d also invest in these:

Setting up these accounts diligently is important – don’t just re-use the same short bio or snippet over and over. Spend the time to build fleshed out profiles that have comprehensive information and interact/network with peers and those with similar interests to help build up reputation on the site. The effort is worth the reward – empty, unloved social accounts do virtually nothing, but active ones can drive traffic, citations, awareness and value.

BTW – Depending on the size and structure of your site, you may also want to consider creating a Facebook Fan Page, a LinkedIn Company Page and profiles on company tracking sites like Crunchbase, BusinessWeek and  the Google Local Business Center.

#14 – Connect Your Social Accounts

If you’ve just set up your social account, you’ve likely added your new site as a reference point already, but if not, you should take the time to visit your various social profiles and make sure they link back to the site you’re launching.

Rand's Twitter Profile

Not all of these links will provide direct SEO value (as many of them are "nofollowed"), but the references and clicks you earn from those investigating your profiles based on your participation may prove invaluable. It’s also a great way to leverage your existing branding and participation to help the traffic of your new site.

#15 – Form a List of Target Press, Blogger and Industry People for Outreach

Depending on your niche, you may have traditional media outlets, bloggers, industry luminaries, academics, Twitter personalities, powerful offline sources or others that could provide your new site with visibility and value. Don’t just hope that these folks find you – create a targeted list of the sites, accounts and individuals you want to connect with and form a strategy to reach the low hanging fruit first.

The list should include as much contact information as you can gather about each target – including Twitter account name, email (if you can find it), and even a physical mailing address. You can leverage all of these to reach out to these folks at launch (or have your PR company do it if you have one). If you tell the right story and have a compelling site, chances are good you’ll get at list a few of your targets to help promote, or, at the least visit and be aware of you.

#16 – Build a List of Keywords to Target in Search Engines

This is SEO basics 101, but every new site should keep in mind that search engines get lots of queries for virtually everything under the sun. If there are keywords and phrases you know you want to rank for, these should be in a list that you can measure and work toward. Chances are that at launch, you won’t even be targeting many of these searches with specific pages, but if you build the list now, you’ll have the goal to create these pages and work on ranking for those terms.

As you’re doing this, don’t just choose the highest traffic keywords possible – go for those that are balanced; moderate to high in volume, highly relevant in terms of what the searcher wants vs. what your page/site offers and relatively low in difficulty.

See this post for more tips – Choosing the Right Keyphrases – from Sam Crocker.

#17 – Set Targets for the Next 12 Months

WIthout goals and targets, there’s no way to know whether you’re meeting, beating or failing against expectations – and every endeavor, from running a marathon to cooking a meal to building a company or just launching a personal blog will fail if there aren’t clear expectations set at the start. If you’re relatively small and just starting out, I’d set goals for the following metrics:

  • Average weekly visits (via analytics)
  • Average page views (via analytics)
  • Number of new posts/pages/content pieces produced per month
  • Number of target contacts (from item #15) that you’ve reached
  • Social media metrics (depending on your heaviest use platform, e.g. # of Twitter followers if you’re a heavy Tweeter)
  • Any of the key items from #8 on this list (your KPI dashboard)

And each of these should have 3, 6 and 12 month targets. Don’t be too agressive as you’ll find yourself discouraged or, worse, not taking your own targets seriously. Likewise, don’t cut yourself short by setting goals that you can easily achieve – stretch at least a little.

Every 3-6 months, you should re-evaluate these and create new goals, possibly adding new metrics if you’ve taken new paths (RSS subscribers, views of your videos, emails collected, etc.)

#18 – Plug in the SEOmoz Web App

I know this one’s a bit self-serving, but I’d like to think I’d add it here even if it wasn’t my company (I recently set up my own personal blog and found the crawling, rank tracking and new GA inegration features pretty awesome for monitoring the growth of a new site).

PRO Web App

The SEOmoz Web App has a number of cool tracking and monitoring features, as well as recommendations for optimizing pages targeting keywords, that make it valuable for new sites that are launching. The crawl system can serve to help with #3 on this list at the outset, but ongoing, it continues to crawl pages and show you your site’s growth and any errors or missed opportunities. Tracking rankings can let you follow progress against item #16, even if that progress is moving from ranking in the 40s to the 20s (where very little search traffic will be coming in, even if you’re making progress). And the new GA integration features show the quantity of pages, keywords and visits from search engines to track progress from an SEO standpoint.


Using this list, you should be able to set up a new site for launch and feel confident that your marketing and metrics priorities are in place. Please feel free to share other suggestions for pre and post-launch tactics to help get a new site on its feet. I’m looking forward to seeing what other recommendations you’ve got.

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by randfish

Earlier this year we asked the community to take our SEO Industry Survey. We had originally hoped to get at least 3,000 responses and were completely blown away when over 10,000 people ended up taking the survey! Of course, it never hurts to have an iPad as the grand prize, but I’m still very excited about the extent of this report. As a comparison, another excellent survey in our industry earlier this year from eConsultancy and SEMPO generated ~1,500 responses (results are detailed in this SELand article).

Our survey’s goal was to gather information about SEO in 2010 and share it publicly. We asked questions around:

  • Who are the people in the SEO community?
  • How do they learn about SEO and sharpen their skills?
  • How are companies embracing search marketing?
  • Which tools and tactics do people in the industry use to support their SEO and social media efforts?

After some detailed number crunching by our good friend Will Critchlow from Distilled, we’re happy to present to you the results from the data.

Get the 2010 Industry Results Here

Some of the cool things you’ll see include:

  • What percent of SEOs say they buy links, report spam and how many overlap?
  • Salary ranges across countries, experience levels and job descriptions
  • Demographics of SEO – we might need to work on our male/female ratio
  • and lots more – just go read it!

We’ve also created a spiffy infographic to help visualize the survey results:

SEO Industry Survey

 

For those who’d like to delve into the data more deeply, and extract new views on the information from the 10K+ respones, we’ve made the full data dump available in CSV form: download here. We’d love to see any interesting/unique analyses on this information, and we hope it’s useful to those organizations and companies seeking to learn more about the SEO market.

Winners!

We can’t forget to mention the people who won gifts for participating in the survey. The winners were notified back in June and they’ve all received their prizes. Here are the winners:

Grand Prize: 32GB Wi-Fi iPad:
Sam Ilowitz

First Prize: 120min Flip Mino HD Camera with custom SEOmoz artwork:
Jared Reed
Jay Estis
J. Smeekens

Second Prize: gift certificates to the SEOmoz Zazzle Store:
Gareth Allen
Jody Lonergan
Anton Korzhuk
Jason Tan
Robert Palmer
Sebastien Mégraud
Lindsay Copeland
Joakim Eriksson
Nicholas Foo
Brian Hutchison

It’s been a tremendous pleasure and honor to be part of such a powerful and growing industry, and this survey highlights the depth, breadth and uniqueness of those who do SEO professionally. Thanks so much for participating – we hope to make this a biennial (or possibly even annual) tradition.

p.s. We’ve also got the questions in individual results format on this detail page. Feel free to use any of the images and data in your reports, presentations, analyses, slide decks, etc. but if you use them online, we’d appreciate a link (nofollow is fine, but remember it leaks PageRank) ;-)

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by JoannaLord

Are you an SEOmoz fan? Do you love our software? Do you dream of introducing Roger to your friends? Well, what if you could do all that and make money? Too awesome to believe, huh? Well now you can! We are excited to present our new affiliate program, which has higher payouts, an easier management platform, and better service all-around.

For years we have been struggling to really let the potential of our promoters and evangelists shine through. Well enough of that.

We have moved to the HasOffers platform, another Seattle-based startup that is quickly earning a reputation as an industry leader. With HasOffers our affiliates will now enjoy more visibility into their account, top-notch reporting, and an easy to navigate platform for better usability.

Now let’s talk money!
Like I said, we are paying out big for your help in promoting our software. It’s a win-win…you get to share software you love and make money while you’re at it! Here is what the payouts look like.

Affiliate payout table
              (Wowzers is right! These are the highest affiliate payouts in our industry!)

 

Got questions? We have some answers!

How long does my cookie last?

We are giving a generous 60-day cookie, to make sure you get the credit you deserve!

What kinds of tracking does HasOffers provide me?

Our new program offers real-time tracking. This means when they convert…you know! Please note the tracking will begin on the click, not the impression.

What kind of resources are available to me?
We have over 25 different creatives in there for you to get started with. This includes a variety of themes as well as sizes. We are also going to be adding to this regularly, based on affiliate feedback and needs. In addition to a plethora of creatives, we provide you a variety of optimized landing pages to help your traffic better understand what SEOmoz PRO is all about and purchase with confidence!

What kinds of campaigns are allowed?
We allow a number of different campaigns–website, blog, email, and coupon are just some examples. Currently, we are not accepting incentivized traffic and paid search campaigns require affiliate manager approval.

How often do I get paid?
We work off a 30-day pay period, and then we will be paying on Net 30 after the close of each pay period.

What about all my other questions?
Well friends, this is the awesome part. We have moved this affiliate program in-house because we are serious about making this a top-notch affiliate program. If you have any questions you can contact us directly at affiliate@seomoz.org, and will get back to you speedy as speedy can be!

That about sums it up for now. We are so excited, and urge all of you check out the program and sign up! If you are looking for more information you can read about the new affiliate program in detail or if you are ready to sign up and get promoting, you can join below!

Become and Affiliate button

 

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by Lindsay

Some of the Internet’s most important pages from many of the most linked-to domains, are blocked by a robots.txt file. Does your website misuse the robots.txt file, too? Find out how search engines really treat robots.txt blocked files, entertain yourself with a few seriously flawed implementation examples and learn how to avoid the same mistakes yourself.

The robots.txt protocol was established in 1994 as a way for webmasters to indicate which pages and directories should not be accessed by bots. To this day, respectable bots adhere to the entries in the file… but only to a point.

Your Pages Could Still Show Up in the SERPs

Bots that follow the instructions of the robots.txt file, including Google and the other big guys, won’t index the content of the page but they may still put the page in their index. We’ve all seen these limited listings in the Google SERPs. Below are two examples of pages that have been excluded using the robots.txt file yet still show up in Google.

Cisco Login Page

The below highlighted Cisco login page is blocked in the robots.txt file, but shows up with a limited listing on the second page of a Google search for ‘login’. Note that the Title Tag and URL are included in the listing. The only thing missing is the Meta Description or a snippet of text from the page.

Cisco Login Page SERP

WordPress’s Next Blog Page

One of WordPress.com’s 100 most popular pages (in terms of linking root domains) is www.wordpress.com/next. It is blocked by the robots.txt file, yet it still appears in position four in Google for the query ‘next blog’.

WordPress Next Blog SERP

As you can see, adding an entry to the robots.txt file is not an effective way of keeping a page out of Google’s search results pages.

Robots.txt Usage Can Block Inbound Link Effectiveness

The thing about using the robots.txt file to block search engine indexing is not only that it is quite ineffective, but that it also cuts off your inbound link flow. When you block a page using the robots.txt file, the search engines don’t index the contents (OR LINKS!) on the page. This means that if you have inbound links to the page, this link juice cannot flow to other pages. You create a dead end.

(If this depiction of Googlebot looks familiar, that’s because you’ve seen it before! Thanks Rand.)

Even though the inbound links to the blocked page likely have some benefit to the domain overall, this inbound link value is not being utilized to its fullest potential. You are missing an opportunity to pass some internal link value from the blocked page to more important internal pages.

3 Big Sites with Blocked Opportunity in the Robots.txt File

I’ve scoured the net looking for the best bloopers possible. Starting with the SEOmoz Top 500 list, I hammered OpenSiteExplorer in search of heart-stopping Top Pages lists like this:

Digg's Top Five Pages

Ouch, Digg. That’s a lot of lost link love!

This leads us to our first seriously flawed example of robots.txt use.

#1 – Digg.com

Digg.com used the robots.txt to create as much disadvantage as possible by blocking a page with an astounding 425,000 unique linking root domains, the "Submit to Digg" page.

Submit to Digg

The good news for Digg is that from the time I started researching for this post to now, they’ve removed the most harmful entries from their robots.txt file. Since you can’t see this example live, I’ve included Google’s latest cache of Digg’s robots.txt file and a look at Google’s listing for the submit page(s).

Digg Robots.txt Cache

As you can see, Google hasn’t begun indexing the content that Digg.com had previously removed in the robots.txt.

Digg Submit SERP

I would expect Digg to see a nice jump in search traffic following the removal of it’s most linked to pages from the robots.txt file. They should probably keep these pages out of the index with the robots meta tag, ‘noindex’, so as not to flood the engines with redundant content. This move would ensure that they benefit from the link juice without flooding the search engine indexes.

If you aren’t up to speed on the use of noindex, all you have to do is place the following meta tag into the <head> section of your page:

<meta name="robots" content="noindex, follow">

Additionally, by adding ‘follow’ to the tag you are telling the bots to not index that particular page, but allowing them to follow the links on the page. This is usually the best scenario as it means that the link juice will flow to the followed links on the page. Take for example a paginated search results page. You probably don’t want that specific page to show up in the search results as the contents of page 5 of that particular search is going to change day to day. But by using the robots noindex, follow the links to products (or jobs in this example from Simply Hired) will be followed and hopefully indexed.

Alternitavely you can use "noindex, nofollow" but that’s a mostly pointless endeavor as you’re blocking link juice as with the robots.txt.

#2 – Blogger.com & Blogspot.com

Blogger and Blogspot, both owned by Google, show us that everyone has room for improvement. The way these two domains are interconnected does not utilize best practices and much link love is lost along the way.

Blogger Home Page Screenshot Blogger.com is the brand behind Google’s blogging platform, with subdomains hosted at ‘yourblog.blogspot.com’. The link juice blockage and robots.txt issue that arises here is that www.blogspot.com is entirely blocked with the robots.txt. As if that wasn’t enough, when you try to pull up the home page of Blogspot, you are 302 redirected to Blogger.com.

Note: All subdomains, aside from ‘www’, are accessible to robots.

A better implementation here would be a straight 301 redirect from the home page of Blogspot.com to the main landing page on Blogger.com. The robots.txt entry should be removed altogether. This small change would unlock the hidden power of more than 4,600 unique linking domains. That is a good chunk of links.

#3 – IBM

IBM has a page with 1001 unique linking domains that is blocked by the robots.txt file. Not only is the page blocked in the robots.txt but it also does a triple-hop 302 to another location, show below.

IBM

When a popular page is expired or moved, the best solution is usually a 301 redirect to the most suitable final replacement.

Superior Solutions to the Robots.txt

In the big site examples highlighted above, we’ve covered some misuses of the robots.txt file. Some scenarios weren’t covered. Below is a  list of effective solutions to keep content out of the search engine index without link juice leak.

Noindex

In most cases, the best replacement for robots.txt exclusion is the robots meta tag. By adding ‘noindex’ and making sure that you DON’T add ‘nofollow’, your pages will stay out of the search engine results but will pass link value. This is a win/win!

301 Redirect

The robots.txt file is no place to list old worn out pages. If the page has expired (deleted, moved, etc.) don’t just block it. Redirect that page using a 301 to the most relevant replacement. Get more information about redirection from the Knowledge Center.

Canonical Tag

Don’t block your duplicate page versions in the robots.txt. Use the canonical tag to keep the extra versions out of the index and to consolidate the link value. Whenever possible. Get more information from the Knowledge Center about canonicalization and the use of the rel=canonical tag.

Password Protection

The robots.txt file is not an effective way of keeping confidential information out of the hands of others. If you are making confidential information accessible on the web, password protect it. If you have a login screen, go ahead and add the ‘noindex’ meta tag to the page. If you expect a lot of inbound links to this page from users, be sure to link to some key internal pages from the login page. This way, you will pass the link juice through.

Effective Robots.txt Usage

The best way to use a robots.txt file is to not use it at all. Well… almost. Use it to indicate that robots have full access to all files on your website and to direct robots to your sitemap.xml file. That’s it.

Your robots.txt file should look like this:

—————–

User-agent: *
Disallow:

Sitemap: http://www.yoursite.com/sitemap.xml

—————–

The Bad Bots

Earlier in the post I mentioned that "Bots that follow the instructions of the robots.txt file," which means that there are bots that don’t adhere to the robots.txt at all. So while you’re doing a good job of keeping out the good bots, you’re doing a horrible job of keeping out the "bad" bots. Additionally, filtering to only allow bot access to Google/Bing isn’t recommend for three reasons:

  1. The engines change/update bot names frequently (e.g. the Bing bot name change recently)
  2. Engines employ multiple types of bots for different types of content (e.g. images, video, mobile, etc.)
  3. New engines/content discovery technologies getting off the ground stand even less of a chance with institutionalized preferences for existing user agents only (e.g. Blekko, Yandex, etc.) and search competition is good for the industry.

Competitors

If your competitors are SEO savvy in any way shape or form, they’re looking at your robots.txt file to see what they can uncover. Let’s say you’re working on a new redesign, or a whole new product line and you have a line in your robots.txt file that disallows bots from "indexing" it. If a competitor comes along, checks out the file and sees this directory called "/newproducttest" then they’ve just hit the jackpot! Better to keep that on a staging server, or behind a login. Don’t give all your secrets away in this one tiny file.

Handling Non-HTML & System Content

  • It isn’t necessary to block .js and .css files in your robots.txt. The search engines won’t index them, but sometimes they like the ability to analyze them so it is good to keep access open.
  • There isn’t a great solution for keeping non-html files, like .PDFs, out of the search engines’ index without using the robots.txt file. Hopefully you aren’t publishing this type of content and want it excluded unless it is behind a login.
  • Images! Every website has background images or images used for styling that you don’t want to have indexed. Make sure these images are displayed through the CSS and not using the <img> tag as much as possible. This will keep them from being indexed, rather than having to disallow the "/style/images" folder from the robots.txt.
  • A good way to determine whether the search engines are even trying to access your non-HTML files is to check your log files for bot activity.

More Reading

Both Rand Fishkin & Andy Beard have covered robots.txt misuse in the past. Take note of the publish dates and be careful with both of these posts, though, because they were written before the practice of internal PR sculpting with the nofollow link attribute was discouraged. In other words, these are a little dated but the concept descriptions are solid.

Action Items

  1. Pull up your website’s robots.txt file(s). If anything is disallowed, keep reading.
  2. Check out the Top Pages report in OSE to see how serious your missed opportunity is. This will help you decide how much priority to give this issue compared to your other projects.
  3. Add the noindex meta tag to pages that you want excluded from the search engine index.
  4. 301 redirect the pages on your domain that don’t need to exist anymore and were previously excluded using the robots.txt file.
  5. Add the canonical tag to duplicate pages previously robots.txt’d.
  6. Get more search traffic.

Happy Optimizing!

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by randfish

In the last year, there’s been a plethora of entrants to the field of link building services outside the traditional software basis of reversing competitors’ backlinks (like our Link Intersect, LAA or Open Site Explorer tools) and consulting/direct purchase. In this post, I’ll try to cover some of the interesting major new services, as well as present some long-standing options that some SEOs may not have discovered.

I’ve segmented the services below into unique sections to help differentiate the types of link building they offer. Some are more service-based, others are pure-software and the first section is more visibility-based than direct link  acquisition.

Zemanta

One of the more unique offerings in the last few years, Zemanta lets publishers submit a feed of content or images to them, which then appear in front of bloggers in the "composition" window (while they write their posts). These are labeled as "related posts" and have multiple benefits:

  • They can improve branding amongst a blogging audience (as bloggers will see your site/brand name while they write)
  • They can draw in direct links (if the blogger chooses to link to your work in the post or as a "related post" at the bottom – or through links from image references)
  • They can attract direct traffic from the bloggers themselves, who are likely to click on links/content that appears to be interesting

Zemanta's Content Recommendations
You can try Zemanta’s service via a demo on their site

Zemanta has (according to their team) been formally approved by Google’s search quality folks as a white-hat service (which makes sense since all they’re doing is showing advertising content to writers, who then determine if they want to link or not) and is now included in WordPress and Blogger.

SEOmoz has been using them for over a year now (we started with a trial and continued on) and we’ve seen good results – we tend to get a half dozen or so links to our content (the blog and YOUmoz) each month which can be seen through their reporting system (which has some upgrades in the works).

*Other than our paid use of the service, SEOmoz does not have any affiliations with Zemanta or its founders.

MyBlogGuest

Founded by Ann Smarty, MyBlogGuest provides a platform for those seeking to write and receive guest posts. The service is relatively simple, but potentially quite powerful. If a reasonable number of quality blogs and sites participate in the marketplace, the opportunities for providing great posts and receiving traffic and links back are tremendous (as are the opportunities for those seeking more content and relationships).

Blogging is an inherently social field and while the links may be a primary driver for many interested in the site, Ann has made it clear that she hopes deeper relationships will emerge from the connections. The site’s layout and signup process are impressive and compelling, though driving action once inside the platform could still use a bit more polish.

MyBlogGuest Screenshot
The marketplace is currently based on a forum connections system

You can read more about the project in SearchEngineLand’s interview with Ann from February.

I’ll be surprised if some Silicon Valley style startups don’t pop up to copy this model. Hopefully Ann can stay far enough ahead of the game through a network effect to remain compeitive. It’s a terrific idea that needs only enough branding and awareness in the space to take off.

*SEOmoz does not have an affiliation with this site, though we have contracted Ann, personally, to do projects for us in the past.

EightFoldLogic’s Linker

Originally known as Enquisite, EightFoldLogic, a software company with offices in Victoria BC and San Francisco has recently launched a marketplace of their own for website owners of all stripes called "Linker." The premise is similar to MyBlogGuest, but the audience is wider and the interface more customized for creating one-to-one, private connections.

Eightfoldlogic's Linker
Linker enables the creation of "criteria" much like personal ads for linking connections

EightFoldLogic Linker
Within a day of signing up in a single category, I had four potential "matches"

Linker’s goal is to connect sites and marketers interested in partnerships or link relationships with one another. Since their service ends at the time of connection, the method of obtaining the link is up to the parties involved. This means plenty of white hat options, but also potential gray hat ones – however, EightFoldLogic’s Richard Zwicky and the audience they’ve traditionally attracted lean white hat, so I expect this won’t be an issue unless the audience changes substantially.

The concept of marketplaces for link acquisition and connecting to site owners interested in links is a compelling one, but the key, as with MyBlogGuest above, will be achieving the critical mass of users necessary to make the service valuable. To that end, Linker’s made their product completely free for the next couple months – you can sign up here.

*SEOmoz provides link data via our API to EightFoldLogic but does not have a financial stake in the company or this product.

Whitespark’s Local Citation Finder

A few months ago, I wrote a blog post about a tactic to grow your Google local/maps rankings that involved a similar principle to the automated tool built by Whitespark and Ontolo.

The concept is to find sites that are included in Google Local’s "sources" for maps and local review data that link to or reference multiple sites that rank in the local results. It’s a simple idea, but well executed and incredibly useful for those seeking to optimize their local listings. You can try the Local Citation Finder here – results take just a few minutes to be returned.

Whitespark Local Citation Finder
Enter some data about your site/goals and the citation finder will email you potential sources for listings

A
s the local results grow in importance and competition, and as the value of having these consistent, multiple listings rises, I suspect this tool will be incredibly popular. I’d love to see further productization around showing more data about the importance/value of particular local listing sites, and some opportunities to help control and manage those listings, but this first version is pretty exciting on its own.

*SEOmoz does not have a financial or product relationship with either WhiteSpark or Ontolo, though we have been talking to the latter about use of our API in other products.

—————

Although there are dozens of other services I’d love to cover, these are some of the most interesting to me, personally. As always, looking forward to your thoughts and recommendations, too!

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by Suzzicks

So here is the deal: Traditional websites frequently rank in mobile search results – especially if you are searching from a SmartPhone. What you may not realize is that the converse is also true – mobile pages can rank well in traditional search. This is quite an interesting phenomenon, and something that we need to address strategically.

Mobile Search-Subway Sandwiches

All One Index Soon?

Why does this happen?

Well, Google has said that they really don’t want to index two versions of the web – one mobile and one traditional. Even though they do have different mobile-specific bots, they want those their bots all to feed into one index. Hmmmm….Is it just an interesting coincidence that they just launched the multi-format site mapping in Google, where you can combine all the different types of sitemaps that we previously had to submit separately? Possibly. At least it that could indicate a shift away from multiple indexes.

Did anyone notice that this shift happened pretty soon after Caffeine, as did the re-launch of Google Images, and some significant changes in Google Places?

Hmmmm…..It seems that now things might be all moving to one index with different types of ‘indexing attributes’ that will replace the need for different indexes in the long run. That would actually do lots of things that Caffeine has done, like speed up searches, and allow them to algorithmically prioritize things by freshness more effectively….

Different Indexes for Smart Phones and Feature Phones

But I have gone astray – We were talking about mobile. We can’t know for sure if there are different mobile indexes. There definitely was in the beginning of ‘mobile’ – you could always tell because the results were SO bad! Even in the past two years, I have seen mobile search results that were way off base – For example, the top result for a search on ‘subway sandwiches’ was a Gawker article for a long time; then Subway.com, and then m.subway.com. I just checked, and they have finally sorted that one out! About 18 months ago Google changed the location of their mobile engine from m.Google.com to Google.com/m, and it did seem that the ‘/m’ feature phone search results were a bit better than they had been, but who knows!

As I have mentioned, there are different mobile search engine crawlers that are evaluating your website as if it was being rendered on a mobile phone. These mobile bots actually have generic and more specific user agent strings that will spoof actual phone handset models in order to understand how the website would behave on the different phones. While they don’t do a great job, Google actually does try to only provide you with mobile search results that will actually work well on your particular handset – What that means is that there are slight variations on search results from phone to phone.

There are some simple ways to check what I am now describing as ‘mobile indexing attributes.’ I always start mobile rankings research by doing a normal search from my traditional computer. We know more about the traditional algorithm, so that sets my baseline for comparison. From there, I will do the same search from Google.com/m to see the differences. In most cases, the websites that are included in the traditional search results will be included in the SmartPhone search results – but sometimes in a slightly different order.

You don’t have to have tons of different phones to get a sense for what is going on in mobile search. There are a couple quick tips and tricks to help you do this all from the web. The first thing to know, is that you can do searches from your computer directly from Google.com/m. The results you get will be generic ‘SmartPhone’ search results. From that page, you can move on to see the results for the same query on feature phones by simply scrolling to the bottom of the page and changing the drop-down that says ‘web’ to say ‘mobile,’ and hit ‘go.’ This set of results will be the generic FeaturePhone results.

Mobile-Friendly Signals for the Search Engines

The best way to indicate to the search engines that your page is mobile-ready, (beyond including the ‘no-transform’ tag, discussed more in another post called What is Mobile Search Engine Transcoding?),  is to provide the search engines pages that will work well on mobile phones. Handheld stylesheets can be included on any page on your site. If you don’t have mobile-specific pages, you can use these stylesheets to tell mobile browsers how you would like your existing pages to look when they are displayed on a mobile phone. These are especially good if you would like to change the order that your content appear in when it is displayed on a mobile phone and they should also be used to prevent the need for left-to right scrolling when your site is displayed on a mobile phone.

If you have mobile specific pages, you should set up user detection on your site to ensure that, regardless of which pages rank (mobile or traditional) that users are presented with the appropriate version of the page, based on the device that they are using to access the page. If they are on a mobile phone, they should automatically be sent to the mobile version of a page – even if it is the traditional page that actually ranked in search engines. Conversely, if they are on a traditional computer,  and happen to click on a mobile version of a page, they should be automatically be sent to the version of the page that is meant for traditional-computer viewing.

Last, include a page-to-page link in the upper left hand corner of each page that allows people to move between the mobile and traditional versions of the pages, if they can’t find what they are looking for, or need to over-ride the user-agent detection and redirection. The upper left-hand corner is the ideal location for this link, because it is always the first thing that people will be able to see, even if there is a mobile rendering  problem with the site. If something is wrong with the way the page looks on someone’s phone, you don’t want to make them search all over for the button to fix it!

You should still crate the handheld stylesheet for your mobile-specific pages and traditional pages as well, just in case something goes wrong. They are a good signal to the search engines that the pages should be ranked in mobile search results.

Mobile Usability Options:

  1. Mobile/Traditional Hybrid Pages Only: One set of pages that has two or more style sheets – One for traditional web rendering, usually called ‘screen,’ and one for mobile web rendering, usually called ‘handheld.’ An important note is that the iPhone will automatically pull the ‘screen’ stylesheet, unless you give other instructions.
    _
  2. Traditional Pages for Computer and Mobile Pages for all Phones: Two sets of pages – one to be shown on traditional computers and one to be shown on mobile phones. The file structure of the mobile pages should be an exact replica of the traditional pages, with the addition of the ‘.m’ or ‘/m’. User-agent detection and redirection delivers feature phone users and smart phone users here automatically if they click on a
    link to a traditional page.

    Always include links between the mobile site and the traditional site in the upper left hand corner of the page. Both sets of pages should have a handheld stylesheet to control mobile rendering. User-agent detection and redirection should also be in-place to automatically deliver people on traditional computers who click on the mobile pages to the traditional version of the page instead.
    _

  3. Mobile/Traditional Hybrid Pages for Traditional and SmartPhone, Mobile Specific Pages for Feature Phones: Two sets of pages; one set of pages that are the mobile/traditional hybrid pages that use separate external stylesheets to be rendered on traditional computer screens and smart phones. The second set of pages are mobile specific pages, hosted on an ‘m.’ or a ‘/m’. The file structure should be an exact replica of the traditional file structure, with the addition of the ‘m’ or ‘/m’. User-agent detection and redirection delivers feature phone users here automatically if they click on a link to a traditional page while they are on a feature phone.

    Always include links between the mobile site and the traditional site in the upper left hand corner of the page. Both sets of pages should have a handheld stylesheet to control mobile rendering. User-agent detection and redirection should also be in-place to automatically deliver people on traditional computers who click on the mobile pages to the traditional version of the page instead.
    _

  4. Traditional Pages for Computers, Graphical Mobile Pages for Smart Phones, Text Mobile Pages for Feature Phones: Three sets of pages. Traditional pages for traditional computers, touch-optimized pages for smart phones with touch screens, and mobile-optimized pages for feature phones and smart phones without touch screens. User-agent detection and redirection delivers users with touch screens to the touch-screen pages if they click on a link while they are on a touch-screen phone. User-agent detection and redirection delivers users on feature phones and smart phones that don’t have a touch-screen to the mobile-optimized pages if they click on a link while they are on one of those types of phones. In this scenario, you will need two mobile-specific subdomains or subdirectories. I recommend using ‘touch.’ or /’touch’ for the touch-screen pages, and ‘m.’ or /m’ for the mobile-optimized pages.

    Always include links between the mobile site and the traditional site in the upper left hand corner of the page. Both sets of pages should have a handheld stylesheet to control mobile rendering. User-agent detection and redirection should also be in-place to automatically deliver people on traditional computers who click on either version of the mobile pages to the traditional version of the page instead.

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by randfish

Today, Yahoo! formally announced that it’s fully transitioning its search engine backend to Microsoft’s Bing. While this is good news on many fronts for marketers (simplification of advertising platforms, a bigger competitor for Google, etc), it’s a big loss to webmasters who relied on some advanced link data available from Yahoo! Search that’s now unavailable.

While Yahoo! is maintaining their Site Explorer service, the use of advanced query parameters on searches using the link: and linkdomain: operators will no longer return results.

Yahoo!'s Linkdomain Command No Longer Returns Results

For the query above, Yahoo! previously showed pages that pointed to any page on SEOmoz.org from sites with the .edu TLD extension (these now return no results)
_

Webmasters and marketers will no longer be able to use advanced parameters on link: and linkdomain: searches such as inurl, intitle, site, etc. breaking many data sources for software tools and limiting link research abilities. However, there are several worthwhile solutions/replacements, including tools from SEOmoz (though I’ll also cover a few others).

#1 – Linkscape Advanced Reports

SEOmoz PRO members now have unlimited access to Linkscape advanced reports, which can apply filters through the UI in much the same way one could with Yahoo! link searches.

Linkscape Advanced Report Filtering on EDU sites

Using the filters and search capabilities, I can add nearly all of the filters previously possible through Yahoo!, and many others unique to Linkscape.
_

This tool is available at www.seomoz.org/linkscape

#2 – OpenSiteExplorer CSV Exports

Another methodology without quite as many bells and whistles, is to use Open Site Explorer. While Linkscape offers filtering right inside the interface, Open Site Explorer is built for speed, meaning you can see lots of links, but only in the views directly ported from our API. To get into the deep filtering, you’ll need to use the CSV export + Excel (or your favorite spreadsheet program).

Filter on OpenSiteExplorer

The filters in OSE are more limited than Linkscape, but most reports take <10 seconds to generate
_

When I export the results to CSV and open in Microsoft Excel, I can easily filter for the .edu links (or any other modifier I’m interested in). OSE also shows up to 10,000 links per report vs. Linkscape’s 3,000.

CSV Export Filter on EDU Links from Open Site Explorer

Using the "find" command in Excel is the simplest methodology, but you can do all sorts of awesome filtering using more advanced techniques
_

This tool is available at www.opensiteexplorer.org

#3 – Majestic SEO

A UK-based search engine built using distributed crawlers, MJ-12, offers an SEO tool for backlink research. The index varies slightly to how major search engines and Linkscape build – instead of new indices built from regular crawls, MJ-12 adds new links and pages as they’re discovered to an ongoing index. This means a much larger dataset, but not always the same level of freshness and limited de-duplication/canonicalization. However, many SEOs like this project a lot, and MJ-12 enables the same filtering available in Linkscape:

Majestic SEO Filter for EDU links

Many cool filters and ordering are available via MJ’s tool and reports typically return fairly quickly
_

This tool is available at www.majesticseo.com

#4 – Yahoo! Site Explorer CSV Exports

Just as CSV exports from Open Site Explorer can enable link searching, so too can exports from Yahoo! ‘s Site Explorer. The big limitation is the 1,000 link limit (1/3rd that of Linkscape and 1/10th that of Open Site Explorer). Previously, SEOs would use modified queries to make requests and get more link data from Yahoo!, but with this switch, the only remaining option is to request links for many pages on a single domain to help get a better sense of sites with greater than 1,000 external links.

Yahoo Site Explorer

The "Export first 1000 results to TSV" button + Excel filtering option enables marketers to do research, but is limited in quantity
_

This tool is available at search.siteexplorer.yahoo.com

#5 – The SEOmoz API

For those with some programming skills, SEOmoz offers a free API for link data with up to 1 million calls per month, as well as a larger, full featured link data API starting at 0/month. This is the same API that powers both the Linkscape tool and Open Site Explorer, as well as integrations with Conductor, Hubspot, Flippa, Brightedge and many others.

SEOmoz's API Wiki

The APIWiki offers lots of information and examples on how to make calls to the service and integrate with your own softare or practices.
_

This API is available at www.seomoz.org/api

#6 – Other possibilities

In addition to these sources, there are a few other options, albeit with less fully functional or open systems. These include:

Other sources may yet emerge, and certainly players like Majestic and SEOmoz are working hard to improve their coverage, quality and functionality. It will be interesting to see how this change affects the link research landscape – hopefully Bing is working on something valuable to help replace this functionality and to serve up data when Yahoo! Site Explorer is also retired (currently scheduled for 2012).

 

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by randfish

The team at SEOmoz has been hard at work this week, smoothing out a lot of the initial bumps we’ve seen with our beta launch of the new web app. We anticipated the app would be popular, but I don’t think any of us were prepared for just how many keywords needed rank checking/grading and pages needed crawling/error-checking. Our queue to fetch rankings/crawl URLs had a backup of multiple tens of thousands of requests all week, and the dev team’s been slogging away on parallelization, separation of queing stages and other fixes.

Beta Web App

Our next big release is scheduled for August 25 (possibly the 26th depending on how repairs go) and we’re all crazy excited (and more than a little nervous, sleep deprived and caffeinated). Feel free to start marking your calendars; I know we have :-)

But, today, I’m here to talk about (and ask about ) the future of the web app. We’ve got a nearly endless list of features & functionality we’re hoping to add to the web app in the weeks and months to come, and we need your help in priotizing what  YOU care about. To start, I’ll share two lists – the first is our "quick hit" list of items we’re planning to address in the next 2-3 weeks (some will even be in time for our "big" launch on August 25th). The second is some larger concepts we’ve been noodling around with that may take a few months to get in. With both, we’re hoping you’ll give your .02 and help us prioritize which items to concentrate on.

Quick Hits List

#1 – Printable Reports (DOC & PDF)

We’ve heard from a lot of users already that they’d like the ability to export the crawl diagnostic reports, on-page summaries and report cards and ranking data into DOC or PDF files to be integrated into internal or client reporting. Luckily, this is a feature that’s early on our roadmap, possibly as soon as September.

#2 – On-Page Optimization Interface Tweaks

On Page Analysis Report Card

The on-page analysis section has already garnered a lot of kind words and hopefully helped many of you improve your targeting for some easy rankings wins. However, there’s a few tweaks that folks have suggested to help make it more usable, including removing the "fix" level of difficulty label on elements that are already completed and offering a way to re-order the recommendations to show those that are incomplete at the top.

We’re also working on ways, in the longer term, to help make this page shorter and the information more quickly digestable. Look for some interface experiments coming soon.

#3 – Adding Issues to Crawl Diagnostics

We currently track 20 unique crawl issues (split between errors, warnings and notices). Some other items we’ve considered tracking include:

  • Use of meta noarchive (notice)
  • Pages with display: none in their CSS (notice)
  • Pages lacking analytics tracking code (warning)
  • Pages that return any response code outside those we already track – 200, 301/2, 40x, 50x (error)
  • Pages that redirect through more than two chains (warning)
  • Pages that serve a meta refresh (notice)
  • Pages that redirect with javascript (warning)

If you have additional items you’d like to see in the crawl diagnostics, please let us know!

#4 -"Ignorable" Crawl Issues

Some of our members have noted that they’d like to be able to "ignore" an issue and have it exist only in an "archived" issues section. We think this is a great idea, as there can be times when we catch a 404, duplicate content, robots blocking, etc. and it’s not a problem for your site but an intentional move. When this happens, it can be frustrating to see the continued error/warning message, so an archiving system might be ideal.

We’re still working on the concept of how to implement, but an "ingore all issues of this type" and a specific "ignore this issue for this URL" are currently on the roster.

#5 – Bulk Keyword Import System

Today, it can be a bit frustrating to add more than 5-10 keywords and labels at a time. We’d like to build a system that lets you upload a CSV or paste in rows with lable data included in a consistent format to make bulk insertion and labeling easier.

Big Ideas for the Future

Although we’ve amassed literally hundreds of ideas for upgrading and adding to the web app’s featureset, we’re really excited about a few key ones that have many mini-features inside. These include:

A) Integration with Google Analytics

One of the projects we’re most excited about is integrating with Google Analytics (and later, other packages like Webtrends and Omniture). You can see some of our early ideas below in wireframe format (these ARE NOT finished designs by any means, just illustrations I made in Flash).

Web App Analytics
Integration Wireframe Teaser

We’re keen on the idea of having some stacked are graphs to help you see when traffic from different sources vary, and help to measure indexation via the chart below. Splitting out social traffic by using a set of referrers (ReadWriteWeb does a good breakdown of sources) to filter also struck us as being a great feature.

From there, we’re also bullish on including data about specific keywords alongside rankings, keyword difficulty scores and estimates from Google AdWords:

Keyword Search Traffic from GA

With this data, we think we can calculate some cool metrics around the potential opportunity of a given keyword, though this will, obviously, require some testing and refinement.

B) Crawl Depth Analysis

We’ve long wanted a way to visualize a site’s internal link structure and know how depth of pages from the homepage might actually be influencing crawling, rankings and traffic. With the custom crawl & crawl diagnostics system, we believe we can architect this into the web app’s dataset (though it’s unfortunately non-trivial to do so). You can see a very early wireframe below:

Crawl Depth Report
Teaser Wireframe

This is one of our more ambitious projects, but we’d love your thoughts about whether it would be valuable/useful for your campaigns.

C) XML Sitemaps Builder

Building an XML Sitemap can be a pain, even with some of the specialized software out there (though we at SEOmoz are big fans of John Mueller’s GSiteCrawler). Since the web app is already crawling your site’s pages, it only makes sense that we could construct an XML sitemap, plug into Google Webmaster Tools’ API and help you verify the sitemap and make custom tweaks based on what you want to include or exclude.

D) Keyword Research System

A relatively obvious next step would be the addition of a keyword research tool. We’d like integrate the functionality of the keyword difficulty tool’s analysis along with data from Google’s AdWords API. This might help you choose which keywords are most likely to produce value for your site and deserve some content/targeting in SEO.

E) Historical Link Analysis

One feature we hear demand for all the time is historical link information. We’ve actually got the data already stored from previous indices, but in testing retrieval, we’ve found that numbers can really bounce around due to the massive amount of noise in the "not-so-awesome" parts of the web (spammy sites, scrapers, etc). Thus, we’re looking into ways to scrub the data a bit before building this system (possibly by using our metrics to have the option of showing only mozRank 2-3+ pages that link, which tend to be relatively high quality). This work may take us into November or later, but we’ve got our fingers crossed that it can be in the web app by year’s end.

Link Growth Over Time
Wireframe Teaser

The wireframes above are just some initial concepts. We’d also really like to be able to show you pages/sites that were linking to you in a previous index but aren’t any longer or those that are newly linking, too.

F) Social Media / Link Monitoring System

Finally, we’ve got a project to turn some of the early work from Blogscape and our Social Media Monitoring prototype into a more robust, fully functional system. Our goal here is to provide a list of all the pages, tweets, blog posts and links that your site acquires in a more real-time type environment. So many of us are constantly doing Google Blog searches and Twitter searches and looking at our referrers via analytics that we thought it would be great to combine all that data in a single repository so you can keep up to date on what the web is saying about you (and, more relevantly, how important each of those sources are).

We’re still at the nascent beginnings of this work, but hope to have some wireframes to show in the not-too-far-out future – possibly in the next feedback request post.

Just for fun, I thought I’d include a poll regarding these "big" ideas and see which you’re most excited about:



With our next big launch just 9 days away (yikes!), we’re all working hard to make the web app and the many other pieces that are releasing better, faster and more stable. However, we’d love your opinions and will certainly use that feedback to improve, if not next week then in the future.

Also – as we move forward, we’ve decided to be more open about our product development and roadmap (as part of our commitment to being TAGFEE), so you can expect a post every few weeks or so detailing some of our ideas and asking for your thoughts on what to build next and how to improve.

p.s. If you haven’t tried the web app beta yet, give it a spin – it’s PRO-only, and some sections are a little slow, but by building a campaign now, you’ll have more historical data and trends to compare over time as the app improves.

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Posted by randfish

Many of our keen members observed that late last week, Linkscape’s index updated (this is actually our 27th index update since starting the project in 2008). This means new link data in Open Site Explorer and Linkscape Classic, as well as new metric data via the mozbar and in our API.

Index 27 Statistics

For those who are interested, you can follow the Linkscape index update calendar on our API Wiki (as you can see, this update was about a week early).

Although we’ve now crawled many hundreds of billions of pages since launch, we only serve our uber-freshest index. Historical data is something we want to do soon – more on that later. This latest index’s stats feature:

  • Pages – 40,152,060,523
  • Subdomains – 284,336,725
  • Root Domains – 91,539,345
  • Links – 420,049,105,986
  • % of Nofollowed Links – 2.02%
  • % of Nofollows on Internal Links – 58.7%
  • % of Nofollows on External Links – 41.3%
  • % of Pages w/ Rel Canonical – 4.3%

These numbers continue the trend we’ve been seeing for some time where internal nofollow usage is declining slightly while rel canonical is down a bit in this index but up substantially over the start of the year (this likely has more to do with our crawl selection than with sites actually removing canonical URL tags.

Comparing Metrics from Index to Index

One of the biggest requests we get is the ability to track historical information about your metrics from Linkscape. We know this is really important to everyone and we want to make this happen soon, but have some technical and practical challenges to overcome. The biggest of which is that what we crawl changes substantively with each index, both due to our improvements in what to crawl (and what to ignore) and with the web’s massive changes each month (60%+ of pages we fetched 6 months ago are no longer in existence!).

For now, the best advice I can give is to measure yourself against competitors and colleagues rather than against your metrics last month or last year. If you’re improving against the competition, chances are good that your overall footprint is increasing at a higher rate than theirs. You might even "lose" links in a raw count from the index, but actually have improved simply because a few hundred spam/scraper websites weren’t crawled this time around, or we’ve done better canonicalization with URLs than last round or your link rotated out of the top of a popular RSS feed many sites were reproducing.

OpenSiteExplorer Comparison Report
Measuring against other sites in your niche is a great way to compare from index to index

If you’ve got more questions about comparisons and index modifications over time, feel free to ask in the comments and we’ll try to dive in. For those who are interested, our current thinking around providing historical tracking is to give multiple number sets like – # of links from mR 3+ pages, # of links from mR 1-3 pages, etc. to help show how many "important" links you’re gaining/losing – these fluctuate much less from index to index and may be better benchmarking tools.

Integration with Conductor’s Searchlight Software

SEOmoz is proud to be powering Conductor’s new Searchlight software. I got to take a demo of their toolset 2 weeks ago (anyone can request one here) and was very impressed. See for yourself with a few exclusive screenshots I’ve wrangled up:

Searchlight Screenshot 1/4

Searchlight Screenshot 2/4

Searchlight Screenshot 3/4

Searchlight Screenshot 4/4

Conductor's Seth Besmertnik at the Searchlight Launch Event

And at the bottom of the series is Seth Besmertnik, Conductor’s CEO, during the launch event (note the unbuttoned top button of his shirt with the tie; this indicates Seth is a professional, but he’s still a startup guy at heart). Searchlight already has some impressive customers including Monster.com, Care.com, Siemens, Travelocity, Progressive and more. I think many in the SEO field will agree that moving further into software is a smart move for the Conductor team, and the toolset certainly looks promising.

Conductor’s also releasing some cool free research data on seasonality (request form here). Couldn’t resist sharing a screenshot below of the sample Excel workbook they developed:

Keyword Seasonality Excel Workbook from Conductor

mmm… prepopulated

SEOmoz’s Linkscape index currently powers the link data section of Searchlight via our API and we’re looking forward to helping many other providers of search software in the future. We’re also integrated with Hubspot’s Grader.com and EightFoldLogic‘s (formerly Enquisite) Linker, so if you’re seeking to build an app and need link data, you can sign up for free API access and get in touch if/when you need more data.

The Link Juice App for iPhone

We’re also very excited about the popular and growing iPhone app – LinkJuice. They’ve just recently updated the software with a few recommendations straight from Danny Dover and me!

LinkJuiceApp 2/2LinkJuice App 1/2

The LinkJuice folks have promised an Android version is on its way soon, and since that’s my phone of choice, I can’t wait!

If you’ve got an app, software piece or website that’s powered by Linkscape, please do drop us a line so we can include it. I’ve been excited to see folks using it for research – like Sean’s recent YOUmoz post on PageRank correlations – as well as in many less public research works.

Oh, and if you somehow missed the announcement, go check out the new Beginner’s Guide to SEO! It’s totally free and Danny’s done a great job with it.

Do you like this post? Yes No


SEOmoz Daily SEO Blog