Tags Archives

You are currently viewing all posts tagged with Data.

Posted by randfish

In early June of this year, SEOmoz released some ranking correlation data about Google’s web results and how they mapped against specific metrics. This exciting work gave us valuable insight into Google’s rankings system and both confirmed many assumptions as well as opened up new lines of questions. When Google announced their new Places Results at the end of October, we couldn’t help but want to learn more.

In November, we gathered data for 220 search queries – 20 US cities and 11 business "types" (different kinds of queries). This dataset is smaller than our web results, and was intended to be an initial data gathering project before we dove deeper, but our findings proved surprising significant (from a statistical standpoint) and thus, we’re making the results and report publicly available.

As with our previous collection and analysis of this type of data, it’s important to keep a few things in mind:

  1. Correlation ≠ Causation – the findings here are merely indicative of what high ranking results are doing that lower ranking results aren’t (or, at least, are doing less of). It’s not necessarily the case that any of these factors are the cause of the higher rankings, they could merely be a side effect of pages that perform better. Nevertheless, it’s always interesting to know what higher ranking sites/pages are doing that they’re lower ranking peers aren’t.
  2. Statistical Signifigance – the report specifically highlights results that are more than two standard errors away from statistical significance (98%+ chance of non-zero correlation). Many of the factors we measured fall into this category, which is why we’re sharing despite the smaller dataset. In terms of the correlation numbers, remember that 0.00 is no correlation and 1.0 is perfect correlation. It’s in our opinion that in algorithms like Google’s, where hundreds of factors are supposedly at play together, data in the 0.05-0.1 range is interesting and data in the 0.1-0.3 range potentialy worth more significant attention.
  3. Ranked Correlations – the correlations are comparing pages that ranked higher vs. those that ranked lower, and the datasets in the report and below are reporting on average correlations across the entire dataset (except where specified), with standard error as a metric for accuracy.
  4. Common Sense is Essential – you’ll see some datapoints, just like in our web results set, that would suggest that sites not following the  commonly held "best practices" (like using the name of the queried city in your URL) results in better rankings. We strongly urge readers to use this data as a guideline, but not a rule (for example, it could be that many results using the city name in the URL are national chains with multiple "city" pages, and thus aren’t as "local" in Google’s eyes as their peers).

With those out of the way, let’s dive into the dataset, which you can download a full version of here:

  • The 20 cities included:
    • Indianapolis
    • Austin
    • Seattle
    • Portland
    • Baltimore
    • Boston
    • Memphis
    • Denver
    • Nashville
    • Milwaukee
    • Las Vegas
    • Louisville
    • Albuquerque
    • Tucson
    • Atlanta
    • Fresno
    • Sacramento
    • Omaha
    • Miami
    • Cleveland
  • The 11 Business Types / Queries included:
    • Restaurants
    • Car Wash
    • Attorneys
    • Yoga Studio
    • Book Stores
    • Parks
    • Ice Cream
    • Gyms
    • Dry Cleaners
    • Hospitals

Interestingly, the results we gathered seem to indicate that across multiple cities, the Google Places ranking algorithm doesn’t differ much, but when business/query types are considered, there’s indications that Google may indeed be changing up how the rankings are calculated (an alternative explanation is that different business segments simply have dramatically different weights on the factors depending on their type).

For this round of correlation analysis, we contracted Dr. Matthew Peters (who holds a PhD in Applied Math from Univ. of WA) to create a report of his findings based on the data. In discussing the role that cities/query types played, he noted:

City is not a significant source of variation for any of the variables, suggesting that Google’s algorithm is the same for all cities. However, for 9 of the 24 variables we can reject the null hypothesis that business type is a not significant source of variation in the correlation coefficients at a=0.05. This is highly unlikely to have occurred by chance. Unfortunately there is a caveat to this result. The results from ANOVA assume the residuals to be normally distributed, but in most cases the residuals are not normal as tested with a Shapiro-Wilk test.

You can download his full report here.

Next, let’s look at some of the more interesting statistical findings Matt discovered. These are split into 4 unique sections, and we’re looking only at the correlations with Places results (though the data and report also include web results).

Correlation with Page-Specific Link Popularity Factors

Google Places Correlations with Page-Specific Link Popularity Elements

With the exception of PageRank, all data comes via SEOmoz’s Linkscape data API.

NOTE: In this data, mozRank and PageRank are not significantly different than zero.

Domain-Wide Link Popularity Factors

Google Places Domain Link Factor Correlations

All data comes via SEOmoz’s Linkscape data API.

NOTE: In this data, all of the metrics are significant.

Keyword Usage Factors

Google Places Keyword Usage Correlations 

All data comes directly from the results page URL or the Places page/listing. Business keyword refers to the type, such as "ice cream" or "hospital" while city keyword refers to the location, such as "Austin" or "Portland." The relatively large, negative correlation with the city keyword in URLs is an outlier (as no other element we measured for local listings had a significant negative correlation). My personal guess is nationwide sites trying to rank individually on city-targeted pages don’t perform as well as local-only results in general and this could cause that biasing, but we don’t have evidence to prove that theory and other explanations are certainly possible.

NOTE: In this data, correlations for business keyword in the URL and city keyword in the title element were not significantly different than zero.

Places Listings, Ratings + Reviews Factors

Google Places Li
stings Correlations 

All data comes directly from Google Places’ page about the result.

NOTE: In this data, all of the metrics are significant. 

Interest Takeaways and Notes from this Research:

  • In Places results, domain-wide link popularity factors seem more important than page-specific ones. We’ve heard that links aren’t as important in local/places and the data certainly suggest that’s accurate (see the full report to compare correlations), but they may not be completely useless, particularly on the domain level.
  • Using the city and business type keyword in the page title and the listing name (when claiming/editing your business’s name in the results) may give a positive boost. Results using these keywords seem to frequently outrank their peers. For example: Portland Attorneys Places Results
     
  • More is almost always better when it comes to everything associated with your Places listing – more related maps, more reviews, more "about this place" results, etc. However, this metric doesn’t appear as powerful as we’d initially thought. It could be that the missing "consistency" metric is a big part of why the correlations here weren’t higher.
  • Several things we didn’t measure in this report are particularly interesting and it’s sad we missed them. These include:
    • Proximity to centroid (just tough to gather for every result at scale)
    • Consistency of listings (supposedly a central piece of the Local rankings puzzle) in address, phone number, business name, type
    • Presence of specific listing sources (like those shown on GetListed.org for example)
  • This data isn’t far out of whack with the perception/opinions of Local SEOs, which we take to be a good sign, both for the data, and the SEOs surveyed :-)

Our hope is to do this experiment again with more data and possibly more metrics in the future. Your suggestions are, of course, very welcome.


As always, we invite you to download the report and raw data and give us any feedback or feel free to do your own analyses and come to your own conclusions. It could even be valuable to use this same process for results you (or your clients) care about and find the missing ingredients between you and the competition.

p.s. Special thanks to Paris Childress and Evgeni Yordanov for help in the data collection process.

Do you like this post? Yes No


SEOmoz Daily SEO Blog

Generally I have not been a huge fan of registering all your websites with Google (profiling risks, etc.), but they keep using the carrot nicely to lead me astray. :D … So much so that I want to find a Googler and give them a hug.

Google recently decided to share some more data in their webmaster tools. And for many webmasters the data is enough to make it worth registering (at least 1 website)!

AOL Click Data

When speaking of keyword search volume beakdown data people have typically shared information from the leaked AOL search data.

The big problem with that data is it is in aggregate. It is a nice free tool, and a good starting point, but it is fuzzy.

Types of Searches

There are 3 well known search classifications: navigational, transactional, and informational. Each type of query has a different traffic breakdown profile.

  • In general, for navigational searches people click the top result more often than they would on an informational search.
  • In general, for informational searches people tend to click throughout the full set of search results at a more even distribution than they would for navigational or transactional searches.
  • The only solid recently-shared publicly data on those breakdowns is from Dogpile [PDF], a meta search engine. But given how polluted meta search services tend to be (with ads mixed in their search results) those numbers were quite a bit off from what one might expect. And once more, they are aggregate numbers.

Other Stuff in the Search Results

Further, anecdotal evidence suggests that the appearance of vertical / universal results within the search results set can impact search click distribution. Google shows maps on 1 in 13 search results, and they have many other verticals they are pushing – video, updates, news, product search, etc. And then there are AdWords ads – which many searchers confuse as being the organic search results.

Pretty solid looking estimates can get pretty rough pretty fast. ;)

The Value of Data

If there is one critical piece of marketing worth learning above all others it is that context is important.

My suggestions as to what works, another person’s opinions or advice on what you should do, and empirical truth collected by a marketer who likes to use numbers to prove his point … well all 3 data sets fall flat on their face when compared against the data and insights and interactions that come from running your own business. As teachers and marketers we try to share tips to guide people toward success, but your data is one of the most valuable things you own.

A Hack to Collect Search Volume Data & Estimated CTR Data

In their Excel plug-in Microsoft shares the same search data they use internally, but its not certain that when they integrate the Yahoo! Search deal that Microsoft will keep sharing as much data as they do now.

Google offers numerous keyword research tools, but getting them to agree with each other can be quite a challenge.

There have been some hacks to collect organic search clickthrough rate data on Google. One of the more popular strategies was to run an AdWords ad for the exact match version of a keyword and bid low onto the first page of results. Keep the ad running for a while and then run an AdWords impression share report. With that data in hand you can estimate how many actual searches there were, and then compare your organic search clicks against that to get an effective clickthrough rate.

The New Solution

Given search personalization and localization and the ever-changing result sets with all the test Google runs, even the above can be rough. So what is a webmaster to do?

Well Google upgraded the data they share inside their webmaster tools, which includes (on a per keyword level)

  • keyword clickthrough rank
  • clickthrough rate at various ranking positions
  • URL that was clicked onto

Trophy Keywords vs Brand Keywords

Even if your site is rather well known going after some of the big keywords can be a bit self-defeating in terms of the value delivered. Imagine ranking #6 or #7 for SEO. Wouldn’t that send a lot of search traffic? Nope.

When you back away the ego searches, the rank checkers, etc. it turns out that there isn’t a ton of search volume to be had ranking on page 1 of Google for SEO.

With only a 2% CTR the core keyword SEO is driving less than 1/2 the traffic driven by our 2 most common brand search keywords. Our brand might not seem like it is getting lots of traffic with only a few thousand searches a month, but when you have a > 70% CTR that can still add up to a lot of traffic. More importantly, that is the kind of traffic which is more likely to buy from you than someone searching for a broad discovery or curiosity type of keyword.

The lessons for SEOs in that data?

  • Core keywords & raw mechanical SEO are both quite frequently heavily over-rated in terms of value.
  • Rather than sweating trying to rank well for the hardest keywords first focus on more niche keywords that are easy to rank for.
  • If you have little rank and little work to do then there is lots of time to focus on giving people reasons to talk about you and reference you.
  • Work on building up brand & relationships. This not only gives your link profile more karma, but it sends you a steady stream of leads for if/when you fall out of favor a bit with the search engines.
Those who perceive you well will seek you out and buy from you. But it is much harder to sell to someone who sees you as just another choice amongst many results.

Search is becoming the default navigational tool for the web. People go to Google and then type in “yahoo.” If you don’t have a branded keyword as one of your top keywords that might indicate long-term risk to your business. If a competitor can clone most of what you are doing and then bake in a viral component you are toast.

Going After the Wrong Brand Keywords

Arbitraging 3rd party brands is an easy way to build up distribution quickly. This is why there are 4,982 Britney Spears fan blogs (well 2 people are actually fans, but the other 4,980 are marketers).

But if you want to pull in traffic you have to go after a keyword that is an extension of the brand. Ranking for “eBay” probably won’t send you much traffic (as their clickthrough rate on their first result is probably even higher than the 70% I had above). Though if you have tips on how to buy or sell on eBay those kinds of keywords might pull in a much higher clickthrough rate for you.

To confirm the above I grabbed data for a couple SEO tool brands we rank well for. A number 3 ranking (behind a double listing) and virtually no traffic!

Different keyword, same result

Informational Keywords

Link building is still a bit of a discovery keyword, but I think it is perhaps a bit later staged than just the acronym “SEO.” Here the click volume distribution is much flatter / less consolidated than it was on the above brand-oriented examples.

If when Google lowers your rank you still pull in a fairly high CTR that might be a signal to them that your site should rank a bit higher.

Enough Already!

Enough about our keywords, what does your keyword data tell you? How can you better integrate it to grow your business?

SEO Book.com – Learn. Rank. Dominate.

I recently came across an interesting stream of search traffic.

The demographic using this search stream was one I had no direct experience of previously. I was amazed at the high level of site interaction this group engaged in. It was related to the wedding of two people I’d never previously heard of – Ti & Tiny. From the names of the people who responded, I determined the traffic was mostly African-American. Pretty obvious given the topic, right.

What was interesting was this group engaged and responded at a much higher level than other groups I was targeting on similar campaigns. It was a reminder of the different ways some demographics choose to participate online, especially when the marketing pitch reflects them.

Target Marketing

Target marketing, otherwise known as market segmentation, is marketing focusing on specific groups of people.

Marketers use demographic profiles to break down groups into a series of traits, such as gender, race, age, income, disabilities, mobility, educational attainment, home ownership, employment status, and location. This helps marketers determine the correct pitch, language and approach to use when trying to appeal to a given audience.

When we use search keyword lists, it’s often easy to lump people who use the same keywords together. However, if we add demographic information into the mix, our marketing can become more focused, which can translate to higher conversions, and higher returns.

For example, according to a recent demographic study, the African-American market makes up 13 percent of the U.S. population and spends more than 0 billion every year. African-American buying power is expected to reach trillion this year. 26 percent of African-American households had incomes of ,000 per year. 64 percent of African Americans—versus 51 percent of Caucasians—spend more on products they perceive as being “the best”. That last piece of information is very useful if you were designing a page to appeal directly to this market.

How about the gay market. This market tends to be affluent. The average annual income for a gay household is ,000, 20.4 percent higher than in a heterosexual household. This group tends to have a high level of education. Some 83 percent of gays and lesbians have either attended or graduated from college. This market is also brand-loyal. Approximately 89 percent of gays and lesbians are brand-affiliated and are highly likely to seek out brands that advertise to them – i.e. advertising that depicts gay lifestyles and models, for example.

How about women. Women make up 51 percent of the US population and influence at least 80 percent of all spending on consumer goods in the United States. By 2010, women are expected to control trillion, or approximately 60 percent of the nation’s wealth. Retail stores are designed around women, and it would be interesting to note how women and men may respond differently to the online retail equivalent.

General marketing one-size-fits-all messages may miss such groups. How much advertising language is geared towards white, middle class family groups, for example? That’s fine if a white, middle class family group is the target market, but it pays to be aware of groups we may be missing.

Relevance

Relevance is more than matching a search keyword to page topic.

“Know they customer” and reflect your audience in your site design, language and pitch. Do your pages reflect your world view, or the world view of your customers? Is there a difference? Can you use keyword terms to identify and segment specific demographic groups? Are there keywords that women are more likely to use than men? Keywords that Hispanics are more likely to use than African Americans? Think about the ways different groups in our society use language.

Your website should hold up a mirror to your target audience, using their language, depicting their lifestyles, and speaking directly to their wants and needs.

Research

In my Ti & Tiny example, the demographic was pretty obvious. It was easy to picture the fanbase, and adjust the language, and pitch, accordingly.

For more in-depth demographic information, you could look at census data, available at the US census beureau, or your regional equivalent. Check out the Country and City databook.

Using keyword research tools, look for broad keyword associations to get a feel for language use and associated areas to target.

The Inside Facebook Blog often provides interesting snippets of demographic data about Facebook usage and trends, which will likely be reflected in the wider online community.

Professional data mining companies, such as Nielsen, are great sources, if you have the budget. And if you want to dig even deeper, check out the VALS survey.

VALs. SEO Book.com – Learn. Rank. Dominate.

Both Yahoo! and Microsoft have confirmed that they will start testing the Bing algorithm live on some Yahoo! traffic this month. One of the big questions from the SEO perspective is what happens to Yahoo! Site Explorer? If it goes away then webmasters will need to get link data from web indexes built by SEO companies, perhaps either Open Site Explorer and/or Majestic SEO.

Yahoo! also offers a link: search in their BOSS program. While they have stated that the BOSS program will live on, there is little chance of the link: operator working in it over the longrun as Bing has disabled inbound link search on Bing.

Blekko Search Engine. Blekko, which is a soon to launch search start-up, doesn’t have much to lose in sharing data. In the short run anything to gain awareness will likely make them money in the longrun. And so they are doing just that:

Blekko is also showing just about all the behind the scenes data that they have to determine rank and relevancy. You can see inbound links, duplicated content and associated metadata for any domain in their index.

Blekko will also come with custom slashtags which users can use to personalize search. And end user feature for average users? Not sure. But it will be interesting to web developers & power searchers. There are already heated debates in the comments on TechCrunch on if people will use that feature. IMHO the point isn’t for it to be an end user service for average searchers, but to be one which generates discussion & builds loyalty amongst power users. And clearly it is working. :D

They are also following the Jason Callus-Anus strategy of anti-SEO marketing (while giving SEOs tons of free data)

The SEO gamers, content farmers and link shoppers are not going to be happy. These guys are flooding the web with content designed to turn a profit, not inform, and the searcher pays the price. One company alone generates literally tens of thousands of pages every day that are solely designed to make money from SEO traffic. Slashtags are the perfect way to bypass them and search only the sites you like.

One more reason the content farmers aren’t going to be happy: we’re opening up all the data that is the core foundation of their business. Link data, site data, rank data – all there for everyone to see. In one fell swoop the playing field just got leveled.

I think a core concept which many search engines have forgot (in an attempt to chase Google) is that if you have a place in the hearts and minds of webmasters & web developers then they will lead other users to your service.

Money is one way to buy loyalty. And Google will pay anyone to syndicate their ads, no matter what sort of externalities that leads to. But now the web is polluted with content mills. Which is an opportunity for Blekko to differentiate.

Since Yahoo! is a big publisher they had mixed incentives on this front. They do share a lot of cool stuff, but they are also the same company which just disappeared the default online keyword research tool and replaced it with nothing, and they recently purchased a content mill. This was a big area where Bing could have won. They created a great SEO guide & are generally more receptive to webmaster communications, but they have fumbled following redirects & have pulled back on the data they share. Further, if you look at Bing’s updated PPC guidelines, you will see that they are pushing out affiliates and chasing the same brand ad Dollars which Google wants. Bing will be anything but desperate for marketshare after they get the Yahoo! deal in place.

Blekko goes one further than the traditional sense of “open” for their launch. They not only give you the traditional open strategy:

Furthermore, we intend to be fully open about our crawl and rank data for the web. We don’t believe security through obscurity is the best way to drive search ranking quality forward. So we have a set of tools on blekko.com which let you understand what factors are driving our rankings, and let you dive behind any url or site to see what their web search footprint looks like.

but they also offer a “Search Bill of Rights” which by default other search companies can’t follow (based on their current business models):

1. Search shall be open
2. Search results shall involve people
3. Ranking data shall not be kept secret
4. Web data shall be readily available
5. There is no one-size-fits-all for search
6. Advanced search shall be accessible
7. Search engine tools shall be open to all
8. Search & community go hand-in-hand
9. Spam does not belong in search results
10. Privacy of searchers shall not be violated

And so based on the above they appeal to…

  • anyone who submits themselves to the open ideology
  • journalists who hate content mills
  • searchers who hate junk search results
  • SEOs & webmasters who like free data
  • programmers who like to hack and tweak
  • people interested in personal freedom & privacy

From a marketing perspective, their site hasn’t even launched yet and there is *at least* a half-dozen different reasons to talk about them! Pretty savvy marketing. :D


SEO Book.com – Learn. Rank. Dominate.

The other day a person contacted me about wanting to help me with ad retargeting on one of my sites, but in order to do so they would have had to have tracked my site. That would have given them tons of great information about how they could retarget all my site’s visitors around the web. And they wanted me to give that up for free in an offer which was made to sound compelling, but lacked substance. And so they never got a response. :D

Given that we live in “the information age” it is surprising how little people value data & how little they expect you to value it. But there are still a lot of naive folks online! Google has a patent for finding under-served markets. And they own the leading search engine + the leading online ad network.

At any point in time they can change who they are voting for, and why they are voting that way.

They acquired YouTube and then universal search was all the rage.

Yes they have been pretty good at taking the longterm view, but that is *exactly* why so many businesses are afraid of them. Google throws off so much cash and collects so much data that they can go into just about any information market and practice price dumping to kill external innovation & lock up the market.

Once they own the market they have the data. From there a near infinite number of business models & opportunities appear.

Google recently became the #1 shopping search engine. How did they respond? More promotion of their shopping search feature.

All those star ratings near the ads go to a thin affiliate / Google value add shopping search engine experience. Featured placement for those who are willing to share more data in exchange for promotion, and then over time Google will start collecting data directly and drive the (non-Google) duplication out of the marketplace.

You can tell where Google aims to position Google in the long run by what they consider to be spam. Early remote quality rater guidelines have highlighted how spammy the travel vertical is with hotel sites. Since then Google has added hotel prices to their search results, added hotels to some of their maps, and they just acquired ITA software – the company which powers many airline search sites.

Amongst this sort of backdrop there was an article in the NYT about small book shops partnering up with Google. The title of the article reads like it is straight out of a press release: Small Stores See Google as Ally in E-Book Market. And it includes the following quote

Mr. Sennett acknowledged that Google would also be a competitor, since it would also sell books from its Web site. But he seemed to believe that Google would favor its smaller partners.

“I don’t see Google directly working to undermine or outsell their retail partners,” he said. “I doubt they are going to be editorially recommending books and making choices about what people should read, which is what bookstores do.”

He added, “I wonder how naïve that is at this point. We’ll have to see.”

If they have all the sales data they don’t need to make recommendations. They let you and your customers do that. All they have to do to provide a better service than you can is aggregate the data.

The long view is this: if Google can cheaply duplicate your efforts you are unneeded duplication in the marketplace.

Look at the list of business models Google publicly stated they were leery on:

  • ebook sites

  • get rich quick
  • comparison shopping sites
  • travel aggregators

3 out of 4 ain’t bad. But they even on the one they missed, they still have an AdSense category for it. :D

SEO Book.com – Learn. Rank. Dominate.
Post image for Why You Should Care About AT&T’s iPhone Data Plans Even If You Don’t Own an iPhone

While a lot of the attention on AT&T and Apple in the past few weeks has been focused on the release of the iPad and new iPhone,  the elimination of unlimited data plans is an equally important development, especially for website owners and publishers.

Even though the iPhone is meant to render full web pages, research has shown that people still prefer mobile-formatted content on iPhones …

In prior years, AT&T offered unlimited data for a month; however, they maintained that a small percentage of users were using a disproportionate share of data. To compensate for this, they announced two new data plans and eliminated the unlimited plan. As I understand it, existing customers are grandfathered until they renew. Upon renewal, they have to choose. Engadget has an excellent breakdown of the details of the plan.

So what does this mean to website owners and publishers? IMHO if you are a publisher, you really need to evaluate your use of rich media and use of a mobile version of your site. If you think that AT&T dropping the unlimited plan is an aberration, you might want to reconsider that position. While free wifi may be on the rise, it’s not as ubiquitous as many in the valley would have you believe. I can find open free hot spots if I really need one, but it isn’t easy. So it’s not unreasonable to expect consumers to start being more conscious of their data use. Additionally, while smart phones and devices like the iPad, Blackberry, or Android can handle some rich media, studies have shown that many users prefer “lite” or mobile websites when on these devices.

From an SEO perspective, creating a mobile website has a few pitfalls to watch out for. In my experience, it’s best to avoid using a separate subdomain or subfolder for a mobile version; instead, you want to serve a different CSS version or serve modified content based on mobile user agents. Again this strategy is tricky if you don’t want to look like you are cloaking; however, as long as you serve the same content to Google’s mobile crawler as you do to mobile browsers, you will be fine (for more info, see this post from Google’s webmaster central team).

While using Word Press as a CMS has issues, this is one area in which it works to your advantage: there are multiple plugins to help you address the problem. I use WP Touch, but you can also use WP Mobile. I’m sure there are other plugins or adapters for other CMS systems. Make sure the systems can handle mission critical functions like shopping and ordering. In the month I’ve owned my iPad, I’ve made a dozen purchases from my iPad, which I suspect is a growing trend.

To wrap up, here is what I would concentrate on as a publisher:

  • Rethink your use of complex, hard to read layouts that are overflowing with ads or other large-file-size elements and images.
  • Minimize your use of rich media elements to the places where they are most essential. IMHO, at this stage flash is a liability on so many fronts it’s not worth the headache.
  • Avoid using a subdomain or subfolder for mobile content. In addition to being a maintenance point, the potential for duplicate content and split link equity is another liability.
  • Choose your mobile implementation method carefully to avoid creating cloaking issues.

Creative Commons License photo credit: Jilles

This post originally came from Michael Gray who is an SEO Consultant. Be sure not to miss the Thesis WordPress Theme review.

Why You Should Care About AT&T’s iPhone Data Plans Even If You Don’t Own an iPhone


tla starter kit

Related posts:

  1. I’ve got a Link Blog, and Why You Should Care I’m sure some of you folks happened to stumble across…
  2. Google iPhone Voice Search – It’s All About the Ads I was lucky enough to have lunch with Matt Cutts…
  3. Ralph Lauren iPhone App One of the criticisms of scientists, engineers, and programmers has…
  4. Google and Personalized Search – Collective Data Borg I’m getting pretty in depth in studying the tools and…
  5. Google Analytics iPhone App Review I was asked to take part in the beta test…

Advertisers:

  1. Text Link Ads – New customers can get 0 in free text links.
  2. CrazyEgg.com – Supplement your analytics with action information from click tracking heat maps.
  3. BOTW.org – Get a premier listing in the internet’s oldest directory.
  4. Ezilon.com Regional Directory – Check to see if your website is listed!
  5. Page1Hosting – Class C IP Hosting starting at .99.
  6. Directory Journal – List your website in our growing web directory today.
  7. Need an SEO Audit for your website, look at my SEO Consulting Services
  8. KnowEm – Protect your brand, product or company name with a continually growing list of social media sites.
  9. Scribe SEO Review find out how to better optimize your wordpress posts.
  10. TigerTech – Great Web Hosting service at a great price.

Michael Gray – Graywolf’s SEO Blog

Posted by randfish

Many of our keen members observed that late last week, Linkscape’s index updated (this is actually our 27th index update since starting the project in 2008). This means new link data in Open Site Explorer and Linkscape Classic, as well as new metric data via the mozbar and in our API.

Index 27 Statistics

For those who are interested, you can follow the Linkscape index update calendar on our API Wiki (as you can see, this update was about a week early).

Although we’ve now crawled many hundreds of billions of pages since launch, we only serve our uber-freshest index. Historical data is something we want to do soon – more on that later. This latest index’s stats feature:

  • Pages – 40,152,060,523
  • Subdomains – 284,336,725
  • Root Domains – 91,539,345
  • Links – 420,049,105,986
  • % of Nofollowed Links – 2.02%
  • % of Nofollows on Internal Links – 58.7%
  • % of Nofollows on External Links – 41.3%
  • % of Pages w/ Rel Canonical – 4.3%

These numbers continue the trend we’ve been seeing for some time where internal nofollow usage is declining slightly while rel canonical is down a bit in this index but up substantially over the start of the year (this likely has more to do with our crawl selection than with sites actually removing canonical URL tags.

Comparing Metrics from Index to Index

One of the biggest requests we get is the ability to track historical information about your metrics from Linkscape. We know this is really important to everyone and we want to make this happen soon, but have some technical and practical challenges to overcome. The biggest of which is that what we crawl changes substantively with each index, both due to our improvements in what to crawl (and what to ignore) and with the web’s massive changes each month (60%+ of pages we fetched 6 months ago are no longer in existence!).

For now, the best advice I can give is to measure yourself against competitors and colleagues rather than against your metrics last month or last year. If you’re improving against the competition, chances are good that your overall footprint is increasing at a higher rate than theirs. You might even "lose" links in a raw count from the index, but actually have improved simply because a few hundred spam/scraper websites weren’t crawled this time around, or we’ve done better canonicalization with URLs than last round or your link rotated out of the top of a popular RSS feed many sites were reproducing.

OpenSiteExplorer Comparison Report
Measuring against other sites in your niche is a great way to compare from index to index

If you’ve got more questions about comparisons and index modifications over time, feel free to ask in the comments and we’ll try to dive in. For those who are interested, our current thinking around providing historical tracking is to give multiple number sets like – # of links from mR 3+ pages, # of links from mR 1-3 pages, etc. to help show how many "important" links you’re gaining/losing – these fluctuate much less from index to index and may be better benchmarking tools.

Integration with Conductor’s Searchlight Software

SEOmoz is proud to be powering Conductor’s new Searchlight software. I got to take a demo of their toolset 2 weeks ago (anyone can request one here) and was very impressed. See for yourself with a few exclusive screenshots I’ve wrangled up:

Searchlight Screenshot 1/4

Searchlight Screenshot 2/4

Searchlight Screenshot 3/4

Searchlight Screenshot 4/4

Conductor's Seth Besmertnik at the Searchlight Launch Event

And at the bottom of the series is Seth Besmertnik, Conductor’s CEO, during the launch event (note the unbuttoned top button of his shirt with the tie; this indicates Seth is a professional, but he’s still a startup guy at heart). Searchlight already has some impressive customers including Monster.com, Care.com, Siemens, Travelocity, Progressive and more. I think many in the SEO field will agree that moving further into software is a smart move for the Conductor team, and the toolset certainly looks promising.

Conductor’s also releasing some cool free research data on seasonality (request form here). Couldn’t resist sharing a screenshot below of the sample Excel workbook they developed:

Keyword Seasonality Excel Workbook from Conductor

mmm… prepopulated

SEOmoz’s Linkscape index currently powers the link data section of Searchlight via our API and we’re looking forward to helping many other providers of search software in the future. We’re also integrated with Hubspot’s Grader.com and EightFoldLogic‘s (formerly Enquisite) Linker, so if you’re seeking to build an app and need link data, you can sign up for free API access and get in touch if/when you need more data.

The Link Juice App for iPhone

We’re also very excited about the popular and growing iPhone app – LinkJuice. They’ve just recently updated the software with a few recommendations straight from Danny Dover and me!

LinkJuiceApp 2/2LinkJuice App 1/2

The LinkJuice folks have promised an Android version is on its way soon, and since that’s my phone of choice, I can’t wait!

If you’ve got an app, software piece or website that’s powered by Linkscape, please do drop us a line so we can include it. I’ve been excited to see folks using it for research – like Sean’s recent YOUmoz post on PageRank correlations – as well as in many less public research works.

Oh, and if you somehow missed the announcement, go check out the new Beginner’s Guide to SEO! It’s totally free and Danny’s done a great job with it.

Do you like this post? Yes No


SEOmoz Daily SEO Blog