Posted by Sam Crocker
Having access to data and large data sets is something any SEO worth his salt craves. Sure, managing a massive dataset or database can be a bit of a hassle, but having good information is key and there are a handful of uses for other people’s/sites’ data sets that are readily available for purchase online. Big budget linkbuilding isn’t the only way to spend your SEO budget these days!
Let’s take a look at five examples of datasets that you can easily and readily purchase and how you might go about using them.
The true motivation for this article was a chat with Tom… and the fact that you can now get Geocities (yes, seriously, the whole thing) in the form of a 1 tb torrent (thanks Hackernews). How much is it going to cost you? Just your email address.
Screenshot of GeoCities–izer version of SEOmoz
What would you do with Geocities, you ask? The sky’s the limit with this one really! I’m not saying you want to use any of the great tickers and beautiful layout/seisure inducing colours for which Geocities is now famous.
However, you may very well want to use the huge volume of content that could quite easily be respun for your own purposes for a start? Or, use the epic designs for mapping out your new site- up to you!
Why pay for keywords? Well, for starters, because sometimes you may find you have a client that has exhausted the entire set of data available through the adwords API (yes, this has seriously happened before). If the site is strong enough and you find you’re still able to rank reasonably for long-tail terms post-MayDay there’s no harm in creating some new content to target the long-tail. This isn’t to suggest that you should buy keyphrases and not do the research yourself, but discussed, more data is almost always better than less.
Out of Words? – Stock Image Provided by Shutterstock
And, most importantly- just because the data isn’t in the API doesn’t mean there isn’t any search volume for it!
Some of the outfits out there selling keywords and keyphrases are:
This sort of thing won’t come cheap, but it can be extremely valuable to the larger sites.
Some of you may be familiar with using 80 legs as a tool to crawl and scrape your way through the interwebs. It’s a tool that I’ve not spent nearly enough time with as I didn’t find it quite as intuitive to use as Mozenda. However, the nice thing about 80 legs is that they have compensated for this a bit by offering packaged-up crawls.
The vast majority of the packages cost 0 per month (with the exception of the ebay motors crawl for 0/month) though the data you could pull off these is extremely valuable and saves you the trouble of doing any of the crawling yourself (or if your IP has been banned you naughty SEOs).
Again, these sets could be used for anything from price-comparison to market analysis and right on down to content creation and keyphrase research. If you’re one of the fortunate few working in the space for which these are offered you should definitely have a look.
So, the Twitter Census dataset is just an example of the variety of datasets you can buy from InfoChimps though the general concept of owning one year’s worth of URLs, hashtags, and smiley usage seems like it could be used a number of ways. Either, you could create an infographic worthy of a link from the likes of Mashable, TechCrunch, etc.
Or, you could use the data to monitor keyphrase usage, common abbreviations, or any other sort of trend in social interaction (could be a great source of keyphrases as well as the search engines begin to take signals and include social directly in the SERPs. This set is currently placed at 0.
Rand was being a bit coy about this one and at time of press I wasn’t able to get a serious price out of them but there’s a price for everything right? Any serious bidders should probably get in touch with the SEOmoz team directly…
Along these lines, there are a number of other datasets that do not have a price set but I’m sure you could get your hands on with enough money and asking the right people. These would include: Backtype API data, Wordtracker, or Amazon’s entire product catalogue. It all comes down to asking the right people, but ultimately anyone with a brain for business and a load of data would sell you their info if you know how to ask for it.
Don’t you just love it when you can get your hands on some awesome free stuff that you never knew you wanted in the first place? Well, thankfully, there are a few datasets that I came across that I thought were worth sharing and could give you some value for free.
Free Stuff! Stock Image from Shutterstock
Feel free to take a gander at these datasets and try to make use of the data! Can you say "infographic ammunition"?
The entire dataset from the New York Stock Exchange from 1970-current (Open, Close, High, Low, Volume).
Massive sets of US Census Data.
And for those of us based over in the UK – huge volumes of UK Government data right at your fingertips.
Other Huge Datasets to get stuck into:Project Gutenberg for over 6,000 full books available online. These book lists at the very least could be of
One thing that you may have noticed is a byproduct of providing large datasets to people is that they tend to be solid gold for linkbait. We could focus an entire post around this but if you’ve got access to great data and you’re not offering it out to your users/curious SEOs what are you thinking?! Publish the data, make it free to download, and require a link back for attribution for anyone who wants to use it- simples!