A few weeks ago Danny showed us some of the basics for image SEO, a medium that may not initially seem valuable for SEO purposes. Well, Danny dispelled that illusion swiftly, with a little help from his friend Doc Brown. This week, Danny's out there alone but still manages to show us that words aren't all they're cracked up to be; videos can yield some great SEO value, too. Besides giving us proven and actionable suggestions, Danny also postulates on some experimental and potential ways to optimize images that may prove useful now and in the future.

Video Transcription

Hello, everybody. My name is Danny Dover. I work here at SEOmoz as the lead SEO. On today’s Whiteboard Friday, I’m going to tell you about the basics of image SEO. We found, when we were doing correlation analysis, that images and specifically the alt text that’s inside of them are a remarkably well-correlated metric for SEO. Besides just being useful for people, images are also, it turns out, useful for search engines. I think part of the reason behind that is that pages that are well developed tend to also have images on them because it helps portray information in a way that textual based content can’t do.

Let me go over some of the important factors with image SEO. Number one, I already mentioned this a little bit, is alt text. Alt text is the text that you provide for an image in case it can’t be displayed. Maybe the image is gone or maybe someone is using a program that can’t display images. This is the text that takes it place. So it makes a lot of sense from an SEO perspective that this metric is going to be important because it’s the information you tell the search engines and other technologies what the image represents. With these, I recommend keeping them below about 140 characters. It’s a rough rule of thumb. Also, have them be descriptive and in line with what you’re trying to target for that page.

Number two is the file name. This works off the exact same principles. The file name is also information you give directly to the search engines and to other technologies to identify what the information is about. I would gander, if you will, that the file name is probably a rougher signal than the alt text. Alt text, from my experience, when it’s there, which is not all the time, in fact, alt text is not included many times which is bad for SEO. But when it is included, it tends to be a clearer signal than a file name which a lot of times is just algorithmically generated by the timestamp, so it’s just a bunch of numbers.

Number three is the surrounding text. I think a lot of people don’t think about this when they’re thinkin
g about image SEO. The text around an image tells a lot about the image itself. This makes sense, right? You’ll see a lot of times where images will be on a blog post and you’ll have a caption describing the image. This is just another signal telling the search engine and other people and technologies what it is this image is about. The surrounding text, and that can either be a caption, like you’ve seen traditionally, or it can just be the paragraphs around the image. A lot of times an image will be used to supplement what the textual information is talking about. So the surrounding text is very important.

Fourth, as with all SEO, inbound links are important. It wouldn’t necessarily be inbound links to the image URL, although it could be, but what I mean in this context is links going to the page that has the image embedded on it. Just like in normal SEO, the anchor text of those inbound links and where they’re coming from and how many of them are all really important factors for image SEO and then SEO in general.

Last is number five which is human categorization. The search engines, especially at the beginning when they were developing this image recognition software, used humans. They would hire people and they’d say, "Label this." Google was semi-famous for creating this game, Google Image Labeler, which I think you can still find online, where it would show you an image of, say, an apple. They would ask you in Family Feud style, which is a game show here in the States, to list words that are associated with that object. You’d say something like apple, and you’d earn points if someone else also said apple. Maybe it’s red, Fuji, or Grandma Smith, or whatever it is. So other words that are associated with the image. And that way they could train their software to start to understand what general shapes and ideas mean within images.

On the other side here, I have some more theoretical things that search engines may be using, while the things on this side are the things that we know they’re using. We’ve heard search engineers talk about this. We’ve seen direct evidence. These are things that I think you should pay attention to but probably just going forward. It’s more just for your knowledge rather than for you to use on your day to day.

The first one is OCR. OCR stands for optical character recognition. It’s a very established software. It comes in a lot of Adobe products. You can get it in lots of places. What it does is it scans an image and can identify characters in it, characters like letters or numbers or spaces or whatever. From that, you can take actual text out of images. Again, this is a very popular software. It seems very likely to me that search engines are using this at least to some degree. It would be very costly for them from a resource perspective to use on every image on the Internet, but it would certainly make sense if they were using it on some or at least playing around with the technology.

Number two is color analysis. It’s very easy from a development perspective to identify at least one color, maybe the primary color, within an image. You pick a pixel and you see what the hex code or whatever it is that you’re measuring that on, it will be based on file type. It’s pretty easy to get a general idea of what the color of an image is. This is helpful from a design standpoint if you’re looking for certain color themes that go with each other or color patterns. Now we’ve seen this actually in the SERPs, so if you go to Google image search, you can see now, and Bing actually had this first, you can go to the image SERPs and you can actually pick to see only images that are of a certain color. Black and white is the obvious one, but then other colors as well.

Number three is file size and type. This one, I think, is more all about the extreme. If the image is ridiculously big, it’s probably not going to get indexed just because the search engines don’t want to spend the resources on that. The exception to that would be if it’s ridiculously well linked to also. It’s about finding these outliers. You probably don’t want to have an image that’s really, really big. It’s probably not going to get indexed. Again, I think what it really comes down to is this is hurtful for users also because they’re going to have to spend time downloading that. If bandwidth is a concern, they’re probably going to click away to begin with. Image size and along with image type, the standard image things are all probably fine for Google.

I’ve heard just a rough rumor here that JPEG is preferred, but honestly GIFs and PNGs and all those other things are probably fine. I would not worry about those aspects. Only worry about it if you’re using obscure file formats, which you shouldn’t be doing to begin with.

The last one on here is the other images on the page. This is twofold. The first part being the other images on the page are likely related to the given image and that’s because they’re on the same page. Right? The other part, and I see this happen a lot especially with bigger clients, is when you put lots and lots of images on one page, like an image gallery, those pages tend to be very hard to get indexed. The reason for that is there’s not a lot unique textual content. A lot of times it’s just overwhelming to users. It doesn’t provide a lot of benefit in a search result.

That’s all the time I got today. I appreciate you listening to this. Please feel free to ask questions in the comments below. Thank you.

