We spent lots of time talking about link building, social media marketing but not enough about on-page or website factors. So today we have the pleasure in talking to Alan Bleiweiss whom is considered to be an SEO Audit Master. Alan has spoken at many SEO conferences including Pubcon, SMX, Blueglass and Affiliate Summit. He provides a free SEO Audit Checklist on his website here. You can follow Alan on Twitter @AlanBleiweiss.
1) What did you do before you got into SEO?
I started out building web sites in January 95. From there I went on to head up the Web Development team for an agency in New York. That’s where I got my first experiences with enterprise scale and Fortune 500 clients. Everything was about best practices professional services. It was at that agency I realized I needed to teach myself coding (ColdFusion) and when I went out on my own; my consulting became a full service self-contained offering.
2) How did you get into SEO?
Around 2000 clients started asking me how they could get found in search engines more consistently. Up til that point, it had just been a matter of “create the site and it will magically show up” given how few companies in most markets had a real site. And we had the directories of course, as well as traditional (old-school) marketing where the web site was listed.
3) I understand you just left an agency to begin providing your own consulting services. What was your big motivation behind that?
Motivation? Parent company restructuring. 90% of our work had been for internal parent company projects. When they restructured, we had to look at costs and revenue models. Since it would have required months to refocus into a true outward serving search agency, we were a “secondary” or “luxury” service to their main corporate mission that didn’t fit the near-term plan. I fully expect Click2Rank to reawaken one day, maybe a year from now. But that depends on how they move through the coming year. And If/when they do, I’ll be around to help consult with what it looks like, if nothing else.
4) You are well known for conducting SEO Audits so what are the top three mistakes people make?
The biggest one isn’t necessarily a mistake as much as not having a full enough grasp of the concept of topical focus. Topical focus really matters. To the point where if you’re not aware of how much it matters, topical dilution occurs. Confusion then ensues when mathematical based algorithms do the evaluation of “what is the primary focus of this page?”. From there, it’s a lack of awareness or lack of acceptance to how serious the notion of “natural” needs to be considered. Look at the web. Billions upon billions of pages of content. Across millions of sites.
Take any single topic. There could be hundreds or even thousands of sites out there laser focused as far as being truly relevant to a topic. Yet because we all share words across nuanced variations of a topic, you end up with tens of thousands or hundreds of thousands or millions of other pages or sites that “appear” to be about that topic, yet aren’t. Right? That’s why Topical focus is so important. Yet it goes further. The overwhelming majority of content is NOT produced for search engines. It’s produced by people who have clue-zero about SEO.
While the search engines are finally driving the concept that they need us to tailor our site architecture to help them figure it all out, the core of search engine work has always been about how to figure it out – that soup of content and links and presence coming from the bigger web. And rightly they should have and should continue to do so. Except that means “natural” content has to look like it belongs in that soup bowl. “Click here” is lately becoming a fuzz phrase in the SEO community because people are thinking they finally realize that anchor text really has to be more diversified.
So where did that “click here” concept come from? Yeah. The soup bowl.
Of course it’s much more complex than that – so many factors. Anchor text is just one example. But that’s the new frontier – people finally waking up thanks to Google’s efforts, to the notion that their sites have to be able to look like they belong in the soup bowl. It’s something that IR (Information Retrieval) and Semantic Web experts have known all along, or at least saw coming years ago. That’s why we haven’t ever had the suffering that so many in the industry have had to deal with.
The last issue is webmasters who for years have tried to “follow the rules” and got burned – that came from their not realizing those issues are so important, not realizing that SEO really is a complex art unto itself, that they could “just figure it out” themselves or thought they had enough information to run with after they “read an article by guru X”.
5) What tools do you use to search a website for duplicate content? Any good tips for finding duplicate pages or near duplicate pages?
Let’s start with the less obvious tool. Google’s Site: operator. Just this week I ran it and found a LOT of duplicate content all perpetuated by the client themselves. Run “Site:domain.com”, then “site:www.domain.com” then “site:domain.com –www”. You’d be surprised how many sites have multiple subdomains that are pure duplicate content of the main site. Test sites. Dev sites. Some sort of promotion gone very wrong sites. Horrific architecture induced sites. And more… Facepalm inducing duplication on massive scales…
Then of course, I go and look for a product details page. Any page will do on some sites. Copy a snippet of the product’s description, and search for that, in quotes. Not a huge amount of effort needed to uncover cross-site duplicate content there. And by only copying a snippet (one full long sentence for example) and this time doing it without the quote marks, you can often find near duplicate content by seeing all the bolded text show up splattering the SERP snippets Google brings back.
I also do that with services page descriptions for service based sites, because I can’t tell you how often someone in a company who was charged with content just went out on the web, found what the other guys were doing, and ripped it completely. [headdesk].
Also check for http vs. https, international versions of sites, mobile versions, listing products In sixteen categories (so our customers can find them)… Duplicate content is a lot more prolific than many people realize. And you need to kill that off because it really is not wise to trust Google to “figure it out” as far as “this is my main site so ignore the rest”.
Run an on-site search. Wow. Look at that. Duplicate content sixty-fold. Because you’re not blocking search results from being indexed. OMG.
6) How important are “heading tags” and do you look for them on each page of the website?
In the overall scheme of things, many “factors” are minor. So it’s a matter of “how many really big issues have I found” for a particular site as to whether I’ll even bother including “You aren’t properly seeding H1 tags”. Because let’s be very real here. If you brought me in for an audit, it’s quite likely your site is going to end up needing a massive overhaul. Yeah I’m mean like that.
No – seriously – header tags are important if for no other reason than they’re a best practice – usability (especially for visually impaired visitors) is the core foundation to all we do. Not search engines. Then yes, header tags DO contribute to helping strengthen the topical focus of a page. So I DO advocate them. Properly nested header tags. Where your site’s logo isn’t wrapped around the main H1 across your entire site. < sigh >.
Given that I could very well task a site to have its entire IA overhauled, I would throw header tags into that overhaul process. But if I’m recommending URL changes, topical regrouping, major server changes, a comprehensive community outreach campaign, and other 30,000 foot recommendations, I likely won’t detail header tag fixes in the audit. I’ll just refer a client to read my SEO for Content Writing ebook which includes the “simpler” page level stuff and tell them their writers need to follow that to the letter.
7) Subdomains can cause a variety of issues with the main domain. What are the most common problems you encounter and what are your recommendations to fix them?
See my above reply regarding duplicate content. That’s the biggest infringer when it comes to subdomains. From there, people once again (see the early part of my interview responses) it’s a matter of not grasping topical focus issues, or on a more refined scale, topical intent factors. People butcher the use of subdomains for a host of reasons. My recommendation is always going to vary depending on a) what’s the best practice for this unique situation? Coupled with b) how does this aspect of my potential recommendation fit into this unique client’s budgetary, resource and internal political constraints as relates to whether they can even ever get to this work?
I know that’s a non-answer; however that’s how much I consider what I recommend given how many clients I have done audits for. However, generally speaking, you should ONLY use subdomains if you have the counsel of a true Information Retrieval expert at your side, AND you have the resources to drive each subdomain with the same effort, weight and leverage you would a stand-alone site. Fail on either account and you might get lucky. Or you might shoot yourselves in the foot. The more likely of the two outcomes.
Seriously though – I routinely implement successful subdomain strategies but it’s a “big kids” tactic that you need to be able to support the correct way. So that’s why its best to go fully self-contained for the overwhelming majority of site owners. Half measures is the issue.
8) What are your thoughts on internal anchor text and internal link structure?
User Experience, combined with “what would it need to look like to fit in that soup bowl?”. So kill that massively super-over-saturated and polluted main site navigation with 800 options in dropdowns or slide-outs. STOP using exact match anchors within your content. Kill off 80% of your nonsense “related products” links, sidebar boxes, in-content “similar” widgets, and anything else that isn’t laser focused on refined topical focus. Reduce your section level sub-navigation down to a bare minimum that only links to the stuff that really is tightly related.
And one last word to a select handful of readers. Stop wasting time with page sculpting / 1st link counts / nofollow / noindex follow nonsense. Really. Yeah – you know who you are – it’s time to get with the program the rest of us got with years ago about page sculpting. That’s a shiny object you’ve been blinded by for way too long. YES, each of those has its place in a best practices world. Not for the reasons you think though. So stop it. Just stop.
9) What are your best practices for image optimization? Do you still think it’s important?
Yes, it’s still important. However it has always ONLY been important in some situations. Unless you talk about true best practices given how image optimization helps visually impaired visitors, which means it’s important always. Other than that specific reason though, in many situations it’s a second tier “luxury” task. The bigger your site, the more images you have, the more relevant images are to your business, or the more likely images play a key role in conversion – sure – the more important image optimization becomes strictly from an SEO perspective.
And IF image optimization is important to your site, get it back into your main sites’ domain. Rip it OFF the cloud. Or OFF that subdomain. Because your site speed matters. Duh.
10) The robotx.txt file can be very powerful way to help control what pages are indexed. Can you provide a few examples of the type of pages people should block via robots.txt file?
Site search. Please. I implore you – block your site’s internal search environment. Get it into a folder in the URL structure so you can block that entire folder. Development areas within your site should also be blocked. Whether that’s a subdomain or subfolder. Block that sucker. Fast. Before Google’s index implodes trying to determine what’s to be indexed or ranked. Wow. Can’t tell you how many development and test sites I find polluting the indexing of a client site…
11) What are your go to tools that you use day in and day out to preform website audits?
Excel, Open Site Explorer, Screaming Frog, Google Webmaster Tools, (and check out Bing’s Phoenix – Bing Webmaster Tools – if you have access to that, you’re going to get a wealth of info as well) Google Analytics, Tools.Pingdom.com, URIValet.com, Google search, Raven Tools (when I want a different angle view on things), and please let’s not forget about my brain, or that of my network of industry experts who have more experience in any given aspect of those clients where I need someone’s expert take outside my own…
12) The biggest low hanging fruit we come across are 404 pages that have links to them. At one point the client changed or took down the URL that has links pointing to it. How do you identify links that go to 404 pages? Do you use a tool?
Google Analytics error section and Screaming Frog.
I want to thank Alan for taking time out of his day to answer some questions about website audits. Thank you Alan!