The Danny Sullivan Klout Effect

For the longest time I have used my one-word-name “jefflouella” as my username for every social media outlet. While I dig the “my name is my address” aspect of this approach, my content is not focused by any means. These accounts are about me, including work, family, and hobbies. If you know me, I have a ton of hobbies.

I’ve been getting the feeling that when I post about SEO and then the next post on home brewing beer, I lose some followers who are only interested in one or the other. Since SEO is my livelihood, I decided to break off and start a new twitter handle and website focusing on SEO.  This article is currently posted on the new site www.thetechseo.com, and my new SEO related Twitter handle is @TheTechSEO.

Following Klout Scores

I’m not super crazy about followers and influence on Twitter. I have friends who freak when they lose one twitter follower. While I am not in competition with anyone, I do like to see how I am progressing. I guess this is more of confirmation that I am not just spinning my wheels.

Since my @TheTechSEO handle is so new, I have very little Klout with it. The account is about 2 weeks old and I have about 65 followers. I have posted less than 30 tweets so far.

On Friday, June 15th, 2012, I posted a response to a tweet from Danny Sullivan @dannysullivan. It went something like this.

https://twitter.com/TheTechSEO/status/213683005946994689

https://twitter.com/TheTechSEO/status/213684277718687745

Over the weekend, I didn’t think too much of this. I was excited I got a response from Danny Sullivan who is a Twitter power user and industry thought leader, but that was it. Then this morning, when I did my Monday morning run through of analytics and site QA, I found something interesting. My Klout Score doubled.

Klout Score - Danny Sullivan Effect

And so did some other numbers.

Klout Score - Danny Sullivan Effect

Klout Conclusion

While my score doubled in many categories after Danny Sullivan tweeted me, I am starting from nothing. This one post did not put me at the top of the Klout charts by any means, but I found that data very interesting. I know this post isn’t directly about SEO, but the idea of Klout could eventually be the new PageRank.

Bottom line, Klout is about influence and Danny Sullivan is an influential dude.

Screaming Frog SEO Spider 1.9 Review

Summary

Pros:

  • Cross Platform (Windows, Mac, Linux)
  • Updated Regularly
  • Great Feature set

Cons:

  • Struggles on very large sites.
  • Pricey – £99 (about $150USD) per year.

Rating:

{rating}

Where to get:

Screaming Frog Official Site

Screaming Frog SEO Spider LogoAs an SEO who works on large-scale websites, a good spider is key for finding hidden issues on a website. Many of the sites I work on have tens of thousands of pages and it is almost impossible to hit up every page one at a time.  A good SEO spider is a must have tool a technical SEO, but is Screaming Frog SEO Spider the answer?

I am on a constant lookout for new spidering software. Many of the free SEO crawling tools on the market are good, but not great. They have scalability issues as well and stability issues. When I downloaded Screaming Frog SEO Spider over a year ago, I found a good spider that has been continually getting better. Unlike many of the free crawlers that have not been updated in years, Screaming Frog SEO Spider has seen updates almost every quarter.  Active development is very important to me when selecting a tool.

I recently have moved to a 100% Macintosh environment and one beautiful thing is that Screaming Frog SEO Spider is cross platform and works on Windows, Mac OS X, and even Linux. No other desktop SEO spider on the market is this Desktop OS Friendly.

What does this SEO Spider Do?

Screaming Frog pulls a plethora of data with its crawl. This data ranges from content on the page to server responses. Each data point can be broken down into different reports that can be exported out as a raw CSV file. These reports include:

  • Errors – Client & server errors (No responses, 4XX, 5XX)
  • Redirects – (3XX, permanent or temporary)
  • External Links – All followed links and their subsequent status codes
  • URI Issues – Non ASCII characters, underscores, uppercase characters, dynamic uris, long over 115 characters
  • Duplicate Pages – Hash value / MD5checksums lookup for pages with duplicate content
  • Page Title – Missing, duplicate, over 70 characters, same as h1, multiple
  • Meta Description – Missing, duplicate, over 156 characters, multiple
  • Meta Keywords – Mainly for reference as it’s only (barely) used by Yahoo.
  • H1 – Missing, duplicate, over 70 characters, multiple
  • H2 – Missing, duplicate, over 70 characters, multiple
  • Meta Robots – Index, noindex, follow, nofollow, noarchive, nosnippet, noodp, noydir etc
  • Meta Refresh – Including target page and time delay
  • Canonical link element & canonical HTTP headers
  • X-Robots-Tag
  • File Size
  • Page Depth Level
  • Inlinks – All pages linking to a URI
  • Outlinks – All pages a URI links out to
  • Anchor Text – All link text. Alt text from images with links
  • Follow & Nofollow – At link level (true/false)
  • Images – All URIs with the image link & all images from a given page. Images over 100kb, missing alt text, alt text over 100 characters
  • User-Agent Switcher – Crawl as Googlebot, Bingbot, or Yahoo! Slurp
  • Custom Source Code Search – The spider allows you to find anything you want in the source code of a website! Whether that’s analytics code, specific text, or code etc. (Please note – This is not a data extraction or scraping feature yet.)
  • XML Sitemap Generator – You can create a basic XML sitemap using the SEO spider.

When I first start to look at a clients website, I pop open Screaming Frog and run the site through it. The speed and ease of the crawler through the site is one sign I use to determine how well the site is built.

Being the Tech SEO, I first look at technical elements on the site. I check errors, redirects, URL structure, robots tags, and canonicals. Screaming Frog SEO Spider allows me to get a quick look at the health of an entire site pretty quick.

Screaming Frog also allows me to look at some of the more important content elements such as title, headers, and description. At a glance, I can sort content to help find duplicate content issues.

One of my favorite features is the Custom Source Code Search.  I tend to use this feature to return pages such as soft 404’s and pages with specific code like analytics scripting.

Where does the SEO spider fall short?

Overall, this is a great SEO spider. But I have had some issues with site that are over 100,000 pages. It is possible to allocate more memory to the program itself to help combat the large site issue.

I increased my install to 4GB and that seemed to work well. Occasionally, when I hit a site that is too large and has too many problems, Screaming Frog can slow down dramatically. When this happens, sometimes saving the project helps. Other times, the program freezes while trying to save.

Screaming Frog recommends crawling very large sites in sections. This is possible within the SEO Spiders configuration using regular expressions (RegEx). Not all sites URL structures are set up nicely, so this may not even be possible.

I’ve started combating this by only allowing the spider crawl the first 75,000 pages. This will give me a large enough set to know what can be touched on globally throughout a site.

A feature I would love to see in Screaming Frog SEO Spider is more advanced reporting. While the raw CSV’s allow me to rip the data I need, I am thinking that some dashboard level reporting could bring this product to the next level. Some ideas would be percentage of things like broken links, missing alt text, and pages missing headers. I am looking for high-level data to report to my client on a monthly basis to show progress.

On top of the dashboards, in the future I would like to save my crawl information to do comparative reporting week over week, month over month, and year over year. SEOmoz’s Pro Campaign does something like this as a web app. It would be great if Screaming Frog could do the same for the desktop.

Overall Impression

Screaming Frog SEO Spider is my go to tool when I am starting off evaluating a website. While there are some limitations to what it can do, it is one of the top SEO Spiders on the market. With its continual quarterly updates, the spider is only going to get better, faster, and more feature rich. If you have the $150/ £99 a year to spend on a SEO spider, Screaming Frog is a good choice.

How Technical SEO Procrastination Hurts Your Redesign Effort

Throughout my career, I’ve worked on web design, development, and marketing teams and have touched almost every aspect of a website relaunch. Today, as an SEO, I see the same common technical SEO mistakes made over and over again. This article explores these mistakes and discusses why it is so important to plan ahead and implement some elements in the beginning of the process and not wait to fix them in phase 2.

There are different levels of a site redesigns. These range from a minor cosmetic update to a complete replatforming. Every time a web developer opens up code there is an opportunity to optimize for the search engines. Unfortunately, most web developers simply do not have the time, budget, or the requirements for technical SEO.

Typical Redesign Process

Web Design Process DiagramIn a typical redesign engagement, a kick off meeting starts the whole processes. This is where the ideas fly. “The site needs to be social,” “we need a way for our clients to contact us,” and “we’d like beautiful photography on the homepage” are typical ideas you may hear at a kick off meeting.

The information architects build the taxonomy and wire frames, the designers bring in the emotional connections, the developers get the site up and functional in the time they have been allotted (which is usually when the project is mostly over budget and their hours have been cut). The site launches and there is a big party. A week later, someone from analytics reports that conversions are up!, but natural search traffic is way down.

How could this be? The site went through 3 rounds of design, development, QA, and it looks amazing. We even had a usability study and passed with flying colors. We are using jQuery and other cool DHTML effects that our users love. Why does Google hate us so much?

The good news is Google doesn’t hate you. The bad news is, during that initial kickoff meeting, SEO was not a big focus. Even if it was, it is usually just talking about keywords and content. I’ve had clients that cut some of the SEO budget to buy better stock photography or to integrate that new analytics system. SEO just falls by the wayside.

Coming from a development background and believing in a more agile philosophy of continual updates, there are a lot of things you can do to help technical SEO after a site is launched. Content tweaks, link building, and internal links are just a few. On the other side, there are a few elements that should be set before any other development is in place.  These are URL Stucture, Redirects, Server Headers, and Code Structure.

Four Elements to Lock into Place before a Site Redesign

URL Structure

If the site updates are just design based and every page URL will remain the same, you should be alright. If you are changing taxonomies, CMS system, or programing language, chances are your URLs are going to need to be changed too.

To me, this is the first aspect of a new site’s architecture with SEO in mind. While it is easy to change URLs, it’s not so easy to have Google reassign the value from the old URL to the new URL. If you change a URL from “URL A” to “URL B”, then a couple weeks later from “URL B” to “URL C”, all the value from “URL A” probably didn’t make it to “URL B” and even less value gets to “URL C”.  If this was one page’s URL, then this is not a real issue. If you are a large e-commerce site with multiple ways to get to one page, there could literally be hundreds of thousands of redirects happening.

This is why I always start with hammering out my URL structure. This is the foundation of a web page. Set a good foundation and you can build off of it forever. Think of all the possible variations, document them, and make sure developers stick with them.

Quick URL Tips

  • Avoid overly long URL’s.
  • Have clean URLs without parameters. If parameters are 100% necessary, keep them to only 1 or 2.
  • Keep folder structure to a minimum. I try to make all my sites no more than 1 or two folders deep.
  • Keep URLs all Lowercase.
  • Use hyphens and not underscores for word separators.

Redirects

Will Return ClockI’ve noticed recently that most developers do not fully understand redirects and their affects. I have some clients who still use meta refresh as their mode of redirect. CMS’s like Microsoft SharePoint use the temporary 302 redirect by default. Most developers code redirects for functionality but don’t understand the meaning behind them. To a lot of developers, when someone clicks on “URL A” and gets redirected to “URL B,” the redirect worked. Unfortunately, this is not the case when the search engines spider a website.

First off, avoid using client side redirects such as meta refreshes and JavaScript. While spiders try to understand these types of redirects, they do not pass page value well.  Look to use a server side redirect.

The two most common server side redirects are 301 and 302 redirects. As I mentioned above, a 302 redirect is temporary. Think of is as a “We will be back in 15 minutes” sign on the door of a retail store. A 301 redirect is a permanent redirect. Think of that as a sign on the door of a store that states “We have moved to a new location. Our new address is…”

With a permanent redirect, the value that “URL A” has acquired over time will be passed to “URL B”. In the case of a temporary redirect, the value from “URL A” will stay with “URL A”, while the customer will be taken to “URL B.” This may cause indexation issues in the search engines.

Redirection Tips

  • Use 301 server side redirects
  • Avoid multiple hops. Redirect “URL A” to “URL C”, not “A” to “B” to “C”.
  • Add the redirect commands to the server config file. This will help with server performance.

Server Header Status

Another common mistake I see with CMS systems are soft 404 error pages.  A soft 404 is when a customer lands on a “Page not found” type page, but the server’s status publishes a 200 “Everything is OK” message. This can cause a ton of confusion in the search engines and add unnecessary pages to the index.

When a site is relaunched and the URLs have changed, not all pages get redirected properly. If content was repurposed, many times that content contains links with old URLs in them. These cause error pages to pop up. If these error pages are telling spiders that this is an “OK” page and not an error page, the search engines will index these bad pages thinking they are good pages.

On a large scale site, this will cause some issues with over indexing. If your site only has 1000 pages, but Google says you have 9000 pages, you may possibly have an issue with soft 404 pages. This may cause the search engines to think 8000 of the pages are the same and discount your site for low quality.

To be sure your error pages are true error pages, take multiple URLs from your site and remove a letter or two from the filename, folder name, and even the file extension. Copy these broken links and paste into a site like www.URIValet.com or use a browser extension like HTTPfox or Live HTTP Headers.  If your pages return a 404 server status message you are fine. If not, you know what to do.

Code Structure

HTML code structure is something that can change once the site goes live, but in reality, it never does. While there may be tweaks here and there, the site’s templated layout is usually set in stone. While in the past I took a harder stance on coding quality than I do now, I still believe starting off with great structure, clean code, and minimal elements will help in the long run. Here are a couple tips I have found to help.

Code Order

Many developers code in the order they see the wireframes or design comps. This usually means the code order goes something like this.

Structurally, there is nothing officially wrong with this and is pretty standard. The issue lies on the site elements. Many sites have gigantic navigations and sub-navigations. Depending on how much code is in the <head> tag, I sometimes see the first line of the main HTML content land around line 1200 or 1500 in the code.

I personally prefer to have HTML content as close to the opening <body> tag as I can.  When I develop a site, I push to have my content above my navigation structures.  For example, a site structure I like to see goes as such:

Now the first thing a search spider would see is HTML content.

JavaScript, CSS, and HTTP Requests

With the advent of the popular JavaScript Libraries like jQuery, I have seen the excessive use of externalized JavaScript files. In one recent review I did, I saw close to 40 external JavaScript files being called in the head of the HTML.

Having so many calls to a server can slow down page speed drastically. It is faster to download 1 large file then to download that same amount of data but cut into multiple files. The HTTP Request diagram below from WebsiteOptimization.com demonstrates the typical HTTP request. Before the first byte of data is downloaded, the request has to get the IP, open a socket, and wait until the server starts sending the data. Multiply this by as many requests your page is making and you can see where all that wait time comes from.

HTTP Request Diagram

I have witnessed page speed increases of 10 seconds or more just by combining JS and CSS files. JavaScript and CSS files are not the only HTTP requests your site is making. Images also count. Try combining imagery and using CSS Sprites to help minimize those requests.

Quick HTTP Request Tips

  • Combine all JS and CSS files into 1 file each
  • Minify and gZip these files
  • Try using Google’s AJAX File Library
  • Minimize the use of images with CSS Sprites

Schema.org Microdata Implementation

This last tip is not for everybody. I am a firm believer in adding semantic code into HTML code, but it is not for every site. Though there are elements on every site that can benefit from Schema.org, I feel e-commerce sites, review sites, and any company that has multiple locations (like retail stores) can benefit the most. Since Google, Yahoo, and Bing all support schema’s, it is the format of Microdata that is preferred.

Schema’s can be added at any time, but in my opinion should be thought about from the beginning of a web redesign. It is easier to structure coding elements at the beginning of development and harder after. It’s not that it can’t be done, but I find that most of the time companies do not want to pay to add SEO later. And that is the theme to this whole article.

Sometimes it is easier and more cost-effective to implement these principals in the beginning of the process. If you wait too long, your site will lose value and your wallet will need to open even larger. Take the time to plan technical SEO from the beginning of the process. Setting a solid foundation to build from is more effective than patching holes later.

This article is not a full technical check list of elements. Feel free to leave any feedback or opinion below in the comments field.