SEO Guide to Evaluating Web Site Software
The SEO forums are seeing a bump in the number of people asking about SEO for sites built using various software packages. In the past week or so I have seen questions on sites ranging from Directories to Shopping Carts to Blogs. So I thought I might take a deeper look into this issue. While I wrote an overview of Optimizing User Generated Content a while back, this article is focusing more on what to look for when considering a specific software package. It could probably also be used as a guide for programmers looking to build this type of software and who are interested in “real” search engine friendliness.
What is the problem here? Well, a great piece of programming and outstanding software package does not always equal “search engine friendly”.. even if they say it does. Most programmers are just that, programmers. They do not know the ins and outs of the ever changing SEO space and if they took the time to keep up with SEO, they might fall behind on their programming skills. Most stick to what they know. Even if they do take the time to try and learn SEO, they face the same uphill battle that new SEO’s face everyday, finding reliable accurate SEO information. Too many places give bad or outdated information and how is the programmer to know the difference. This leads to great applications which are rather limited in their SEO viability out of the box. I can not think of a single online software package in any category that would not need modification to be “SEO friendly”. However, some are much worse than others and some are actually very harmful. I wont name names (yet), but I think you need to know some of the issues to look out for when evaluating software for SEO or “search engine friendliness”.
While all SEO issues apply, lets talk about the few that tend to show up more frequently in programs and scripts used to build dynamic web sites.
The first issue is Code Bloat: This is the effect of having more code in the HTML than actual text that displays on the page. The smaller your code to content ratio, the easier it is for spiders to read and understand your pages content. Here are a few things that usually add to code bloat issues.
1. On Page Script. When you view the source of a page generated by one of these programs, any scripting should be called from an external file. Javascript seems to be the most common issue, though it can happen with other scripting languages such and PHP or C++ as well.
- GOOD: All scripts called from external files
- BAD: Scripting code placed directly in the page code
2. Style Sheets. The programmers are not usually designers either and tend not to use CSS to its fullest potential. Most programmers know some basic CSS, HTML or XHTML, but this does not mean they know the best was to utilize this code for SEO. Additionally, these style sheets should be implemented as external files.
- GOOD: Designed using CSS as extensively as possible and CSS called from an external file
- BAD: Uses outdated HTML markup, places CSS in line or puts the CSS in the header of the page
3. Program Comments: Some programmers like to leave “helpful notes” in the code of a page in the form of comments. This just adds to Code Bloat. This type of information should be added to a FAQ or Doc type file.
- GOOD: No comments in the code and additional information in external documents
- BAD: Instructions or usage information placed in the pages code via comments
The next issue is Duplicate Content: This is a large SEO issue and it is growing. The engines only want one version of any specific content indexed. When they find pages that contain the same content they will pick one (seemingly at random) and list it, while any others are regulated down to Supplemental Results. Many site scripts have “features” which end up creating duplicate content issues and here are a couple of the common ones.
1. Page Templates: Many times these scripts come with dynamic pages which are built around a page template. While there is nothing wrong with this per se, these template pages can lead to duplicate content issues if done poorly. The entire code bloat issue above comes into play here. If only a small portion of the page changes dynamically, then the rest of the page is duplicated on every instance of that page. This is what leads to the duplicate content issues.
- GOOD: A large percentage of the code on each page changes dynamically
- BAD: Only small portions of the page change dynamically or design and navigation overwhelm the content
2. Session Variables: These are parameters sometimes found in the URL, which track your progress through a site. They are used for a number of purposes. They can keep track of which shopping cart info is yours or remember some information you had to input. There are ways to use sessions without this variable in the URL. If the Session ID is in the URL it causes you problems in a couple of areas. First if a few people copy the URL and link to a page, each with different Session IDs you have a duplicate content problem. Additionally, as we will cover in a bit, this adds one more variable to a URL, in which you want as few variables as possible. On a side note, the Google Webmaster Guidelines specifically states “Don’t use “&id=” as a parameter in your URLs, as we don’t include these pages in our index.” and suggests you avoid session IDs as well.
- GOOD: Does not use session variables to track surfers
- BAD: Includes session variables in the page URLs or uses “ID” as a parameter name
3. Print This Page: A wonderful feature that some of these scripts incorporate is the “print this page” feature”, It’s used here on AppliedSEO.com. However, this feature creates a duplicate version of any page it is implemented on if one small step is not taken. The most common method to fix this is to include a meta “noindex,nofollow” tag in the head of the page. Without this it is an obvious duplicate content issue.
- GOOD: The print pages are not allowed to be indexed
- BAD: These pages are allowed to be indexed
On to the Indexability issue: One of the common obstacles to good search placement is indexability. This is simply the ability of your pages to be indexed properly. In the past, this idea was attached solidly to things like “Meta Tags” and “Keyword Density”. Today we understand that many more factors go into the indexability of a web page. In an application building a dynamic website it is hard to address all of the issues, but with a focus on a few tags and content implementation you can gain some ground in search placement.
1. Dynamic Tags: While most meta tags are obsolete, there are a couple of tags you want to make sure are implemented correctly. While not a “meta” tag, the “Title” tag is the most important on the page. This tag needs to be unique on every page and be very specific to a pages’ content. The only other meta tags worth worrying about are the “description” and “keyword” tags. Each of these should also be unique and specific to the content of a page.
- GOOD: Script allows for unique Title, Description and Keyword tags on EVERY page of the site
- BAD: Tag construction automated by the software with little chance to modify or only slight modifications allowed
2. Types of Content: While this is more a function of the type of site you are building, it is important to address in some cases. Photoblogs or Gallery applications focus on images, but they need to contain a textual component. Directories or Topscripts which only do linking also need more text on the pages. It is the text on a page that is the biggest factor in indexing and if you don’t have it, the pages will not rank.
- GOOD: Each dynamic page contains a good percentage of unique topical text
- BAD: Pages only contain non-textual content or a very small percentage of unique text
3. Content Display: Making your content easy for the spiders to find is the goal here. The spiders usually read from the top of the page down, so having the main content of your page at the top of the page’s code is a good practice. Additionally, there are some neat “tricks” programmers can use to dynamically insert content into a page. Some of these tricks make the content unreadable by the spiders. If you see a scroll bar in the middle of a page for example, the content inside that element is probably not accessible to spiders. Interactive content is usually a problem from an SEO standpoint as well.
- GOOD: Clean standard HTML pages and navigation with simple a display of content
- BAD: Fancy content insertion or content that changes without reloading the page
OK, let’s talk about Spiderability: This does not have much to do with your rankings, but is more focused on allowing the spiders access to your content. If a spider has problems seeing your content or understanding what it is seeing, you will have problems getting your pages indexed at all. The search engine spiders are becoming more sophisticated every day but there are still some basic guidelines any site or page should follow, dynamically generated or not.
1. Navigation/Menus: The navigation is the key to a spider’s journey through your website. Some scripts create flashy drop down or fly-out menus that dynamically update as the site evolves. These types of menus are normally done using Javascript or DHTML both of which make the links in these menus non-spiderable. A basic textual navigation allows the spiders easier access to internal pages and additionally passes some contextual information about a page via the text used in the links. Other types of non-spiderable navigation to avoid include both javascript mouseover and image map graphical navigation.
- GOOD: Basic text link navigation
- BAD: Image based navigation, Drop down or fly-out DHTML/Javascript menus
2. Code Validation: Programmers know good code. But they don’t always know every type of code and taking shortcuts or using non-standard mark-up can cause issues with a spiders ability to analyze a page. Additionally, some enhanced features of a script may require the page to include non-valid mark-up. It is not just the page code that needs to be validated, if the page references an external CSS, it should be validated as well. By running your page through an HTML Validator or your style sheets through a CSS Validator you can easily identify coding errors and fix them. Engines have publicly stated that their spiders work better when crawling validated pages.
- GOOD: All pages and style sheets validated
- BAD: Any validation errors on pages or style sheets
3. Mod-ReWrite: I saved this for last because for some reason it seems to have become a litmus test for the SEO capabilities of site scripts. There are even mods for some scripts that only do the mod-rewrite and then claim to have made the site “Search Engine Friendly”. If you have read this far you know that could never be true. A mod-rewrite will take URLs with a bunch of parameters and turn them into normal looking URLs with keyword related file names. Since spiders have a hard time with URLs containing more than 3 parameters this make it easier for the spiders to travel the site. This is a good thing, don’t get me wrong, but it takes much more than this to make a script generated site “Search Engine Friendly”. The mod-rewrite can be a complicated task and needs to be maintained for continually updated sites.
- GOOD: An automatically maintained mod-rewrite feature
- BAD: Dynamic URLS with more than 3 parameters
It is very common for a web site generated dynamically by a script or application to have SEO issues. As a matter of fact most all of them do have SEO issues. But these types of websites are coming of age. Webmasters, SEOs and yes, Programmers are going to need to become more aware of all the different issues which can effect the “Search Engine Friendliness” of a site generated this way. Until these programs come up to SEO speed, I would not recommend using any of them without some sort of modification. Just because a script claims to be “SEO Ready” or “Search Engine Friendly” does not make it so. Compare it to the items listed above and if there are more GOODs than BADs, you have found one of the best. Most will have a majority in the BAD category. Find one which excels in the GOOD category and fix the rest, you will love the results.
- Article Permalink
- http://www.appliedseo.com/seo-guide-to-evaluating-web-site-software/
- Article Trackback Link
- http://www.appliedseo.com/seo-guide-to-evaluating-web-site-software/trackback/



Yet another absolutely great article, John!
I linked to it.