By Sebastian | February 2, 2010
Rant by Sebastian
Last time I looked, search engines evaluated links when drawing site hierarchies and link graphs. They give a dead rat’s ass if your fucking URIs match utterly meaningless file system structures. URIs are totally independent of OS restrictions, hierarchies, and brain farts as well.
So why do so many “SEOs” out there still advise their clients to maintain URIs within a (pseudo) file system hierarchy? Because someone replaced their brain with a pile of bullshit. Or pseudo-PageRank. Or both. Or other disgusting crap.
Well deserved insults left aside – here comes the academic explanation. It’s all Google’s fault. Back in the good old days when Google was dancing monthly, they had a faulty algo in their toolbar that kinda “guessed” PageRank, partly based on directory structures. It displayed a green pixels width (of upper directory) minus one or two pixels for URIs without an entry in the propagated toolbar PageRank database.
If an indexed page example.com/holy/index.html showed 4 green pixels, a not yet indexed page like example.com/holy/crap.html sometimes showed 3 green pixels.
Of course PageRank isn’t measured in green pixels. Savvy SEOs knew that toolbar PageRank is for entertainment purposes only, and that guessed toolbar PageRank means less than nothing with regard to free search engine traffic. Unfortunately, not all webmasters and SEOs are savvy. Actually, most aren’t, thanks to stultifying webmaster hangouts and the SEO blogosphere.
Many clueless SEO clowns chasing toolbar PageRank begun to serve all their crap from the root directory. Until the next SEO myth came out. Well, breadcrumb navigation is a great thing to do, if it makes sense, but it doesn’t boost PageRank on “directory level” via underlying URI structures (mimicking a static file system). Not in the root, not in deeper levels. Never. PageRank is solely influenced by links and their attributes.
Changing URI structures from root-only to equally keyword stuffed directory structures resulted in useless 301 orgies, making crawling and indexing more complex for search engines. In fact, it created redirect chains like example.com/products/sku (initial version) to example.com/sku (root-only version) to example.com/category/subcategory/justanotheruselesslevel/sku.
Two more revamps (plus server name canonicalization), and search engines won’t index any product page, because five redirects in a row is the maximum. There’s no maximum when it comes to SEO myths, so probably most over-SEO’ed pages will drop out of all search indexes soon.
I love idiocy. Really, I do love it. I’m going to launch just another SEO myth like “replace slashes with backslashes because Google fell in love with IIS” and that’s four. Next up is “replace \.htm|\.php|\.html with \.asp|\.aspx because Bing gives those script name extensions more weight” and that’s five. The day after tomorrow I’ll dominate the InterWebs. Aaaahhhrrrggg…
So why is the totally flawed concept of hierachical URIs, as in file systems, so popular? Because meaningful URIs make sense for (bookmarking) users, and increase SERP CTRs. Not that a visitor cares about the hierarchies a webmaster considers logical. The opposite is true.
E.g. example.com/widgets/green/xxl is very much webmaster friendly, but meaningless to users. A visitor would appreciate example.com/green-widgets/ and prices for all sizes (XS to XXL) on this page (even better would be example.com/widgets/ where the punter can chose color as well as size). A search for [green widgets in XXL] or so delivers the desired sales pitch just as [green widgsts] (sic!) without a size.
In fact, on very few sites any hierarchy in URIs makes sense at all. If there’s no hierarchy from a user’s (current) point of view, example.com/sku (or example.com/unique-keyword-phrase or example.com/buy/unique-keyword-phrase|sku) is the best choice (for a good looking and meaningful URI). That doesn’t mean you should drop hierarchical breadcrumbs. It means that URIs have nothing to do with the structure of your vendor’s data feed, and that breadcrumbs aren’t static.
In real live, hierarchies just don’t work. Most users don’t browse Yahoo’s directory or the ODP, they perform a search. This sort of categorizing is flawed by design, because the result is always subjective and therefore not generic enough to be useful for everyone. Don’t say “but Yahoo did it too, so it can’t be that wrong”. Trust me, it’s outdated bullshit, a relict of the Internet’s early Jurassic. Just because my ancessors buried their dead bodies in trees, that doesn’t mean that I can’t have a funeral at sea.
A node (Web page) can appear at many coordinates in a webbed structure (Web site), and each coordinate can be expressed as another navigation path (breadcrumb) leading to this node. That makes the breadcrumbs transient, and you must not use transient attributes or behavior as (parts of) persistent identifiers, like URIs.
Rest assured Google can (at least will) handle dynamic breadcrumbs without losing the node’s actual context (with regard to search query relevance), so ditching on your static breadcrumb components in URIs will neither lower SERP CTRs nor uglify SERP displays. You’ll get a breadcrumb’ed SERP display even for ugly URIs like example.com/p?id=Hj8TSc&ctx=k0Oh5Ew, because the breadcrumbs are gathered from links, not from path components of URIs.
In other words: your information architecture should be based on estimated (even better: tested!) user behavior, not on your product portfolio or technical indicators. Make short and meaningful URIs for users. Provide short click-paths to all of your contents. Interlink the hell out of your page portfolio. Link for the sake of your sales and easy navigation as well, not hierarchical following alien structures like category/vendor/product/color/size.
If it makes sense, search engines will follow suit. They don’t care much about path components of URIs. A page linked from the root index page or a powerful hub will rank well regardless whether it’s URI is example.com/sku or example.com/products/sku. It’s liked by users when its URI is example.com/green-widget.
You don’t even need to provide a consistent URI pattern. Having a ton of ancient example.com/sku URIs plus fewer example.com/meaningful-string URIs for newer products as well is fine with the engines, and fine with your users. As long as your linkage is user friendly.
Think! Be creative. Don’t fall for ancient sagas.