« | Home | »

Let the LDA Google Games Begin

By theGypsy | September 8, 2010

Let the LDA Google Games Begin

The LSI myth is all grown up!

Here I go again… and seriously, I just can’t fucking help me’self. Ya know? I’ve spent many years defending the flame against those dumb ass people pretending to be SEOs. In particular, the ever popular ‘Dave Bait‘ that is Latent Semantic Indexing/Analysis. Yup, even recent pounded it here.

So you can imagine my emotions when news started to come outta the land of Moz about a LDA tool. At first some of the geekier instincts thought, this might be neat. That soon drifted back to fucking reality as the age old words of isolated anomalies took hold. Then of course if hit me like a link spammer on an abandoned blog;

Here we go again. Time for more snake oil and mass confusion!

Leaving the land behind

Now, I don’t want this to become a debate over LDA and what it means. Nor turn into Moz-illa just yet. We need the larger picture first. As with LSA before it, this will become an unholy comedy show for those reading the banter in the SEO space…

Loved these…

I’m not even going to try to explain the intricacies of LDA suffice to say that it involves the long standing SEO theory of topic modelling which isn’t so much a theory but more common sense. Learn more about the flavour of the month over at SEO Moz ” – Hamlin.com

Uh… wow. Nice one dood. It’s a fucking bril’ idea that you didn’t get into the intricacies my good man. Because that ‘SEO theory‘ bit? It’s a computer science thingy actually. Oh but it isn’t a theory tho? So a ‘common sense’ thingy-ma-bob? Perfect. That just makes me all warm in my guts bro.

 

Oh and maybe this ‘Practical Summary‘ has some goodies? Awwww for fucks sakes!! Can I please get that 3 minutes of my life back? (U will pay Mikey!). I gotta love the advice…

1. Make sure you are using the rel=canonical tag.

…shit…

2. Make your web site about something.

…. fucking Thomas Edison lives

3. Stop feeding the social media machine all your stuff.

…OK, this is just getting weird.

Oh and please do go and read the thinking here. But don’t bother coming to me to get those precious moments of yer life back m’kay? Then there was some wonderous bullshit floated with sentences such as;

“You can’t get away with two sentence pages and minimally valuable content any more – you have to do the hard work of creating good stuff in order to leverage this algorithm effectively. T ”

So the algorithm likes ‘good stuff‘? And two sentence pages???? Did you miss the whole Demand Media / Mahalo fiasco? meh.

 

Then there was twitter, which I kept finding gems such as;

Let the LDA Google Games Begin

Let the LDA Google Games Begin

 

We can clearly start to see the evolution of ‘Google’ and ‘LSI’ in some unholy darker shade of SEO hell that it will soon become. Shit, shit shit… Tommy? Go fetch Daddy’s boomerang.

Now, I will leave these fine, if not somewhat misguided, folks alone. Because even that is not the point here. It has already started. And if this fire is allowed to burn out of control, we shall all be consumed by it. Staffers and contractors asking. Clients and department heads asking. Snake Oil SEOs out stealing $$$ under the ‘Google LDA’ banner. We simply must implore people to look at these things reasonably.

Back to the Moz

Ok, now, I will have to lay this back at their door ultimately. Why? Because this is the genesis of the new threat. If they like it or not. For the record I have already written about the whole ‘Google and LDA‘ thang, albeit from a somewhat different angle. But all along I tried to impart one simple concept (pun intended); let’s not sell it for what it isn’t. Also Rand, Ben et al, did seem to try and put that out there. But it isn’t helping.

Some comments on the post, such as these, needed to be answered, to stop the madness (which they weren’t) .. so allow me;

Let the LDA Google Games Begin

Whoah back there… ‘behind the Google curtain’. Please get up from your desk and go sit in the corner ok? That titanic shit? I have no clue what yer saying. Go to the corner.

 

Let the LDA Google Games Begin

LDA in detail? That like a 3 day SEO course? And a link to Bill’s work on named entities? Those are actually some relatively different elements there and strangely, one is ‘semantic’ analysis and the other… well… fucking named entities. And we can’t prove anything. Ok?

 

Let the LDA Google Games Begin

What the fuck? Is that English ol chap? Please don’t be all gushy like that. How do you know it is a goldmine until you try it? We are interested in having you tell us if it serves a purpose. Thanks.. I hope you successfully have that dictionary removed from yer …

 

Let the LDA Google Games Begin

Not a stretch to assume…LDA or some other semantic analysis? Uhm sure. It’s a search engine, that’s a safe bet. But once again, we’re not deconstructing Google no matter HOW many times we may have referenced them in the post. And that other stuff? Strange but it sure sounds a lot like phrase based IR, not LDA. We haven’t looked at that yet.

 

Let the LDA Google Games Begin

Thanks for that. And your welcome for that… and well.. What? Once more we have no idea why everyone keeps trying to imply this is about Google. I mean this page only scored like a 56% for those two terms.

You get the idea. I have never heard so much greek-geek speaking bullshit in all my life. I mean I am reading comments by people that sure sound goddamned smart, but when you read it a third time, it becomes apparent they don’t even ‘get’ what LDA does nor barely semantic analysis at all (synonyms being on of the common foibles). Please Rand/Ben, please do try to hit such comments and discourage the perceptions of Google gold and continue the education into CS/IR..

 

Help stop the madness

And so my friends, I leave it to you now. Tell peeps to read up on where LDA came from. Get them to learn more about the IR world. Won’t you just help stop the bleeding? Before it’s too late? If you see some LDA bullshit out there, please contact your local SEOBS reporter. Thank you.

I really don’t care who makes an LDA tool. Shit, you can go and grab the code for yourself if you wanted to. It is just super fucking important that people understand that we can’t really get a strong indication of what any search engine is doing with limited data sets as far as methodologies and scoring. It’s a fact, get over it. And if you’re a manic Mozer, please watch the person next to you and ensure they don’t drink too much of the kool aid. Be smart about it and hopefully we won’t all be cleaning up LDA Snake Oil and mass confusion in the years to come.

/As you were…

No Bullshit? Then just Share it!
  • Let the LDA Google Games Begin

Topics: Myths and Crap, Snake Oil | 20 Comments »

  • Kevin

    You deserve a medal for making it through all the comments. I could only make it about 1/4 of the way down before the sting of giant crocodile tears filled my eyes. Real tears. MY TEARS.

  • http://www.huomah.com theGypsy

    lol…yes, well I was vested in it. If anything I watched it as one does a car accident; some morbid fascination. There was also a good exchange in there from Dany and Rand worth reading btw. There are so many angles I could take on this stuff, I just had to focus on one at a time (avoiding the LSI fiasco all over).

  • Kevin

    In purist of middle ground, I think Rand would likely admit, a correlation and a cause are two different things.

    Personally I’m not surprised sites that have good rankings also have a good LDA score with his new tool, but I’m guessing links and simply having the keyword in the text trumps LDA.

    I have two site competing for the same keyword. The one with the lower LDA score actually ranks higher in reality, so I’m not going lose sleep on this. However, it’s always good to know how others are looking a search realize it’s about getting several on-page and off-page factors to work in concert with each other.

  • Kevin P.

    In pursuit of middle ground, I think Rand would likely admit; a correlation and a cause are two different things.

    Personally I’m not surprised sites that have good rankings also have a good LDA score with his new tool, but I’m guessing links and simply having the keyword in the text trumps LDA.

    I have two sites competing for the same keyword. The one with the lower LDA score actually ranks higher in reality, so I’m not going lose sleep on this. However, it’s always good to know how others are looking a search realize it’s about getting several on-page and off-page factors to work in concert with each other.

  • Michael Martinez

    It looks to me like most of the research around LDA has focused on improving computing performance in parallel environments and analyzing video and image patterns.

    The problem with Latent Dirichlet Allocation from a search-query perspective is that it still operates on the ancient “bag of words” concept — in fact, it’s been praised for enhancing that concept.

    In “bag of words” topic analysis, word order and proximity are tossed out. Topics are decided on the basis of weightings derived from word occurrences within a document as cross-referenced across a document set.

    Researchers have found that this model can be applied to abstract things (such as complex images). However, a typical Web search query is not focused on abstraction but rather on discrete, specific subjects, such as “what will the weather be like (in my area) tonight?” and “where can I buy the least expensive gasoline (within five miles of my location)?” and “how do I bake gingerbread cookies that are soft rather than hard?”

    Tossing out word-order and proximity for queries like that (or even “ring tones” and “britney spears”) doesn’t make sense.

    I’m getting 500 errors on YouTube right now so I cannot find it, but there was a recent video by Matt Cutts where he says that Google considers a page where “Britney” and “Spears” occur side-by-side to be more relevant to a query for “Britney Spears” than a page where “britney” is above the fold and “spears” occurs near the bottom of the page.

    People who want to believe in this LDA bullshit would do well to pay attention to what GOOGLE ACTUALLY SAYS IT LOOKS AT and leave the snake oil science to the snake oil salesmen.

  • Michael Martinez

    Rand first needs to actually demonstrate some correlations, rather than point at these low values his calculations produce and try to force them into the category of correlations.

    But his faux correlations are obviously thinly disguised attempts (despite all his disclaimers and protests to the contrary) to reverse-engineer an algorithm that is being adjusted daily.

    As was pointed out to him in numerous comments, blog posts, and by people at SMX Advanced — his attempts to find correlations in highly competitive queries are counter-productive.

    They have absolutely no clue as to what they are doing over there.

  • Fred

    First of all – I do not believe that the world famous Mr Martinez is in here posting in the SEO BULLSHIT blog – wow if it is. Nice one.

    Secondly – I’ll fucking tell you why LSI and LDA and what ever other fucking dickwart scum ass licker invents is a load of dogshit – because a search engine fucking wants cash – and the fuckers twist the serps to encourage ad clicks – not fucking serp clicks.

    Now – it doesnt take a fucking rocket scientist to work out that this “twist” has NOTHING AT ALL TO DO WITH ANYTHING OTHER THAN subtly putting shit in front of users now does it?

    Forget thinking that an SE wants to give the best serps – they DONT – they want to give serps which cause ad clicks. THATS where SEO comes in.

    Fucking rant over.

  • http://twitter.com/bradleyhunt Bradley Hunt

    Nice – brought out a complex cry; tears of sadness mixed with those of laughter. Perhaps the best lessons we can learn as SEOs and Web marketers from this is how some of these poor unfortunates buttress their identities and group affiliations upon this nonsense without letting intellectual curiosity, a semblance of reason, and more important matters get in the way.

  • Tory McGill

    I think your narrow mindedness is staggering! The LDA analysis is the way forward in search engine optimisation. Without LDA, how would we know that when writing compliant web copy we need to use other words related to the topic and main keywords? It is common knowledge that Google invented the written word and before that we were communicating in faecal finger paintings; I am just glad there are still some pioneers out there willing to create complex systems that state the most fundamental principals of communication. Plus I now have Google Scribe which means I never have to use my brain again, suppose you have a problem with Scribe as well?

  • Randy

    Wow, I just spent the last little bit catching up on posts, and had to read this one, and show it to my partner LOL –

    Keep up pointing out the obvious Bullshit Dave, that’s what makes this site and the dojo awesome.

    OH also, that is the famous Michael M posting down there. :-) Which makes this place that much cooler LOL

  • Anonymous

    Cheers for the post Dave, I have a better understanding of all this after going through your semantic analysis week over on the Dojo.

  • http://www.green9media.com/ Glenn Isaac

    Dude, you rock. You know it, so I’m not gonna pat your back too hard.

    I will, however, introduce you to a new SEO ranking model I’ve discovered, called PHRA (Post Hoc Ranking Algorithm). I can’t explain all the complexities of that awesome model here, or anywhere else, but it’s a mix of common sense and science.

    Event A occurred in the distant past and seems related to recent Event C. Event B also occurred in the distant past, and seems related to recent Event C. Therefore Event C is caused by Events A and B. Obviously, my PHRA model works.

    (full disclosure: though I’m making fun of LDA and all that, and think your analysis is badass, I’m still a total SEOmoz fanboy – sorry! ;)

  • http://www.harrr.org/rrr righini

    haters are gonna hate, dude. LDA is cool, the tool is not that cool yet, simple. Why all this anger?

  • Pingback: SEO Research: Because Nothing Makes You More Informed Than Being Misinformed | SEO Bullshit

  • http://zolpidemambien.com/ Buy Zolpidem

    Forget thinking that an SE wants to give the best serps – they DONT – they want to give serps which cause ad clicks. THATS where SEO comes in.

  • Anonymous

    I think Rand would likely admit, a correlation and a cause are two different things. 

  • BuyGiftsItems

    It becomes apparent they don’t even ‘get’ what LDA does nor barely semantic analysis at all.

    a deal a day

  • Anonymous

    I’m not surprised sites that have good rankings also have a good LDA
    score with his new tool, but I’m guessing links and simply having the
    keyword in the text trumps LDA.
     Plumber Service

  • http://glennfriesen.com/ Glenn Friesen

    Señor, when reasoning defeats observation, it’s doublethink.

    I don’t blindly follow any of my advisers (including the smart people like you who share their opinions online), so when I hear you or Rand/Ben discuss LDA, I privately thank you and then go out and test for myself (if the risk/reward makes sense). So far, I’ve yet to observe any affect from the LDA improvements I’ve made, hypothetically because I’ve yet to pass the threshold (the scores of competitors that outrank my site). I shall soon see, firsthand, if that hypothesis is true or false (or at least, get an indicator one way or the other).

    Nevertheless, badass perspective you share. Love the comment responses.

  • http://www.frivtown.com/ friv online

    I don’t know what you’re talking about…maybe you can add more information about this subject.Thanks!