Tuesday, December 09, 2008

Improving Search Engine Optimization by Incorporating Predictive Analytics

As more companies increase the size of their databases search engine optimization (SEO) techniques can be adapted to data mining of commercial databases. In SEO link analysis is a measure of the quality and relevance of the set of links pointing to a given site. This is measured is achieved through an algorithm that maps the hyperlinks in a series of networks. The measurement creates a ranking of the strength of the inbound links to a particular network. The objective of link analysis is to detect patterns or trends that would make the search engine to bring to the top the most relevant web sites in any search.

Link analysis contains multiple variables that are analyzed. Google’s claims over 200 variables are analyzed in its link analysis for its ranking algorithm. Although I do not know which variables it uses, I surmise that they are the keystone of Google’s success. The core of any analysis is its variables. Let me suggest the utilization of predictive modeling as an additional variable that will improve SEO.

Using predictive modeling as another variable in link analysis could potentially increase SEO by 1.4 times by giving depth perception to the link analysis. In ophthalmology medicine it an established fact that binocular vision gives depth perception, and that depth perception (or binocular vision) increases the range of view 1.4 times greater than monocular vision. In other words, you can see better with two eyes than with one eye. The equivalent of depth perception in analytics is the addition of a predictive modeling (or scoring) variable to any pattern detection analysis.

A predictive modeling variable will improve the SEO because:

  1. It gives an independent variable that acts as a spare variable in case that another variable is not working. In other words, you can use a predictive modeling variable in a correlation analysis as your independent variable against the other numerical variables in your link analysis.
  2. A predictive modeling variable will widen the field of view of your networks from 160 degrees to 200 degrees.
  3. Binocular summation (seeing with two eyes) will enhance faint but important networks and links within your data.

Among SEO scientists, statisticians, and business analysts it would increase stereopsis, or the keen sense that they have depth perception. In other words, it would give them another tool to do their work more efficiently.

Most of the variables used in link analysis are flat, or with one-dimension. A predictive modeling variable is multidimensional and hence a “depth variable”. The addition of a “depth variable” to any analysis statistically can be expressed as detecting the networks using two sensors instead of one. If each flat variable alone had a 0.6 probability of detecting a network, that probability has been calculated to be:

Pb = Pr + Pl - (Pr x Pl) = 0.6 + 0.6 - (0.6 x 0.6 ) = 0.84 (1)

The improvement from 0.6 to 0.84 represents a 1.4 fold improvement. This improvement can be achieved in any analytics technique by adding a multidimensional variable to a one- dimensional variable during analysis.

Contact: Alberto Roldan, CEO of R&R Analytics at atomanalytics@gmail.com


Anonymous said...

This reminds me of an example from a modeling assignment I worked on many years ago.

We were predicting corporate revenue for a business directory, in cases where it was unreported, by analyzing cases where we had a high confidence in the revenue number as a training set.

It's a simple example but we discovered (of course!) that while 'number of employees' was generally correlated with revenue, the combination of number of employees and industry had an even higher correlation.

Ten lawyers and ten people in a lawn care business have very different revenue predictions, although in both casese ten lawyers and ten lawn care staff have typically twice the revenue of five lawyers and five lawn care staff.

There are probably more details that come along (ten lawyers in new york versus ten lawyers in a small town in middle america) but at some point the addition of mroe variables causes the data to fragment and makes it difficult to find training data and to present the algorithm to business users in a meaningful way.

jessica said...

Nice and thoughtful writeup here on SEO. There's several discussions going on with this topic on several boards I participate in. It's nice to see some new opinions versus the same one's all the time.

