A Radar Tracking Approach to Data Mining

(Statistics and Data Mining II)

Automated decision problems are frequently encountered in statistical data processing and data mining. An heuristic filter or heuristic classifier typically has a limited set of input data from which to arrive at a set of conclusions and make a decision: REJECT, ACCEPT, or UNDETERMINED. In such cases, pre-processing the input data before applying the heuristic classifier can substantially enhance the performance of the decision system.

In this article, I’ll motivate the use of a radar-tracking algorithm to improve the performance of automated decision making and statistical estimation in data processing. I will illustrate using the website visitation statistics problem.

Continue reading this article…

Analysis of Visitor Statistics: Data Mining in-the-Small

(Statistics and Data Mining I)

For a variety of reasons, meaningful website visitation and visitor behavior statistics are an elusive data set to generate. This article introduces the visitor statistics problem, and describes seven challenges that must be overcome by statistical and data analysis techniques aiming for accurate estimates. Along the way, we’ll encounter the “Good News Cheap, Bad News Expensive” Paradox of Data Mining — or, why information is often used “as-is”.

This article is the first in a series on algorithms, statistics and data analysis techniques (using free and open source tools) using the visitor statistics problem as a vehicle for illustration.

Continue reading this article…

Dear Readers!

Our Google+ (Buzz) page is where we publish more regular (~monthly), shorter posts. Feel free to check it out! Full length articles will continue to be published here, with notifications through the Feed (you can join the list below).