The Problem: Growth is Data is Stressing Modern Analytics Systems

As the size and velocity of data continues to grow faster than the ability of servers to keep up, modern analytics systems cannot keep pace.

Growth is Data Stressing Modern Analytics Data is doubling every two years, while server power doubles every four. When you couple this with modern analytics methods that tend to scale exponentially as datasets grow, you quickly end up with an analysis gap. Approaches to solving this problem range from sampling data, to adding servers, to moving analytics into memory. None of these methods are sustainable. Treeminer approaches the problem from a new perspective: how can we structure data so that we can dramatically improve the performance of analytics algorithms.


Data Modeling Times Increase Exponentially as Data Sizes Increase

Model Creation Times are also Growing Frequently, data mining algorithms build data models in order to make predictions about future data points. Unfortunately, the scalability issues associated with new data points are futher impacted by the poor scalability characteristics of the algorithms that build the models. Keeping data models up to date is critical to an analytics process, and the chart to the right shows that even with small to moderate sized datasets, model creation times can be a challenge.

Sampling Data or Adding Servers is not Sustainable

The resulting exponential growth in analysis times means that each year, more and more hardware capacity will need to be added, all at a time when some are predicting that the CPU growth predicted by Moore's Law may be slowing. Sampling data is a stop-gap measure as well, but what's the point of maintaining a big data store if you aren't going to analyze all the data? A new approach is needed to meet these challenges. Introducing Vertical Data Mining.