It is interesting when I hear customers boast, “we’ve got 50 years’ worth of data…” or “we’ve got datafor our 12 million customers…” and they assume that just because they have bytes and bytes of datasitting in their data repositories, then they have big data and hence need a big data analytics solution.This is not always the case.
Here are three questions that I like to ask:
- What is the size of the datasets that you will be feeding into your analysis?Just because you have 50 years’ worth of data, doesn’t mean that you are going to be feeding allof it into your analysis. For example, if you are trying profile a customer in 2015, does it makessense to consider data from customers in 1965? Probably not. A common mistake made bycustomers is that they consider the size of their data warehouse instead of the size of thedatasets that they will be analyzing.
- What types of data are you looking to analyze?Relational databases management systems are very good at searching and storing structureddata, but they are not well suited for handling semi-structured and unstructured data. Big dataplatforms are more flexible than traditional relational database management systems, and areequipped handle a variety of data. If you are looking analyze data that is unstructured orlooking to combine different types of data (i.e. text, geospatial, multi-media, etc.) into youranalysis then you probably should be considering a big data solution.
- How quickly do you need to go from raw data to insights?It is important not only to consider the rate of data influx, but also how quickly you need to takeaction after extracting information from your data. The insights that are gathered from youranalysis will result in either a reaction to what the data is telling you, or anticipation of what isto come. The time interval between learning For example, after making a purchase from anonline retailer, the retailer might send you an email within the next 24 to 48 hoursrecommending other products for you to buy based on what other customers similar to youhave purchased. The waiting from when the purchase is made to when the email is sentreasonable, but total unacceptable if you need take preemptive action if your network ispossibly under attack.
There are a lot of factors that need to be considered to determine the appropriate analytics solution foryour organization. With all of the different technologies that are out there, selecting the tools that areappropriate for the business objectives can be an overwhelming task. The good thing is, that no matterthe size of your data, there is a technology out there that can help you accomplish your business goals.