Big data has had everyone talking and every industry seems to be rushing around to trying to get every aspect of their business data driven. This phenomenon has created a thriving industry of firms trying to help enterprises achieve this goal.
All the petabytes of data that’s continuously generated also needs to be stored and leveraged. As a result, the big data analytics industry serving enterprises has grown to be worth $122 billion and is expected to continue on this upward trajectory (growing to $187 billion by 2019).
With all this money being pumped into the industry, you would have expected to have achieved some real value by now. But the reality is that it’s just not there yet, in fact, it hasn’t really transitioned from a hot topic to actionable insights that industries can depend on.
The whole big data industry is based on mining a huge mass of enterprise data to identify latent patterns and interpret them. When done properly, precious insights can be derived from this information. However, gaining actionable insights comes down to specialized analytical software and highly skilled individuals.
Lack of Talent and Poor Software Solutions
There’s been a major void when it comes to big data talent in North America. Further, with President Trump’s new restrictions on immigration, we’re not any closer resolving this issue.
As a result, outsourcing this function overseas (to places like Eastern Europe and South Asia) will be imperative to achieve enterprise goals. At the same time, outsourcing data related to core business functions can also be a problem. So it will be interesting to see how companies find effective solutions to this issue.
Check out how FIDO uses offshore talent in Ukraine to build a Big Data Analytics solution from scratch and replace their legacy software.
Further, a lot of big data analytical tools are just not making the grade, so companies need to think of replacing some elements of the stack:
Apache Flume: Businesses using Flume should probably move on as there are better solutions like Kafka and StreamSets that can do a better job.
JAVA: For big data, JAVA is just a bad idea. This is why most companies have moved on to Python and Scala.
MapReduce: It might be cheaper, but it’s too slow to make a real difference. Businesses are better off investing in Spark.
Oozie: This application fails to deliver as a scheduler or a workflow engine. It’s also very buggy for a software that isn’t too complicated to code.
Pig: Although it has a fantastic name and may seem like a good option for PL/SQL, Pig just doesn’t work as most would expect (in fact, it’s a little strange). Almost any other data application can do what it does, better.
Storm: This application lacks support and seems to be fading away under pressure from better technologies like Apex and Flink (which are better low-latency alternatives to Spark).
Be Like Google
So far the firms that have been filling the gap have been leveraging big data like they did with small data. They’re essentially following the same principles as reporting BI. Further, their technological investments have only provided superficial value with interactive dashboards that visualize it (a prettier version of Excel charts from a decade ago). As a result, this approach hasn’t been able to deliver any real value.
Companies also need to consider the fact that the human brain can only process complex data sets and interpret them if it's made small enough. This can be achieved through aggregation, description, summarization, and presentation. Further, business leaders need to be aware that there is a limit to how much the data can help businesses processes.
Although these companies have adopted similar technology to what’s used by tech giants like Google, they haven’t figured out how to use it properly. Further, although you can incorporate machine learning (ML), it can’t make the errors of judgment that we can. But that’s not its job, predictive algorithms aren’t meant to understand the cause and effect of statistical relationships to work effectively.
To make real gains from predictive analytics, enterprises need to give up on trying to figure out why things happen and focus on the lines of code that’s telling them what it is. However, don’t expect it to happen overnight as companies need to first get over the hurdle of mistrust. At the end of the day, it’s better to be like Google and accept these cultural changes quickly to reap the benefits of ML.
Embrace Applied Predictive Analytics
Although it was once used to stop cyber attacks and detect fraud, predictive analytics is something that hasn’t been largely incorporated into customer-facing businesses. Although they can be useful, to incorporate it into enterprise operations, predictive models first need to be tested for accuracy. Once that’s sorted out, they can be used to make real world decisions.
At the same time, for most businesses, a predictive model doesn’t have to be exact in its predictions as we can use virtual simulations to safely ascertain how old and new methods perform against one another.
However, it’s not an easy feat as you can easily make life difficult when designing an algorithm. But as ML algorithms are expected to get better over time, the future of artificial intelligence (AI) in big data looks bright. At the same time, what will make a real impact in the scenario is what you choose to predict.
Although it has barely made a difference over the last couple of years, big data will make a significant impact on business processes in the near future. But for that to happen, business leaders need to trust the science and figure out what processes would need to be data driven in the coming years, right now.
This will not be easy to accomplish and you can say it’s almost stepping outside the box to achieve your targets somewhere in the future. But for big data analytics to work for enterprises in the future, business leaders need to do the groundwork for it soon.