Monday, November 7, 2011

Think End-to-End When Parsing and Visualizing Big Data in Hadoop

It has been a big, busy week at Informatica.  We introduced HParser, the industry's data parser for Hadoop. As part of the launch activities, we recorded a video discussion on parse and visualize Big Data, featuring Brett Sheppard from Zettaforce, Ronen Schwartz, Vice President of B2B Products at Informatica, and Karl Van den Bergh, Vice President, Product and Alliances at Jaspersoft. The discussion helps you understand the approaches to extend Jaspersoft business intelligence (BI) and Informatica data integration investments to leverage data stored in Hadoop. HParser is designed to reduce the need for manually written Map/Reduce scripts.  You can also check out the chalktalk and product demonstration of HParser .

Pace of innovation for Hadoop has been rapid.  One of the reasons that we are seeing this accelerated rate of growth in Hadoop is that the Hadoop community and vendors across the relevant data computing infrastructures seem very engaged. 
 I am talking about solution providers that make the eco-system including advanced analytics, business intelligence, data processing, data management and data integration players like Informatica.  This obviously brings a wide array of parties from business and IT. This is great news because, in the previous round of evolution for relational databases and business intelligence, it appeared that the vision  and expectations of business and IT were not well aligned. As a result, there is still this IT-business gap that we, as the industry, are trying to fill. Of course, Hadoop is not perfect and there is lot of room to grow to be fully integrated into the enterprise. Yet, I believe that, there are fundamental forces at play that make this Hadoop evolution more lasting -  commoditization of technology, rise of social media, culture of data-driven enterprise, interest in harnessing machine-generated data, and pervasive use of data across industries.  The biggest driver of all is the democritization of data.  The more people use data for anything, the more they develop appetite for. It feels like we are at a key inflection point in the data management industry because of this dynamics.

Collaboration between data integration and BI vendors for Hadoop made sense to us because, to support those sophisticated and data-hungry business managers, we need to demonstrate our value of parsing and visualizing big data end-to-end.  I used to think that BI innovation was somewhat complete because I was wondering why we can make pretty faces even prettier and we had to focus on data (which is still ugly).  But I am rethinking this given the need for analytics and BI is evolving too.  The mindset is now different - unlike the traditional one to one accounting mindset, people are resorting to statistical algorithm and sophisticated techniques like machine learning and social graph with network effects coming from all the data that can be stored much more cost effectively. In other words, if you are thinking about the traditional reporting of BI and dashboards of sales data and revenue projection based on transaction data, we may be in good shape.  Once we start to look at complex, unstructured data from log data, network traffic, application events, documents and industry formats, and need to see the relationships and influences, you are in a new territory. So the data integration and analytics players need to work together more tightly.  We are excited to be part of the journey, helping organizations get more from the unstructured data that allow them to act and operate on to produce superior business results. 

To learn more about the eco-system dynamics, we recommend checking out the Hadoop Eco-System discussion (Part 3 of Hadoop Tuesdays) with Matt Aslett of the 451 Group.

No comments:

Post a Comment