Big Data Cloud Today! Experts discuss what today’s data is saying about tomorrow opportunities

Increased productivity, new innovations and smarter marketing are just a few advantages being realized by organizations that embrace big data.

Big Data Cloud Today!, an event held on June 7th in Mountain View, drew leaders from business and technology to discuss the next generation of Big Data use cases. Attendees to Big Data Cloud Today! learned emerging techniques, like 3d data visualization, to distill new insights from existing data.  The event addressed the growth of big data, big data architectures, and identification of new business opportunities.

As I participant in the event, I would like to share a few of the insights and key learning’s that I felt offer the most value for Bodhtree customers and network.   Milind Bhandarkar, Chief Scientist from Pivotal, Dr. Mayank Bawa, Co-President R&D Labs of from Teradata Aster, Jim Blomo Engineering Manager – Data-Mining, and Gail Ennis, CEO of Karmasphere, were a few experts who made this event truly impact full. Speakers presented first-hand experiences and lessons learned from Big Data early-adopter organizations.

Dr. Mayank Bawa (Co-President, R&D Labs, Teradata Aster) set the tone for the conference with an excellent keynote address. ‘Why is there such excitement around Big Data Analytics in the current environment?’ and ‘How are Big Data Services & Data Sciences Unique?’ were the two questions that framed his remarks.  His presentation marvelously answered them with real-life use cases in two broad categories:

• “All kinds of data in a single platform”
• “All kinds of Analytics in a single platform”

To underscore these points, he presented various applications of Big Data Solutions such as ‘Market Basket Analysis’, ’Telecom and Churn analysis’, and ‘Predictive Modeling in Insurance Domain.’ Some of the interesting takeaways, challenges and open questions are as follows:

– How will technology progress to a unified architecture from the current state?
– The focus of every company is on building a platform that bring silos of data together and facilitates seamless dataflow across systems
– Empowering data sciences and improving analytical algorithms
– Relational vs. NoSQL:
– Is there a need to build SQL layer over NoSQL?
– Does it add any business value?
– Vision of oracle, Teradata in bringing relational and NoSQL together.

How to make Big Money from Big Data? – Sourabh Satish, Security Technologies & Response, CTOs Office, Symantec

While Dr. Bawa presented the motivation to build a unified architecture with better analytics, Sourabh Satish, Security Technologies and Response at Symantec, explored the advances offered by Big Data in the Security Domain. He demonstrated some of the security tools built at Symantec and illustrated how the three fields – Big Data, Data Science and Domain Expertise – can be leveraged for building an application.

Hadoop: A Foundation for Change – Milind Bhandarkar, Chief Scientist, Pivotal

If I had to design a metric to calculate the most relevant and valuable presentation in the conference, then without a doubt the gold standard would be set by Milind Bhandarkar, Chief Scientist of Pivotal.  Mr. Bhandarkar talked about the evolution of analytics and big data and characterized by three distinct areas:

• Source Systems +ETL+EDW+Visualization
• Source Systems +Hadoop& M>R +EDW+Visualization
• Hadoop and ecosystem

He went on to explore several key issues in the Big Data field:

• BI Vs. Big Data and Future
• Big Data’s journey from batch processing to interactive processing. Is interactive processing possible?
• Hadoop as a service?
• Applications as a service?
• Cloudera Impela bypassing MapReduce
• Myth around how huge (Big) is the volume of data used in  analytics query (at Yahoo, Microsoft)

Why Hadoop is the New Infrastructure for the CMO (they may not know it yet!)- Gail Ennis CEO, Karmasphere

Gail Ennis talked about business use cases driving the demand for Big Data in today’s rapidly changing world, the journey of technology from BI to Big Data (predictive insights) and the potential of predictive analytics in Marketing and product Development.

Insights from Big Data: How-to? –Panel Discussion
 

Jim Blomo, Engineering Manager Data-Mining, Yelp
David P. Mariani, VP Engineering, Klout

The frank and energetic interaction between Professor Blomo and Mariani offered some of the most interesting discussion in the conference, including topics such as:

• How to identify whether a given problem is a BI Analytics problem or Big Data problem?
• Is existing BI framework needed? How Big Data evolves to be interactive BI
• How a company can form a  data sciences group & their Journey in building their team
• What qualities they looked at while selecting Data scientists in their team as Data Scientist is not a role well defined across the industry
• Evolving technologies in data sciences and Big data (hive vs. cloudera imepala vs. shark)
• Is ETL on the fly possible
• Yelp and its work in data sciences
• Academic  Education or Practical Experiences which helps in being a great Data Scientist

Rhaghav Karnam manages the Big Data Scientists group at Bodhtree, focusing on Big Data for customers in High Tech, Manufacturing, Life Sciences/Pharmaceuticals and government industries.  Bodhtree enables its customers and partners for business transformation through Big Data and social-mining solutions laser focused on measurable business objectives.

Read More

Data Integration for Salesforce.com is no more a challenging task

Enterprises implement SalesForces so that teams can focus on customers and revenue, leaving the rest to automation.  But when companies try to extend that automation by integrating SalesForce with an ERP, the result too often is more headache than focus.  MIDAS an ETLE Tool (Extraction, Transformation, Loading and Enrichment) resolves the challenge by seamlessly integrating capabilities from Saleforce to and from SAP, EBS and other ERPs. Midas seamlessly integrates Saleforce.com with SAP, Oracle E-Business Suite and other ERPs.

Feature set

• Cloud based solution for invoking transformations and jobs remotely
• Broad connectivity and data delivery
• Hosting a Cloud solution with multi-tenancy capability
• Social connectors – Facebook, Twitter and LinkedIn
• Custom Connectors – SAP, Oracle E-Business Suite and Salesforce.com
• Analytics integration with Pentaho Reports, OBIEE and SAP BusinessObjects
• Powerful administration and management
• Data profiling and data quality
• Single interface to manage all the integration projects
• Flexible deployment options
• Bi-directional CRM-ERP integration

Key Benefits

• Designed for seamlessly integrating Salesforce.com with ERPs and other applications
• 300 plus Open – source connectors out of the box
• Transparent diagrammatic depiction of data transfer steps
• Removes custom coding for quick turn around
• Reduces integration and maintenance costs
• Improves data quality
• Shortens implementation time
• Easy installation and configuration
• Business continuity and application availability management

Read More

PIG and Big Data – Processing Massive Data Volumes at High Speed

For most organizations, availability of data is not the challenge.  Rather, it’s handling, analyzing, and reporting on that data in a way that can be translated into effective decision-making.

PIG is an open source project intended to support ad-hoc analysis of very large data volumes. It allows us to process data collected from a myriad of sources such as relational databases, traditional data warehouses, unstructured internet data, machine-generated log data, and free-form text.

How does it process?

PIG is used to build complex jobs behind the scenes to spread the load across many servers and process massive quantities of data in an endlessly scalable parallel environment.

Unlike traditional BI tools that are used to report on structured data, PIG is a high level data flow language which creates step-by-step procedures on raw data to derive valuable insights. It offers major advantages in efficiency and flexibility to access different kinds of data.

What does PIG do?

PIG opens up the power of Map Reduce to the non-java community. The complexity of writing java programs can be avoided by creating simple procedural language abstraction over Map Reduce to expose a more Structured Query Language (SQL)-like interface for big data applications.

PIG provides common data processing operations for web search platforms like web log processing. PIG Latin is a language that follows a specific format in which data is read from the file system, a number operations are performed on the data (transforming it in one or more ways), and then the resulting relation is written back to the file system.

PIG scripts can use functions that you define for things such as parsing input data or formatting output data and even operators. UDFs (user defined functions) are written in the Java language and permit PIG to support custom processing. UDFs are the way to extend PIG into your particular application domain.

PIG allows rapid prototyping of algorithms for processing petabytes of data. It effectively addresses data analysis challenges such as traffic log analysis and user consumption patterns to find things like best-selling products.

Common Use Cases:

Mostly used for data pipelining which includes bringing in data feed, data cleansing, and data enhancements through transformations. A common example would be log files.

PIG is used for iterative data processing to allow time sensitive updates to a dataset. A common example is “Bulletin”, which involves constant inflow of small pieces of new data to replace the older feeds every few minutes.

Sailaja Bhagavatula specializes in SAP Business Objects and Hadoop for Bodhtree, a business analytics services company focused on helping customers get maximum value from their data.  Bodhtree not only implements the tools to enable processing and analysis of massive volumes of data, we also help business to ensure the questions being asked target key factors for long term growth.

Read More

Extending SAP Business Objects to All Organizational Decision-Makers

BI tools play a vital role in decision making and innovation at every level in dynamic organizations. SAP Business Objects includes tools that help expand the reach of BI Information services, enabling the organization to share, integrate and also Embed BI in applications, services, tools and business processes.

Unification of the BI data used across applications

BI can be used across multiple functions and is generally not specific to any particular department or team. It can be leveraged across applications related to finance, operations, sales or human resources. SAP Business Objects Enterprise provides a unified structure with a powerful semantic layer and integration capability that brings a “single version of the truth” to data drawn from multiple sources.

Share BI with any service-Enabled Application

To build applications that extend the advantage of a company’s BI Investment, developers can use SAP Business Objects BI Software Development Kits (SDK’s). These SDK’s can be used in any Java or .Net based application for authentication, authorization, scheduling, content display, ad hoc query, or server administration. SAP Business Objects also offers a comprehensive set of Web Services that expose BI functionality as a platform-agnostic interface. The software also supports your organization by extending the reach of BI beyond traditional corporate business.

SAP Business Objects Web Services enhances support for your tactical and operation decision making, which improves Business process efficiency.

Sridurga Vannemreddi is an SAP Business Objects and Big Data developer with Bodhtree.  For more than a decade, Bodhtree has specialized in business analytics, leveraging close partnerships with leading BI software manufacturers such as SAP Business Objects, Informatica, and IBM Cognos.  Bodhtree offers free assessments to map analytics solutions to the goals and objectives specific to your organization.

Read More