Why data of all sizes and complexities including “Big Data” should be “Happy”

My blog is centered on the theme of making a conscious decision to begin to view data as if it was “alive” with all the complexities and mysteries of a human being. By taking this approach I hope to provide a journey and a platform to spark a conversation on how this perspective can then begin to change on how we act towards data and how our decisions around data might then change. Yet, if I do take this premise then it is in my personal opinion that at the end of the day that my data or data that I personally interact with it or have responsibility of will be “Happy Data”.

When I studied biopsychology (combo of psychology + neuroscience) at UCSD we would often look at how biological processes interact with emotions and cognition. As I was earning my degree too often the common debate of nature vs. nurture would be highlighted in this branch of psychology. Over the years, I truly believe that the difference of nature vs. nurture is very important but there is strong importance the relationship between both of them. By nature you carry the traits that might define you but you are nurtured to become the human being that you become as an adult by your interactions. Those interaction can start with your family environment (data in your organization), your extended family (data loosely related) , peer experience (how data interacts with other data), and extending to influences in socio-economic status(will you make different decision around your data in you are economical sound). So if my goal is to make sure that at the end of the day my data is happy what can I do to make sure this happens? What should I consider in the DNA of my data? What things should I consider to provide a positive environment as my data is maturing in my ecosystem?

Nurture: In the beginning there was “Little Data”

I have a strong passion for Analytics so a lot my examples going forward would probably gravitate towards that subject. (Yet, I will try to change around in future postings)Lets then look at my first example on when an organization has decided to launch existing product line in a different channel. Let’s say that this organization has traditionally provided this product only via direct to consumer over the web and print channel (catalog). Now they want to have physical presence were can have a more intimate relationship with the customer and have begun to roll their products via kiosk in a mall. It is anticipated that mix % of these new distribution channel might increase to 15% in 12 month period so they are being cautious not to tarnish the brand but also cognizant that there is certain opportunity cost if they do not move fast enough. Both the folks in marketing and product development might have decided that it was more important to the launch the product quickly then to see if the proper process of capturing the entire 360 degree touch points of the customer. In this example, the organization rolled out the product and did not think about the various components that the transactions with the customer might be different that on the web. Thus, it is treating the data with a limited view. So the data is small and young at the beginning of this process. If the data was alive like a human being would you wait until the data grows or would you try to deal data at a different cycle of the process? It is best to think about it, listen to it, analyze it, interpret it, treat it, nurture it, and protect it (we will talk about security in detail in later blogs) at a stage that it is not as complex and the size is manageable. You also have a stronger chance to nurture it along the process and can influence the outcome of this data by beginning a relationship with it earlier on. You are able to change some the environment and process when you begin to understand the importance of this data in the future.

I will try to get into more details on different examples on different stages in maturity and complexities of data going forward on other postings. I did not get into too much detail given that I wanted to introduce this subject first. I am excited to see in other discussion what we should consider in your organization if the data might be unstructured and rebellious, how then you would then need to act around it. Also, if you have old enterprise data that has been there a long time what are different ways to deal with historical and older data. Regardless, your data should be happy and you should consider how to get there. Can you provide an example in your organization that if you had taken this approach the outcome would be different? Did I miss something or angle that I should have considered? Thank you and please share your comments.

Kain A Sosa VP, Analytics at Bodhtree with expertise on various big data technologies, like Hadoop, Big query, Passionate leader in Data Analytics, Business Intelligence, and Big Data services.

Share Button

Read More

Why are so many customers failing in their Big Data initiatives?


I strongly believe that companies with a successful Big Data strategy have an information-centric culture where all employees are fully aware of the possibilities of well-analyzed and visualized information. Better data visualization can help you make better decisions

As a matter of fact, Gartner’s top predictions for 2012 and beyond included this prediction about Big Data: “Through 2015, more than 85 percent of Fortune 500 organizations will fail to effectively exploit Big Data for competitive advantage.” This leads to the question “Why are so many customers failing in their Big Data initiatives?”

The success of a Big Data implementation is directly proportional to the maturity model of the organization.

Remembering the Big Data project implementation experience I would like to share the approach that includes three assessment steps as mentioned below. I thought it would be insightful if I also mention here the recommendations which lead to a successful Big Data implementation.

I. APPROACH

II. RECOMMENDATION

Recommend a model, which will demonstrate the real value of Big Data as it is applicable to an organization. The final recommendations and roadmap, based on our learning’s yield one of two possible outcomes:

• If an organization already has all the necessary tools, processes, systems, and solution to solve the existing problems, then we will recommend through a business case that they are not a good contender to adopt Big Data technologies but can resolve their problems with existing ecosystem

• If an organization demonstrates the potential value of a Big Data investment, then we would recommend moving forward with next steps: take the executable roadmap and blueprint to engage in a Big Data proof of concept (POC)

III. METHODOLOGY

Organizations that approach big data from a value perspective with partnership between the business and IT are much more likely to be successful than those which adopt a pure technology approach. For this reason, making appropriate investments in both technology and organizational skill sets to ensure enterprise capability in extracting value from big data is essential.

Don’t wait, start now

Start collecting massive amounts of data and store it centralized with Hadoop, hire or train your data scientists and change your culture to an information-centric organization. This will help to drive innovation and stay ahead. Don’t wait, as Big Data is the only way forward.

Phani Kumar Reddy is a Manager Analytics at Bodhtree, Managing presales of BI with expertise on various big data technologies, like Hadoop, Big query , Passionate leader in Data Analytics, Business Intelligence, and Big Data services

Share Button

Read More

Ever wondered what happens between Map and Reduce?

Shuffle and Sort – The input passed to every reducer is sorted by a key. The process of sorting and transforming the map outputs into reducer outputs is known as Shuffle.

MAP side

The output produced by the mapper is not directly recorded onto the memory. This process involves buffering and processing data further to enhance efficiency. It is often a good idea to compress the map output while writing it onto a disk, as doing so improves performance, saves disk space, and optimizes the volume of data that is being transferred to the reducer. By default the output is not compressed, but it is easy to enable by setting the value of ‘mapred.compress.map.output’ to ‘True’.

Map-reduce-areaReduce side

The map output file resides on the local disk of the task tracker that runs the map task. This requires further processing by the task tracker that is about to run the reduce task for the partition. The reduce task requires the map output for a particular partition from several map tasks across the cluster. The map tasks may complete at different times and the reduce task starts copying their outputs as soon as each map task completes.

Bodhtree, a leader in ‘PACE’ technology IT Services, including Product Engineering, Analytics, Cloud Computing, and Enterprise Services.   Bodhtree empowers innovative businesses strategies through a mission to Educate, Implement, Align, and Secure transformational technology solutions.

Share Button

Read More

Extending SAP Business Objects to All Organizational Decision-Makers

BI tools play a vital role in decision making and innovation at every level in dynamic organizations. SAP Business Objects includes tools that help expand the reach of BI Information services, enabling the organization to share, integrate and also Embed BI in applications, services, tools and business processes.

Unification of the BI data used across applications

BI can be used across multiple functions and is generally not specific to any particular department or team. It can be leveraged across applications related to finance, operations, sales or human resources. SAP Business Objects Enterprise provides a unified structure with a powerful semantic layer and integration capability that brings a “single version of the truth” to data drawn from multiple sources.

Share BI with any service-Enabled Application

To build applications that extend the advantage of a company’s BI Investment, developers can use SAP Business Objects BI Software Development Kits (SDK’s). These SDK’s can be used in any Java or .Net based application for authentication, authorization, scheduling, content display, ad hoc query, or server administration. SAP Business Objects also offers a comprehensive set of Web Services that expose BI functionality as a platform-agnostic interface. The software also supports your organization by extending the reach of BI beyond traditional corporate business.

SAP Business Objects Web Services enhances support for your tactical and operation decision making, which improves Business process efficiency.

Sridurga Vannemreddi is an SAP Business Objects and Big Data developer with Bodhtree.  For more than a decade, Bodhtree has specialized in business analytics, leveraging close partnerships with leading BI software manufacturers such as SAP Business Objects, Informatica, and IBM Cognos.  Bodhtree offers free assessments to map analytics solutions to the goals and objectives specific to your organization.

Share Button

Read More