Predictive Analytics on Cloud: Major Drivers

Predictive analytics can easily mine and turn a large volume of data into valuable business insights. This requires organizations to build statistical predictive modeling systems that demands significant time and resources with niche skill sets. It’s just a matter of time that organizations start realizing and moving their predictive and statistical analytic systems onto the cloud.

Drivers for cloud are not just dealing with big data, niche skills, and time it takes to build the system, but also the volume of consumer’s behavioral information that is available online, which can help in building a full proof predictive modeling system.

When you plan to build and deploy a predictive model, one of the major bottle necks would be to convert your data into a format that facilitates building and deploying predictive models, The transformation of data however is often a series of database operations (group by, join, where clauses), Numerical transformations (binning, ratios, log transforms, etc.), and text processing (stemming, grouping/binning). Except for some database operations, these operations are inherently parallelizable, lending themselves nicely to cloud solutions.

Lastly, a major driver for moving predictive analytics stack to cloud would be mobility and accessibility of the solution if the solution is on cloud; Decision makers are often traveling and having access to their business data on cloud can be a major driver for organizations to move their predictive analytics to cloud.

Read More

What is nosql ? why nosql ? When Nosql?

no sqlWhat is nosql?

Unlike what it sounds Nosql means “not only sql” since the goal is not to reject SQL but, rather, to compensate for the technical limitations shared by the majority of relational database implementations.

NoSQL is a whole new way of thinking about a database. NoSQL is not a relational database.

nosql is becoming prominent, for the simple reason that relational database model may not be the best solution for all situations

Best way to think of nosql db is distributed non-relational db with very loose structure or no structure

NoSQL databases are finding significant and growing industry use in big data analytics and real-time web applications

Why nosql?

In 2000,Eric Brewer outlined the now famous CAP(Consistency, Availability and Partitioning) theorem,

which states that both Consistency and high Availability cannot be maintained when a database is Partitioned across a fallible wide area network.

so to get all consistency availability over partitions nosql comeback to deal with data explosion

other imp advantages along with providing Consistency, Availability and Partitioning are as below

  • Horizontal Scalability
  • More flexible data model and
  • Performance advantages

When nosql?

Typically nosql db would be preferred but not limited in the following scenarios

  • Real time web applications
  • Unstructured/”schema less” data – usually, you don’t need to explicitly define your schema up front and can just include new fields
  • Huge data (TBs)
  • When scalability is critical

Read More

Is Social Mining already deciding your forecasting and Pricing?



big data analytic services

A recent study conducted by Oracle Corporation in the retail sector revealed that customers are more social media savvy and the reason behind selecting a particular brand as the best brand is customer service (post sale). If you have visited Japan, Australia or India you may have seen an “Oxygen Bar.” These are establishments that sell oxygen for recreational and consumer usage…seriously. Visit www.o2bar.com.au or Google it. When I first saw the statistics below, I felt that maybe what I really need is a “free air” bar, as in free of Social Media. But this seems impossible in today’s digital world. Social Media has not only played a major role in connecting people, it has also brought a paradigm shift in the way enterprises conduct business.

Here are some quick facts about the ever-present role social media now plays in our relationships and buying decisions:

– How demand is influenced (Forecast)

– 20% of time on PCs is spent on social media. On mobile devices, people are on social media 30% of the time (Nielson)

– Consumers are 71% more likely to make a purchase based on social media referrals (Hubspot)

– Social networks influence nearly 50% of all IT decision makers (LinkedIn – learn more at TechConnect ’12)

– 74% of consumers rely on social networks to guide purchase decisions (SproutSocial)

– Facebook is the most effective platform to get consumers talking about products (SproutSocial)

– 44% of automotive consumers conduct research on forums (Mashable)

– 81% of US respondents indicated that friends’ social media posts directly influenced their purchase decision (Forbes)

– 78% of respondents said that companies’ social media posts impact their purchases (Forbes)

It is not enough for a company to say, “I am mining social data and using Big Data technologies.” Instead companies need to clearly state and understand “What are you mining?”;”Do you understand the ROI?”; ”Do you know how it integrates with demand and pricing management?” If the answers to these questions are not clear, you may not be there yet; but should any sense of complacency arise, just ask, “Is my competitor ahead of me with social mining?”

Read More

Big Data Cloud Today! Experts discuss what today’s data is saying about tomorrow opportunities

Increased productivity, new innovations and smarter marketing are just a few advantages being realized by organizations that embrace big data.

Big Data Cloud Today!, an event held on June 7th in Mountain View, drew leaders from business and technology to discuss the next generation of Big Data use cases. Attendees to Big Data Cloud Today! learned emerging techniques, like 3d data visualization, to distill new insights from existing data.  The event addressed the growth of big data, big data architectures, and identification of new business opportunities.

As I participant in the event, I would like to share a few of the insights and key learning’s that I felt offer the most value for Bodhtree customers and network.   Milind Bhandarkar, Chief Scientist from Pivotal, Dr. Mayank Bawa, Co-President R&D Labs of from Teradata Aster, Jim Blomo Engineering Manager – Data-Mining, and Gail Ennis, CEO of Karmasphere, were a few experts who made this event truly impact full. Speakers presented first-hand experiences and lessons learned from Big Data early-adopter organizations.

Dr. Mayank Bawa (Co-President, R&D Labs, Teradata Aster) set the tone for the conference with an excellent keynote address. ‘Why is there such excitement around Big Data Analytics in the current environment?’ and ‘How are Big Data Services & Data Sciences Unique?’ were the two questions that framed his remarks.  His presentation marvelously answered them with real-life use cases in two broad categories:

• “All kinds of data in a single platform”
• “All kinds of Analytics in a single platform”

To underscore these points, he presented various applications of Big Data Solutions such as ‘Market Basket Analysis’, ’Telecom and Churn analysis’, and ‘Predictive Modeling in Insurance Domain.’ Some of the interesting takeaways, challenges and open questions are as follows:

– How will technology progress to a unified architecture from the current state?
– The focus of every company is on building a platform that bring silos of data together and facilitates seamless dataflow across systems
– Empowering data sciences and improving analytical algorithms
– Relational vs. NoSQL:
– Is there a need to build SQL layer over NoSQL?
– Does it add any business value?
– Vision of oracle, Teradata in bringing relational and NoSQL together.

How to make Big Money from Big Data? – Sourabh Satish, Security Technologies & Response, CTOs Office, Symantec

While Dr. Bawa presented the motivation to build a unified architecture with better analytics, Sourabh Satish, Security Technologies and Response at Symantec, explored the advances offered by Big Data in the Security Domain. He demonstrated some of the security tools built at Symantec and illustrated how the three fields – Big Data, Data Science and Domain Expertise – can be leveraged for building an application.

Hadoop: A Foundation for Change – Milind Bhandarkar, Chief Scientist, Pivotal

If I had to design a metric to calculate the most relevant and valuable presentation in the conference, then without a doubt the gold standard would be set by Milind Bhandarkar, Chief Scientist of Pivotal.  Mr. Bhandarkar talked about the evolution of analytics and big data and characterized by three distinct areas:

• Source Systems +ETL+EDW+Visualization
• Source Systems +Hadoop& M>R +EDW+Visualization
• Hadoop and ecosystem

He went on to explore several key issues in the Big Data field:

• BI Vs. Big Data and Future
• Big Data’s journey from batch processing to interactive processing. Is interactive processing possible?
• Hadoop as a service?
• Applications as a service?
• Cloudera Impela bypassing MapReduce
• Myth around how huge (Big) is the volume of data used in  analytics query (at Yahoo, Microsoft)

Why Hadoop is the New Infrastructure for the CMO (they may not know it yet!)- Gail Ennis CEO, Karmasphere

Gail Ennis talked about business use cases driving the demand for Big Data in today’s rapidly changing world, the journey of technology from BI to Big Data (predictive insights) and the potential of predictive analytics in Marketing and product Development.

Insights from Big Data: How-to? –Panel Discussion
 

Jim Blomo, Engineering Manager Data-Mining, Yelp
David P. Mariani, VP Engineering, Klout

The frank and energetic interaction between Professor Blomo and Mariani offered some of the most interesting discussion in the conference, including topics such as:

• How to identify whether a given problem is a BI Analytics problem or Big Data problem?
• Is existing BI framework needed? How Big Data evolves to be interactive BI
• How a company can form a  data sciences group & their Journey in building their team
• What qualities they looked at while selecting Data scientists in their team as Data Scientist is not a role well defined across the industry
• Evolving technologies in data sciences and Big data (hive vs. cloudera imepala vs. shark)
• Is ETL on the fly possible
• Yelp and its work in data sciences
• Academic  Education or Practical Experiences which helps in being a great Data Scientist

Rhaghav Karnam manages the Big Data Scientists group at Bodhtree, focusing on Big Data for customers in High Tech, Manufacturing, Life Sciences/Pharmaceuticals and government industries.  Bodhtree enables its customers and partners for business transformation through Big Data and social-mining solutions laser focused on measurable business objectives.

Read More