What is nosql ? why nosql ? When Nosql?

no sqlWhat is nosql?

Unlike what it sounds Nosql means “not only sql” since the goal is not to reject SQL but, rather, to compensate for the technical limitations shared by the majority of relational database implementations.

NoSQL is a whole new way of thinking about a database. NoSQL is not a relational database.

nosql is becoming prominent, for the simple reason that relational database model may not be the best solution for all situations

Best way to think of nosql db is distributed non-relational db with very loose structure or no structure

NoSQL databases are finding significant and growing industry use in big data analytics and real-time web applications

Why nosql?

In 2000,Eric Brewer outlined the now famous CAP(Consistency, Availability and Partitioning) theorem,

which states that both Consistency and high Availability cannot be maintained when a database is Partitioned across a fallible wide area network.

so to get all consistency availability over partitions nosql comeback to deal with data explosion

other imp advantages along with providing Consistency, Availability and Partitioning are as below

  • Horizontal Scalability
  • More flexible data model and
  • Performance advantages

When nosql?

Typically nosql db would be preferred but not limited in the following scenarios

  • Real time web applications
  • Unstructured/”schema less” data – usually, you don’t need to explicitly define your schema up front and can just include new fields
  • Huge data (TBs)
  • When scalability is critical

Read More

Big Data Cloud Today! Experts discuss what today’s data is saying about tomorrow opportunities

Increased productivity, new innovations and smarter marketing are just a few advantages being realized by organizations that embrace big data.

Big Data Cloud Today!, an event held on June 7th in Mountain View, drew leaders from business and technology to discuss the next generation of Big Data use cases. Attendees to Big Data Cloud Today! learned emerging techniques, like 3d data visualization, to distill new insights from existing data.  The event addressed the growth of big data, big data architectures, and identification of new business opportunities.

As I participant in the event, I would like to share a few of the insights and key learning’s that I felt offer the most value for Bodhtree customers and network.   Milind Bhandarkar, Chief Scientist from Pivotal, Dr. Mayank Bawa, Co-President R&D Labs of from Teradata Aster, Jim Blomo Engineering Manager – Data-Mining, and Gail Ennis, CEO of Karmasphere, were a few experts who made this event truly impact full. Speakers presented first-hand experiences and lessons learned from Big Data early-adopter organizations.

Dr. Mayank Bawa (Co-President, R&D Labs, Teradata Aster) set the tone for the conference with an excellent keynote address. ‘Why is there such excitement around Big Data Analytics in the current environment?’ and ‘How are Big Data Services & Data Sciences Unique?’ were the two questions that framed his remarks.  His presentation marvelously answered them with real-life use cases in two broad categories:

• “All kinds of data in a single platform”
• “All kinds of Analytics in a single platform”

To underscore these points, he presented various applications of Big Data Solutions such as ‘Market Basket Analysis’, ’Telecom and Churn analysis’, and ‘Predictive Modeling in Insurance Domain.’ Some of the interesting takeaways, challenges and open questions are as follows:

– How will technology progress to a unified architecture from the current state?
– The focus of every company is on building a platform that bring silos of data together and facilitates seamless dataflow across systems
– Empowering data sciences and improving analytical algorithms
– Relational vs. NoSQL:
– Is there a need to build SQL layer over NoSQL?
– Does it add any business value?
– Vision of oracle, Teradata in bringing relational and NoSQL together.

How to make Big Money from Big Data? – Sourabh Satish, Security Technologies & Response, CTOs Office, Symantec

While Dr. Bawa presented the motivation to build a unified architecture with better analytics, Sourabh Satish, Security Technologies and Response at Symantec, explored the advances offered by Big Data in the Security Domain. He demonstrated some of the security tools built at Symantec and illustrated how the three fields – Big Data, Data Science and Domain Expertise – can be leveraged for building an application.

Hadoop: A Foundation for Change – Milind Bhandarkar, Chief Scientist, Pivotal

If I had to design a metric to calculate the most relevant and valuable presentation in the conference, then without a doubt the gold standard would be set by Milind Bhandarkar, Chief Scientist of Pivotal.  Mr. Bhandarkar talked about the evolution of analytics and big data and characterized by three distinct areas:

• Source Systems +ETL+EDW+Visualization
• Source Systems +Hadoop& M>R +EDW+Visualization
• Hadoop and ecosystem

He went on to explore several key issues in the Big Data field:

BI Vs. Big Data and Future
Big Data’s journey from batch processing to interactive processing. Is interactive processing possible?
Hadoop as a service?
Applications as a service?
Cloudera Impela bypassing MapReduce
Myth around how huge (Big) is the volume of data used in  analytics query (at Yahoo, Microsoft)

Why Hadoop is the New Infrastructure for the CMO (they may not know it yet!)- Gail Ennis CEO, Karmasphere

Gail Ennis talked about business use cases driving the demand for Big Data in today’s rapidly changing world, the journey of technology from BI to Big Data (predictive insights) and the potential of predictive analytics in Marketing and product Development.

Insights from Big Data: How-to? –Panel Discussion

Jim Blomo, Engineering Manager Data-Mining, Yelp
David P. Mariani, VP Engineering, Klout

The frank and energetic interaction between Professor Blomo and Mariani offered some of the most interesting discussion in the conference, including topics such as:

How to identify whether a given problem is a BI Analytics problem or Big Data problem?
Is existing BI framework needed? How Big Data evolves to be interactive BI
How a company can form a  data sciences group & their Journey in building their team
What qualities they looked at while selecting Data scientists in their team as Data Scientist is not a role well defined across the industry
Evolving technologies in data sciences and Big data (hive vs. cloudera imepala vs. shark)
Is ETL on the fly possible
Yelp and its work in data sciences
Academic  Education or Practical Experiences which helps in being a great Data Scientist

Rhaghav Karnam manages the Big Data Scientists group at Bodhtree, focusing on Big Data for customers in High Tech, Manufacturing, Life Sciences/Pharmaceuticals and government industries.  Bodhtree enables its customers and partners for business transformation through Big Data and social-mining solutions laser focused on measurable business objectives.

Read More

Balance your Supply Chain with Big Data

Let’s start by going back…way back from a tech perspective. In the 1840s Samuel Finley Breese Morse, the American co-inventor of Morse code envisioned laying cable across the Atlantic to enable telephonic communication from US to Europe. The business benefit metric of the solution was a reduction in message transmission time from 10 days to only a few minutes. With this massive return, the initiative would seem like a “no brainer” from today’s perspective where communication is at milliseconds speed from your cell phone; believe it or not, the question commonly asked then was ‘Do we really need communication so fast?’ The project ultimately took over 18 years to complete when US president James Buchanan finally conversed with Queen Victoria over the transatlantic cable, hence demonstrating the first business benefit. Let us call this the ‘Paradigm Shift Period’ for communication. Modern businesses now rely on instant communication across the world with voice and data transfers occurring at lightning speed. People, processes and technologies within business have all evolved to conform to this new paradigm of global data interconnection.

In fact, the original challenge has now come full circle. Business and government have become so efficient at capturing and transmitting data that getting the data is no longer the core of the issue. The challenge and opportunity now lay in processing and interpreting the terabytes, even petabytes, of available structured and unstructured data to influence effective business strategy.

The chances are that you’ve been bombarded with Big Data buzz over the last year. But in spite of all the noise, you’ve probably noticed that few of these descriptions contain focused business use cases for applying Big Data technologies. I am the first to acknowledge and agree with Gartner research that Big Data Analytics is riding a hype cycle that will likely peak sometime in 2013. Between now and then a lot of mind share will go into figuring out if there is value for your domain, your industry and your job. If you work in supply chain, irrespective of the industry, continue reading to understand how Big Data is expected to bring both direct and indirect impact. Some of these reverberations may fundamentally change the nature and duties performed in supply chain jobs. In 2010 we have witnessed a ‘Paradigm Shift Period’ for Big Data Analytics with major players like SAP announcing the next generation of real-time analytics as many ask a similar question to 170 years earlier, ‘Do we really need analytics so fast?’ SAP is now seeing their Hana analytics customers grow rapidly, similar to other big players like Oracle. We are witnessing an epic shift in supply chain data analytics that will make the approaches of the last decade seem antiquated.

The Supply Chain Domain

The core of any supply chain strategy is maintaining an appropriate balance between the supply and respective demand. Every other related model, including the well-known JIT (Just in Time), really targets the same goal with different degrees of precision and timeliness. Every time you enter the car repair shop and the mechanic mentions a part will take X days to order, you get a prime, though frustrating, example of a supply-demand imbalance. It is every organization’s goal to maintain a supply-demand balance by optimizing cost and quality with operational efficiencies.

On a much larger scale, I have observed operations at a $40B Hi Tech manufacturer where maintaining the supply-demand balance is a far more complex proposition. Everyday employees and partners in this supply chain ecosystem are trying to find answers to key supply chain questions, but their view is constrained to only a piece of the picture since reports rely primarily on structured data. How fast the person can get accurate and relevant information has a significant impact on the growth, profitability and productivity of the supply chain function.

The following are some ballpark metrics for the annual activities involved in keeping supply aligned with constant variation in market demand:

Does this look ugly? It is. But think about what these numbers will be after data volumes grow 16X by 2016.

It’s a category 5 hurricane of data.

All of the above communication is related to one or more of the following four areas: Assess the demand, Assess the supply, Fulfilment of demand, Delivery of the product/service. The efficiency and success of these activities can be tracked through metrics such as lead-time variance, forecast inaccuracies, on-time shipments and quality metrics to name a few.

Big Data for Supply Chain

NOW, let us bring Big Data into the picture and see how this outlook changes. A Big Data problem exists if data Volume, Velocity and Variety become difficult or impossible to store, process, and analyse using traditional technology and methods. With Big Data technologies, the capability to find answers faster and cheaper has grown exponentially.

While we predict 16X growth in data volumes in just a few years, human ability to comprehend does not keep the same pace. From the perspective of people, processes and technology within supply chain management, improvements will need to catch up as you implement Big Data technologies. The probability is high that Big Data technologies will play a key role in handling your rapid data expansion, so gear up your people and processes to match the potential of these technological innovations. Within the broad range of supply chain roles, let us consider the role of planner to see how his/her activities change from today’s traditional technologies vs. Big Data technologies of tomorrow.

Key Supply Chain functions Today – Traditional Technologies Tomorrow – Big Data Technologies
Forecasting Running reports and analysis on a daily basis (reports alone can take hours to produce). Forecasting using real time dashboards, eliminating the concept of running reports. Data is ready at lightning fast speeds with the capability to capture snapshots of analysis.
Demand Planning Mostly using human-fed structured data Demand Planning using structured and unstructured data (e.g. web clickstreams, Facebook likes, Twitter Feeds , Customer reviews, news article mentions)
Supply Planning Traditional reports and email communications Supply Planning using real time data with deep insights to the news of vendors and partners.
Fulfilment & Delivery Tracked through workflows and report status Proactive delivery tracking to predict possible delays and correlated interdependent events.

There is a fundamental shift from planners reading the data and recommending changes to the machine recommending changes and planners managing the exceptions. This has been the goal of many organizations for the last decade, but recent Big Data technology innovations represent quantum-leap advances toward true strategy automation.

The traditional model makes local copies of data which the planner edits and writes back. The read/write process might take anywhere from seconds to many hours depending on the tasks. With Big Data, the turnaround becomes milliseconds. The natural reaction is, “Do I really need information flow that fast?” The important question is not how fast the information flows, but how quickly you can change your decision from A to B, capturing a time-sensitive opportunity or averting a major cost. Cancelling a wrong work order or not considering all available information for analysis could mean a poor decision in current model. Visualize the planners viewing all the information they want to see in real time while the competition is still updating data and processing reports.

Bringing the Supply Chain Contacts, Content and Context Together for decisions

The most critical factor for effective corporate decisions is to bring the contacts, content and context closer to each other. For example, a supply chain company that knows a part defect would potentially affect the assembly, which could in turn delay customer delivery and eventually affect services. Predicting the occurrence of defects well in advance through analysis of historical Big Data has huge ROI potential by enabling appropriate adjustments to every event in this chain. Additionally, with Big Data recommending related content and relaying all of this to the right contacts, the result is direct ROI in the form of improved quality metrics, increased customer satisfaction and reduced maintenance costs for part replacement.

Today’s Big Data technologies have the capability to demonstrate how in the automobile industry an alternator part data sheet (Content) can be analysed against all cars sold (Contacts) and reveal the root cause for battery replacements (Context), an issue which has cost the company millions of dollars in repair services. Similar examples can be found in many Big Data technology use cases across industry verticals.

All of these scenarios are primarily connecting the 3Cs, the Contacts (e.g. Customer information or internal employee) and Content (Use case specific information e.g. Battery failure) with Context (How a battery replacement is due to alternator failure).

Much of a Planner’s time is spent searching for information across multiple tools, reports and manual communication with traditional technologies. One gauge of an effective Big Data technologies implementation is to reduce the number of reports to 1/10 the current volume. Let the machines do the job of relating and correlating the huge flow of information, and put the planner in the command seat to review recommendations and approve/disapprove. This will directly increase the productivity of the planner as he/she has to focus on reviewing the recommendations rather than searching for information.

Where to Start

All of this means that you need to first conduct an assessment of your supply chain ecosystem with a specific use case in mind to which Big Data technologies will be applied. The specific area targeted for improvement may be forecast inaccuracies, which in today’s model relies mostly on structured data combined with massive exchanges of manual communication, ignoring much of the available market feedback (unstructured data). Measure the baseline and set realistic targets. Traditional Forecast/Demand planning fundamentally relies on a set of numbers entered by internal and external users. It does not factor in some of the Big Data elements such as sentimental analysis of the market, internal/external unstructured communication (e.g. blogs, chats, Tweets, customer reviews). When the unstructured information is correlated with structured data, new insights arise prompting better decisions. 1% improvement in your forecasting drives multi-fold improvements to your entire supply Chain based on empirical research. Upon realizing these early Big Data benefits, we can then expand it to broader supply chain use cases.


Now, where do you initiate the change and get the quick ROI? Our recommendation is to pick the top five supply chain reports you run on your traditional BI Solutions and Analytics platform, analyse them and assess whether Big Data technologies would bring in improved results. Consider dimensions of accuracy, precision, and timeliness. For example, forecasting traditionally depends on sales, BU or operations entering their forecasts and coming up with some form of consensus. Inherent forecast inaccuracy exists, which are mitigated by a continuous improvement process. Now, with Big Data you start feeding unstructured market information into the analysis, casting more light on external reactions to your product. This insight provides early indications of demand variations, allowing for corrections to forecasts.


The fundamental disruption in our supply chain eco system has begun through Big Data technology capabilities impacting People, Process and Technology. Faster, better and cheaper processing of Big Data will drive improvements in people’s behaviour and actions, bringing improved supply/demand balance. Similarly, process improvements learned from various supply chain driven companies (e.g. automobile) will flow into other industries like Hi Tech and Healthcare. The traditional daily job of a supply chain employee who reads and writes Content relating it to a Context working with his set of Contacts will dramatically change. Human-driven searching will fundamentally shift to machine-driven searching, mapping relevant information for faster decision making with recommendations. Get started with a use case which can be easily measured for ROI realization, then use this success as a launch pad to expand Big Data insights across the organization.

Read More