World, nowadays, is engulfed in a deluge of data of different formats which is being generated from innumerable sources like mobile phones, social media, digital platforms, scientific experiments and enterprise applications. Such huge amount of unstructured as well as semi-structured data coming from various sources and in different formats is termed as “Big Data”. The trillion sensor world is further going to add to the explosive growth of Big Data segment. According to Gartner, “Big data is high volume, high velocity, and high variety information assets that require new forms of processing to enable enhanced decision making, insight, discovery, and process optimization”. It is quite evident that Big Data is creating opportunities for effective decision making across several domains and organizations.
Since many years of data management, Relational Database Management Systems (RDBMS) have always been a comprehensive solution for storage, processing and analysis of data. But to capture, store, process, analyze and visualize Big Data, the traditional database management techniques are no longer sufficient. Here comes the question what are the limitations of traditional database systems leading to their inability to handle “big data”? These limitations have been discussed in this chapter which made it essential to shift towards the paradigm of breakthrough big data technologies and techniques in a world drowning in a “data tsunami”. This chapter attempts to elaborate the reasons of this paradigm shift from traditional database processing to big data processing technologies.
Various big data analysis, processing and visualization techniques and technologies have been discussed in this chapter, which have changed the world of big data. Initially, large firms like Google developed technologies like MapReduce, The Google File System, Big Table etc to meet their own data needs. But eventually most of the organizations and even governments started gearing up for big data solutions to improve their decision making process by value generation based on data trends. Therefore there has been a sharp rise in the development of big data techniques. This chapter covers major breakthrough technologies like MapReduce, HadoopDB, Cassandra, Chubby Lock service, PLATFORA, SkyTree, Dremel, Pregel, Spanner, Shark, Megastore, Spark, F1, MLBase, NoSQL Databases, HBase, HDFS, YARN, Mahout and Chukwa. Big data technologies find applications in various domains like IT, Government, Defence, Manufacturing, Earth sciences, Healthcare, Agriculture, Education, Media industry, Retail, Real Estate, Science & Research activities like Large Hadron collider, Astronomy and Sports. The application of Big Data techniques and technologies in these domains has also been discussed to emphasize upon their importance in the present world leading to data-driven decision making.