HPC is one of the most important and fundamental infrastructures for scientific development in all disciplines, and has progressed enormously due to the increasing need to solve complex problems. HPC is planned to reach Exascale computing capability of at least one exaFLOPS by 2020. Compared to the first petascale computer to come into operation in 2008, this capacity represents a thousand-fold increase.
Big data is the term used to refer to data sets so large and complex that traditional data processing applications are inadequate to handle them. Gathering, capture, data cleaning and curation, search, sharing, storage, transfer, analysis, visualization, and information privacy are its main challenges. Recent years have witnessed a flood of network data driven by sensors, the IoT, emerging online social media, cameras, M2M communications, mobile sensing, user-generated video content, and global-scale communications, all of which have brought people into the era of big data. The processing of Big Data requires vast amounts of storage and computing resources. This makes it an exciting time to join the practitioners in HPC systems and Big Data.
HPC facilitates the processing of big data, but the tremendous research challenges of recent years have included the scalability of computing performance for high velocity, high variety and high volume big data; Deep learning with massive-scale datasets; Big data programming paradigms on multi-core; GPUhybrid distributed environments; and unstructured data processing with high-performance computing. The tools and cultures of high-performance computing and big-data analytics have diverged to the detriment of both, and it is clear that they need to come together to effectively address a range of major research areas.
The question now is: what is the biggest technical challenge associated with advancing HPC beyond the exascale operations capability for numeric and big data applications? There are many challenges ahead, including system power consumption and environmentally friendly cooling, massive parallelism and component failures, data and transaction consistency, metadata and ontology management, precision and recall at scale, and multidisciplinary data fusion and preservation, and all these challenges must ultimately be solved to achieve a computing platform which will be useable by and accessible to researchers from a wide range of disciplines who may not be computer experts themselves.
In accordance with the importance of this field, the TopHPC2017 congress on Advances in High-Performance Computing and Big Data Analytics in the Exascale Era was held in Tehran, Iran, and selected papers from the congress are gathered together in this book.