Adaptive Resource Management for Distributed Data Analytics

Thamsen, Lauritz; Renner, Thomas; Verbitskiy, Ilya; Kao, Odej

doi:10.3233/978-1-61499-882-2-155

Abstract

Increasingly large datasets make scalable and distributed data analytics necessary. Frameworks such as Spark and Flink help users in efficiently utilizing cluster resources for their data analytics jobs. It is, however, usually difficult to anticipate the runtime behavior and resource demands of these distributed data analytics jobs. Yet, many resource management decisions would benefit from such information.

Addressing this general problem, this chapter presents our vision of adaptive resource management and reviews recent work in this area. The key idea is that workloads should be monitored for trends, patterns, and recurring jobs. These monitoring statistics should be analyzed and used for a cluster resource management calibrated to the actual workload. In this chapter, we motivate and present the idea of adaptive resource management. We also introduce a general system architecture and we review specific adaptive techniques for data placement, resource allocation, and job scheduling in the context of our architecture.

Contact

IOS Press Copyright 2024

Contact

IOS Press Copyright 2024

This website uses cookies

This website uses cookies