MODEL OF SERVER POOL FOR ESTIMATION OF ENERGY CONSUMPTION IN BIG DATA PROCESSING
DOI:
https://doi.org/10.53920/ITS-2021-1-4Keywords:
big data, cluster, server pool, Markov model, power consumptionAbstract
Platforms for the organization of big data processing systems are considered. Details of the deployment, use, architecture, and capabilities of Apache Spark in the Azure cloud are detailed. The components of the Apache Spark cluster in Azure HDInsight are considered. The types of cluster managers Apache Mesos, Apache Hadoop YARN and Spark are distinguished. The general model of task maintenance in the Spark cluster is given, which allows to estimate the probability of task failure, the server component of the delay time to the response of SparkContext, the energy consumption of the architecture components. This model considers three types of resource groups: hot, warm, and cold pools of physical servers. A stochastic model of a physical hot pool server in the form of a Markov graph is constructed. The formulas for calculating the total average power consumption of a physical server are given.