Trace-Based Runtime Prediction of Reoccurring Data-Parallel Processing Jobs

Master's Thesis from the year 2021 in the subject Engineering - Computer Engineering, grade: 1.7, Technical University of Berlin, language: English, abstract: The present research proposes a novel approach to estimate incoming jobs runtime based on similarities of reocurring jobs. To achieve this goal, we utilize the latest achievements in neural network techniques to embed the job dependencies. Subsequently, we perform multiple clustering techniques to form meaningful groups of reoccurring jobs. Finally, based on the similarities within the groups of samples, we predict runtimes. A recently published trace dataset allows us to develop and evaluate our contribution with more than 200,000 complex and real-world jobs. The cloud data centers should daily handle numerous jobs with complex parallelization. In order to schedule such a heavy and complicated workload and reach efficient resource utilization, runtime prediction is critical. Moreover, accurate runtime prediction may assist cloud users in choosing their required resources more intelligently. Despite the importance of runtime prediction, achieving an accurate prediction is not straightforward because the execution time of jobs in complicated environments of clouds is affected by many factors, e.g., cluster status, users' requirements, etc.