Dynamic scheduling algorithm for heterogeneous computers

This study developed a dynamic scheduling algorithm for heterogeneous computers that schedules application based on a historical database of runtime values. It can be noted that by storing and using the historical information, a scheduler can determine how to assign applications to heterogeneous processors that utilize all devices available and allow for a greater computational throughput than simply assigning all applications to one device (i.e static scheduling means). The dynamic scheduling mechanism schedules applications according to their order in the queue, and if the runtime prediction is relatively accurate, applications will finish running prior to when they would have if they had all been statically scheduled onto the GPU, or if they had been scheduled to run on the device on which they run fastest.
In this paper, we analysed the impact of the scheduling scheme on the heterogeneous computing systems that use both the CPU and the GPU together. The proposed dynamic scheduling considers the remaining execution time of the CPU and the GPU for currently executed applications. It also considers the expected execution time for the incoming application based on the execution time history. According to our simulations, the proposed scheduling scheme improves the system performance by maximizing the resource utilization of the CPU and the GPU. Moreover, the proposed scheduling scheme provides better performance consistency than existing scheduling schemes for the executed order of applications. The estimated execution-time scheduling also shows the best energy efficiency by reducing the execution time. Therefore, we expect that the proposed scheduling can be a solution for improving the performance and the energy efficiency of heterogeneous computing systems. One drawback of the proposed scheduling scheme is that it requires a training period to collect the execution history. In a future work, how the training period can be reduced should be investigated.
5.2 FUTURE STRATEGY
In the ï¬eld of HPC (High Performance Computing), the current hardware trend is to design multiprocessor architectures featuring heterogeneous technologies such as specialized coprocessors (e.g Cell/BE) or data-parallel accelerators (e.g GPUs). Approaching the theoretical performance of these architectures is a complex issue. Indeed substantial eï¬orts have already been devoted to eï¬ciently oï¬”oad parts of the computations. However, designing an execution model that uniï¬es all computing units and associated embedded memory remains a main challenge.
We like to design an architecture of a runtime system providing a high-level uniï¬ed execution model tightly coupled with an expressive data management library. The main goal of this model will be to provide numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware on the one hand and easily develop and tune powerful scheduling algorithms on the other hand.
5.3 The STARPU Runtime System
Each accelerator technology usually has its speciï¬c execution model and its proper interface to manipulate data (e.g., DMA on the Cell). Porting an application to a new platform therefore often boils down to rewriting a large part of the application, which severely impairs productivity. Writing portable code that runs on multiple targets is currently a major issue, especially if the application needs to exploit multiple accelerator technologies, possibly at the same time. To help tackling this issue, we will use modified StarPU, a runtime layer that provides an interface unifying execution on accelerator technologies as well as multicore processors. Middle layers tools (such as programming environments and HPC libraries) can build up on top of StarPU (instead of directly using low-level oï¬”oading libraries) to keep focused on their speciï¬c roles instead of having to handle eï¬cient simultaneous use of oï¬”oading libraries. That allows programmers to make existing applications eï¬ciently exploit diï¬erent accelerators with limited eï¬ort.
5.1 Conclusion
This study developed a dynamic scheduling algorithm for heterogeneous computers that schedules application based on a historical database of runtime values. It can be noted that by storing and using the historical information, a scheduler can determine how to assign applications to heterogeneous processors that utilize all devices available and allow for a greater computational throughput than simply assigning all applications to one device (i.e static scheduling means). The dynamic scheduling mechanism schedules applications according to their order in the queue, and if the runtime prediction is relatively accurate, applications will finish running prior to when they would have if they had all been statically scheduled onto the GPU, or if they had been scheduled to run on the device on which they run fastest.
In this paper, we analysed the impact of the scheduling scheme on the heterogeneous computing systems that use both the CPU and the GPU together. The proposed dynamic scheduling considers the remaining execution time of the CPU and the GPU for currently executed applications. It also considers the expected execution time for the incoming application based on the execution time history. According to our simulations, the proposed scheduling scheme improves the system performance by maximizing the resource utilization of the CPU and the GPU. Moreover, the proposed scheduling scheme provides better performance consistency than existing scheduling schemes for the executed order of applications. The estimated execution-time scheduling also shows the best energy efficiency by reducing the execution time. Therefore, we expect that the proposed scheduling can be a solution for improving the performance and the energy efficiency of heterogeneous computing systems. One drawback of the proposed scheduling scheme is that it requires a training period to collect the execution history. In a future work, how the training period can be reduced should be investigated.
5.2 FUTURE STRATEGY
In the ï¬eld of HPC (High Performance Computing), the current hardware trend is to design multiprocessor architectures featuring heterogeneous technologies such as specialized coprocessors (example Cell/BE) or data-parallel accelerators (GPUs). Approaching the theoretical performance of these architectures is a complex issue. Indeed substantial eï¬orts have already been devoted to eï¬ciently oï¬”oad parts of the computations. However, designing an execution model that uniï¬es all computing units and associated embedded memory remains a main challenge.
We like to design an architecture of a runtime system providing a high-level uniï¬ed execution model tightly coupled with an expressive data management library. The main goal of this model will be to provide numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware on the one hand and easily develop and tune powerful scheduling algorithms on the other hand.
5.3 The STARPU Runtime System
Each accelerator technology usually has its speciï¬c execution model and its proper interface to manipulate data (e.g DMA on the Cell). Porting an application to a new platform therefore often boils down to rewriting a large part of the application, which severely impairs productivity. Writing portable code that runs on multiple targets is currently a major issue, especially if the application needs to exploit multiple accelerator technologies, possibly at the same time. To help tackling this issue, we will use modified StarPU, a runtime layer that provides an interface unifying execution on accelerator technologies as well as multicore processors. Middle layers tools (such as programming environments and HPC libraries) can build up on top of StarPU (instead of directly using low-level oï¬”oading libraries) to keep focused on their speciï¬c roles instead of having to handle eï¬cient simultaneous use of oï¬”oading libraries, that allows programmers to make existing applications eï¬ciently exploit diï¬erent accelerators with limited eï¬ort.

Essay: Dynamic scheduling algorithm for heterogeneous computers

Essay details and download:

Text preview of this essay:

About this essay:

Essay details and download:

Text preview of this essay:

About this essay:

Essay Categories: