Featured Tutorial: Professor H. J. Siegel
Abell Endowed Chair Distinguished Professor of Electrical and Computer Engineering & Professor of Computer Science
Colorado State University
Fort Collins, CO 80523-1373, USA
Date & Time: July 25 (Monday), 2016; 05:40pm - 08:30pm
LOCATION: Sterling AB Room
Scientists and engineers always want faster and faster computers, and in general faster computers require more energy. With rising energy costs, there is an urgent need for energy-efficient computing at many different levels. This tutorial focuses on energy-aware resource management in heterogeneous parallel and distributed computing systems. We address the problem of assigning serial tasks and parallel tasks to machines in a heterogeneous computing environment that is a collection of machines with different computational capabilities and energy-usage characteristics. These machines execute a workload composed of different tasks, where the tasks have diverse computational and energy requirements. The execution time and energy consumption of each task on each machine is based on how the task’s computational requirements interact with the machine’s capabilities.
In heterogeneous parallel and distributed computing systems, a critical research problem is energy-aware allocation of resources to tasks to optimize some performance objective, possibly under a given constraint. Often, these allocation decisions must be made when there is uncertainty in relevant system parameters, such as the data-dependent execution time of a given task on a given machine. It is important for system performance to be robust against uncertainty. We have designed models for defining, deriving, and quantifying the degree of robustness of a resource allocation using history-based stochastic (probabilistic) information about the execution times of tasks on different machines.
Energy-aware resource allocation heuristics for several example environments will be presented. The first two environments involve dynamic heuristics, which are executed on-line for situations where tasks must be assigned resources as they arrive into the system. For the first of these environments each serial task has a utility function associated with it, which is monotonically decreasing over time. This utility function represents the value of a serial task based on the task’s completion time, and the goal of the heuristics is to maximize the total utility earned from all serial task completions over an interval of time while satisfying an energy constraint. The second example for the dynamic environment is similar to the first except parallel tasks are considered.
We provide an analysis framework that will allow a system administrator to investigate the tradeoffs between minimizing system energy consumption and maximizing the computing performance (utility) achieved by a system, typically two conflicting goals. This can be modeled as a bi-objective optimization problem. We present a method to create a set of different serial task resource allocations that illustrate the tradeoffs.
The last two environments involve static heuristics, which are executed off-line, where a collection of independent tasks (“bag-of-tasks”) is to be assigned to machines in a heterogeneous computing system. We assume that there is uncertainty in the stochastic execution times of the tasks. In this study, we define “makespan-robustness” as the probability a makespan deadline (time to complete all of the tasks) is not violated, and we define “energy-robustness” as the probability that the energy budget is not violated. Typically, a smaller makespan requires more energy. For the first of these static allocation problems, the goal is to design resource allocation heuristics that maximize makespan-robustness, while maintaining a specified energy-robustness constraint. For the second, we design heuristics to maximize energy-robustness, while maintaining a specified makespan-robustness constraint.
The resource management approaches presented can be applied to a variety of computing and communication system environments, including parallel, distributed, cluster, grid, Internet, cloud, embedded, multicore, content distribution networks, wireless networks, and sensor networks. Furthermore, the approaches can be used with many different system performance metrics and constraints.
This course will enable you to:
- understand the problem of energy-aware resource allocation in heterogeneous parallel and distributed computing system.
- develop and use robustness metrics to quantify the robustness of a particular resource allocation for a given computational environment
- design energy-aware resource allocation heuristics that may incorporate robustness, for both static (off-line) and dynamic (on-line) environments
- create energy-aware resource management methods that use system energy consumption as a performance measure or constraint
- learn how to use bi-objective optimization to derive sets of resource allocation solutions that can be used to analyze the tradeoffs between the conflicting goals of minimizing energy consumption and optimizing system computing performance
This course is intended for faculty, graduate students, engineers, and scientists who want to learn how to model and manage resources in parallel and distributed computing systems (including clusters and clouds) in a way that is energy-aware. In particular, energy can be used as a constraint when trying to optimize a system computing performance metric, or energy can be optimized while meeting a computing performance constraint goal.
Biography of the presenter
H. J. Siegel is the George T. Abell Endowed Chair Distinguished Professor of Electrical and Computer Engineering at Colorado State University (CSU), where he is also a Professor of Computer Science. From 2002 to 2013, he was the founding Director of the CSU Information Science and Technology Center (ISTeC), a university-wide organization for enhancing CSU’s activities pertaining to the design and innovative application of computer, communication, and information systems. Before joining CSU, he was a Professor at Purdue University from 1976 to 2001. He received two B.S. degrees from the Massachusetts Institute of Technology (MIT), and the M.A., M.S.E., and Ph.D. degrees from Princeton University. He is a Fellow of the IEEE and a Fellow of the ACM. Prof. Siegel has co-authored over 430 published technical papers in the areas of parallel and distributed computing and communications, which have been cited over 14,000 times. He was a Coeditor-in-Chief of the Journal of Parallel and Distributed Computing, and was on the Editorial Boards of the IEEE Transactions on Parallel and Distributed Systems and the IEEE Transactions on Computers.
For more information, please see www.engr.colostate.edu/~hj.