Dynamic Voltage Scaling for Low-Power Hard Real-Time Systems:Models and Assumptions for Task Groups.

Models and Assumptions for Task Groups

In this section, we introduce the models and assumptions commonly used in hard real-time inter-task DVS. In particular, of interest are those pertaining task groups, while in regard to the processor models, the same models and assumptions used for intra-DVS remain valid.

To capture tasks and their interactions (communications), task graphs are preferred, especially in multi- processor architectures, which require explicit modeling of communication channels. On single processors, shared memory is used for intertask communications, which lends itself to simpler models. Often, tasks are even considered to be independent. In this cases, modeling systems using task sets will often sufﬁce. In fact, tasks sets are the preferred model in classic real-time scheduling. From a real-time perspective, the most important parameters of a task ti in a set T = {ti}i =1,…,N are:

1. Its rate or period, Ti. To simplify the problem, certain approaches assume that all tasks have the same period. Nevertheless, in general tasks are assumed to have different periods. For each period, a different task instance (job, f) needs to execute. The point in time when a job becomes ready to execute is referred to as arrival or release time, Ai. A task set hyper-period Hi is the time interval after which the pattern of tasks arrival repeats itself. Some approaches focus on aperiodic tasks, where the rates are unknown.

2. Its deadline, Di. In hard real-time systems keeping deadlines is vital for the safety of the system.

Occasionally, tasks are assumed to have deadlines equals to periods Di = Ti or smaller than their

periods Di < Ti . The case when Di > Ti is rather uncommon, and hard to handle even in classic real time.

3. Its execution pattern. Differently from the classic real time, this parameter is not measured in time, but rather in clock cycles, as the same number of clock cycles takes different time depending on the processor speed. Apart from that, as in classic real-time scheduling, a task execution pattern can be:

(a) ﬁxed, when it does not vary from instance to instance Ci, or

(b) variable from instance to instance (Cij in instance j), between a best case BCEi and a worst case

WCEi , usually modeled by a random variable Xi associated to a certain probability distribution hi.

4. Its switched capacitance, SWi . The switched capacitance is an important parameter affecting the power, and thus energy consumption. Often tasks are assumed to employ roughly the same processor resources, meaning that the switched capacitance remains constant from task to task and instance to instance. In this case, the switched capacitance can actually be omitted, as it does not have any effect on the processor-level scheduling decisions. Nevertheless, tasks and even instances can vary in the way they use the processor. A data-processing task may actually employ more resources (large switched capacitance) than a control task.

Additional parameters are occasionally used to model tasks in some intertask DVS approaches to model tasks, mainly to integrate the classic real-time scheduling and current intertask DVS scheduling more closely. However, these additional parameters do not bring anything new to the basic inter-task DVS principles. With the measures introduced above, a number of others taken from the classic real-time theory will prove useful in dealing with DVS scheduling. The most important in this is sense processor utilization.

Definition 18.1 The processor utilization for a give task set {ti}i=1,…,N is computed as U = åi = 1,..., N Ci /Ti, where Ci is the time needed to execute Ci clock cycles at the maximum (reference) clock frequency.

For tasks with variable execution pattern, this measure usually refers to the worst-case utilization, for which Ci is replaced by WCEi .

Static Scheduling

In static intertask DVS, the schedule and task speeds are ﬁxed off-line, before the system becomes operational. Normally, all the parameters required to compute an exact schedule need to be available. This is in fact one drawback of employing static scheduling, since for tasks with variable execution time one has to assume the worst case for all instances. For highly dynamic systems, where task execution can vary a lot compared to the worst case, fully static schedules are not very effective. Additionally, they usually need to cover a full hyperperiod, which could include an extremely large number of instances. Consequently, the amount of data these methods need to provide to the run-time system may be extremely large, as the speeds and exact start times for each instance have to be stored for run-time use. In contrast, off-line techniques may be computationally intensive, as they are applied before the system becomes operational when the designer can afford more time to ﬁnd a good schedule. Furthermore, the run-time overhead for static scheduling is minimal, as no computations need to take place, since task speeds and start times are already decided.

Overview of Techniques

One of the earliest publications on DVS [46] presents an optimal algorithm for static speed scheduling of independent task sets in an EDF-like manner. Refs. [53,54] present similar algorithms for sets of tasks with ﬁxed priorities. Starting from the same initial work, Ref. [51] presents a heuristic that accounts for speed transition overheads and discrete voltage levels. Ref. [7] improves upon these, by presenting an optimal¶ EDF DVS algorithm that accounts for discrete voltage levels and tasks with nonuniform load (switched capacitances).

In Ref. [47], a nonpreemptive intertask DVS is presented, for sets of independent tasks with known arrival times, deadlines, and execution times. Their heuristic has a two-phase scheduling algorithm at its core. In the ﬁrst phase a feasible schedule is found, assuming that all tasks run at the nominal voltage (or maximal speed). A priority function based on interval work demand is used to schedule the most constrained tasks ﬁrst. The second phase attempts to iteratively lower each task speed until no energy reduction occurs. As the objective function is randomized with a certain offset, the best solution out of a series of schedules is selected. The algorithm is claimed to have an O(n3) complexity, for sets of n tasks. A similar algorithm, including switching activity as a task parameter is presented in Ref. [48].

In Refs. [50,55], several off-line speed-scheduling techniques for minimising power and energy con- sumption are presented. All start from valid real-time schedules (RM, EDF) and extend them with speed assignment. Gradually the speed of each task is decreased to a minimum beyond which deadlines start to be violated. The algorithmic complexity of these methods depends on the number of iterations taken, thus on the number of available speeds.

A Classic Preemptive Static Approach

The off-line scheduling algorithm presented in Ref. [46] has inspired many of the recent approaches, and therefore it can be considered to be a classic static speed-selection technique. The method ﬁnds optimal speeds for a set of jobs scheduled with the preemptive earliest deadline ﬁrst strategy, by using a processor utilization-like measure. With the notations starting from 18.2, a few necessary deﬁnitions adapted from Ref. [45] are given below.

Definition 18.2 Given a set of real-time jobs running on a variable speed processor, the processor

In other words, it is the amount of computation in clock cycles, required by the jobs that can execute after t1 and have to ﬁnish before moment t2.

Definition 18.3 Given a set of real-time jobs running on a variable speed processor, the loading factor on the interval [t1,t2) is the fraction of the interval needed to execute its job, that is, Note that for any interval, the loading factor is in fact the clock frequency needed to carry out the jobs associated with that interval. Furthermore, these are minimal values, since if a lower clock speed is used, the jobs cannot be completed inside that interval.

Definition 18.4 An interval I for which uI = sup0£ t < t ) u[t1 ,t 2 ), meaning that it maximizes u[t1,t2) is called critical interval. The associated job set {fj }[A ,D )ÍI is called critical group.

Given the above deﬁnitions and observations, the speed scheduling algorithm from Ref. [46] iteratively assigns speeds to critical groups and eliminates them from the job set, while also extracting their associated interval from the scheduling interval. A pseudocode description of the algorithm is given in Box 18.1. Until all jobs are scheduled, the algorithm ﬁnds the critical interval and associated jobs, assigns them a unique speed based on processor load and extracts them from the unscheduled set of jobs.

The algorithmic complexity for this algorithm is O(N2), with N being the number of jobs. With more specialized data structures, this can be reduced to O(N log2N) according to Ref. [46]. Nevertheless, the number of jobs to consider for scheduling depends very much on the rates of a periodic task set. For the set of only ﬁve tasks from Example 18.2, the number of jobs in a hyperperiod is around 1,50,000, yielding a very long run time for the algorithm. This is in fact a drawback present in all the approaches that have to unroll a whole hyperperiod to carry out the scheduling.

To ﬁnd the critical interval and set of jobs, the following intervals are interesting. [0,10) containing all jobs, yielding a loading factor of (1 + 1 + 2 + 1)/(10 – 0) = 0.5; [1,7) containing jobs f2, f3, and f4, with a loading factor of (1 + 2 + 1)/(7 – 1) = 0.66; and [2,6), containing the overlapping jobs f3 and f4, with the largest loading factor (2 + 1)/(6 – 2) = 0.75. Consequently, the critical jobs are f3 and f4, which are the ﬁrst to be scheduled at a speed of 0.75 of the maximal speed. Having dealt with the interval [2,6) and the corresponding jobs, they must be excluded from the current problem, to obtain the reduced set of jobs {f¢1(0,6,1), f¢2(1,3,1)}. Now the step of ﬁnding critical jobs for this reduced problem can be started. The interesting intervals this time are [0,6) for all jobs, with a loading factor of (1 + 1)/(6 – 0) = 0.33 and [1,3) for job f2, with a loading factor of 1/(3 – 1) = 0.5. The critical job is in this case f2, and will be scheduled to execute at speed 0.5. Reducing the problem further, the only remaining job is f1´´(0, 4, 1), yielding a loading factor of 0.25. Thus, the last job to be scheduled f1 will execute at speed 0.25. The ﬁnal schedule, together with the power consumption at any time, is depicted in Figure 18.6. The energy consumption for this arrangement turns out to be 60% of the maximum speed energy.

Search This Blog

Integrated circuit course