For a journal version, it will be necessary to extend the simulation. In particular, it would be good to compare to several other competing approaches and to consider different platforms (varying ratio m/k, varying ratio between the expected values, varying number of tasks).
Let’s start with a symmetric situation with medium heterogeneity (CV = 0.3), 10 processors of each type and let’s vary the number of tasks.
m <- 10
k <- 10
ntasks <- c(10, 20, 30, 50, 100, 200, 300, 500)
ratios <- map_df(ntasks, function(n) {
replicate(10, {
costs_CPU <- rgamma_cost(n, mean = 1, sd = 0.3)
costs_GPU <- rgamma_cost(n, mean = 1, sd = 0.3)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
polish <- schedule_balanced_estimate_polished(costs, m, k)
balpolish <- schedule_balancing_balanced_estimate(costs, m, k, polish$alloc,
polish$inv)
RR <- list(schedule_balanced_makespan_efficient(costs, m, k),
balpolish,
polish,
schedule_DADA(costs, m, k),
schedule_DualHP(costs, m, k),
schedule_CLB2C(costs, m, k),
schedule_HeteroPrio(costs, m, k))
names(RR) <- c("best", "bal_polish", "polish", "DADA", "DualHP", "CLB2C",
"HeteroPrio")
c(list(n = n, LB = max(LB_europar(costs, m, k),
LB_balance(costs, m, k))), map_dbl(RR, ~.$mks))
}) %>% t() %>% data.frame() %>% map(unlist) %>% data.frame()
})
ratios %>%
rowwise() %>%
mutate(min = min(best, bal_polish, polish, DADA, DualHP, CLB2C, HeteroPrio)) %>%
gather(key = "algo", value = "ratio", -n, -LB, -min) %>%
ggplot(aes(x = n, y = ratio / LB, color = algo)) +
geom_boxplot(aes(group = interaction(algo, n)), outlier.size = 1) +
scale_color_brewer("Algorithm", palette = "Set1") +
scale_x_log10("Number of tasks") +
scale_y_continuous("Makespan / LB")
Small instances are easy because there are many resources and each task is allocated to a free resource. This indicates that the hardest instances is when there is around 50 tasks (2 to 3 tasks per machine). However, looking at the larger instances (more than 200, or 10 tasks per machine) also makes sense because all strategies are not asymptotically equivalent.
We will therefore concentrate on small instances (a few tasks per machine) where sub-optimal scheduling decisions may have large consequence, and large instances for which we look to asymptotic performance.
Let’s study how the variance of the costs impacts the result with a small symmetric instance.
m <- 10
k <- 10
tasks <- c(50, 200)
cvs <- c(0.03, 0.05, 0.1, 0.2, 0.3, 0.5, 1, 2, 3)
ratios <- map_df(tasks, function(n) {
map_df(cvs, function(cv) {
replicate(10, {
costs_CPU <- rgamma_cost(n, mean = 1, sd = cv)
costs_GPU <- rgamma_cost(n, mean = 1, sd = cv)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
polish <- schedule_balanced_estimate_polished(costs, m, k)
balpolish <- schedule_balancing_balanced_estimate(costs, m, k, polish$alloc,
polish$inv)
RR <- list(schedule_balanced_makespan_efficient(costs, m, k),
balpolish,
polish,
schedule_DADA(costs, m, k),
schedule_DualHP(costs, m, k),
schedule_CLB2C(costs, m, k),
schedule_HeteroPrio(costs, m, k))
names(RR) <- c("best", "bal_polish", "polish", "DADA", "DualHP", "CLB2C",
"HeteroPrio")
c(list(n = n, cv = cv, LB = max(LB_europar(costs, m, k),
LB_balance(costs, m, k))), map_dbl(RR, ~.$mks))
}) %>% t() %>% data.frame() %>% map(unlist) %>% data.frame()
})
})
ratios %>%
rowwise() %>%
mutate(min = min(best, bal_polish, polish, DADA, DualHP, CLB2C, HeteroPrio)) %>%
gather(key = "algo", value = "ratio", -cv, -n, -LB, -min) %>%
ggplot(aes(x = cv, y = ratio / LB, color = algo)) +
geom_boxplot(aes(group = interaction(algo, cv)), outlier.size = 1) +
facet_wrap(~ n, nrow = 2) +
scale_color_brewer("Algorithm", palette = "Set1") +
scale_x_log10("Coefficient of variation") +
scale_y_continuous("Makespan / LB")
The difficult setting is between 0.1 and 1. Below, the instance is easy (all tasks are similar and can be balanced simply). Above, some tasks have very large costs and dominate the completion times. Surprisingly, with many tasks, large variance remains difficult for many approaches (it is more difficult to pack tasks below a completion time achieved by a large tasks).
We concentrate on coefficients of variation between 0.1 and 0.5 for small instances and between 0.2 and 1 for large instances: between 0.1 and 1 seems safe for all instances.
Let’s play on the ratio m/k while keeping the same amount of work on both types of processors (it means that tasks will be divided in two sets of similar size).
tasks <- c(50, 200)
mk <- 20
CPU <- c(10, 12, 14, 16, 18)
ratios <- map_df(tasks, function(n) {
map_df(CPU, function(m) {
replicate(10, {
k <- mk - m
costs_CPU <- m / k * rgamma_cost(n, mean = 1, sd = 0.3)
costs_GPU <- rgamma_cost(n, mean = 1, sd = 0.3)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
polish <- schedule_balanced_estimate_polished(costs, m, k)
balpolish <- schedule_balancing_balanced_estimate(costs, m, k, polish$alloc,
polish$inv)
RR <- list(schedule_balanced_makespan_efficient(costs, m, k),
balpolish,
polish,
schedule_DADA(costs, m, k),
schedule_DualHP(costs, m, k),
schedule_CLB2C(costs, m, k),
schedule_HeteroPrio(costs, m, k))
names(RR) <- c("best", "bal_polish", "polish", "DADA", "DualHP", "CLB2C",
"HeteroPrio")
c(list(n = n, m = m, LB = max(LB_europar(costs, m, k),
LB_balance(costs, m, k))), map_dbl(RR, ~.$mks))
}) %>% t() %>% data.frame() %>% map(unlist) %>% data.frame()
})
})
ratios %>%
rowwise() %>%
mutate(min = min(best, bal_polish, polish, DADA, DualHP, CLB2C, HeteroPrio)) %>%
gather(key = "algo", value = "ratio", -n, -m, -LB, -min) %>%
ggplot(aes(x = m, y = ratio / LB, color = algo)) +
geom_boxplot(aes(group = interaction(algo, m)), outlier.size = 1) +
facet_wrap(~ n, ncol = 1) +
scale_color_brewer("Algorithm", palette = "Set1") +
scale_x_continuous("Number of CPU") +
scale_y_continuous("Makespan / LB")
For small instances, DADA performance decreases as the asymmetry increases, while HeteroPrio performs better than CLB2C. The other approaches have similar behavior, but it seems that the harder instances is when they are asymmetric.
For large instances, all approaches are below 10% of the optimal and the behavior of each algorithm seems stable.
Let’s consider that we put less tasks on one of the processor type. We consider an asymmetric platform with twice as many CPU as GPU.
tasks <- c(50, 200)
m <- 13
k <- 7
mean_CPU <- c(0.1, 0.2, 0.3, 0.5, 1, 2, 3, 5, 10)
ratios <- map_df(tasks, function(n) {
map_df(mean_CPU, function(m_CPU) {
replicate(10, {
costs_CPU <- m_CPU * rgamma_cost(n, mean = 1, sd = 0.3)
costs_GPU <- rgamma_cost(n, mean = 1, sd = 0.3)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
polish <- schedule_balanced_estimate_polished(costs, m, k)
balpolish <- schedule_balancing_balanced_estimate(costs, m, k, polish$alloc,
polish$inv)
RR <- list(schedule_balanced_makespan_efficient(costs, m, k),
balpolish,
polish,
schedule_DADA(costs, m, k),
schedule_DualHP(costs, m, k),
schedule_CLB2C(costs, m, k),
schedule_HeteroPrio(costs, m, k))
names(RR) <- c("best", "bal_polish", "polish", "DADA", "DualHP", "CLB2C",
"HeteroPrio")
c(list(n = n, m_CPU = m_CPU, LB = max(LB_europar(costs, m, k),
LB_balance(costs, m, k))), map_dbl(RR, ~.$mks))
}) %>% t() %>% data.frame() %>% map(unlist) %>% data.frame()
})
})
ratios %>%
rowwise() %>%
mutate(min = min(best, bal_polish, polish, DADA, DualHP, CLB2C, HeteroPrio)) %>%
gather(key = "algo", value = "ratio", -n, -m_CPU, -LB, -min) %>%
ggplot(aes(x = m_CPU, y = ratio / LB, color = algo)) +
geom_boxplot(aes(group = interaction(algo, m_CPU)), outlier.size = 1) +
facet_wrap(~ n, ncol = 1) +
scale_color_brewer("Algorithm", palette = "Set1") +
scale_x_log10("CPU expected cost") +
scale_y_continuous("Makespan / LB")
For small instances, this shows that instances where tasks are much costlier on one processor type are easy to solve: any task on the slow processors will reduce the makespan. On the other hand, when the initial allocation has works on both processors that are almost balanced, there is only a few tasks to move and many have similar performance on all processors. Their allocation has thus low impact and the instance can be assumed to be easy. Difficult instances arise when the ratio between expected costs on both types of machines are sufficiently different to induce a significant imbalance, but not as strong as to put all tasks on the same processor.
For large instances, very strong imbalance is not limited by a large cost on one type of processors. However, the expected costs do not seem to have an impact on the difficulty on the instances. There is an exception for the polish algorithm, which is impacted by large tasks that may benefit from going on the other processor type.
For small instances (2 to 3 tasks per resource):
For large instances (10 tasks per resource):
For consistency, let’s apply the same criteria on both small and large instances.
It remains to study the effect of the CV heterogeneity.
n <- 50
m <- 13
k <- 7
mean_CPU <- 5
cvs_CPU <- c(0.1, 0.2, 0.3, 0.5, 1)
cvs_GPU <- c(0.1, 0.2, 0.3, 0.5, 1)
ratios <- map_df(cvs_CPU, function(cv_CPU) {
map_df(cvs_GPU, function(cv_GPU) {
replicate(10, {
costs_CPU <- mean_CPU * rgamma_cost(n, mean = 1, sd = cv_CPU)
costs_GPU <- rgamma_cost(n, mean = 1, sd = cv_GPU)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
polish <- schedule_balanced_estimate_polished(costs, m, k)
balpolish <- schedule_balancing_balanced_estimate(costs, m, k, polish$alloc,
polish$inv)
RR <- list(schedule_balanced_makespan_efficient(costs, m, k),
balpolish,
polish,
schedule_DADA(costs, m, k),
schedule_DualHP(costs, m, k),
schedule_CLB2C(costs, m, k),
schedule_HeteroPrio(costs, m, k))
names(RR) <- c("best", "bal_polish", "polish", "DADA", "DualHP", "CLB2C",
"HeteroPrio")
c(list(cv_CPU = cv_CPU, cv_GPU = cv_GPU,
LB = max(LB_europar(costs, m, k),
LB_balance(costs, m, k))), map_dbl(RR, ~.$mks))
}) %>% t() %>% data.frame() %>% map(unlist) %>% data.frame()
})
})
ratios %>%
rowwise() %>%
mutate(min = min(best, bal_polish, polish, DADA, DualHP, CLB2C, HeteroPrio)) %>%
gather(key = "algo", value = "ratio", -cv_CPU, -cv_GPU, -LB, -min) %>%
ggplot(aes(x = cv_CPU, y = ratio / LB, color = algo)) +
geom_boxplot(aes(group = interaction(algo, cv_CPU)), outlier.size = 1) +
facet_wrap(~ cv_GPU, ncol = 1) +
scale_color_brewer("Algorithm", palette = "Set1") +
scale_x_log10("CPU cost CV") +
scale_y_continuous("Makespan / LB")
Several observations:
Low GPU CV does not seem to be interesting (easier for most approaches). Chosing 0.5 for the GPU CV makes apparent the impact of the CPU CV on bal_polish, polish, DualHP, CLB2C and HeteroPrio.
n <- 200
m <- 13
k <- 7
mean_CPU <- 5
cvs_CPU <- c(0.1, 0.2, 0.3, 0.5, 1)
cvs_GPU <- c(0.1, 0.2, 0.3, 0.5, 1)
ratios <- map_df(cvs_CPU, function(cv_CPU) {
map_df(cvs_GPU, function(cv_GPU) {
replicate(10, {
costs_CPU <- mean_CPU * rgamma_cost(n, mean = 1, sd = cv_CPU)
costs_GPU <- rgamma_cost(n, mean = 1, sd = cv_GPU)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
polish <- schedule_balanced_estimate_polished(costs, m, k)
balpolish <- schedule_balancing_balanced_estimate(costs, m, k, polish$alloc,
polish$inv)
RR <- list(schedule_balanced_makespan_efficient(costs, m, k),
balpolish,
polish,
schedule_DADA(costs, m, k),
schedule_DualHP(costs, m, k),
schedule_CLB2C(costs, m, k),
schedule_HeteroPrio(costs, m, k))
names(RR) <- c("best", "bal_polish", "polish", "DADA", "DualHP", "CLB2C",
"HeteroPrio")
c(list(cv_CPU = cv_CPU, cv_GPU = cv_GPU,
LB = max(LB_europar(costs, m, k),
LB_balance(costs, m, k))), map_dbl(RR, ~.$mks))
}) %>% t() %>% data.frame() %>% map(unlist) %>% data.frame()
})
})
ratios %>%
rowwise() %>%
mutate(min = min(best, bal_polish, polish, DADA, DualHP, CLB2C, HeteroPrio)) %>%
gather(key = "algo", value = "ratio", -cv_CPU, -cv_GPU, -LB, -min) %>%
ggplot(aes(x = cv_CPU, y = ratio / LB, color = algo)) +
geom_boxplot(aes(group = interaction(algo, cv_CPU)), outlier.size = 1) +
facet_wrap(~ cv_GPU, ncol = 1) +
scale_color_brewer("Algorithm", palette = "Set1") +
scale_x_log10("CPU cost CV") +
scale_y_continuous("Makespan / LB")
Several observations:
It is more complicated to select a single parameter to study for large instances. A solution could be to study the effect of an identical CV for CPU and GPU, and then an opposite CV (4 graphs).
As a summary experiment, it could be interesting to measure the number of times any approach is the best on the instances and what was the worst ratio to the best in the best 80% cases.
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
##
## locale:
## [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
## [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
## [7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggplot2_2.2.1 dplyr_0.7.4 tidyr_0.8.0 purrr_0.2.4 stringr_1.3.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.16 knitr_1.20 bindr_0.1.1 magrittr_1.5
## [5] munsell_0.4.3 colorspace_1.3-2 R6_2.2.2 rlang_0.2.0
## [9] plyr_1.8.4 tools_3.4.4 grid_3.4.4 gtable_0.2.0
## [13] pacman_0.4.6 htmltools_0.3.6 lazyeval_0.2.1 yaml_2.1.18
## [17] rprojroot_1.3-2 digest_0.6.15 assertthat_0.2.0 tibble_1.4.2
## [21] bindrcpp_0.2 glue_1.2.0 evaluate_0.10.1 rmarkdown_1.9
## [25] stringi_1.1.7 compiler_3.4.4 pillar_1.2.1 scales_0.5.0
## [29] backports_1.1.2 pkgconfig_2.0.1