Let’s polish the algorithm presented at Euro-Par for a more elegant proof.
Let’s start by cleaning the previous implementations:
n <- 300
m <- 20
k <- 4
replicate(10, {
costs_CPU <- 15 * rgamma_cost(n, mean = 1, sd = 0.2)
costs_GPU <- rgamma_cost(n, mean = 1, sd = 1)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
source("10_fixing_heteroprio_implem/helper.R")
source("10_fixing_heteroprio_implem/algorithms_clean.R")
euro_bef <- schedule_greedy_balance_proved(costs, m, k)$mks
best_bef <- schedule_LPT_balance_variant1(costs, m, k)$mks
source("11_cleaner_balanced_estimate/helper.R")
source("11_cleaner_balanced_estimate/algorithms.R")
euro_aft <- schedule_balanced_estimate_europar(costs, m, k)$mks
best_aft <- schedule_balanced_makespan_efficient(costs, m, k)$mks
c(euro_bef, euro_aft, best_bef, best_aft)
})
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 41.85200 49.63555 41.48358 44.08781 40.09831 40.89632 41.33427
## [2,] 41.85200 49.63555 41.48358 44.08781 40.09831 40.89632 41.33427
## [3,] 39.07669 42.68625 40.64930 42.41492 37.42225 39.17067 40.24304
## [4,] 39.07669 42.68625 40.64930 42.41492 37.42225 39.17067 40.24304
## [,8] [,9] [,10]
## [1,] 40.47192 42.90440 41.77081
## [2,] 40.47192 42.90440 41.77081
## [3,] 39.32153 40.35227 40.19545
## [4,] 39.32153 40.35227 40.19545
This confirms that the new implementation gives the same results.
Let’s try a new version that will have a clearer proof.
replicate(10, {
costs_CPU <- 15 * rgamma_cost(n, mean = 1, sd = 0.2)
costs_GPU <- rgamma_cost(n, mean = 1, sd = 1)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
euro <- schedule_balanced_estimate_europar(costs, m, k)$mks
polish <- schedule_balanced_estimate_polished(costs, m, k)$mks
best <- schedule_balanced_makespan_efficient(costs, m, k)$mks
c(euro, polish, best)
})
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 41.60851 42.48743 40.74426 40.03564 42.12906 42.06441 42.69488
## [2,] 41.10435 42.48743 40.35395 39.49828 42.12906 42.06441 42.48758
## [3,] 40.89191 40.62059 38.98903 38.35696 39.12302 41.11084 39.80933
## [,8] [,9] [,10]
## [1,] 41.99270 42.90027 41.89887
## [2,] 41.99270 42.90027 41.89887
## [3,] 41.60442 39.92920 40.03186
The results are very close, which is encouraging. Let’s keep this version for now. If writing its actual proof requires changes, they will be done later.
Let’s add a phase where the final schedule is sligthly balanced. This will provide better results than BalancedEstimate and faster than with BalancedMakespan.
replicate(10, {
costs_CPU <- 15 * rgamma_cost(n, mean = 1, sd = 0.2)
costs_GPU <- rgamma_cost(n, mean = 1, sd = 1)
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
balmks <- schedule_balanced_makespan_europar(costs, m, k)
best <- schedule_balanced_makespan_efficient(costs, m, k)
polish <- schedule_balanced_estimate_polished(costs, m, k)
balpolish <- schedule_balancing_balanced_estimate(costs, m, k, polish$alloc,
polish$inv)
c(balmks$mks, best$mks, polish$mks, balpolish$mks)
})
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 41.80823 42.31467 41.72717 41.54803 40.16897 40.96336 41.40612
## [2,] 41.02800 41.55935 41.58998 40.87178 39.37354 40.06158 40.11729
## [3,] 42.06523 42.33585 42.21697 43.11098 42.62107 41.37926 42.04999
## [4,] 42.01222 42.33585 41.93238 42.35788 41.00176 41.12841 41.97814
## [,8] [,9] [,10]
## [1,] 41.74733 40.40749 40.01742
## [2,] 40.27011 39.98466 39.43645
## [3,] 42.26512 41.26964 40.62427
## [4,] 41.98280 41.26964 40.49880
The complexity of the second balancing step is \(O(\max(m,k)n\log(nmk))\). The number of additional rescheduling steps is bounded by the number of proc thanks to the transfer of the inverting task.
An example of worst case scenario is when the processors 1 finish all except for one after the processors 2 that all finish at the same time. Then, the balancing step consists in moving all the last tasks from the processors 1 to the processors 2 where they have a marginal cost.
Let’s see what the plot of Euro-Par would look like with these methods: the best method, which time complexity is \(O(n^2\log(nmk))\), the balanced polished and the polished one, which is in \(O(n\log(nmk))\).
n <- 300
m <- 20
k <- 4
cases <- data.frame(rbind(cv1 = c(0.2, 1, 0.2, 1), cv2 = c(0.2, 0.2, 1, 1)))
ratios <- map_df(cases, function(case) {
replicate(30, {
costs_CPU <- 15 * rgamma_cost(n, mean = 1, sd = case[1])
costs_GPU <- rgamma_cost(n, mean = 1, sd = case[2])
costs <- matrix(c(costs_CPU, costs_GPU), ncol = 2, byrow = FALSE)
schedule_balanced_makespan_europar(costs, m, k)
polish <- schedule_balanced_estimate_polished(costs, m, k)
balpolish <- schedule_balancing_balanced_estimate(costs, m, k, polish$alloc,
polish$inv)
RR <- list(schedule_balanced_makespan_efficient(costs, m, k),
balpolish,
polish,
schedule_DADA(costs, m, k),
schedule_DualHP(costs, m, k),
schedule_CLB2C(costs, m, k),
schedule_HeteroPrio(costs, m, k))
names(RR) <- c("best", "bal_polish", "polish", "DADA", "DualHP", "CLB2C",
"HeteroPrio")
c(list(case = paste(map(case, ~ ifelse(. == 0.2, "low", "high")),
collapse = "/"),
LB = LB_europar(costs, m, k)), map_dbl(RR, ~.$mks))
}) %>% t() %>% data.frame() %>% map(unlist) %>% data.frame()
})
## Warning in rbind_all(x, .id): Unequal factor levels: coercing to character
ratios[,c("case", "LB", "best", "bal_polish", "polish", "DualHP",
"DADA", "HeteroPrio", "CLB2C")] %>%
gather(key = "algo", value = "ratio", -case, -LB) %>%
ggplot(aes(x = case, y = ratio / LB, color = algo)) +
geom_boxplot(aes(group = interaction(algo, case)), outlier.size = 1) +
coord_cartesian(ylim = extendrange(c(1, 1.2))) +
scale_color_brewer("Algorithm", palette = "Set1") +
scale_x_discrete("Coefficient of variation (CPU/GPU)") +
scale_y_continuous("Makespan / LB")
We have three final algorithms with varying performance, time complexity and all with the same ratio that should be easier to prove than in Euro-Par.
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
##
## locale:
## [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
## [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
## [7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ggplot2_2.2.1 dplyr_0.7.4 tidyr_0.8.0 purrr_0.2.4 stringr_1.3.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.16 knitr_1.20 bindr_0.1.1 magrittr_1.5
## [5] munsell_0.4.3 colorspace_1.3-2 R6_2.2.2 rlang_0.2.0
## [9] plyr_1.8.4 tools_3.4.4 grid_3.4.4 gtable_0.2.0
## [13] pacman_0.4.6 htmltools_0.3.6 lazyeval_0.2.1 yaml_2.1.18
## [17] rprojroot_1.3-2 digest_0.6.15 assertthat_0.2.0 tibble_1.4.2
## [21] bindrcpp_0.2 glue_1.2.0 evaluate_0.10.1 rmarkdown_1.9
## [25] stringi_1.1.7 compiler_3.4.4 pillar_1.2.1 scales_0.5.0
## [29] backports_1.1.2 pkgconfig_2.0.1