The objective is to generate random instances for the problem (Pm,Pk)|prec|Cmax problem. It would be interesting to have variable instances that are difficult for the studied scheduling heuristic in order to compare them. In addition, having a good idea of the optimal value would improve the strength of the comparison.

Let’s start with a simple model in which most tasks are independent and some are connected to form a simple chain. In an optimal solution the chain will be executed on a dedicated GPU and all the other cores will finish around the same time. For simplicity, the cost mean is 1 for GPU and \(sqrt(m/k)\) for CPU.

m <- 20
k <- 4
g <- chain_indep_proto(m, k)
schedule <- HEFT(m, k, g)
summary(sapply(1:(m + k), function(x) { tail(schedule[[x]]$end_time, 1) }))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   78.13   78.36   79.67   79.33   80.00   80.12
CP(g)
## [1] 79.5845
write_graph(g, "14_generating_random_graphs/chain_indep.dot", "dot")

The idea seems to work.

       Treename      QALS      ERLS      MxQE      EFiT      Quik      Rati
chain_indep.dot   233.329   234.088   151.294   151.294   316.805   283.218

In practice the online heuristics are not efficient.

Another idea consists in having \(k\) chains, one for each GPU, and many other tasks depending on the tasks of these chains.

g <- chain_fork_proto(m, k)
schedule <- HEFT(m, k, g)
summary(sapply(1:(m + k), function(x) { tail(schedule[[x]]$end_time, 1) }))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   61.51   62.65   63.03   67.92   75.89   82.63
CP(g)
## [1] 81.59284
write_graph(g, "14_generating_random_graphs/chain_fork.dot", "dot")

There is more variability in the finish time on each core.

       Treename      QALS      ERLS      MxQE      EFiT      Quik      Rati
 chain_fork.dot   248.313   248.011   183.835   183.835   507.331   272.022

The performance are even worse compared to the optimal makespan.

These two approaches can be improved as follows:

Improved versions

Let’s test new versions:

g <- chain_indep(m, k, 100, 0.1)
schedule <- HEFT(m, k, g)
summary(sapply(1:(m + k), function(x) { tail(schedule[[x]]$end_time, 1) }))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   6.752   6.821   7.433   7.560   8.326   8.550
CP(g)
## [1] 7.922416
plot.igraph(g)

Good. With higher number of nodes, the difference between the finishing time should be lower.

g <- chain_indep(m, k, 1000, 0.1)
schedule <- HEFT(m, k, g)
summary(sapply(1:(m + k), function(x) { tail(schedule[[x]]$end_time, 1) }))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   75.14   75.34   75.62   75.74   75.90   77.34
CP(g)
## [1] 77.34249

Perfect. Let’s move to the second method:

g <- chain_fork(m, k, 100, 0.1)
schedule <- HEFT(m, k, g)
summary(sapply(1:(m + k), function(x) { tail(schedule[[x]]$end_time, 1) }))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   7.115   8.966   9.158   8.858   9.310   9.515
CP(g)
## [1] 5.245685
plot.igraph(g)

With more nodes, the critical path should be closer to the makespan.

g <- chain_fork(m, k, 1000, 0.1)
schedule <- HEFT(m, k, g)
summary(sapply(1:(m + k), function(x) { tail(schedule[[x]]$end_time, 1) }))
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   90.21   90.86   90.92   91.16   91.92   92.22
CP(g)
## [1] 45.35556
LB_europar(cbind(get.vertex.attribute(g, "CPU"),
                 get.vertex.attribute(g, "GPU")),
           m, k)
## [1] 73.47593

The critical path is twice as low as HEFT. However, the LB from EuroPar is still good. Since this method changed a bit from the previous prototype, let’s test the online algorithms too.

write_graph(g, "14_generating_random_graphs/chain_fork_large.dot", "dot")
            Treename      QALS      ERLS      MxQE      EFiT      Quik      Rati
chain_fork_large.dot    122.86   123.255   95.6505   95.6505   252.804   143.306

These new instances are less difficult.

Simulations

We are ready to produce a large set of instances with varying parameters.

set.seed(1)

NB <- 10

n_s <- c(10, 30, 100)
k_s <- c(1, 2, 4, 8)
mk_s <- c(1, 2, 3, 5)
CV_s <- c(0.1, 0.2, 0.3, 0.5, 1)

for (n in n_s)
  for (k in k_s)
    for (m in k * mk_s)
      for (CV in CV_s)
        for (func in c("chain_indep", "chain_fork")) {
          for (i in 1:NB) {
            print(c(n, k, m, CV, i))
            g <- get(func)(m, k, n, CV)
            name <- paste(func, n, m, k, CV, i, sep = "_")
            name <- paste(name, "dot", sep = ".")
            name <- paste("14_generating_random_graphs/instances", name, sep = "/")
            write_graph(g, name, "dot")
            schedule <- HEFT(m, k, g)
            name_HEFT <- paste(name, "HEFT", sep = ".")
            capture.output(schedule, file = name_HEFT)
            params <- data.frame(func = func, n = n, m = m, k = k, CV = CV, num = i,
                                 CP = CP(g),
                                 LB_europar = LB_europar(cbind(get.vertex.attribute(g, "CPU"),
                                                               get.vertex.attribute(g, "GPU")),
                                                         m, k),
                                 HEFT = schedule$mks)
            name_params <- paste(name, "params", sep = ".")
            write.table(params, name_params, row.names = FALSE, sep = ",")
          }
        }

Let’s run the code for the online algorithms:

grep "e-" input/test/* -l | xargs rm
> result.csv
for k in 1 2 4 8
do
  for mk in 1 2 3 5
  do
    m=$((k * mk))
    ./test -m $m -k $k
    grep "[^0-9]\+_[0-9]*_$m\_$k\_" output_$m-$k.dat >> result.csv
  done
done
head -n 1 output_$m-$k.dat > result.csv.tmp
cat result.csv >> result.csv.tmp
mv result.csv.tmp result.csv
sed -i "s/\t/,/g" result.csv
sed -i "s/ //g" result.csv

Let’s analyze a first overview of the results (in the instances :

for file in $(awk -F',' 'NR>1 { print "instances/"$1".params" }' result.csv)
do
    header=$(head -n 1 $file)
    tail -n 1 $file
done > result.csv.tmp1
paste result.csv <(echo $header; cat result.csv.tmp1) > result.csv.tmp2
rm result.csv.tmp1
mv result.csv.tmp2 result.csv
data <- read.csv("14_generating_random_graphs/result.csv")
data %>%
  mutate(ratio = HEFT / pmax(CP, LB_europar)) %>%
  ggplot(aes(x = ratio)) +
  geom_histogram() +
  facet_grid(m/k ~ k)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The lower bound is often close to HEFT but not always.

data %>%
  mutate(LB = pmax(CP, LB_europar)) %>%
  gather(algo, mks, QALS, ERLS, MxQE, EFiT, Quik, Rati) %>%
  ggplot(aes(x = func, y = mks / LB, col = algo)) +
  geom_boxplot() +
  facet_grid(m/k ~ k)

These instances can be used to compare online scheduling algorithms. From worst to best: Quik, Rati, ERLS, QALS, MxQE and EFiT.

any(data$MxQE != data$EFiT)
## [1] FALSE

MxQE and EFiT are always similar.

## R version 3.4.3 (2017-11-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.3 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
## 
## locale:
##  [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
##  [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
##  [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] igraph_1.0.1  ggplot2_2.2.1 dplyr_0.4.3   tidyr_0.2.0   purrr_0.2.0  
## [6] stringr_1.0.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.10     knitr_1.15.1     magrittr_1.5     munsell_0.4     
##  [5] colorspace_1.2-2 R6_2.1.0         rlang_0.1        plyr_1.8.1      
##  [9] tools_3.4.3      parallel_3.4.3   grid_3.4.3       gtable_0.1.2    
## [13] pacman_0.4.6     DBI_0.3.1        htmltools_0.3.6  lazyeval_0.2.0  
## [17] yaml_2.1.13      rprojroot_1.2    digest_0.6.3     assertthat_0.1  
## [21] tibble_1.3.1     reshape2_1.2.2   evaluate_0.10    rmarkdown_1.5   
## [25] labeling_0.1     stringi_0.5-5    compiler_3.4.3   scales_0.4.1    
## [29] backports_1.0.5