Additional algorithms for the unrelated case (June 2, 2014 to June 18, 2014)

We need a collection of scheduling algorithms for comparing the effect of the cost generation method on the conclusions. Once we have several algorithms for allocating independent tasks on unrelated machines, we will then test each methods using each cost generation techniques with varying parameters. We are interested to select the best ones to assess the performance of a LPT (for the uniform case) on a simplification of a unrelated instance.

EFT and LPT for unrelated instances

# EFT
EFT <- function(costs) {
    compute_alloc(costs, EFT_alloc(costs))
}

EFT_alloc <- function(costs) {
    RT <- rep(0, ncol(costs))
    index <- 1:nrow(costs)
    alloc <- numeric(nrow(costs))
    while (length(index) != 0) {
        efj <- index[1]
        efp <- 1
        for (i in index) for (j in 1:ncol(costs)) if (RT[j] + costs[i, j] < 
            RT[efp] + costs[efj, efp]) {
            efj <- i
            efp <- j
        }
        RT[efp] <- RT[efp] + costs[efj, efp]
        alloc[efj] <- efp
        index <- index[index != efj]
    }
    alloc
}
LPT_unrelated <- function(costs, func = mean) {
    compute_alloc(costs, LPT_unrelated_alloc(costs, func))
}

LPT_unrelated_alloc <- function(costs, func = mean) {
    RT <- rep(0, ncol(costs))
    mean_costs <- sapply(1:nrow(costs), function(i) return(func(costs[i, ])))
    index <- order(mean_costs, decreasing = TRUE)
    alloc <- numeric(nrow(costs))
    for (i in index) {
        efp <- 1
        for (j in 1:ncol(costs)) if (RT[j] + costs[i, j] < RT[efp] + costs[i, 
            efp]) {
            efp <- j
        }
        RT[efp] <- RT[efp] + costs[i, efp]
        alloc[i] <- efp
    }
    alloc
}
balance_sufferage <- function(costs) {
    compute_alloc(costs, balance_sufferage_alloc(costs))
}

balance_sufferage_alloc <- function(costs) {
    # Compute sufferage matrix
    S <- t(sapply(1:nrow(costs), function(i) {
        costs[i, ] - min(costs[i, ])
    }))
    # Compute default allocation that minimizes sufferage
    allocation <- sapply(1:nrow(costs), function(i) {
        which.min(S[i, ])[1]
    })
    alloc <- allocation
    # Transform as an appropriate structure
    proc <- sapply(1:ncol(costs), function(p) {
        list(proc = p, tasks = which(allocation == p))
    }, simplify = FALSE)
    # Compute ready times
    RT <- sapply(proc, function(p) {
        sum(costs[p$tasks, p$p])
    })
    repeat {
        pmax <- which.max(RT)[1]
        suff_min <- max(S) + 1
        task <- NULL
        target <- NULL
        for (i in proc[[pmax]]$tasks) {
            for (p in which(1:ncol(costs) != pmax)) {
                suff <- S[i, p]
                cmax <- max(RT[pmax] - costs[i, pmax], RT[p] + costs[i, p])
                if (cmax < RT[pmax] && suff < suff_min) {
                  suff_min <- suff
                  task <- i
                  target <- p
                }
            }
        }
        if (is.null(task) || is.null(target)) {
            return(alloc)
        }

        proc[[target]]$tasks <- c(proc[[target]]$tasks, task)
        proc[[pmax]]$tasks <- proc[[pmax]]$tasks[proc[[pmax]]$tasks != task]
        RT[target] <- RT[target] + costs[task, target]
        alloc[task] <- target
        RT[pmax] <- RT[pmax] - costs[task, pmax]
    }
}
balance_EFT <- function(costs) {
    compute_alloc(costs, balance_EFT_alloc(costs))
}

balance_EFT_alloc <- function(costs) {
    # Compute sufferage matrix
    S <- t(sapply(1:nrow(costs), function(i) {
        costs[i, ] - min(costs[i, ])
    }))
    # Compute default allocation that minimizes sufferage
    allocation <- sapply(1:nrow(costs), function(i) {
        which.min(S[i, ])[1]
    })
    alloc <- allocation
    # Transform as an appropriate structure
    proc <- sapply(1:ncol(costs), function(p) {
        list(proc = p, tasks = which(allocation == p))
    }, simplify = FALSE)
    # Compute ready times
    RT <- sapply(proc, function(p) {
        sum(costs[p$tasks, p$p])
    })
    repeat {
        pmax <- which.max(RT)[1]
        RT_min <- RT[pmax]
        task <- NULL
        target <- NULL
        for (i in proc[[pmax]]$tasks) {
            for (p in which(1:ncol(costs) != pmax)) {
                RTnew <- RT[p] + costs[i, p]
                if (RTnew < RT_min) {
                  RT_min <- RTnew
                  task <- i
                  target <- p
                }
            }
        }
        if (is.null(task) || is.null(target)) 
            return(alloc)
        proc[[target]]$tasks <- c(proc[[target]]$tasks, task)
        proc[[pmax]]$tasks <- proc[[pmax]]$tasks[proc[[pmax]]$tasks != task]
        RT[target] <- RT[target] + costs[task, target]
        alloc[task] <- target
        RT[pmax] <- RT[pmax] - costs[task, pmax]
    }
}

Z <- create_matrix(c(3, 6, 1), c(1, 2))
S <- simplify_matrix(Z)
LPT(S$row, S$col)

## [1] 7

EFT(Z)

## [1] 10

LPT_unrelated(Z)

## [1] 7

LPT_unrelated(Z, min)

## [1] 7

balance_sufferage(Z)

## [1] 7

balance_EFT(Z)

## [1] 7

Idea for more heuristics :

apply MO-greedy to greedy_sufferage
apply MO-greedy to LPT_unrelated
apply MO-greedy to EFT

We have now four (or five) methods. We can start a basic experimentation plan and continue to add heuristics later.

First experimentations

Let's start with only one scenario to see if any heuristic shows pathological behavior. Let's assume that there are 10 times more tasks than processors.

n <- 300
m <- n/10
mu <- 10
sigma <- 2
rhoR <- 0.8
rhoC <- 0.8

rdist <- function(n, mean, sd) {
    a <- mean - sqrt(12)/2 * sd  #lower
    b <- mean + sqrt(12)/2 * sd  #upper
    return(runif(n, a, b))
}

Z <- generate_matrix_corr(n, m, rdist, mu, sigma, rhoR, rhoC)
S <- simplify_matrix(Z)
timestamp()

## ##------ Wed Mar 18 13:10:26 2015 ------##

LPT(S$row, S$col)

## [1] 101

timestamp()

## ##------ Wed Mar 18 13:10:26 2015 ------##

EFT(Z)

## [1] 92.75

timestamp()

## ##------ Wed Mar 18 13:10:28 2015 ------##

LPT_unrelated(Z)

## [1] 94.65

timestamp()

## ##------ Wed Mar 18 13:10:28 2015 ------##

LPT_unrelated(Z, min)

## [1] 94.35

timestamp()

## ##------ Wed Mar 18 13:10:28 2015 ------##

balance_sufferage(Z)

## [1] 90.42

timestamp()

## ##------ Wed Mar 18 13:10:31 2015 ------##

balance_EFT(Z)

## [1] 92.14

timestamp()

## ##------ Wed Mar 18 13:10:32 2015 ------##

All methods are instantaneous except EFT and balance_sufferage, which are the longest.

Now, let's test the performance of all methods on a series of tests.

n <- 100
m <- n/10
result <- data.frame()
for (i in 1:100) {
    Z <- generate_matrix_corr(n, m, rdist, mu, sigma, rhoR, rhoC)
    result <- rbind(result, list(EFT = EFT(Z), LPT_mean = LPT_unrelated(Z), 
        LPT_min = LPT_unrelated(Z, min), bal_suff = balance_sufferage(Z), bal_EFT = balance_EFT(Z)))
}

summary(result)

##       EFT           LPT_mean        LPT_min         bal_suff    
##  Min.   : 82.2   Min.   : 86.3   Min.   : 85.8   Min.   : 80.6  
##  1st Qu.: 91.9   1st Qu.: 95.5   1st Qu.: 95.5   1st Qu.: 90.2  
##  Median : 94.7   Median : 98.5   Median : 98.4   Median : 93.0  
##  Mean   : 95.1   Mean   : 98.4   Mean   : 98.5   Mean   : 93.0  
##  3rd Qu.: 98.5   3rd Qu.:102.0   3rd Qu.:102.3   3rd Qu.: 96.1  
##  Max.   :106.8   Max.   :112.0   Max.   :111.3   Max.   :106.7  
##     bal_EFT     
##  Min.   : 83.7  
##  1st Qu.: 93.1  
##  Median : 95.5  
##  Mean   : 96.0  
##  3rd Qu.: 99.4  
##  Max.   :110.8

sort(summary(result)[4, ])

##          bal_suff               EFT           bal_EFT          LPT_mean 
## "Mean   : 93.0  " "Mean   : 95.1  " "Mean   : 96.0  " "Mean   : 98.4  " 
##           LPT_min 
## "Mean   : 98.5  "

We can then order the methods as: balance-sufferage, EFT, balance-EFT, LPT-mean and LPT-min.

t.test(result$bal_suff, result$EFT)

## 
##  Welch Two Sample t-test
## 
## data:  result$bal_suff and result$EFT
## t = -2.79, df = 198, p-value = 0.005784
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.4669 -0.5956
## sample estimates:
## mean of x mean of y 
##     93.04     95.07

t.test(result$EFT, result$bal_EFT)

## 
##  Welch Two Sample t-test
## 
## data:  result$EFT and result$bal_EFT
## t = -1.233, df = 198, p-value = 0.219
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.3560  0.5431
## sample estimates:
## mean of x mean of y 
##     95.07     95.98

t.test(result$bal_EFT, result$LPT_mean)

## 
##  Welch Two Sample t-test
## 
## data:  result$bal_EFT and result$LPT_mean
## t = -3.266, df = 198, p-value = 0.001284
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.8613 -0.9541
## sample estimates:
## mean of x mean of y 
##     95.98     98.39

t.test(result$LPT_mean, result$LPT_min)

## 
##  Welch Two Sample t-test
## 
## data:  result$LPT_mean and result$LPT_min
## t = -0.2013, df = 198, p-value = 0.8407
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.606  1.308
## sample estimates:
## mean of x mean of y 
##     98.39     98.54

The statistically relevant categories are:

bal-suff
EFT and bal-EFT
LPT-min and LPT-mean

Let's see how frequent each method is the best:

table(sapply(1:nrow(result), function(i) {
    which.min(result[i, ])
}))

## 
##  1  4 
##  5 95

It is mostly bal-suf except sometimes for EFT (or its almost equivalent bal-EFT).

We can now proceed to test whether our matrix generation method leads to original observations.

Comparing both generation methods

Let's start by characterizing the efficiency of LPT in the uniform case (relatively to the proposed methods).

best_cmax <- function(costs) {
    min(EFT(costs), LPT_unrelated(costs), LPT_unrelated(costs, min), balance_sufferage(costs), 
        balance_EFT(costs))
}

LPT_perf <- NULL
for (i in 1:100) {
    Z <- generate_matrix_corr(n, m, rdist, mu, sigma, 0, 0)
    S <- simplify_matrix(Z)
    cmax <- LPT(S$row, S$col)
    costs <- create_matrix(S$row, S$col)
    LPT_perf <- c(LPT_perf, cmax/best_cmax(costs))
}

summary(LPT_perf)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    1.00    1.00    1.00    1.00    1.01

We then compare LPT_unrelated improvements upon the best method when both correlation increases identically.

LPT_unrelated_perf <- data.frame()
for (i in 1:1000) {
    corr <- runif(1)
    Z <- generate_matrix_corr(n, m, rdist, mu, sigma, corr, corr)
    cmax <- LPT_unrelated(Z)
    LPT_unrelated_perf <- rbind(LPT_unrelated_perf, list(corr = corr, ratio = cmax/best_cmax(Z)))
}

library(ggplot2)
p <- ggplot(data = LPT_unrelated_perf, aes(x = corr, y = ratio))
p <- p + geom_point() + geom_smooth()
p

## geom_smooth: method="auto" and size of largest group is >=1000, so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.

plot of chunk LPT_unrelated_perf_fig

Now, let's see what happens when both correlations are distinct with a heat map.

iterations <- 100
nb_point <- 10 + 1  # odd because of symmetry
rhos <- seq(0, 1, length.out = nb_point)[-nb_point]

LPT_unrelated_perf_tile <- data.frame()
for (rhoR in rhos) for (rhoC in rhos) {
    ratios <- NULL
    for (i in 1:iterations) {
        Z <- generate_matrix_corr(n, m, rdist, mu, sigma, rhoR, rhoC)
        ratios <- c(ratios, LPT_unrelated(Z)/best_cmax(Z))
    }
    LPT_unrelated_perf_tile <- rbind(LPT_unrelated_perf_tile, list(rhoR = rhoR, 
        rhoC = rhoC, ratio = mean(ratios)))
}

library(ggplot2)
p <- ggplot(data = LPT_unrelated_perf_tile, aes(x = rhoR, y = rhoC, fill = ratio, 
    z = ratio))
p <- p + geom_tile() + theme_bw()
p <- p + stat_contour(aes(color = ..level..), breaks = seq(0, 2, 0.02))
p <- p + theme(legend.position = "bottom")
p <- p + scale_color_continuous(low = "#000000", high = "#000000")
library(directlabels)

## Loading required package: grid
## Loading required package: quadprog
## 
## Attaching package: 'directlabels'
## 
## The following object is masked from 'package:nlme':
## 
##     gapply

direct.label(p)

## Loading required package: proto

plot of chunk unnamed-chunk-6

We see that the diagonal confirms our previous figure. We also see that the evolution is not linear when one of the coefficient is fixed to zero.

Let's see what we would get with Siegel's method:

LPT_unrelated_siegel_perf <- data.frame()
for (i in 1:1000) {
    corr <- runif(1)
    Z <- generate_matrix_siegel_corr(n, m, rdist, mu, sigma, corr)
    cmax <- LPT_unrelated(Z)
    LPT_unrelated_siegel_perf <- rbind(LPT_unrelated_siegel_perf, list(corr = corr, 
        ratio = cmax/best_cmax(Z)))
}

library(ggplot2)
p <- ggplot(data = LPT_unrelated_siegel_perf, aes(x = corr, y = ratio))
p <- p + geom_point() + geom_smooth()
p

## geom_smooth: method="auto" and size of largest group is >=1000, so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.

plot of chunk unnamed-chunk-7

We see a similar trend as with our method. Let's try to superpose them:

LPT_unrelated_merge <- rbind(cbind(LPT_unrelated_perf, method = factor("corr")), 
    cbind(LPT_unrelated_siegel_perf, method = factor("siegel")))
library(ggplot2)
p <- ggplot(data = LPT_unrelated_merge, aes(x = corr, y = ratio, group = method))
p <- p + geom_smooth()
p

## geom_smooth: method="auto" and size of largest group is >=1000, so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.

plot of chunk unnamed-chunk-8

Siegel trend's is less linear. And the behavior when the correlation is greater than 0.8 may be difficult to assess. Recall that this method was not presented as one in which the correlation coefficient may be controlled. Thus, many published paper may have used arbitrary parameters that do not yield all the possible behaviors that our method does. To be more fair, it may be relevant to see which area are of this curve are generally studied to assess its potential bias.

Conclusion

We have a first idea of what we can expect from our generation method and how our proposed heuristics behave. The idea now is to analyse what can be gained from a methodological point of view by using our generation method.

A first question is the following: when using Siegel's method as it was published, how are the properties of the matrices distributed (correlation, but also mean and standard deviation)? To answer this question, we could select a subset of articles (or all) that used this method and try to reproduce the matrices to get their properties (or just the experimentation plan). Depending on the result of this study, a bias could appear.

A second question is whether our method by proposing two parameters may lead to different conclusions. It is not clear yet, even though the linearity we saw already seems to be an advantage.

Another question concerns the properties of the matrix that exists in practice. Where to get such information? How to test this? Should GPU/CPU be included? Can BOINC provide data on this?

If setting both correlations to the same value leads to the same conclusions and if concrete cost matrix does not have distinct values, our method could be simplified.

As a side contribution, we may prove that LPT is the best heuristics in our collection for uniform platforms (if it is indeed the case).