Coefficient of variation of the singular/eigen values (March 18, 2015)

We are interesting in a measure related to, and certainly cleaner than, the TMA. Let's consider the CV of the singular values (or, alternatively, the CV of the eigenvector). We may normalize this measure as it is dependent on the dimension of the matrix. We will try to see if we can derive formal bounds on this measure. We will also compare it to the correlations and both versions of the TMA. We also want to know what is the expected value for matrix generated randomly according to a uniform law. Finally, we will try to sketch a method for determining which measure is the best predictor (with a linear regression).

CV of singular values and CV of the eigenvector

Let's do some test:

n <- 100
m <- 100
mu <- 1
cv <- 1
U <- generate_uniform_matrix(n, m, mu, cv, cv)
R <- generate_random_matrix(n, m, mu, cv)
D <- diag(rep(1, n))
c(CV_SV(U), CV_SV(R), CV_SV(D))
## [1] 10.00  1.16  0.00

Let's test with non-square matrices:

n <- 200
U <- generate_uniform_matrix(n, m, mu, cv, cv)
R <- generate_random_matrix(n, m, mu, cv)
c(CV_SV(U), CV_SV(R))
## [1] 10.0000  0.9727

Let's see the correspondence with the eigenvector:

n <- 100
U <- generate_uniform_matrix(n, m, mu, cv, cv)
summary(eigen(U, symmetric = FALSE, only.values = TRUE)$values)
##  Length   Class    Mode 
##     100 complex complex

The eigenvalues can only be computed for square matrices and the values are complex. We will only consider the singular values from now on.

Bounds

With the previous extreme cases, we see that the range may be \( [0, \sqrt{\min(n,m)}] \)

We may thus normalize the measure. But should it be done on the smallest dimension or on the largest?

We did not find any lead.

Frobenius norm

The Frobenius norm may also be interesting. Let's test it:

n <- 100
m <- 100
mu <- 1
cv <- 1
U <- generate_uniform_matrix(n, m, mu, cv, cv)
R <- generate_random_matrix(n, m, mu, cv)
D <- diag(rep(1, n))
c(norm(U, "F"), norm(R, "F"), norm(D, "F"))/c(sum(svd(U)$d), sum(svd(R)$d), 
    sum(svd(D)$d))
## [1] 1.0000 0.1516 0.1000

We see that the norm is between the sum of all the singular values divided by \( sqrt(min(n,m)) \) and the sum of all the singular values. This corresponds to two cases: all singular values are identical; or, all singular values are zero except one.

Relation with other measures

Let's test whether this norm is related to anything else when it is not normalized by the sum of all the singular values itself.

measures <- NULL
for (rhoR in seq(0.1, 0.9, 0.1)) for (rhoC in seq(0.1, 0.9, 0.1)) {
    mat <- generate_heterogeneous_matrix_noise_corr(100, 100, rgamma_cost, rhoR, 
        rhoC, 3)
    measures <- rbind(measures, data.frame(mean_cor_row = mean_cor_row(mat), 
        mean_cor_col = mean_cor_col(mat), TMA1 = TMA1(mat), TMA2 = TMA2(mat), 
        CVSV = CV_SV(mat), fnorm = norm(mat, "F"), sumSV = sum(svd(mat)$d), 
        method = "noise"))
    mat <- generate_matrix_corr(100, 100, rgamma_cost, 1, 1, rhoR, rhoC)
    measures <- rbind(measures, data.frame(mean_cor_row = mean_cor_row(mat), 
        mean_cor_col = mean_cor_col(mat), TMA1 = TMA1(mat), TMA2 = TMA2(mat), 
        CVSV = CV_SV(mat), fnorm = norm(mat, "F"), sumSV = sum(svd(mat)$d), 
        method = "combination"))
}

Let's detect which measure is related to which other one.

indexNoise <- measures$method == "noise"
m_complete <- cbind(measures[, -ncol(measures)], corrProd = measures$mean_cor_row * 
    measures$mean_cor_col, normfnorm = measures$fnorm/measures$sumSV)
correlations <- NULL
for (meas1 in 1:(ncol(m_complete) - 1)) for (meas2 in (meas1 + 1):ncol(m_complete)) correlations <- rbind(correlations, 
    data.frame(name1 = colnames(m_complete)[meas1], name2 = colnames(m_complete)[meas2], 
        corr = cor.test(m_complete[indexNoise, meas1], m_complete[indexNoise, 
            meas2])$estimate, method = "noise"), data.frame(name1 = colnames(m_complete)[meas1], 
        name2 = colnames(m_complete)[meas2], corr = cor.test(m_complete[!indexNoise, 
            meas1], m_complete[!indexNoise, meas2])$estimate, method = "combination"))
correlations <- correlations[order(correlations$corr, decreasing = TRUE), ]
correlations[correlations$corr > 0.5 | correlations$corr < -0.5, ]
##              name1     name2    corr      method
## cor58         CVSV normfnorm  1.0000       noise
## cor59         CVSV normfnorm  0.9998 combination
## cor60        fnorm     sumSV  0.9171       noise
## cor46         TMA2     sumSV  0.7710       noise
## cor57         CVSV  corrProd  0.7705 combination
## cor71     corrProd normfnorm  0.7690 combination
## cor27 mean_cor_col  corrProd  0.6918 combination
## cor70     corrProd normfnorm  0.6628       noise
## cor56         CVSV  corrProd  0.6624       noise
## cor13 mean_cor_row  corrProd  0.6608 combination
## cor7  mean_cor_row      CVSV  0.6305 combination
## cor15 mean_cor_row normfnorm  0.6258 combination
## cor12 mean_cor_row  corrProd  0.6205       noise
## cor21 mean_cor_col      CVSV  0.6159 combination
## cor29 mean_cor_col normfnorm  0.6122 combination
## cor26 mean_cor_col  corrProd  0.5975       noise
## cor44         TMA2     fnorm  0.5736       noise
## cor20 mean_cor_col      CVSV  0.5148       noise
## cor28 mean_cor_col normfnorm  0.5144       noise
## cor48         TMA2  corrProd -0.5571       noise
## cor25 mean_cor_col     sumSV -0.5788 combination
## cor50         TMA2 normfnorm -0.6219       noise
## cor42         TMA2      CVSV -0.6231       noise
## cor11 mean_cor_row     sumSV -0.6532 combination
## cor16 mean_cor_col      TMA1 -0.6559       noise
## cor67        sumSV  corrProd -0.7143 combination
## cor69        sumSV normfnorm -0.9064 combination
## cor55         CVSV     sumSV -0.9123 combination

We see a clear relation between the CV of the singular values and the normalized Frobenius norm. Those two measures may thus be considered equivalent and we will prefer the first one. The sumSV is not really interesting here and its behavior is distinct depending on the method. All other negative correlations are specific to the noise-based method only. In the remaining relations, the only interesting one is between the product of the correlations and the CV of the singular values.

We have thus two similar but distinct measures: the correlations and the CV of the singular values. The TMA2 is also partially correlated, but only for the noise-based method.

Random matrix

R <- generate_random_matrix(100, 100, 1, 1)
c(TMA2(R), mean_cor_row(R), mean_cor_col(R), CV_SV(R))
## [1]  0.2714929 -0.0005887 -0.0017780  1.1345734

Obviously, the correlation is close to zero. Let's consider the other two measures with other parameters:

R <- generate_random_matrix(100, 100, 1, 0.1)
c(TMA2(R), CV_SV(R))
## [1] 0.008573 5.391758
R <- generate_random_matrix(100, 100, 0.1, 1)
c(TMA2(R), CV_SV(R))
## [1] 0.3119 1.1492
R <- generate_random_matrix(100, 10, 1, 1)
c(TMA2(R), CV_SV(R))
## [1] 0.4609 0.6410
R <- generate_random_matrix(10, 100, 1, 1)
c(TMA2(R), CV_SV(R))
## [1] 0.5089 0.6383

Both measures are significantly impacted by the CV. Low heterogeneity leads to identical instances (low TMA2 and large CV of singular values). Non-square matrices tend to increase the affinity.

In any case, random matrices have low values (except when the heterogeneity is low).

Identifying the best predictor

Let's consider the impact of the correlation on the performance of HLPT. As the relation is almost linear, we can use a linear regression to see which measure is the best predictor.

best_cmax <- function(costs) {
    min(LPT_unrelated(costs, min), balance_sufferage(costs), balance_EFT(costs))
}

LPT_unrelated_perf <- NULL
for (i in 1:100) {
    corr <- runif(1)
    Z <- generate_matrix_corr(200, 50, rgamma_cost, 1, 0.1, corr, corr)
    cmax <- LPT_unrelated(Z, func = min)
    LPT_unrelated_perf <- rbind(LPT_unrelated_perf, data.frame(corr = corr, 
        mean_cor_row = mean_cor_row(Z), mean_cor_col = mean_cor_col(Z), TMA1 = TMA1(Z), 
        TMA2 = TMA2(Z), CVSV = CV_SV(Z), ratio = max(1, cmax/best_cmax(Z))))
}

Before analyzing the data, let's visualize them:

library(ggplot2)
p <- ggplot(data = LPT_unrelated_perf, aes(x = corr, y = ratio))
p <- p + geom_point() + geom_smooth()
p
## geom_smooth: method="auto" and size of largest group is <1000, so using loess. Use 'method = x' to change the smoothing method.

plot of chunk unnamed-chunk-8

Let's run a linear regression with a single predictor each time. We will quantify the prediction with the adjusted R-squared:

summary(lm(ratio ~ corr, LPT_unrelated_perf))$adj.r.squared
## [1] 0.7759
summary(lm(ratio ~ mean_cor_row, LPT_unrelated_perf))$adj.r.squared
## [1] 0.7632
summary(lm(ratio ~ mean_cor_col, LPT_unrelated_perf))$adj.r.squared
## [1] 0.7692
summary(lm(ratio ~ TMA1, LPT_unrelated_perf))$adj.r.squared
## [1] 0.7772
summary(lm(ratio ~ TMA2, LPT_unrelated_perf))$adj.r.squared
## [1] 0.7778
summary(lm(ratio ~ CVSV, LPT_unrelated_perf))$adj.r.squared
## [1] 0.7493

We wee that both versions of the TMA are actually good predictors. This is surprising that the TMA outperforms the CVSV. Maybe it is similar to the fact that HLPT is better with min than with with mean: focusing on the most extreme value gives better results because they are likely to play a role in an efficient schedule.

Conclusion

We will keep the following properties when generating a matrix: generation method and parameters (including the seed), 4 heterogeneous measures, MPH, TDH, TMA1, TMA2, both correlations and the CV of the singular values.

Let's focus on the idea that an heuristic or a measure could benefit from considering the extreme cases that are likely to occur in an optimal schedule rather than all the possible cases.

Note that the combination-based method produces negative costs when used with a large CV. This needs to be fixed.