Convergence of Markov process method (December 21, 2015)

Let’s study the empirical convergence rate of a cost matrix generation method that is based on a Markov process. The advantage of this method is to ensure that the generated matrix is drawn uniformly among all the matrices that have some specific heterogeneity properties.

Let’s start by implementing the transition code:

transition <- function(M) {
  i <- ceiling(runif(2, min = 0, max = nrow(M)))
  j <- ceiling(runif(2, min = 0, max = ncol(M)))
  if (M[i[1],j[2]] != 0 && M[i[2],j[1]] != 0) {
    M[i[1],j[1]] <- M[i[1],j[1]] + 1
    M[i[1],j[2]] <- M[i[1],j[2]] - 1
    M[i[2],j[1]] <- M[i[2],j[1]] - 1
    M[i[2],j[2]] <- M[i[2],j[2]] + 1
  }
  M
}

Now, the code to generate some initial matrices:

gen_matrix <- function(n, m, M) {
  base_row <- sample(1:M, n, replace = TRUE)
  base_col <- sample(1:M, m, replace = TRUE)
  base_row %*% t(base_col)
}

Now, let’s run a few tests and see how the correlations behave with the number of transitions:

M <- gen_matrix(10, 10, 10)
meas_corr <- NULL
iterations <- 1e5
for (i in 0:iterations) {
  if (floor(log(i + 1, base = 1.3)) >= floor(log(i, base = 1.3)) + 1)
    meas_corr <- rbind(meas_corr, data.frame(i = i,
                                             mean_cor_row = mean_cor_row(M),
                                             mean_cor_col = mean_cor_col(M)))
  M <- transition(M)
}

Let’s plot the result:

gather(meas_corr, measure, correlation, contains("mean_cor_")) %>%
  ggplot(aes(x = i, y = correlation, color = measure)) +
  geom_point() +
  geom_line()

The convergence is reached for 50 000 iterations. Let’s try with different values:

meas_corr <- NULL
iterations <- 1e5
for (n in c(10, 20))
  for (m in c(10, 20)) {
    M <- gen_matrix(n, m, 10)
    for (i in 0:iterations) {
      if (floor(log(i + 1, base = 1.3)) >= floor(log(i, base = 1.3)) + 1)
        meas_corr <- rbind(meas_corr,
                           data.frame(n = n, m = m, i = i,
                                      mean_cor_row = mean_cor_row(M),
                                      mean_cor_col = mean_cor_col(M)))
      M <- transition(M)
    }
  }

And the result:

gather(meas_corr, measure, correlation, contains("mean_cor_")) %>%
  ggplot(aes(x = i, y = correlation, color = measure)) +
  facet_grid(n ~ m) +
  geom_point() +
  geom_line()

The speed of the convergence is lower when the dimension increase. Let’s investigate this for squared matrix:

meas_corr <- NULL
iterations <- 1e6
for (n in c(10, 20, 30, 50, 100)) {
  M <- gen_matrix(n, n, 10)
  for (i in 0:iterations) {
    if (floor(log(i + 1, base = 1.3)) >= floor(log(i, base = 1.3)) + 1)
      meas_corr <- rbind(meas_corr,
                         data.frame(n = n, i = i,
                                    mean_cor_row = mean_cor_row(M),
                                    mean_cor_col = mean_cor_col(M)))
    M <- transition(M)
  }
}

And the result:

gather(meas_corr, measure, correlation, contains("mean_cor_")) %>%
  ggplot(aes(x = i, y = correlation, color = measure)) +
  facet_wrap(~n) +
  geom_point() +
  geom_line()

If 10 000 iterations are sufficient for n=10, about 500 000 are necessary for n=30. The rate seems to increase quickly. Let’s consider the time required to reach correlations below 0.5 with a bit of stability.

meas_corr <- NULL
iterations <- 1e6
for (n in c(10, 20, 40, 80)) {
  M <- gen_matrix(n, n, 10)
  already <- FALSE
  for (i in 0:iterations) {
    if (floor(log(i + 1, base = 1.3)) >= floor(log(i, base = 1.3)) + 1)
      if (mean_cor_row(M) <= 0.5 || mean_cor_col(M) <= 0.5) {
        if (already)
          break
        else
          already <- TRUE
      }
    M <- transition(M)
  }
  meas_corr <- rbind(meas_corr, data.frame(n = n, i = i))
}

And the result:

meas_corr %>%
  ggplot(aes(x = n, y = i)) +
  geom_point() +
  geom_line()

We see a quadratic progression. Let’s see how non-squared matrices are affected:

meas_corr <- NULL
iterations <- 1e6
for (n in c(10, 20, 30, 40, 60, 80)) {
  M <- gen_matrix(n, 80, 10)
  already <- FALSE
  for (i in 0:iterations) {
    if (floor(log(i + 1, base = 1.3)) >= floor(log(i, base = 1.3)) + 1)
      if (mean_cor_row(M) <= 0.5 || mean_cor_col(M) <= 0.5) {
        if (already)
          break
        else
          already <- TRUE
      }
    M <- transition(M)
  }
  meas_corr <- rbind(meas_corr, data.frame(n = n, i = i))
}

And the result:

meas_corr %>%
  ggplot(aes(x = n, y = i)) +
  geom_point() +
  geom_line()

We can conjecture that the rate is in O(nm).

Let’s finish with an analysis of the impact of the maximum cost:

meas_corr <- NULL
iterations <- 1e6
for (MAX in 2:12) {
  M <- gen_matrix(50, 50, MAX)
  already <- FALSE
  for (i in 0:iterations) {
    if (floor(log(i + 1, base = 1.3)) >= floor(log(i, base = 1.3)) + 1)
      if (mean_cor_row(M) <= 0.5 || mean_cor_col(M) <= 0.5) {
        if (already)
          break
        else
          already <- TRUE
      }
    M <- transition(M)
  }
  meas_corr <- rbind(meas_corr, data.frame(M = MAX, i = i))
}

And the result:

meas_corr %>%
  ggplot(aes(x = M, y = i)) +
  geom_point() +
  geom_line()

OK, it seems to be at least quadratic. Let’s apply a regression and see what better fit the points.

rsquared <- NULL
for (k in 2:6)
  rsquared <- c(rsquared, summary(lm(i ~ I(M ^ k), meas_corr))$r.squared)
print(rsquared)

## [1] 0.8466172 0.9196449 0.9560563 0.9704478 0.9717191

rsquared <- NULL
for (k in 2:6)
  rsquared <- c(rsquared, summary(lm(i ~ I(M ^ k), meas_corr[meas_corr$M < 8,]))$r.squared)
print(rsquared)

## [1] 0.9767188 0.9852752 0.9668185 0.9366381 0.9032574

The regression on the stable low values suggests the rate to be in O(M^3) but it could be even higher.

Conclusion

We end up with a convergence rate of at least O(nmM^2). This study is however limited by the following factors:

it is empirical
it assumes that the correlation is a good indicator of the convergence
no average were performed (only a single test each time) and we may have outliers
the measures could be imprecise due to some approximations

Still it provides some idea of the convergence rate.

Perspectives

The impact of the maximum cost should be investigated (note that the maximum cost in the matrix is M^2). Also, the impact of this parameter could be minimized by changing the transition mechanism to add and subtract values larger than 1.