Split an adjacency matrix into a train and test adjacency matrix using either data thinning (Poisson or Gaussian edges) or data fission (Bernoulli edges.)

Usage

split_network(
  A,
  distribution,
  epsilon = 0.5,
  gamma = NULL,
  allow_self_loops = TRUE,
  is_directed = TRUE,
  tau = NULL
)

Arguments

A: The square adjacency matrix to be split.
distribution: The distribution that the edges of the adjacency matrix follow. Acceptable distributions are `"gaussian"`, `"poisson"`, or `"bernoulli"`.
epsilon: The parameter controlling the amount of information allocated to the train network versus the test network. For Gaussian and Poisson networks, this must be between 0 and 1 (non-inclusive). A larger value of epsilon indicates more information in the train network. For Bernoulli networks, this input is an alias to the `gamma` parameter.
gamma: For Bernoulli networks, the parameter controlling the amount of information allocated to the train network versus the test network. This must be between 0 and 0.5 (non-inclusive) A larger value of `gamma` indicates less information in the train network, and more in the test network.
allow_self_loops: A logical indicating whether the network allows self loops (edges pointing from a node to itself.) By default this parameter is set to `TRUE`. If this is set to `FALSE`, then the values in the adjacency matrix along the diagonal will be ignored.
is_directed: A logical indicating whether the network is a directed network, and by default is set to `TRUE`. If this is set to `FALSE`, then only the values along the upper triangular portion of the matrix will be used.
tau: For networks with Gaussian edges only, this parameter indicates the known common standard deviation (square root of the variance) of the edges in the network.

Value

A list labeled with two elements labeled `"Atr"` and `"Ate"`, which are the train and test networks, respectively.

Examples

# Split a simulated Gaussian adjacency matrix
A_gaussian <- matrix(rnorm(n = 10^2, mean = 10, sd = 5), nrow = 10)
gaussian_split <- split_network(A_gaussian, "gaussian", 0.3, tau = 5)
A_gaussian_tr <- gaussian_split$Atr
A_gaussian_te <- gaussian_split$Ate

# Split a simulated Bernoulli adjacency matrix with gamma = 0.25
A_bernoulli <- matrix(rbinom(n = 10^2, size = 1, p = 0.5), nrow = 10)
bernoulli_split <- split_network(A_bernoulli, "bernoulli", gamma = 0.25)
A_bernoulli_tr <- bernoulli_split$Atr
A_bernoulli_te <- bernoulli_split$Ate