Conduct inference using `Ate` for the selected target of inference: a linear combination of connectivity parameters based upon estimated communities. Note that communities should be estimated using the `Atr` matrix produced from the `split_network()` function prior to using this function.
infer_network.Rd
Conduct inference using `Ate` for the selected target of inference: a linear combination of connectivity parameters based upon estimated communities. Note that communities should be estimated using the `Atr` matrix produced from the `split_network()` function prior to using this function.
Usage
infer_network(
Ate,
u,
communities,
distribution,
epsilon = 0.5,
gamma = NULL,
Atr = NULL,
allow_self_loops = TRUE,
is_directed = TRUE,
tau = NULL
)
Arguments
- Ate
The test adjacency matrix produced from the `split_network()` function which will be used to conduct inference.
- u
The linear combination vector (or matrix) which specifies which connectivity parameters should be considered when constructing the selected target of inference. This input should have norm 1. If `is_directed` is set to `FALSE`, then the matrix version of `u` must be upper triangular.
- communities
A vector or matrix which specifies the estimated communities. If this is inputted as a vector, then if `Ate` is of size `n` x `n`, then this should be a vector of length `n`, where the ith element is the numbered community that the ith node belongs to. For example, `communities[2] = 3` would indicate that the 2nd node belongs to the 3rd estimated community. If this is inputted as a matrix, then it should be a matrix of size `n` x `K`, where `K` is the number of estimated communities. The matrix should have 0s and 1s only, where `communities[i, k] = 1` indicates that the ith node belongs to the `k`th community. Each node is only allowed to belong to a single community, so there should only be a single 1 in each row of this matrix.
- distribution
The distribution that the edges of the adjacency matrix follow. Acceptable distributions are `"gaussian"`, `"poisson"`, or `"bernoulli"`.
- epsilon
The parameter controlling the amount of information allocated to the train network versus the test network. For Gaussian and Poisson networks, this must be between 0 and 1 (non-inclusive). A larger value of epsilon indicates more information in the train network. For Bernoulli networks, this input is an alias to the `gamma` parameter.
- gamma
For Bernoulli networks, the parameter controlling the amount of information allocated to the train network versus the test network. This must be between 0 and 0.5 (non-inclusive) A larger value of `gamma` indicates less information in the train network, and more in the test network.
- Atr
The train adjacency matrix produced from the `split_network()` function. This is only necessary when the network has edges which follow the Bernoulli distribution.
- allow_self_loops
A logical indicating whether the network allows self loops (edges pointing from a node to itself.) By default this parameter is set to `TRUE`. If this is set to `FALSE`, then values in the adjacency matrix along the diagonal will be ignored.
- is_directed
A logical indicating whether the network is a directed network, and by default is set to `TRUE`. If this is set to `FALSE`, then only the values in the upper triangular portion of the adjacency matrix will be used.
- tau
For networks with Gaussian edges only, this parameter indicates the known common standard deviation (square root of the variance) of the edges in the network.
Value
A list labeled with two elements labeled `"estimate"` and `"estimate_variance"`, which contain the estimate for the selected target of inference, as well as an estimate of the variance of the estimator.
Examples
# ==============================
# == Gaussian network example ==
# ==============================
# (Poisson networks proceed nearly identically except that the parameter
# tau is not necessary to input into split_network() or infer_network())
# First, split a simulated Gaussian adjacency matrix
A_gaussian <- matrix(stats::rnorm(n = 10^2, mean = 10, sd = 5), nrow = 10)
gaussian_split <- split_network(A = A_gaussian, distribution = "gaussian",
epsilon = 0.3, tau = 5)
A_gaussian_tr <- gaussian_split$Atr
A_gaussian_te <- gaussian_split$Ate
# Estimate some communities using the train matrix using spectral
# clustering for K=3 communities
if (requireNamespace("nett", quietly = TRUE)) {
communities_estimate <- nett::spec_clust(A_gaussian_tr, K = 3)
} else {
communities_estimate <- c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3)
}
# This particular "u" vector will specify that we want to conduct inference
# for the mean connectivity within the first estimated community.
# Note that "u" is of length 9, because we have 3 estimated communities.
# (This should always be of length K^3)
u_vector <- c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
# We can also specify "u" in matrix form.
u_matrix <- matrix(c(1, 0, 0,
0, 0, 0,
0, 0, 0), nrow = 3)
# Conduct inference for the selected target (mean connectivity within the
# first estimated community)
gaussian_inference <-
infer_network(Ate = A_gaussian_te, u = u_matrix,
communities = communities_estimate,
distribution = "gaussian",
epsilon = 0.3, tau = 5)
# Produce a 90% confidence interval for the target of inference
margin_of_error <- sqrt(gaussian_inference$estimate_variance) * qnorm(0.95)
ci_upper_bound <- gaussian_inference$estimate + margin_of_error
ci_lower_bound <- gaussian_inference$estimate - margin_of_error
# ===============================
# == Bernoulli network example ==
# ===============================
# First, split a simulated Bernoulli adjacency matrix with gamma=0.10
A_bernoulli <- matrix(stats::rbinom(n = 10^2, size = 1, p = 0.5), nrow = 10)
bernoulli_split <- split_network(A = A_bernoulli, distribution = "bernoulli",
gamma = 0.10)
A_bernoulli_tr <- bernoulli_split$Atr
A_bernoulli_te <- bernoulli_split$Ate
# Estimate some communities using the train matrix using spectral
# clustering for K=3 communities
if (requireNamespace("nett", quietly = TRUE)) {
communities_estimate <- nett::spec_clust(A_bernoulli_tr, K = 3)
} else {
communities_estimate <- c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3)
}
# This particular "u" vector will specify that we want to conduct inference
# for the mean connectivity within the first estimated community.
# Note that "u" is of length 9, because we have 3 estimated communities.
# (This should always be of length K^3)
u_vector <- c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0)
# We can also specify "u" in matrix form.
u_matrix <- matrix(c(1, 0, 0,
0, 0, 0,
0, 0, 0), nrow = 3)
# Conduct inference for the selected target (mean connectivity within the
# first estimated community)
bernoulli_inference <-
infer_network(Ate = A_bernoulli_te, u = u_matrix,
communities = communities_estimate,
distribution = "bernoulli",
gamma = 0.10, Atr = A_bernoulli_tr)
# Produce a 90% confidence interval for the target of inference
margin_of_error <- sqrt(bernoulli_inference$estimate_variance) * qnorm(0.95)
ci_upper_bound <- bernoulli_inference$estimate + margin_of_error
ci_lower_bound <- bernoulli_inference$estimate - margin_of_error