Use Simulated Annealing to attempt to match a network to a vector of mean statistics
This function attempts to find a network or networks whose statistics match
those passed in via the target.stats
vector.
san(object, ...) ## S3 method for class 'formula' san( object, response = NULL, reference = ~Bernoulli, constraints = ~., target.stats = NULL, nsim = NULL, basis = NULL, output = c("network", "edgelist", "pending_update_network"), only.last = TRUE, control = control.san(), verbose = FALSE, offset.coef = NULL, ... ) ## S3 method for class 'ergm_model' san( object, response = NULL, reference = ~Bernoulli, constraints = ~., target.stats = NULL, nsim = NULL, basis = NULL, output = c("network", "edgelist", "pending_update_network"), only.last = TRUE, control = control.san(), verbose = FALSE, offset.coef = NULL, ... )
object |
Either a |
... |
Further arguments passed to other functions. |
response |
Name of the edge attribute whose value is to be
modeled in the valued ERGM framework. Defaults to |
reference |
A one-sided formula specifying the reference measure (h(y)) to be used. See help for ERGM reference measures implemented in the ergm package. |
constraints |
A one-sided formula specifying one or more constraints on
the support of the distribution of the networks being simulated. See the
documentation for a similar argument for |
target.stats |
A vector of the same length as the number of non-offset statistics
implied by the formula, which is either |
nsim |
Number of networks to generate. Deprecated: just use |
basis |
If not NULL, a |
output |
Character, one of |
only.last |
if |
control |
A list of control parameters for algorithm tuning; see
|
verbose |
Logical or numeric giving the level of verbosity. Higher values produce more verbose output. |
offset.coef |
A vector of offset coefficients; these must be passed in by the user.
Note that these should be the same set of coefficients one would pass to |
formula |
(By default, the |
Acceptance probabilities for proposed toggles are computed as we now describe. There are two contributions: one from targeted statistics and one from offsets.
For the targeted statistics, a matrix of weights W
is determined on
each san
iteration as follows. On the first iteration, the matrix
W
is the n
by n
identity matrix (n
= number of
target statistics), divided by n
. On subsequent iterations: if
control$SAN.invcov.diag = FALSE
(the default), then the matrix
W
is the inverse of the covariance matrix of the targeted
statistics, divided by the sum of its (the inverse's) diagonal;
if control$SAN.invcov.diag = TRUE
, then W
is the inverse
of the diagonal (regarded as a matrix) of the covariance matrix of the
targeted statistics, divided by the sum of its (the inverse's) diagonal.
In either of these two cases, the covariance matrix is computed based on
proposals (not acceptances) made on the previous iteration, and the
normalization for W
is such that sum(diag(W)) = 1
. The
component of the acceptance probability coming from the targeted statistics
is then computed for a given W
as exp([y.Wy - x.Wx]/T)
where
T
is the temperature, y
the column vector of differences
network statistics - target statistics
computed before the current
proposal is made, x
the column vector of differences
network statistics - target statistics
computed assuming the current proposal
is accepted, and .
the dot product. If control$SAN.maxit > 1
,
then on the i
th iteration, the temperature T
takes the value
control$SAN.tau * (1/i - 1/control$SAN.maxit)/(1 - 1/control$SAN.maxit)
;
if control$SAN.maxit = 1
, then the temperature T
takes the
value 0
. Thus, T
steps down from control$SAN.tau
to
0
and is always 0
on the final iteration.
Offsets also contribute to the acceptance probability, as follows. If
eta
are the canonical offsets and Delta
the corresponding
change statistics for a given proposal, then the offset contribution to
the acceptance probability is simply exp(eta.Delta)
where
.
denotes the dot product. By default, finite offsets are ignored,
but this behavior can be changed by setting
control$SAN.ignore.finite.offsets = FALSE
.
The overall acceptance probability is the product of the targeted statistics contribution and the offset contribution (with the product capped at one).
A network or list of networks that hopefully have network
statistics close to the target.stats
vector.
formula
: Sufficient statistics are specified by a formula
.
ergm_model
: A lower-level function that expects a pre-initialized ergm_model
.
# initialize x to a random undirected network with 50 nodes and a density of 0.1 x <- network(50, density = 0.05, directed = FALSE) # try to find a network on 50 nodes with 300 edges, 150 triangles, # and 1250 4-cycles, starting from the network x y <- san(x ~ edges + triangles + cycle(4), target.stats = c(300, 150, 1250)) # check results summary(y ~ edges + triangles + cycle(4)) # initialize x to a random directed network with 50 nodes x <- network(50) # add vertex attributes x %v% 'give' <- runif(50, 0, 1) x %v% 'take' <- runif(50, 0, 1) # try to find a set of 100 directed edges making the outward sum of # 'give' and the inward sum of 'take' both equal to 62.5, so in # edges (i,j) the node i tends to have above average 'give' and j # tends to have above average 'take' y <- san(x ~ edges + nodeocov('give') + nodeicov('take'), target.stats = c(100, 62.5, 62.5)) # check results summary(y ~ edges + nodeocov('give') + nodeicov('take')) # initialize x to a random undirected network with 50 nodes x <- network(50, directed = FALSE) # add a vertex attribute x %v% 'popularity' <- runif(50, 0, 1) # try to find a set of 100 edges making the total sum of # popularity(i) and popularity(j) over all edges (i,j) equal to # 125, so nodes with higher popularity are more likely to be # connected to other nodes y <- san(x ~ edges + nodecov('popularity'), target.stats = c(100, 125)) # check results summary(y ~ edges + nodecov('popularity')) # creates a network with denser "core" spreading out to sparser # "periphery" plot(y)
Please choose more modern alternatives, such as Google Chrome or Mozilla Firefox.