Stochastic Block Model
- Latent Class Model
- for each actor (node), we assign group memberships
two components in the block model,
- the vector of grop membership :
- conditional on the group membership, the block matrix represents the edge probability of two nodes
- 에 부여된 확률 0.8: the connection prob within group 1
- 에 부여된 확률 0.9: the connection prob within group 1
- 에 부여된 확률 0.05: the connection prob b/w group 1 and group 2
※ notations:
- : adjacency Matrix
- : the total membership groups ()
- : -vector. node 가 속하게 될 group 에만 부여하고 이외에는 모두 .
- : block matrix
-
entry : prob of occurence of a edge from a node in
-
block Matrix 라는 발상은, block membership 이 given 일 때, edge 들이 conditionally independent 라는 것.
⇒ 따라서
- Bernoulli distribution with success probability
- independent of for , given and .
Likelihood function
In real data, and are unknown. we need additional assumptions.
- is independent of .
- , the latent group follows multinomial distribution with , i.e.,
, where is the total number of nodes that belongs to group .
In a Bayesian inference, we can assign prior .
Inference
By combining equation (1) and (2), we can make an inference by the frequentist way using EM algorithm.
Bayesian
we need to assign prior for , , where are Matrix.
Posterior Distribution
\begin{alignat}{2} \pi(Z , \theta, C , \alpha \Big | Y) & \propto f(Y \Big | Z , \theta, C , \alpha) && \cdot \pi(Z , \theta, C , \alpha) \\ & = f(Y \Big | Z , \theta, C , \alpha) && \cdot \pi(Z \Big | \theta, C , \alpha) && \cdot \pi(\theta \Big | C , \alpha) && \cdot \pi(C, \alpha) \\ & = f(Y \Big | Z , C) && \cdot \pi(Z \Big | \theta) && \cdot \pi(\theta \Big | \alpha) && \cdot \pi(C) \cdot \pi(\alpha) \\ & \propto \prod_{i<j} \Big[ (z_i' C z_j)^{y_{ij}}(1- z_i' C z_j)^{1-y_{ij}} \Big] && \cdot \prod_{k=1}^K \theta_k^{n_k} \Bigg[\Gamma(k \alpha) \Big \{ \sum^k_{i=} \theta_k = 1 \prod_{i=1}^k \frac{\theta_z^{\alpha-1}}{\Gamma(\alpha)}\Big \} \Bigg] && \cdot\prod^k_{i<j} \Big[ C_{ij}^{A_{ij}-1} (1-C_{ij})^{B_{ij}-1} \Big] && \cdot \alpha^{a-1} e^{-b \alpha} \end{alignat}Computation is veery straightforward via MCMC.
Mixed Membership Block Model (MMBM)
상기의 SBM 에서, 각 actor 는 오직 1개의 membership 을 가짐. 이러한 SBM 의 제약을 완화 (alleviate) 하기 위해 MMBM 이 제언되었다. 해당 모델에서는 특정 membership 에 1을 부여하는 것이 아니라, 각 membership 발생 여부에 prob 을 할당하여 다중 membership 을 각 ndoe 에 할당한다.
이때 MMSB 의 vector 의 각 항은 각각 group 1, group 2, group 3 일 prob.
- : weights or probabilities in the groups that have to be non-negative and sum to 1.
For each dyad , a latent membership is drawn from the multinomial distribution with .
, where is Matrix of membership probability.
The likeliood is .
The goal of inference estimating not but the mixed memberships .
Degree-corrected SBM
use Poisson Model.
- : the number of edges from dyad (i,j) following a Poisson dist
- : the expected number of edges from a node in group to a node in group
The likelihood is
In a large sparse graph, where the edge probability equals the expected number of edges, DC-SBM is asymptotically equivalient to the Bernoulli counterpart in equation (3).
- : the ratio of node ‘s degree to the sum of degrees in node ‘s group.
Under constraint , then equation (3) becomes
which is DC-SBM.
SBM ignores the variation of node degrees in a real network.
With DC-SBM, we can make correction better.