Dirichlet Process

Dirichlet Process

The original definition of the DP is due to Ferguson (1973)[1] .

Given a measurable space $(\Omega, \mathcal{F})$, a random distribution (measure) $G$ is said to follow a Dirichlet pocess with a baseline probability measure $H$ and a scaling parameter $\alpha$, if for any partition of $\Omega = \displaystyle\cup_{i=1}^n B_i$ $$ (G(B_1), G(B_2), \dots, G(B_n)) \sim \text{Dir}(\alpha H(B_1), \alpha H(B_2), \dots, \alpha H(B_n)) $$ which is denoted by $G \sim \text{DP}(\alpha, H)$ .

An alternative definition is the stick-breaking process, which defines the Dirichlet process constructively by writing a distribution sampled from the process as $f(x)=\displaystyle\sum_{k=1}^{\infty} \beta_k \delta_{x_k}(x)$

The sampling precedure can be decribed as follows

  1. $k=0, \beta_0 = 0$
  2. Sample $x_k \sim H$
  3. Let $\beta_k \sim (1 - \beta_{k-1}) \cdot \text{Beta}(1, \alpha)$
  4. $k = k + 1$ and go to step2

In mathematics, the digamma function is defined as the logarithmic derivative of the gamma function $$ \psi(x)=\frac{\mathrm{d}}{\mathrm{d} x} \ln (\Gamma(x))=\frac{\Gamma^{\prime}(x)}{\Gamma(x)} \sim \ln x-\frac{1}{2 x} $$ It satisfies $$ \psi(z+1)=\psi(z)+\frac{1}{z} $$


  1. Ferguson, T. S. (1973). A Bayesian Analysis of Some Nonparametric Problems. The Annals of Statistics, 1(2), 209–230. 

updatedupdated2023-01-262023-01-26