Dynamic Inventory Allocation with Demand Learning for Seasonal Goods

发表在 Production and Operations Management, 2021. DOI: https://doi.org/10.1111/poms.13315

Key words: multi-echelon inventory; demand learning; dynamic programming


这篇文章研究的是 inventory allocation 的问题. Two-echelon network.

要把一个仓库里的货运给 $N$ 个零售商,零售商之间无法转运货物;总的货物量是 $w_0$;lead time 为0;历史需求信息集 $\mathscr{H}_t$(censored or uncensored)。

定义 cost:

$$ L_{i, t}\left(y_{i, t}\right):=p_{i, t}\left[D_{i, t}-y_{i, t}\right]^{+}+h_{i, t}\left[y_{i, t}-D_{i, t}\right]^{+} $$

文章把这种库存分配问题建模成了一个动态规划:

$$ \begin{aligned} & G_{t}\left(\mathbf{y}, w_{t}, \mathbf{x}_{t}, \mathscr{H}_{t}\right)=\sum_{i=1}^{N} \mathrm{E}\left[L_{i, t}(\mathbf{y}) \mid \mathscr{H}_{t}\right] +\gamma \mathrm{E}\left[V_{t+1}\left(w_{t}-\mathbf{1}^{\mathrm{T}}\left(\mathbf{y}-\mathbf{x}_{t}\right),\left[\mathbf{y}-\mathbf{D}_{t}\right]^{+}, \mathscr{H}_{t+1}\right) \mid \mathscr{H}_{t}\right] \\ & \qquad V_{t}\left(w_{t}, \mathbf{x}_{t}, \mathscr{H}_{t}\right)=\min _{\mathbf{y} \geq \mathbf{x}_{t} \atop \mathbf{1}^{\mathrm{T}} \mathbf{y} \leq \mathbf{1}^{\mathrm{T}} \mathbf{x}_{t}+w_{t}} G_{t}\left(\mathbf{y}, w_{t}, \mathbf{x}_{t}, \mathscr{H}_{t}\right) \\ & \qquad V_{\mathrm{T}+1}\left(w_{0, T+1}, \mathbf{x}_{T+1}, \mathscr{H}_{\mathrm{T}+1}\right)=0 . \end{aligned} $$

它等价于:

$$ \begin{aligned} \min\; & \sum_{t=1}^{T} \sum_{i=1}^{N} \gamma^{t-1} \mathrm{E}\left[\mathrm{E}\left[L_{i, t}\left(y_{i, t}\right) \mid \mathscr{H}_{t}\right]\right] \\ \text { s.t. } & y_{i, t}=x_{i, t}+a_{i, t} \forall i, t, \mathscr{H}_{t} \\ & x_{i, t+1}=\left[y_{i, t}-D_{i, t}\right]^{+} \forall i, t, \mathscr{H}_{t} \\ & \sum_{i=1}^{N} \sum_{t=1}^{T} a_{i, t} \leq w_{0} \text { a.s. } \\ & a_{i, t} \geq 0 \;\; \forall i, t, \mathscr{H}_{t} . \end{aligned} $$

在这个过程中,可以以任意的方式进行 demand learning。

Two types of demand learning models that this framework can capture are Bayesian methods, where the decision maker updates her beliefs on the unknown demand model parameter distributions with time, and time series models (e.g., ARMA or ARIMA).

Bayesian methods:

$$ \mathbb{P}\left[D_{i, t}=d \mid \mathscr{H}_{t}\right]=\int_{\boldsymbol{\theta} \in \Theta} p_{i, t}(d \mid \boldsymbol{\theta}) f\left(\boldsymbol{\theta} \mid \mathscr{H}_{t}\right) \mathrm{d} \boldsymbol{\theta} $$

ARMA/ARIMA models:

$$ D_{i, t}=\mu_{i}+\alpha_{1} D_{i, t-1}+\cdots+\alpha_{p} D_{i, t-p}+\varepsilon_{i, t}+\beta_{1} \varepsilon_{i, t-1}+\cdots+\beta_{q} \varepsilon_{i, t-q} $$

这两种方法都要求 uncensored demand information.

文章设计了一种 Lagrangian Relaxation 的近似方法,并说明了这种方法的渐进最优性。

updatedupdated2023-01-262023-01-26