发表在 Operations Research, 2010. DOI: https://doi.org/10.1287/opre.1090.0746
Subject classifications: robust optimization; inventory control.
Area of review: Optimization.
这篇文章提出了用鲁棒优化解决多阶段库存管理的方法。文章假设需求分布是未知的,但是知道一些统计信息(support, mean...)。
文章提到了一个概念,tractable replenishment policy,指可以在多项式时间内计算出来的库存管理策略。使用动态规划来得到库存管理策略有可能不是多项式内可计算的。
For instance, the celebrated optimum base-stock policy may not necessarily be a tractable one.
这篇文章用时间序列来考虑需求存在相关性的情况,并且使用了因子模型。
Our proposed robust optimization approximation is based upon a comprehensive factor-based demand model that can capture correlations such as the autoregressive nature of demand, the effect of external factors, as well as trends and seasonality, among others.
文章首先给出了 Stochastic Inventory Model: $$ \begin{aligned} Z_{\text {STOC }}=&\min \sum_{t=1}^{T}\left(\mathrm{E}\left(c_{t} x_{t}\left(\tilde{\mathbf{d}}_{t-1}\right)\right)+\mathrm{E}\left(h_{t}\left(y_{t+1}\left(\tilde{\mathbf{d}}_{t}\right)\right)^{+}\right)+\mathrm{E}\left(b_{t}\left(y_{t+1}\left(\tilde{\mathbf{d}}_{t}\right)\right)^{-}\right)\right) \\ &\text {s.t. } y_{t+1}\left(\tilde{\mathbf{d}}_{t}\right)=y_{t}\left(\tilde{\mathbf{d}}_{t-1}\right)+x_{t-L}\left(\tilde{\mathbf{d}}_{t-L-1}\right)-\tilde{d}_{t}, t=1, \ldots, T \\ & 0 \leqslant x_{t}\left(\tilde{\mathbf{d}}_{t-1}\right) \leqslant S_{t} \quad t=1, \ldots, T-L . \end{aligned} $$
这里的 lead time = $L$。需求由多个因子决定:
$$ d_{t}(\widetilde{\mathbf{z}}) \triangleq \tilde{d}_{t}=d_{t}^{0}+\sum_{k=1}^{N} d_{t}^{k} \tilde{z}_{k}, \quad t=1, \ldots, T $$
并且有时间序列上的相关性:
$$ d_{t}(\widetilde{\mathbf{z}})= \begin{cases}d_{t}^{0} & \text { if } t \leqslant 0, \\ \sum_{j=1}^{p} \phi_{j} d_{t-j}(\widetilde{\mathbf{z}})+\tilde{z}_{t}+\sum_{j=1}^{\min \{q, t-1\}} \theta_{j} \tilde{z}_{t-j} & \text { otherwise }\end{cases} $$
接着提出用 directional deviations 来表达分布信息:
Given a random variable $\tilde{z}$, the forward deviation is defined as
$$ \sigma_{f}(\tilde{z}) \triangleq \sup _{\theta>0}\left\{\sqrt{2 \ln \left(\mathrm{E}(\exp (\theta(\tilde{z}-\mathrm{E}(\tilde{z})))) / \theta^{2}\right.}\right\} $$
and backward deviation is defined as
$$ \sigma_{b}(\tilde{z}) \triangleq \sup _{\theta>0}\left\{\sqrt{2 \ln \left(\mathrm{E}(\exp (-\theta(\tilde{z}-\mathrm{E}(\tilde{z})))) / \theta^{2}\right.}\right\} $$
文章还提到一个有用的命题:
(Scarf 1958). Let $\tilde{z}$ be a random variable in $[-\mu, \infty)$ with mean $\mu$ and standard deviation $\sigma$; then, for all $a \geqslant-\mu$,
$$ \mathrm{E}\left((\tilde{z}-a)^{+}\right) \leqslant\left\{\begin{array}{ll} \frac{1}{2}\left(-a+\sqrt{\sigma^{2}+a^{2}}\right) & \text { if } a \geqslant \frac{\sigma^{2}-\mu^{2}}{2 \mu} \\ -a \frac{\mu^{2}}{\mu^{2}+\sigma^{2}}+\mu \frac{\sigma^{2}}{\mu^{2}+\sigma^{2}} & \text { if } a<\frac{\sigma^{2}-\mu^{2}}{2 \mu} \end{array} .\right. $$
Moreover, the bound is achievable.
文章在最后讨论了各种 policy 的求解,如
Static Replenishment Policy:
$$ x_{t}\left(\tilde{\mathbf{d}}_{t-1}\right)=x_{t}^{0} $$
Linear Replenishment Policy:
$$ x_{t}^{\mathrm{LRP}}\left(\tilde{\mathbf{d}}_{t-1}\right)=x_{t}^{0}+\mathbf{x}_{t}^{\prime} \tilde{\mathbf{z}} $$
提出 Truncated Linear Replenishment Policy:
$$ x_{t}^{\mathrm{TLRP}}\left(\tilde{\mathbf{d}}_{t-1}\right)=\min \left\{\max \left\{x_{t}^{0}+\mathbf{x}_{t}^{\prime} \widetilde{\mathbf{z}}, 0\right\}, S_{t}\right\} $$
Policy 是与因子 $\tilde{z}$ 相关联的。并且可以由鲁棒优化高效计算。
总的来说,我觉得这篇文章写的比较乱,让人一眼看不清主线;另外大量引用了作者之前做的工作。