AI基础 L22 Uncertainty over Time I 时间的不确定性

Time and Uncertainty
1 Time and Uncertainty
States and Observations
• discrete-time models: we view the world as a series of snapshots or time slices
• the time interval ∆ between slices, we assume to be the same for every interval
• Xt: denotes the set of state variables at time t, which we assume to be unobservable
• Et: denotes the set of observable evidence variables: observation at time t is

Transition and sensor models
• The transition model specifies the probability distribution over the latest state variables,
given the previous values: .
• Problem: the set is unbounded in size as t increases.
• Solution: Markov assumption
the current state depends on only a finite fixed number of previous states
• P (Et | Xt) is our sensor model, sensor Markov assumption:

离散时间模型假设时间间隔是恒定的，而马尔可夫假设允许我们处理状态的无限序列，同时保持模型的可管理性。传感器模型则描述了如何将不可观测的状态变量转换为可观测的证据变量。

• the prior probability distribution at time 0, P (X0).

• Umbrella World: first-order Markov process—–the probability of rain is assumed to
depend only on whether it rained the previous day
• The first-order Markov assumption says that the state variables contain all the
information needed to characterize the probability distribution for the next time
slice.
• Ways to improve the accuracy of the approximation
— Increasing the order of the Markov process mode
— Increasing the set of state variables

先验概率分布（prior probability distribution）是概率论中的一个重要概念，用于描述在没有任何额外信息的情况下，某一事件或状态的概率分布。

在时间序列分析中，特别是马尔可夫过程（Markov process）中，概率分布随着时间的推移而变化。对于从时间 0 到时间 t 的整个序列，概率分布可以表示为：

P(X0:t, E1:t) = P(X0) * ∏i=1 P(Xi | Xi−1) * P(Ei | Xi)

这里：

P(X0:t, E1:t) 是从时间 0 到时间 t 的状态变量 X0:t 和观测证据变量 E1:t 的联合概率分布。
P(X0) 是时间 0 的先验概率分布，即在没有任何观测信息的情况下，状态变量 X0 的概率分布。
∏i=1 P(Xi | Xi−1) 是状态变量 Xi 基于前一个状态 Xi−1 的条件概率分布的乘积，表示状态的马尔可夫性质。
P(Ei | Xi) 是观测证据变量 Ei 基于状态变量 Xi 的条件概率分布，表示观测模型。

在“Umbrella World”示例中，我们假设雨天的概率只依赖于前一天是否下雨，这是一个一阶马尔可夫过程。一阶马尔可夫假设意味着状态变量包含了描述下一时间片概率分布所需的所有信息。

为了提高这个近似的准确性，可以采取以下方法：

增加马尔可夫过程的阶数：从一阶到更高阶，增加过程的记忆长度。
增加状态变量的集合：添加更多的状态变量来描述系统的复杂性。

通过增加马尔可夫过程的阶数和状态变量的数量，可以更准确地捕捉系统随时间变化的动态特性，从而提高概率分布的准确性。

Inference in Temporal Models
• Formulate the basic inference tasks that must be solved:
— Filtering or state estimation is the task of computing the belief state P (Xt | e1:t)
— Prediction: This is the task of computing the posterior distribution over the future
state, given all evidence to date.
— Smoothing: This is the task of computing the posterior distribution over a past state,
given all evidence up to the present
— Most likely explanation: Given a sequence of observations, we might wish to find the
sequence of states that is most likely to have generated those observations
• Besides inference tasks:
— Learning: The transition and sensor models, if not yet known, can be learned from observations

过滤或状态估计（P(Xt | e1:t)）：
- 这项任务是指根据到目前为止收集的所有证据（观测）来计算系统当前状态（在时间t）。信念状态P(Xt | e1:t)代表我们对状态Xt的最佳估计，基于证据的历史。这就像在新数据到来时更新我们对系统状态的知识。
预测：
- 预测是关于展望未来。它涉及到基于到目前为止的所有证据来计算未来某个时间（Xt+n）的状态的后验分布。这对于预测系统中接下来可能发生的事情非常有用。
平滑：
- 平滑是使用额外的后续观测来改进我们对过去状态的估计的过程。它计算给定到目前为止所有证据（e1:t）的过去状态Xt-n的后验分布。这就像利用当前的知识回顾过去，以更好地理解过去发生了什么。
最可能的解释：
- 给定一系列观测，这项任务涉及到找出最可能产生那些观测的状态序列。它用于识别最可能导致观测数据的潜在原因的状态路径，这对于理解观测的根本原因很有帮助。

除了这些推理任务之外，还有学习方面：

学习：
- 学习是指如果转换模型和传感器模型尚未知晓，则使用观测数据来确定时间模型的参数。这涉及到使用观测数据来估计转换模型（状态随时间如何演变）和传感器模型（如何从状态生成观测）。学习这些模型对于准确推理至关重要，因为推理的质量在很大程度上取决于这些模型的准确性。

Filtering messages
We can think of the filtered estimate P (Xt | e1:t) as a “message” f1:t:
• Propagated forward along the sequence
• Modified by each transition
• Updated by each new observation
So that
f1:t+1 = Forward(f1:t, et+1)
We bootstrap the process with f1:0 = P (X0)

滤消息可以理解为对过滤估计P(Xt | e1:t)作为一种“消息”f1:t的处理过程：

这种消息沿着序列向前传播。
每次状态转移时，消息都会被修改。
每次有新的观测时，消息都会被更新。

因此，我们可以用以下方式表示这个过程： f1:t+1 = Forward(f1:t, et+1) 这里的f1:t+1表示在时间t+1时的更新后的消息，Forward是一个操作，它将时间t的消息f1:t和新的观测et+1结合起来，得到时间t+1的消息。

我们通过以下方式启动这个过程： f1:0 = P(X0) 中文解释如下：

我们可以将过滤估计P(Xt | e1:t)视为一种“消息”f1:t。
这种消息会沿着时间序列向前传递。
每当发生状态转移时，这个消息都会被调整。
每当接收到新的观测数据时，这个消息都会被更新。

因此，我们有以下关系： f1:t+1 = Forward(f1:t, et+1) 这里的f1:t+1表示在时间t+1时的更新后的消息，Forward是一个函数，它将时间t的消息f1:t和新的观测数据et+1结合起来，以生成时间t+1的消息。

这个过程是从以下初始条件开始的： f1:0 = P(X0) 这里的f1:0表示在没有任何观测数据之前，对初始状态X0的先验概率分布。这是整个过滤过程的起点。

AI基础 L22 Uncertainty over Time I 时间的不确定性

最新新闻

热搜词