2D transformations

2023-04-10

Scale

图一：均匀缩放

把二维坐标下的钟表均匀地缩小一半，用几何的方法表示就是各个点的 $(x, y)$ 转化为 $(x', y')$ ，其中 $x' = sx$ ， $y' = sy$ ， $s = 0.5$ 。我们可以把坐标写成列向量，然后用矩阵相乘的方法计算缩放的结果： $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}s & 0 \\ 0 & s\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right]$ 。其中 $\left[\begin{matrix}s & 0 \\ 0 & s\end{matrix}\right]$ 叫做缩放矩阵。

缩放也可以是非均匀的，即 $x$ 和 $y$ 的缩放系数不一样，如下图，把 $x$ 缩小一半， $y$ 不变：

图二：非均匀缩放

缩放矩阵处理的方法是类似的，只需要把对应的系数填对即可： $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}s_x & 0 \\ 0 & s_y\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right]$ 。

Reflection Matrix

反射（或者说是翻转），也就是沿着 $x$ 轴或 $y$ 轴对图像做镜像，以沿着 $y$ 轴翻转举例，我们可以得到镜像后的 $x' = -x$ ， $y' = y$ 。写成矩阵形式为： $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}-1 & 0 \\ 0 & 1\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right]$ 。

图三：反射（翻转）

Shear Matrix

切变：把图像的某一边沿着某个方向进行拉拽。以图示矩形的上边水平向右拉拽 $a$ 个单位举例：

图四：切变

拉拽结果有如下特点：

对于任何一个点，纵坐标没有发生变化 ( Vertical shift is always 0 )
在 $y = 0$ 处，拉拽前后 $x$ 的值没有变化，依旧是 0 ( Horizontal shift is 0 at $y = 0$ )
在最上边（即 $y = 1$ 处），拉拽后 $x' = x + a$ ( Horizontal shift is $a$ at $y = 1$ )

进一步分析可知，在 $y' = 0.5$ 的地方， $x' = x + 0.5a$ 。易知 $x' = x + ay$ 。由此可写出切变的矩阵形式： $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}1 & a \\ 0 & 1\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right]$ 。

Rotate

如果不说明其他信息，默认是绕原点逆时针旋转 (about the origin (0, 0), CCW by default)。

如下图，把左边的图旋转 45°，就表示把左边的图以原点为中心，逆时针旋转 45°。

图五：以原点为中心逆时针旋转 45°

由前面的几个例子我们可以猜到，旋转应该也有对应的矩阵形式（旋转公式）： $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}A & B \\ C & D\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right]$ 。我们假设示例正方形边长为 1，旋转了 $\theta$ °，则可以发现有如下关系：

图六：旋转关系

既然所有点都要满足旋转公式，那一些特殊点也一定要满足这个公式，我们可以取旋转后的点 $(cos{\theta}, sin{\theta})$ 进行验算，对应旋转前的点是 $(1, 0)$ ，那么就应该满足： $\left[\begin{matrix}cos{\theta} \\ sin{\theta}\end{matrix}\right] = \left[\begin{matrix}A & B \\ C & D\end{matrix}\right]\left[\begin{matrix}1 \\ 0\end{matrix}\right]$ 。展开计算可以得到：

$cos{\theta} = A \times 1 + B \times 0 = A$
$sin{\theta} = C \times 1 + D \times 0 = C$

同样地，我们取旋转后的点 $(-sin{\theta}, cos{\theta})$ 进行验算，对应旋转前的点是 $(0, 1)$ ，应该满足： $\left[\begin{matrix}-sin{\theta} \\ cos{\theta}\end{matrix}\right] = \left[\begin{matrix}A & B \\ C & D\end{matrix}\right]\left[\begin{matrix}0 \\ 1\end{matrix}\right]$ 。展开计算可以得到：

$-sin{\theta} = A \times 0 + B \times 1 = B$
$cos{\theta} = C \times 0 + D \times 1 = D$

所以，对所有点都满足的旋转公式为： $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}cos{\theta} & -sin{\theta} \\ sin{\theta} & cos{\theta}\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right]$ 。对应的旋转矩阵 $R_{\theta} = \left[\begin{matrix}cos{\theta} & -sin{\theta} \\ sin{\theta} & cos{\theta}\end{matrix}\right]$ 。

按照同样的方法，我们可以得出 $R_{-\theta} = \left[\begin{matrix}cos{\theta} & sin{\theta} \\ -sin{\theta} & cos{\theta}\end{matrix}\right]$ 。注意 $R_{-\theta}$ 含义是绕原点顺时针旋转 $\theta$ 。同时，我们可以得到：

$R_{-\theta} = R_{\theta}^T$
$R_{-\theta} = R_{\theta}^{-1}$

也就是如果我们知道了图像绕原点逆时针旋转 $\theta$ 的旋转矩阵，通过求它的转置矩阵就能知道图像绕原点顺时针旋转 $\theta$ 的旋转矩阵。另外，如果一个矩阵的逆等于它的转置，那么这个矩阵就叫做正交矩阵。

Linear Transforms = Matrices (of the same dimension)

如果一个变换满足 $x' = ax + by$ 和 $y' = cx + dy$ ，即可以写成 $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}a & b \\ c & d\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right]$ 的形式（也即 $x' = Mx$ ，这里的 x 表示一个坐标），那么这个变换就是一个线性变换。

Translation

如果把一个图像进行平移变换，很容易得出： $x' = x + t_x$ ， $y' = y + t_y$ 。但是我们同时也会发现，如果用矩阵形式表示平移变换，应该写成： $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}1 & 0 \\ 0& 1\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right] + \left[\begin{matrix}t_x \\ t_y\end{matrix}\right]$ ，也就是说平移变换并不满足线性变换的条件。

图七：平移变换

人类是懒惰的，闫老师也不止一次提到懒惰是一种美德（这里说的懒惰并不是说不去处理问题，而是通过一些方便高效的办法去解决问题），人们不希望把平移变换看作是一种特殊变换，那有什么办法能把平移变换也写成线性变换的形式呢？答案是引入齐次坐标(Homogeneous coordinates)。具体是，增加第三个维度 $w$ ，对于点来说， $w = 1$ ；对于向量来说， $w = 0$ 。

$2D point = (x, y, 1)^T$
$2D vector = (x, y, 0)^T$

这样一来，看下面的式子：

$\left[\begin{matrix}x' \\ y' \\ w'\end{matrix}\right] = \left[\begin{matrix}1 & 0 &t_x\\ 0& 1 & t_y\\0 & 0 & 1\end{matrix}\right]\left[\begin{matrix}x \\ y \\ 1\end{matrix}\right] = \left[\begin{matrix}x+t_x \\ y+t_y \\ 1\end{matrix}\right]$

结果正是平移变换需要的样子。那为什么点和向量要用 1 和 0 区分开呢？因为如果做了这样的区分的话，在齐次坐标下对于向量和点的运算，仍然有一些良好的性质：

vector + vector = vector
$(x_1, y_1, 0) + (x_2, y_2, 0) = (x_1+x_2, y_1+y_2, 0)$
point - point = vector
$(x_1, y_1, 1) - (x_2, y_2, 1) = (x_1-x_2, y_1-y_2, 0)$
point + vector = point
$(x_1, y_1, 1) + (x_2, y_2, 0) = (x_1+x_2, y_1+y_2, 1)$
point + point = ??

对于上面的第四点，两个点相加并无实际意义，而在齐次坐标下，我们扩充了它的意义：对于齐次坐标下的一个点 $(x, y, w)$ ，可以被认为是 $(x/w, y/w, 1)$ ，其中 $w \neq 0$ 。这样就有：

point + point = point
$(x_1, y_1, 1) + (x_2, y_2, 1) = (x_1+x_2, y_1+y_2, 2) = ((x_1+x_2)/2, (y_1+y_2)/2, 1)$

也就是说，在齐次坐标下两点相加的结果是它们的中点。而引入齐次坐标的一个最重要的目的，就是为了把所有的变换都能看作是线性变换。在引入齐次坐标之前，平移变换的形式为： $\left[\begin{matrix}x' \\ y'\end{matrix}\right] = \left[\begin{matrix}a & b \\ c& d\end{matrix}\right]\left[\begin{matrix}x \\ y\end{matrix}\right] + \left[\begin{matrix}t_x \\ t_y\end{matrix}\right]$ 。这种由一个线性变换加平移而组成的变换叫做仿射变换。仿射变换都可以用齐次坐标来表示： $\left[\begin{matrix}x' \\ y' \\ 1\end{matrix}\right] = \left[\begin{matrix}a & b &t_x\\ c& d & t_y\\0 & 0 & 1\end{matrix}\right]\left[\begin{matrix}x \\ y \\ 1\end{matrix}\right]$ 。包括前面的放缩、旋转都可以用齐次坐标表示：

Scale: $S(s_x, s_y) = \left(\begin{matrix}s_x & 0 & 0 \\0 & s_y & 0\\0 & 0 & 1\end{matrix}\right)$
Rotation: $R(\theta) = \left(\begin{matrix}cos{\theta} & -sin{\theta} & 0 \\sin{\theta} & cos{\theta} & 0\\0 & 0 & 1\end{matrix}\right)$
Translation: $T(t_x, t_y) = \left(\begin{matrix}1 & 0 & t_x \\0 & 1 & t_y\\0 & 0 & 1\end{matrix}\right)$

这其中还有一些规律，左上角 $2 \times 2$ 的矩阵代表对应的变换，而第三列的前两个数表示平移坐标，第三行都是 (0, 0, 1)（注意：这里只是针对二维坐标的齐次坐标表示有这个性质）。然而引入齐次坐标也有一定的代价 (trade off)：增加了一个维度。

Inverse Transform（逆变换）

如果一个图像经过某些变换 $(M)$ 后到达一个新的位置，那从新的位置进行逆变换后回到初始位置。而这个逆变换在数学上对应的是乘以这个变换的逆矩阵 $(M^{-1})$ 。

图八：逆变换

Composing Transforms（组合变换）

考虑下图一个问题：左边的图像要经过怎样的变换才能得到右边的图像呢？

图九：组合变换

我们尝试先平移 $(T_{(1,0)})$ ，后旋转（注意默认是绕原点逆时针旋转） $(R_{45})$ ，另外我们约定变换是从右到左一一应用，也就是先平移后旋转应写成 $R_{45}{\cdot}T_{(1,0)}$ 。我们发现结果不是我们想要的：

图十：先平移后旋转

再试试先旋转后平移，我们发现得到了想要的结果：

图十一：先旋转后平移

这里我们得出两个结论：

复杂的变换可以通过简单的变换得到
变换的顺序会影响变化的结果

我们也可以从矩阵乘法不满足交换律去理解 $R_{45}{\cdot}T_{(1,0)} \neq T_{(1,0)}{\cdot}R_{45}$ ，而上述变换也可以写成下面的形式（注意变换的应用是从右到左）：

$T_{(1,0)}{\cdot}R_{45}{\cdot}\left[\begin{matrix}x\\ y\\1\end{matrix}\right] = \left[\begin{matrix}1 & 0 & 1\\0 & 1 & 0\\0 & 0 & 1\end{matrix}\right]\left[\begin{matrix}cos{45°} & -sin{45°} & 0\\sin{45°} & cos{45°} & 0\\0 & 0 & 1\end{matrix}\right]\left[\begin{matrix}x\\ y\\1\end{matrix}\right]$

我们推广一下，对 $X$ 依次进行 $A_1, A_2, A_3, ... A_n$ 这些变换，我们可以得到：

$A_{n}(...A_{2}(A_{1}(X))) = A_{n} \cdots A_{2} \cdot A_{1} \cdot \left(\begin{matrix}x\\ y\\1\end{matrix}\right)$

虽然矩阵乘法不满足交换律，但是满足结合律。因此在这里，我们可以把前面的变换 $A_{n} \cdots A_{2} \cdot A_{1}$ 先计算出来，再跟最后的列向量相乘。而二维齐次坐标下的变换都是 $3 \times 3$ 的矩阵，即前面所有变换的结果也都是 $3 \times 3$ 的矩阵，这对计算机运算来说是一个很好的性质。

Decomposing Complex Transforms（变换的分解）

前面我们说旋转变换默认是绕原点逆时针旋转，那如果我们需要绕指定点 $c = (c_x, c_y)$ 旋转的话需要如何做呢？这里可以通过分解变换来解决：

把图像相对于 $c$ 点通过平移回到原点（即图像所有点都要做 $x-c_x, y-c_y$ 操作）
进行旋转变换
平移回去

图十二：通过变换分解实现绕指定点旋转

对应的矩阵形式为（注意顺序是从右到左依次应用变换）： $T(c) \cdot R(\alpha) \cdot T(-c)$

* 未经同意不得转载。

Matrices

3D transformations

最爱午后红茶