您當前的位置:首頁 > 舞蹈

Improved Trajectory Planning for On-Road Self-Driving Vehicles Via Optimization

作者:由 論文推土機 發表于 舞蹈時間:2022-03-04

本文解析Improved trajectory planning for on-road self-driving vehicles via combined graph search, optimization & topology analysis 的chapter 6。 DDP(differential dynamic programming)進行trajectory planning。

我們將軌跡trajectory理解成一個狀態序列和一個時間序列。

\begin{array}{l} X^{(k)} \doteq\left\{x_{0}, x_{1}, \ldots, x_{i}, \ldots, x_{N-1}, x_{N}\right\} \\ U^{(k)} \doteq\left\{u_{0}, u_{1}, \ldots, u_{i}, \ldots, u_{N-1}\right\} \end{array}

並且還有狀態轉移方程:

x_{i+1}=f_{d}\left(x_{i}, u_{i}\right)

定義目標函式就是要:

\begin{array}{c} U^{*}=\underset{U}{\operatorname{argmin}} J(X, U) \\ J(X, U)=g_{N}\left(x_{N}\right)+\sum_{i=0}^{N-1} g\left(x_{i}, u_{i}\right) \end{array}

需要找到一個控制量U的在時間上的序列,能夠最小化cost function J。

然後我們再定義一個在狀態 i 的時候的value function(optimal cost-to-go)為最小的在狀態 i 的中間cost g與 下一個狀態 i+1的value function的和:

V\left(x_{i}\right)=\min _{u_{i}}\left[g\left(x_{i}, u_{i}\right)+V\left(x_{i+1}\right)\right]

現在把v function中的兩個項叫做P:

\begin{aligned} P\left(x_{i}, u_{i}\right) &=g\left(x_{i}, u_{i}\right)+V\left(x_{i+1}\right) \\ &=g\left(x_{i}, u_{i}\right)+V\left(f_{d}\left(x_{i}, u_{i}\right)\right) \end{aligned}

再定義P的微小擾動為Q:

\begin{aligned} Q(\delta x, \delta u) &=P\left(x_{i}+\delta x, u_{i}+\delta u\right)-P\left(x_{i}, u_{i}\right) \\ &=g\left(x_{i}+\delta x, u_{i}+\delta u\right)-g\left(x_{i}, u_{i}\right) \\ &+V\left(f_{d}\left(x_{i}+\delta x, u_{i}+\delta u\right)\right)-V\left(f_{d}\left(x_{i}, u_{i}\right)\right) \end{aligned}

我們將Q在(0,0)附近做二次化,也就是做二階泰勒展開:

\begin{aligned} Q(\delta x, \delta u) & \approx \tilde{Q}(\delta x, \delta u)=Q(0,0) \\ &+\left.\frac{\partial Q(\delta x, \delta u)}{\partial[\delta x]}\right|_{0,0}[\delta x]+\left.\frac{\partial Q(\delta x, \delta u)}{\partial[\delta u]}\right|_{0,0}[\delta u]+\left.\frac{1}{2}[\delta x]^{T} \frac{\partial^{2} Q(\delta x, \delta u)}{\partial[\delta x]^{2}}\right|_{0,0}[\delta x] \\ &+\left.[\delta u]^{T} \frac{\partial^{2} Q(\delta x, \delta u)}{\partial[\delta u] \partial[\delta x]}\right|_{0,0}[\delta x]+\left.\frac{1}{2}[\delta u]^{T} \frac{\partial^{2} Q(\delta x, \delta u)}{\partial[\delta u]^{2}}\right|_{0,0}[\delta u] \\ &=0+Q_{x}[\delta x]+Q_{u}[\delta u]+\frac{1}{2}[\delta x]^{T} Q_{x x}[\delta x]+[\delta u]^{T} Q_{u x}[\delta x]+\frac{1}{2}[\delta u]^{T} Q_{u u}[\delta u] \\ &=\frac{1}{2}\left[\begin{array}{c} 1 \\ \delta x \\ \delta u \end{array}\right]^{T}\left[\begin{array}{ccc} 0 & Q_{x}^{T} & Q_{u}^{T} \\ Q_{x} & Q_{x x} & Q_{x u} \\ Q_{u} & Q_{u x} & Q_{u u} \end{array}\right]\left[\begin{array}{c} 1 \\ \delta x \\ \delta u \end{array}\right] \end{aligned}

其中:

\begin{aligned} Q_{x} &=\left.\frac{\partial[Q(\delta x, \delta u)]}{\partial[\delta x]}\right|_{0,0}=\left.\frac{\partial g\left(x_{i}+\delta x, u_{i}+\delta u\right)}{\partial[\delta x]}\right|_{0,0}+\left.\frac{\partial V\left(f_{d}\left(x_{i}+\delta x, u_{i}+\delta u\right)\right)}{\partial[\delta x]}\right|_{0,0} \\ &=\left.\frac{\partial g}{\partial x}\right|_{x_{i}, u_{i}}+\left.\left.\frac{\partial f_{d}}{\partial x}\right|_{x_{i}, u_{i}} ^{T} \cdot \frac{\partial V}{\partial x}\right|_{x_{i+1}, u_{i+1}}=g_{x}+f_{x}^{T} V_{x}^{\prime} \\ Q_{u} &=\left.\frac{\partial[Q(\delta x, \delta u)]}{\partial[\delta u]}\right|_{0,0}=\left.\frac{\partial g\left(x_{i}+\delta x, u_{i}+\delta u\right)}{\partial[\delta u]}\right|_{0,0}+\left.\frac{\partial V\left(f_{d}\left(x_{i}+\delta x, u_{i}+\delta u\right)\right)}{\partial[\delta u]}\right|_{0,0} \\ &=\left.\frac{\partial g}{\partial u}\right|_{x_{i}, u_{i}}+\left.\left.\frac{\partial f_{d}}{\partial u}\right|_{x_{j}, u_{i}} ^{T} \cdot \frac{\partial V}{\partial x}\right|_{x_{i+1}, u_{i+1}}=g_{u}+f_{u}^{T} V_{x}^{\prime} \\ Q_{x x} &=\left.\frac{\partial^{2}[Q(\delta x, \delta u)]}{\partial[\delta x]^{2}}\right|_{0,0}=\left.\frac{\partial^{2} g\left(x_{i}+\delta x, u_{i}+\delta u\right)}{\partial[\delta x]^{2}}\right|_{0,0}+\left.\frac{\partial^{2} V\left(f_{d}\left(x_{i}+\delta x, u_{i}+\delta u\right)\right)}{\partial[\delta x]^{2}}\right|_{0,0} \\ &=\left.\frac{\partial^{2} g}{\partial x^{2}}\right|_{x_{i}, u_{i}}+\left.\left.\left.\frac{\partial f_{d}}{\partial x}\right|_{x_{i}, u_{i}} ^{T} \cdot \frac{\partial^{2} V}{\partial x^{2}}\right|_{x_{i+1}, u_{i+1}} \cdot \frac{\partial f_{d}}{\partial x}\right|_{x_{i}, u_{i}}+\left.\left.\frac{\partial V}{\partial x}\right|_{x_{i+1}, u_{i+1}} \cdot \frac{\partial^{2} f_{d}}{\partial x^{2}}\right|_{x_{i}, u_{i}} \\ &=g_{x x}+f_{x}^{T} \cdot V_{x x}^{\prime} \cdot f_{x}+V_{x}^{\prime} \cdot f_{x x} \end{aligned}

\begin{aligned} Q_{u x} &=\left.\frac{\partial^{2}[Q(\delta x, \delta u)]}{\partial[\delta u] \partial[\delta x]}\right|_{0,0}=\left.\frac{\partial^{2} g\left(x_{i}+\delta x, u_{i}+\delta u\right)}{\partial[\delta u] \partial[\delta x]}\right|_{0,0}+\left.\frac{\partial^{2} V\left(f_{d}\left(x_{i}+\delta x, u_{i}+\delta u\right)\right)}{\partial[\delta x] \partial[\delta u]}\right|_{0,0} \\ &=\left.\frac{\partial^{2} g}{\partial u \partial x}\right|_{x_{i}, u_{i}}+\left.\left.\left.\frac{\partial f_{d}}{\partial u}\right|_{x_{i}, u_{i}} ^{T} \cdot \frac{\partial^{2} V}{\partial x^{2}}\right|_{x_{i+1}, u_{i+1}} \cdot \frac{\partial f_{d}}{\partial x}\right|_{x_{i}, u_{i}}+\left.\left.\frac{\partial V}{\partial x}\right|_{x_{i+1}, u_{i+1}} \cdot \frac{\partial^{2} f_{d}}{\partial u \partial x}\right|_{x_{i}, u_{i}} \\ &=g_{u x}+f_{u}^{T} \cdot V_{x x}^{\prime} \cdot f_{x}+V_{x}^{\prime} \cdot f_{u x} \\ Q_{u u} &=\left.\frac{\partial^{2}[Q(\delta x, \delta u)]}{\partial[\delta u]^{2}}\right|_{0,0}=\left.\frac{\partial^{2} g\left(x_{i}+\delta x, u_{i}+\delta u\right)}{\partial[\delta u]^{2}}\right|_{0,0}+\left.\frac{\partial^{2} V\left(f_{d}\left(x_{i}+\delta x, u_{i}+\delta u\right)\right)}{\partial[\delta u]^{2}}\right|_{0,0} \\ &=\left.\frac{\partial^{2} g}{\partial u^{2}}\right|_{x_{i}, u_{i}}+\left.\left.\left.\frac{\partial f_{d}}{\partial u}\right|_{x_{i}, u_{i}} ^{T} \cdot \frac{\partial^{2} V}{\partial x^{2}}\right|_{x_{i+1}, u_{i+1}} \cdot \frac{\partial f_{d}}{\partial u}\right|_{x_{i}, u_{i}}+\left.\left.\frac{\partial V}{\partial x}\right|_{x_{i+1}, u_{i+1}} \cdot \frac{\partial^{2} f_{d}}{\partial u^{2}}\right|_{x_{i}, u_{i}} \\ &=g_{u u}+f_{u}^{T} \cdot V_{x x}^{\prime} \cdot f_{u}+V_{x}^{\prime} \cdot f_{u u} \end{aligned}

公式比較枯燥, 但是就是分佈求導以及一些表達的簡化。

然後,我們現在不是要求最優的控制序列嗎,由於這裡的Q的表達是微小擾動,那我們要找的就是這個最優的微小擾動,然後透過一遍遍的迭代,讓動作序列一直畢竟最優的動作序列,這個和

Improved Trajectory Planning for On-Road Self-Driving Vehicles Via Optimization

這張圖裡的意思是一樣的,從一個初始狀態,透過每一個迭代一點點delta, 最後畢竟最優狀態,以及獲得最優狀態下的最優控制序列。論文推土機:對特斯拉決策規劃技術的理解(ddp optimization)

因為這個表示式是二次泰勒,所以求導為0即是最優點:

\frac{\partial Q(\delta x, \delta u)}{\partial[\delta u]}=Q_{u}+Q_{u x} \cdot \delta x+Q_{u u} \cdot \delta u=0

得到最優解為:

\begin{array}{c} \delta u^{*}=-Q_{u u}^{-1}\left(Q_{u}+Q_{u x} \cdot \delta x\right) \\ =k+K \cdot \delta x \\ k=-Q_{u u}^{-1} Q_{u} \\ K=-Q_{u u}^{-1} Q_{u x} \end{array}

現在這裡還有一步和標準iLQR(DDP)不太一樣的,就是上面的Q的各種求導表示式中有V,而我們要拿Q去求V,所以這裡的trick是將上述表示式帶回去消除V。這個原因在論文推土機:Synthesis and Stabilization of Complex Behaviors through Online Trajectory Optimization解釋過。

\begin{aligned} \frac{\partial V}{\partial x} &=Q_{x}-Q_{u} Q_{u u}^{-1} Q_{u x} \\ \frac{\partial^{2} V}{\partial x^{2}} &=Q_{x x}-Q_{x u} Q_{u u}^{-1} Q_{u x} \end{aligned}

演算法的流程如下:

Improved Trajectory Planning for On-Road Self-Driving Vehicles Via Optimization

Improved Trajectory Planning for On-Road Self-Driving Vehicles Via Optimization

詳細的推導見上課的ppt:

http://

rail。eecs。berkeley。edu/

deeprlcourse/static/slides/lec-10。pdf

以及:

https://

jonathan-hui。medium。com

/rl-lqr-ilqr-linear-quadratic-regulator-a5de5104c750

最後的trajectory規劃效果如圖:

Improved Trajectory Planning for On-Road Self-Driving Vehicles Via Optimization

是不是和tesla的影片很像。

接下來是cost function的詳解:to be continue。

標簽: function  序列  最優  trajectory  cost