论文标题
使用深层编码器从视频数据中识别非线性状态空间模型
Non-linear State-space Model Identification from Video Data using Deep Encoders
论文作者
论文摘要
在机器人技术,自动驾驶汽车和医学成像中,识别具有高维输入和输出的系统,例如通过视频流衡量的系统,这是一个充满挑战的问题。在本文中,我们提出了一种从高维输入和输出数据开始的新型非线性状态空间识别方法。组合了多个计算和概念上的进步,以处理数据的高维质。引入了由神经网络代表的编码函数,以学习可重建性图,以从过去的输入和输出中估算模型状态。通过动力学共同学习此编码器函数。此外,提出了多种计算改进,例如对多次拍摄和批次优化的改进重新印度的重新印象,以在处理高维和大数据集时控制计算时间。我们将提出的方法应用于单元盒中可控球的模拟环境的视频流。该研究显示了低模拟误差,具有出色的长期预测能力,该模型使用了所提出的方法获得。
Identifying systems with high-dimensional inputs and outputs, such as systems measured by video streams, is a challenging problem with numerous applications in robotics, autonomous vehicles and medical imaging. In this paper, we propose a novel non-linear state-space identification method starting from high-dimensional input and output data. Multiple computational and conceptual advances are combined to handle the high-dimensional nature of the data. An encoder function, represented by a neural network, is introduced to learn a reconstructability map to estimate the model states from past inputs and outputs. This encoder function is jointly learned with the dynamics. Furthermore, multiple computational improvements, such as an improved reformulation of multiple shooting and batch optimization, are proposed to keep the computational time under control when dealing with high-dimensional and large datasets. We apply the proposed method to a video stream of a simulated environment of a controllable ball in a unit box. The study shows low simulation error with excellent long term prediction capability of the model obtained using the proposed method.