论文标题
实时网络摄像头凝视跟踪的效率
Efficiency in Real-time Webcam Gaze Tracking
论文作者
论文摘要
效率和易用性对于基于相机的眼睛/凝视跟踪的实际应用至关重要。凝视跟踪涉及估计一个人根据面向计算机相机的面部图像在屏幕上的位置。在本文中,我们研究了凝视跟踪中的两种互补形式的效率:1。系统的计算效率,该系统由CNN预测凝视向量的推理速度主导; 2。可用性效率取决于将视线向量强制校准到计算机屏幕的强制性校准。为此,我们评估了CNN的计算速度/准确性权衡以及筛选校准的校准工作/准确性权衡。对于CNN,我们评估全脸,两眼和单眼输入。对于屏幕校准,我们测量所需的校准点的数量并评估三种类型的校准:1。纯几何,2。纯机器学习和3。混合几何回归。结果表明,单一眼睛输入和几何回归校准实现了最佳的权衡。
Efficiency and ease of use are essential for practical applications of camera based eye/gaze-tracking. Gaze tracking involves estimating where a person is looking on a screen based on face images from a computer-facing camera. In this paper we investigate two complementary forms of efficiency in gaze tracking: 1. The computational efficiency of the system which is dominated by the inference speed of a CNN predicting gaze-vectors; 2. The usability efficiency which is determined by the tediousness of the mandatory calibration of the gaze-vector to a computer screen. To do so, we evaluate the computational speed/accuracy trade-off for the CNN and the calibration effort/accuracy trade-off for screen calibration. For the CNN, we evaluate the full face, two-eyes, and single eye input. For screen calibration, we measure the number of calibration points needed and evaluate three types of calibration: 1. pure geometry, 2. pure machine learning, and 3. hybrid geometric regression. Results suggest that a single eye input and geometric regression calibration achieve the best trade-off.