论文标题
计算时间表的模型选择,并在自行车共享系统中的旅行数量及其波动性中进行了预测数量
Model selection for count timeseries with applications in forecasting number of trips in bike-sharing systems and its volatility
论文作者
论文摘要
预测自行车共享系统中的旅行次数及其随着时间的流动性对于计划和优化此类系统至关重要。本文开发了时间工程模型,以预测小时计算时间工程数据,并估计其波动性。这样的模型需要考虑到各种时间尺度上的复杂模式,包括每小时,每日,每周和每年以及时间相关性。为了捕获这种复杂的结构,需要大量参数。这里使用一种结构模型选择方法来选择参数。此方法探索了每个步骤的一组协变量的参数空间。这些协变量组的构造是代表模型中的特定结构。所使用的统计模型是对时间表数据的广义线性模型的扩展。使用此类模型的一个挑战是模拟值的爆炸性行为。为了解决这个问题,我们开发了一种依赖于模拟值的技术,如果它不在可允许的间隔之外。可允许的间隔是使用左右尾巴变异性的度量来定义的。根据这些可变性度量提出了一个新的异常值定义。该新定义在不对称分布的上下文中被证明是有用的。
Forecasting the number of trips in bike-sharing systems and its volatility over time is crucial for planning and optimizing such systems. This paper develops timeseries models to forecast hourly count timeseries data, and estimate its volatility. Such models need to take into account the complex patterns over various temporal scales including hourly, daily, weekly and annual as well as the temporal correlation. To capture this complex structure, a large number of parameters are needed. Here a structural model selection approach is utilized to choose the parameters. This method explores the parameter space for a group of covariates at each step. These groups of covariate are constructed to represent a particular structure in the model. The statistical models utilized are extensions of Generalized Linear Models to timeseries data. One challenge in using such models is the explosive behavior of the simulated values. To address this issue, we develop a technique which relies on damping the simulated value, if it falls outside of an admissible interval. The admissible interval is defined using measures of variability of the left and right tails. A new definition of outliers is proposed based on these variability measures. This new definition is shown to be useful in the context of asymmetric distributions.