论文标题

网络的非参数回归

Non-parametric regression for networks

论文作者

Severn, Katie E., Dryden, Ian L., Preston, Simon P.

论文摘要

网络数据越来越多,因此需要开发合适的方法来进行统计分析。网络可以表示为图形laplacian矩阵,这是一种流动值数据。我们的主要目的是从一组欧几里得协变量(例如,在协变量为时间的动态网络中),从图形的laplacian矩阵样本中估算回归曲线。我们开发了一个适应的Nadaraya-Watson估计器,该估计量使用欧几里得和欧几里得指标具有均匀的估计较弱一致性。我们将方法应用于安然电子邮件语料库,以模拟每月网络中的平滑趋势并突出显示异常网络。语料库语言学中给出了另一个激励性应用程序,该应用程序探讨了基于单词共发生网络的随着时间的推移趋势。

Network data are becoming increasingly available, and so there is a need to develop suitable methodology for statistical analysis. Networks can be represented as graph Laplacian matrices, which are a type of manifold-valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example in dynamic networks where the covariate is time. We develop an adapted Nadaraya-Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author's writing style over time based on word co-occurrence networks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源