论文标题

数值恒定识别的标准

Criteria for the numerical constant recognition

论文作者

Odrzywolek, Andrzej

论文摘要

在实验数学,数值分析,计算机代数系统,模型构建,机器学习,近似和数据压缩的许多领域中出现了对功能识别/近似的需求。最低估的方法之一是符号回归。在文章中,还采用了还原主义方法,将完整的问题减少到恒定功能,即纯数字(十进制,浮点)。但是,现有的解决方案由于缺乏区分随机公式,匹配或字面上的十进制膨胀和可能的“精确”(最佳)表达式匹配的固体标准困扰着。特别是,从未开发令人信服的搜索停止标准。在文章中,提供了从统计意义上工作的这样的标准。可以将识别过程视为(1)按照提高Kolmogorov复杂性K(2)随机过程的顺序列出所有公式,并具有适当的统计分布(3)小数线的压缩。这三种方法都非常一致,并且基本上为搜索的实际深度提供了相同的限制。测试的唯一公式计数不得超过1/Sigma,其中Sigma是目标常数的相对数值误差。除此之外,进一步的搜索是毫无意义的,因为在方法(1)的视图中,误差范围内的等效表达式数量呈指数增长。鉴于(2),随机匹配方法的概率1;鉴于(3)压缩比小于1。

The need for recognition/approximation of functions in terms of elementary functions/operations emerges in many areas of experimental mathematics, numerical analysis, computer algebra systems, model building, machine learning, approximation and data compression. One of the most underestimated methods is the symbolic regression. In the article, reductionist approach is applied, reducing full problem to constant functions, i.e, pure numbers (decimal, floating-point). However, existing solutions are plagued by lack of solid criteria distinguishing between random formula, matching approximately or literally decimal expansion and probable ''exact'' (the best) expression match in the sense of Occam's razor. In particular, convincing STOP criteria for search were never developed. In the article, such a criteria, working in statistical sense, are provided. Recognition process can be viewed as (1) enumeration of all formulas in order of increasing Kolmogorov complexity K (2) random process with appropriate statistical distribution (3) compression of a decimal string. All three approaches are remarkably consistent, and provide essentially the same limit for practical depth of search. Tested unique formulas count must not exceed 1/sigma, where sigma is relative numerical error of the target constant. Beyond that, further search is pointless, because, in the view of approach (1), number of equivalent expressions within error bounds grows exponentially; in view of (2), probability of random match approaches 1; in view of (3) compression ratio much smaller than 1.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源