Comparison of Gaussian process software: Difference between revisions

Content deleted Content added

Inline

Revision as of 22:07, 1 September 2021

This is a comparison of statistical analysis software that allows doing inference with Gaussian processes often using approximations.

This article is written from the point of view of Bayesian statistics, which may use a terminology different from the one commonly used in kriging. The next section should clarify the mathematical/computational meaning of the information provided in the table independently of contextual terminology.

Description of columns

This section details the meaning of the columns in the table below.

Solvers

These columns are about the algorithms used to solve the linear system defined by the prior covariance matrix, i.e., the matrix built by evaluating the kernel.

Exact: whether generic exact algorithms are implemented. These algorithms are usually appropriate only up to some thousands of datapoints.
Specialized: whether specialized exact algorithms for specific classes of problems are implemented. Supported specialized algorithms may be indicated as:
- Kronecker: algorithms for separable kernels on grid data.^[1]
- Toeplitz: algorithms for stationary kernels on uniformly spaced data.^[2]
- Semisep.: algorithms for semiseparable covariance matrices.^[3]
- Sparse: algorithms optimized for sparse covariance matrices.
- Block: algorithms optimized for block diagonal covariance matrices.
Approximate: whether generic or specialized approximate algorithms are implemented. Supported approximate algorithms may be indicated as:
- Sparse: algorithms based on choosing a set of "inducing points" in input space.^[4]
- Hierarchical: algorithms which approximate the covariance matrix with a hierarchical matrix.^[5]

Input

These columns are about the points on which the Gaussian process is evaluated, i.e. $x$ if the process is $f(x)$ .

ND: whether multidimensional input is supported. If it is, multidimensional output is always possible by adding a dimension to the input, even without direct support.
Non-real: whether arbitrary non-real input is supported (for example, text or complex numbers).

Output

These columns are about the values yielded by the process, and how they are connected to the data used in the fit.

Likelihood: whether arbitrary non-Gaussian likelihoods are supported.
Errors: whether arbitrary non-uniform correlated errors on datapoints are supported for the Gaussian likelihood. Errors may be handled manually by adding a kernel component, this column is about the possibility of manipulating them separately. Partial error support may be indicated as:
- iid: the datapoints must be independent and identically distributed.
- Uncorrelated: the datapoints must be independent, but can have different distributions.
- Stationary: the datapoints can be correlated, but the covariance matrix must be a Toeplitz matrix, in particular this implies that the variances must be uniform.

Hyperparameters

These columns are about finding values of variables which enter somehow in the definition of the specific problem but that can not be inferred by the Gaussian process fit, for example parameters in the formula of the kernel.

Prior: whether specifying arbitrary hyperpriors on the hyperparameters is supported.
Posterior: whether estimating the posterior is supported beyond point estimation, possibly in conjunction with other software.

If both the "Prior" and "Posterior" cells contain "Manually", the software provides an interface for computing the marginal likelihood and its gradient w.r.t. hyperparameters, which can be feed into an optimization/sampling algorithm, e.g., gradient descent or Markov chain Monte Carlo.

Linear transformations

These columns are about the possibility of fitting datapoints simultaneously to a process and to linear transformations of it.

Deriv.: whether it is possible to take an arbitrary number of derivatives up to the maximum allowed by the smoothness of the kernel, for any differentiable kernel. Example partial specifications may be the maximum derivability or implementation only for some kernels. Integrals can be obtained indirectly from derivatives.
Finite: whether finite arbitrary $\mathbb {R} ^{n}\to \mathbb {R} ^{m}$ linear transformations are allowed on the specified datapoints.
Sum: whether it is possible to sum various kernels and access separately the processes corresponding to each addend. It is a particular case of finite linear transformation but it is listed separately because it is a common feature.

Comparison table

Name	License	Language	Solvers			Input		Output		Hyperparameters		Linear transformations			Name
Name	License	Language	Exact	Specialized	Approximate	ND	Non-real	Likelihood	Errors	Prior	Posterior	Deriv.	Finite	Sum	Name
PyMC3	Apache	Python	Yes	Kronecker	Sparse	ND	No	Any	Correlated	Yes	Yes	No	Yes	Yes	PyMC3
Stan	BSD, GPL	custom	Yes	No	No	ND	No	Any	Correlated	Yes	Yes	No	Yes	Yes	Stan
scikit-learn	BSD	Python	Yes	No	No	ND	Yes	Bernoulli	Uncorrelated	Manually	Manually	No	No	No	scikit-learn
GPvecchia^[6]	GNU GPL	R	Yes	No	Sparse, Hierarchical	ND	No	Exponential family	Uncorrelated	No	No	No	No	No	GPvecchia
GpGp	MIT	R	No	No	Sparse	ND	No	Gaussian	iid	Manually	Manually	No	No	No	GpGp
GPy^[7]	BSD	Python	Yes	No	Sparse	ND	No	Many	Uncorrelated	Yes	Yes	No	No	No	GPy
pyGPs^[8]	BSD	Python	Yes	No	Sparse	ND	Graphs, Manually	Bernoulli	iid	Manually	Manually	No	No	No	pyGPs
GPyTorch^[9]	MIT	Python	Yes	No	Sparse	ND	No	Bernoulli	No			First RBF			GPyTorch
GPML^[10]^[11]	BSD	MATLAB	Yes	No	Sparse	ND	No	Many	iid	Manually	Manually	No	No	No	GPML
fbm^[11]	Free	C	Yes	No	No	ND	No	Bernoulli, Poisson	Uncorrelated, Stationary	Many	Yes	No			fbm
gptk^[12]	BSD	R	Yes	Block?	Sparse	ND	No	Gaussian	No	Manually	Manually	No	No	No	gptk
SuperGauss	GNU GPL	R, C++	No	Toeplitz^[a]	No	1D	No	Gaussian	No	Manually	Manually	No	No	No	SuperGauss
celerite^[3]	MIT	Python, Julia, C++	No	Semisep.^[b]	No	1D	No	Gaussian	Uncorrelated	Manually	Manually	No	No		celerite
george^[5]	MIT	Python, C++	Yes	No	Hierarchical	ND	No	Gaussian	Uncorrelated	Manually	Manually	No	No	Manually	george
neural-tangents^[13]^[c]	Apache	Python	Yes	Block, Kronecker	No	ND	No	Gaussian	No	No	No	No	No	No	neural-tangents
STK	GNU GPL	MATLAB	Yes	No	No	ND	No	Gaussian	Uncorrelated	Manually	Manually	No	No	Manually	STK
UQLab^[14]	Proprietary	MATLAB													UQLab
ooDACE^[15]	Proprietary	MATLAB				ND	No								ooDACE
GPstuff^[11]	GNU GPL	MATLAB, R	Yes	No	Sparse	ND	No	Many		Many	Yes	First RBF			GPstuff
GSTools	GNU LGPL	Python	Yes	No	No	ND	No	Gaussian	No	No	No	No	No	No	GSTools
GPR	Apache	C++	Yes	No	Sparse	ND	No	Gaussian	iid	Some, Manually	Manually	First	No	No	GPR
PyKrige	BSD	Python				2D,3D	No								PyKrige
GPflow^[7]	Apache	Python	Yes	No	Sparse			Many		Yes	Yes				GPflow
Name	License	Language	Exact	Specialized	Approximate	ND	Non-real	Likelihood	Errors	Prior	Posterior	Deriv.	Finite	Sum	Name
Name	License	Language	Solvers			Input		Output		Hyperparameters		Linear transformations			Name

Notes

^ SuperGauss implements a superfast Toeplitz solver with computational complexity $O(n\log ^{2}n)$ .
^ celerite implements only a specific subalgebra of kernels which can be solved in $O(n)$ .^[3]
^ neural-tangents is a specialized package for infinitely wide neural networks.

References

^ P. Cunningham, John; Gilboa, Elad; Saatçi, Yunus (Feb 2015). "Scaling Multidimensional Inference for Structured Gaussian Processes". IEEE Transactions on Pattern Analysis and Machine Intelligence. 37 (2): 424–436. doi:10.1109/TPAMI.2013.192. PMID 26353252. S2CID 6878550.
^ Leith, D. J.; Zhang, Yunong; Leithead, W. E. (2005). "Time-series Gaussian Process Regression Based on Toeplitz Computation of O(N2) Operations and O(N)-level Storage". Proceedings of the 44th IEEE Conference on Decision and Control: 3711–3716. doi:10.1109/CDC.2005.1582739. S2CID 13627455.
^ ^a ^b ^c Foreman-Mackey, Daniel; Angus, Ruth; Agol, Eric; Ambikasaran, Sivaram (9 November 2017). "Fast and Scalable Gaussian Process Modeling with Applications to Astronomical Time Series". The Astronomical Journal. 154 (6): 220. arXiv:1703.09710. Bibcode:2017AJ....154..220F. doi:10.3847/1538-3881/aa9332. S2CID 88521913.{{cite journal}}: CS1 maint: unflagged free DOI (link)
^ Quiñonero-Candela, Joaquin; Rasmussen, Carl Edward (5 December 2005). "A Unifying View of Sparse Approximate Gaussian Process Regression". Journal of Machine Learning Research. 6: 1939–1959. Retrieved 23 May 2020.
^ ^a ^b Ambikasaran, S.; Foreman-Mackey, D.; Greengard, L.; Hogg, D. W.; O’Neil, M. (1 Feb 2016). "Fast Direct Methods for Gaussian Processes". IEEE Transactions on Pattern Analysis and Machine Intelligence. 38 (2): 252–265. arXiv:1403.6015. doi:10.1109/TPAMI.2015.2448083. PMID 26761732. S2CID 15206293.
^ Zilber, Daniel; Katzfuss, Matthias (January 2021). "Vecchia–Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data". Computational Statistics & Data Analysis. 153. doi:10.1016/j.csda.2020.107081. ISSN 0167-9473. Retrieved 1 September 2021.
^ ^a ^b Matthews, Alexander G. de G.; van der Wilk, Mark; Nickson, Tom; Fujii, Keisuke; Boukouvalas, Alexis; León-Villagrá, Pablo; Ghahramani, Zoubin; Hensman, James (April 2017). "GPflow: A Gaussian process library using TensorFlow". Journal of Machine Learning Research. 18 (40): 1–6. arXiv:1610.08733. Retrieved 6 July 2020.
^ Neumann, Marion; Huang, Shan; E. Marthaler, Daniel; Kersting, Kristian (2015). "pyGPs -- A Python Library for Gaussian Process Regression and Classification". Journal of Machine Learning Research. 16: 2611–2616.
^ Gardner, Jacob R; Pleiss, Geoff; Bindel, David; Weinberger, Kilian Q; Wilson, Andrew Gordon (2018). "GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration" (PDF). Advances in Neural Information Processing Systems. 31: 7576–7586. arXiv:1809.11165. Retrieved 23 May 2020.
^ Rasmussen, Carl Edward; Nickisch, Hannes (Nov 2010). "Gaussian processes for machine learning (GPML) toolbox". Journal of Machine Learning Research. 11 (2): 3011–3015. doi:10.1016/0002-9610(74)90157-3. PMID 4204594.
^ ^a ^b ^c Vanhatalo, Jarno; Riihimäki, Jaakko; Hartikainen, Jouni; Jylänki, Pasi; Tolvanen, Ville; Vehtari, Aki (Apr 2013). "GPstuff: Bayesian Modeling with Gaussian Processes". Journal of Machine Learning Research. 14: 1175−1179. Retrieved 23 May 2020.
^ Kalaitzis, Alfredo; Lawrence, Neil D. (May 20, 2011). "A Simple Approach to Ranking Differentially Expressed Gene Expression Time Courses through Gaussian Process Regression". BMC Bioinformatics. 12 (1): 180. doi:10.1186/1471-2105-12-180. ISSN 1471-2105. Retrieved 1 September 2021.{{cite journal}}: CS1 maint: unflagged free DOI (link)
^ Novak, Roman; Xiao, Lechao; Hron, Jiri; Lee, Jaehoon; Alemi, Alexander A.; Sohl-Dickstein, Jascha; Schoenholz, Samuel S. (2020). "Neural Tangents: Fast and Easy Infinite Neural Networks in Python". International Conference on Learning Representations. arXiv:1912.02803.
^ Marelli, Stefano; Sudret, Bruno (2014). "UQLab: a framework for uncertainty quantification in MATLAB" (PDF). Vulnerability, Uncertainty, and Risk. Quantification, Mitigation, and Management: 2554–2563. doi:10.3929/ethz-a-010238238. Retrieved 28 May 2020.
^ Couckuyt, Ivo; Dhaene, Tom; Demeester, Piet (2014). "ooDACE toolbox: a flexible object-oriented Kriging implementation" (PDF). Journal of Machine Learning Research. 15: 3183–3186. Retrieved 8 July 2020.

[13] SuperGauss implements a superfast Toeplitz solver with computational complexity $O(n\log ^{2}n)$ .

[14] rite implements only a specific subalgebra of kernels which can be solved in $O(n)$ .^[3]

[16] ural-tangents is a specialized package for infinitely wide neural networks.

[gilboa2015-1] P. Cunningham, John; Gilboa, Elad; Saatçi, Yunus (Feb 2015). "Scaling Multidimensional Inference for Structured Gaussian Processes". IEEE Transactions on Pattern Analysis and Machine Intelligence. 37 (2): 424–436. doi:10.1109/TPAMI.2013.192. PMID 26353252. S2CID 6878550.

[zhang2005-2] Leith, D. J.; Zhang, Yunong; Leithead, W. E. (2005). "Time-series Gaussian Process Regression Based on Toeplitz Computation of O(N2) Operations and O(N)-level Storage". Proceedings of the 44th IEEE Conference on Decision and Control: 3711–3716. doi:10.1109/CDC.2005.1582739. S2CID 13627455.

[foreman2017-3] Foreman-Mackey, Daniel; Angus, Ruth; Agol, Eric; Ambikasaran, Sivaram (9 November 2017). "Fast and Scalable Gaussian Process Modeling with Applications to Astronomical Time Series". The Astronomical Journal. 154 (6): 220. arXiv:1703.09710. Bibcode:2017AJ....154..220F. doi:10.3847/1538-3881/aa9332. S2CID 88521913.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[candela2005-4] Quiñonero-Candela, Joaquin; Rasmussen, Carl Edward (5 December 2005). "A Unifying View of Sparse Approximate Gaussian Process Regression". Journal of Machine Learning Research. 6: 1939–1959. Retrieved 23 May 2020.

[ambikasaran2016-5] Ambikasaran, S.; Foreman-Mackey, D.; Greengard, L.; Hogg, D. W.; O’Neil, M. (1 Feb 2016). "Fast Direct Methods for Gaussian Processes". IEEE Transactions on Pattern Analysis and Machine Intelligence. 38 (2): 252–265. arXiv:1403.6015. doi:10.1109/TPAMI.2015.2448083. PMID 26761732. S2CID 15206293.

[zilber2021-6] Zilber, Daniel; Katzfuss, Matthias (January 2021). "Vecchia–Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data". Computational Statistics & Data Analysis. 153. doi:10.1016/j.csda.2020.107081. ISSN 0167-9473. Retrieved 1 September 2021.

[matthews2017-7] Matthews, Alexander G. de G.; van der Wilk, Mark; Nickson, Tom; Fujii, Keisuke; Boukouvalas, Alexis; León-Villagrá, Pablo; Ghahramani, Zoubin; Hensman, James (April 2017). "GPflow: A Gaussian process library using TensorFlow". Journal of Machine Learning Research. 18 (40): 1–6. arXiv:1610.08733. Retrieved 6 July 2020.

[neumann2015-8] Neumann, Marion; Huang, Shan; E. Marthaler, Daniel; Kersting, Kristian (2015). "pyGPs -- A Python Library for Gaussian Process Regression and Classification". Journal of Machine Learning Research. 16: 2611–2616.

[gardner2018-9] Gardner, Jacob R; Pleiss, Geoff; Bindel, David; Weinberger, Kilian Q; Wilson, Andrew Gordon (2018). "GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration" (PDF). Advances in Neural Information Processing Systems. 31: 7576–7586. arXiv:1809.11165. Retrieved 23 May 2020.

[rasmussen2010-10] Rasmussen, Carl Edward; Nickisch, Hannes (Nov 2010). "Gaussian processes for machine learning (GPML) toolbox". Journal of Machine Learning Research. 11 (2): 3011–3015. doi:10.1016/0002-9610(74)90157-3. PMID 4204594.

[vanhatalo2013-11] Vanhatalo, Jarno; Riihimäki, Jaakko; Hartikainen, Jouni; Jylänki, Pasi; Tolvanen, Ville; Vehtari, Aki (Apr 2013). "GPstuff: Bayesian Modeling with Gaussian Processes". Journal of Machine Learning Research. 14: 1175−1179. Retrieved 23 May 2020.

[kalaitzis2011-12] Kalaitzis, Alfredo; Lawrence, Neil D. (May 20, 2011). "A Simple Approach to Ranking Differentially Expressed Gene Expression Time Courses through Gaussian Process Regression". BMC Bioinformatics. 12 (1): 180. doi:10.1186/1471-2105-12-180. ISSN 1471-2105. Retrieved 1 September 2021.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[novak2020-15] Novak, Roman; Xiao, Lechao; Hron, Jiri; Lee, Jaehoon; Alemi, Alexander A.; Sohl-Dickstein, Jascha; Schoenholz, Samuel S. (2020). "Neural Tangents: Fast and Easy Infinite Neural Networks in Python". International Conference on Learning Representations. arXiv:1912.02803.

[marelli2014-17] Marelli, Stefano; Sudret, Bruno (2014). "UQLab: a framework for uncertainty quantification in MATLAB" (PDF). Vulnerability, Uncertainty, and Risk. Quantification, Mitigation, and Management: 2554–2563. doi:10.3929/ethz-a-010238238. Retrieved 28 May 2020.

[couckuyt2014-18] Couckuyt, Ivo; Dhaene, Tom; Demeester, Piet (2014). "ooDACE toolbox: a flexible object-oriented Kriging implementation" (PDF). Journal of Machine Learning Research. 15: 3183–3186. Retrieved 8 July 2020.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[a]

[b]

[13]

[c]

[14]

[15]

@@ Line 254: / Line 254: @@
 ! [http://www.cs.toronto.edu/%7Eradford/fbm.software.html fbm]
 |-
-! [https://CRAN.R-project.org/package=gptk gptk]
+! [https://CRAN.R-project.org/package=gptk gptk]<ref name="kalaitzis2011"></ref>
 | {{BSD-lic}}
 | [[R (programming language)|R]]
@@ Line 532: / Line 532: @@
 <ref name="zilber2021">{{cite journal |last1=Zilber |first1=Daniel |last2=Katzfuss |first2=Matthias |title=Vecchia–Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data |journal=Computational Statistics & Data Analysis |date=January 2021 |volume=153 |doi=10.1016/j.csda.2020.107081 |url=https://www.sciencedirect.com/science/article/pii/S0167947320301729 |access-date=1 September 2021 |issn=0167-9473}}</ref>
+<ref name="kalaitzis2011">{{cite journal |last1=Kalaitzis |first1=Alfredo |last2=Lawrence |first2=Neil D. |title=A Simple Approach to Ranking Differentially Expressed Gene Expression Time Courses through Gaussian Process Regression |journal=BMC Bioinformatics |date=May 20, 2011 |volume=12 |issue=1 |pages=180 |doi=10.1186/1471-2105-12-180 |url=https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-180 |access-date=1 September 2021 |issn=1471-2105}}</ref>
 }}