From 11008d691efb4b5c968cf146f46d6c907b925ec7 Mon Sep 17 00:00:00 2001 From: st-- Date: Sat, 9 Jan 2021 17:05:33 +0000 Subject: [PATCH] Documentation improvements (inc. lengthscale explanation) and Matern12Kernel alias (#213) * various edits for clarity and typos * remove reference to not-yet-implemented feature (#38) * adds Matern12Kernel as alias for ExponentialKernel (in line with the explicitly defined Matern32Kernel and Matern52Kernel) and gives all aliases docstrings * incorporates the lengthscales explanation from #212. Co-authored-by: David Widmann --- docs/make.jl | 2 +- docs/src/create_kernel.md | 10 +-- docs/src/index.md | 2 +- docs/src/kernels.md | 71 +++++++++++--------- docs/src/metrics.md | 15 +++-- docs/src/transform.md | 10 +-- docs/src/userguide.md | 93 ++++++++++++++------------ src/KernelFunctions.jl | 2 +- src/basekernels/constant.jl | 4 +- src/basekernels/exponential.jl | 42 ++++++++++-- src/basekernels/matern.jl | 2 + src/basekernels/piecewisepolynomial.jl | 7 +- src/transform/lineartransform.jl | 1 - 13 files changed, 157 insertions(+), 104 deletions(-) diff --git a/docs/make.jl b/docs/make.jl index 619a9f102..0970c5722 100644 --- a/docs/make.jl +++ b/docs/make.jl @@ -16,7 +16,7 @@ makedocs( "User Guide" => "userguide.md", "Examples"=>"example.md", "Kernel Functions"=>"kernels.md", - "Transform"=>"transform.md", + "Input Transforms"=>"transform.md", "Metrics"=>"metrics.md", "Theory"=>"theory.md", "Custom Kernels"=>"create_kernel.md", diff --git a/docs/src/create_kernel.md b/docs/src/create_kernel.md index 049b6e0a3..488632935 100644 --- a/docs/src/create_kernel.md +++ b/docs/src/create_kernel.md @@ -2,9 +2,9 @@ KernelFunctions.jl contains the most popular kernels already but you might want to make your own! -Here are a few ways depending on how complicated your kernel is : +Here are a few ways depending on how complicated your kernel is: -### SimpleKernel for kernels function depending on a metric +### SimpleKernel for kernel functions depending on a metric If your kernel function is of the form `k(x, y) = f(d(x, y))` where `d(x, y)` is a `PreMetric`, you can construct your custom kernel by defining `kappa` and `metric` for your kernel. @@ -20,7 +20,7 @@ KernelFunctions.metric(::MyKernel) = SqEuclidean() ### Kernel for more complex kernels If your kernel does not satisfy such a representation, all you need to do is define `(k::MyKernel)(x, y)` and inherit from `Kernel`. -For example we recreate here the `NeuralNetworkKernel` +For example, we recreate here the `NeuralNetworkKernel`: ```julia struct MyKernel <: KernelFunctions.Kernel end @@ -28,7 +28,7 @@ struct MyKernel <: KernelFunctions.Kernel end (::MyKernel)(x, y) = asin(dot(x, y) / sqrt((1 + sum(abs2, x)) * (1 + sum(abs2, y)))) ``` -Note that `BaseKernel` do not use `Distances.jl` and can therefore be a bit slower. +Note that the fallback implementation of the base `Kernel` evaluation does not use `Distances.jl` and can therefore be a bit slower. ### Additional Options @@ -37,7 +37,7 @@ Finally there are additional functions you can define to bring in more features: - `KernelFunctions.dim(x::MyDataType)`: by default the dimension of the inputs will only be checked for vectors of type `AbstractVector{<:Real}`. If you want to check the dimensionality of your inputs, dispatch the `dim` function on your datatype. Note that `0` is the default. - `dim` is called within `KernelFunctions.validate_inputs(x::MyDataType, y::MyDataType)`, which can instead be directly overloaded if you want to run special checks for your input types. - `kernelmatrix(k::MyKernel, ...)`: you can redefine the diverse `kernelmatrix` functions to eventually optimize the computations. - - `Base.print(io::IO, k::MyKernel)`: if you want to specialize the printing of your kernel + - `Base.print(io::IO, k::MyKernel)`: if you want to specialize the printing of your kernel. KernelFunctions uses [Functors.jl](https://github.com/FluxML/Functors.jl) for specifying trainable kernel parameters in a way that is compatible with the [Flux ML framework](https://github.com/FluxML/Flux.jl). diff --git a/docs/src/index.md b/docs/src/index.md index 61bd94887..719e6bb79 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -1,6 +1,6 @@ # KernelFunctions.jl -Model agnostic kernel functions compatible with automatic differentiation +Model-agnostic kernel functions compatible with automatic differentiation **KernelFunctions.jl** is a general purpose kernel package. It aims at providing a flexible framework for creating kernels and manipulating them. diff --git a/docs/src/kernels.md b/docs/src/kernels.md index cd535a002..492437727 100644 --- a/docs/src/kernels.md +++ b/docs/src/kernels.md @@ -4,7 +4,7 @@ # Base Kernels -These are the basic kernels without any transformation of the data. They are the building blocks of KernelFunctions +These are the basic kernels without any transformation of the data. They are the building blocks of KernelFunctions. ## Constant Kernels @@ -86,21 +86,20 @@ The [`FBMKernel`](@ref) is defined as k(x,x';h) = \frac{|x|^{2h} + |x'|^{2h} - |x-x'|^{2h}}{2}, ``` -where $h$ is the [Hurst index](https://en.wikipedia.org/wiki/Hurst_exponent#Generalized_exponent) and $00 $ is the lengthscale and $p_i>0$ is the period. +where $l_i > 0$ is the lengthscale and $p_i > 0$ is the period. -## Matern Kernels +## Matérn Kernels -### Matern Kernel +### General Matérn Kernel The [`MaternKernel`](@ref) is defined as @@ -110,7 +109,15 @@ The [`MaternKernel`](@ref) is defined as where $\nu > 0$. -### Matern 3/2 Kernel +### Matérn 1/2 Kernel + +The Matérn 1/2 kernel is defined as +```math + k(x,x') = \exp\left(-|x-x'|\right), +``` +equivalent to the Exponential kernel. `Matern12Kernel` is an alias for [`ExponentialKernel`](@ref). + +### Matérn 3/2 Kernel The [`Matern32Kernel`](@ref) is defined as @@ -118,7 +125,7 @@ The [`Matern32Kernel`](@ref) is defined as k(x,x') = \left(1+\sqrt{3}|x-x'|\right)\exp\left(\sqrt{3}|x-x'|\right). ``` -### Matern 5/2 Kernel +### Matérn 5/2 Kernel The [`Matern52Kernel`](@ref) is defined as @@ -128,7 +135,7 @@ The [`Matern52Kernel`](@ref) is defined as ## Neural Network Kernel -The [`NeuralNetworkKernel`](@ref) (as in the kernel for an infinitely wide neural network interpretated as a Gaussian process) is defined as +The [`NeuralNetworkKernel`](@ref) (as in the kernel for an infinitely wide neural network interpreted as a Gaussian process) is defined as ```math k(x, x') = \arcsin\left(\frac{\langle x, x'\rangle}{\sqrt{(1+\langle x, x\rangle)(1+\langle x',x'\rangle)}}\right). @@ -142,19 +149,23 @@ The [`PeriodicKernel`](@ref) is defined as k(x,x';r) = \exp\left(-0.5 \sum_i (sin (π(x_i - x'_i))/r_i)^2\right), ``` -where $r$ has the same dimension as $x$ and $r_i >0$. +where $r$ has the same dimension as $x$ and $r_i > 0$. ## Piecewise Polynomial Kernel -The [`PiecewisePolynomialKernel`](@ref) is defined as - +The [`PiecewisePolynomialKernel`](@ref) is defined for $x, x'\in \mathbb{R}^D$, a positive-definite matrix $P \in \mathbb{R}^{D \times D}$, and $V \in \{0,1,2,3\}$ as ```math - k(x,x'; P, V) =& \max(1 - r, 0)^{j + V} f(r, j),\\ - r =& x^\top P x',\\ - j =& \lfloor \frac{D}{2}\rfloor + V + 1, + k(x,x'; P, V) = \max(1 - \sqrt{x^\top P x'}, 0)^{j + V} f_V(\sqrt{x^\top P x'}, j), +``` +where $j = \lfloor \frac{D}{2}\rfloor + V + 1$, and $f_V$ are polynomials defined as follows: +```math +\begin{aligned} + f_0(r, j) &= 1, \\ + f_1(r, j) &= 1 + (j + 1) r, \\ + f_2(r, j) &= 1 + (j + 2) r + ((j^2 + 4j + 3) / 3) r^2, \\ + f_3(r, j) &= 1 + (j + 3) r + ((6 j^2 + 36j + 45) / 15) r^2 + ((j^3 + 9 j^2 + 23j + 15) / 15) r^3. +\end{aligned} ``` -where $x\in \mathbb{R}^D$, $V \in \{0,1,2,3\} and $P$ is a positive definite matrix. -$f$ is a piecewise polynomial (see source code). ## Polynomial Kernels @@ -166,7 +177,7 @@ The [`LinearKernel`](@ref) is defined as k(x,x';c) = \langle x,x'\rangle + c, ``` -where $c \in \mathbb{R}$ +where $c \in \mathbb{R}$. ### Polynomial Kernel @@ -176,7 +187,7 @@ The [`PolynomialKernel`](@ref) is defined as k(x,x';c,d) = \left(\langle x,x'\rangle + c\right)^d, ``` -where $c \in \mathbb{R}$ and $d>0$ +where $c \in \mathbb{R}$ and $d>0$. ## Rational Quadratic @@ -223,35 +234,33 @@ where $i\in\{-1,0,1,2,3\}$ and coefficients $a_i$, $b_i$ are fixed and residuals ### Transformed Kernel -The [`TransformedKernel`](@ref) is a kernel where input are transformed via a function `f` +The [`TransformedKernel`](@ref) is a kernel where inputs are transformed via a function `f`: ```math - k(x,x';f,\widetile{k}) = \widetilde{k}(f(x),f(x')), + k(x,x';f,\widetilde{k}) = \widetilde{k}(f(x),f(x')), ``` - -Where $\widetilde{k}$ is another kernel and $f$ is an arbitrary mapping. +where $\widetilde{k}$ is another kernel and $f$ is an arbitrary mapping. ### Scaled Kernel The [`ScaledKernel`](@ref) is defined as ```math - k(x,x';\sigma^2,\widetilde{k}) = \sigma^2\widetilde{k}(x,x') + k(x,x';\sigma^2,\widetilde{k}) = \sigma^2\widetilde{k}(x,x') , ``` - -Where $\widetilde{k}$ is another kernel and $\sigma^2 > 0$. +where $\widetilde{k}$ is another kernel and $\sigma^2 > 0$. ### Kernel Sum -The [`KernelSum`](@ref) is defined as a sum of kernels +The [`KernelSum`](@ref) is defined as a sum of kernels: ```math k(x, x'; \{k_i\}) = \sum_i k_i(x, x'). ``` -### KernelProduct +### Kernel Product -The [`KernelProduct`](@ref) is defined as a product of kernels +The [`KernelProduct`](@ref) is defined as a product of kernels: ```math k(x,x';\{k_i\}) = \prod_i k_i(x,x'). @@ -259,7 +268,7 @@ The [`KernelProduct`](@ref) is defined as a product of kernels ### Tensor Product -The [`TensorProduct`](@ref) is defined as : +The [`TensorProduct`](@ref) is defined as: ```math k(x,x';\{k_i\}) = \prod_i k_i(x_i,x'_i) diff --git a/docs/src/metrics.md b/docs/src/metrics.md index 644fcb2a0..12260adbc 100644 --- a/docs/src/metrics.md +++ b/docs/src/metrics.md @@ -1,16 +1,19 @@ # Metrics -KernelFunctions.jl relies on [Distances.jl](https://github.com/JuliaStats/Distances.jl) for computing the pairwise matrix. -To do so a distance measure is needed for each kernel. Two very common ones can already be used : `SqEuclidean` and `Euclidean`. -However all kernels do not rely on distances metrics respecting all the definitions. That's why additional metrics come with the package such as `DotProduct` (``) and `Delta` (`δ(x,y)`). -Note that every `SimpleKernel` must have a defined metric defined as : +`SimpleKernel` implementations rely on [Distances.jl](https://github.com/JuliaStats/Distances.jl) for efficiently computing the pairwise matrix. +This requires a distance measure or metric, such as the commonly used `SqEuclidean` and `Euclidean`. + +The metric used by a given kernel type is specified as ```julia - KernelFunctions.metric(::CustomKernel) = SqEuclidean() +KernelFunctions.metric(::CustomKernel) = SqEuclidean() ``` +However, there are kernels that can be implemented efficiently using "metrics" that do not respect all the definitions expected by Distances.jl. For this reason, KernelFunctions.jl provides additional "metrics" such as `DotProduct` ($\langle x, y \rangle$) and `Delta` ($\delta(x,y)$). + + ## Adding a new metric -If you want to create a new distance just implement the following : +If you want to create a new "metric" just implement the following: ```julia struct Delta <: Distances.PreMetric diff --git a/docs/src/transform.md b/docs/src/transform.md index ca927f5e9..31e4e1732 100644 --- a/docs/src/transform.md +++ b/docs/src/transform.md @@ -1,9 +1,9 @@ -# Transform +# Input Transforms `Transform` is the object that takes care of transforming the input data before distances are being computed. It can be as standard as `IdentityTransform` returning the same input, or multiplying the data by a scalar with `ScaleTransform` or by a vector with `ARDTransform`. -There is a more general `Transform`: `FunctionTransform` that uses a function and apply it on each vector via `mapslices`. -You can also create a pipeline of `Transform` via `TransformChain`. For example `LowRankTransform(rand(10,5))∘ScaleTransform(2.0)`. +There is a more general `Transform`: `FunctionTransform` that uses a function and applies it on each vector via `mapslices`. +You can also create a pipeline of `Transform` via `TransformChain`. For example, `LowRankTransform(rand(10,5))∘ScaleTransform(2.0)`. -One apply a transformation on a matrix or a vector via `KernelFunctions.apply(t::Transform,v::AbstractVecOrMat)` +A transformation `t` can be applied to a matrix or a vector `v` via `KernelFunctions.apply(t, v)`. -Check the list on the [API page](@ref Transforms) +Check the full list of provided transforms on the [API page](@ref Transforms). diff --git a/docs/src/userguide.md b/docs/src/userguide.md index 59bdda9ec..39fe55b23 100644 --- a/docs/src/userguide.md +++ b/docs/src/userguide.md @@ -2,88 +2,95 @@ ## Kernel creation -To create a kernel chose one of the kernels proposed, see [Base Kernels](@ref), or create your own, see [Creating your own kernel](@ref) -For example to create a square exponential kernel +To create a kernel object, choose one of the pre-implemented kernels, see [Base Kernels](@ref), or create your own, see [Creating your own kernel](@ref). +For example, a squared exponential kernel is created by ```julia k = SqExponentialKernel() ``` -Instead of having lengthscale(s) for each kernel we use `Transform` objects (see [Transform](@ref)) which are directly going to act on the inputs before passing them to the kernel. -For example to premultiply the input by 2.0 we create the kernel the following options are possible -```julia - k = transform(SqExponentialKernel(),ScaleTransform(2.0)) # returns a TransformedKernel - k = @kernel SqExponentialKernel() l=2.0 # Will be available soon - k = TransformedKernel(SqExponentialKernel(),ScaleTransform(2.0)) -``` -Check the [`Transform`](@ref) page to see the other options. -To premultiply the kernel by a variance, you can use `*` or create a `ScaledKernel` -```julia - k = 3.0*SqExponentialKernel() - k = ScaledKernel(SqExponentialKernel(),3.0) - @kernel 3.0*SqExponentialKernel() -``` + +!!! tip "How do I set the lengthscale?" + Instead of having lengthscale(s) for each kernel we use [`Transform`](@ref) objects which act on the inputs before passing them to the kernel. Note that the transforms such as [`ScaleTransform`](@ref) and [`ARDTransform`](@ref) _multiply_ the input by a scale factor, which corresponds to the _inverse_ of the lengthscale. + For example, a lengthscale of 0.5 is equivalent to premultiplying the input by 2.0, and you can create the corresponding kernel as follows: + ```julia + k = transform(SqExponentialKernel(), ScaleTransform(2.0)) + k = transform(SqExponentialKernel(), 2.0) # implicitly constructs a ScaleTransform(2.0) + ``` + Check the [Input Transforms](@ref) page for more details. The API documentation contains an [overview of all available transforms](@ref Transforms). + +!!! tip "How do I set the kernel variance?" + To premultiply the kernel by a variance, you can use `*` with a scalar number: + ```julia + k = 3.0 * SqExponentialKernel() + ``` ## Using a kernel function -To compute the kernel function on two vectors you can call +To evaluate the kernel function on two vectors you simply call the kernel object: ```julia k = SqExponentialKernel() x1 = rand(3) x2 = rand(3) - k(x1,x2) + k(x1, x2) ``` ## Creating a kernel matrix Kernel matrices can be created via the `kernelmatrix` function or `kerneldiagmatrix` for only the diagonal. -An important argument to give is the dimensionality of the input `obsdim`. It tells if the matrix is of the type `# samples X # features` (`obsdim`=1) or `# features X # samples`(`obsdim`=2) (similarly to [Distances.jl](https://github.com/JuliaStats/Distances.jl)) +An important argument to give is the data layout of the input `obsdim`. It specifies whether the number of observed data points is along the first dimension (`obsdim=1`, i.e. the matrix shape is number of samples times number of features) or along the second dimension (`obsdim=2`, i.e. the matrix shape is number of features times number of samples), similarly to [Distances.jl](https://github.com/JuliaStats/Distances.jl). If not given explicitly, `obsdim` defaults to [`defaultobs`](@ref). For example: ```julia k = SqExponentialKernel() - A = rand(10,5) - kernelmatrix(k,A,obsdim=1) # Return a 10x10 matrix - kernelmatrix(k,A,obsdim=2) # Return a 5x5 matrix - k(A,obsdim=1) # Syntactic sugar + A = rand(10, 5) + kernelmatrix(k, A, obsdim=1) # returns a 10x10 matrix + kernelmatrix(k, A, obsdim=2) # returns a 5x5 matrix + k(A, obsdim=1) # Syntactic sugar ``` -We also support specific kernel matrices outputs: +We also support specific kernel matrix outputs: - For a positive-definite matrix object`PDMat` from [`PDMats.jl`](https://github.com/JuliaStats/PDMats.jl), you can call the following: ```julia using PDMats k = SqExponentialKernel() - K = kernelpdmat(k,A,obsdim=1) # PDMat + K = kernelpdmat(k, A, obsdim=1) # PDMat ``` -It will create a matrix and in case of bad conditionning will add some diagonal noise until the matrix is considered PSD, it will then return a `PDMat` object. For this method to work in your code you need to include `using PDMats` first +It will create a matrix and in case of bad conditioning will add some diagonal noise until the matrix is considered positive-definite; it will then return a `PDMat` object. For this method to work in your code you need to include `using PDMats` first. - For a Kronecker matrix, we rely on [`Kronecker.jl`](https://github.com/MichielStock/Kronecker.jl). Here are two examples: ```julia using Kronecker -x = range(0,1,length=10) -y = range(0,1,length=50) -K = kernelkronmat(k,[x,y]) # Kronecker matrix -K = kernelkronmat(k,x,5) # Kronecker matrix +x = range(0, 1, length=10) +y = range(0, 1, length=50) +K = kernelkronmat(k, [x, y]) # Kronecker matrix +K = kernelkronmat(k, x, 5) # Kronecker matrix ``` -Make sure that `k` is a vector compatible with such constructions (with `iskroncompatible`). Both method will return a . For those methods to work in your code you need to include `using Kronecker` first -- For a Nystrom approximation : `kernelmatrix(nystrom(k, X, ρ, obsdim = 1))` where `ρ` is the proportion of sampled used. +Make sure that `k` is a kernel compatible with such constructions (with `iskroncompatible(k)`). Both methods will return a Kronecker matrix. For those methods to work in your code you need to include `using Kronecker` first. +- For a Nystrom approximation: `kernelmatrix(nystrom(k, X, ρ, obsdim=1))` where `ρ` is the fraction of data samples used in the approximation. ## Composite kernels -One can create combinations of kernels via `KernelSum` and `KernelProduct` or using simple operators `+` and `*`. -For example : +Sums and products of kernels are also valid kernels. They can be created via `KernelSum` and `KernelProduct` or using simple operators `+` and `*`. +For example: ```julia k1 = SqExponentialKernel() k2 = Matern32Kernel() - k = 0.5 * k1 + 0.2 * k2 # KernelSum - k = k1 * k2 # KernelProduct + k = 0.5 * k1 + 0.2 * k2 # KernelSum + k = k1 * k2 # KernelProduct ``` -## Kernel Parameters +## Kernel parameters -What if you want to differentiate through the kernel parameters? Even in a highly nested structure such as : +What if you want to differentiate through the kernel parameters? This is easy even in a highly nested structure such as: ```julia - k = transform(0.5*SqExponentialKernel()*MaternKernel()+0.2*(transform(LinearKernel(),2.0)+PolynomialKernel()),[0.1,0.5]) + k = transform( + 0.5 * SqExponentialKernel() * Matern12Kernel() + + 0.2 * (transform(LinearKernel(), 2.0) + PolynomialKernel()), + [0.1, 0.5]) ``` -One can get the array of parameters to optimize via `params` from `Flux.jl` - +One can access the named tuple of trainable parameters via `Functors.functor` from `Functors.jl`. +This means that in practice you can implicitly optimize the kernel parameters by calling: ```julia - using Flux - params(k) +using Flux +kernelparams = Flux.params(k) +Flux.gradient(kernelparams) do + # ... some loss function on the kernel .... +end ``` diff --git a/src/KernelFunctions.jl b/src/KernelFunctions.jl index eeb57a9c7..58eaecc6e 100644 --- a/src/KernelFunctions.jl +++ b/src/KernelFunctions.jl @@ -28,7 +28,7 @@ export SqExponentialKernel, RBFKernel, GaussianKernel, SEKernel export LaplacianKernel, ExponentialKernel, GammaExponentialKernel export ExponentiatedKernel export FBMKernel -export MaternKernel, Matern32Kernel, Matern52Kernel +export MaternKernel, Matern12Kernel, Matern32Kernel, Matern52Kernel export LinearKernel, PolynomialKernel export RationalQuadraticKernel, GammaRationalQuadraticKernel export MahalanobisKernel, GaborKernel, PiecewisePolynomialKernel diff --git a/src/basekernels/constant.jl b/src/basekernels/constant.jl index db7f94870..99d796cdd 100644 --- a/src/basekernels/constant.jl +++ b/src/basekernels/constant.jl @@ -1,11 +1,11 @@ """ ZeroKernel() -Create a kernel that always returning zero +Create a kernel that always returns zero ``` κ(x,y) = 0.0 ``` -The output type depends of `x` and `y` +The output type depends on `x` and `y` """ struct ZeroKernel <: SimpleKernel end diff --git a/src/basekernels/exponential.jl b/src/basekernels/exponential.jl index 1a9c38a34..8797a4536 100644 --- a/src/basekernels/exponential.jl +++ b/src/basekernels/exponential.jl @@ -6,8 +6,7 @@ The squared exponential kernel is a Mercer kernel given by the formula: κ(x, y) = exp(-‖x - y‖² / 2) ``` Can also be called via `RBFKernel`, `GaussianKernel` or `SEKernel`. -See also [`ExponentialKernel`](@ref) for a -related form of the kernel or [`GammaExponentialKernel`](@ref) for a generalization. +See [`GammaExponentialKernel`](@ref) for a generalization. """ struct SqExponentialKernel <: SimpleKernel end @@ -20,10 +19,29 @@ iskroncompatible(::SqExponentialKernel) = true Base.show(io::IO,::SqExponentialKernel) = print(io,"Squared Exponential Kernel") ## Aliases ## + +""" + RBFKernel() + +See [`SqExponentialKernel`](@ref) +""" const RBFKernel = SqExponentialKernel + +""" + GaussianKernel() + +See [`SqExponentialKernel`](@ref) +""" const GaussianKernel = SqExponentialKernel + +""" + SEKernel() + +See [`SqExponentialKernel`](@ref) +""" const SEKernel = SqExponentialKernel + """ ExponentialKernel() @@ -31,6 +49,8 @@ The exponential kernel is a Mercer kernel given by the formula: ``` κ(x,y) = exp(-‖x-y‖) ``` +Can also be called via `LaplacianKernel` or `Matern12Kernel`. +See [`GammaExponentialKernel`](@ref) for a generalization. """ struct ExponentialKernel <: SimpleKernel end @@ -42,9 +62,23 @@ iskroncompatible(::ExponentialKernel) = true Base.show(io::IO, ::ExponentialKernel) = print(io, "Exponential Kernel") -## Alias ## +## Aliases ## + +""" + LaplacianKernel() + +See [`ExponentialKernel`](@ref) +""" const LaplacianKernel = ExponentialKernel +""" + Matern12Kernel() + +See [`ExponentialKernel`](@ref) +""" +const Matern12Kernel = ExponentialKernel + + """ GammaExponentialKernel(; γ = 2.0) @@ -53,7 +87,7 @@ The γ-exponential kernel [1] is an isotropic Mercer kernel given by the formula κ(x,y) = exp(-‖x-y‖^γ) ``` Where `γ > 0`, (the keyword `γ` can be replaced by `gamma`) -For `γ = 2`, see `SqExponentialKernel` and `γ = 1`, see `ExponentialKernel`. +For `γ = 2`, see [`SqExponentialKernel`](@ref); for `γ = 1`, see [`ExponentialKernel`](@ref). [1] - Gaussian Processes for Machine Learning, Carl Edward Rasmussen and Christopher K. I. Williams, MIT Press, 2006. diff --git a/src/basekernels/matern.jl b/src/basekernels/matern.jl index e6b8ee8ea..89c1741b1 100644 --- a/src/basekernels/matern.jl +++ b/src/basekernels/matern.jl @@ -33,6 +33,8 @@ metric(::MaternKernel) = Euclidean() Base.show(io::IO, κ::MaternKernel) = print(io, "Matern Kernel (ν = ", first(κ.ν), ")") +## Matern12Kernel = ExponentialKernel aliased in exponential.jl + """ Matern32Kernel() diff --git a/src/basekernels/piecewisepolynomial.jl b/src/basekernels/piecewisepolynomial.jl index baf788348..c487efdc4 100644 --- a/src/basekernels/piecewisepolynomial.jl +++ b/src/basekernels/piecewisepolynomial.jl @@ -2,19 +2,18 @@ PiecewisePolynomialKernel{V}(maha::AbstractMatrix) Piecewise Polynomial covariance function with compact support, V = 0,1,2,3. -The kernel functions are 2v times continuously differentiable and the corresponding -processes are hence v times mean-square differentiable. The kernel function is: +The kernel functions are 2V times continuously differentiable and the corresponding +processes are hence V times mean-square differentiable. The kernel function is: ```math κ(x, y) = max(1 - r, 0)^(j + V) * f(r, j) with j = floor(D / 2) + V + 1 ``` where `r` is the Mahalanobis distance mahalanobis(x,y) with `maha` as the metric. - """ struct PiecewisePolynomialKernel{V, A<:AbstractMatrix{<:Real}} <: SimpleKernel maha::A j::Int function PiecewisePolynomialKernel{V}(maha::AbstractMatrix{<:Real}) where V - V in (0, 1, 2, 3) || error("Invalid paramter v=$(V). Should be 0, 1, 2 or 3.") + V in (0, 1, 2, 3) || error("Invalid parameter V=$(V). Should be 0, 1, 2 or 3.") LinearAlgebra.checksquare(maha) j = div(size(maha, 1), 2) + V + 1 return new{V,typeof(maha)}(maha, j) diff --git a/src/transform/lineartransform.jl b/src/transform/lineartransform.jl index dd1e4db1b..a500da0a9 100644 --- a/src/transform/lineartransform.jl +++ b/src/transform/lineartransform.jl @@ -9,7 +9,6 @@ The second dimension of `A` must match the number of features of the target. ```julia-repl julia> A = rand(10, 5) - julia> tr = LinearTransform(A) ``` """