From 690001c6264114d3921572f3cc019dda41d80e52 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Mon, 15 Apr 2024 15:40:21 +0000 Subject: [PATCH] build based on aecaae5 --- previews/PR297/distributions/index.html | 2 +- previews/PR297/examples/index.html | 92 ++++++++++++++----------- previews/PR297/index.html | 2 +- previews/PR297/search/index.html | 2 +- previews/PR297/transforms/index.html | 20 +++--- 5 files changed, 66 insertions(+), 52 deletions(-) diff --git a/previews/PR297/distributions/index.html b/previews/PR297/distributions/index.html index 64afd43b..1f9da147 100644 --- a/previews/PR297/distributions/index.html +++ b/previews/PR297/distributions/index.html @@ -20,4 +20,4 @@ dist: Beta{Float64}(α=2.0, β=2.0) transform: Bijectors.Logit{Float64}(0.0, 1.0) )
julia> tdist isa UnivariateDistributiontrue

We can the then compute the logpdf for the resulting distribution:

julia> # Some example values
-       x = rand(dist)0.4861689643053158
julia> y = tdist.transform(x)-0.05533826042475478
julia> logpdf(tdist, y)-0.9823602192137184
+ x = rand(dist)0.5781817614856105
julia> y = tdist.transform(x)0.31531377864527943
julia> logpdf(tdist, y)-1.0303360620857969 diff --git a/previews/PR297/examples/index.html b/previews/PR297/examples/index.html index d0576bf2..c5b5dc50 100644 --- a/previews/PR297/examples/index.html +++ b/previews/PR297/examples/index.html @@ -5,69 +5,83 @@ )
julia> x = rand(rng, td) # ∈ (0, 1)0.3384404850130036

It's worth noting that support(Beta) is the closed interval [0, 1], while the constrained-to-unconstrained bijection, Logit in this case, is only well-defined as a map (0, 1) → ℝ for the open interval (0, 1). This is of course not an implementation detail. is itself open, thus no continuous bijection exists from a closed interval to . But since the boundaries of a closed interval has what's known as measure zero, this doesn't end up affecting the resulting density with support on the entire real line. In practice, this means that

julia> td = transformed(Beta())UnivariateTransformed{Beta{Float64}, Bijectors.Logit{Float64}}(
 dist: Beta{Float64}(α=1.0, β=1.0)
 transform: Bijectors.Logit{Float64}(0.0, 1.0)
-)
julia> inverse(td.transform)(rand(rng, td))0.8130302707446476

will never result in 0 or 1 though any sample arbitrarily close to either 0 or 1 is possible. Disclaimer: numerical accuracy is limited, so you might still see 0 and 1 if you're lucky.

Multivariate ADVI example

We can also do multivariate ADVI using the Stacked bijector. Stacked gives us a way to combine univariate and/or multivariate bijectors into a singe multivariate bijector. Say you have a vector x of length 2 and you want to transform the first entry using Exp and the second entry using Log. Stacked gives you an easy and efficient way of representing such a bijector.

julia> using Bijectors: SimplexBijector
julia> # Original distributions - dists = (Beta(), InverseGamma(), Dirichlet(2, 3));
julia> # Construct the corresponding ranges - ranges = [];
julia> idx = 1;
julia> for i in 1:length(dists) +)
julia> inverse(td.transform)(rand(rng, td))0.8130302707446476

will never result in 0 or 1 though any sample arbitrarily close to either 0 or 1 is possible. Disclaimer: numerical accuracy is limited, so you might still see 0 and 1 if you're lucky.

Multivariate ADVI example

We can also do multivariate ADVI using the Stacked bijector. Stacked gives us a way to combine univariate and/or multivariate bijectors into a singe multivariate bijector. Say you have a vector x of length 2 and you want to transform the first entry using Exp and the second entry using Log. Stacked gives you an easy and efficient way of representing such a bijector.

julia> using Bijectors: SimplexBijector
+       
+       # Original distributions
julia> dists = (Beta(), InverseGamma(), Dirichlet(2, 3)); + + # Construct the corresponding ranges
julia> ranges = [];
julia> idx = 1;
julia> for i in 1:length(dists) d = dists[i] push!(ranges, idx:(idx + length(d) - 1)) global idx idx += length(d) - end;
julia> ranges3-element Vector{Any}: + end;
julia> ranges + + # Base distribution; mean-field normal3-element Vector{Any}: 1:1 2:2 - 3:4
julia> # Base distribution; mean-field normal - num_params = ranges[end][end]4
julia> d = MvNormal(zeros(num_params), ones(num_params));
julia> # Construct the transform - bs = bijector.(dists); # constrained-to-unconstrained bijectors for dists
julia> ibs = inverse.(bs); # invert, so we get unconstrained-to-constrained
julia> sb = Stacked(ibs, ranges) # => Stacked <: BijectorStacked(Any[Inverse{Bijectors.Logit{Float64}}(Bijectors.Logit{Float64}(0.0, 1.0)), Base.Fix1{typeof(broadcast), typeof(exp)}(broadcast, exp), Inverse{Bijectors.SimplexBijector}(Bijectors.SimplexBijector())], Any[1:1, 2:2, 3:4], Any[1:1, 2:2, 3:5])
julia> # Mean-field normal with unconstrained-to-constrained stacked bijector - td = transformed(d, sb);
julia> y = rand(td)5-element Vector{Float64}: - 0.5771946241605288 - 1.9359652721377536 - 0.49224220792733864 - 0.19470455754410665 - 0.3130532345285547
julia> 0.0 ≤ y[1] ≤ 1.0true
julia> 0.0 < y[2]true
julia> sum(y[3:4]) ≈ 1.0false

Normalizing flows

A very interesting application is that of normalizing flows.[1] Usually this is done by sampling from a multivariate normal distribution, and then transforming this to a target distribution using invertible neural networks. Currently there are two such transforms available in Bijectors.jl: PlanarLayer and RadialLayer. Let's create a flow with a single PlanarLayer:

julia> d = MvNormal(zeros(2), ones(2));
julia> b = PlanarLayer(2)PlanarLayer(w = [0.26436875120904013, 1.2765374549928816], u = [-1.19574270205447, -0.5523066081262475], b = [1.368563630624271])
julia> flow = transformed(d, b)MultivariateTransformed{DiagNormal, PlanarLayer{Vector{Float64}, Vector{Float64}}}( + 3:4
julia> num_params = ranges[end][end]4
julia> d = MvNormal(zeros(num_params), ones(num_params)); + + # Construct the transform
julia> bs = bijector.(dists); # constrained-to-unconstrained bijectors for dists
julia> ibs = inverse.(bs); # invert, so we get unconstrained-to-constrained
julia> sb = Stacked(ibs, ranges) # => Stacked <: Bijector + + # Mean-field normal with unconstrained-to-constrained stacked bijectorStacked(Any[Inverse{Bijectors.Logit{Float64}}(Bijectors.Logit{Float64}(0.0, 1.0)), Base.Fix1{typeof(broadcast), typeof(exp)}(broadcast, exp), Inverse{Bijectors.SimplexBijector}(Bijectors.SimplexBijector())], Any[1:1, 2:2, 3:4], Any[1:1, 2:2, 3:5])
julia> td = transformed(d, sb);
julia> y = rand(td)5-element Vector{Float64}: + 0.1116590754498559 + 0.8324155851108938 + 0.0718082056466437 + 0.6671597677422355 + 0.2610320266111208
julia> 0.0 ≤ y[1] ≤ 1.0true
julia> 0.0 < y[2]true
julia> sum(y[3:4]) ≈ 1.0false

Normalizing flows

A very interesting application is that of normalizing flows.[1] Usually this is done by sampling from a multivariate normal distribution, and then transforming this to a target distribution using invertible neural networks. Currently there are two such transforms available in Bijectors.jl: PlanarLayer and RadialLayer. Let's create a flow with a single PlanarLayer:

julia> d = MvNormal(zeros(2), ones(2));
julia> b = PlanarLayer(2)PlanarLayer(w = [1.2479476657829534, -0.4644416760036228], u = [-0.9380314151318037, -0.7626662224348569], b = [-0.4392260004833269])
julia> flow = transformed(d, b)MultivariateTransformed{DiagNormal, PlanarLayer{Vector{Float64}, Vector{Float64}}}( dist: DiagNormal( dim: 2 μ: [0.0, 0.0] Σ: [1.0 0.0; 0.0 1.0] ) -transform: PlanarLayer(w = [0.26436875120904013, 1.2765374549928816], u = [-1.19574270205447, -0.5523066081262475], b = [1.368563630624271]) +transform: PlanarLayer(w = [1.2479476657829534, -0.4644416760036228], u = [-0.9380314151318037, -0.7626662224348569], b = [-0.4392260004833269]) )
julia> flow isa MultivariateDistributiontrue

That's it. Now we can sample from it using rand and compute the logpdf, like any other Distribution.

julia> y = rand(rng, flow)2-element Vector{Float64}:
- -1.7493115386470126
-  0.1592574141606397
julia> logpdf(flow, y) # uses inverse of `b`-2.082306933199818

Similarily to the multivariate ADVI example, we could use Stacked to get a bounded flow:

julia> d = MvNormal(zeros(2), ones(2));
julia> ibs = inverse.(bijector.((InverseGamma(2, 3), Beta())));
julia> sb = stack(ibs...) # == Stacked(ibs) == Stacked(ibs, [i:i for i = 1:length(ibs)]ERROR: MethodError: no method matching length(::Inverse{Bijectors.Logit{Float64}}) + 0.06009887872540687 + 1.178225666425143
julia> logpdf(flow, y) # uses inverse of `b`-2.0368184925151436

Similarily to the multivariate ADVI example, we could use Stacked to get a bounded flow:

julia> d = MvNormal(zeros(2), ones(2));
julia> ibs = inverse.(bijector.((InverseGamma(2, 3), Beta())));
julia> sb = stack(ibs...) # == Stacked(ibs) == Stacked(ibs, [i:i for i = 1:length(ibs)]ERROR: MethodError: no method matching length(::Inverse{Bijectors.Logit{Float64}}) Closest candidates are: - length(!Matched::Union{Base.KeySet, Base.ValueIterator}) - @ Base abstractdict.jl:58 - length(!Matched::Union{LinearAlgebra.Adjoint{T, S}, LinearAlgebra.Transpose{T, S}} where {T, S}) - @ LinearAlgebra /opt/hostedtoolcache/julia/1.9.4/x64/share/julia/stdlib/v1.9/LinearAlgebra/src/adjtrans.jl:295 - length(!Matched::Union{SparseArrays.FixedSparseVector{Tv, Ti}, SparseArrays.SparseVector{Tv, Ti}} where {Tv, Ti}) - @ SparseArrays /opt/hostedtoolcache/julia/1.9.4/x64/share/julia/stdlib/v1.9/SparseArrays/src/sparsevector.jl:95 + length(!Matched::Core.Compiler.InstructionStream) + @ Base show.jl:2777 + length(!Matched::DataStructures.EnumerateAll) + @ DataStructures ~/.julia/packages/DataStructures/aD5vv/src/multi_dict.jl:96 + length(!Matched::Distributed.WorkerPool) + @ Distributed /opt/hostedtoolcache/julia/1.10.2/x64/share/julia/stdlib/v1.10/Distributed/src/workerpool.jl:139 ...
julia> b = sb ∘ PlanarLayer(2)ERROR: UndefVarError: `sb` not defined
julia> td = transformed(d, b);
julia> y = rand(rng, td)2-element Vector{Float64}: - 0.231562298721395 - 1.0048660635827271
julia> 0 < y[1]true
julia> 0 ≤ y[2] ≤ 1false

Want to fit the flow?

julia> using Zygote
julia> # Construct the flow. - b = PlanarLayer(2)PlanarLayer(w = [2.342767485641014, -0.907813242000753], u = [-0.5433838798826695, 0.4268768473805427], b = [-0.7578012634282671])
julia> # Convenient for extracting parameters and reconstructing the flow. - using Functors
julia> θs, reconstruct = Functors.functor(b);
julia> # Make the objective a `struct` to avoid capturing global variables. - struct NLLObjective{R,D,T} + 0.9017322532010843 + 0.8371543544788624
julia> 0 < y[1]true
julia> 0 ≤ y[2] ≤ 1true

Want to fit the flow?

julia> using Zygote
+       
+       # Construct the flow.
julia> b = PlanarLayer(2) + + # Convenient for extracting parameters and reconstructing the flow.PlanarLayer(w = [1.278731373149864, 0.6315215462304843], u = [0.7885456660197679, 0.3443023807126602], b = [-0.8746030708306238])
julia> using Functors
julia> θs, reconstruct = Functors.functor(b); + + # Make the objective a `struct` to avoid capturing global variables.
julia> struct NLLObjective{R,D,T} reconstruct::R basedist::D data::T end
julia> function (obj::NLLObjective)(θs...) transformed_dist = transformed(obj.basedist, obj.reconstruct(θs)) return -sum(Base.Fix1(logpdf, transformed_dist), eachcol(obj.data)) - end
julia> # Some random data to estimate the density of. - xs = randn(2, 1000);
julia> # Construct the objective. - f = NLLObjective(reconstruct, MvNormal(2, 1), xs);
julia> # Initial loss. - @info "Initial loss: $(f(θs...))"[ Info: Initial loss: 3040.7394373205884
julia> # Train using gradient descent. - ε = 1e-3;
julia> for i in 1:100 + end + + # Some random data to estimate the density of.
julia> xs = randn(2, 1000); + + # Construct the objective.
julia> f = NLLObjective(reconstruct, MvNormal(2, 1), xs); + + # Initial loss.
julia> @info "Initial loss: $(f(θs...))" + + # Train using gradient descent.[ Info: Initial loss: 2894.0875044145787
julia> ε = 1e-3;
julia> for i in 1:100 ∇s = Zygote.gradient(f, θs...) θs = map(θs, ∇s) do θ, ∇ θ - ε .* ∇ end - end
julia> # Final loss - @info "Finall loss: $(f(θs...))"[ Info: Finall loss: 2878.918022593535
julia> # Very simple check to see if we learned something useful. - samples = rand(transformed(f.basedist, f.reconstruct(θs)), 1000);
julia> mean(eachcol(samples)) # ≈ [0, 0]2-element Vector{Float64}: - 0.03258439677513573 - 0.052160493127753095
julia> cov(samples; dims=2) # ≈ I2×2 Matrix{Float64}: - 0.869062 -0.0107859 - -0.0107859 0.992585

We can easily create more complex flows by simply doing PlanarLayer(10) ∘ PlanarLayer(10) ∘ RadialLayer(10) and so on.

+ end + + # Final loss
julia> @info "Finall loss: $(f(θs...))" + + # Very simple check to see if we learned something useful.[ Info: Finall loss: 2843.6647634691403
julia> samples = rand(transformed(f.basedist, f.reconstruct(θs)), 1000);
julia> mean(eachcol(samples)) # ≈ [0, 0]2-element Vector{Float64}: + 0.0070041854750248845 + -0.055628993181150804
julia> cov(samples; dims=2) # ≈ I2×2 Matrix{Float64}: + 0.950317 -0.00828724 + -0.00828724 1.0389

We can easily create more complex flows by simply doing PlanarLayer(10) ∘ PlanarLayer(10) ∘ RadialLayer(10) and so on.

diff --git a/previews/PR297/index.html b/previews/PR297/index.html index 56194d3d..dbabd569 100644 --- a/previews/PR297/index.html +++ b/previews/PR297/index.html @@ -1,2 +1,2 @@ -Home · Bijectors

Bijectors.jl

This package implements a set of functions for transforming constrained random variables (e.g. simplexes, intervals) to Euclidean space. The 3 main functions implemented in this package are the link, invlink and logpdf_with_trans for a number of distributions. The distributions supported are:

  1. RealDistribution: Union{Cauchy, Gumbel, Laplace, Logistic, NoncentralT, Normal, NormalCanon, TDist},
  2. PositiveDistribution: Union{BetaPrime, Chi, Chisq, Erlang, Exponential, FDist, Frechet, Gamma, InverseGamma, InverseGaussian, Kolmogorov, LogNormal, NoncentralChisq, NoncentralF, Rayleigh, Weibull},
  3. UnitDistribution: Union{Beta, KSOneSided, NoncentralBeta},
  4. SimplexDistribution: Union{Dirichlet},
  5. PDMatDistribution: Union{InverseWishart, Wishart}, and
  6. TransformDistribution: Union{T, Truncated{T}} where T<:ContinuousUnivariateDistribution.

All exported names from the Distributions.jl package are reexported from Bijectors.

Bijectors.jl also provides a nice interface for working with these maps: composition, inversion, etc. The following table lists mathematical operations for a bijector and the corresponding code in Bijectors.jl.

OperationMethodAutomatic
b ↦ b⁻¹inverse(b)
(b₁, b₂) ↦ (b₁ ∘ b₂)b₁ ∘ b₂
(b₁, b₂) ↦ [b₁, b₂]stack(b₁, b₂)
x ↦ b(x)b(x)×
y ↦ b⁻¹(y)inverse(b)(y)×
x ↦ log|det J(b, x)|logabsdetjac(b, x)AD
x ↦ b(x), log|det J(b, x)|with_logabsdet_jacobian(b, x)
p ↦ q := b_* pq = transformed(p, b)
y ∼ qy = rand(q)
p ↦ b such that support(b_* p) = ℝᵈbijector(p)
(x ∼ p, b(x), log|det J(b, x)|, log q(y))forward(q)

In this table, b denotes a Bijector, J(b, x) denotes the Jacobian of b evaluated at x, b_* denotes the push-forward of p by b, and x ∼ p denotes x sampled from the distribution with density p.

The "Automatic" column in the table refers to whether or not you are required to implement the feature for a custom Bijector. "AD" refers to the fact that it can be implemented "automatically" using automatic differentiation.

+Home · Bijectors

Bijectors.jl

This package implements a set of functions for transforming constrained random variables (e.g. simplexes, intervals) to Euclidean space. The 3 main functions implemented in this package are the link, invlink and logpdf_with_trans for a number of distributions. The distributions supported are:

  1. RealDistribution: Union{Cauchy, Gumbel, Laplace, Logistic, NoncentralT, Normal, NormalCanon, TDist},
  2. PositiveDistribution: Union{BetaPrime, Chi, Chisq, Erlang, Exponential, FDist, Frechet, Gamma, InverseGamma, InverseGaussian, Kolmogorov, LogNormal, NoncentralChisq, NoncentralF, Rayleigh, Weibull},
  3. UnitDistribution: Union{Beta, KSOneSided, NoncentralBeta},
  4. SimplexDistribution: Union{Dirichlet},
  5. PDMatDistribution: Union{InverseWishart, Wishart}, and
  6. TransformDistribution: Union{T, Truncated{T}} where T<:ContinuousUnivariateDistribution.

All exported names from the Distributions.jl package are reexported from Bijectors.

Bijectors.jl also provides a nice interface for working with these maps: composition, inversion, etc. The following table lists mathematical operations for a bijector and the corresponding code in Bijectors.jl.

OperationMethodAutomatic
b ↦ b⁻¹inverse(b)
(b₁, b₂) ↦ (b₁ ∘ b₂)b₁ ∘ b₂
(b₁, b₂) ↦ [b₁, b₂]stack(b₁, b₂)
x ↦ b(x)b(x)×
y ↦ b⁻¹(y)inverse(b)(y)×
x ↦ log|det J(b, x)|logabsdetjac(b, x)AD
x ↦ b(x), log|det J(b, x)|with_logabsdet_jacobian(b, x)
p ↦ q := b_* pq = transformed(p, b)
y ∼ qy = rand(q)
p ↦ b such that support(b_* p) = ℝᵈbijector(p)
(x ∼ p, b(x), log|det J(b, x)|, log q(y))forward(q)

In this table, b denotes a Bijector, J(b, x) denotes the Jacobian of b evaluated at x, b_* denotes the push-forward of p by b, and x ∼ p denotes x sampled from the distribution with density p.

The "Automatic" column in the table refers to whether or not you are required to implement the feature for a custom Bijector. "AD" refers to the fact that it can be implemented "automatically" using automatic differentiation.

diff --git a/previews/PR297/search/index.html b/previews/PR297/search/index.html index ce49e38b..131bf126 100644 --- a/previews/PR297/search/index.html +++ b/previews/PR297/search/index.html @@ -1,2 +1,2 @@ -Search · Bijectors

Loading search...

    +Search · Bijectors

    Loading search...

      diff --git a/previews/PR297/transforms/index.html b/previews/PR297/transforms/index.html index 05f693de..c1096fc6 100644 --- a/previews/PR297/transforms/index.html +++ b/previews/PR297/transforms/index.html @@ -5,10 +5,10 @@ 2.71828 2.71828 2.71828 2.71828
      julia> logabsdetjac(elementwise(exp), x)4.0
      julia> with_logabsdet_jacobian(elementwise(exp), x)([2.718281828459045 2.718281828459045; 2.718281828459045 2.718281828459045], 4.0)

      These methods also work nicely for compositions of transformations:

      julia> transform(elementwise(log ∘ exp), x)2×2 Matrix{Float64}:
        1.0  1.0
      - 1.0  1.0

      Unlike exp, some transformations have parameters affecting the resulting transformation they represent, e.g. Logit has two parameters a and b representing the lower- and upper-bound, respectively, of its domain:

      julia> using Bijectors: Logit
      julia> f = Logit(0.0, 1.0)Bijectors.Logit{Float64}(0.0, 1.0)
      julia> f(rand()) # takes us from `(0, 1)` to `(-∞, ∞)`2.1094834878499475

      User-facing methods

      Without mutation:

      Bijectors.transformFunction
      transform(b, x)

      Transform x using b, treating x as a single input.

      source
      Bijectors.logabsdetjacFunction
      logabsdetjac(b, x)

      Return log(abs(det(J(b, x)))), where J(b, x) is the jacobian of b at x.

      source
      with_logabsdet_jacobian

      With mutation:

      Bijectors.transform!Function
      transform!(b, x[, y])

      Transform x using b, storing the result in y.

      If y is not provided, x is used as the output.

      source
      Bijectors.logabsdetjac!Function
      logabsdetjac!(b, x[, logjac])

      Compute log(abs(det(J(b, x)))) and store the result in logjac, where J(b, x) is the jacobian of b at x.

      source
      Bijectors.with_logabsdet_jacobian!Function
      with_logabsdet_jacobian!(b, x[, y, logjac])

      Compute transform(b, x) and logabsdetjac(b, x), storing the result in y and logjac, respetively.

      If y is not provided, then x will be used in its place.

      Defaults to calling with_logabsdet_jacobian(b, x) and updating y and logjac with the result.

      source

      Implementing a transformation

      Any callable can be made into a bijector by providing an implementation of ChangeOfVariables.with_logabsdet_jacobian(b, x).

      You can also optionally implement transform and logabsdetjac to avoid redundant computations. This is usually only worth it if you expect transform or logabsdetjac to be used heavily without the other.

      Similarly with the mutable versions with_logabsdet_jacobian!, transform!, and logabsdetjac!.

      Working with Distributions.jl

      Bijectors.bijectorFunction
      bijector(d::Distribution)

      Returns the constrained-to-unconstrained bijector for distribution d.

      source
      Bijectors.transformedMethod
      transformed(d::Distribution)
      -transformed(d::Distribution, b::Bijector)

      Couples distribution d with the bijector b by returning a TransformedDistribution.

      If no bijector is provided, i.e. transformed(d) is called, then transformed(d, bijector(d)) is returned.

      source

      Utilities

      Bijectors.elementwiseFunction
      elementwise(f)

      Alias for Base.Fix1(broadcast, f).

      In the case where f::ComposedFunction, the result is Base.Fix1(broadcast, f.outer) ∘ Base.Fix1(broadcast, f.inner) rather than Base.Fix1(broadcast, f).

      source
      Bijectors.isinvertibleFunction
      isinvertible(t)

      Return true if t is invertible, and false otherwise.

      source
      Bijectors.isclosedformMethod
      isclosedform(b::Transform)::bool
      -isclosedform(b⁻¹::Inverse{<:Transform})::bool

      Returns true or false depending on whether or not evaluation of b has a closed-form implementation.

      Most transformations have closed-form evaluations, but there are cases where this is not the case. For example the inverse evaluation of PlanarLayer requires an iterative procedure to evaluate.

      source

      API

      Bijectors.TransformType

      Abstract type for a transformation.

      Implementing

      A subtype of Transform of should at least implement transform(b, x).

      If the Transform is also invertible:

      • Required:
        • Either of the following:
          • transform(::Inverse{<:MyTransform}, x): the transform for its inverse.
          • InverseFunctions.inverse(b::MyTransform): returns an existing Transform.
        • logabsdetjac: computes the log-abs-det jacobian factor.
      • Optional:
        • with_logabsdet_jacobian: transform and logabsdetjac combined. Useful in cases where we can exploit shared computation in the two.

      For the above methods, there are mutating versions which can optionally be implemented:

      source
      Bijectors.BijectorType

      Abstract type of a bijector, i.e. differentiable bijection with differentiable inverse.

      source
      Bijectors.InverseType
      inverse(b::Transform)
      -Inverse(b::Transform)

      A Transform representing the inverse transform of b.

      source

      Bijectors

      Bijectors.CorrBijectorType
      CorrBijector <: Bijector

      A bijector implementation of Stan's parametrization method for Correlation matrix: https://mc-stan.org/docs/2_23/reference-manual/correlation-matrix-transform-section.html

      Basically, a unconstrained strictly upper triangular matrix y is transformed to a correlation matrix by following readable but not that efficient form:

      K = size(y, 1)
      + 1.0  1.0

      Unlike exp, some transformations have parameters affecting the resulting transformation they represent, e.g. Logit has two parameters a and b representing the lower- and upper-bound, respectively, of its domain:

      julia> using Bijectors: Logit
      julia> f = Logit(0.0, 1.0)Bijectors.Logit{Float64}(0.0, 1.0)
      julia> f(rand()) # takes us from `(0, 1)` to `(-∞, ∞)`0.6188151324413063

      User-facing methods

      Without mutation:

      with_logabsdet_jacobian

      With mutation:

      Bijectors.transform!Function
      transform!(b, x[, y])

      Transform x using b, storing the result in y.

      If y is not provided, x is used as the output.

      source
      Bijectors.logabsdetjac!Function
      logabsdetjac!(b, x[, logjac])

      Compute log(abs(det(J(b, x)))) and store the result in logjac, where J(b, x) is the jacobian of b at x.

      source
      Bijectors.with_logabsdet_jacobian!Function
      with_logabsdet_jacobian!(b, x[, y, logjac])

      Compute transform(b, x) and logabsdetjac(b, x), storing the result in y and logjac, respetively.

      If y is not provided, then x will be used in its place.

      Defaults to calling with_logabsdet_jacobian(b, x) and updating y and logjac with the result.

      source

      Implementing a transformation

      Any callable can be made into a bijector by providing an implementation of ChangeOfVariables.with_logabsdet_jacobian(b, x).

      You can also optionally implement transform and logabsdetjac to avoid redundant computations. This is usually only worth it if you expect transform or logabsdetjac to be used heavily without the other.

      Similarly with the mutable versions with_logabsdet_jacobian!, transform!, and logabsdetjac!.

      Working with Distributions.jl

      Bijectors.bijectorFunction
      bijector(d::Distribution)

      Returns the constrained-to-unconstrained bijector for distribution d.

      source
      Bijectors.transformedMethod
      transformed(d::Distribution)
      +transformed(d::Distribution, b::Bijector)

      Couples distribution d with the bijector b by returning a TransformedDistribution.

      If no bijector is provided, i.e. transformed(d) is called, then transformed(d, bijector(d)) is returned.

      source

      Utilities

      Bijectors.elementwiseFunction
      elementwise(f)

      Alias for Base.Fix1(broadcast, f).

      In the case where f::ComposedFunction, the result is Base.Fix1(broadcast, f.outer) ∘ Base.Fix1(broadcast, f.inner) rather than Base.Fix1(broadcast, f).

      source
      Bijectors.isclosedformMethod
      isclosedform(b::Transform)::bool
      +isclosedform(b⁻¹::Inverse{<:Transform})::bool

      Returns true or false depending on whether or not evaluation of b has a closed-form implementation.

      Most transformations have closed-form evaluations, but there are cases where this is not the case. For example the inverse evaluation of PlanarLayer requires an iterative procedure to evaluate.

      source

      API

      Bijectors.TransformType

      Abstract type for a transformation.

      Implementing

      A subtype of Transform of should at least implement transform(b, x).

      If the Transform is also invertible:

      • Required:
        • Either of the following:
          • transform(::Inverse{<:MyTransform}, x): the transform for its inverse.
          • InverseFunctions.inverse(b::MyTransform): returns an existing Transform.
        • logabsdetjac: computes the log-abs-det jacobian factor.
      • Optional:
        • with_logabsdet_jacobian: transform and logabsdetjac combined. Useful in cases where we can exploit shared computation in the two.

      For the above methods, there are mutating versions which can optionally be implemented:

      source
      Bijectors.BijectorType

      Abstract type of a bijector, i.e. differentiable bijection with differentiable inverse.

      source
      Bijectors.InverseType
      inverse(b::Transform)
      +Inverse(b::Transform)

      A Transform representing the inverse transform of b.

      source

      Bijectors

      Bijectors.CorrBijectorType
      CorrBijector <: Bijector

      A bijector implementation of Stan's parametrization method for Correlation matrix: https://mc-stan.org/docs/2_23/reference-manual/correlation-matrix-transform-section.html

      Basically, a unconstrained strictly upper triangular matrix y is transformed to a correlation matrix by following readable but not that efficient form:

      K = size(y, 1)
       z = tanh.(y)
       
       for j=1:K, i=1:K
      @@ -32,12 +32,12 @@
       [w1'w1 w1'w2 ... w1'wn;
        w2'w1 w2'w2 ... w2'wn;
        ...
      -]

      The diagonal elements are given by wk'wk = 1, thus x is a correlation matrix.

      Every step is invertible, so this is a bijection(bijector).

      Note: The implementation doesn't follow their "manageable expression" directly, because their equation seems wrong (7/30/2020). Insteadly it follows definition above the "manageable expression" directly, which is also described in above doc.

      source
      Bijectors.LeakyReLUType
      LeakyReLU{T}(α::T) <: Bijector

      Defines the invertible mapping

      x ↦ x if x ≥ 0 else αx

      where α > 0.

      source
      Bijectors.StackedType
      Stacked(bs)
      +]

      The diagonal elements are given by wk'wk = 1, thus x is a correlation matrix.

      Every step is invertible, so this is a bijection(bijector).

      Note: The implementation doesn't follow their "manageable expression" directly, because their equation seems wrong (7/30/2020). Insteadly it follows definition above the "manageable expression" directly, which is also described in above doc.

      source
      Bijectors.LeakyReLUType
      LeakyReLU{T}(α::T) <: Bijector

      Defines the invertible mapping

      x ↦ x if x ≥ 0 else αx

      where α > 0.

      source
      Bijectors.StackedType
      Stacked(bs)
       Stacked(bs, ranges)
       stack(bs::Bijector...)

      A Bijector which stacks bijectors together which can then be applied to a vector where bs[i]::Bijector is applied to x[ranges[i]]::UnitRange{Int}.

      Arguments

      • bs can be either a Tuple or an AbstractArray of 0- and/or 1-dimensional bijectors
        • If bs is a Tuple, implementations are type-stable using generated functions
        • If bs is an AbstractArray, implementations are not type-stable and use iterative methods
      • ranges needs to be an iterable consisting of UnitRange{Int}
        • length(bs) == length(ranges) needs to be true.

      Examples

      b1 = Logit(0.0, 1.0)
       b2 = identity
       b = stack(b1, b2)
      -b([0.0, 1.0]) == [b1(0.0), 1.0]  # => true
      source
      Bijectors.RationalQuadraticSplineType
      RationalQuadraticSpline{T} <: Bijector

      Implementation of the Rational Quadratic Spline flow [1].

      • Outside of the interval [minimum(widths), maximum(widths)], this mapping is given by the identity map.
      • Inside the interval it's given by a monotonic spline (i.e. monotonic polynomials connected at intermediate points) with endpoints fixed so as to continuously transform into the identity map.

      For the sake of efficiency, there are separate implementations for 0-dimensional and 1-dimensional inputs.

      Notes

      There are two constructors for RationalQuadraticSpline:

      • RationalQuadraticSpline(widths, heights, derivatives): it is assumed that widths,

      heights, and derivatives satisfy the constraints that makes this a valid bijector, i.e.

      • widths: monotonically increasing and length(widths) == K,
      • heights: monotonically increasing and length(heights) == K,
      • derivatives: non-negative and derivatives[1] == derivatives[end] == 1.
      • RationalQuadraticSpline(widths, heights, derivatives, B): other than than the lengths, no assumptions are made on parameters. Therefore we will transform the parameters s.t.:
      • widths_new ∈ [-B, B]ᴷ⁺¹, where K == length(widths),
      • heights_new ∈ [-B, B]ᴷ⁺¹, where K == length(heights),
      • derivatives_new ∈ (0, ∞)ᴷ⁺¹ with derivatives_new[1] == derivates_new[end] == 1, where (K - 1) == length(derivatives).

      Examples

      Univariate

      julia> using StableRNGs: StableRNG; rng = StableRNG(42);  # For reproducibility.
      +b([0.0, 1.0]) == [b1(0.0), 1.0]  # => true
      source
      Bijectors.RationalQuadraticSplineType
      RationalQuadraticSpline{T} <: Bijector

      Implementation of the Rational Quadratic Spline flow [1].

      • Outside of the interval [minimum(widths), maximum(widths)], this mapping is given by the identity map.
      • Inside the interval it's given by a monotonic spline (i.e. monotonic polynomials connected at intermediate points) with endpoints fixed so as to continuously transform into the identity map.

      For the sake of efficiency, there are separate implementations for 0-dimensional and 1-dimensional inputs.

      Notes

      There are two constructors for RationalQuadraticSpline:

      • RationalQuadraticSpline(widths, heights, derivatives): it is assumed that widths,

      heights, and derivatives satisfy the constraints that makes this a valid bijector, i.e.

      • widths: monotonically increasing and length(widths) == K,
      • heights: monotonically increasing and length(heights) == K,
      • derivatives: non-negative and derivatives[1] == derivatives[end] == 1.
      • RationalQuadraticSpline(widths, heights, derivatives, B): other than than the lengths, no assumptions are made on parameters. Therefore we will transform the parameters s.t.:
      • widths_new ∈ [-B, B]ᴷ⁺¹, where K == length(widths),
      • heights_new ∈ [-B, B]ᴷ⁺¹, where K == length(heights),
      • derivatives_new ∈ (0, ∞)ᴷ⁺¹ with derivatives_new[1] == derivates_new[end] == 1, where (K - 1) == length(derivatives).

      Examples

      Univariate

      julia> using StableRNGs: StableRNG; rng = StableRNG(42);  # For reproducibility.
       
       julia> using Bijectors: RationalQuadraticSpline
       
      @@ -74,7 +74,7 @@
       julia> b([-1., 5.])
       2-element Vector{Float64}:
        -1.5660106244288925
      -  5.0

      References

      [1] Durkan, C., Bekasov, A., Murray, I., & Papamakarios, G., Neural Spline Flows, CoRR, arXiv:1906.04032 [stat.ML], (2019).

      source
      Bijectors.CouplingType
      Coupling{F, M}(θ::F, mask::M)

      Implements a coupling-layer as defined in [1].

      Examples

      julia> using Bijectors: Shift, Coupling, PartitionMask, coupling, couple
      +  5.0

      References

      [1] Durkan, C., Bekasov, A., Murray, I., & Papamakarios, G., Neural Spline Flows, CoRR, arXiv:1906.04032 [stat.ML], (2019).

      source
      Bijectors.CouplingType
      Coupling{F, M}(θ::F, mask::M)

      Implements a coupling-layer as defined in [1].

      Examples

      julia> using Bijectors: Shift, Coupling, PartitionMask, coupling, couple
       
       julia> m = PartitionMask(3, [1], [2]); # <= going to use x[2] to parameterize transform of x[1]
       
      @@ -101,7 +101,7 @@
       Shift([2.0])
       
       julia> with_logabsdet_jacobian(cl, x)
      -([3.0, 2.0, 3.0], 0.0)

      References

      [1] Kobyzev, I., Prince, S., & Brubaker, M. A., Normalizing flows: introduction and ideas, CoRR, (), (2019).

      source
      Bijectors.NamedTransformType
      NamedTransform <: AbstractNamedTransform

      Wraps a NamedTuple of key -> Bijector pairs, implementing evaluation, inversion, etc.

      Examples

      julia> using Bijectors: NamedTransform, Scale
      +([3.0, 2.0, 3.0], 0.0)

      References

      [1] Kobyzev, I., Prince, S., & Brubaker, M. A., Normalizing flows: introduction and ideas, CoRR, (), (2019).

      source
      Bijectors.NamedTransformType
      NamedTransform <: AbstractNamedTransform

      Wraps a NamedTuple of key -> Bijector pairs, implementing evaluation, inversion, etc.

      Examples

      julia> using Bijectors: NamedTransform, Scale
       
       julia> b = NamedTransform((a = Scale(2.0), b = exp));
       
      @@ -111,7 +111,7 @@
       (a = 2.0, b = 1.0, c = 42.0)
       
       julia> (a = 2 * x.a, b = exp(x.b), c = x.c)
      -(a = 2.0, b = 1.0, c = 42.0)
      source
      Bijectors.NamedCouplingType
      NamedCoupling{target, deps, F} <: AbstractNamedTransform

      Implements a coupling layer for named bijectors.

      See also: Coupling

      Examples

      julia> using Bijectors: NamedCoupling, Scale
      +(a = 2.0, b = 1.0, c = 42.0)
      source
      Bijectors.NamedCouplingType
      NamedCoupling{target, deps, F} <: AbstractNamedTransform

      Implements a coupling layer for named bijectors.

      See also: Coupling

      Examples

      julia> using Bijectors: NamedCoupling, Scale
       
       julia> b = NamedCoupling(:b, (:a, :c), (a, c) -> Scale(a + c));
       
      @@ -121,4 +121,4 @@
       (a = 1.0, b = 8.0, c = 3.0)
       
       julia> (a = x.a, b = (x.a + x.c) * x.b, c = x.c)
      -(a = 1.0, b = 8.0, c = 3.0)
      source
      +(a = 1.0, b = 8.0, c = 3.0)source