Implementing a rotation method
If you wish to implement your own factor rotation method or extend this package, you can do so in two ways:
- Implementing a rotation method without specifying the gradient
- Implementing a rotation method with gradients
In the following guide we will walk through both ways of implementing a rotation method. As an example we will reimplement Quartimax
, which minimizes
\[Q = \sum_p \sum_k \lambda_{pk}^4\]
where $p$ and $k$ are the row and column indices of factor loading matrix $\Lambda$ and $\lambda_{pk}$ are the entries of the factor loading matrix.
The first step to a custom implementation is to define a new struct
for the rotation method. FactorRotations.jl requires that all rotation methods inherit from RotationMethod
. One must also specify whether the new method can be used for orthogonal rotation, oblique rotation, or both. For orthogonal rotation it is required that T <: RotationMethod{Orthogonal}
. Oblique rotations must satisfy T <: RotationMethod{Oblique}
. Methods that can be used for both orthogonal and oblique rotation are defined T{RT} <: RotationMethod{RT}
.
Since Quartimax is an orthogonal rotation method, we define it as such.
julia> using FactorRotations
julia> struct MyQuartimax <: RotationMethod{Orthogonal} end
Defining the rotation quality criterion
The easiest way to define a rotation method is to implement the criterion_and_gradient!(::Nothing, ...)
function, which calculates the rotation quality criterion. nothing
fixed as the first argument specifies that this implementation does not calculate the criterion gradient.
julia> import FactorRotations: criterion_and_gradient!
julia> function criterion_and_gradient!(::Nothing, method::MyQuartimax, Λ::AbstractMatrix)
return -sum(Λ .^ 4)
end;
julia> criterion(MyQuartimax(), ones(10, 2))
-20.0
Since the algorithm in this package minimizes the criterion value, we have to make sure to return -sum(...)
instead of the original criterion for Quartimax.
FactorRotations.jl will apply Automatic Differentiation to derive the gradient for your quality criterion and use it during rotation optimization.
By default, FactorRotations.jl would use Enzyme.jl as an autodiff engine, but it could be changed with FactorRotations.set_autodiff_backend
. To enable autodiff in FactorRotations.jl, the corresponding autodiff package should be loaded first:
julia> using Enzyme
julia> grad = fill(NaN, 10, 2);
julia> criterion_and_gradient!(grad, MyQuartimax(), ones(10, 2))
-20.0
julia> grad
10×2 Matrix{Float64}:
-4.0 -4.0
-4.0 -4.0
-4.0 -4.0
-4.0 -4.0
-4.0 -4.0
-4.0 -4.0
-4.0 -4.0
-4.0 -4.0
-4.0 -4.0
-4.0 -4.0
julia> L = [
0.830 -0.396
0.818 -0.469
0.777 -0.470
0.798 -0.401
0.786 0.500
0.672 0.458
0.594 0.444
0.647 0.333
];
julia> L_rotated = rotate(L, MyQuartimax())
FactorRotation{Float64} with loading matrix:
8×2 Matrix{Float64}:
0.898755 0.194824
0.933943 0.129749
0.902131 0.103864
0.876508 0.171284
0.315572 0.876476
0.251123 0.773489
0.198007 0.714678
0.307857 0.659334
Checking against the Quartimax
implementation shows that the results are approximately equal.
julia> L_reference = rotate(L, Quartimax())
FactorRotation{Float64} with loading matrix:
8×2 Matrix{Float64}:
0.898755 0.194823
0.933943 0.129748
0.902132 0.103864
0.876508 0.171284
0.315572 0.876476
0.251124 0.773489
0.198008 0.714678
0.307858 0.659334
julia> isapprox(loadings(L_rotated), loadings(L_reference), atol = 1e-5)
true
Defining the rotation quality gradient
When the gradient formula is available, criterion_and_gradient!
method could be modified to allow ∇Q::AbstractMatrix
as the first argument. In this case FactorRotations.jl will expect that the criterion_and_gradient!(∇Q, method, Λ)
call sets ∇Q
to the $∇Q(Λ)$ in-place and also returns $Q(Λ)$.
This variant of criterion_and_gradient!
allows reusing intermediate computations between the criterion and its gradient, as well as providing more efficient gradient calculation than the autodiff-based one.
Continuing the example of MyQuartimax
:
julia> import FactorRotations: criterion_and_gradient!
julia> function criterion_and_gradient!(∇Q::Union{AbstractMatrix, Nothing}, method::MyQuartimax, Λ::AbstractMatrix)
Q = -sum(Λ .^ 4)
if !isnothing(∇Q)
∇Q .= -Λ.^3
end
return Q
end;
User-defined criterion_and_gradient!
has priority over the default autodiff-based one:
julia> grad = fill(NaN, 10, 2);
julia> criterion_and_gradient!(grad, MyQuartimax(), ones(10, 2))
-20.0
julia> grad
10×2 Matrix{Float64}:
-1.0 -1.0
-1.0 -1.0
-1.0 -1.0
-1.0 -1.0
-1.0 -1.0
-1.0 -1.0
-1.0 -1.0
-1.0 -1.0
-1.0 -1.0
-1.0 -1.0
rotate
will now also use the custom criterion_and_gradient!
:
julia> L_rotated = rotate(L, MyQuartimax())
FactorRotation{Float64} with loading matrix:
8×2 Matrix{Float64}:
0.898755 0.194824
0.933943 0.129749
0.902131 0.103865
0.876508 0.171285
0.315572 0.876476
0.251123 0.773489
0.198007 0.714678
0.307857 0.659335
julia> isapprox(loadings(L_rotated), loadings(L_reference), atol = 1e-5)
true
Note that in our criterion_and_gradient!
method gradient calculation is optional and could be skipped by passing nothing
. This variant of the method is used by criterion
:
julia> criterion(MyQuartimax(), L)
-2.8829327011730004