R/flash_set_conv_crit.R
flash_set_conv_crit.RdUsed in a flash pipeline to set the criterion for
determining whether a greedy fit or backfit has "converged."
flash_set_conv_crit(flash, fn = NULL, tol)A flash or flash_fit object.
The convergence criterion function (see Details below). If
NULL, then only the tolerance parameter is updated (thus a
convergence criterion can be set at the beginning of a flash pipeline,
allowing the tolerance parameter to be updated at will without needing to
re-specify the convergence criterion each time). The default convergence
criterion, which is set when the flash object is initialized, is
flash_conv_crit_elbo_diff, which calculates the
difference in the variational lower bound or "ELBO" from one iteration to
the next.
The tolerance parameter (see Details below). The default, which is
set when the flash object is initialized (see
flash_init), is \(np\sqrt{\epsilon}\), where \(n\) is the
number of rows in the dataset, \(p\) is the number of columns, and
\(\epsilon\) is equal to .Machine$double.eps.
The flash object from argument flash, with the
new convergence criterion reflected in updates to the "internal"
flash_fit object. These settings will persist across
all subsequent calls to flash_xxx functions in the same
flash pipeline (unless, of course, flash_set_conv_crit is
again called within the same pipeline).
Function flash_set_conv_crit can be used to customize
the convergence criterion for a flash object. This criterion
determines when to stop optimizing a newly added factor
(see flash_greedy) and when to stop backfitting
(flash_backfit). Note that, because most alternative
convergence criteria do not make sense in the context of a nullcheck, it
does not set the "convergence" criterion for flash_nullcheck
(for example, flash_conv_crit_max_chg_L would simply return
the maximum \(L^2\)-normalized loading for each set of loadings
\(\ell_{\cdot k}\)).
The criterion is defined by the function supplied as argument to fn,
which must accept exactly three parameters,
curr, prev, and k. curr refers to the
flash_fit object from the current iteration; prev,
to the flash_fit object from the previous iteration;
and, if the iteration is a sequential backfitting iteration (that is, a
flash_backfit iteration with argument
extrapolate = FALSE), k identifies the factor/loadings pair
that is currently being updated (in all other cases, k is
NULL). The function must output a numeric value; if the value is
less than or equal to tol, then the fit is considered to have
"converged." The meaning of "convergence" here varies according to the
operation being performed.
In the greedy algorithm, fn simply compares the fit from
one iteration to the next. During a backfit, it similarly compares fits from
one iteration to the next, but it only considers the fit to have
converged when the value of fn over successive updates to
all factor/loadings pairs is less than or equal to tol. If,
for example, factor/loadings pairs \(1, \ldots, K\) are being
sequentially backfitted, then fits are compared before and
after the update to factor/loadings 1, before and after the update to
factor/loadings 2, and so on through factor/loadings \(K\),
and backfitting only terminates when fn returns a value less
than or equal to tol for all \(K\) updates.
Package flashier provides a number of functions that may be supplied
as convergence criteria: see
flash_conv_crit_elbo_diff (the default criterion),
flash_conv_crit_max_chg,
flash_conv_crit_max_chg_L, and
flash_conv_crit_max_chg_F. Custom functions may also be
defined. Typically, they will compare the fit in curr (the current
iteration) to the fit in prev (the previous iteration).
To facilitate working with flash_fit objects, package
flashier provides a number of accessors, which are enumerated in
the documentation for object flash_fit. Custom functions
should return a numeric value that can be compared against tol; see
Examples below.
fl <- flash_init(gtex) |>
flash_set_conv_crit(flash_conv_crit_max_chg, tol = 1e-3) |>
flash_set_verbose(
verbose = 3,
fns = flash_verbose_max_chg,
colnames = "Max Chg",
colwidths = 20
) |>
flash_greedy(Kmax = 3)
#> Adding factor 1 to flash object...
#> Optimizing factor...
#> Iteration Max Chg
#> 1 2.91e-03
#> 2 1.20e-04
#> Factor successfully added. Objective: -86178.931
#> Adding factor 2 to flash object...
#> Optimizing factor...
#> Iteration Max Chg
#> 1 1.20e-02
#> 2 4.37e-03
#> 3 3.15e-03
#> 4 2.23e-03
#> 5 1.58e-03
#> 6 1.12e-03
#> 7 7.90e-04
#> Factor successfully added. Objective: -85147.218
#> Adding factor 3 to flash object...
#> Optimizing factor...
#> Iteration Max Chg
#> 1 2.57e-02
#> 2 1.72e-02
#> 3 1.52e-02
#> 4 1.64e-02
#> 5 1.95e-02
#> 6 1.84e-02
#> 7 1.97e-02
#> 8 1.77e-02
#> 9 8.83e-03
#> 10 7.66e-03
#> 11 6.45e-03
#> 12 5.45e-03
#> 13 4.66e-03
#> 14 4.05e-03
#> 15 3.58e-03
#> 16 3.35e-03
#> 17 3.25e-03
#> 18 3.11e-03
#> 19 2.93e-03
#> 20 2.71e-03
#> 21 2.44e-03
#> 22 2.15e-03
#> 23 1.88e-03
#> 24 1.70e-03
#> 25 1.53e-03
#> 26 1.35e-03
#> 27 1.18e-03
#> 28 1.02e-03
#> 29 8.78e-04
#> Factor successfully added. Objective: -84354.031
#> Wrapping up...
#> Done.