dSVD
)In this vignette, we consider approximating a matrix as a product of two low-rank matrices (a.k.a., factor matrices).
Test data is available from toyModel
.
You will see that there are five blocks in the data matrix as follows.
Here, we introduce the ternary regularization to take {-1,0,1} values in U as below:
X ≈ UV′ s.t. U ∈ {−1, 0, 1},
where X (N × M) is a data matrix,
U (N × J) is a ternary score
matrix, and V (M × J) is a loading matrix.
In dcTensor
package, the object function is optimized by
combining gradient-descent algorithm (Tsuyuzaki
2020) and ternary regularization.
In STMF, a rank parameter J
( ≤ min (N, M)) is
needed to be set in advance. Other settings such as the number of
iterations (num.iter
) are also available. For the details
of arguments of dSVD, see ?dSVD
. After the calculation,
various objects are returned by dSVD
. STMF is achieved by
specifying the ternary regularization parameter as a large value like
the below:
## List of 6
## $ U : num [1:100, 1:5] 0.00592 0.00582 0.00626 0.00641 0.00611 ...
## $ V : num [1:300, 1:5] 89.8 94.8 93.6 101 87.6 ...
## $ RecError : Named num [1:101] 1.00e-09 4.24e+05 3.67e+05 3.63e+05 3.65e+05 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TrainRecError: Named num [1:101] 1.00e-09 4.24e+05 3.67e+05 3.63e+05 3.65e+05 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ TestRecError : Named num [1:101] 1e-09 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 0e+00 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
## $ RelChange : Named num [1:101] 1.00e-09 9.70e-01 1.55e-01 1.25e-02 4.53e-03 ...
## ..- attr(*, "names")= chr [1:101] "offset" "1" "2" "3" ...
The reconstruction error (RecError
) and relative error
(RelChange
, the amount of change from the reconstruction
error in the previous step) can be used to diagnose whether the
calculation is converged or not.
layout(t(1:2))
plot(log10(out_STMF$RecError[-1]), type="b", main="Reconstruction Error")
plot(log10(out_STMF$RelChange[-1]), type="b", main="Relative Change")
The product of U and V shows whether the original data is
well-recovered by dSVD
.
recX <- out_STMF$U %*% t(out_STMF$V)
layout(t(1:2))
image.plot(X, main="Original Data", legend.mar=8)
image.plot(recX, main="Reconstructed Data (STMF)", legend.mar=8)
The histograms of U and V show that U looks ternary but V does not.
## R version 4.4.3 (2025-02-28)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] nnTensor_1.3.0 fields_16.3.1 viridisLite_0.4.2 spam_2.11-1
## [5] dcTensor_1.3.0 rmarkdown_2.29
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.6 jsonlite_1.9.1 compiler_4.4.3 maps_3.4.2.1
## [5] Rcpp_1.0.14 plot3D_1.4.1 tagcloud_0.6 jquerylib_0.1.4
## [9] scales_1.3.0 yaml_2.3.10 fastmap_1.2.0 ggplot2_3.5.1
## [13] R6_2.6.1 tcltk_4.4.3 knitr_1.49 MASS_7.3-65
## [17] dotCall64_1.2 misc3d_0.9-1 tibble_3.2.1 maketools_1.3.2
## [21] munsell_0.5.1 pillar_1.10.1 bslib_0.9.0 RColorBrewer_1.1-3
## [25] rlang_1.1.5 cachem_1.1.0 xfun_0.51 sass_0.4.9
## [29] sys_3.4.3 cli_3.6.4 magrittr_2.0.3 digest_0.6.37
## [33] grid_4.4.3 rTensor_1.4.8 lifecycle_1.0.4 vctrs_0.6.5
## [37] evaluate_1.0.3 glue_1.8.0 buildtools_1.0.0 colorspace_2.1-1
## [41] pkgconfig_2.0.3 tools_4.4.3 htmltools_0.5.8.1