| Title: | An Improved Version of WA-PLS |
|---|---|
| Description: | The goal of this package is to provide an improved version of WA-PLS (Weighted Averaging Partial Least Squares) by including the tolerances of taxa and the frequency of the sampled climate variable. This package also provides a way of leave-out cross-validation that removes both the test site and sites that are both geographically close and climatically close for each cycle, to avoid the risk of pseudo-replication. |
| Authors: | Mengmeng Liu [aut] (ORCID: <https://orcid.org/0000-0001-6250-0148>), Iain Colin Prentice [aut] (ORCID: <https://orcid.org/0000-0002-1296-6764>), Cajo J. F. ter Braak [aut] (ORCID: <https://orcid.org/0000-0002-0414-8745>), Sandy P. Harrison [aut] (ORCID: <https://orcid.org/0000-0001-5687-1903>), Roberto Villegas-Diaz [aut, cre] (ORCID: <https://orcid.org/0000-0001-5036-8661>), SPECIAL Research Group @ University of Reading [cph] |
| Maintainer: | Roberto Villegas-Diaz <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.3 |
| Built: | 2026-05-31 08:36:26 UTC |
| Source: | https://github.com/special-uor/fxtwapls |
Pseudo-removed leave-out cross-validation
cv.pr.w( modern_taxa, modern_climate, nPLS = 5, trainfun, predictfun, pseudo, usefx = FALSE, fx_method = "bin", bin = NA, cpus = 4, test_mode = TRUE, test_it = 5 )cv.pr.w( modern_taxa, modern_climate, nPLS = 5, trainfun, predictfun, pseudo, usefx = FALSE, fx_method = "bin", bin = NA, cpus = 4, test_mode = TRUE, test_it = 5 )
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
trainfun |
Training function you want to use, either
|
predictfun |
Predict function you want to use: if |
pseudo |
The geographically and climatically close sites to each test
site, obtained from |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
test_mode |
Boolean flag to execute the function with a limited number
of iterations, |
test_it |
Number of iterations to use in the test mode. |
Leave-one-out cross validation results.
fx, TWAPLS.w,
TWAPLS.predict.w, WAPLS.w, and
WAPLS.predict.w
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] point <- modern_pollen[, c("Long", "Lat")] test_mode <- TRUE # It should be set to FALSE before running dist <- fxTWAPLS::get_distance( point, cpus = 2, # Remove the following line test_mode = test_mode ) pseudo_Tmin <- fxTWAPLS::get_pseudo( dist, modern_pollen$Tmin, cpus = 2, # Remove the following line test_mode = test_mode ) cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w( taxa, modern_pollen$Tmin, nPLS = 5, fxTWAPLS::TWAPLS.w2, fxTWAPLS::TWAPLS.predict.w, pseudo_Tmin, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, # Remove the following line test_mode = test_mode ) # Run with progress bar `%>%` <- magrittr::`%>%` cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w( taxa, modern_pollen$Tmin, nPLS = 5, fxTWAPLS::TWAPLS.w2, fxTWAPLS::TWAPLS.predict.w, pseudo_Tmin, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, # Remove the following line test_mode = test_mode ) %>% fxTWAPLS::pb() ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] point <- modern_pollen[, c("Long", "Lat")] test_mode <- TRUE # It should be set to FALSE before running dist <- fxTWAPLS::get_distance( point, cpus = 2, # Remove the following line test_mode = test_mode ) pseudo_Tmin <- fxTWAPLS::get_pseudo( dist, modern_pollen$Tmin, cpus = 2, # Remove the following line test_mode = test_mode ) cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w( taxa, modern_pollen$Tmin, nPLS = 5, fxTWAPLS::TWAPLS.w2, fxTWAPLS::TWAPLS.predict.w, pseudo_Tmin, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, # Remove the following line test_mode = test_mode ) # Run with progress bar `%>%` <- magrittr::`%>%` cv_pr_tf_Tmin2 <- fxTWAPLS::cv.pr.w( taxa, modern_pollen$Tmin, nPLS = 5, fxTWAPLS::TWAPLS.w2, fxTWAPLS::TWAPLS.predict.w, pseudo_Tmin, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, # Remove the following line test_mode = test_mode ) %>% fxTWAPLS::pb() ## End(Not run)
Leave-one-out cross-validation as
rioja (https://cran.r-project.org/package=rioja).
cv.w( modern_taxa, modern_climate, nPLS = 5, trainfun, predictfun, usefx = FALSE, fx_method = "bin", bin = NA, cpus = 4, test_mode = FALSE, test_it = 5 )cv.w( modern_taxa, modern_climate, nPLS = 5, trainfun, predictfun, usefx = FALSE, fx_method = "bin", bin = NA, cpus = 4, test_mode = FALSE, test_it = 5 )
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
trainfun |
Training function you want to use, either
|
predictfun |
Predict function you want to use: if |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
test_mode |
boolean flag to execute the function with a limited number
of iterations, |
test_it |
number of iterations to use in the test mode. |
leave-one-out cross validation results
fx, TWAPLS.w,
TWAPLS.predict.w, WAPLS.w, and
WAPLS.predict.w
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] ## LOOCV test_mode <- TRUE # It should be set to FALSE before running cv_tf_Tmin2 <- fxTWAPLS::cv.w( taxa, modern_pollen$Tmin, nPLS = 5, fxTWAPLS::TWAPLS.w2, fxTWAPLS::TWAPLS.predict.w, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, # Remove the following line test_mode = test_mode ) # Run with progress bar `%>%` <- magrittr::`%>%` cv_tf_Tmin2 <- fxTWAPLS::cv.w( taxa, modern_pollen$Tmin, nPLS = 5, fxTWAPLS::TWAPLS.w2, fxTWAPLS::TWAPLS.predict.w, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, # Remove the following line test_mode = test_mode ) %>% fxTWAPLS::pb() ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] ## LOOCV test_mode <- TRUE # It should be set to FALSE before running cv_tf_Tmin2 <- fxTWAPLS::cv.w( taxa, modern_pollen$Tmin, nPLS = 5, fxTWAPLS::TWAPLS.w2, fxTWAPLS::TWAPLS.predict.w, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, # Remove the following line test_mode = test_mode ) # Run with progress bar `%>%` <- magrittr::`%>%` cv_tf_Tmin2 <- fxTWAPLS::cv.w( taxa, modern_pollen$Tmin, nPLS = 5, fxTWAPLS::TWAPLS.w2, fxTWAPLS::TWAPLS.predict.w, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, # Remove the following line test_mode = test_mode ) %>% fxTWAPLS::pb() ## End(Not run)
Function to get the frequency of the climate value, which will be used to
provide fx correction for WA-PLS and TWA-PLS.
fx(x, bin, show_plot = FALSE)fx(x, bin, show_plot = FALSE)
x |
Numeric vector with the modern climate values. |
bin |
Binwidth to get the frequency of the modern climate values. |
show_plot |
Boolean flag to show a plot of |
Numeric vector with the frequency of the modern climate values.
cv.w, cv.pr.w, and
sse.sample
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Get the frequency of each climate variable fx fx_Tmin <- fxTWAPLS::fx(modern_pollen$Tmin, bin = 0.02, show_plot = TRUE) fx_gdd <- fxTWAPLS::fx(modern_pollen$gdd, bin = 20, show_plot = TRUE) fx_alpha <- fxTWAPLS::fx(modern_pollen$alpha, bin = 0.002, show_plot = TRUE) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Get the frequency of each climate variable fx fx_Tmin <- fxTWAPLS::fx(modern_pollen$Tmin, bin = 0.02, show_plot = TRUE) fx_gdd <- fxTWAPLS::fx(modern_pollen$gdd, bin = 20, show_plot = TRUE) fx_alpha <- fxTWAPLS::fx(modern_pollen$alpha, bin = 0.002, show_plot = TRUE) ## End(Not run)
Function to get the frequency of the climate value, which will be used to
provide fx correction for WA-PLS and TWA-PLS.
fx_pspline(x, bin, show_plot = FALSE)fx_pspline(x, bin, show_plot = FALSE)
x |
Numeric vector with the modern climate values. |
bin |
Binwidth to get the frequency of the modern climate values, the curve will be p-spline smoothed later |
show_plot |
Boolean flag to show a plot of |
Numeric vector with the frequency of the modern climate values.
cv.w, cv.pr.w, and
sse.sample
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Get the frequency of each climate variable fx fx_pspline_Tmin <- fxTWAPLS::fx_pspline( modern_pollen$Tmin, bin = 0.02, show_plot = TRUE ) fx_pspline_gdd <- fxTWAPLS::fx_pspline( modern_pollen$gdd, bin = 20, show_plot = TRUE ) fx_pspline_alpha <- fxTWAPLS::fx_pspline( modern_pollen$alpha, bin = 0.002, show_plot = TRUE ) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Get the frequency of each climate variable fx fx_pspline_Tmin <- fxTWAPLS::fx_pspline( modern_pollen$Tmin, bin = 0.02, show_plot = TRUE ) fx_pspline_gdd <- fxTWAPLS::fx_pspline( modern_pollen$gdd, bin = 20, show_plot = TRUE ) fx_pspline_alpha <- fxTWAPLS::fx_pspline( modern_pollen$alpha, bin = 0.002, show_plot = TRUE ) ## End(Not run)
Get the distance between points, the output will be used in
get_pseudo.
get_distance(point, cpus = 4, test_mode = FALSE, test_it = 5)get_distance(point, cpus = 4, test_mode = FALSE, test_it = 5)
point |
Each row represents a sampling site, the first column is longitude and the second column is latitude, both in decimal format. |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
test_mode |
Boolean flag to execute the function with a limited number
of iterations, |
test_it |
Number of iterations to use in the test mode. |
Distance matrix, the value at the i-th row, means the distance
between the i-th sampling site and the whole sampling sites.
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") point <- modern_pollen[, c("Long", "Lat")] test_mode <- TRUE # It should be set to FALSE before running dist <- fxTWAPLS::get_distance( point, cpus = 2, # Remove the following line test_mode = test_mode ) # Run with progress bar `%>%` <- magrittr::`%>%` dist <- fxTWAPLS::get_distance( point, cpus = 2, # Remove the following line test_mode = test_mode ) %>% fxTWAPLS::pb() ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") point <- modern_pollen[, c("Long", "Lat")] test_mode <- TRUE # It should be set to FALSE before running dist <- fxTWAPLS::get_distance( point, cpus = 2, # Remove the following line test_mode = test_mode ) # Run with progress bar `%>%` <- magrittr::`%>%` dist <- fxTWAPLS::get_distance( point, cpus = 2, # Remove the following line test_mode = test_mode ) %>% fxTWAPLS::pb() ## End(Not run)
Get the sites which are both geographically and climatically close to the
test site, which could result in pseudo-replication and inflate the
cross-validation statistics. The output will be used in
cv.pr.w.
get_pseudo(dist, x, cpus = 4, test_mode = FALSE, test_it = 5)get_pseudo(dist, x, cpus = 4, test_mode = FALSE, test_it = 5)
dist |
Distance matrix which contains the distance from other sites. |
x |
The modern climate values. |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
test_mode |
Boolean flag to execute the function with a limited number
of iterations, |
test_it |
Number of iterations to use in the test mode. |
The geographically and climatically close sites to each test site.
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") point <- modern_pollen[, c("Long", "Lat")] test_mode <- TRUE # It should be set to FALSE before running dist <- fxTWAPLS::get_distance( point, cpus = 2, # Remove the following line test_mode = test_mode ) pseudo_Tmin <- fxTWAPLS::get_pseudo( dist, modern_pollen$Tmin, cpus = 2, # Remove the following line test_mode = test_mode ) # Run with progress bar `%>%` <- magrittr::`%>%` pseudo_Tmin <- fxTWAPLS::get_pseudo( dist, modern_pollen$Tmin, cpus = 2, # Remove the following line test_mode = test_mode ) %>% fxTWAPLS::pb() ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") point <- modern_pollen[, c("Long", "Lat")] test_mode <- TRUE # It should be set to FALSE before running dist <- fxTWAPLS::get_distance( point, cpus = 2, # Remove the following line test_mode = test_mode ) pseudo_Tmin <- fxTWAPLS::get_pseudo( dist, modern_pollen$Tmin, cpus = 2, # Remove the following line test_mode = test_mode ) # Run with progress bar `%>%` <- magrittr::`%>%` pseudo_Tmin <- fxTWAPLS::get_pseudo( dist, modern_pollen$Tmin, cpus = 2, # Remove the following line test_mode = test_mode ) %>% fxTWAPLS::pb() ## End(Not run)
Show progress bar
pb(expr, ...)pb(expr, ...)
expr |
R expression. |
... |
Arguments passed on to
|
Return data from the function called.
Plot the residuals, the black line is 0 line, the red line is the locally estimated scatterplot smoothing, which shows the degree of local compression.
plot_residuals(train_output, col)plot_residuals(train_output, col)
train_output |
Training output, can be the output of WA-PLS, WA-PLS with
|
col |
Choose which column of the fitted value to plot, in other words, how many number of components you want to use. |
Plotting status.
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) nsig <- 3 # This should be got from the random t-test of the cross validation fxTWAPLS::plot_residuals(fit_tf_Tmin2, nsig) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) nsig <- 3 # This should be got from the random t-test of the cross validation fxTWAPLS::plot_residuals(fit_tf_Tmin2, nsig) ## End(Not run)
Plot the training results, the black line is the 1:1 line, the red line is
the linear regression line to fitted and x, which shows the degree
of overall compression.
plot_train(train_output, col)plot_train(train_output, col)
train_output |
Training output, can be the output of WA-PLS, WA-PLS with
|
col |
Choose which column of the fitted value to plot, in other words, how many number of components you want to use. |
Plotting status.
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) nsig <- 3 # This should be got from the random t-test of the cross validation fxTWAPLS::plot_train(fit_tf_Tmin2, nsig) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) nsig <- 3 # This should be got from the random t-test of the cross validation fxTWAPLS::plot_train(fit_tf_Tmin2, nsig) ## End(Not run)
Do a random t-test to the cross-validation results.
rand.t.test.w(cvoutput, n.perm = 999)rand.t.test.w(cvoutput, n.perm = 999)
cvoutput |
|
n.perm |
The number of permutation times to get the p value, which assesses whether using the current number of components is significantly different from using one less. |
A matrix of the statistics of the cross-validation results. Each component is described below:
R2the coefficient of determination (the larger, the better the fit).
Avg.Biasaverage bias.
Max.Biasmaximum bias.
Min.Biasminimum bias.
RMSEProot-mean-square error of prediction (the smaller, the better the fit).
delta.RMSEPthe percent change of RMSEP using the current number of components than using one component less.
passesses whether using the current number of components is significantly different from using one component less, which is used to choose the last significant number of components to avoid over-fitting.
-The degree of overall compression is assessed by doing linear regression to the cross-validation result and the observed climate values.
Compre.b0: the intercept.
Compre.b1: the slope (the closer to 1, the less the
overall compression).
Compre.b0.se: the standard error of the intercept.
Compre.b1.se: the standard error of the slope.
## Not run: ## Random t-test rand_pr_tf_Tmin2 <- fxTWAPLS::rand.t.test.w(cv_pr_tf_Tmin2, n.perm = 999) # note: choose the last significant number of components based on the p-value, # see details at Liu Mengmeng, Prentice Iain Colin, ter Braak Cajo J. F., # Harrison Sandy P.. 2020 An improved statistical approach for reconstructing # past climates from biotic assemblages. Proc. R. Soc. A. 476: 20200346. # <https://doi.org/10.1098/rspa.2020.0346> ## End(Not run)## Not run: ## Random t-test rand_pr_tf_Tmin2 <- fxTWAPLS::rand.t.test.w(cv_pr_tf_Tmin2, n.perm = 999) # note: choose the last significant number of components based on the p-value, # see details at Liu Mengmeng, Prentice Iain Colin, ter Braak Cajo J. F., # Harrison Sandy P.. 2020 An improved statistical approach for reconstructing # past climates from biotic assemblages. Proc. R. Soc. A. 476: 20200346. # <https://doi.org/10.1098/rspa.2020.0346> ## End(Not run)
Calculate Sample Specific Errors
sse.sample( modern_taxa, modern_climate, fossil_taxa, trainfun, predictfun, nboot, nPLS, nsig, usefx = FALSE, fx_method = "bin", bin = NA, cpus = 4, seed = NULL, test_mode = FALSE, test_it = 5 )sse.sample( modern_taxa, modern_climate, fossil_taxa, trainfun, predictfun, nboot, nPLS, nsig, usefx = FALSE, fx_method = "bin", bin = NA, cpus = 4, seed = NULL, test_mode = FALSE, test_it = 5 )
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site |
fossil_taxa |
Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. |
trainfun |
Training function you want to use, either
|
predictfun |
Predict function you want to use: if |
nboot |
The number of bootstrap cycles you want to use. |
nPLS |
The number of components to be extracted. |
nsig |
The significant number of components to use to reconstruct past climates, this can be obtained from the cross-validation results. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
cpus |
Number of CPUs for simultaneous iterations to execute, check
|
seed |
Seed for reproducibility. |
test_mode |
Boolean flag to execute the function with a limited number
of iterations, |
test_it |
Number of iterations to use in the test mode. |
The bootstrapped standard error for each site.
fx, TWAPLS.w,
TWAPLS.predict.w, WAPLS.w, and
WAPLS.predict.w
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Load reconstruction data Holocene <- read.csv("/path/to/Holocene.csv") taxaColMin <- which(colnames(Holocene) == "taxa0") taxaColMax <- which(colnames(Holocene) == "taxaN") core <- Holocene[, taxaColMin:taxaColMax] ## SSE nboot <- 5 # Recommended 1000 nsig <- 3 # This should be got from the random t-test of the cross validation sse_tf_Tmin2 <- fxTWAPLS::sse.sample( modern_taxa = taxa, modern_climate = modern_pollen$Tmin, fossil_taxa = core, trainfun = fxTWAPLS::TWAPLS.w2, predictfun = fxTWAPLS::TWAPLS.predict.w, nboot = nboot, nPLS = 5, nsig = nsig, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, seed = 1 ) # Run with progress bar `%>%` <- magrittr::`%>%` sse_tf_Tmin2 <- fxTWAPLS::sse.sample( modern_taxa = taxa, modern_climate = modern_pollen$Tmin, fossil_taxa = core, trainfun = fxTWAPLS::TWAPLS.w2, predictfun = fxTWAPLS::TWAPLS.predict.w, nboot = nboot, nPLS = 5, nsig = nsig, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, seed = 1 ) %>% fxTWAPLS::pb() ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Load reconstruction data Holocene <- read.csv("/path/to/Holocene.csv") taxaColMin <- which(colnames(Holocene) == "taxa0") taxaColMax <- which(colnames(Holocene) == "taxaN") core <- Holocene[, taxaColMin:taxaColMax] ## SSE nboot <- 5 # Recommended 1000 nsig <- 3 # This should be got from the random t-test of the cross validation sse_tf_Tmin2 <- fxTWAPLS::sse.sample( modern_taxa = taxa, modern_climate = modern_pollen$Tmin, fossil_taxa = core, trainfun = fxTWAPLS::TWAPLS.w2, predictfun = fxTWAPLS::TWAPLS.predict.w, nboot = nboot, nPLS = 5, nsig = nsig, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, seed = 1 ) # Run with progress bar `%>%` <- magrittr::`%>%` sse_tf_Tmin2 <- fxTWAPLS::sse.sample( modern_taxa = taxa, modern_climate = modern_pollen$Tmin, fossil_taxa = core, trainfun = fxTWAPLS::TWAPLS.w2, predictfun = fxTWAPLS::TWAPLS.predict.w, nboot = nboot, nPLS = 5, nsig = nsig, usefx = TRUE, fx_method = "bin", bin = 0.02, cpus = 2, seed = 1 ) %>% fxTWAPLS::pb() ## End(Not run)
TWA-PLS predict function
TWAPLS.predict.w(TWAPLSoutput, fossil_taxa)TWAPLS.predict.w(TWAPLSoutput, fossil_taxa)
TWAPLSoutput |
The output of the |
fossil_taxa |
Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. |
A list of the reconstruction results. Each element in the list is described below:
fitthe fitted values using each number of components.
nPLSthe total number of components extracted.
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Load reconstruction data Holocene <- read.csv("/path/to/Holocene.csv") taxaColMin <- which(colnames(Holocene) == "taxa0") taxaColMax <- which(colnames(Holocene) == "taxaN") core <- Holocene[, taxaColMin:taxaColMax] ## Train fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5) fit_tf_Tmin <- fxTWAPLS::TWAPLS.w( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5) fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## Predict fossil_t_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin, core) fossil_tf_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin, core) fossil_t_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin2, core) fossil_tf_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin2, core) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Load reconstruction data Holocene <- read.csv("/path/to/Holocene.csv") taxaColMin <- which(colnames(Holocene) == "taxa0") taxaColMax <- which(colnames(Holocene) == "taxaN") core <- Holocene[, taxaColMin:taxaColMax] ## Train fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5) fit_tf_Tmin <- fxTWAPLS::TWAPLS.w( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5) fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## Predict fossil_t_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin, core) fossil_tf_Tmin <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin, core) fossil_t_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_t_Tmin2, core) fossil_tf_Tmin2 <- fxTWAPLS::TWAPLS.predict.w(fit_tf_Tmin2, core) ## End(Not run)
TWA-PLS training function, which can perform fx correction.
1/fx^2 correction will be applied at step 7.
TWAPLS.w( modern_taxa, modern_climate, nPLS = 5, usefx = FALSE, fx_method = "bin", bin = NA )TWAPLS.w( modern_taxa, modern_climate, nPLS = 5, usefx = FALSE, fx_method = "bin", bin = NA )
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
A list of the training results, which will be used by the predict function. Each element in the list is described below:
fitthe fitted values using each number of components.
xthe observed modern climate values.
taxon_namethe name of each taxon.
optimumthe updated taxon optimum
compeach component extracted (will be used in step 7 regression).
utaxon optimum for each component (step 2).
ttaxon tolerance for each component (step 2).
za parameter used in standardization for each component (step 5).
sa parameter used in standardization for each component (step 5).
ortha list that stores orthogonalization parameters (step 4).
alphaa list that stores regression coefficients (step 7).
meanxmean value of the observed modern climate values.
nPLSthe total number of components extracted.
fx, TWAPLS.predict.w, and
WAPLS.w
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Training fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5) fit_tf_Tmin <- fxTWAPLS::TWAPLS.w( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Training fit_t_Tmin <- fxTWAPLS::TWAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5) fit_tf_Tmin <- fxTWAPLS::TWAPLS.w( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## End(Not run)
TWA-PLS training function, which can perform fx correction.
1/fx correction will be applied at step 2 and step 7.
TWAPLS.w2( modern_taxa, modern_climate, nPLS = 5, usefx = FALSE, fx_method = "bin", bin = NA )TWAPLS.w2( modern_taxa, modern_climate, nPLS = 5, usefx = FALSE, fx_method = "bin", bin = NA )
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
A list of the training results, which will be used by the predict function. Each element in the list is described below:
fitthe fitted values using each number of components.
xthe observed modern climate values.
taxon_namethe name of each taxon.
optimumthe updated taxon optimum
compeach component extracted (will be used in step 7 regression).
utaxon optimum for each component (step 2).
ttaxon tolerance for each component (step 2).
za parameter used in standardization for each component (step 5).
sa parameter used in standardization for each component (step 5).
ortha list that stores orthogonalization parameters (step 4).
alphaa list that stores regression coefficients (step 7).
meanxmean value of the observed modern climate values.
nPLSthe total number of components extracted.
fx, TWAPLS.predict.w, and
WAPLS.w
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Training fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5) fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Training fit_t_Tmin2 <- fxTWAPLS::TWAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5) fit_tf_Tmin2 <- fxTWAPLS::TWAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## End(Not run)
WA-PLS predict function
WAPLS.predict.w(WAPLSoutput, fossil_taxa)WAPLS.predict.w(WAPLSoutput, fossil_taxa)
WAPLSoutput |
The output of the |
fossil_taxa |
Fossil taxa abundance data to reconstruct past climates, each row represents a site to be reconstructed, each column represents a taxon. |
A list of the reconstruction results. Each element in the list is described below:
fitThe fitted values using each number of components.
nPLSThe total number of components extracted.
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Load reconstruction data Holocene <- read.csv("/path/to/Holocene.csv") taxaColMin <- which(colnames(Holocene) == "taxa0") taxaColMax <- which(colnames(Holocene) == "taxaN") core <- Holocene[, taxaColMin:taxaColMax] ## Train fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5) fit_f_Tmin <- fxTWAPLS::WAPLS.w( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5) fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## Predict fossil_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_Tmin, core) fossil_f_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin, core) fossil_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_Tmin2, core) fossil_f_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin2, core) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Load reconstruction data Holocene <- read.csv("/path/to/Holocene.csv") taxaColMin <- which(colnames(Holocene) == "taxa0") taxaColMax <- which(colnames(Holocene) == "taxaN") core <- Holocene[, taxaColMin:taxaColMax] ## Train fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5) fit_f_Tmin <- fxTWAPLS::WAPLS.w( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5) fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## Predict fossil_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_Tmin, core) fossil_f_Tmin <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin, core) fossil_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_Tmin2, core) fossil_f_Tmin2 <- fxTWAPLS::WAPLS.predict.w(fit_f_Tmin2, core) ## End(Not run)
WA-PLS training function, which can perform fx correction.
1/fx^2 correction will be applied at step 7.
WAPLS.w( modern_taxa, modern_climate, nPLS = 5, usefx = FALSE, fx_method = "bin", bin = NA )WAPLS.w( modern_taxa, modern_climate, nPLS = 5, usefx = FALSE, fx_method = "bin", bin = NA )
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
A list of the training results, which will be used by the predict function. Each element in the list is described below:
fitthe fitted values using each number of components.
xthe observed modern climate values.
taxon_namethe name of each taxon.
optimumthe updated taxon optimum (u* in the WA-PLS paper).
compeach component extracted (will be used in step 7 regression).
utaxon optimum for each component (step 2).
za parameter used in standardization for each component (step 5).
sa parameter used in standardization for each component (step 5).
ortha list that stores orthogonalization parameters (step 4).
alphaa list that stores regression coefficients (step 7).
meanxmean value of the observed modern climate values.
nPLSthe total number of components extracted.
fx, TWAPLS.w, and
WAPLS.predict.w
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Training fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5) fit_f_Tmin <- fxTWAPLS::WAPLS.w( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Training fit_Tmin <- fxTWAPLS::WAPLS.w(taxa, modern_pollen$Tmin, nPLS = 5) fit_f_Tmin <- fxTWAPLS::WAPLS.w( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## End(Not run)
WA-PLS training function, which can perform fx correction.
1/fx correction will be applied at step 2 and step 7.
WAPLS.w2( modern_taxa, modern_climate, nPLS = 5, usefx = FALSE, fx_method = "bin", bin = NA )WAPLS.w2( modern_taxa, modern_climate, nPLS = 5, usefx = FALSE, fx_method = "bin", bin = NA )
modern_taxa |
The modern taxa abundance data, each row represents a sampling site, each column represents a taxon. |
modern_climate |
The modern climate value at each sampling site. |
nPLS |
The number of components to be extracted. |
usefx |
Boolean flag on whether or not use |
fx_method |
Binned or p-spline smoothed |
bin |
Binwidth to get fx, needed for both binned and p-splined method.
if |
A list of the training results, which will be used by the predict function. Each element in the list is described below:
fitthe fitted values using each number of components.
xthe observed modern climate values.
taxon_namethe name of each taxon.
optimumthe updated taxon optimum (u* in the WA-PLS paper).
compeach component extracted (will be used in step 7 regression).
utaxon optimum for each component (step 2).
za parameter used in standardization for each component (step 5).
sa parameter used in standardization for each component (step 5).
ortha list that stores orthogonalization parameters (step 4).
alphaa list that stores regression coefficients (step 7).
meanxmean value of the observed modern climate values.
nPLSthe total number of components extracted.
fx, TWAPLS.w, and
WAPLS.predict.w
## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Training fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5) fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## End(Not run)## Not run: # Load modern pollen data modern_pollen <- read.csv("/path/to/modern_pollen.csv") # Extract taxa taxaColMin <- which(colnames(modern_pollen) == "taxa0") taxaColMax <- which(colnames(modern_pollen) == "taxaN") taxa <- modern_pollen[, taxaColMin:taxaColMax] # Training fit_Tmin2 <- fxTWAPLS::WAPLS.w2(taxa, modern_pollen$Tmin, nPLS = 5) fit_f_Tmin2 <- fxTWAPLS::WAPLS.w2( taxa, modern_pollen$Tmin, nPLS = 5, usefx = TRUE, fx_method = "bin", bin = 0.02 ) ## End(Not run)