Displaying 20 results from an estimated 3000 matches similar to: "tol in prcomp"
2016 Mar 24
3
summary( prcomp(*, tol = .) ) -- and 'rank.'
Following from the R-help thread of March 22 on "Memory usage in prcomp",
I've started looking into adding an optional 'rank.' argument
to prcomp allowing to more efficiently get only a few PCs
instead of the full p PCs, say when p = 1000 and you know you
only want 5 PCs.
(https://stat.ethz.ch/pipermail/r-help/2016-March/437228.html
As it was mentioned, we already
2016 Mar 24
3
summary( prcomp(*, tol = .) ) -- and 'rank.'
I agree with Kasper, this is a 'big' issue. Does your method of taking only
n PCs reduce the load on memory?
The new addition to the summary looks like a good idea, but Proportion of
Variance as you describe it may be confusing to new users. Am I correct in
saying Proportion of variance describes the amount of variance with respect
to the number of components the user chooses to show? So
2016 Mar 25
2
summary( prcomp(*, tol = .) ) -- and 'rank.'
> On 25 Mar 2016, at 10:41 am, peter dalgaard <pdalgd at gmail.com> wrote:
>
> As I see it, the display showing the first p << n PCs adding up to 100% of the variance is plainly wrong.
>
> I suspect it comes about via a mental short-circuit: If we try to control p using a tolerance, then that amounts to saying that the remaining PCs are effectively zero-variance, but
2016 Mar 22
3
Memory usage in prcomp
Hi All:
I am running prcomp on a very large array, roughly [500000, 3650]. The array itself is 16GB. I am running on a Unix machine and am running ?top? at the same time and am quite surprised to see that the application memory usage is 76GB. I have the ?tol? set very high (.8) so that it should only pull out a few components. I am surprised at this memory usage because prcomp uses the SVD
2016 Mar 22
3
Memory usage in prcomp
Hi All:
I am running prcomp on a very large array, roughly [500000, 3650]. The array itself is 16GB. I am running on a Unix machine and am running ?top? at the same time and am quite surprised to see that the application memory usage is 76GB. I have the ?tol? set very high (.8) so that it should only pull out a few components. I am surprised at this memory usage because prcomp uses the SVD
2016 Mar 24
0
summary( prcomp(*, tol = .) ) -- and 'rank.'
Martin, I fully agree. This becomes an issue when you have big matrices.
(Note that there are awesome methods for actually only computing a small
number of PCs (unlike your code which uses svn which gets all of them);
these are available in various CRAN packages).
Best,
Kasper
On Thu, Mar 24, 2016 at 1:09 PM, Martin Maechler <maechler at stat.math.ethz.ch
> wrote:
> Following from
2009 Nov 09
4
prcomp - principal components in R
Hello, not understanding the output of prcomp, I reduce the number of
components and the output continues to show cumulative 100% of the
variance explained, which can't be the case dropping from 8 components
to 3.
How do i get the output in terms of the cumulative % of the total
variance, so when i go from total solution of 8 (8 variables in the data
set), to a reduced number of
2016 Mar 25
0
summary( prcomp(*, tol = .) ) -- and 'rank.'
As I see it, the display showing the first p << n PCs adding up to 100% of the variance is plainly wrong.
I suspect it comes about via a mental short-circuit: If we try to control p using a tolerance, then that amounts to saying that the remaining PCs are effectively zero-variance, but that is (usually) not the intention at all.
The common case is that the remainder terms have a roughly
2010 Nov 10
2
prcomp function
Hello,
I have a short question about the prcomp function. First I cite the
associated help page (help(prcomp)):
"Value:
...
SDEV the standard deviations of the principal components (i.e., the square
roots of the eigenvalues of the covariance/correlation matrix, though the
calculation is actually done with the singular values of the data matrix).
ROTATION the matrix of variable loadings
2016 Mar 25
0
summary( prcomp(*, tol = .) ) -- and 'rank.'
> On 25 Mar 2016, at 10:08 , Jari Oksanen <jari.oksanen at oulu.fi> wrote:
>
>>
>> On 25 Mar 2016, at 10:41 am, peter dalgaard <pdalgd at gmail.com> wrote:
>>
>> As I see it, the display showing the first p << n PCs adding up to 100% of the variance is plainly wrong.
>>
>> I suspect it comes about via a mental short-circuit: If we
2000 Jun 14
2
Typo in the documentation of prcomp. (PR#569)
The help for prcomp on R 1.0.0 states that the component sdev of the
return value is the eigenvalues of the cov matrix. Am I completely
mistaken, or should this be the _square root_ of the eigenvalues?
Also, the documentation is not very clear about how tol is used to omit
components. (The _code_ is clear, though. :-)
--
B/H
2006 Jun 16
2
bug in prcomp (PR#8994)
The following seems to be an bug in prcomp():
> test <- ts( matrix( c(NA, 2:5, NA, 7:10), 5, 2))
> test
Time Series:
Start = 1
End = 5
Frequency = 1
Series 1 Series 2
1 NA NA
2 2 7
3 3 8
4 4 9
5 5 10
> prcomp(test, scale.=TRUE, na.action=na.omit)
Erro en svd(x, nu = 0) : infinite or missing values in 'x'
2016 Mar 30
1
reg-tests-1a fails with r70391
Hi,
This may be a `transitional' bug but I am reporting a make check
fail with R-devel r70391 in reg-tests-1a.Rout. The tail of
reg-tests-1a.Rout.fail is
> ## prcomp(tol=1e-6)
> x <- matrix(runif(30),ncol=10)
> s <- prcomp(x, tol=1e-6)
> stopifnot(length(s$sdev) == ncol(s$rotation))
Error: length(s$sdev) == ncol(s$rotation) is not TRUE
Execution halted
Looking at
2002 Oct 29
0
patch to mva:prcomp to use La.svd instead of svd (PR#2227)
Per the discussion about the problems with prcomp() when n << p, which
boils down to a problem with svd() when n << p,
here is a patch to prcomp() which substitutes La.svd() instead of svd().
-Greg
(This is really a feature enhancement, but submitted to R-bugs to make sure
it doesn't get lost. )
*** R-1.6.0/src/library/mva/R/prcomp.R Mon Aug 13 17:41:50 2001
---
2009 Dec 23
1
prcomp : plotting only explanatory axis arrows
Dear all,
I have a very large dataset (1712351 , 20) and would like
to plot only the arrows that represent the
contribution of each variables.
On the sample below I woild like to plot
only the explanatory variables (Murder, Assault..)
and not the sites.
prcomp(USArrests) # inappropriate
prcomp(USArrests, scale = TRUE)
prcomp(~ Murder + Assault + Rape, data = USArrests, scale = TRUE)
2009 Mar 08
2
prcomp(X,center=F) ??
I do not understand, from a PCA point of view, the option center=F
of prcomp()
According to the help page, the calculation in prcomp() "is done by a
singular value decomposition of the (centered and possibly scaled) data
matrix, not by using eigen on the covariance matrix" (as it's done by
princomp()) .
"This is generally the preferred method for numerical accuracy"
2006 Mar 25
1
Suggest patch for princomp.formula and prcomp.formula
Dear all,
perhaps I am using princomp.formula and prcomp.formula in a way that
is not documented to work, but then the documentation just says:
formula: a formula with no response variable.
Thus, to avoid a lot of typing, it would be nice if one could use '.'
and '-' in the formula, e.g.
> library(DAAG)
> res <- prcomp(~ . - case - site - Pop - sex, possum)
2004 Jan 15
2
prcomp scale error (PR#6433)
Full_Name: Ryszard Czerminski
Version: 1.8.1
OS: GNU/Linux
Submission from: (NULL) (205.181.102.120)
prcomp(..., scale = TRUE) does not work correctly:
$ uname -a
Linux 2.4.20-28.9bigmem #1 SMP Thu Dec 18 13:27:33 EST 2003 i686 i686 i386
GNU/Linux
$ gcc --version
gcc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
> a <- matrix(rnorm(6), nrow = 3)
> sum((scale(a %*% svd(cov(a))$u, scale
2008 May 18
1
predict.prcomp: 'newdata' does not have the correct number of columns
Hi,
I'm doing PCA on wide matrices and I don't understand why calling
predict.prcomp on it throws an error:
> x1 <- matrix(rnorm(100), 5, 20)
> x2 <- matrix(rnorm(100), 5, 20)
> p <- prcomp(x1)
> predict(p, x2)
Error in predict.prcomp(p, x2) :
'newdata' does not have the correct number of columns
> dim(x2)
[1] 5 20
> dim(p$rotation)
[1] 20 5
2012 Aug 23
1
Accessing the (first or more) principal component with princomp or prcomp
Hi ,
To my knowledge, there're two functions that can do principal component
analysis, princomp and prcomp.
I don't really know the difference; the only thing I know is that when
the sample size < number of variable, only prcomp will work. Could someone
tell me the difference or where I can find easy-to-read reference?
To access the first PC using princomp: