Displaying 20 results from an estimated 800 matches similar to: "How to avoid converting "_" to "." ?"
2012 Jan 10
strange Sys.Date() side effect
Any ideas what is the problem with this code?
> N <- 2; c(Sys.Date(), sprintf('N = %d', N))
[1] "2012-01-10" NA
Warning message:
In as.POSIXlt.Date(x) : NAs introduced by coercion
Best regards,
Ryszard Czerminski
AstraZeneca Pharmaceuticals LP
35 Gatehouse Drive
Waltham, MA 02451
2012 Jan 12
strsplit() does not split on "."?
Any ideas what is wrong?
> strsplit("a.b", ".") # generates empty strings with split="."
[1] "" "" ""
> strsplit("a b", " ") # seems to work fine with split=" ", and other
[1] "a" "b"
> R.Version()
2011 Nov 23
bizarre seq() behavior?
Is there any rational explanation for the bizarre seq() behavior below?
> seq(2,8.1, lenght.out=3)
[1] 2 3 4 5 6 7 8
> help(seq)
> seq(2,8,length.out=3)
[1] 2 5 8
> seq(2,8.1,length.out=3)
[1] 2.00 5.05 8.10
Except maybe that it is early in the morning :)
Best regards,
Ryszard Czerminski
AstraZeneca Pharmaceuticals LP
35 Gatehouse Drive
Waltham, MA 02451
2002 Jun 20
how to skip NA columns ?
na.omit() can be used to remove rows with NA's
but how can I remove columns ? and remember, which columns have been removed
I guess I can do t(na.omit(t(o))) as shown below, but this probably creates
a lot of overhead and I do not know which columns
have been removed.
> o
[,1] [,2] [,3]
[1,] 1 NA 7
[2,] 2 NA 8
[3,] 3 NA 9
2002 Jun 20
problem with predict()
It is most probably just my R-ignorance, but I have following problem with
using predict(). I train the model using 164 cases and then I try to use
it on the data set with 35 cases, but I am getting 164 predictions ?
R-code below illustrates in more detail what I am doing.
Truly yours,
train = read.csv("train.csv", header = TRUE, row.names = "mol",
2010 Mar 22
sets package: converting a set to data frame?
I just started using nice package "sets"
and I wonder if there are utilities to convert (some) sets to data frame
(as in the example below)
> library(sets)
> a <- gset(elements = list(e('A', 0.1), e('B', 0.8)))
> lst <- as.list(a)
> nr <- length(lst)
> rnames <- character()
> for (i in 1:nr) rnames[i] <- lst[[i]]
> df <-
2003 Dec 09
problem with pls(x, y, ..., ncomp = 16): Error in inherit s( x, "data.frame") : subscript out of bounds
I don't know the details of pls (in the pls.pcr package, I assume), but if
you use validation="CV", that says you want to use CV to select the best
number of components. Then why would you specify ncomp as well?
> From: ryszard.czerminski at pharma.novartis.com
> When I try to use ncomp parameter in pls procedure I get
> following error:
> >
2010 Sep 27
smooth contour lines
Is there an easy way to control smoothness of the contour lines?
In the plot I am working on due to the undersampling the contour
lines I am getting are jugged, but it is clear "by eye" these should
be basically straight lines.
In maps package I found smooth.map function, but maybe there is a more
generic way
of accomplishing the same thing.
Ideally there would be an option to control
2012 Jan 25
Error in predict.randomForest ... subscript out of bounds with NULL name in X
RF trains fine with X, but fails on prediction
> library(randomForest)
> chirps <-
> temp <-
> X <- cbind(1,chirps)
> rf <- randomForest(X, temp)
> yp <- predict(rf, X)
Error in predict.randomForest(rf, X) : subscript
2011 Jan 20
randomForest: too many elements specified?
I getting "Error in matrix(0, n, n) : too many elements specified"
while building randomForest model, which looks like memory allocation
Software versions are: randomForest 4.5-25, R version 2.7.1
Dataset is big (~90K rows, ~200 columns), but this is on a big machine (
~120G RAM)
and I call randomForest like this: randomForest(x,y)
i.e. in supervised mode and not requesting
2010 Oct 22
how fit linear model with fixed slope?
I want to fit a linear model with fixed slope e.g. y = x + b
(instead of general: y = a*x + b)
Is it possible to do with lm()?
Confidentiality Notice: This message is private and may ...{{dropped:11}}
2003 Nov 12
column extraction by name ?
I have a data frame (df) with colums x, y and z.
e.g. df <- data.frame(x = sample(4), y = sample(4), z = sample(4))
I can extract column z by: df$z or df[3]
I can also extract columns x,y by: df[1:2] or by df[-3].
Is it possible to extract x,y columns in a "symbolic" fashion i.e.
by equivalent of df[-z] (which is illegal) ???
Or alternativeley, is there an equivalent of
2010 Oct 04
plotmath: how to use greek symbols in expression(integral(f(tau)*dtau, 0, t))?
I would like to use greek "tau" as a symbol of variable to integrate
over in plotmath
expression(integral(f(tau)*dtau, 0,t))
but nothing seems to work. I tried d{\tau}, d\tau, etc.,
without any success
Is it possible? How can I accomplish this?
Best regards,
Confidentiality Notice: This message is
2010 Nov 18
how to find near neighbors?
I am looking for an efficient way to find near neighbors...
More specifically...
I have two sets of points: A & B and I want to find
points in set B which are closer to set A than some
cutoff (or n-closest)
I will appreciate very much any pointers...
Confidentiality Notice: This message is private and may
2006 Nov 03
R CMD BATCH: unable to start device PNG
And on that note, here is a function that I use to get around it:
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Jeffrey Horner
Sent: Friday, November 03, 2006 10:01 AM
To: ryszard.czerminski at novartis.com
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] R CMD BATCH: unable to start device PNG
2003 Oct 24
how to remove NaN columns ?
How can I remove columns with NaN entries ?
Here is my simple example:
> data <- read.csv("test.csv")
> xdata <- data[3:length(data)]
> xs <- lapply(xdata, function(x){(x - mean(x))/sqrt(var(x))})
> x <- data.frame(xs)
> x
1 -0.7071068 NaN -0.7071068 -0.7071068
2 0.7071068 NaN 0.7071068 0.7071068
2003 Oct 31
print(), cat() and simple I/O in R
I am trying to produce rather mundane output of the form e.g.
pi, e = 3.14 2.718
The closest result I achieved so far with print() is:
> print (c(pi, exp(1)), digits = 3)
[1] 3.14 2.72
> print(c("pi, e =", pi, exp(1)), digits = 3)
[1] "pi, e =" "3.14159265358979" "2.71828182845905"
I understand that c() promotes floats to strings and
2004 Jan 15
prcomp scale error (PR#6433)
Full_Name: Ryszard Czerminski
Version: 1.8.1
OS: GNU/Linux
Submission from: (NULL) (
prcomp(..., scale = TRUE) does not work correctly:
$ uname -a
Linux 2.4.20-28.9bigmem #1 SMP Thu Dec 18 13:27:33 EST 2003 i686 i686 i386
$ gcc --version
gcc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
> a <- matrix(rnorm(6), nrow = 3)
> sum((scale(a %*% svd(cov(a))$u, scale
2006 Apr 07
strange matrix behaviour: is there a matrix with one row?
Consider this:
> y <- matrix(1:8, ncol=2)
> is.matrix(y[-c(1,2),])
[1] TRUE
> is.matrix(y[-c(1,2,3),])
> is.matrix(y[-c(1,2,3,4),])
[1] TRUE
It seems like an inconsistent behaviour:
- with 2 or more rows we have a matrix
- with 1 row we do not have a matrix and
- with 0 rows we have a matrix again
I just stumbled on this behaviour, because I had a problem
with my
2004 Jun 09
how to initialize random seed properly ?
I want to start R processes on multiple processors from single shell
and I want all of them to have different random seeds.
One way of doing this is
sleep 2 # (with 'sleep 1' I am often getting the same number)
Is there a simpler way without a need to sleep between invoking
different R processes ?