similar to: aggregating strings

Displaying 20 results from an estimated 7000 matches similar to: "aggregating strings"

2009 Sep 25
7
Spliting columns, strings or reg exp returning substrings
Currently as the first column in a data frame I have string values in the format xx_yy - I want to create a new column with just the substring xx (for each row in turn). Three possible ways to do this might be (1) split the string by '_' using strsplit and paste the first of the resulting variables into a new column, but I have been unable to do this for each row of my data frame in turn
2009 Jan 20
2
Merging tables
I am relatively new to R and am trying to do some basic data manipulation. Basically I have a table (csv - table 1) of data for a set of samples (rows), and a second table (table 2) of information about a subset of samples of particular interest. I want to pull out the data from table 1 for the samples in table 2, either by: * Merging the two tables based on a common identifier (SampleID - may
2012 Jan 10
2
strange Sys.Date() side effect
Any ideas what is the problem with this code? > N <- 2; c(Sys.Date(), sprintf('N = %d', N)) [1] "2012-01-10" NA Warning message: In as.POSIXlt.Date(x) : NAs introduced by coercion Best regards, Ryszard Ryszard Czerminski AstraZeneca Pharmaceuticals LP 35 Gatehouse Drive Waltham, MA 02451 USA 781-839-4304 ryszard.czerminski@astrazeneca.com
2011 Nov 23
2
bizarre seq() behavior?
Is there any rational explanation for the bizarre seq() behavior below? > seq(2,8.1, lenght.out=3) [1] 2 3 4 5 6 7 8 > help(seq) > seq(2,8,length.out=3) [1] 2 5 8 > seq(2,8.1,length.out=3) [1] 2.00 5.05 8.10 Except maybe that it is early in the morning :) Best regards, Ryszard Ryszard Czerminski AstraZeneca Pharmaceuticals LP 35 Gatehouse Drive Waltham, MA 02451 USA 781-839-4304
2012 Jan 12
3
strsplit() does not split on "."?
Any ideas what is wrong? > strsplit("a.b", ".") # generates empty strings with split="." [[1]] [1] "" "" "" > strsplit("a b", " ") # seems to work fine with split=" ", and other characters... [[1]] [1] "a" "b" > > R.Version() $platform [1]
2010 Sep 27
1
smooth contour lines
Is there an easy way to control smoothness of the contour lines? In the plot I am working on due to the undersampling the contour lines I am getting are jugged, but it is clear "by eye" these should be basically straight lines. In maps package I found smooth.map function, but maybe there is a more generic way of accomplishing the same thing. Ideally there would be an option to control
2012 Jan 25
1
Error in predict.randomForest ... subscript out of bounds with NULL name in X
RF trains fine with X, but fails on prediction > library(randomForest) > chirps <- c(20,16.0,19.8,18.4,17.1,15.5,14.7,17.1,15.4,16.2,15,17.2,16,17,14.1) > temp <- c(88.6,71.6,93.3,84.3,80.6,75.2,69.7,82,69.4,83.3,78.6,82.6,80.6,83.5,76 .3) > X <- cbind(1,chirps) > rf <- randomForest(X, temp) > yp <- predict(rf, X) Error in predict.randomForest(rf, X) : subscript
2001 Feb 07
5
zero inflated poisson and censored-continuous models
I wonder if there is a package that will estimate a Zero Inflated Poisson Model (ZIP), and also if there is a package that will estimate what is called the Tobit model: that is a combination of censored and observed values in the same sample. Georgina Bermann Biostatistics AstraZeneca R&D M?lndal -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing
2011 Jan 20
1
randomForest: too many elements specified?
I getting "Error in matrix(0, n, n) : too many elements specified" while building randomForest model, which looks like memory allocation error. Software versions are: randomForest 4.5-25, R version 2.7.1 Dataset is big (~90K rows, ~200 columns), but this is on a big machine ( ~120G RAM) and I call randomForest like this: randomForest(x,y) i.e. in supervised mode and not requesting
2002 Oct 30
2
Problems joining a Samba PDC controlled Domain
Hello, I'm having problems to join a japanese W2K Client with SrvPck 2 installed to my samba 2.2.5 PDC controlled domain. Other Clients are no problem (Win98, W2k engl., WXP german). I check the regKey (for plaintextpassword; set to 1). Reinstalled SevPck 2, but didn't help. the problem must be on Client side because I having the same problem when jooining a other samba controlled
2002 Oct 23
1
SAMBA and Win2000 SP3
We are presenty using SAMBA 2.2 w. Windows 2000 sp1 and will be upgrading to Windows 2000 sp3. Are there any known or suspected problems with the combination of Windows 2000 sp3 and SAMBA 2.2. We are using Solaris 7 on the Unix side. /ola Ola Engstr?m Technical Computing & Information Services AstraZeneca R&D M?lndal S-431 83 M?lndal Sweden
2000 Jan 05
1
Upgrade to 2.0.6 not working
Hi All, We have been running samba-1.9.18p10 beautifully for the past year or so. Finally decided to upgrade to samba-2.0.6 and am having no luck. We are using security=server such that the users are using their NT login/password to authenticate. When I test the 2.0.6 installation with smbclient, I get the "protocol negotiation failure" message. Does anyone know what this means? I
2004 Apr 20
2
Re: [R] Unexpected behaviour of identical (PR#6799)
"Swinton, Jonathan" <Jonathan.Swinton@astrazeneca.com> writes: > # works as expected > > ac <- c('A','B'); > > identical(ac,ac[1:2]) > [1] TRUE > > #but > > af <- factor(ac) > > identical(af,af[1:2]) > [1] FALSE > > Any opinions? Did a cross-check with Splus and it doesn't do that , so I think it
2001 Apr 02
2
standard errors of fitted values are different S-plus survival pa ckage and R
Perhaps this question has been asked before: but using the function predict( fit,type="terms",se.fit=T), where fit is a coxph object in S-plus, the estimated standard errors are different. It may be different estimators of the variance of the residuals? Which one is the default in R, I don't find that too easily in the documentation. Does anybody know? I'll be very grateful
2009 Sep 25
2
summarize-plyr package
Hi,I am using the amazing package 'plyr". I have one problem. I would appreciate help to fix the following error: Thanks. ______________________________ > library(plyr) > data(baseball) > summarise(baseball, + duration = max(year) - min(year), + nteams = length(unique(team))) Error: could not find function "summarise" > ddply(baseball, "id", summarise, +
2017 Sep 09
2
Avoid duplication in dplyr::summarise
Dear group, Is there a way I could avoid the sort of duplication illustrated below? i.e., I have the same dplyr::summarise function on different group_by arguments. So I'd like to create a single summarise function that could be applied to both. My attempt below fails. df <- data.frame(matrix(rnorm(40), 10, 4), f1 = gl(3, 10, labels = letters[1:3]), f2 =
2004 Apr 19
3
How to write an S4 method for sum or a Summary generic
If I have a class Foo, then i can write an S3 method for sum for it: >setClass("Foo",representation(a="integer"));aFoo=new("Foo",a=c(1:3,NA)) >sum.Foo <- function(x,na.rm){print(x);print(na.rm);sum(x at a,na.rm=na.rm)} >sum(aFoo) But how do I write an S4 method for this? All my attempts to do so have foundered. For example
2010 Apr 16
4
score counts in an aggregate function
Dear R-Users, I have a big data set "mydata" with repeated observation and some missing values. It looks like the format below: userid sex item score1 score2 1 0 1 1 1 1 0 2 0 1 1 0 3 NA 1 1 0 4 1 0 2 1 1 0 1 2 1 2 NA 1 2 1 3 1
2003 Jan 02
1
aggregate: "sum" not meaningful for factors
Dear all, I try to summarise my data per category using aggregate, but for some reason I get the error message "sum" not meaningful for factors even though my vector is numeric. The data set is shown below. Could someone please give a hint. Thanks in advance! Sincerely, Tord > names(test) [1] "ObjektID" "tallstubbyta" > is.factor(test$ObjektID);
2003 Dec 10
1
Article on Asterisk: German Linux magazine
Hi there, a friend just notified my of the the cover story of "freeX 1'2004": Linux als Telefonanalage (engl.: Linux as PBX) http://www.cul.de/freex.html Cheers, Philipp