thr3ads.net - similar to: "Data transformation"

Changing Column names in (Output) csv file

2009 Dec 15

1

Changing Column names in (Output) csv file

Dear R helpers Following is a part of R code. data_lab <- expand.grid(c("R11", "R12", "R13"), c("R21", "R22", "R23"), c("R31", "R32", "R33"), c("R41", "R42", "R43"), c("R51", "R52", "R53"), c("R61", "R62", "R63"),

creating a dynamic output vector

2007 Nov 07

2

creating a dynamic output vector

Let's say I have a program that returns variables whose names may be any string within the vector NAMES=c("varA","varB","varC","varD","varE","varF"..."varZ"), but I do not ever know which ones have actually been created. So in one example output, "varA", "varC", and "varD" could exist, but

Merging multiple data sets

2011 Jun 23

2

Merging multiple data sets

Hi, I am trying to merge data similar to the example data below > dat0 id var1 var2 var3 2 1 0 1 3 1 0 1 4 0 1 1 5 0 1 1 > dat1 id var4 var5 var6 2 1 0 1 3 1 0 1 6 0 1 1 7 0 1 1 > dat2 id

Help

2010 Jan 26

6

Help

> Dear All > > I have data as follows. > > D T M L > 0.20 1 03 141 > 0.32 1 07 62 > 0.50 1 05 49 > 0.80 1 04 46 > 0.20 2 14 130 > 0.32 2 17 52 > 0.50 2 13 41 > 0.80 2 14 36 > 0.20 3 24 120 > 0.32

How to use 'prcomp' with CLUSPLOT?

2011 Nov 04

1

How to use 'prcomp' with CLUSPLOT?

Hello, I have a large data set that has more columns than rows (sample data below). I am trying to perform a partitioning cluster analysis and then plot that using pca. I have tried using CLUSPLOT(), but that only allows for 'princomp' where I need 'prcomp' as I do not want to reduce my columns. Is there a way to edit the CLUSPLOT() code to use 'prcomp', please? #

cut point in ROC

2012 Oct 25

1

cut point in ROC

var1 var2 var3 var4 var5 var6 var7 var8 var9 var10 gold 2 3 1 2 4 0 1 4 4 3 2 2 4 2 4 3 4 2 4 4 4 2 3 3 0 0 4 1 0 2 4 4 2 1 4 0 3 2 0 0 2 4 4 2 3 4 0 2 2 0 0 0 3 4 2 2 2 3 2 2 0 0 0 2 4 2 2 4 1 1 2 0 0 3 3 3 2 3 4 1 4 0 0 0 0 3 4 2 3 1 0 2 2 1 0 2 3 3 2 0 3 1 1 1 1 2 1 2 3 2 1

(no subject)

2004 Jan 08

1

(no subject)

Hello, I have trouble converting a character string to a R object. Let me describe this by an example; > dim(a) [1] 270 14 > dim("a") NULL > names(a) [1] "Var1" "Var2" "Var3" "Var4" "Var5" "Var6" "Var7" "Var8" "Var9" [10] "Var10" "Var11" "Var12"

Add columns of dataset

2010 Dec 03

2

Add columns of dataset

Dear all, I have a dataset that looks like id var1 var2 var4 var7 var8 1 0.0 0.1 0.3 0.9 0.0 2 0.4 0.6 0.0 0.0 0.2 3 0.0 0.0 0.0 0.8 0.7 Some columns are missed, for example, here the fourth (var3), sixth(var5) and seventh (var6) columns. I want to first determine which columns are missed in a huge dataset and then add the missed

List of Variables in Original Order

2012 Sep 27

1

List of Variables in Original Order

I am trying to Sweave the output of calculating correlations between one variable and several others. I wanted to print a table where the odd-numbered rows contain the variable names and the even-numbered rows contain the correlations. So if VarA is correlated with all the variables in mydata.df, then it would look like var1 var2 var3 corr1 corr2 corr3 var4 var5

Several lattice plots on one page

2010 Nov 08

2

Several lattice plots on one page

Dear all, I am trying (!!!) to generate pdfs that have 8 plots on one page: df = data.frame( day = c(1,2,3,4), var1 = c(1,2,3,4), var2 = c(100,200,300,4000), var3 = c(10,20,300,40000), var4 = c(100000,20000,30000,4000), var5 = c(10,20,30,40), var6 = c(0.001,0.002,0.003,0.004), var7 = c(123,223,123,412), var8 = c(213,123,234,435), all = as.factor(c(1,1,1,1)))

How to sum values across multiple variables using a wildcard?

2006 Feb 21

6

How to sum values across multiple variables using a wildcard?

I have a dataframe called "data" with 5 records (in rows) each of which has been scored on each of many variables (in columns). Five of the variables are named var1, var2, var3, var4, var5 using headers. The other variables are named using other conventions. I can create a new variable called var6 with the value 15 for each record with this code: > var6=var1+var2+var3+var4+var5

error in summary.Design

2008 Apr 28

1

error in summary.Design

Dear list, after fitting an lrm with the Design package (stored as "mymodel") I try running a summary, but I get the following error: dim(mydata) [1] 235 9 names(mydata) [1] "id" "VAR1" "VAR2" "VAR3" "VAR4" "VAR5" "VAR6" "VAR7" "VAR8" summary(mymodel) Error in `contrasts<-`(`*tmp*`,

Changing column names

2010 Dec 31

3

Changing column names

Dear R helpers Wish you all a very Happy and Prosperous New Year 2011. I have following query. country = c("US", "France", "UK", "NewZealand", "Germany", "Austria", "Italy", "Canada") Through some other R process, the result.csv file is generated as result.csv var1 var2 var3 var4 var5 var6 var7

Data transformation

2009 Nov 10

1

Data transformation

Dear all, I have a dataset as below: id code1 code2 p 1 4 8 0.1 1 5 7 0.9 2 1 8 0.4 2 6 2 0.2 2 4 3 0.6 3 5 6 0.7 3 7 5 0.9 I just want to rewrite it as this (vertical to horizontal): id var1 var2 var3

Calculate Closest 5 Cases?

2004 Feb 13

3

Calculate Closest 5 Cases?

I've only begun investigating R as a substitute for SPSS. I have a need to identify for each CASE the closest (or most similar) 5 other CASES (not including itself as it is automatically the closest). I have a fairly large matrix (50000 cases by 50 vars). In SPSS, I can use Correlate > Distances to generate a matrix of similarity, but only on a small sample. The entire matrix can not

translating SAS proc mixed into R lme()

2012 Sep 21

1

translating SAS proc mixed into R lme()

Dear R users, I need help with translating these SAS codes into R with lme()? I have a longitudinal data with repeated measures (measurements are equally spaced in time, subjects are measured several times a year). I need to allow slope and intercept vary. SAS codes are: proc mixed data = survey method=reml; class subject var1 var3 var2 time; model score = var2 score_base var4 var5 var3

Unable to fit model using “lrm.fit”

2012 May 27

2

Unable to fit model using “lrm.fit”

Hi, I am running a logistic regression model using lrm library and I get the following error when I run the command: mod1 <- lrm(death ~ factor(score), x=T, y=T, data = env1) Unable to fit model using ?lrm.fit? where score is a numeric variable from 0 to 6. LRM executes fine for the following commands: mod1 <- lrm(death ~ score, x=T, y=T, data = env1) mod1<- lrm(death ~

Memory limits for MDSplot in randomForest package

2012 Mar 23

1

Memory limits for MDSplot in randomForest package

Hello, I am struggling to produce an MDS plot using the randomForest package with a moderately large data set. My data set has one categorical response variables, 7 predictor variables and just under 19000 observations. That means my proximity matrix is approximately 133000 by 133000 which is quite large. To train a random forest on this large a dataset I have to use my institutions high

linear discriminant analysis / search

2008 Mar 07

0

linear discriminant analysis / search

Dear R help list, I have a training dataset that looks like Table1. I have an unknown dataset that looks like Table2. I want to have a program that should search the training dataset and identify that the unknown sample belongs to which category (type1, type2 or type3) and also if the unknown does not belong to any of the categories, it should let me know. The real dataset has 600 variables and

linear discriminant analysis

2008 Mar 07

0

linear discriminant analysis

Dear R help list, I have a training dataset that looks like Table1. I have an unknown dataset that looks like Table2. I want to have a program that should search the training dataset and identify that the unknown sample belongs to which category (type1, type2 or type3) and also if the unknown does not belong to any of the categories, it should let me know. The real dataset has 600 variables and

similar to: Data transformation