similar to: Data transformation

Displaying 20 results from an estimated 1000 matches similar to: "Data transformation"

2009 Dec 15
1
Changing Column names in (Output) csv file
Dear R helpers   Following is a part of R code.   data_lab <- expand.grid(c("R11", "R12", "R13"), c("R21", "R22", "R23"), c("R31", "R32", "R33"), c("R41", "R42", "R43"), c("R51", "R52", "R53"), c("R61", "R62", "R63"),
2007 Nov 07
2
creating a dynamic output vector
Let's say I have a program that returns variables whose names may be any string within the vector NAMES=c("varA","varB","varC","varD","varE","varF"..."varZ"), but I do not ever know which ones have actually been created. So in one example output, "varA", "varC", and "varD" could exist, but
2011 Jun 23
2
Merging multiple data sets
Hi, I am trying to merge data similar to the example data below > dat0 id var1 var2 var3 2 1 0 1 3 1 0 1 4 0 1 1 5 0 1 1 > dat1 id var4 var5 var6 2 1 0 1 3 1 0 1 6 0 1 1 7 0 1 1 > dat2 id
2010 Jan 26
6
Help
> Dear All > > I have data as follows. > > D T M L > 0.20 1 03 141 > 0.32 1 07 62 > 0.50 1 05 49 > 0.80 1 04 46 > 0.20 2 14 130 > 0.32 2 17 52 > 0.50 2 13 41 > 0.80 2 14 36 > 0.20 3 24 120 > 0.32
2011 Nov 04
1
How to use 'prcomp' with CLUSPLOT?
Hello, I have a large data set that has more columns than rows (sample data below). I am trying to perform a partitioning cluster analysis and then plot that using pca. I have tried using CLUSPLOT(), but that only allows for 'princomp' where I need 'prcomp' as I do not want to reduce my columns. Is there a way to edit the CLUSPLOT() code to use 'prcomp', please? #
2012 Oct 25
1
cut point in ROC
var1 var2 var3 var4 var5 var6 var7 var8 var9 var10 gold 2 3 1 2 4 0 1 4 4 3 2 2 4 2 4 3 4 2 4 4 4 2 3 3 0 0 4 1 0 2 4 4 2 1 4 0 3 2 0 0 2 4 4 2 3 4 0 2 2 0 0 0 3 4 2 2 2 3 2 2 0 0 0 2 4 2 2 4 1 1 2 0 0 3 3 3 2 3 4 1 4 0 0 0 0 3 4 2 3 1 0 2 2 1 0 2 3 3 2 0 3 1 1 1 1 2 1 2 3 2 1
2004 Jan 08
1
(no subject)
Hello, I have trouble converting a character string to a R object. Let me describe this by an example; > dim(a) [1] 270 14 > dim("a") NULL > names(a) [1] "Var1" "Var2" "Var3" "Var4" "Var5" "Var6" "Var7" "Var8" "Var9" [10] "Var10" "Var11" "Var12"
2010 Dec 03
2
Add columns of dataset
Dear all, I have a dataset that looks like id var1 var2 var4 var7 var8 1 0.0 0.1 0.3 0.9 0.0 2 0.4 0.6 0.0 0.0 0.2 3 0.0 0.0 0.0 0.8 0.7 Some columns are missed, for example, here the fourth (var3), sixth(var5) and seventh (var6) columns. I want to first determine which columns are missed in a huge dataset and then add the missed
2012 Sep 27
1
List of Variables in Original Order
I am trying to Sweave the output of calculating correlations between one variable and several others. I wanted to print a table where the odd-numbered rows contain the variable names and the even-numbered rows contain the correlations. So if VarA is correlated with all the variables in mydata.df, then it would look like var1 var2 var3 corr1 corr2 corr3 var4 var5
2010 Nov 08
2
Several lattice plots on one page
Dear all, I am trying (!!!) to generate pdfs that have 8 plots on one page: df = data.frame( day = c(1,2,3,4), var1 = c(1,2,3,4), var2 = c(100,200,300,4000), var3 = c(10,20,300,40000), var4 = c(100000,20000,30000,4000), var5 = c(10,20,30,40), var6 = c(0.001,0.002,0.003,0.004), var7 = c(123,223,123,412), var8 = c(213,123,234,435), all = as.factor(c(1,1,1,1)))
2006 Feb 21
6
How to sum values across multiple variables using a wildcard?
I have a dataframe called "data" with 5 records (in rows) each of which has been scored on each of many variables (in columns). Five of the variables are named var1, var2, var3, var4, var5 using headers. The other variables are named using other conventions. I can create a new variable called var6 with the value 15 for each record with this code: > var6=var1+var2+var3+var4+var5
2008 Apr 28
1
error in summary.Design
Dear list, after fitting an lrm with the Design package (stored as "mymodel") I try running a summary, but I get the following error: dim(mydata) [1] 235 9 names(mydata) [1] "id" "VAR1" "VAR2" "VAR3" "VAR4" "VAR5" "VAR6" "VAR7" "VAR8" summary(mymodel) Error in `contrasts<-`(`*tmp*`,
2010 Dec 31
3
Changing column names
Dear R helpers Wish you all a very Happy and Prosperous New Year 2011. I have following query. country = c("US", "France", "UK", "NewZealand", "Germany", "Austria", "Italy", "Canada") Through some other R process, the result.csv file is generated as result.csv      var1   var2  var3  var4    var5    var6   var7  
2009 Nov 10
1
Data transformation
Dear all, I have a dataset as below: id code1 code2 p 1 4 8 0.1 1 5 7 0.9 2 1 8 0.4 2 6 2 0.2 2 4 3 0.6 3 5 6 0.7 3 7 5 0.9 I just want to rewrite it as this (vertical to horizontal): id var1 var2 var3
2004 Feb 13
3
Calculate Closest 5 Cases?
I've only begun investigating R as a substitute for SPSS. I have a need to identify for each CASE the closest (or most similar) 5 other CASES (not including itself as it is automatically the closest). I have a fairly large matrix (50000 cases by 50 vars). In SPSS, I can use Correlate > Distances to generate a matrix of similarity, but only on a small sample. The entire matrix can not
2012 Sep 21
1
translating SAS proc mixed into R lme()
Dear R users, I need help with translating these SAS codes into R with lme()? I have a longitudinal data with repeated measures (measurements are equally spaced in time, subjects are measured several times a year). I need to allow slope and intercept vary. SAS codes are: proc mixed data = survey method=reml; class subject var1 var3 var2 time; model score = var2 score_base var4 var5 var3
2012 May 27
2
Unable to fit model using “lrm.fit”
Hi, I am running a logistic regression model using lrm library and I get the following error when I run the command: mod1 <- lrm(death ~ factor(score), x=T, y=T, data = env1) Unable to fit model using ?lrm.fit? where score is a numeric variable from 0 to 6. LRM executes fine for the following commands: mod1 <- lrm(death ~ score, x=T, y=T, data = env1) mod1<- lrm(death ~
2012 Mar 23
1
Memory limits for MDSplot in randomForest package
Hello, I am struggling to produce an MDS plot using the randomForest package with a moderately large data set. My data set has one categorical response variables, 7 predictor variables and just under 19000 observations. That means my proximity matrix is approximately 133000 by 133000 which is quite large. To train a random forest on this large a dataset I have to use my institutions high
2008 Mar 07
0
linear discriminant analysis / search
Dear R help list, I have a training dataset that looks like Table1. I have an unknown dataset that looks like Table2. I want to have a program that should search the training dataset and identify that the unknown sample belongs to which category (type1, type2 or type3) and also if the unknown does not belong to any of the categories, it should let me know. The real dataset has 600 variables and
2008 Mar 07
0
linear discriminant analysis
Dear R help list, I have a training dataset that looks like Table1. I have an unknown dataset that looks like Table2. I want to have a program that should search the training dataset and identify that the unknown sample belongs to which category (type1, type2 or type3) and also if the unknown does not belong to any of the categories, it should let me know. The real dataset has 600 variables and