Displaying 20 results from an estimated 1000 matches similar to: "Data transformation"
2009 Dec 15
1
Changing Column names in (Output) csv file
Dear R helpers
Following is a part of R code.
data_lab <- expand.grid(c("R11", "R12", "R13"), c("R21", "R22", "R23"), c("R31", "R32", "R33"), c("R41", "R42", "R43"), c("R51", "R52", "R53"), c("R61", "R62", "R63"),
2007 Nov 07
2
creating a dynamic output vector
Let's say I have a program that returns variables whose names may be any
string within the vector
NAMES=c("varA","varB","varC","varD","varE","varF"..."varZ"), but I do
not ever know which ones have actually been created. So in one example
output, "varA", "varC", and "varD" could exist, but
2011 Jun 23
2
Merging multiple data sets
Hi,
I am trying to merge data similar to the example data below
> dat0
id var1 var2 var3
2 1 0 1
3 1 0 1
4 0 1 1
5 0 1 1
> dat1
id var4 var5 var6
2 1 0 1
3 1 0 1
6 0 1 1
7 0 1 1
> dat2
id
2010 Jan 26
6
Help
> Dear All
>
> I have data as follows.
>
> D T M L
> 0.20 1 03 141
> 0.32 1 07 62
> 0.50 1 05 49
> 0.80 1 04 46
> 0.20 2 14 130
> 0.32 2 17 52
> 0.50 2 13 41
> 0.80 2 14 36
> 0.20 3 24 120
> 0.32
2011 Nov 04
1
How to use 'prcomp' with CLUSPLOT?
Hello,
I have a large data set that has more columns than rows (sample data below). I am trying to perform a partitioning cluster analysis and then plot that using pca. I have tried using CLUSPLOT(), but that only allows for 'princomp' where I need 'prcomp' as I do not want to reduce my columns. Is there a way to edit the CLUSPLOT() code to use 'prcomp', please?
#
2012 Oct 25
1
cut point in ROC
var1
var2
var3
var4
var5
var6
var7
var8
var9
var10
gold
2
3
1
2
4
0
1
4
4
3
2
2
4
2
4
3
4
2
4
4
4
2
3
3
0
0
4
1
0
2
4
4
2
1
4
0
3
2
0
0
2
4
4
2
3
4
0
2
2
0
0
0
3
4
2
2
2
3
2
2
0
0
0
2
4
2
2
4
1
1
2
0
0
3
3
3
2
3
4
1
4
0
0
0
0
3
4
2
3
1
0
2
2
1
0
2
3
3
2
0
3
1
1
1
1
2
1
2
3
2
1
2004 Jan 08
1
(no subject)
Hello,
I have trouble converting a character string to a R object. Let me describe this by an example;
> dim(a)
[1] 270 14
> dim("a")
NULL
> names(a)
[1] "Var1" "Var2" "Var3" "Var4" "Var5" "Var6" "Var7" "Var8" "Var9"
[10] "Var10" "Var11" "Var12"
2010 Dec 03
2
Add columns of dataset
Dear all,
I have a dataset that looks like
id var1 var2 var4 var7 var8
1 0.0 0.1 0.3 0.9 0.0
2 0.4 0.6 0.0 0.0 0.2
3 0.0 0.0 0.0 0.8 0.7
Some columns are missed, for example, here the fourth (var3), sixth(var5)
and seventh (var6) columns. I want to first determine which columns are
missed in a huge dataset and then add the missed
2012 Sep 27
1
List of Variables in Original Order
I am trying to Sweave the output of calculating correlations between one
variable and several others. I wanted to print a table where the
odd-numbered rows contain the variable names and the even-numbered rows
contain the correlations. So if VarA is correlated with all the variables in
mydata.df, then it would look like
var1 var2 var3
corr1 corr2 corr3
var4 var5
2010 Nov 08
2
Several lattice plots on one page
Dear all,
I am trying (!!!) to generate pdfs that have 8 plots on one page:
df = data.frame(
day = c(1,2,3,4),
var1 = c(1,2,3,4),
var2 = c(100,200,300,4000),
var3 = c(10,20,300,40000),
var4 = c(100000,20000,30000,4000),
var5 = c(10,20,30,40),
var6 = c(0.001,0.002,0.003,0.004),
var7 = c(123,223,123,412),
var8 = c(213,123,234,435),
all = as.factor(c(1,1,1,1)))
2006 Feb 21
6
How to sum values across multiple variables using a wildcard?
I have a dataframe called "data" with 5 records (in rows) each of
which has been scored on each of many variables (in columns).
Five of the variables are named var1, var2, var3, var4, var5 using
headers. The other variables are named using other conventions.
I can create a new variable called var6 with the value 15 for each
record with this code:
> var6=var1+var2+var3+var4+var5
2008 Apr 28
1
error in summary.Design
Dear list,
after fitting an lrm with the Design package (stored as "mymodel") I
try running a summary, but I get the following error:
dim(mydata)
[1] 235 9
names(mydata)
[1] "id" "VAR1" "VAR2" "VAR3" "VAR4" "VAR5" "VAR6" "VAR7" "VAR8"
summary(mymodel)
Error in `contrasts<-`(`*tmp*`,
2010 Dec 31
3
Changing column names
Dear R helpers
Wish you all a very Happy and Prosperous New Year 2011.
I have following query.
country = c("US", "France", "UK", "NewZealand", "Germany", "Austria", "Italy", "Canada")
Through some other R process, the result.csv file is generated as
result.csv
var1 var2 var3 var4 var5 var6 var7
2009 Nov 10
1
Data transformation
Dear all,
I have a dataset as below:
id code1 code2 p
1 4 8 0.1
1 5 7 0.9
2 1 8 0.4
2 6 2 0.2
2 4 3 0.6
3 5 6 0.7
3 7 5 0.9
I just want to rewrite it as this (vertical to horizontal):
id var1 var2 var3
2004 Feb 13
3
Calculate Closest 5 Cases?
I've only begun investigating R as a substitute for SPSS.
I have a need to identify for each CASE the closest (or most similar) 5
other CASES (not including itself as it is automatically the closest). I
have a fairly large matrix (50000 cases by 50 vars). In SPSS, I can use Correlate > Distances to generate a matrix of similarity, but only on a small sample. The entire matrix can not
2012 Sep 21
1
translating SAS proc mixed into R lme()
Dear R users,
I need help with translating these SAS codes into R with lme()? I have a
longitudinal data with repeated measures (measurements are equally spaced
in time, subjects are measured several times a year). I need to allow slope
and intercept vary.
SAS codes are:
proc mixed data = survey method=reml;
class subject var1 var3 var2 time;
model score = var2 score_base var4 var5 var3
2012 May 27
2
Unable to fit model using “lrm.fit”
Hi,
I am running a logistic regression model using lrm library and I get the
following error when I run the command:
mod1 <- lrm(death ~ factor(score), x=T, y=T, data = env1)
Unable to fit model using ?lrm.fit?
where score is a numeric variable from 0 to 6.
LRM executes fine for the following commands:
mod1 <- lrm(death ~ score, x=T, y=T, data = env1)
mod1<- lrm(death ~
2012 Mar 23
1
Memory limits for MDSplot in randomForest package
Hello,
I am struggling to produce an MDS plot using the randomForest package
with a moderately large data set. My data set has one categorical
response variables, 7 predictor variables and just under 19000
observations. That means my proximity matrix is approximately 133000
by 133000 which is quite large. To train a random forest on this large
a dataset I have to use my institutions high
2008 Mar 07
0
linear discriminant analysis / search
Dear R help list,
I have a training dataset that looks like Table1.
I have an unknown dataset that looks like Table2.
I want to have a program that should search the training dataset and
identify that the unknown sample belongs to which category (type1, type2 or
type3)
and also if the unknown does not belong to any of the categories, it should
let me know.
The real dataset has 600 variables and
2008 Mar 07
0
linear discriminant analysis
Dear R help list,
I have a training dataset that looks like Table1.
I have an unknown dataset that looks like Table2.
I want to have a program that should search the training dataset and
identify that the unknown sample belongs to which category (type1, type2 or
type3)
and also if the unknown does not belong to any of the categories, it should
let me know.
The real dataset has 600 variables and