Displaying 20 results from an estimated 700 matches similar to: "Conditionally adding a constant"
2011 May 10
1
Filtering out bad data points
Hi,
I always have a question about how to do this best in R. I have a data
frame and a set of criteria to filter points out. My procedure is to
always locate indices of those points, check if index vector length is
greater than 0 or not and then remove them. Meaning
dftest <- data.frame(x=rnorm(100),y=rnorm(100));
qtile <- quantile(dftest$x,probs=c(0.05,0.95));
badIdx <- which((dftest$x
2012 May 01
3
Data frame vs matrix quirk: Hinky error message?
AdvisoRs:
Is the following a bug, feature, hinky error message, or dumb Bert?
> mtest <- matrix(1:12,nr=4)
> dftest <- data.frame(mtest)
> ix <- cbind(1:2,2:3)
> mtest[ix] <- NA
> mtest
[,1] [,2] [,3]
[1,] 1 NA 9
[2,] 2 6 NA
[3,] 3 7 11
[4,] 4 8 12
## But ...
> dftest[ix] <- NA
Error in `[<-.data.frame`(`*tmp*`, ix, value
2011 Oct 25
2
Logistic Regression - Variable Selection Methods With Prediction
Hello,
I am pretty new to R, I have always used SAS and SAS products. My
target variable is binary ('Y' and 'N') and i have about 14 predictor
variables. My goal is to compare different variable selection methods
like Forward, Backward, All possible subsests. I am using
misclassification rate to pick the winner method.
This is what i have as of now,
Reg <- glm (Graduation ~.,
2008 Jan 10
2
`[.data.frame`(df3, , -2) and NA columns
Dear baseRs,
I recently made a mistake when renaming data frame columns, accidentally
creating an NA column. I found the following strange behavior when negative
indexes are used.
Can anyone explain what happens here. No "workarounds" required, just curious.
Dieter
Version: Windows, R version 2.6.1 (2007-11-26)
#-----------------------------
df = data.frame(a=0:10,b=10:20)
df[,-2]
2009 Jan 21
3
merging several dataframes from a list
Hi there,
I have a list of dataframes (generated by reading multiple files) and all
dataframes are comparable in dimension and column names. They also have a
common column, which, I'd like to use for merging. To give a simple example of
what I have:
df1 <- data.frame(c(LETTERS[1:5]), c(2,6,3,1,9))
names(df1) <- c("pos", "data")
df3 <- df2 <- df1
df2$data
2011 Aug 18
2
Best way/practice to create a new data frame from two given ones with last column computed from the two data frames?
Dear expeRts,
What is the best approach to create a third data frame from two given ones, when
the new/third data frame has last column computed from the last columns of the two given
data frames?
## Okay, sounds complicated, so here is an example. Assume we have the two data frames:
df1 <- data.frame(Year=rep(2001:2010, each=2), Group=c("Group 1","Group 2"), Value=1:20)
2008 Feb 19
2
Looping through a list of objects & do something...
Hey Folks,
Could somebody show me how to loop through a list of dataframes? I want to
be able to generically access their elements and do something with them.
For instance, instead of this:
df1<- data.frame(x=(1:5),y=(1:5));
df2<- data.frame(x=(1:5),y=(1:5));
df3<- data.frame(x=(1:5),y=(1:5));
plot(df1$x,df1$y);
plot(df2$x,df2$y);
plot(df3$x,df3$y);
I would like to do something like:
2008 Jan 24
2
boxplot axis labelling
Hi,
i'm very new to R, so sorry for what i'm sure is a very basic question. I'm
producing a boxplot with the data below:
df3<-data.frame(
x=c(10,11,115,12,13,14,16,17,18,21,22,23,24,26,27,28,29,3,30,32,33,34,35,4,4
1,45,5,50,52,56,58,6,67,6738,68,7,8,9),
fq=c(8,11,1,2,4,4,2,2,6,3,4,2,2,1,1,1,4,51,3,1,1,1,1,35,1,1,19,2,1,1,1,14,1,
1,1,10,13,5),
2010 May 11
2
Regex and gsub
Dear group,
Here is my df :
df3 <-
structure(list(DESCRIPTION = c("COPPER May/10", "COTTON NO.2 Jul/10",
"CRUDE OIL miNY May/10", "GOLD Jun/10", "ROBUSTA COFFEE (10) Jul/10",
"SOYBEANS Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 May/10",
"WHEAT Jul/10", "SPCL HIGH GRADE ZINC USD",
2008 Jul 26
1
Can't get the correct order from melt.data.frame of reshape library.
Simple illustration,
> df3 <- data.frame(id=c(3,2,1,4), age=c(40,50,60,50), dose1=c(1,2,1,2), dose2=c(2,1,2,1), dose4=c(3,3,3,3))> df3 id age dose1 dose2 dose41 3 40 1 2 32 2 50 2 1 33 1 60 1 2 34 4 50 2 1 3> melt.data.frame(df3, id.var=1:2, na.rm=T) id age variable value1 3 40 dose1 12 2 50 dose1 23 1
2009 Feb 19
3
Multiple merge, better solution?
Hello,
My problem is that I would like to merge multiple files with a common
column but merge accepts only two
data.frames to merge. In the real situation, I have 26 different
data.frames with a common column. I can of course use merge many times
(see below) but what would be more sophisticated solution? For loop?
Any ideas?
DF1 <- data.frame(var1 = letters[1:5], a = rnorm(5))
DF2 <-
2009 Oct 07
1
merging dataframes with an unequal number of variables
Hallo Everyone
I have the kind of problem that one should never have because one must
always plan well and communicate with your team. But now I haven't so here
is my problem.
I have data coming in on a daily basis from surveys in 10 towns. The
questionnaire has 62 variables but some of the regions have used older
versions of the questionnaire that have a few variables less. I want to
combine
2008 Mar 10
2
write.table with row.names=FALSE unnecessarily slow?
write.table with large data frames takes quite a long time
> system.time({
+ write.table(df, '/tmp/dftest.txt', row.names=FALSE)
+ }, gcFirst=TRUE)
user system elapsed
97.302 1.532 98.837
A reason is because dimnames is always called, causing 'anonymous' row
names to be created as character vectors. Avoiding this in
src/library/utils, along the lines of
Index:
2004 Mar 15
1
gzfile & read.table on Win32
Hello ...
Are there any known problems or even gotchas to look out for when using a
gzfile connection in read.csv/read.table in Windows?
In the package PROcess, available at
www.bioconductor.org/repository/devel/package/html/PROcess.html
there are two files in the PROcess/inst/Test directory which are of the
extension *.csv.gz.
With both files, if I open up a gzfile connection, say:
vv <-
2011 Dec 07
2
Dividing rows when time is overlapping
Hi all,
I have dataframe that was created from the fusion of two dataframes. Both
spanned over the same time intervall but contained different information.
When I put them together, the info overlapped since there is no holes in the
time interval of one of the dataframe. Here is an example where the rows
"sp=A and B" are part of a first df and the rows "sp=C" come from a
2004 Mar 09
5
Adding data.frames together
I have a series of data frames that are identical structurally, i.e. -
made with the same code, but I need to add them together so that they
become one, longer, data frame, i.e. - each of the slot vectors are
increased in length by the length of the added data frame vectors.
So if I have df1 with a slot A so that length(df1$A) = 100 and I have
df2 with a slot A so that length(df2$A)=200 then I
2010 Sep 15
1
Format Data Issue??
R Users,
I am new to R and have tried to figure out how to automate this
process instead of using excel. I have read in this dataframe into r
with read.table. I need to reshape the data from the first table into
the format of the second table.
TractID StandID Species CruiseDate DBHClass TreesPerAcre
Carbon Stand 1 Loblolly Pine 5/20/2010 10 1.2
Carbon Stand 1 Loblolly Pine
2012 Jul 18
2
duplicate data between two data frames according to row names
Hi everybody.
I'll first explain my problem and what I'm trying to do.
Admit this example:
I'm working on 5 different weather stations.
I have first in one file 3 of these 5 weather stations, containing their
data. Here's an example of this file:
DF1 <- data.frame(station=c("ST001","ST004","ST005"),data=c(5,2,8))
And my two other stations in
2011 Jun 06
1
Merge two columns of a data frame
I have the following data:
prefix <- c("cheap", "budget")
roots <- c("car insurance", "auto insurance")
suffix <- c("quote", "quotes")
prefix2 <- c("cheap", "budget")
roots2 <- c("car insurance", "auto insurance")
roots3 <- c("car insurance", "auto
2003 Sep 04
1
error in lm.fit
Hello R user,
I have several data frames with >100 columns and I did a linear regression
over time of each column
df1.lm <- lapply(df1, function(x) lm(x~year)$coeff[2])
that worked fine and I get slope of each column oder time - until I divided
df1 by df2
df3 <- df1/df2
> df3.lm <- lapply(df3, function(x) lm(x~year)$coeff[2])
Error in lm.fit(x, y, offset = offset, ...) :