thr3ads.net - similar to: "Conditionally adding a constant"

Displaying 20 results from an estimated 700 matches similar to: "Conditionally adding a constant"

2011 May 10

Filtering out bad data points

Hi, I always have a question about how to do this best in R. I have a data frame and a set of criteria to filter points out. My procedure is to always locate indices of those points, check if index vector length is greater than 0 or not and then remove them. Meaning dftest <- data.frame(x=rnorm(100),y=rnorm(100)); qtile <- quantile(dftest$x,probs=c(0.05,0.95)); badIdx <- which((dftest$x

Data frame vs matrix quirk: Hinky error message?

2012 May 01

Data frame vs matrix quirk: Hinky error message?

AdvisoRs: Is the following a bug, feature, hinky error message, or dumb Bert? > mtest <- matrix(1:12,nr=4) > dftest <- data.frame(mtest) > ix <- cbind(1:2,2:3) > mtest[ix] <- NA > mtest [,1] [,2] [,3] [1,] 1 NA 9 [2,] 2 6 NA [3,] 3 7 11 [4,] 4 8 12 ## But ... > dftest[ix] <- NA Error in `[<-.data.frame`(`*tmp*`, ix, value

Logistic Regression - Variable Selection Methods With Prediction

2011 Oct 25

Logistic Regression - Variable Selection Methods With Prediction

Hello, I am pretty new to R, I have always used SAS and SAS products. My target variable is binary ('Y' and 'N') and i have about 14 predictor variables. My goal is to compare different variable selection methods like Forward, Backward, All possible subsests. I am using misclassification rate to pick the winner method. This is what i have as of now, Reg <- glm (Graduation ~.,

`[.data.frame`(df3, , -2) and NA columns

2008 Jan 10

`[.data.frame`(df3, , -2) and NA columns

Dear baseRs, I recently made a mistake when renaming data frame columns, accidentally creating an NA column. I found the following strange behavior when negative indexes are used. Can anyone explain what happens here. No "workarounds" required, just curious. Dieter Version: Windows, R version 2.6.1 (2007-11-26) #----------------------------- df = data.frame(a=0:10,b=10:20) df[,-2]

merging several dataframes from a list

2009 Jan 21

merging several dataframes from a list

Hi there, I have a list of dataframes (generated by reading multiple files) and all dataframes are comparable in dimension and column names. They also have a common column, which, I'd like to use for merging. To give a simple example of what I have: df1 <- data.frame(c(LETTERS[1:5]), c(2,6,3,1,9)) names(df1) <- c("pos", "data") df3 <- df2 <- df1 df2$data

Best way/practice to create a new data frame from two given ones with last column computed from the two data frames?

2011 Aug 18

Best way/practice to create a new data frame from two given ones with last column computed from the two data frames?

Dear expeRts, What is the best approach to create a third data frame from two given ones, when the new/third data frame has last column computed from the last columns of the two given data frames? ## Okay, sounds complicated, so here is an example. Assume we have the two data frames: df1 <- data.frame(Year=rep(2001:2010, each=2), Group=c("Group 1","Group 2"), Value=1:20)

Looping through a list of objects & do something...

2008 Feb 19

Looping through a list of objects & do something...

Hey Folks, Could somebody show me how to loop through a list of dataframes? I want to be able to generically access their elements and do something with them. For instance, instead of this: df1<- data.frame(x=(1:5),y=(1:5)); df2<- data.frame(x=(1:5),y=(1:5)); df3<- data.frame(x=(1:5),y=(1:5)); plot(df1$x,df1$y); plot(df2$x,df2$y); plot(df3$x,df3$y); I would like to do something like:

boxplot axis labelling

2008 Jan 24

boxplot axis labelling

Hi, i'm very new to R, so sorry for what i'm sure is a very basic question. I'm producing a boxplot with the data below: df3<-data.frame( x=c(10,11,115,12,13,14,16,17,18,21,22,23,24,26,27,28,29,3,30,32,33,34,35,4,4 1,45,5,50,52,56,58,6,67,6738,68,7,8,9), fq=c(8,11,1,2,4,4,2,2,6,3,4,2,2,1,1,1,4,51,3,1,1,1,1,35,1,1,19,2,1,1,1,14,1, 1,1,10,13,5),

Regex and gsub

2010 May 11

Regex and gsub

Dear group, Here is my df : df3 <- structure(list(DESCRIPTION = c("COPPER May/10", "COTTON NO.2 Jul/10", "CRUDE OIL miNY May/10", "GOLD Jun/10", "ROBUSTA COFFEE (10) Jul/10", "SOYBEANS Jul/10", "SUGAR NO.11 Jul/10", "SUGAR NO.11 May/10", "WHEAT Jul/10", "SPCL HIGH GRADE ZINC USD",

Can't get the correct order from melt.data.frame of reshape library.

2008 Jul 26

Can't get the correct order from melt.data.frame of reshape library.

Simple illustration, > df3 <- data.frame(id=c(3,2,1,4), age=c(40,50,60,50), dose1=c(1,2,1,2), dose2=c(2,1,2,1), dose4=c(3,3,3,3))> df3 id age dose1 dose2 dose41 3 40 1 2 32 2 50 2 1 33 1 60 1 2 34 4 50 2 1 3> melt.data.frame(df3, id.var=1:2, na.rm=T) id age variable value1 3 40 dose1 12 2 50 dose1 23 1

Multiple merge, better solution?

2009 Feb 19

Multiple merge, better solution?

Hello, My problem is that I would like to merge multiple files with a common column but merge accepts only two data.frames to merge. In the real situation, I have 26 different data.frames with a common column. I can of course use merge many times (see below) but what would be more sophisticated solution? For loop? Any ideas? DF1 <- data.frame(var1 = letters[1:5], a = rnorm(5)) DF2 <-

merging dataframes with an unequal number of variables

2009 Oct 07

merging dataframes with an unequal number of variables

Hallo Everyone I have the kind of problem that one should never have because one must always plan well and communicate with your team. But now I haven't so here is my problem. I have data coming in on a daily basis from surveys in 10 towns. The questionnaire has 62 variables but some of the regions have used older versions of the questionnaire that have a few variables less. I want to combine

write.table with row.names=FALSE unnecessarily slow?

2008 Mar 10

write.table with row.names=FALSE unnecessarily slow?

write.table with large data frames takes quite a long time > system.time({ + write.table(df, '/tmp/dftest.txt', row.names=FALSE) + }, gcFirst=TRUE) user system elapsed 97.302 1.532 98.837 A reason is because dimnames is always called, causing 'anonymous' row names to be created as character vectors. Avoiding this in src/library/utils, along the lines of Index:

gzfile & read.table on Win32

2004 Mar 15

gzfile & read.table on Win32

Hello ... Are there any known problems or even gotchas to look out for when using a gzfile connection in read.csv/read.table in Windows? In the package PROcess, available at www.bioconductor.org/repository/devel/package/html/PROcess.html there are two files in the PROcess/inst/Test directory which are of the extension *.csv.gz. With both files, if I open up a gzfile connection, say: vv <-

Dividing rows when time is overlapping

2011 Dec 07

Dividing rows when time is overlapping

Hi all, I have dataframe that was created from the fusion of two dataframes. Both spanned over the same time intervall but contained different information. When I put them together, the info overlapped since there is no holes in the time interval of one of the dataframe. Here is an example where the rows "sp=A and B" are part of a first df and the rows "sp=C" come from a

Adding data.frames together

2004 Mar 09

Adding data.frames together

I have a series of data frames that are identical structurally, i.e. - made with the same code, but I need to add them together so that they become one, longer, data frame, i.e. - each of the slot vectors are increased in length by the length of the added data frame vectors. So if I have df1 with a slot A so that length(df1$A) = 100 and I have df2 with a slot A so that length(df2$A)=200 then I

Format Data Issue??

2010 Sep 15

Format Data Issue??

R Users, I am new to R and have tried to figure out how to automate this process instead of using excel. I have read in this dataframe into r with read.table. I need to reshape the data from the first table into the format of the second table. TractID StandID Species CruiseDate DBHClass TreesPerAcre Carbon Stand 1 Loblolly Pine 5/20/2010 10 1.2 Carbon Stand 1 Loblolly Pine

duplicate data between two data frames according to row names

2012 Jul 18

duplicate data between two data frames according to row names

Hi everybody. I'll first explain my problem and what I'm trying to do. Admit this example: I'm working on 5 different weather stations. I have first in one file 3 of these 5 weather stations, containing their data. Here's an example of this file: DF1 <- data.frame(station=c("ST001","ST004","ST005"),data=c(5,2,8)) And my two other stations in

Merge two columns of a data frame

2011 Jun 06

Merge two columns of a data frame

I have the following data: prefix <- c("cheap", "budget") roots <- c("car insurance", "auto insurance") suffix <- c("quote", "quotes") prefix2 <- c("cheap", "budget") roots2 <- c("car insurance", "auto insurance") roots3 <- c("car insurance", "auto

error in lm.fit

2003 Sep 04

error in lm.fit

Hello R user, I have several data frames with >100 columns and I did a linear regression over time of each column df1.lm <- lapply(df1, function(x) lm(x~year)$coeff[2]) that worked fine and I get slope of each column oder time - until I divided df1 by df2 df3 <- df1/df2 > df3.lm <- lapply(df3, function(x) lm(x~year)$coeff[2]) Error in lm.fit(x, y, offset = offset, ...) :

similar to: Conditionally adding a constant